Sunday, 1 September 2019

iconv separates accents from letter

I'm trying to make a function that would return a given string without its accents, but iconv's //TRANSLIT option only seems to separate the accent and the letter without removing the accent.

Here's my function :

<?php
function strRemoveAccents($str)
{
    return iconv(mb_detect_encoding($str), 'us-ascii//TRANSLIT', $str);
}

And here are my results :

  • test 1

    • Input : Athènes
    • Expected output : Athenes
    • Current output : Ath`enes
  • test 2

    • Input : Gdańsk
    • Expected output : Gdansk
    • Current output : Gda'nsk
  • test 3

    • Input : niño
    • Expected output : nino
    • Current output : ni~no

Some precisions :

  • mb_detect_encoding returns 'UTF-8' for all of my tests, and replacing the function with its return does not change anything.
  • My locale is currently set to LC_COLLATE=C;LC_CTYPE=French_France.1252;LC_MONETARY=C;LC_NUMERIC=C;LC_TIME=C
  • I also tried changing the locale to en_US.UTF-8 (I checked : the locale was successfully updated), but the function's return was still the same
  • Tested on a Macbook with the default locale set to c/fr_FR.UTF-8/c/c/c/c the problem is still the same.
  • I could remove the accents, but since I'll be using the method on whole sentences, I don't want to remove more apostrophes than needed.
  • Edit : when testing with this sandbox, I get the results I want.

I'm probably missing something, but I don't see what.



from iconv separates accents from letter

No comments:

Post a Comment