I run into a problem at the beginning of the year and I just figured what the problem is and how to fix it.
I have unicode eastern james bay cree roman orthography to syllabics converter.
When the all characters have been converted from roman to syllabics the final ᐅ has to be change to ᐤ only if it is the final u or occurs before a final h.
I decided to use regex in my PHP converter
$find ='/(\p{Lo})ᐅ(᙮|ᐦ)?\b/ui';
when I run a test wiiuh
- WAMP ᐧᐄᐤᐦ
- LAMP ᐧᐄᐅᐦ ᐧᐄᐅᐦ needs changed to ᐧᐄᐤᐦ
It turns out the word boundary mark \b in linux is not unicode aware so I have to swap that with (?!\pL)
$find = '/(\p{Lo})ᐅ(᙮|ᐦ)?(?!\pL)/ui';
$processed = preg_replace($find, '\1ᐤ\2', $processed);
That fixed the problem