: How to access Arabic BEH initial form, from a font that places the glyph in a non-conventional place unicode slot (I'm new to both Arabic, and doing things related to Unicode, so I might be
(I'm new to both Arabic, and doing things related to Unicode, so I might be overlooking some glaring detail.)
So, I'm using the SIL font Lateef, and want to access the BEH initial form glyph, in HTML. (Because, in MSA, "with/through" is translated to the prefix ﺑِ, so I want to include it in initial form to make clear that it is a prefix, and not a word on itself.)
The BEH initial form glyph has Unicode code U+FE91. However, Lateef does not include the glyph at this 'location' (not entirely sure what to call it), and instead includes the glyph at U+1016F ("Greek Acrophonic Carystian Five Hundred"), and then uses a substitution table to use it. (The substitution table usage is logical, but I don't get why Lateef puts the glyph in this weird place.)
As my app definitively uses Lateef, included as a webfont, I decided to then just reference it as U+1016F in javascript (u{1016F}) or HTML (𐅯 or 𐅯) --- ugly, I know, because non-semantic, but I couldn't decide upon a better way. However, even though the element in question was indeed using Lateef as font, Chrome displayed the "Greek Acrophonic Carystian Five Hundred" instead:
My questions then, are:
How come Chrome displays the "Greek Acrophonic Carystian Five Hundred" glyph, when the HTML element is clearly using Lateef, and the unicode character is correctly referenced, and Lateef indeed includes the BEH initial form glyph at U+1016F?
Is it at all possible, to display Lateef's initial BEH? (Other than editing the font to include the glyph at the correct location, as well.)
Is this an error/misunderstanding on my part, or an actual weirdness on behalf of Chrome, Ubuntu/Chrome, Unicode, or HTML?
More posts by @Hamm6457569
1 Comments
Sorted by latest first Latest Oldest Best
The reason why the glyph was not accessible
Apparently, although the BEH initial form glyph was 'stored' at location 0x1016f, is wasn't 'assigned' the Unicode value U+1016f. As far as I understand, the glyph will then not show up in the cmap table, which is the go-to place for the rendering engine's character -> glyph selection process.
I discovered this when messing around in Fontforge. In FontForge, to view the characters labelled by Unicode value instead of glyph image, select "View" > "Label Glyph By" > "Unicode"; and to set the Unicode value for the glyph in question, select the glyph, right-click, then set "Glyph Info..." > "Unicode" > "Unicode Value" appropriately.
I imagine this is a bug on behalf of the Lateef font; the 'correct' way to go would have been to make the glyph accessible in the cmap table, but also put it at 0xfe91 in the first place, instead of 0x1016f (where it doesn't semantically belong).
How to access the glyph anyway
EASY: Just use U+0640: بـ.
OLD / COMPLICATED:
Apart from modifying the font, as is suggested from the solution of the problem above, I came up with another, funky way of accessing the glyph, without modifying the font.
<div style="direction: rtl;">ب‍<span style="color:#fff; width: 0px; display: inline-block;">ن</span></div>
As the glyph can be accessed implicitly, when combined with subsequent letters, one can simply hide those subsequent letters, and end up with just the BEH initial form glyph. However, an additional trick must be employed: adding a zero-width joiner ‍, in order to let the characters connect through the interspersed <span> element.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2025 All Rights reserved.