Mobile app version of vmapp.org
Login or Join
Vandalay110

: How do I go change the order/position of characters in Unicode fonts when a language demands it? When designing a font for another language in Unicode, how do you change the position of the

@Vandalay110

Posted in: #FontDesign #Fonts #Typography #Unicode

When designing a font for another language in Unicode, how do you change the position of the characters to show up how they need to show up?

Edit: Since the characters don't seem to be showing up for some of you, I am including an image with the text and will reference which line in the image I am referring to when using the characters (Format: [character]/[line number in image])



For example, in Gurmukhi, the ਿ/1 (vowel) character should show up before the (consonant) character that it is attached to. For example, when attached to ਸ/2, it shows up as ਸਿ/3. Though it shows up before, the ਿ/1 character is actually after the ਸ/2 character (in terms of how it's stored). You can see this by copying the characters together into another textbox and pressing backspace.

It is notable that it even shifts the character that it is attached to over to create space when needed. See this example: ਨਾਮ/4 and ਨਾਮਿ/5. You can here that it shifts the ਮ/6 character to the right in order to create room for the ਿ/1 character (again, you can see it better if you copy and press backspace).

I have seen other character positions handled with anchors, but those don't seem to apply here exactly. Even for other characters in Gurmukhi, anchors are used to change the character position. For example, the way that the ੁ/7 is supposed to be positioned below a consonant, like so: ਸੁ/8. The way that Unicode fonts handle this is with anchors, but when it comes to the ਿ/1, that is not the case. When opening a Unicode font on something like FontForge, you can see that there is no anchors in the ਿ/1 character (U+0A3F in case you want to see). Also, I've seen that right-to-left languages like Arabic don't have anchors either, so they are probably handled in a different way as well.

But how exactly are these cases handled?

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Vandalay110

1 Comments

Sorted by latest first Latest Oldest Best

 

@Nimeshi706

Combining characters and positioning

OpenType fonts have a Glyph Positioning table (GPOS) which is used to provide precise control over glyph placement for sophisticated text layout and rendering in different scripts. The GPOS table can position glyphs in a number of ways.

From the Microsoft OpenType Specification:


The GPOS table supports eight types of actions for positioning and attaching glyphs:


A cursive attachment describes cursive scripts and other glyphs that are connected with attachment points when rendered.
A MarkToBase attachment positions combining marks with respect to base glyphs, as when positioning vowels, diacritical marks, or tone marks in Arabic, Hebrew, and Vietnamese.
A MarkToLigature attachment positions combining marks with respect to ligature glyphs. Because ligatures may have multiple points for attaching marks, the font developer needs to associate each mark with one of the ligature glyph's components.
A MarkToMark attachment positions one mark relative to another, as when positioning tone marks with respect to vowel diacritical marks in Vietnamese.
Contextual positioning describes how to position one or more glyphs in context, within an identifiable sequence of specific glyphs, glyph classes, or varied sets of glyphs. One or more positioning operations may be performed on “input” context sequences. Figure 4e illustrates a context for positioning adjustments.
Chaining Contextual positioning describes how to position one or more glyphs in a chained context, within an identifiable sequence of specific glyphs, glyph classes, or varied sets of glyphs. One or more positioning operations may be performed on “input” context sequences.



The GPOS table in combination with combining characters can be used to create diacritics and other layout combinations.

There are OpenType specifications for the support of scripts that require repositioning and complex layouts, these features should be automatic and implemented by the OpenType layout engine.

An example of how the OpenType layout engine can combine complex Gurmukhi strings:



In reality these aren't implemented universally though, Adobe products (all except for Photoshop I believe) for example have a lot of trouble typesetting right-to-left and complex scripts without using specific "Middle Eastern" versions of the software.

References and reading:


Microsoft: Developing OpenType Fonts for Gurmukhi Script
Microsoft OpenType Specification: GPOS - The Glyph Positioning Table
Microsoft OpenType Specification: Overview
Glyphs App - Mark to Base Positioning




Precomposed characters and ligatures

In reality, more complex structures required by some scripts are difficult or impossible to achieve with OpenType positioning and layout features so the character combinations are often precomposed characters or ligatures.

For example, Unicode has a large number of Arabic ligatures, some for whole words and phrases, such as Allah (U+FDF2):



And bismillah ar-rahman ar-raheem:



As far as I can tell, the Gurmukhi Unicode block doesn't contain any ligatures or precomposed characters, but it is perfectly possible that a font will contain ligatures for certain combinations (I know a lot of Arabic and Bengali fonts do, which are the only scripts other than Latin that I have any experience with).

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme