Wednesday 31 October 2018

macos - Weird symbols on Mac


Since I've had my mac, I keep seeing this weird symbol. Till today, it had been only in the place of bullet points in OpenOffice.org. The first pictures shows this in a .doc file created on a Windows system.


I thought nothing of it - just an annoyance. It appears no matter what the font. Real bullets appear if I delete the text and insert a bulletted list using the toolbar.


Then, today I noticed in in iTunes - which seemed strange.


Image 3 is a zoom of the character. It says on it: Private Use E000 F8FF.


What is it (unicode related?), and how do I get the bullets working properly?


openoffice iTunes alt text


Edit: The plot thickens... If I boot in Safe Mode, the symbols look like little snap boards like you'd have at the beginning of filming a scene in a film...



Answer



(Err, yes, this answer is way too long. And there's no happy ending for the bullets! Highlights in bold...)


The images from the question are defined in Apple's Last Resort font, which holds 236 different funny symbols. I guess the name says it all, but according to Wikipedia:



LastResort is a Mac OS font that is invisible to the end user, but is used by the system to display glyphs that are not available in any other font. The symbols provided by LastResort place glyphs into categories based on their location in the Unicode system and provide a hint to the user about which font or script is required to view the unavailable characters.



And Apple explains:



Examplar glyphs were chosen in a number of ways. Almost all of the Brahmic scripts show the initial consonant ka. Latin uses the letter A because it's the first letter, and because in each Latin block there is a letter A so they can be easily differentiated. Greek and Cyrillic use their last letters, omega and ya, because they are so distinctive. Most other alphabets and syllabaries use their initial letter where distinctive.



(I like the Unicode BMP Fallback font, like used in Firefox, much better as it shows the exact 4-character Unicode code.)


So, your Mac does not know what to display, and uses the Last Resort font to provide some information.


My first guess was: Microsoft Office uses some proprietary symbol font, in which it uses character codes from the private use area (PUA) to define what the bullets look like. (Or maybe the author has some odd font installed from which some funny bullet has been used.) Your Mac neither knows that font, nor has any other font in which the same Unicode character code happens to be defined. And even if another font did define some character for that code, it would not help either, as characters from the private use areas may by definition have a totally different meaning in different fonts. Installing Office on the Mac might also include the font, probably making the bullets show fine in OpenOffice.org as well. (In fact, installing a trial of Office might already install the missing proprietary font.)


Well, no.


While the above would be true for any sane usage of Unicode, some further investigation learns that in old applications Microsoft uses the range U+F020..U+F0FF to display symbols using another font. When displaying characters from that range, it then automatically switches to another font for those characters. Some organisation named SIL International figured out:



One of the mysteries of text formatted with symbol fonts (at least, in certain Microsoft applications) is that characters appear to be encoded in terms of 8-bit code points even if a document is otherwise encoded in Unicode. When U+F021 was inserted into WordPad from the clipboard, not only did WordPad (more precisely, the Rich Edit control) apply the Wingdings font, it seems that it also changed the code point to 0x21. When the character was reformatted to a non-symbol font, this became U+0021.



Or, as Microsoft explains it, for Microsoft Platform Software Development Kit-January 2000 Edition:



Richedit 4.1 maps the range of characters between U+F020 and U+F0FF in the PUA to symbol fonts. Therefore, when you map any character in this range, Richedit 4.1 shows the symbol character instead of the end-user-defined character (EUDC).



I think that meanwhile Microsoft has added the symbols to many Microsoft versions of fonts, to allow more recent Microsoft software to display those characters without switching fonts. For example, U+F020 shows a bullet in most fonts, but not in Arial on your Mac. Installing a Microsoft-version of Arial might help. But that might surely get you into other problems when you're using Arabic...


I doubt any non-Microsoft software will handle the above exceptions.


Still, in general:


One way to determine the name of the font: when printing from Word to PDF one can choose to include the font (or: the subset of characters that are used) in the PDF, to ensure it prints fine on systems that do not have that font installed. So, peeking into the PDF's properties might then reveal the name of that font. And maybe simply selecting the character in OpenOffice.org will already reveal its name in the font list. (However, given the automatic displaying of symbols like described above, both methods will probably not work for bullets at all.)


One way to determine the exact character code: copy it (for Pub Quiz , copy it from the auto-suggest in the search of iTunes) and paste into some Unicode code converter. This reveals that the Pub Quiz character is U+E047, which could be some odd double-quote. But well, as this is from a private use area, and we don't know which font the developers had in mind when they typed that name in the iPhone app-store, I guess only the developers can tell us what they hoped it would look like...


(Fileformat.info has a neat utility to show the character using all fonts on your computer.)


No comments:

Post a Comment

Where does Skype save my contact's avatars in Linux?

I'm using Skype on Linux. Where can I find images cached by skype of my contact's avatars? Answer I wanted to get those Skype avat...