25
Aug 13
00:11
Mac Dictionary Services API Tease
For a while I just assumed there was no easy way of using the content the mac dictionary app uses. Then I found out about the mac dictionary services API, which looks promising at a glance.
I created the basic lookup pretty quickly while on the BART commute to work and back. But it became apparent after some tinkering that the existing API (which has only two functions for word lookup) is entirely incomplete. You can look up a string, and get a definition back from the same dictionaries that Dictionary.app uses. However, you can’t specify which dictionary, or which entry within a dictionary (e.g. for a word that has multiple definitions). This means you only will the first entry of the first dictionary that gets hit. So I decided to spend a good part of today trying to see what could be done about this.
I came across one or two or three interesting posts that showed some private API off. Most of these were for simple CLI programs, or for building their own dictionary. And used private calls such as DCSRecordCopyData() and DCSCopyAvailableDictionaries(). DCSCopyAvailableDictionaries allowed me to access specific dictionaries, and used in conjunction with DCSGetTermRangeInString and DCSCopyRecordsForSearchString, I was able to generate a reasonable list of candidate DCSRecordRefs entries from a single input word. The only thing missing was the definitions for each DCSRecordRef.
I wanted to make this dictionary for my Japanese studies. In Japanese, there are tons of homonym and homophones depending on if you use write the word with kanji or not. I didn’t see a function from my googles that would show the word listing, but doing an NSLog(@”%@”, record) on an example search for ‘いる’ showed some info about the DCSRecord structure:
lldb output:{key = いる, headword = いる【射る】, bodyID = 111081} {key = いる, headword = いる【要る】, bodyID = 104584} {key = いる, headword = いる【居る】, bodyID = 163639}
It looked like the ‘bodyID’ or ‘headword’ field of DCSRecordRef had the most specific information about the result. So I went into the framework and searched for symbols of functions that might do the trick:
cd /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/DictionaryServices.framework nm -gU DictionaryServices|grep DCS 0000000000007b06 T _DCSActivateDictionaryPanel 000000000000914d T _DCSCopyActiveDictionaries 000000000000916f T _DCSCopyAvailableDictionaries [...] 0000000000007e07 T _DCSRecordCopyData 0000000000007e1e T _DCSRecordCopyDataURL 0000000000007e63 T _DCSRecordGetAnchor 00000000000076ef T _DCSRecordGetAssociatedObj 0000000000007e4c T _DCSRecordGetDictionary 0000000000007ebd T _DCSRecordGetHeadword 0000000000007e91 T _DCSRecordGetRawHeadword 0000000000007f04 T _DCSRecordGetString 0000000000007e35 T _DCSRecordGetSubDictionary 0000000000007e7a T _DCSRecordGetTitle 0000000000009181 T _DCSRecordGetTypeID 00000000000076d9 T _DCSRecordSetAssociatedObj 0000000000007ea8 T _DCSRecordSetHeadword 000000000000878a T _DCSSearchSessionCreate [...]
There were about a 100 or so symbols or so that Apple didn’t feel like sharing via docs. I tried a few combinations, but it looks like DCSRecordGetTitle or DCSRecordGetRawHeadword produced the best strings for use with DCSCopyTextDefinition. This solves the homonym problem for the most part. However, this did not work at all with heteronyms (words that are spelled the same, but pronounced differently), since the headword/display title would be the same. For example, for the input 棺 I need the definitions for both 棺 read as ‘kan’ and 棺 read as ‘hitsugi’, but this method would give me two definitions for ‘kan’ instead. Eventually I gave up. I hope someone else figures this out. I put the intermediate result up on github. Let me know if you make any progress.
On iOS it goes without saying that you should probably avoid using private APIs, since the main means of distribution is through apple. On the mac distributing an app yourself is still viable, and thus you can use private APIs to your heart’s content. However, because the non-documented but exported symbols do not provide function signatures, it’s pretty much just a tease unless you want to spend a lot of time to figure out what each method takes. After googling symbol names, it seems dictionary services are relatively unexplored. But I’m mostly interested in this for making my hobby dictionary that I can nerd out on and add lots of obscure features to while using the apple-provided dictionaries.
Shayan
June 6, 2018
8:45 pm
Hey,
Any chance you had any progress on this?
mike
June 9, 2018
7:40 am
No, haven’t touched it in a while. Curious though if any new APIs were exposed or created. Let me know your project and if you do find something useful.