reader: implement language-support plugin system

This creates a new plugin system which hooks into a handful of reader
operations in order to allow plugins to add language-specific support
where the default reader falls short. The two hooks added are:

 * During hold-without-pan taps, language plugins can modify the
   selection in order to better match what users expect koreader to
   highlight when selecting a single word.

   The vast majority of CJK language words are more than one character,
   but KOReader treats all CJK characters as a single word by default,
   so adding this hook means that readers no longer need to manually
   select the whole word every time they need to look something.

 * During dictionary lookup, language plugins can propose alternative
   candidate words to look up if the selected word could not be found in
   the dictionary.

   This is pretty necessary for Japanese and Korean, both of which are
   highly agglutinative languages and the fuzzy searching system of
   StarDict is simply not usable because often the inflection of the
   word is so much longer than the dictionary form that sdcv decides to
   chop off the actual word and search for the inflection (which yields
   useless results).

This system is of particular interest for readers of CJK languages
(without this, looking up words using KOReader was fairly painful) but
this system is designed to be minimal and language-agnostic enough that
other languages could make use of it by creating their own plugins if
the default "whole word" highlight and fuzzy-search system doesn't match
their needs.

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
This commit is contained in:
Aleksa Sarai
2021-10-23 21:13:09 +11:00
committed by Frans de Jonge
parent da70fe9de1
commit 7c5243667b
7 changed files with 363 additions and 7 deletions

View File

@@ -99,7 +99,9 @@ function ReaderDictionary:init()
self.dicts_order = G_reader_settings:readSetting("dicts_order", {})
self.dicts_disabled = G_reader_settings:readSetting("dicts_disabled", {})
self.ui.menu:registerToMainMenu(self)
if self.ui then
self.ui.menu:registerToMainMenu(self)
end
self.data_dir = STARDICT_DATA_DIR or
os.getenv("STARDICT_DATA_DIR") or
DataStorage:getDataDir() .. "/data/dict"
@@ -742,10 +744,7 @@ function ReaderDictionary:rawSdcv(words, dict_names, fuzzy_search, lookup_progre
table.insert(args, opt)
end
end
table.insert(args, "--") -- prevent word starting with a "-" to be interpreted as a sdcv option
-- XXX: This requires <https://github.com/Dushistov/sdcv/pull/77> in
-- order to function properly (otherwise the first failure will
-- cause sdcv to exit).
table.insert(args, "--") -- prevent words starting with a "-" to be interpreted as a sdcv option
util.arrayAppend(args, words)
local cmd = util.shell_escape(args)
@@ -801,6 +800,17 @@ end
function ReaderDictionary:startSdcv(word, dict_names, fuzzy_search)
local words = {word}
if self.ui.languagesupport:hasActiveLanguagePlugins() then
-- Get any other candidates from any language-specific plugins we have.
-- We prefer the originally selected word first (in case there is a
-- dictionary entry for whatever text the user selected).
local candidates = self.ui.languagesupport:extraDictionaryFormCandidates(word)
if candidates then
util.arrayAppend(words, candidates)
end
end
lookup_cancelled, results = self:rawSdcv(words, dict_names, fuzzy_search, self.lookup_progress_msg or false)
if results == nil then -- no dictionaries found
return {
@@ -815,7 +825,7 @@ function ReaderDictionary:startSdcv(word, dict_names, fuzzy_search)
local seen_results = {}
-- Flatten the array, removing any duplicates we may have gotten (sdcv
-- may do multiple queries, in fixed mode then in fuzzy mode, and the
-- LanguageSupport plugin may have returned multiple equivalent
-- language-specific plugin may have also returned multiple equivalent
-- results).
local h
for _, term_results in ipairs(results) do