reader: implement language-support plugin system

This creates a new plugin system which hooks into a handful of reader
operations in order to allow plugins to add language-specific support
where the default reader falls short. The two hooks added are:

 * During hold-without-pan taps, language plugins can modify the
   selection in order to better match what users expect koreader to
   highlight when selecting a single word.

   The vast majority of CJK language words are more than one character,
   but KOReader treats all CJK characters as a single word by default,
   so adding this hook means that readers no longer need to manually
   select the whole word every time they need to look something.

 * During dictionary lookup, language plugins can propose alternative
   candidate words to look up if the selected word could not be found in
   the dictionary.

   This is pretty necessary for Japanese and Korean, both of which are
   highly agglutinative languages and the fuzzy searching system of
   StarDict is simply not usable because often the inflection of the
   word is so much longer than the dictionary form that sdcv decides to
   chop off the actual word and search for the inflection (which yields
   useless results).

This system is of particular interest for readers of CJK languages
(without this, looking up words using KOReader was fairly painful) but
this system is designed to be minimal and language-agnostic enough that
other languages could make use of it by creating their own plugins if
the default "whole word" highlight and fuzzy-search system doesn't match
their needs.

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
This commit is contained in:
Aleksa Sarai
2021-10-23 21:13:09 +11:00
committed by Frans de Jonge
parent da70fe9de1
commit 7c5243667b
7 changed files with 363 additions and 7 deletions

View File

@@ -847,7 +847,8 @@ function ReaderHighlight:onHold(arg, ges)
if ok and word then
logger.dbg("selected word:", word)
-- Convert "word selection" table to "text selection" table because we
-- use text selections throughout readerhighlight.
-- use text selections throughout readerhighlight in order to allow the
-- highlight to be corrected by language-specific plugins more easily.
self.is_word_selection = true
self.selected_text = {
text = word.word or "",
@@ -862,6 +863,18 @@ function ReaderHighlight:onHold(arg, ges)
logger.dbg("link:", link)
self.selected_link = link
end
if self.ui.languagesupport:hasActiveLanguagePlugins() then
-- If this is a language where pan-less word selection needs some
-- extra work above and beyond what the document engine gives us
-- from getWordFromPosition, call the relevant language-specific
-- plugin.
local new_selected_text = self.ui.languagesupport:improveWordSelection(self.selected_text)
if new_selected_text then
self.selected_text = new_selected_text
end
end
if self.ui.document.info.has_pages then
self.view.highlight.temp[self.hold_pos.page] = self.selected_text.sboxes
-- Unfortunately, getWordFromPosition() may not return good coordinates,
@@ -1325,6 +1338,13 @@ function ReaderHighlight:highlightFromHoldPos()
if self.hold_pos then
if not self.selected_text then
self.selected_text = self.ui.document:getTextFromPositions(self.hold_pos, self.hold_pos)
if self.ui.languagesupport:hasActiveLanguagePlugins() then
-- Match language-specific expansion you'd get from self:onHold().
local new_selected_text = self.ui.languagesupport:improveWordSelection(self.selected_text)
if new_selected_text then
self.selected_text = new_selected_text
end
end
logger.dbg("selected text:", self.selected_text)
end
end