Skip to content

Legado Book Source Adaptation — Gap Analysis & Improvement Plan

This document tracks the gaps between Kototoro's Legado book source adapter and the native legado-with-MD3 implementation.

Last updated: 2026-05-26. ✅ = completed, 🟡 = in progress, ⬜ = pending.


Phase 1: ContentRule Field Completions ✅ Done

Impact: Missing fields cause incomplete content parsing for sources that depend on them.

TaskFileStatus
Add subContent / imageDecode / callBackJs to ContentRulecore/model/jsonsource/LegadoBookSource.kt
Implement subContent parsingcore/parser/legado/book/BookContent.kt
Inject imageDecode script as X-Legado-ImageDecode page headercore/parser/legado/book/BookContent.kt
Execute callBackJs at end of content parsingcore/parser/legado/book/BookContent.kt

Phase 2: JS Bridge Expansion ✅ Done

TaskFileStatus
t2s / s2t Chinese conversioncore/util/ChineseConverter.kt (new) + core/javascript/LegadoJavaAPI.kt
connect(url, headers?, callTimeout?)StrResponsecore/javascript/LegadoJavaAPI.kt
webView(html, url, js, cacheFirst) seriescore/javascript/LegadoJavaAPI.kt
webViewGetSource(html, url, js, sourceRegex, cacheFirst, delayTime)core/javascript/LegadoJavaAPI.kt
webViewGetOverrideUrl(html, url, js, overrideUrlRegex, cacheFirst, delayTime)core/javascript/LegadoJavaAPI.kt
ajaxAll(urlList, skipRateLimit?)Array<StrResponse>core/javascript/LegadoJavaAPI.kt
hexEncode(str) — complement to existing hexDecodeToStringcore/javascript/LegadoJavaAPI.kt
encodingDetect(bytes) / encodingDetect(str)core/javascript/LegadoJavaAPI.kt

Phase 3: TOC Metadata Completion ✅ Done

TaskFileStatus
Parse isVolume → chapter volume fieldcore/parser/legado/book/BookChapterList.kt
Parse isVip / isPaybranch tag ("vip", "pay")core/parser/legado/book/BookChapterList.kt
Parse updateTimeuploadDate (multi-format parseTimestamp())core/parser/legado/book/BookChapterList.kt

Phase 4: Source Login System ✅ Done

TaskFileStatus
AES crypto utility (Legado-compatible format)core/parser/legado/auth/LegadoCrypto.kt (new)
Sandbox loads persisted sourceVariable_* / userInfo_* on initcore/parser/legado/sandbox/LegadoSandbox.kt
Implement ContentParserAuthProvider on LegadoRepositorycore/parser/legado/LegadoRepository.kt
Factory passes legado_source_store SharedPreferencescore/parser/JsonContentRepositoryProvider.kt
"用浏览器登录" button when loginUrl is presentsettings/sources/SourceComposeSettingsFragment.kt
SourceAuthActivity fallback to ContentRepository.Factory for non-Parser sourcessettings/sources/auth/SourceAuthActivity.kt
Existing loginUi dynamic form + loginCheckJs button (pre-existing)settings/sources/SourceComposeSettingsFragment.kt✅ (已有)

Phase 5: Network Layer Enhancements ✅ Done

TaskFileStatus
bodyJs option support in parseUrlWithOptions()core/parser/legado/AnalyzeUrl.kt
encodedQuery option — auto URL-encode query parameterscore/parser/legado/AnalyzeUrl.kt
Response charset auto-detectioncore/parser/legado/LegadoRepository.kt✅ (已有 EncodingDetect.getHtmlEncode())

Phase 6: Global Replace Rule System ✅ Done

Impact: Content cleansing relied solely on per-source replaceRegex. Now all novel content from ALL source types passes through global replace rules.

Implementation (Simplified Architecture)

Instead of injecting at individual repositories, rules are applied at a single universal point:

TaskFileStatus
Data model (ReplaceRule with Legado-compatible JSON serialization)core/replace/ReplaceRule.kt (new)
Persistence (ReplaceRuleRepository with import/export)core/replace/ReplaceRuleRepository.kt (new)
applyReplaceRules() at end of htmlToPlainText() — covers all HTML-based sourcesreader/novel/NovelContentLoader.kt
applyReplaceRules() on EPUB direct pathreader/novel/NovelContentLoader.kt
Management UI (add/edit/delete/toggle/import/export)settings/sources/replace/ReplaceRulesFragment.kt (new)

Coverage: Legado, Parser, Mihon, JS, LNReader, IReader, EPUB, local files — all novel content flows through htmlToPlainText() or the EPUB path.


Phase 7: Source Type Expansion ❌ Not Planned

7.1 Audio Source Support (bookSourceType=1)

Kototoro 定位是漫画 + 小说阅读器,不是音频播放器。不做。


Other Low-Priority Gaps ⬜

Network / HTTP

GapLegado NativeKototoro StatusImpactEffort
ajaxAll true concurrencyflow.mapAsync(N) with threadCount configSequential loopSlow for sources issuing 10+ parallel requests
java.ajax(url, {method, headers, body, charset}) Rhino overloadFull options map via NativeObject/UndefinedPartial (handles method but not full NativeObject)Some scripts pass JS-native objects
get(url, headers, timeout) / head() / post() (Jsoup-style)Jsoup Connection.Response with header mapNot availableRarely used by book source scripts
ajaxTestAll(urlList, timeout)Concurrent test calls with individual timeoutNot availableSource validation/debugging only

File I/O

GapLegado NativeKototoro StatusImpactEffort
getFile(path) / readTxtFile(path) / readFile(path)Read from app cache directoryNot availableSome cache-first scripts use this
deleteFile(path)Delete from cacheNot availableRare
downloadFile(url) / cacheFile(url, saveTime)Download + cache with TTLNot availableRare for book sources; more common in RSS/spider
getTxtInFolder(path)Read first .txt in directoryNot availableRare

Archive Operations

GapLegado NativeKototoro StatusImpactEffort
unzipFile / un7zFile / unrarFileDecompress archivesNot availableRare; some sources use archived content
getZipStringContent / getRarStringContent / get7zStringContentRead text file from inside archiveNot availableVery rare
getZipByteArrayContent etc.Read binary from archiveNot availableVery rare

Crypto / Encoding

GapLegado NativeKototoro StatusImpactEffort
java.createSymmetricCrypto(algorithm, key, iv)AES/DES/3DES via hutool-cryptoNot availableSome API sources use custom encryption
hexDecodeToByteArray(hex)Hex → bytesOnly hexDecodeToString (single-byte chars)Edge case
base64DecodeToByteArray(str)Base64 → bytesOnly base64Decode (String output)Edge case

Font / Typography

GapLegado NativeKototoro StatusImpactEffort
queryTTF(data)Extract TTF font properties from bytesNot availableFont-related source rules extremely rare极小
replaceFont(data, ttfData)Replace font in rendered chapterNot availableExtremely rare

Utilities

GapLegado NativeKototoro StatusImpactEffort
sleep(ms)Block JS execution for N msNot availableRate-limiting scripts need this极小
getVerificationCode(imageUrl)OCR captcha imageNot availableCaptcha-required sources are rare
importScript(path)Dynamic JS import from cache pathNot availableVery rare
toNumChapter(s)Normalize chapter number stringNot availableEdge case for TOC sorting
openUrl(url)Open URL in external intentNot availablestartBrowser already covers most cases

Reader Config Access

GapLegado NativeKototoro StatusImpactEffort
getReadBookConfig() / getReadBookConfigMap()Expose reader theme/layout to JSNot availableSome sources adapt content based on reader prefs极小
getThemeConfig() / getThemeConfigMap()Expose app theme to JSNot availableExtremely rare极小

ComponentPath
Rule enginecore/parser/legado/AnalyzeRule.kt
URL buildercore/parser/legado/AnalyzeUrl.kt
JS sandboxcore/parser/legado/sandbox/LegadoSandbox.kt
JS API bridgecore/javascript/LegadoJavaAPI.kt
JS cookie APIcore/javascript/LegadoCookieAPI.kt
Chinese convertercore/util/ChineseConverter.kt
HTTP clientcore/network/jsonsource/LegadoHttpClient.kt
Source modelcore/model/jsonsource/LegadoBookSource.kt
Repositorycore/parser/legado/LegadoRepository.kt
Content parsercore/parser/legado/book/BookContent.kt
TOC parsercore/parser/legado/book/BookChapterList.kt
Info parsercore/parser/legado/book/BookInfo.kt
Replace rules modelcore/replace/ReplaceRule.kt
Replace rules repocore/replace/ReplaceRuleRepository.kt
Replace rules UIsettings/sources/replace/ReplaceRulesFragment.kt
Login UIsettings/sources/SourceComposeSettingsFragment.kt
Auth activitysettings/sources/auth/SourceAuthActivity.kt
Factorycore/parser/JsonContentRepositoryProvider.kt
Legado native ref../legado-with-MD3/app/src/main/java/io/legado/app/

Detailed Alignment Gaps — AnalyzeRule.kt

Systematic comparison of core/parser/legado/AnalyzeRule.kt vs legado-with-MD3/.../model/analyzeRule/AnalyzeRule.kt. Found 22 behavioral differences; key gaps listed below.

CRITICAL (#6) — List Item-by-Item vs Batch Processing

KototoroLegado
When result is List<*>Iterates each item, applies rule individuallyPasses the ENTIRE result to the analyzer
ImpactCSS selectors scope limited to each itemCSS can see the full collection
FixRemove item iteration; let analyzers handle lists internally

CRITICAL (#18) — @put/@get Variable Storage Layers

KototoroLegado
put(key, value)sandbox-ruleData only (in-memory)chapter → book → ruleData → source (SharedPrefs)
get(key)sandbox-ruleData only4-layer priority chain + special keys (bookName, title)
Impact@put:{foo: bar} in BookInfo won't be visible in TOC parsing
FixRoute put/get through source.put/get (SourceWrapper → SharedPrefs)

HIGH (#4) — @get/{{}} Skipped in JS Mode

KototoroLegado
SourceRule init for JS modeEntire @get/&#123;&#123;&#125;&#125; block skipped (if (mode != Mode.Js))Parsed for ALL modes
Impact@get:{...} inside <js> blocks silently ignored
FixRemove the mode guard; let makeUpRule handle substitution for JS mode

HIGH (#12+#13) — isRule() / makeUpRule {{}} Routing

KototoroLegado
isRule()Extremely broad (. $ // @ ## &#123;&#123; children etc.)Narrow: @ $. $[ // only
makeUpRule {{}} routingChecks $././//@ prefix + dotted JSONPath heuristicUses narrow isRule()
Impact&#123;&#123;data.name&#125;&#125; routed to JsonPath rule eval instead of JS eval
FixNarrow isRule() to match Legado; align makeUpRule route logic

HIGH (#10b) — HTML Unescape Default

KototoroLegado
Default valuefalsetrue
First elementNOT unescaped (only items from index 1+)All elements unescaped
Impact&#21407; etc. not decoded by default
FixDefault to true; unescape all elements including first

MEDIUM (#2) — Json Mode Detection ($ vs $.)

KototoroLegado
Rule $xxx (bare $, no dot/bracket)→ Mode.Json→ falls through to Default/Jsoup
Impact$var used as variable name would be treated as JSONPath
FixRequire $. or $[ prefix for Json mode

MEDIUM (#10a) — getString NativeObject Missing

Legado's getString has a fast path: when result is NativeObject, skip the full rule loop and directly access result[rule]. Kototoro feeds NativeObject through the full analyzer pipeline. Fix: add the NativeObject shortcut.

LOW (#15) — replaceRegex replaceFirst Failure Return

Legado: replaceFirst mode returns replacement string when regex fails. Kototoro: returns original result. Fixed in Phase 5 bodyJs alignment but replaceFirst branch needs verification.

Verified OK

#ClaimStatus
17Missing JS bindings (java/source/book/etc.)❌ False alarm — RhinoJavaScriptEngine provides all bindings
8NativeObject early-exit⬜ investigation differed
10cFirst element not unescaped⬜ investigation differed
11URL resolution base (baseUrl vs redirectUrl)Low impact
16Regex cachePerformance only
19getElements blank rule skipDefensive, no impact
20Missing getElement()No callers in parser path
21Undefined handlingSafety enhancement
22Private getStringList less Map-awareMinor

Documentation for Kototoro