Chapters: Ch 1 · Ch 2 · Ch 3 · Ch 4 · Ch 5
Phases 19–15 — Deep TOC, Markdown Structure, OCR & Eke Pipeline
2026-03-19 to 2026-03-20
Phase 19 — Deep 3-Level TOC + Section Anchors (2026-03-20)
Scope: All 11 books with kn.md (02, 03, 07-vol1, 07-vol2, 08, 14, 17, 25, 27, 28, 29).
Changes:
- Added
<a id="sec-N-M">anchors at all section headings and<a id="sub-N-M-K">at all subsection headings in every kn.md - ಪರಿವಿಡಿ TOC in each kn.md extended to list all three levels (chapter → section → subsection)
- Cross-links inserted after every sec/sub anchor:
[Eke →](./SLUG-kn-eke#sec-N-M)in kn.md;[ಕನ್ನಡ →](./SLUG-kn#sec-N-M)in kn-eke.md - Chapter nav fragments corrected to
#adhyAya-Nthroughout - Index back-links added to all kn.md headers:
[← ಸೂಚಿ](./README)and[← sUci](./README)in kn-eke.md - kn-eke.md self-referential header links corrected (were pointing to wrong file)
Phase 18 — docs/ sync + Noto Sans Kannada + ettuge-sync skill (2026-03-20)
Root cause fixed: All 57 docs/dnsbhat/ files were stale — Phase 17 OCR cleanup and kn.md changes went to src/ but were never copied to docs/ (the GitHub Pages source). This caused garbled rendering of books 25, 15, and all other Phase 17-touched books.
Changes:
docs/dnsbhat/— synced 57 files fromsrc/main/md/kannada/dnsbhat/preserving Jekyll nav front matter (title,parent,nav_order) in each filedocs/_sass/custom/custom.scss— added Noto Sans Kannada via Google Fonts for correct nukta (U+0CBC ಼) rendering; previously Georgia/system fonts silently dropped nukta-modified clusters.claude/skills/ettuge-sync/— new skill (ettuge-sync) automates the full post-phase sync pipeline: staleness detection → CLAUDE.md updates → claude-prompt updates → docs/ sync → global skill copy → regenerate combined docs files → commit and push.claude/skills/ettuge-sync/scripts/sync_docs.py— standalone script for src→docs sync (Step 4a of ettuge-sync skill)
Phase 17 — Nudi Encoding Cleanup, u’ → u^, TOC Restructure, Citation Quote Convention (2026-03-18–19)
Multi-part phase completing Nudi/WX glyph-map artifact cleanup, fixing the unrounded-u Eke marker, restructuring TOCs, removing residual OCR structural artifacts, and establishing a canonical citation-quote convention for the published site.
Sub-phase A — Nudi character-level cleanup (books 17 and 14)
Books 17 and 14 were typeset in Nudi legacy font. WX-decoding produced Kannada Unicode text but left unmapped Latin glyph-map residuals that required cross-referencing the original PDF.
Book 17 — symbols resolved:
| Symbol | U+ | → | Replacement | Count | Context |
|---|---|---|---|---|---|
ù | 00F9 | → | ಱ (archaic RA, U+0CB1) | 85 | vowel-displacement pattern |
 | 00C2 | → | ᵒ (modifier letter small o, U+1D52) | 24 | Havyaka suffix marker |
ï | 00EF | → | ್ (virama, U+0CCD) | 7 | unrounded-u context |
û | 00FB | → | ಼ (nukta, U+0CBC) | 3 | — |
Ð | 00D0 | → | direct reconstructions | 2 | two Tamil loanwords: ಞೆಙ್ಙೋಳ್, ಞಙ್ಙು |
Œ | 0152 | → | char + ಼ (nukta) | 21 | lowered vowels (ಅ಼, ಎ಼, ಒ಼) — pattern A prefix and B infix |
Additional compound OCR garbles fixed: ಯುೀ → ಯೇ (75×), ೊೀ → ೋ (37×), ೂೀ → ೋ (3×), ದುು → ದು (1×).
New Eke rules for book 17’s archaic symbols: ಱ → R/Ra, ೞ → Z/Za, ಙ → G/Ga, ಞ → Y/Ya (halant/full akshara); ಼ (nukta) → : suffix (e.g. ಅ಼ → a:); ᵒ → pass through as-is; ಉ್ (unrounded u) → u^.
Book 14 — symbols resolved:
| Symbol | U+ | Replacement | Count | Context |
|---|---|---|---|---|
« | 00AB | < | 16 | etymological source arrow (§4.6) |
» | 00BB | > (4×) / , (7×) | 11 | word-change notation / clause joins |
¢ | 00A2 | vowel extender | 1 | ಮಧ್ಯದಲ್ಲೆ ¢ → ಮಧ್ಯದಲ್ಲೇ |
£ | 00A3 | , | 1 | clause join (§12.2) |
© | 00A9 | deleted | 1 | page-break artifact block (§9.1 running header) |
(ಆಕ) | — | (೮ಕ) | 1 | OCR misread of Kannada digit ೮ |
Sub-phase B — Eke u’ → u^ fix (2026-03-18)
All 8 existing kn-eke.md files (03, 07-vol1, 07-vol2, 14, 17, 25, 27, 28) regenerated with u^ (caret) for the unrounded-u vowel ಉ್, replacing the earlier u' (apostrophe). Reason: apostrophe caused rendering ambiguity in citation-quote contexts and Markdown processors. Book 27 and 29 re-regenerated again after the fragment cleanup below. Commit: 9a9b8fe.
Sub-phase C — OCR structural artifact removal
| Commit | Books | What was removed |
|---|---|---|
dc21662 | 27, 28, 29 | Per-page running chapter headers embedded in body text |
66a7c62 | 27 | Page-break orphaned fragments before section headings |
61d2f36 | 29 | Page-split sentence fragment rejoined to its paragraph |
5412429 | 08 | Page-break orphaned fragment lines (3 instances) |
6b072f1 | 03 | Stray ಚ page-break fragment isolated before a section heading |
949ed17 | 25 | Entire OCR’d anukaraNike (preface) block removed from body (202 lines) — preface had been OCR’d twice, appearing a second time mid-body |
Sub-phase D — TOC restructure (all kn.md files)
All books with kn.md now have a clean ಒಳಪಿಡಿ/ಪರಿವಿಡಿ section with <a id> anchors and section-link tables. Books 03 and 27 received new full TOCs in this phase; other books were already clean.
| Book | TOC header | Anchor scheme | Count |
|---|---|---|---|
| 03 | ## ಒಳಪಿಡಿ | sec-N-M, sec-N-M-P | 100 sections, 3 levels |
| 07 | ## ಒಳಪಿಡಿ | adhyAya-N | 4 (vol1) + 2 (vol2) |
| 08 | ## ಒಳಪಿಡಿ | mixed | 38 |
| 14 | ## ಒಳಪಿಡಿ | mixed | 164 |
| 17 | ## ಪರಿವಿಡಿ | adhyAya-N | 12 |
| 25 | ## ಪರಿವಿಡಿ | adhyAya-N | 11 |
| 27 | ## ಒಳಪಿಡಿ | part-N, sec-N-M, sub-N-M-K | 221 (5+32+184) |
| 28 | ## ಪರಿವಿಡಿ | adhyAya-N | 12 |
| 29 | ## ಪರಿವಿಡಿ | adhyAya-N | 11 |
Commits: 20bb002 (book 03 — new full 3-level TOC), ad6be57 (book 27 — new 3-tier TOC with 221 anchors).
Sub-phase E — Citation quote convention (books 07, 17, 25, 28)
DNS Bhat’s books were typeset with backtick (U+0060) as typographic open-quote and apostrophe (U+0027 or U+2019) as close. Backtick triggers Markdown code-span rendering on the published site.
Decision: Replace with curly single quotes 'word' (U+2018 open / U+2019 close) — the convention already used natively in books 03 and 27 (Sarvam OCR output). The vowel-modification marker u^ (unrounded-u in Eke) is explicitly not a citation quote and is left unchanged.
Close-char per book: U+0027 (ASCII apostrophe) for books 07, 25, 28; U+2019 (right single quotation mark) for book 17.
Implementation: retrieved HEAD^:{path} via git to get pre-intermediate-commit state, then applied a DOTALL regex (\CONTENT’ → ‘CONTENT’, max 300-char span to handle page-break-split citations) with a double-backtick pass first (``CONTENT’’ → ‘CONTENT’` for direct speech). Orphaned opens/closes handled case-by-case.
| Book (file pair) | Quotes converted | Notable edge cases |
|---|---|---|
| 07 vol1 kn + eke | ~400 | OCR fix ನವi್ಮಲ್ಲಿ → ನಮ್ಮಲ್ಲಿ (+ Eke navaimalli → nammalli); double-citation-mark display ('') → (''); 1 orphaned-open vocab gloss |
| 07 vol2 kn + eke | ~300 | 1 orphaned open (parallel entry); 1 nested outer backtick; 2 isolated OCR fragment orphans (backtick removed); 1 orphaned close |
| 17 kn + eke | 15 | 4 list-gloss items with OCR-dropped close; 1 bibliography backtick before garbled English title (backtick removed) |
| 25 kn + eke | 4 | 4 double-backtick direct-speech citations; 0 residual backticks after regex |
| 28 kn + eke | ~30 | 3 translation glosses with OCR-dropped close; 1 number-structure example |
Commits: 500a296 (intermediate ^..^ convention — superseded), 971e918 (final curly single quotes — 10 files across 5 books).
All Nudi Latin artifacts (0x80–0xFF) now cleared across all kn.md files except © (genuine copyright symbol, preserved in books 03 and 27).
Phase 16 — Cross-Link Audit + Nav Transformation Fix (2026-03-17)
Motivation: After adding cross-links to kn.md files in prior phases, two systemic issues remained:
kn.mdcross-links used wrong label ([ingliS →]— Eke romanisation of “English” — instead of[English →])gen_kn_eke.pypassed[English →] | [Eke →]nav lines through verbatim, so regeneratedkn-eke.mdfiles had self-referential[Eke →]links pointing at themselves02-kn.mdhad zero cross-links (the user reported#ch2had no navigation to English or Eke)
Audit of all kn.md files for cross-links:
| Book | [English →] links | [ingliS →] links | Status |
|---|---|---|---|
| 02 | 0 | 0 | ❌ Missing — added 60 |
| 03 | 9 (1/chapter) | 0 | ✅ |
| 07 vol1 | 4 (1/chapter) | 0 | ✅ |
| 07 vol2 | 2 (1/chapter) | 0 | ✅ |
| 08 | 38 (1/section) | 0 | ✅ |
| 14 | 0 | 82 | ❌ Wrong label — renamed to [English →] |
| 17 | 12 | 0 | ✅ |
| 25 | 11 | 0 | ✅ |
| 27 | 5 | 0 | ✅ |
| 28 | 12 (1/chapter) | 0 | ✅ |
| 29 | 11 (1/chapter) | 0 | ✅ |
Fix 1 — Book 14 kn.md: rename [ingliS →] → [English →] (82 occurrences; kn-eke.md already correct, not regenerated)
Fix 2 — gen_kn_eke.py: proper nav-link transformation
Previously: [English →](en) | [Eke →](kn-eke) was passed through verbatim into kn-eke.md — creating self-referential Eke links.
Now: when generating kn-eke.md, these lines are transformed to the correct perspective:
[English →](./book-en#en-anchor) | [Eke →](./book-kn-eke#sec-id)
↓ (in kn-eke.md)
[ಕನ್ನಡ →](./book-kn#sec-id) | [English →](./book-en#en-anchor)
The kn URL is derived by stripping -eke from the Eke filename in the [Eke →] link.
Fix 3 — Book 02 kn.md: 60 cross-links added (every chapter + section anchor)
Anchor-to-English-anchor mapping (30 unique chapters/sections):
- ch1, sec-1-[1-3] →
part-1--philosophy-and-core-principles - ch2, sec-2-[1-3] →
part-2--framework-overview - ch3, sec-3-[1-2] →
part-3--adjective-to-noun--ತನ - ch4, sec-4-[1-6] →
parts-45--verb-to-noun - ch6, sec-6-1 →
part-6--zero-derivation - ch7, sec-7-[1-3] →
part-7--noun-to-noun - ch8-ch11, ch13-ch14, ch18-ch19, ch29-36, ch37-52 (and their sections) → most specific en.md anchor
Regenerations:
| File | Old lines | New lines | Change |
|---|---|---|---|
02-...-kn-eke.md | 491 (no nav) | 611 (with nav) | +60 nav links; correct [ಕನ್ನಡ →] format |
07-...-vol1-kn-eke.md | 20,183 | 20,183 | Nav fixed: [English →]\|[Eke →] → [ಕನ್ನಡ →]\|[English →] |
07-...-vol2-kn-eke.md | 13,331 | 13,331 | Same nav fix |
Verbatim content audit (all kn-eke.md files): All 11 books confirmed verbatim — non-empty line counts match kn.md exactly.
Commit: fix(02,14): add kn.md cross-links, fix ingliS→English, fix kn-eke nav transformation
Phase 15 — Holistic kn-eke.md Audit + Nav Fix + Stale-Eke Regeneration (2026-03-17)
Motivation: After Phase 14, a cross-book audit revealed two systemic issues that had been fixed one book at a time in prior commits, and two that hadn’t been fixed at all.
Issue 1 — Nav link hygiene (fixed holistically in commit 4964158)
All kn-eke.md files had inconsistent nav-link labels. Patterns found and corrected:
| Old pattern | Correct | Books affected |
|---|---|---|
[ಕನ್ನಡ →] (hybrid Eke in Kannada label) | [ಕನ್ನಡ →] | 02, 07, 14, 18, 27, 29 |
[ingliS →] (Eke romanisation of “English”) | [English →] | 02, 14 |
[English →] \| [Eke →](kn-eke#...) (self-referential) | [ಕನ್ನಡ →](kn#adhyAya-N) \| [English →](en#...) | 03, 17, 25, 28 |
Total: 12 files, 18,746 insertions across the single holistic commit.
Issue 2 — Book 07 OCR page headers/footers (fixed in commit 98c2c7e)
After Phase 14 cleaned vol1-kn.md and vol2-kn.md, the corresponding kn-eke.md files were still stale — generated from the uncleaned source. Transliterated page headers remained:
| File | Lines before | Lines after | Pattern removed |
|---|---|---|---|
vol1-kn.md | 20,475 | 20,185 | N / kannaDa barahada sollarime, garbled M¼À |
vol2-kn.md | 13,928 | 13,333 | copyright line, N / kannaDa barahada sollarime, chapter headers |
Issue 3 — Book 07 kn-eke.md files stale after OCR cleanup (fixed in this phase)
The vol1-kn-eke.md (20,473 lines) and vol2-kn-eke.md (13,929 lines) were regenerated from the Phase 14 uncleaned kn.md — before the header/footer removal. After removing those artifacts from kn.md, the kn-eke.md files still contained their transliterated equivalents:
4 / kannaDa barahada sollarime— page headers from left-page running headers- Copyright line in Eke form
- Section separators from chapter titles printed at top of print pages
Fix: Regenerate both from the cleaned kn.md using gen_kn_eke.py.
Issue 4 — Book 02 kn-eke.md was hand-authored summaries, not verbatim Eke (fixed in this phase)
The earliest kn-eke.md in the collection (book 02, Kannadalle Hosapadagalannu Kattuva Bage) was written manually as a companion document with explanatory Eke text — not a verbatim transliteration of kn.md. At sections like sec-4-4, the kn-eke.md had analytical explanation (“esaka padakkE -ka oTTannu sErisi upakaraNavannu hesarisuvA…”) while kn.md had verbatim Kannada word lists and body text. The file was 835 lines vs kn.md’s 553 lines (52% larger — expanded by hand-authored explanations).
Fix: Regenerate from kn.md using gen_kn_eke.py, replacing hand-authored content with verbatim Eke.
Regenerations in this phase (all via gen_kn_eke.py, 0 residual Kannada chars):
| File | Old lines | New lines | Source | Reduction |
|---|---|---|---|---|
02-...-kn-eke.md | 835 (hand-authored) | 491 (verbatim) | 02-...-kn.md (553L) | −344 (removed summaries) |
07-...-vol1-kn-eke.md | 20,473 (stale) | 20,183 (clean) | 07-...-vol1-kn.md (20,185L) | −290 (removed page headers) |
07-...-vol2-kn-eke.md | 13,929 (stale) | 13,331 (clean) | 07-...-vol2-kn.md (13,333L) | −598 (removed page headers/footers) |
Known residual: 07-...-vol1-kn.md line 11206 has (4) M¼À: — a garbled WX-encoded list entry (1 occurrence). Requires original PDF to determine correct Kannada. All other character-level cleanup is complete.
Commit: fix(02,07): regenerate kn-eke.md verbatim — drop hand-authored summaries and stale page headers