From c2f42fd1aa894ce930fc6a6809f1cf5d63be32a9 Mon Sep 17 00:00:00 2001 From: deeaitch Date: Sat, 28 Feb 2026 22:13:02 -0500 Subject: [PATCH] Update readme.md file with desctiption --- README.md | 144 +++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 142 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 495f504..6d7515b 100644 --- a/README.md +++ b/README.md @@ -1,3 +1,143 @@ -# kindle2latex +# Kindle Clippings → LaTeX Converter -Convert amazon kindle clippings to latex \ No newline at end of file +A small console utility that converts Amazon Kindle *My Clippings* text exports into structured LaTeX. + +The tool parses Kindle highlights and groups them by book title, producing a LaTeX structure with: + +* `\section{}` — per book +* `\subsection{}` — per highlight (metadata line) +* Highlight text — inserted as plain LaTeX content +* `\subsubsection{notes};` — placeholder for future comments + +--- + +## Architecture + +This project demonstrates **two different parsing approaches** solving the same problem: + +### 1️⃣ FSM-based parser + +Implemented using a template-based finite state machine (`fsm.h`). + +Characteristics: + +* Compile-time validated transitions +* Strong type safety +* Explicit state/event model +* Strict contract enforcement + +This version is useful when: + +* The input format is more complex +* You want compile-time guarantees for state transitions +* The parsing logic may grow over time + +--- + +### 2️⃣ TypeFactory-based parser + +Implemented using a registration-based factory (`typefactory.h`). + +Characteristics: + +* Stage-driven pipeline +* One handler per parsing stage +* Runtime validation of handler registration +* No per-line allocations (handlers cached once) + +This version is: + +* Simpler +* More readable +* Easier to debug +* Well suited for linear, protocol-like formats + +Both implementations produce identical LaTeX output. + +--- + +## Input Format + +Expected input is a standard Kindle `My Clippings.txt` export. + +Each clipping block follows this structure: + +``` +Book Title +- Your Highlight on Location 123-125 | Added on ... + +Highlighted text line 1 +Highlighted text line 2 +========== +``` + +--- + +## Output Format + +Generated LaTeX structure: + +```latex +\section{Book Title} + + \subsection{- Your Highlight on Location 123-125 | Added on ...} + Highlighted text line 1 + Highlighted text line 2 + \subsubsection{notes}; +``` + +Highlights are grouped by book title. + +--- + +## Build + +Requires a C++17-compatible compiler. + +```bash +g++ -std=gnu++17 -Wall -Wextra -O2 -o kindle2latex main.cpp +``` + +--- + +## Usage + +```bash +./kindle2latex --input input.txt --output output.tex +``` + +Arguments: + +| Argument | Description | +| ---------- | ----------------------------- | +| `--input` | Path to Kindle clippings file | +| `--output` | Path to generated LaTeX file | + +--- + +## Design Notes + +* No dynamic allocations per input line (handlers are cached). +* Order of books is preserved as in the original file. +* LaTeX special characters are escaped automatically. +* Incomplete clipping blocks are safely ignored. +* The final block is flushed even if the file does not end with `==========`. + +--- + +## Why Two Implementations? + +This repository intentionally keeps two different parsing styles: + +* The FSM version demonstrates strict compile-time state control. +* The TypeFactory version demonstrates a clean, extensible runtime pipeline. + +The goal is architectural exploration and comparison, not just solving the parsing task. + +--- + +If you want, I can also add: + +* A small diagram of the parsing stages +* A section comparing performance characteristics +* Or a short "When to choose FSM vs Factory" guideline for future reuse in your blueprints repository