Update readme.md file with desctiption

This commit is contained in:
2026-02-28 22:13:02 -05:00
parent c980049751
commit c2f42fd1aa

144
README.md
View File

@@ -1,3 +1,143 @@
# kindle2latex # Kindle Clippings → LaTeX Converter
Convert amazon kindle clippings to latex A small console utility that converts Amazon Kindle *My Clippings* text exports into structured LaTeX.
The tool parses Kindle highlights and groups them by book title, producing a LaTeX structure with:
* `\section{}` — per book
* `\subsection{}` — per highlight (metadata line)
* Highlight text — inserted as plain LaTeX content
* `\subsubsection{notes};` — placeholder for future comments
---
## Architecture
This project demonstrates **two different parsing approaches** solving the same problem:
### 1⃣ FSM-based parser
Implemented using a template-based finite state machine (`fsm.h`).
Characteristics:
* Compile-time validated transitions
* Strong type safety
* Explicit state/event model
* Strict contract enforcement
This version is useful when:
* The input format is more complex
* You want compile-time guarantees for state transitions
* The parsing logic may grow over time
---
### 2⃣ TypeFactory-based parser
Implemented using a registration-based factory (`typefactory.h`).
Characteristics:
* Stage-driven pipeline
* One handler per parsing stage
* Runtime validation of handler registration
* No per-line allocations (handlers cached once)
This version is:
* Simpler
* More readable
* Easier to debug
* Well suited for linear, protocol-like formats
Both implementations produce identical LaTeX output.
---
## Input Format
Expected input is a standard Kindle `My Clippings.txt` export.
Each clipping block follows this structure:
```
Book Title
- Your Highlight on Location 123-125 | Added on ...
Highlighted text line 1
Highlighted text line 2
==========
```
---
## Output Format
Generated LaTeX structure:
```latex
\section{Book Title}
\subsection{- Your Highlight on Location 123-125 | Added on ...}
Highlighted text line 1
Highlighted text line 2
\subsubsection{notes};
```
Highlights are grouped by book title.
---
## Build
Requires a C++17-compatible compiler.
```bash
g++ -std=gnu++17 -Wall -Wextra -O2 -o kindle2latex main.cpp
```
---
## Usage
```bash
./kindle2latex --input input.txt --output output.tex
```
Arguments:
| Argument | Description |
| ---------- | ----------------------------- |
| `--input` | Path to Kindle clippings file |
| `--output` | Path to generated LaTeX file |
---
## Design Notes
* No dynamic allocations per input line (handlers are cached).
* Order of books is preserved as in the original file.
* LaTeX special characters are escaped automatically.
* Incomplete clipping blocks are safely ignored.
* The final block is flushed even if the file does not end with `==========`.
---
## Why Two Implementations?
This repository intentionally keeps two different parsing styles:
* The FSM version demonstrates strict compile-time state control.
* The TypeFactory version demonstrates a clean, extensible runtime pipeline.
The goal is architectural exploration and comparison, not just solving the parsing task.
---
If you want, I can also add:
* A small diagram of the parsing stages
* A section comparing performance characteristics
* Or a short "When to choose FSM vs Factory" guideline for future reuse in your blueprints repository