gpl/kindle2latex

Fork 0

Go to file

deeaitch cfcacee550 fixed small mistake

2026-02-28 23:12:24 -05:00

fsm

separate fsm version from other

2026-02-28 21:44:47 -05:00

typefactory

Translate comments.

2026-02-28 22:09:40 -05:00

.gitignore

move to roor folder

2026-02-28 21:07:16 -05:00

LICENSE

Initial commit

2026-03-01 01:34:55 +00:00

README.md

fixed small mistake

2026-02-28 23:12:24 -05:00

README.md

Kindle Clippings → LaTeX Converter

A small console utility that converts Amazon Kindle My Clippings text exports into structured LaTeX.

The tool parses Kindle highlights and groups them by book title, producing a LaTeX structure with:

\section{} — per book
\subsection{} — per highlight (metadata line)
Highlight text — inserted as plain LaTeX content
\subsubsection{notes} — placeholder for future comments

Architecture

This project demonstrates two different parsing approaches solving the same problem:

1️⃣ FSM-based parser

Implemented using a template-based finite state machine (fsm.h).

Characteristics:

Compile-time validated transitions
Strong type safety
Explicit state/event model
Strict contract enforcement

This version is useful when:

The input format is more complex
You want compile-time guarantees for state transitions
The parsing logic may grow over time

2️⃣ TypeFactory-based parser

Implemented using a registration-based factory (typefactory.h).

Characteristics:

Stage-driven pipeline
One handler per parsing stage
Runtime validation of handler registration
No per-line allocations (handlers cached once)

This version is:

Simpler
More readable
Easier to debug
Well suited for linear, protocol-like formats

Both implementations produce identical LaTeX output.

Input Format

Expected input is a standard Kindle My Clippings.txt export.

Each clipping block follows this structure:

Book Title
- Your Highlight on Location 123-125 | Added on ...

Highlighted text line 1
Highlighted text line 2
==========

Output Format

Generated LaTeX structure:

\section{Book Title}

  \subsection{- Your Highlight on Location 123-125 | Added on ...}
    Highlighted text line 1
    Highlighted text line 2
      \subsubsection{notes}

Highlights are grouped by book title.

Build

Requires a C++17-compatible compiler.

g++ -std=gnu++17 -Wall -Wextra -O2 -o kindle2latex main.cpp

Usage

./kindle2latex --input input.txt --output output.tex

Arguments:

Argument	Description
`--input`	Path to Kindle clippings file
`--output`	Path to generated LaTeX file

Design Notes

No dynamic allocations per input line (handlers are cached).
Order of books is preserved as in the original file.
LaTeX special characters are escaped automatically.
Incomplete clipping blocks are safely ignored.
The final block is flushed even if the file does not end with ==========.

Why Two Implementations?

This repository intentionally keeps two different parsing styles:

The FSM version demonstrates strict compile-time state control.
The TypeFactory version demonstrates a clean, extensible runtime pipeline.

The goal is architectural exploration and comparison, not just solving the parsing task.

README.md Unescape Escape

Kindle Clippings → LaTeX Converter

Architecture

1️⃣ FSM-based parser

2️⃣ TypeFactory-based parser

Input Format

Output Format

Build

Usage

Design Notes

Why Two Implementations?

README.md