Paperdoc Library
A zero-dependency PHP library for generating, parsing, and converting documents — PDF, HTML, CSV, DOCX, XLSX, PPTX, Markdown and more. One API for create, read, and convert.
Features
- Generate documents from scratch (PDF, HTML, CSV, DOCX, XLSX, PPTX, Markdown)
- Parse existing files into a unified in-memory model
- Convert between any supported format in one call
- Rich document model — typed headings, ordered & bullet lists (nested), bookmarks, code blocks, blockquotes, images, tables, page breaks, and typed document properties
- Per-page layout v0.7.0 — per-section
PageSetup(anyPageSizeenum or fully custom dimensions, padding, full-page background image or color); absolutely-positionedTextZoneblocks withclip/ellipsis/visibleoverflow strategies; document-wide running headers / footers with{page}/{pages}/{title}/{date}/{datetime}placeholders. See Page layout. - Native rendering core — every block element renders cleanly to DOCX, PDF, HTML and Markdown: typed headings (
<h1>/<w:pStyle>), nested lists (<ul>/<w:numPr>), blockquotes, code blocks (with language hint), bookmarks, embedded or on-disk images - Hyperlinks — DOCX
<w:hyperlink>parsed and round-tripped to HTML<a>, Markdown[text](url)and DOCX hyperlink relationships, with anchors and tooltips - Typed exceptions —
ParserException,RendererException,UnsupportedFormatException,InvalidDocumentException - Batch processing — open and process multiple files at once
- Thumbnails — generate preview images from documents or any supported file (images, PDF, Office, HTML, etc.); LibreOffice required for Office/CSV, Imagick or Ghostscript for PDF
- Laravel integration — ServiceProvider, Facade, and Artisan commands
- AI-powered — OCR (Tesseract) and LLM extraction via Neuron AI
- Core library: PHP + extensions only; thumbnails require LibreOffice and/or Imagick/Ghostscript
Requirements
Paperdoc requires PHP 8.2+ and the following extensions (no external PHP packages):
| Dependency | Version |
|---|---|
| PHP | ^8.2 |
| ext-dom | * |
| ext-mbstring | * |
| ext-zip | * |
| ext-zlib | * |
Optional (Laravel): illuminate/support ^11.0 or ^12.0 for the Facade and ServiceProvider.
Thumbnails — required system dependencies
For correct, high-quality thumbnails (fonts and layout):
- LibreOffice is required for DOCX, XLSX, PPTX, and CSV (headless:
libreofficeorsofficein PATH). - Imagick or Ghostscript is required for PDF thumbnails with correct rendering (otherwise a fallback text/image preview is used).
Without these, thumbnails fall back to native PHP previews (limited fonts, no real layout).
Installation
Install the package via Composer:
composer require paperdoc-dev/paperdoc-lib
Laravel auto-discovery
The PaperdocServiceProvider and Paperdoc facade are registered automatically. No manual registration needed.
Quick Start
Standalone PHP
Create a document, add content, and save to a file:
use Paperdoc\Support\DocumentManager;
use Paperdoc\Document\Style\TextStyle;
$doc = DocumentManager::create('pdf', 'My Report');
$doc->openSection()
->addParagraph('Hello, Paperdoc!', TextStyle::make()->setBold());
DocumentManager::save($doc, 'output/report.pdf');
Laravel (Facade)
Use the Paperdoc facade to create, parse, convert, or render:
use Paperdoc\Facades\Paperdoc;
// Create and save
$doc = Paperdoc::create('docx', 'Invoice #1042');
$doc->openSection()->addParagraph('Amount due: $500');
Paperdoc::save($doc, storage_path('invoices/1042.docx'));
// Parse an existing file
$doc = Paperdoc::open('uploads/report.xlsx');
// Convert directly (file to file)
Paperdoc::convert('report.docx', 'report.pdf', 'pdf');
// Render document to string (e.g. HTML)
$html = Paperdoc::renderAs($doc, 'html');
// Batch open multiple files
$docs = Paperdoc::openBatch([
'file1.pdf',
'file2.docx',
'file3.xlsx',
]);
Supported Formats
Parse existing files or generate new ones for each format. Legacy Office formats (DOC, XLS, PPT) are parse-only.
| Format | Parse | Render / Generate |
|---|---|---|
| ✓ | ✓ | |
| HTML | ✓ | ✓ |
| DOCX | ✓ | ✓ |
| XLSX | ✓ | ✓ |
| PPTX | ✓ | ✓ |
| CSV | ✓ | ✓ |
| Markdown | ✓ | ✓ |
| DOC | ✓ | — |
| XLS | ✓ | — |
| PPT | ✓ | — |
Document Model
Every format uses the same strongly-typed in-memory structure, so you can parse a PDF and render to DOCX without format-specific code. Every renderer supports the full set of first-class block elements — no element is silently dropped (see Rendering).
Document (format, title, ?Metadata, metadata[])
└── Section[]
├── Heading (level 1-6, runs, ?id)
├── Paragraph (TextRun[], ?ParagraphStyle)
│ └── TextRun (text, ?TextStyle, ?TextLink)
├── ListBlock (bullet | ordered, start)
│ └── ListItem (runs, blocks → nested ListBlock…)
├── Blockquote (nested DocumentElement[])
├── CodeBlock (code, ?language)
├── Bookmark (id) — link target for TextLink anchors
├── Table → TableRow[] → TableCell[]
├── Image (src | embedded data + mimeType)
└── PageBreak
All block elements implement Paperdoc\Contracts\BlockElementInterface. Styles live in Document/Style/ (ParagraphStyle, TextStyle, TableStyle), links in Document/Link/TextLink, typed document properties in Document/Metadata.
Build a richly-typed document
use Paperdoc\Document\{Document, Heading, Metadata};
use Paperdoc\Document\Style\TextStyle;
$doc = Document::make('md', 'Release notes')
->setProperties(
Metadata::make()
->setAuthor('Alice')
->setKeywords('release, changelog, paperdoc')
->setLanguage('en-US')
);
$section = $doc->openSection();
$section->addElement(Heading::make('Getting started', 2, 'intro'));
$section->addBulletList()
->addText('Install the library')
->addText('Run the quick start')
->addText('Read the docs');
$section->addCodeBlock("composer require paperdoc-dev/paperdoc-lib", 'bash');
$section->addBookmark('ready-to-go');
$section->addBlockquote()
->addText('You are all set.', TextStyle::make()->setItalic());
Block elements at a glance
| Element | Purpose | Section shortcut |
|---|---|---|
| Heading | Typed heading (level 1-6) with optional id for anchors | addHeading() / addElement(Heading::make(…)) |
| ListBlock | Ordered or bullet list; nest another ListBlock inside a ListItem for sub-lists | addBulletList() / addOrderedList() |
| Blockquote | Quoted block that can contain any nested block element | addBlockquote() |
| CodeBlock | Verbatim source code with optional language hint | addCodeBlock($code, $language) |
| Bookmark | Named landmark; target for TextLink internal anchors | addBookmark($id) |
| Image | On-disk (Image::make($path)) or in-memory bytes (Image::fromData($bytes, $mime)) — embedded as data URI in HTML/Markdown, JPEG XObject in PDF, and w:drawing with a relationship in DOCX. Real dimensions are auto-detected when width/height are omitted, oversized images are scaled to fit the page content area while preserving the aspect ratio. | addElement(Image::make(…)) |
| Table | Rows of cells with optional header row and per-column widths. Cells accept any block element — paragraphs, headings, lists, blockquotes, code blocks, bookmarks and images. | addElement(new Table()) |
| PageBreak | Hard page break (<w:br w:type="page"/> in DOCX, new PDF page, <div class="page-break"> in HTML, --- in Markdown) | addPageBreak() |
| Metadata | Typed document properties (author, subject, keywords, dates, language) | $doc->setProperties(Metadata::make()) |
Page layout, text zones, headers & footers
v0.7.0 Each Section can declare its own page geometry through a PageSetup value object, place absolutely-positioned TextZone blocks anywhere on the page, and a global RunningElement on the Document draws a header/footer on every page. Combine several sections (each with its own PageSetup) to build documents where every page has a different size and background.
Configure a page
use Paperdoc\Document\{Image, Section};
use Paperdoc\Document\Style\PageSetup;
use Paperdoc\Enum\PageSize;
$cover = Section::make('cover')->setPageSetup(
PageSetup::fromSize(PageSize::A4)
->setPadding(0) // 1, 2, 3 or 4 values (CSS shorthand)
->setBackgroundImage(Image::make('cover.jpg')) // full-bleed image
);
$body = Section::make('body')->setPageSetup(
PageSetup::fromSize(PageSize::A4, PageSetup::ORIENTATION_LANDSCAPE)
->setPadding(50)
->setBackgroundColor('#F8F5EC') // solid color
);
$square = Section::make('back-cover')->setPageSetup(
PageSetup::custom(500, 500) // any width × height in pt
->setBackgroundImage(Image::make('back.jpg'))
);
Section exposes shortcut setters (setPageSize(), setPageDimensions(), setPagePadding(), setPageBackgroundImage(), setPageBackgroundColor()) that delegate to a lazily-created PageSetup.
| Setter / Factory | Purpose |
|---|---|
| PageSetup::fromSize(PageSize, $orientation = 'portrait') | Use a standard format (A3/A4/A5/A6/Letter/Legal/Tabloid/Executive) |
| PageSetup::custom($width, $height) | Any dimensions in PDF points (1 pt = 1/72 inch) |
| landscape() / portrait() | Flip the active orientation |
| setPadding(...) (1–4 values) | CSS-style shorthand for top/right/bottom/left padding |
| setBackgroundColor($hex) | Solid full-bleed background color |
| setBackgroundImage(Image) | Full-bleed image — on-disk or Image::fromData() |
Place text precisely with TextZone
A TextZone places text in an absolutely-positioned rectangle. Coordinates use the top-left convention (x=0, y=0 is the top-left of the page) for both PDF and HTML — the PdfRenderer flips to PDF's bottom-left origin internally.
use Paperdoc\Document\TextZone;
use Paperdoc\Document\Style\{ParagraphStyle, TextStyle};
use Paperdoc\Enum\Alignment;
$cover->addTextZone(x: 40, y: 40, width: 515, height: 90)
->setBackgroundColor('#0B1437')
->setBorder('#FFFFFF', 0.8)
->setPadding(16)
->addText(
'Paperdoc — Cover title',
TextStyle::make()->setBold()->setFontSize(20)->setColor('#FFFFFF'),
ParagraphStyle::make()->setAlignment(Alignment::LEFT),
);
// Long lorem with the ellipsis strategy: text is truncated to fit
// exactly the visible height and the last visible line ends with "…".
$cover->addTextZone(x: 40, y: 160, width: 250, height: 260)
->setPadding(12)
->setBackgroundColor('#FFFFFF')
->setBorder('#1F2937', 0.5)
->setOverflow(TextZone::OVERFLOW_ELLIPSIS)
->addText($veryLongText,
TextStyle::make()->setFontSize(10)->setColor('#111827'),
ParagraphStyle::make()->setLineSpacing(1.25),
);
| Overflow strategy | Behaviour |
|---|---|
| TextZone::OVERFLOW_CLIP | (Default) Silently truncates content that doesn't fit |
| TextZone::OVERFLOW_ELLIPSIS | Truncates and ends the last visible line with … (PDF: native; HTML: pseudo-element) |
| TextZone::OVERFLOW_VISIBLE | No clipping — content may flow outside the box (parity with CSS) |
Document-wide headers and footers
use Paperdoc\Document\Style\{RunningElement, TextStyle};
use Paperdoc\Enum\Alignment;
use Paperdoc\Support\DocumentManager;
$doc = DocumentManager::create('pdf', 'Quarterly report');
$doc->setHeader(
RunningElement::make('{title}')
->setAlignment(Alignment::LEFT)
->setStyle(TextStyle::make()->setFontSize(9)->setItalic()->setColor('#FFFFFF'))
);
$doc->setFooter(
RunningElement::make('Page {page} / {pages} · {date}')
->setAlignment(Alignment::CENTER)
->setStyle(TextStyle::make()->setFontSize(9)->setColor('#FFFFFF'))
);
Supported placeholders in the template: {page} (1-indexed current page), {pages} (total pages), {title} (the document title), {date} (Y-m-d) and {datetime} (Y-m-d H:i). The renderer resolves them per page so you don't need to update the template between pages.
Note. The HTML renderer adds a translucent rgba(255, 255, 255, 0.85) backdrop with a backdrop-filter: blur(2px) behind the running elements so they remain legible on top of any background image. The library does not automatically reserve vertical space for the header/footer — keep that in mind when positioning a TextZone close to a page edge.
Rendering
Every block element of the document model is rendered natively by all four built-in renderers. No element is silently dropped — what you build in PHP is what you get in DOCX, PDF, HTML and Markdown.
| Element | DOCX | HTML | Markdown | |
|---|---|---|---|---|
| Heading | <w:pStyle w:val="Heading1..6"/> | Sized 24/20/16/14/13/12pt | <h1>…<h6> (with id) |
#…###### (Pandoc {#id}) |
| Paragraph | <w:p> with styled runs | Wrapped runs (font/size/color) | <p> with styled spans | **bold** / _italic_ / `code` |
| ListBlock (nested) | <w:numPr> + numbering.xml | Bullet / number markers, indented | <ul> / <ol start="…"> | - / 1. with indent |
| Blockquote | Indented Quote w:pStyle |
Indented italic block | <blockquote> | > prefix per line |
| CodeBlock | Code w:pStyle (monospaced) |
Monospaced block | <pre><code class="language-…"> | ```lang … ``` |
| Bookmark | <w:bookmarkStart/> + <w:bookmarkEnd/> (also on Heading id) |
Invisible target (link annotations planned) | <a id="…" class="paperdoc-bookmark"> | <a id="…"> (HTML fallback) |
| TextLink (hyperlinks) | <w:hyperlink> + relationship (or anchor) | Inline text only (PDF link annotations planned) | <a href … target/rel for external> | [text](url "title") |
| Image | <w:drawing> + word/media/ part + relationship; auto-detected dimensions, capped to content width |
JPEG XObject (PNG/GIF/WebP re-encoded via GD) | Bare <img src="data:…"> (no <figure> wrapper) — safe inside <td>/<li> |
 or  |
| Table (cells accept any block element) | <w:tbl> with required <w:tblGrid>, per-cell <w:tcW> from Table::getColumnWidths() |
Drawn grid with cell padding; bold/italic/colour preserved when all cell runs share a style | <table> / <tr> / <td> — every block dispatcher runs inside cells | Pipe table (| … |) — multi-line cell content (lists, code, quotes) flattened to one line |
| PageBreak | <w:br w:type="page"/> | New PDF page | <div class="page-break"> | --- |
| Metadata | docProps/core.xml + app.xml | Author → PDF Creator | — | — |
DOCX output is a complete OOXML package ([Content_Types].xml, _rels/, word/styles.xml, word/numbering.xml, word/_rels/document.xml.rels, embedded media). PDF is generated by a native zero-dependency engine (Paperdoc\Support\Pdf\PdfEngine) with built-in fonts and JPEG image XObjects.
Conversion & rendering
Convert a file to another format in one call, or render a document to a string (e.g. HTML or Markdown) without writing to disk.
File-to-file conversion
// Standalone
DocumentManager::convert('input.docx', 'output.pdf', 'pdf');
// Laravel
Paperdoc::convert('reports/data.xlsx', storage_path('reports/data.pdf'), 'pdf');
Render to string
Useful for web preview, APIs, or further processing:
$doc = Paperdoc::open('document.pdf');
$html = Paperdoc::renderAs($doc, 'html');
$markdown = Paperdoc::renderAs($doc, 'md');
Hyperlinks
Every TextRun can carry an optional Paperdoc\Document\Link\TextLink. Links survive the full round-trip — parsed from DOCX <w:hyperlink> and re-emitted natively to DOCX (with the proper hyperlink relationship or internal anchor), HTML and Markdown.
Add a link programmatically
use Paperdoc\Support\DocumentManager;
use Paperdoc\Document\Section;
use Paperdoc\Document\Link\TextLink;
$doc = DocumentManager::create('md', 'Release notes');
$section = Section::make('main');
$section->addText(
'See the full changelog',
null,
TextLink::make('https://github.com/paperdoc-dev/paperdoc-lib/blob/main/CHANGELOG.md', '', 'Changelog')
);
$doc->addSection($section);
echo DocumentManager::renderAs($doc, 'md');
// [See the full changelog](https://github.com/paperdoc-dev/paperdoc-lib/blob/main/CHANGELOG.md "Changelog")
Supported link flavours
| Kind | Construction | HTML output | Markdown output |
|---|---|---|---|
| External URL | TextLink::make('https://x.com') | <a href="…" target="_blank" rel="noopener noreferrer"> | [label](url) |
| Internal anchor | TextLink::make('', 'section-2') | <a href="#section-2"> | [label](#section-2) |
| URL + fragment | TextLink::make('https://x.com', 'sect-2') | <a href="https://x.com#sect-2"> | [label](url#sect-2) |
| Tooltip / title | TextLink::make('https://x.com', '', 'Open') | <a … title="Open"> | [label](url "Open") |
External schemes (http, https, mailto, tel, ftp) automatically get target="_blank" rel="noopener noreferrer" in HTML to prevent tabnabbing. Run styling (bold, italic, color, font) is preserved when combined with a link.
Convert DOCX with hyperlinks to Markdown
All <w:hyperlink> elements are parsed and attached to their TextRun. Labels containing ] and URLs with spaces or parentheses are escaped automatically in the Markdown output:
use Paperdoc\Support\DocumentManager;
$doc = DocumentManager::open('report.docx');
file_put_contents('report.md', DocumentManager::renderAs($doc, 'md'));
Generate thumbnails
Paperdoc can generate thumbnails from documents or from any supported file. Thumbnails are computed on the fly (no file is written unless you do it yourself). For in-memory documents, the thumbnail reflects the first image in the document, or falls back to the first page of the source file if the document was opened from disk.
LibreOffice is required for DOCX, XLSX, PPTX, and CSV to get thumbnails with correct fonts and layout. For PDF, Imagick or Ghostscript is required for proper rendering. See Requirements.
From a document (in-memory)
Use getThumbnail() for raw binary data (width, height, mimeType, data) or getThumbnailDataUri() for a data:image/…;base64,… string suitable for <img src="…">:
$doc = Paperdoc::open('report.pdf');
// Array: ['data' => '…', 'mimeType' => 'image/jpeg', 'width' => 300, 'height' => 200]
$thumb = $doc->getThumbnail(300, 200, 85);
if ($thumb) {
file_put_contents('preview.jpg', base64_decode($thumb['data']));
}
// Data URI — use directly in HTML
$dataUri = $doc->getThumbnailDataUri(300, 200);
// <img src="<?php echo $dataUri; ?>" alt="Preview">
Via DocumentManager or Facade
Same API without holding the document instance:
// Standalone
$thumb = DocumentManager::thumbnail($document, 300, 300, 85);
$dataUri = DocumentManager::thumbnailDataUri($document, 300, 300);
// Laravel
$dataUri = Paperdoc::thumbnailDataUri($doc, 200, 200);
From a file path (any format)
ThumbnailGenerator can create a thumbnail directly from a file. Images use GD; PDF uses Imagick or Ghostscript (required for correct rendering); Office and CSV use LibreOffice headless (required) → PDF → first page. Without LibreOffice/Imagick/Ghostscript, a fallback text or grid preview is used.
use Paperdoc\Support\ThumbnailGenerator;
// Array or data URI
$thumb = ThumbnailGenerator::fromFile('document.docx', 300, 300, 85);
$dataUri = ThumbnailGenerator::fromFileDataUri('report.pdf', 400, 300, 90, 0); // page 0 = first page
Defaults: ThumbnailGenerator::DEFAULT_WIDTH / DEFAULT_HEIGHT = 300, DEFAULT_QUALITY = 85. For PDF and Office files, the optional $page argument (0-based) selects which page to thumbnail.
Opening options (OCR, LLM)
When opening a file with open() or openBatch(), you can pass options to enable OCR and/or LLM augmentation:
// Force OCR on a scanned PDF
$doc = Paperdoc::open('scan.pdf', ['ocr' => true]);
// Skip OCR even if auto-detect would run it
$doc = Paperdoc::open('mixed.pdf', ['ocr' => false]);
// Enable LLM augmentation (summaries, structure, correction)
$doc = Paperdoc::open('scan.pdf', ['ocr' => true, 'llm' => true]);
// OCR language (e.g. 'fra', 'eng')
$doc = Paperdoc::open('document.pdf', ['language' => 'fra']);
Options: ocr (true / false / 'auto'), llm (bool), language (OCR language code). Defaults come from config/paperdoc.php.
OCR
Paperdoc uses Tesseract for text extraction from scanned documents and images. OCR can run automatically when opening a PDF (auto-detect) or be forced/skipped via open($path, ['ocr' => true|false]).
- Post-processing — character substitution (0→O, rn→m), optional spell correction, n-gram correction, pattern recognition (dates, amounts), structure detection (headings, lists).
- Parallel processing — multiple pages processed in parallel (pool size configurable).
- Laravel Artisan —
paperdoc:build-dictionaryto build a spell-check dictionary;paperdoc:train-ngramto train an n-gram model for better correction.
Config: config/paperdoc.php → ocr (enabled, driver, language, pool_size, tesseract binary) and ocr.post_processing.
LLM / AI
LLM augmentation improves OCR output and enables summaries, translations, and structured extraction. Paperdoc uses Neuron AI to connect to multiple providers.
- LlmAugmenter — post-process OCR text (correction, structure).
- PaperdocAgent — document Q&A, summaries, and structured data extraction.
- Providers — OpenAI, Anthropic, Gemini, Ollama (and others supported by Neuron AI).
Enable via open($path, ['llm' => true]) or in config: llm.enabled, llm.provider, llm.model, llm.api_key (or PAPERDOC_LLM_* env vars).
Laravel
Use the Paperdoc facade for the same API as DocumentManager: Paperdoc::create(), Paperdoc::open(), Paperdoc::save(), Paperdoc::convert(), Paperdoc::renderAs(), Paperdoc::openBatch().
Artisan commands (when the package is installed in a Laravel app):
php artisan paperdoc:build-dictionary <path>— build a dictionary from text files for OCR spell correction.php artisan paperdoc:train-ngram <path>— train an n-gram model from text files for OCR post-processing.
Configuration
Laravel only. Publish the config file:
php artisan vendor:publish --tag=paperdoc-config
This creates config/paperdoc.php. Main options:
- Default format — default output when creating documents
- Typography — fonts and sizes applied to documents
- Storage paths — where to read/write files
- OCR — Tesseract and post-processing settings
- LLM — AI extraction and augmentation (Neuron AI)
Typed Exceptions
All library errors extend a single base (Paperdoc\Exceptions\PaperdocException) so consumers can catch them uniformly.
| Exception | Thrown when… |
|---|---|
| PaperdocException | Base class (extends RuntimeException) |
| ParserException | A parser cannot read or decode a file — use ParserException::forFile($path, $reason, $previous) |
| RendererException | A renderer cannot serialise a document — use RendererException::forFormat($fmt, $reason, $previous) |
| UnsupportedFormatException | Unknown format or extension — ::forFormat() / ::forExtension() |
| InvalidDocumentException | Document is used in an invalid state (e.g. invalid heading level) |
use Paperdoc\Exceptions\PaperdocException;
try {
$doc = Paperdoc::open('report.docx');
} catch (PaperdocException $e) {
// Any Paperdoc error ends up here.
}
API reference
Main entry point: DocumentManager (standalone) or Paperdoc facade (Laravel).
| Method | Description |
|---|---|
| create($format, $title = '') | Create a new empty document (format: pdf, docx, html, csv, etc.). |
| open($filename, $options = []) | Parse a file. Options: ocr, llm, language. |
| save($document, $path) | Write document to a file. |
| renderAs($document, $format) | Render document to string (e.g. 'html', 'md'). |
| convert($source, $destination, $format) | Open source file and save as destination in given format. |
| openBatch($filenames, $options = []) | Open multiple files; returns array of documents. Same options as open(). |
| thumbnail($document, $maxWidth, $maxHeight, $quality) | Get thumbnail array (data, mimeType, width, height) for a document. See Thumbnails. |
| thumbnailDataUri($document, $maxWidth, $maxHeight, $quality) | Get thumbnail as a data:image/…;base64,… string for <img src="…">. |
| thumbnailBase64($document, $maxWidth, $maxHeight, $quality) | Get the raw base64-encoded thumbnail bytes (no data: prefix) — handy for JSON APIs. |
| registerRenderer($format, $rendererClass) | Register a custom renderer class for a given format (must implement RendererInterface). |
| registerParser($parser) | Register a custom ParserInterface instance — extend Paperdoc with new formats. |
Architecture
High-level layout of the package source:
src/
├── Concerns/ # Shared traits
├── Console/ # Artisan commands
├── Contracts/ # DocumentInterface, ParserInterface, BlockElementInterface…
├── Document/ # Core model (Document, Section, Paragraph, Heading, ListBlock, Bookmark, CodeBlock, Blockquote, Metadata…)
├── Enum/ # Format enums
├── Exceptions/ # PaperdocException + typed exceptions
├── Facades/ # Laravel Facade
├── Factory/ # Document/Parser factories
├── Llm/ # AI/LLM integration (Neuron AI)
├── Ocr/ # OCR integration
├── Parsers/ # Format-specific parsers
├── Renderers/ # Format-specific renderers
├── Support/ # DocumentManager and helpers
└── PaperdocServiceProvider.php
Testing
Run the test suite from the library directory:
composer test
# or
./vendor/bin/phpunit
Integration tests are in tests/Integration/, unit tests in tests/Unit/.