Layout Detection¶
Guide to document layout detection in Doctra.
Overview¶
Layout detection is the foundation of Doctra's processing pipeline. It analyzes PDF pages to identify and classify different document elements (text, tables, charts, figures).
How It Works¶
- Render: PDF pages converted to images at specified DPI
- Detection: PaddleOCR model identifies element regions
- Classification: Elements labeled by type
- Filtering: Low-confidence detections removed
Configuration¶
from doctra import StructuredPDFParser
parser = StructuredPDFParser(
layout_model_name="PP-DocLayout_plus-L",
dpi=200,
min_score=0.5
)
Parameters¶
- layout_model_name
- PaddleOCR model to use
-
PP-DocLayout_plus-L
: Best accuracy (slower) -PP-DocLayout_plus-M
: Faster, good accuracy - dpi
- Image resolution - 100-150: Fast, lower quality - 200: Balanced (default) - 250-300: High quality, slower
- min_score
- Confidence threshold (0-1) - 0.0: Include all detections - 0.5: Moderate filtering - 0.7+: Conservative, high confidence only
Visualization¶
Verify detection quality:
Element Types¶
- Text: Regular content (blue boxes)
- Tables: Tabular data (red boxes)
- Charts: Graphs and plots (green boxes)
- Figures: Images and diagrams (orange boxes)
See Also¶
- Core Concepts - Understanding the pipeline
- Visualization - Layout visualization
- API Reference - Configuration options