Visualization¶
Guide to visualizing Doctra's processing results.
Overview¶
Doctra provides visualization tools to help you understand and verify document processing results.
Layout Visualization¶
Display detected document elements with bounding boxes:
from doctra import StructuredPDFParser
parser = StructuredPDFParser()
parser.display_pages_with_boxes(
pdf_path="document.pdf",
num_pages=3
)
Features¶
- Color-coded Elements: Each type has a distinct color
- Confidence Scores: Shows detection confidence
- Grid Layout: Multiple pages in organized grid
- Element Counts: Summary statistics per page
Color Scheme¶
- 🔵 Blue: Text regions
- 🔴 Red: Tables
- 🟢 Green: Charts
- 🟠Orange: Figures
Configuration¶
parser.display_pages_with_boxes(
pdf_path="document.pdf",
num_pages=5, # Pages to visualize
cols=3, # Grid columns
page_width=700, # Page width in pixels
spacing=40, # Spacing between pages
save_path="viz.png" # Save instead of display
)
Use Cases¶
- Quality Assurance: Verify detection accuracy
- Debugging: Identify layout issues
- Documentation: Create visual reports
- Analysis: Understand document structure
CLI Visualization¶
See Also¶
- Layout Detection - Understanding detection
- Core Concepts - Processing pipeline
- CLI Reference - Command line tools