引擎 API 参考¶

Doctra 引擎的完整 API 文档。

DocResEngine¶

用于文档增强的图像恢复引擎。

`doctra.engines.image_restoration.DocResEngine` ¶

DocRes Image Restoration Engine

A wrapper around DocRes inference functionality for easy integration with Doctra's document processing pipeline.

Source code in doctra/engines/image_restoration/docres_engine.py

class DocResEngine:
    """
    DocRes Image Restoration Engine

    A wrapper around DocRes inference functionality for easy integration
    with Doctra's document processing pipeline.
    """

    SUPPORTED_TASKS = [
        'dewarping', 'deshadowing', 'appearance', 
        'deblurring', 'binarization', 'end2end'
    ]

    def __init__(
        self, 
        device: Optional[str] = None,
        use_half_precision: bool = True,
        model_path: Optional[str] = None,
        mbd_path: Optional[str] = None
    ):
        """
        Initialize DocRes Engine

        Args:
            device: Device to run on ('cuda', 'cpu', or None for auto-detect)
            use_half_precision: Whether to use half precision for inference
            model_path: Path to DocRes model checkpoint (optional, defaults to Hugging Face Hub)
            mbd_path: Path to MBD model checkpoint (optional, defaults to Hugging Face Hub)
        """
        if not DOCRES_AVAILABLE:
            raise ImportError(
                "DocRes is not available. Please install the missing dependencies:\n"
                "pip install scikit-image>=0.19.3\n\n"
                "The DocRes module is already included in this library, but requires "
                "scikit-image for image processing operations."
            )

        # Set device
        if device is None:
            self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
        else:
            requested_device = torch.device(device)
            # Check if the requested device is available
            if requested_device.type == 'cuda' and not torch.cuda.is_available():
                print(f"Warning: CUDA requested but not available. Falling back to CPU.")
                self.device = torch.device('cpu')
            else:
                self.device = requested_device

        self.use_half_precision = use_half_precision

        # Get model paths (always from Hugging Face Hub)
        try:
            self.mbd_path, self.model_path = get_model_paths(
                use_huggingface=True,
                model_path=model_path,
                mbd_path=mbd_path
            )
        except Exception as e:
            raise RuntimeError(f"Failed to get model paths: {e}")

        # Verify model files exist
        if not os.path.exists(self.model_path):
            raise FileNotFoundError(
                f"DocRes model not found at {self.model_path}. "
                f"This may indicate a Hugging Face download failure. "
                f"Please check your internet connection and try again."
            )

        if not os.path.exists(self.mbd_path):
            raise FileNotFoundError(
                f"MBD model not found at {self.mbd_path}. "
                f"This may indicate a Hugging Face download failure. "
                f"Please check your internet connection and try again."
            )

        # Initialize model
        self._model = None
        self._initialize_model()

    def _initialize_model(self):
        """Initialize the DocRes model"""
        try:
            # Create model architecture
            self._model = restormer_arch.Restormer( 
                inp_channels=6, 
                out_channels=3, 
                dim=48,
                num_blocks=[2,3,3,4], 
                num_refinement_blocks=4,
                heads=[1,2,4,8],
                ffn_expansion_factor=2.66,
                bias=False,
                LayerNorm_type='WithBias',
                dual_pixel_task=True        
            )

            # Load model weights - always load to CPU first, then move to target device
            state = convert_state_dict(torch.load(self.model_path, map_location='cpu')['model_state'])

            self._model.load_state_dict(state)
            self._model.eval()
            self._model = self._model.to(self.device)

        except Exception as e:
            raise RuntimeError(f"Failed to initialize DocRes model: {e}")

    def restore_image(
        self, 
        image: Union[str, np.ndarray], 
        task: str = "appearance",
        save_prompts: bool = False
    ) -> Tuple[np.ndarray, Dict[str, Any]]:
        """
        Restore a single image using DocRes

        Args:
            image: Path to image file or numpy array
            task: Restoration task to perform
            save_prompts: Whether to save intermediate prompts

        Returns:
            Tuple of (restored_image, metadata)
        """
        if task not in self.SUPPORTED_TASKS:
            raise ValueError(f"Unsupported task: {task}. Supported tasks: {self.SUPPORTED_TASKS}")

        # Load image if path provided
        if isinstance(image, str):
            if not os.path.exists(image):
                raise FileNotFoundError(f"Image not found: {image}")
            img_array = cv2.imread(image)
            if img_array is None:
                raise ValueError(f"Could not load image: {image}")
        else:
            img_array = image.copy()

        original_shape = img_array.shape

        try:
            # Handle end2end pipeline
            if task == "end2end":
                return self._run_end2end_pipeline(img_array, save_prompts)

            # Run single task
            restored_img, metadata = self._run_single_task(img_array, task, save_prompts)

            metadata.update({
                'original_shape': original_shape,
                'restored_shape': restored_img.shape,
                'task': task,
                'device': str(self.device)
            })

            return restored_img, metadata

        except Exception as e:
            raise RuntimeError(f"Image restoration failed: {e}")

    def _run_single_task(self, img_array: np.ndarray, task: str, save_prompts: bool) -> Tuple[np.ndarray, Dict]:
        """Run a single restoration task"""

        # Create temporary file for inference
        with tempfile.NamedTemporaryFile(suffix='.jpg', delete=False) as tmp_file:
            tmp_path = tmp_file.name
            cv2.imwrite(tmp_path, img_array)

        try:
            # Change to DocRes directory for inference to work properly
            original_cwd = os.getcwd()
            os.chdir(str(docres_dir))

            # Set global DEVICE variable that DocRes inference expects
            import inference  # Import the inference module to set its global DEVICE
            inference.DEVICE = self.device

            try:
                # Run inference
                prompt1, prompt2, prompt3, restored = inference_one_im(self._model, tmp_path, task)
            finally:
                # Always restore original working directory
                os.chdir(original_cwd)

            metadata = {
                'task': task,
                'device': str(self.device)
            }

            if save_prompts:
                metadata['prompts'] = {
                    'prompt1': prompt1,
                    'prompt2': prompt2, 
                    'prompt3': prompt3
                }

            return restored, metadata

        finally:
            # Clean up temporary file with retry for Windows
            try:
                # Wait a bit for file handles to be released
                time.sleep(0.1)
                os.unlink(tmp_path)
            except PermissionError:
                # If still locked, try again after a longer wait
                time.sleep(1)
                try:
                    os.unlink(tmp_path)
                except PermissionError:
                    # If still failing, just leave it - it will be cleaned up by the OS
                    pass

    def _run_end2end_pipeline(self, img_array: np.ndarray, save_prompts: bool) -> Tuple[np.ndarray, Dict]:
        """Run the end2end pipeline: dewarping → deshadowing → appearance"""

        intermediate_steps = {}

        # Change to DocRes directory for inference to work properly
        original_cwd = os.getcwd()
        os.chdir(str(docres_dir))

        # Set global DEVICE variable that DocRes inference expects
        import inference  # Import the inference module to set its global DEVICE
        inference.DEVICE = self.device

        try:
            with tempfile.TemporaryDirectory() as tmp_dir:
                # Step 1: Dewarping
                step1_path = os.path.join(tmp_dir, "step1.jpg")
                cv2.imwrite(step1_path, img_array)

                prompt1, prompt2, prompt3, dewarped = inference_one_im(self._model, step1_path, "dewarping")
                intermediate_steps['dewarped'] = dewarped

                # Step 2: Deshadowing
                step2_path = os.path.join(tmp_dir, "step2.jpg")
                cv2.imwrite(step2_path, dewarped)

                prompt1, prompt2, prompt3, deshadowed = inference_one_im(self._model, step2_path, "deshadowing")
                intermediate_steps['deshadowed'] = deshadowed

                # Step 3: Appearance
                step3_path = os.path.join(tmp_dir, "step3.jpg")
                cv2.imwrite(step3_path, deshadowed)

                prompt1, prompt2, prompt3, final = inference_one_im(self._model, step3_path, "appearance")

                metadata = {
                    'task': 'end2end',
                    'device': str(self.device),
                    'intermediate_steps': intermediate_steps
                }

                if save_prompts:
                    metadata['prompts'] = {
                        'prompt1': prompt1,
                        'prompt2': prompt2,
                        'prompt3': prompt3
                    }

                return final, metadata
        finally:
            # Always restore original working directory
            os.chdir(original_cwd)

    def batch_restore(
        self, 
        images: List[Union[str, np.ndarray]], 
        task: str = "appearance",
        save_prompts: bool = False
    ) -> List[Tuple[Optional[np.ndarray], Dict[str, Any]]]:
        """
        Restore multiple images in batch

        Args:
            images: List of image paths or numpy arrays
            task: Restoration task to perform
            save_prompts: Whether to save intermediate prompts

        Returns:
            List of (restored_image, metadata) tuples
        """
        results = []

        for i, image in enumerate(images):
            try:
                restored_img, metadata = self.restore_image(image, task, save_prompts)
                results.append((restored_img, metadata))
            except Exception as e:
                # Return None for failed images with error metadata
                error_metadata = {
                    'error': str(e),
                    'task': task,
                    'device': str(self.device),
                    'image_index': i
                }
                results.append((None, error_metadata))

        return results

    def get_supported_tasks(self) -> List[str]:
        """Get list of supported restoration tasks"""
        return self.SUPPORTED_TASKS.copy()

    def is_available(self) -> bool:
        """Check if DocRes is available and properly configured"""
        return DOCRES_AVAILABLE and self._model is not None

    def restore_pdf(
        self, 
        pdf_path: str, 
        output_path: str | None = None,
        task: str = "appearance",
        dpi: int = 200
    ) -> str | None:
        """
        Restore an entire PDF document using DocRes

        Args:
            pdf_path: Path to the input PDF file
            output_path: Path for the enhanced PDF (if None, auto-generates)
            task: DocRes restoration task (default: "appearance")
            dpi: DPI for PDF rendering (default: 200)

        Returns:
            Path to the enhanced PDF or None if failed
        """
        try:
            from PIL import Image
            from doctra.utils.pdf_io import render_pdf_to_images

            # Generate output path if not provided
            if output_path is None:
                pdf_dir = os.path.dirname(pdf_path)
                pdf_name = os.path.splitext(os.path.basename(pdf_path))[0]
                output_path = os.path.join(pdf_dir, f"{pdf_name}_enhanced.pdf")

            print(f"🔄 Processing PDF with DocRes: {os.path.basename(pdf_path)}")

            # Render all pages to images
            pil_pages = [im for (im, _, _) in render_pdf_to_images(pdf_path, dpi=dpi)]

            if not pil_pages:
                print("❌ No pages found in PDF")
                return None

            # Process each page with DocRes
            enhanced_pages = []

            # Detect environment for progress bar
            is_notebook = "ipykernel" in sys.modules or "jupyter" in sys.modules

            # Create progress bar for page processing
            if is_notebook:
                progress_bar = create_notebook_friendly_bar(
                    total=len(pil_pages), 
                    desc="Processing pages"
                )
            else:
                progress_bar = create_beautiful_progress_bar(
                    total=len(pil_pages), 
                    desc="Processing pages",
                    leave=True
                )

            with progress_bar:
                for i, page_img in enumerate(pil_pages):
                    try:
                        # Convert PIL to numpy array
                        img_array = np.array(page_img)

                        # Apply DocRes restoration
                        restored_img, _ = self.restore_image(img_array, task)

                        # Convert back to PIL Image
                        enhanced_page = Image.fromarray(restored_img)
                        enhanced_pages.append(enhanced_page)

                        progress_bar.set_description(f"✅ Page {i+1}/{len(pil_pages)} processed")
                        progress_bar.update(1)

                    except Exception as e:
                        print(f"  ⚠️ Page {i+1} processing failed: {e}, using original")
                        enhanced_pages.append(page_img)
                        progress_bar.set_description(f"⚠️ Page {i+1} failed, using original")
                        progress_bar.update(1)

            # Create enhanced PDF
            if enhanced_pages:
                enhanced_pages[0].save(
                    output_path,
                    "PDF",
                    resolution=100.0,
                    save_all=True,
                    append_images=enhanced_pages[1:] if len(enhanced_pages) > 1 else []
                )

                print(f"✅ Enhanced PDF saved: {output_path}")
                return output_path
            else:
                print("❌ No pages to save")
                return None

        except ImportError as e:
            print(f"❌ Required dependencies not available: {e}")
            print("Install with: pip install PyMuPDF")
            return None
        except Exception as e:
            print(f"❌ Error processing PDF with DocRes: {e}")
            return None

`init(device=None, use_half_precision=True, model_path=None, mbd_path=None)` ¶

Initialize DocRes Engine

Parameters:

Name	Type	Description	Default
`device`	`Optional[str]`	Device to run on ('cuda', 'cpu', or None for auto-detect)	`None`
`use_half_precision`	`bool`	Whether to use half precision for inference	`True`
`model_path`	`Optional[str]`	Path to DocRes model checkpoint (optional, defaults to Hugging Face Hub)	`None`
`mbd_path`	`Optional[str]`	Path to MBD model checkpoint (optional, defaults to Hugging Face Hub)	`None`

Source code in doctra/engines/image_restoration/docres_engine.py

def __init__(
    self, 
    device: Optional[str] = None,
    use_half_precision: bool = True,
    model_path: Optional[str] = None,
    mbd_path: Optional[str] = None
):
    """
    Initialize DocRes Engine

    Args:
        device: Device to run on ('cuda', 'cpu', or None for auto-detect)
        use_half_precision: Whether to use half precision for inference
        model_path: Path to DocRes model checkpoint (optional, defaults to Hugging Face Hub)
        mbd_path: Path to MBD model checkpoint (optional, defaults to Hugging Face Hub)
    """
    if not DOCRES_AVAILABLE:
        raise ImportError(
            "DocRes is not available. Please install the missing dependencies:\n"
            "pip install scikit-image>=0.19.3\n\n"
            "The DocRes module is already included in this library, but requires "
            "scikit-image for image processing operations."
        )

    # Set device
    if device is None:
        self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    else:
        requested_device = torch.device(device)
        # Check if the requested device is available
        if requested_device.type == 'cuda' and not torch.cuda.is_available():
            print(f"Warning: CUDA requested but not available. Falling back to CPU.")
            self.device = torch.device('cpu')
        else:
            self.device = requested_device

    self.use_half_precision = use_half_precision

    # Get model paths (always from Hugging Face Hub)
    try:
        self.mbd_path, self.model_path = get_model_paths(
            use_huggingface=True,
            model_path=model_path,
            mbd_path=mbd_path
        )
    except Exception as e:
        raise RuntimeError(f"Failed to get model paths: {e}")

    # Verify model files exist
    if not os.path.exists(self.model_path):
        raise FileNotFoundError(
            f"DocRes model not found at {self.model_path}. "
            f"This may indicate a Hugging Face download failure. "
            f"Please check your internet connection and try again."
        )

    if not os.path.exists(self.mbd_path):
        raise FileNotFoundError(
            f"MBD model not found at {self.mbd_path}. "
            f"This may indicate a Hugging Face download failure. "
            f"Please check your internet connection and try again."
        )

    # Initialize model
    self._model = None
    self._initialize_model()

`batch_restore(images, task='appearance', save_prompts=False)` ¶

Restore multiple images in batch

Parameters:

Name	Type	Description	Default
`images`	`List[Union[str, ndarray]]`	List of image paths or numpy arrays	required
`task`	`str`	Restoration task to perform	`'appearance'`
`save_prompts`	`bool`	Whether to save intermediate prompts	`False`

Returns:

Type	Description
`List[Tuple[Optional[ndarray], Dict[str, Any]]]`	List of (restored_image, metadata) tuples

Source code in doctra/engines/image_restoration/docres_engine.py

def batch_restore(
    self, 
    images: List[Union[str, np.ndarray]], 
    task: str = "appearance",
    save_prompts: bool = False
) -> List[Tuple[Optional[np.ndarray], Dict[str, Any]]]:
    """
    Restore multiple images in batch

    Args:
        images: List of image paths or numpy arrays
        task: Restoration task to perform
        save_prompts: Whether to save intermediate prompts

    Returns:
        List of (restored_image, metadata) tuples
    """
    results = []

    for i, image in enumerate(images):
        try:
            restored_img, metadata = self.restore_image(image, task, save_prompts)
            results.append((restored_img, metadata))
        except Exception as e:
            # Return None for failed images with error metadata
            error_metadata = {
                'error': str(e),
                'task': task,
                'device': str(self.device),
                'image_index': i
            }
            results.append((None, error_metadata))

    return results

`get_supported_tasks()` ¶

Get list of supported restoration tasks

Source code in doctra/engines/image_restoration/docres_engine.py

def get_supported_tasks(self) -> List[str]:
    """Get list of supported restoration tasks"""
    return self.SUPPORTED_TASKS.copy()

`is_available()` ¶

Check if DocRes is available and properly configured

Source code in doctra/engines/image_restoration/docres_engine.py

def is_available(self) -> bool:
    """Check if DocRes is available and properly configured"""
    return DOCRES_AVAILABLE and self._model is not None

`restore_image(image, task='appearance', save_prompts=False)` ¶

Restore a single image using DocRes

Parameters:

Name	Type	Description	Default
`image`	`Union[str, ndarray]`	Path to image file or numpy array	required
`task`	`str`	Restoration task to perform	`'appearance'`
`save_prompts`	`bool`	Whether to save intermediate prompts	`False`

Returns:

Type	Description
`Tuple[ndarray, Dict[str, Any]]`	Tuple of (restored_image, metadata)

Source code in doctra/engines/image_restoration/docres_engine.py

def restore_image(
    self, 
    image: Union[str, np.ndarray], 
    task: str = "appearance",
    save_prompts: bool = False
) -> Tuple[np.ndarray, Dict[str, Any]]:
    """
    Restore a single image using DocRes

    Args:
        image: Path to image file or numpy array
        task: Restoration task to perform
        save_prompts: Whether to save intermediate prompts

    Returns:
        Tuple of (restored_image, metadata)
    """
    if task not in self.SUPPORTED_TASKS:
        raise ValueError(f"Unsupported task: {task}. Supported tasks: {self.SUPPORTED_TASKS}")

    # Load image if path provided
    if isinstance(image, str):
        if not os.path.exists(image):
            raise FileNotFoundError(f"Image not found: {image}")
        img_array = cv2.imread(image)
        if img_array is None:
            raise ValueError(f"Could not load image: {image}")
    else:
        img_array = image.copy()

    original_shape = img_array.shape

    try:
        # Handle end2end pipeline
        if task == "end2end":
            return self._run_end2end_pipeline(img_array, save_prompts)

        # Run single task
        restored_img, metadata = self._run_single_task(img_array, task, save_prompts)

        metadata.update({
            'original_shape': original_shape,
            'restored_shape': restored_img.shape,
            'task': task,
            'device': str(self.device)
        })

        return restored_img, metadata

    except Exception as e:
        raise RuntimeError(f"Image restoration failed: {e}")

`restore_pdf(pdf_path, output_path=None, task='appearance', dpi=200)` ¶

Restore an entire PDF document using DocRes

Parameters:

Name	Type	Description	Default
`pdf_path`	`str`	Path to the input PDF file	required
`output_path`	`str \| None`	Path for the enhanced PDF (if None, auto-generates)	`None`
`task`	`str`	DocRes restoration task (default: "appearance")	`'appearance'`
`dpi`	`int`	DPI for PDF rendering (default: 200)	`200`

Returns:

Type	Description
`str \| None`	Path to the enhanced PDF or None if failed

Source code in doctra/engines/image_restoration/docres_engine.py

def restore_pdf(
    self, 
    pdf_path: str, 
    output_path: str | None = None,
    task: str = "appearance",
    dpi: int = 200
) -> str | None:
    """
    Restore an entire PDF document using DocRes

    Args:
        pdf_path: Path to the input PDF file
        output_path: Path for the enhanced PDF (if None, auto-generates)
        task: DocRes restoration task (default: "appearance")
        dpi: DPI for PDF rendering (default: 200)

    Returns:
        Path to the enhanced PDF or None if failed
    """
    try:
        from PIL import Image
        from doctra.utils.pdf_io import render_pdf_to_images

        # Generate output path if not provided
        if output_path is None:
            pdf_dir = os.path.dirname(pdf_path)
            pdf_name = os.path.splitext(os.path.basename(pdf_path))[0]
            output_path = os.path.join(pdf_dir, f"{pdf_name}_enhanced.pdf")

        print(f"🔄 Processing PDF with DocRes: {os.path.basename(pdf_path)}")

        # Render all pages to images
        pil_pages = [im for (im, _, _) in render_pdf_to_images(pdf_path, dpi=dpi)]

        if not pil_pages:
            print("❌ No pages found in PDF")
            return None

        # Process each page with DocRes
        enhanced_pages = []

        # Detect environment for progress bar
        is_notebook = "ipykernel" in sys.modules or "jupyter" in sys.modules

        # Create progress bar for page processing
        if is_notebook:
            progress_bar = create_notebook_friendly_bar(
                total=len(pil_pages), 
                desc="Processing pages"
            )
        else:
            progress_bar = create_beautiful_progress_bar(
                total=len(pil_pages), 
                desc="Processing pages",
                leave=True
            )

        with progress_bar:
            for i, page_img in enumerate(pil_pages):
                try:
                    # Convert PIL to numpy array
                    img_array = np.array(page_img)

                    # Apply DocRes restoration
                    restored_img, _ = self.restore_image(img_array, task)

                    # Convert back to PIL Image
                    enhanced_page = Image.fromarray(restored_img)
                    enhanced_pages.append(enhanced_page)

                    progress_bar.set_description(f"✅ Page {i+1}/{len(pil_pages)} processed")
                    progress_bar.update(1)

                except Exception as e:
                    print(f"  ⚠️ Page {i+1} processing failed: {e}, using original")
                    enhanced_pages.append(page_img)
                    progress_bar.set_description(f"⚠️ Page {i+1} failed, using original")
                    progress_bar.update(1)

        # Create enhanced PDF
        if enhanced_pages:
            enhanced_pages[0].save(
                output_path,
                "PDF",
                resolution=100.0,
                save_all=True,
                append_images=enhanced_pages[1:] if len(enhanced_pages) > 1 else []
            )

            print(f"✅ Enhanced PDF saved: {output_path}")
            return output_path
        else:
            print("❌ No pages to save")
            return None

    except ImportError as e:
        print(f"❌ Required dependencies not available: {e}")
        print("Install with: pip install PyMuPDF")
        return None
    except Exception as e:
        print(f"❌ Error processing PDF with DocRes: {e}")
        return None

快速参考¶

DocResEngine¶

from doctra import DocResEngine

# 初始化引擎
engine = DocResEngine(
    device: str = None,  # "cuda"、"cpu" 或 None 自动检测
    use_half_precision: bool = False,
    model_path: str = None,
    mbd_path: str = None
)

# 恢复单个图像
restored_img, metadata = engine.restore_image(
    image: Union[str, np.ndarray, PIL.Image.Image],
    task: str = "appearance"
)

# 恢复 PDF
output_path = engine.restore_pdf(
    pdf_path: str,
    output_path: str = None,
    task: str = "appearance",
    dpi: int = 200
)

参数参考¶

初始化参数¶

参数	类型	默认值	描述
`device`	str	None	处理设备："cuda"、"cpu" 或 None（自动检测）
`use_half_precision`	bool	False	使用 FP16 以加快 GPU 处理速度
`model_path`	str	None	恢复模型的自定义路径
`mbd_path`	str	None	MBD 模型的自定义路径

恢复任务¶

任务	描述	用例
`"appearance"`	一般外观增强	大多数文档（默认）
`"dewarping"`	校正透视失真	有透视问题的扫描文档
`"deshadowing"`	去除阴影和光照伪影	光照条件差
`"deblurring"`	减少模糊并提高清晰度	运动模糊、对焦问题
`"binarization"`	转换为黑白	干净的文本提取
`"end2end"`	完整的恢复流程	严重退化的文档

方法¶

restore_image()¶

恢复单个图像。

参数：

image (str | np.ndarray | PIL.Image.Image)：输入图像（路径、numpy 数组或 PIL 图像）
task (str)：要执行的恢复任务

返回：

restored_img (PIL.Image.Image)：恢复的图像
metadata (dict)：处理元数据，包括任务、设备和时间

示例：

from doctra import DocResEngine

engine = DocResEngine(device="cuda")
restored, meta = engine.restore_image("blurry.jpg", task="deblurring")

print(f"任务：{meta['task']}")
print(f"设备：{meta['device']}")
print(f"时间：{meta['processing_time']:.2f}秒")

# 保存恢复的图像
restored.save("restored.jpg")

restore_pdf()¶

恢复 PDF 文档中的所有页面。

参数：

pdf_path (str)：输入 PDF 的路径
output_path (str, 可选)：输出 PDF 的路径（如果为 None 则自动生成）
task (str)：要执行的恢复任务
dpi (int)：处理的分辨率

返回：

output_path (str)：恢复的 PDF 的路径

示例：

from doctra import DocResEngine

engine = DocResEngine(device="cuda")
restored_pdf = engine.restore_pdf(
    pdf_path="low_quality.pdf",
    output_path="enhanced.pdf",
    task="appearance",
    dpi=300
)

print(f"恢复的 PDF 已保存到：{restored_pdf}")

设备选择¶

自动检测¶

# 如果可用则自动使用 GPU，否则使用 CPU
engine = DocResEngine()

显式 GPU¶

# 强制使用 GPU（如果 CUDA 不可用则会出错）
engine = DocResEngine(device="cuda")

显式 CPU¶

# 强制使用 CPU（较慢但始终可用）
engine = DocResEngine(device="cpu")

检查设备¶

import torch

print(f"CUDA 可用：{torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU：{torch.cuda.get_device_name(0)}")

性能优化¶

半精度¶

在现代 GPU 上使用 FP16 可获得约 2 倍速度：

engine = DocResEngine(
    device="cuda",
    use_half_precision=True  # 更快，质量损失最小
)

要求： - 计算能力 7.0+ 的 NVIDIA GPU（Volta 或更新版本） - 示例：RTX 20xx、RTX 30xx、RTX 40xx、A100、V100

批量处理¶

高效处理多个图像：

from doctra import DocResEngine

engine = DocResEngine(device="cuda")

# 处理图像列表
images = ["doc1.jpg", "doc2.jpg", "doc3.jpg"]
restored_images = []

for img_path in images:
    restored, _ = engine.restore_image(img_path, task="appearance")
    restored_images.append(restored)
    restored.save(f"restored_{img_path}")

DPI 考虑¶

DPI	质量	速度	内存	最适合
100	低	快	低	快速预览
150	中	中	中	一般用途
200	好	慢	中	默认设置
300	高	非常慢	高	高质量扫描

元数据¶

restore_image() 方法返回元数据：

restored, metadata = engine.restore_image("doc.jpg", "appearance")

print(metadata)
# {
#     'task': 'appearance',
#     'device': 'cuda',
#     'processing_time': 1.23,
#     'input_size': (1920, 1080),
#     'output_size': (1920, 1080)
# }

错误处理¶

from doctra import DocResEngine

engine = DocResEngine(device="cuda")

try:
    restored, meta = engine.restore_image("document.jpg", "appearance")
except FileNotFoundError:
    print("未找到图像")
except RuntimeError as e:
    print(f"CUDA 错误：{e}")
    # 回退到 CPU
    engine = DocResEngine(device="cpu")
    restored, meta = engine.restore_image("document.jpg", "appearance")
except Exception as e:
    print(f"意外错误：{e}")

与解析器集成¶

DocResEngine 已集成到 EnhancedPDFParser 中：

from doctra import EnhancedPDFParser

# 这内部使用 DocResEngine
parser = EnhancedPDFParser(
    use_image_restoration=True,
    restoration_task="appearance",
    restoration_device="cuda"
)

parser.parse("document.pdf")

对于独立恢复：

from doctra import DocResEngine

# 步骤 1：恢复 PDF
engine = DocResEngine(device="cuda")
enhanced_pdf = engine.restore_pdf(
    pdf_path="low_quality.pdf",
    output_path="enhanced.pdf",
    task="appearance"
)

# 步骤 2：解析增强的 PDF
from doctra import StructuredPDFParser

parser = StructuredPDFParser()
parser.parse(enhanced_pdf)

示例¶

示例 1：去扭曲扫描文档¶

from doctra import DocResEngine

engine = DocResEngine(device="cuda")

# 修复透视失真
restored, meta = engine.restore_image(
    "scanned_with_distortion.jpg",
    task="dewarping"
)

restored.save("dewarped.jpg")
print(f"处理时间：{meta['processing_time']:.2f}秒")

示例 2：去除阴影¶

from doctra import DocResEngine

engine = DocResEngine(device="cuda")

# 去除阴影伪影
restored, meta = engine.restore_image(
    "document_with_shadows.jpg",
    task="deshadowing"
)

restored.save("no_shadows.jpg")

示例 3：批量 PDF 恢复¶

import os
from doctra import DocResEngine

engine = DocResEngine(device="cuda", use_half_precision=True)

pdf_dir = "input_pdfs"
output_dir = "restored_pdfs"
os.makedirs(output_dir, exist_ok=True)

for filename in os.listdir(pdf_dir):
    if filename.endswith(".pdf"):
        input_path = os.path.join(pdf_dir, filename)
        output_path = os.path.join(output_dir, f"restored_{filename}")

        print(f"处理 {filename}...")
        engine.restore_pdf(
            pdf_path=input_path,
            output_path=output_path,
            task="appearance",
            dpi=200
        )

另请参阅¶

增强解析器 - 将恢复与解析结合使用
核心概念 - 了解图像恢复
示例 - 高级用法模式

引擎 API 参考¶

DocResEngine¶

doctra.engines.image_restoration.DocResEngine ¶

__init__(device=None, use_half_precision=True, model_path=None, mbd_path=None) ¶

batch_restore(images, task='appearance', save_prompts=False) ¶

get_supported_tasks() ¶

is_available() ¶

restore_image(image, task='appearance', save_prompts=False) ¶

restore_pdf(pdf_path, output_path=None, task='appearance', dpi=200) ¶

快速参考¶

DocResEngine¶

参数参考¶

初始化参数¶

恢复任务¶

方法¶

restore_image()¶

restore_pdf()¶

设备选择¶

自动检测¶

显式 GPU¶

显式 CPU¶

检查设备¶

性能优化¶

半精度¶

批量处理¶

DPI 考虑¶

元数据¶

错误处理¶

与解析器集成¶

示例¶

示例 1：去扭曲扫描文档¶

示例 2：去除阴影¶

示例 3：批量 PDF 恢复¶

另请参阅¶

`doctra.engines.image_restoration.DocResEngine` ¶

`init(device=None, use_half_precision=True, model_path=None, mbd_path=None)` ¶

`batch_restore(images, task='appearance', save_prompts=False)` ¶

`get_supported_tasks()` ¶

`is_available()` ¶

`restore_image(image, task='appearance', save_prompts=False)` ¶

`restore_pdf(pdf_path, output_path=None, task='appearance', dpi=200)` ¶