PaddlePaddle

4 articles found in this topic.

A stylized graphic showing a document being processed by a computer, with text and images being recognized and analyzed, representing advanced OCR workflows powered by open-source vision-language models.
PaddlePaddle•12/10/2025

Open-Source Models Advance OCR Workflows

The emergence of Vision-Language Models (VLMs) is significantly expanding OCR capabilities beyond simple text extraction to complex visual and semantic understanding. Open-source models offer cost-efficiency and privacy benefits, making advanced OCR solutions more accessible. This article explores key factors for selecting OCR models and highlights cutting-edge open-source options.

Read Article
Hugging Face logo with AI-related graphics, symbolizing innovation in OCR technologies like HunyuanOCR and PaddleOCR-VL.
Hugging Face•12/8/2025

Hugging Face AI Insight Talk to Feature OCR Innovations

Hugging Face's AI Insight Talk will feature advanced OCR technologies, including HunyuanOCR, PaddleOCR-VL, and MinerU. This session, in collaboration with OpenMMLab and others, will explore solutions for general recognition to multilingual support. The event is scheduled for December 4, 2025, at 20:00 Beijing Time.

Read Article
Cherry Studio logo with PaddleOCR logo, symbolizing their integration for enhanced multilingual document parsing.
PaddlePaddle•12/8/2025

Cherry Studio Integrates PaddleOCR for Enhanced Multilingual Document Parsing

Cherry Studio, an open-source desktop application for multilingual translation, has integrated PaddleOCR for enhanced image and cross-language document processing. This collaboration improves accuracy and efficiency by leveraging PaddleOCR's PP-OCRv5 model for text recognition and translation workflows.

Read Article
PaddleOCR-VL logo with text '109 Languages' and 'Lightweight Multilingual Document Intelligence Solution'
Hugging Face•12/8/2025

PaddleOCR-VL: A Lightweight Multilingual Document Intelligence Solution for 109 Languages

PaddleOCR-VL is a new lightweight, multilingual document intelligence solution supporting 109 languages. Developed by Baidu, it integrates a visual encoder and ERNIE language model for accurate recognition of complex document elements with minimal parameters. This advancement addresses challenges in traditional OCR by offering enhanced efficiency and deployability.

Read Article
Page 1 of 1