DataHour: Document Segmentation using Layout Parser
DataHour: Document Segmentation using Layout Parser
11 Mar 202309:03am - 11 Mar 202310:03am
DataHour: Document Segmentation using Layout Parser
About the Event
In this DataHour, Sumeet will give you a practical walkthrough of how document segmentation is done using Layout Parser.
He will start with an introduction to techniques regarding handling of searchable/scanned PDFs and its limitations which would be a foundation to the next step for usage of LayoutParser. Theoretical discussion regarding data preprocessing pipeline, creation of custom training set, training pipeline and post processing pipeline with seamless integration to any commercial/open source OCR service will be conducted following which various issues regarding OCR service would be discussed which would be handled by this approach.
Prerequisites: Interest in learning the emerging and trending technologies and basic understanding of NLP, basics of Neural Network, Python, Statistical Hypothesis Testing and Clustering.
- Best articles get published on Analytics Vidhya’s Blog Space
- Best articles get published on Analytics Vidhya’s Blog Space
- Best articles get published on Analytics Vidhya’s Blog Space
- Best articles get published on Analytics Vidhya’s Blog Space
- Best articles get published on Analytics Vidhya’s Blog Space
Who is this DataHour for?
- Best articles get published on Analytics Vidhya’s Blog Space
- Best articles get published on Analytics Vidhya’s Blog Space
- Best articles get published on Analytics Vidhya’s Blog Space
About the Speaker
Participate in discussion
Registration Details
Registered
Become a Speaker
Share your vision, inspire change, and leave a mark on the industry. We're calling for innovators and thought leaders to speak at our event
- Professional Exposure
- Networking Opportunities
- Thought Leadership
- Knowledge Exchange
- Leading-Edge Insights
- Community Contribution
