To this day, legal documents are primarily distributed as PDF documents, making it complex for machines to access the content. To automatically check for compliance with legal requirements, a common approach is translating the legal documents into a logical representation, which allows execution by an automated theorem prover. While textual content can be successfully translated into such a logical representation, there is limited research investigating how to translate tabular content into a logical representation.
The objective is to utilise modern deep learning models for natural language processing and computer vision to parse the tabular content and translate them into logic rules. The existing deep learning model or training data for translating textual content can be reused or modified.
This project lets you experiment with cutting-edge deep learning models, utilising them for a domain application.
Undergraduate
The outcome of this project shall be a pipeline that takes a table in PDF or an image file type and outputs one or more logic rules that represent the tabular content.
None
Computer Science (303S.499, Lab)