The University of Auckland

Project #34: Development of Effective Question Generation model for Education using Natural Language Processing techniques



The examination procedure is unavoidable for all university students. However, from our own anecdotal experience, we see that many students rely on past examination papers as part of their study routine. Although past papers offer excellent practice for the exam itself, students are prone to study the question style rather than the content of the course. Our team looks to explore the potential of question generation using techniques in Machine Learning and Natural Language Processing/Generation to dynamically generate “examinable questions” that cover the contents of a given reference text such as a textbook or lecture excerpt. Thus, enabling students to retain information on the content rather than memorising the question style of a particular exam.

Current question generation techniques look promising, with Microsoft’s recent paper Unified Language Model Pre-training for Natural Language Understanding and Generation (UniLM) displaying state of the art results for general question generation tasks. However, we believe there is potential for a question generation model explicitly tailored for student education. Hence, our team looks to investigate practical question formats for student education, in particular, looking at knowledge retention as a critical focus.

In conjunction with our investigation into question composition, we plan to utilise the Stanford Question Answering Dataset (SQuAD). A reading comprehension dataset consisting of over 100,000 questions crowdsourced on a set of Wikipedia articles, where the answer to each question is a segment of text from the corresponding reading passage. Using this dataset as a starting point, we look to build upon existing NLP architectures to achieve effectiveness in meaningful question generation.










Lab allocations have not been finalised