The examination procedure is unavoidable for all university students. However, from our own anecdotal experience, we see that many students rely on past examination papers as part of their study routine. Although past papers offer excellent practice for the exam itself, students are prone to study the question style rather than the content of the course. Our team looks to explore the potential of question generation using techniques in Machine Learning and Natural Language Processing/Generation to dynamically generate “examinable questions” that cover the contents of a given reference text such as a textbook or lecture excerpt. Thus, enabling students to retain information on the content rather than memorising the question style of a particular exam.
Current question generation techniques look promising, with Microsoft’s recent paper Unified Language Model Pre-training for Natural Language Understanding and Generation (UniLM) displaying state of the art results for general question generation tasks. However, we believe there is potential for a question generation model explicitly tailored for student education. Hence, our team looks to investigate practical question formats for student education, in particular, looking at knowledge retention as a critical focus.
In conjunction with our investigation into question composition, we plan to utilise the Stanford Question Answering Dataset (SQuAD). A reading comprehension dataset consisting of over 100,000 questions crowdsourced on a set of Wikipedia articles, where the answer to each question is a segment of text from the corresponding reading passage. Using this dataset as a starting point, we look to build upon existing NLP architectures to achieve effectiveness in meaningful question generation.
Undergraduate
An investigation into effective question composition for knowledge retention
An investigation into existing natural language question generation models
Proposal of a feasible question generation model that builds from the investigations
Development and training of the proposed machine learning model
Validation of proposed model results
Investigation and analysis into the effectiveness of the proposed model
None
Lab allocations have not been finalised