Part IV Project Management System

Description:

This project investigates the current Speech-To-Text technologies (such as Amazon Transcribe,
GCP Cloud Speech, Azure Speech and others) with the aim of testing viability of use by the
Libraries and Learning Services (LLS) audio-visual archive content. It includes the development
of functionality such as:
(1) Production of searchable text indexes from spoken audio content;
(2) Creation of subtitle files for optional display in video playback;
(3) Enabling jump-to-word playback through text and media timeline linking.
The project will work closely with the LLS at the University of Auckland,
who will provide necessary resources for the research.

Research Components: (1) Research into assessing various speech-to-text technologies, and weighing up cost-vs-benefit evaluations; (2) Research into providing optimising technologies to work with: + New Zealand (and other non-US/UK) English accents + Māori language recognition (if possible) + Variable quality audio content (e.g., broadcast quality materials vs. researcher field recordings)

Type:

Undergraduate

Outcome:

Implement a software tool for Speech-To-Text conversion, and evaluate the outcome in terms of accuracy, scalability, indexing, etc.

Prerequisites

None

Specialisations

Software Engineering

Supervisor

Jing Sun

Co-supervisor

Jiamou Liu

Team

Allocated (Not available for preferences)

Lab

Lab allocations have not been finalised

Project #134: Speech-To-Text Technologies for LLS Audio-Visual Archives