The University of Auckland

Project #134: Speech-To-Text Technologies for LLS Audio-Visual Archives

Back

Description:

This project investigates the current Speech-To-Text technologies (such as Amazon Transcribe,
 GCP Cloud Speech, Azure Speech and others) with the aim of testing viability of use by the
 Libraries and Learning Services (LLS) audio-visual archive content. It includes the development
 of functionality such as:
(1) Production of searchable text indexes from spoken audio content;
(2) Creation of subtitle files for optional display in video playback;
(3) Enabling jump-to-word playback through text and media timeline linking.
The project will work closely with the LLS at the University of Auckland,
who will provide necessary resources for the research.

Research Components: (1) Research into assessing various speech-to-text technologies, and weighing up cost-vs-benefit evaluations; (2) Research into providing optimising technologies to work with: + New Zealand (and other non-US/UK) English accents + Māori language recognition (if possible) + Variable quality audio content (e.g., broadcast quality materials vs. researcher field recordings)

Outcome:

Implement a software tool for Speech-To-Text conversion, and evaluate the outcome in terms of accuracy, scalability, indexing, etc.

Prerequisites

None

Specialisations

Categories

Supervisor

Co-supervisor

Team

Allocated (Not available for preferences)

Lab

Lab allocations have not been finalised