This project aims to design and develop a software tool based on real-time object detection and machine learning, to give the visually impaired a tool to help better identify and understand their surroundings. An initial framework (but different from this project: YOLO (You Only Look Once)) has been developed by the students will be a starting point for this project.
Specifically, the visually impaired individuals would point their cameras (either smartphone or external) around a room or environment, which would then detect objects (e.g. people, computers, cars, road/street signs). The application could then verbally announce these identified objects to the individual, or try figure out what the room is based on the objects (e.g. office, road, kitchen, computer lab) using further machine learning. This would in turn provide more context of where they are and their surroundings, e.g., the number of individuals and furniture in the room. The verbal announcements could either be given when prompted, at set times, only for particularly labelled objects (filtered), or constantly for every new object it detects. Further potential use cases of such technology can span from danger detection when walking busy streets, to conveniences such as transcribing a menu in a restaurant or a book. The research components that will be explored include real-time spatial detection based on streaming data from the individuals and speech synthesis of text. Moreover, how to integrate the cloud-based machine learning services into the software architecture will be investigated.
• A basic implementation of a mobile app with the functionalities specified in the project description.
• A series of scalable algorithms based cloud services.
Lab allocations have not been finalised