Document Type


Lead Author Type

CIS Masters Student


Dr. Jonathan Engelsma,

Embargo Period



Audio, Data Labeling, Machine Learning, Video Extraction, Datasets, Capture


Machine learning is based on two things, data and statistics, by feeding data into a computer and applying statistics the application can learn any type of pattern behind the administered data. Based on this we wanted to create an application that allows people to control and improve their dental hygiene by listening to the user brushing their teeth. However, during the study of the project, it was identified that for this specific objective there was not enough data for the development and execution of this project. Therefore, we decided to create a tool to gather, label and categorize audio files so they can be used in any kind of machine learning project.

This tool was designed not only to tag any type of video or audio, allowing the user to extract their data, but also to guarantee the privacy of the information by keeping all the input files locally and sending only the extracted audio to a protected service. Furthermore, this tool was developed to be compatible with both mobile devices and desktop computers. It is important to note that this tool can also be used in many other areas that involve audio detection as it can easily be expanded to suit the researcher's need.