Validation Of Nvidia Tao’S Bpnet For Gait Feature Extraction With Human Motion Dataset
Location
Loosemore Auditorium
Description
PURPOSE: The study aimed to develop and validate a platform for accurately inferring human body joint positions from 2D images using deep learning framework. SUBJECTS: Participants include twenty-five healthy adults (seventeen males and eight females.) METHODS AND MATERIALS: The 3D motion capture (VICON) system collected 3D ground truth data and 2D images of three views (0o, 45o, and 90o) using three 2D cameras while subjects performing gaits. Annotated key points were extracted from the images to prepare the dataset for 2D feature detection, and neural network models including NVIDIA Tao’s BPNET were trained using COCO dataset and the proposed dataset. The accuracy of the models was evaluated by comparing with VICON system's values. ANALYSIS: The performance metric used was Mean Per Joint Position Error (MPJPE). RESULTS: The MPJPE of the model trained with the purposed dataset (18.24±14.38mm) resulted significantly comparable value with the model trained with COCO-2017 dataset (12.27±11.27mm). CONCLUSIONS: With only one-fourth of the training images, the purposed dataset achieved comparable results in comparison to the COCO dataset, with significant deduction in computational time and resources. With the inclusion of more data, the proposed dataset has the potential for greater accuracy. The dataset's prime focus on gait and its remarkable accuracy in predicting key joint positions during normal gait movements make it unique compared to most existing human motion datasets. The potential applications include identifying a person based on gait feature, detecting player concussions through temporal analysis during a game in a non-invasive way, and identifying pathologic gait patterns.
Validation Of Nvidia Tao’S Bpnet For Gait Feature Extraction With Human Motion Dataset
Loosemore Auditorium
PURPOSE: The study aimed to develop and validate a platform for accurately inferring human body joint positions from 2D images using deep learning framework. SUBJECTS: Participants include twenty-five healthy adults (seventeen males and eight females.) METHODS AND MATERIALS: The 3D motion capture (VICON) system collected 3D ground truth data and 2D images of three views (0o, 45o, and 90o) using three 2D cameras while subjects performing gaits. Annotated key points were extracted from the images to prepare the dataset for 2D feature detection, and neural network models including NVIDIA Tao’s BPNET were trained using COCO dataset and the proposed dataset. The accuracy of the models was evaluated by comparing with VICON system's values. ANALYSIS: The performance metric used was Mean Per Joint Position Error (MPJPE). RESULTS: The MPJPE of the model trained with the purposed dataset (18.24±14.38mm) resulted significantly comparable value with the model trained with COCO-2017 dataset (12.27±11.27mm). CONCLUSIONS: With only one-fourth of the training images, the purposed dataset achieved comparable results in comparison to the COCO dataset, with significant deduction in computational time and resources. With the inclusion of more data, the proposed dataset has the potential for greater accuracy. The dataset's prime focus on gait and its remarkable accuracy in predicting key joint positions during normal gait movements make it unique compared to most existing human motion datasets. The potential applications include identifying a person based on gait feature, detecting player concussions through temporal analysis during a game in a non-invasive way, and identifying pathologic gait patterns.