Chronological Age Estimation Of An Individual Using Machine Learning Algorithms

Document Type


Lead Author Type

MBI Masters Student


Dr. Guenter Tusch, tuschg@gvsu.edu

Embargo Period



Background: Chronological age estimation can be supportive to eliminate child labor, to solve criminal cases and also for administrative purposes. The purpose of the present study is to use the machine learning algorithms to estimate the age of a person based on the relationship between chronological age and biochemical markers. Machine learning techniques are extensively used in biomedical applications. Ensemble methods are used to improve the predictive performance of a given classification algorithm by utilizing a majority vote of several machine learning algorithms. There has been limited investigation of machine learning algorithms in predicting the chronological age of a person. Methods and Results: The National Center for Health Statistics (NCHS) of the Centers for Disease Control and Prevention (CDC) conducts regular surveys designed to provide national estimates of the health and nutritional status of the United States' civilian, non-institutionalized population aged two months and older. The National Health and Nutrition Examination Survey (NHANES) includes information obtained by standardized medical examination and questionnaires. The NHANES 2001-02 data set was selected for the analysis. The dataset contained 5,331 males and 5,708 females; the total number N of cases age 0-84 was 10,803. Age groups were classified as 12-17 (Young) and 18-25 (Old). Attributes were selected using information gain and Chi-square test. Alkaline Phosphatase, PCB Liquid Adjustment, and Dioxin showed the highest scores among all other features. Several decision tree based algorithms (Random Tree, Bagging with Random Tree, and J48), Naïve Bayes (based on a simple probabilistic model assuming independent conditional probabilities), and OneR (a rule based algorithm) were included in the analysis. Ensemble size was chosen to be 2. The decision tree models based on the selected features showed improved predictive performance (78.8% for males and 67.4% for females) compared to the other models. Conclusion: The case study shows that machine learning algorithms can assist in estimating the age of a person and emphasizes the strengths and weaknesses of different machine learning ensemble methods

This document is currently not available here.