Data Mining Techniques Applied to the Hydrogen Lactose Breath Test

Document Type


Lead Author Type

MBI Masters Student


Dr. Guenter Tusch

Embargo Period



For the assessment of gut microbiome functional activity, the hydrogen breath test is one of the most common tests in use nowadays. The main purpose of this test is to learn about gut microbial function activity by identifying the hydrogen patterns. These hydrogen patterns can be identified by statistical methods like heat map, principal component analysis and clustering. My approach is based on a research paper. The authors used the Matlab software, but my approach is to use the R language. My goal is to compare both software packages. In addition, I performed interactive visualization using 3D scatter plot by considering all the variables present in the dataset using plotly tool. This paper is the first one to apply the data mining techniques to hydrogen breath tests, so the conclusions in this paper are quite conservative and research work is going on further to link these hydrogen patterns to different set of symptoms occur in metabolic activity of gut flora. This is considered as the initial step that needs further research.

