Comparing the Performance of Scalpel to GATK-HaplotypeCaller Using Simulated Reads

Document Type


Lead Author Type

MBI Masters Student


Dr. Guenter Tusch, tuschg@gvsu.edu

Embargo Period



The ability to accurately call variants from next-generation sequencing data (NGS) is a necessity for the success of NGS in clinical genomics. Therefore, there is a need for continuous in-depth reporting on the accuracy of state-of-the-art variant calling algorithms. In this paper, the performance of two local de novo reassembly-based variant calling tools are benchmarked using a simulated dataset. Genome Analysis Tool Kit HaplotypeCaller (GATK-HC) is consistently reported to be one of the best performing variant callers. Scalpel is a newer tool which has recently been reported to outperform GATK-HC in calling insertion/deletion elements (INDELs). The goal of this study is to provide an up to date and in-depth comparison of these two variant callers using a realistic simulated dataset. Simulated reads were generated using the tools VarSim and ART, then aligned to a reference genome using BWA-MEM. Precision, recall, and F1-scores were calculated by comparing variants called by GATK-HC and Scalpel to a truth-set of variants using PrecisionFDA’s comparison tool. GATK-HC was observed to have higher precision and recall for single nucleotide polymorphisms (SNPs) and INDELs.

This document is currently not available here.