Length and GC-biases during sequencing library amplification
High throughput sequencing has broadened our understanding of countless biological processes and led to scientific advancements in clinical therapeutics. With massively parallel sequencing becoming a staple in laboratories worldwide, we need to understand the limitations and biases of the technology. PCR amplification is an essential step in the preparation of a library for high throughput sequencing. However, biases introduced by PCR, reflecting the GC content and length of the template, can have detrimental effects on library generation.
Dabney and Meyer have now compared ten commercially available PCR polymerase-buffer systems (Dabney and Meyer, Biotechniques 52:87-94, 2012) to determine the biases they introduce in sequencing both modern and ancient DNA. For modern DNA library preparation, Herculase II Fusion polymerase best maintained the GC content and length distribution of the library throughout 40 cycles of PCR, while Phusion polymerase in HF buffer introduced dramatic bias in both parameters relative to the original library. Neanderthal DNA is a limited resource and contains a high level of GC-rich microbial contamination. Of the ten polymerase-buffer systems tested for ancient DNA library preparation, AccuPrime Pfx produced the highest levels of endogenous sequences, while Phusion in HF buffer preferentially amplifies the GC-rich microbial templates.
PCR polymerases are a principal source of GC content and length bias in high throughput library preparation. Other uninvestigated biases, such as thermocycling parameters, likely exist but optimizing the polymerase-buffer system can contribute to generating a library that accurately represents the starting material.
Desiree Boltz

