Quantitative analysis of synthesized nucleic acid pools

Abstract

Experimental evolution of RNA (or DNA) is a powerful method to isolate sequences with useful function (e.g., catalytic RNA), discover fundamental features of the sequence-activity relationship (i.e., the fitness landscape), and map evolutionary pathways or functional optimization strategies. However, the limitations of current sequencing technology create a significant undersampling problem which impedes our ability to measure the true distribution of unique sequences. In addition, synthetic sequence pools contain a non-random distribution of nucleotides. Here, we present and analyze simple models to approximate the true sequence distribution. We also provide tools that compensate for sequencing errors and other biases that occur during sample processing. We describe our implementation of these algorithms in the Galaxy bioinformatics platform.

ICB Affiliated Authors

Authors
Xulvi-Brunet, R., Campbell, G.W., Rajamani, S., Jimenez, J. I. and Chen, I. A.
Date
Type
Peer-Reviewed Article
Journal
Nonlinear Dynamics in Biological Systems
Pages
19–41