Quantitative analysis of synthesized nucleic acid pools

Abstract

Experimental evolution of RNA (or DNA) is a powerful method to isolate sequences with useful function (e.g., catalytic RNA), discover fundamental features of the sequence-activity relationship (i.e., the fitness landscape), and map evolutionary pathways or functional optimization strategies. However, the limitations of current sequencing technology create a significant undersampling problem which impedes our ability to measure the true distribution of unique sequences. In addition, synthetic sequence pools contain a non-random distribution of nucleotides. Here, we present and analyze simple models to approximate the true sequence distribution. We also provide tools that compensate for sequencing errors and other biases that occur during sample processing. We describe our implementation of these algorithms in the Galaxy bioinformatics platform.

ICB Affiliated Authors

Irene Chen

Authors

Xulvi-Brunet, R., Campbell, G.W., Rajamani, S., Jimenez, J. I. and Chen, I. A.

Date

July 1, 2016

Type

Peer-Reviewed Article

Journal

Nonlinear Dynamics in Biological Systems

Pages

19–41

DOI

10.1007/978-3-319-33054-9_2