When the monomers added to a growing polymer chain depend on signals in the environment, such as the ion fluxes during an action potential, the polymer sequence stores a record of the environmental signal's variation over time, much like a ticker tape 
. DNA polymerases (DNAPs), enzymes that catalyze replication of DNA, possess nucleotide misincorporation probabilities that can be modulated by local ion concentrations 
, making them candidates for ion-sensitive molecular ticker tapes that encode signals into DNA strands in the form of base misincorporation patterns. For example, neural firing could be recorded by linking intracellular calcium concentration to polymerase misincorporation rates. In DNAP misincorporation-based recording, information is stored in the form of a string of copied nucleotides, which can be sequenced and compared to the known template sequence to identify the sites of misincorporations. Consequently, one can estimate the state of the environment – e.g. ion concentration – as a function of time, based on the observed misincorporation pattern.
A key problem for such biochemical ticker tape machines is that they may not have a high-fidelity clock. DNAPs do not add nucleotides at a constant rate 
: binding, catalysis, pausing, and dissociation from the template strand are thermally-activated, stochastic processes 
. It is therefore necessary to address imperfect measurements of time in molecular ticker tapes.
To assess the feasibility of extracting information from molecular ticker tapes, we analyze a system in which multiple ion-sensitive DNAPs simultaneously replicate identical DNA template strands in the presence of a time-varying ion concentration signal (). In this scenario, DNAPs add each successive copied nucleotide with an ion concentration-dependent misincorporation probability. Due to thermal fluctuations, the time at which the addition of a particular nucleotide occurs must be treated as a random variable (). In the limit of a large ensemble of simultaneously replicated templates, a misincorporation probability distribution can be measured as a function of the index of the nucleotide (). Here we study the problem of estimating the ion concentration signal as a function of time, based on observed misincorporation frequencies as a function of the nucleotide index.
Encoding and decoding of signals with a molecular ticker tape.
Our method for solving this inverse problem relies only on counting the total number of misincorporations as a function of position within the template. Therefore, it is directly compatible with current-generation short-read deep sequencing technologies, in conjunction with in silico
sequence alignment algorithms (e.g. Smith-Waterman 
), which would be used to localize the short reads inside a long, high-complexity DNA template sequence. Note that assembly of the short reads into contiguous strands, representing the output of a single polymerase molecule, is not required. This is fortunate because distinct error-prone copies of templates with identical sequences will share a high degree of homology and therefore may be difficult to assemble.
What are the biochemical properties that a DNAP must possess in order to function as a molecular ticker tape recorder? To allow for faithful decoding of realistic input signals, a DNAP may require a favorable combination of parameters such as speed, pause probability, distribution of pause durations, and ion-dependent misincorporation rate. Likewise, it is unclear how many simultaneously replicated template strands are required for accurate decoding.
Here we address these statistical constraints on molecular ticker tapes by presenting (1) an intuitive theoretical framework, based on Fisher information theory, which quantifies the theoretical optimal precision for estimating the time-varying input signal from sequencing data as a function of relevant biochemical and experimental parameters, and (2) decoding algorithms to perform estimation of the time-varying input signal from sequencing data. The decoding algorithms rely on knowledge of the DNAP's kinetic parameters. When these parameters are unknown, we provide an algorithm to calibrate them from sequence data generated in the presence of known input signals. Simulations of the decoding algorithm are used to determine the effects of relevant experimental parameters on the actual decoding performance of the algorithms (as opposed to their effects on the theoretical optima). With a view towards potential neuroscience applications, we identify polymerase parameter sets and input signal characteristics for which molecular recording may be feasible, thereby providing guidelines for the experimental design and validation of molecular recording technologies.