Worldwide, over a dozen teams are developing visual prostheses, which aim to return some visual sensitivity to blind people by electrically stimulating the retina (Argus II Retinal Prosthesis System, Second Sight Medical Products Inc “Argus II”; Dowling,
2009; Zrenner et al.,
2011), lateral geniculate nucleus in the thalamus (Pezaris and Reid,
2007; Pezaris and Eskandar,
2009), or primary visual cortex (Brindley and Lewin,
1968; Dobelle and Mladejovsky,
1974; Dobelle et al.,
1974; Schmidt et al.,
1996; Bradley et al.,
2005; Tehovnik and Slocum,
2007). Localized stimulation in any of these regions along the early visual pathway can produce the percept of a small patch of light, referred to as a phosphene. Therefore, simultaneous stimulation at multiple locations via spatially separated electrodes can be used to construct an image (Dobelle et al.,
1976; Humayun et al.,
1999).
Three primary functional and practical benchmarks for a visual prosthesis are to enable unassisted navigation, to simplify object manipulation, and to facilitate object and shape recognition. Shape recognition includes identifying letters and words, which are the critical first steps to enable reading. Although reading is likely to be one of the most difficult goals to achieve with a visual prosthesis, it is highly desired and valued by people with low vision (Massof,
1998; Hazel et al.,
2000). While patients currently implanted with visual prostheses can identify letters and short words (da Cruz et al.,
2010; Stanga et al.,
2010), their performance is limited and accurate recognition requires significant time and mental effort. Therefore, simulations of prosthetic reading in normally sighted participants may allow extensive exploration and refinement of device requirements and possibilities. All previous approaches to simulating prosthetic reading have required converting a high resolution video signal into a continually changing, but low spatial resolution pattern of electrical stimulation, where each pixel in the simulated image will correspond to an electrically generated phosphene (Cha et al.,
1992; Sommerhalder et al.,
2003,
2004; Dagnelie et al.,
2006; Fu et al.,
2006). Thus, these prosthetic reading methods require scanning small patches of text by moving the video camera (i.e., with the head or eyes). While this is similar to the eye movements that occur during normal reading, continual movements present three primary hurdles for reading with a visual prosthesis, associated with contrast polarity, spatial resolution, and temporal resolution.
First consider contrast polarity; if text on screen or on paper (which is normally shown as black letters on a white background) were to be represented using a prosthesis this would require the majority of electrodes to be activated to represent the white background. This has the undesirable effects of increasing current leakage and power consumption in the device. Second, spatial resolution, or the number of pixels in the artificial image, is limited by the number of electrodes in the prosthetic device, with most systems expected to have just 60–1500 electrodes (Argus II; Zrenner et al.,
2011). Most studies aim to use at least 600 electrodes to represent approximately four letters, since this allows relatively normal reading speeds (Legge et al.,
1985; Cha et al.,
1992; Sommerhalder et al.,
2003). This is problematic, as a representation of four letters can only be achieved if the camera can be moved or zoomed so that its field of view closely matches the size of the text. Finally, temporal resolution is likely limited by both engineering and physiological considerations to as little as 2–20

Hz, which is lower than normal video refresh rates (Dobelle et al.,
1976; Dobelle,
2000; Perez Fornos et al.,
2010; Zrenner et al.,
2011). However, previous simulations have assumed that the prosthetic image is updated at video rates of higher than 30

Hz, where motion blur is not a problem (Cha et al.,
1992; Sommerhalder et al.,
2003; Dagnelie et al.,
2006; Fornos et al.,
2011).
Based on the high reading rates achievable with Rapid Serial Visual Presentation (RSVP) of isolated words (Gilbert,
1959; Forster,
1970), we wondered if similar presentation methods would facilitate fast, accurate reading with a visual prosthesis. While simultaneously rendering an entire word with a few hundred randomly placed phosphenes is extremely difficult, rendering a single letter is simple (Figure ). Here, we designed a simulation for normally sighted people that allows us to evaluate a novel Single Letter Reading (SLR) method for use with bionic eyes. In contrast to previous RSVP methods which present a whole word at a time, we use rapid, sequential presentation of single
letters in a fixed foveal location. Reading accuracy and overall reading rate were assessed as we systematically varied font size, the presentation durations of individual letters, gaps between letters, and spaces between words and the degree of user control of letter, and word presentation. When tested with isolated words and complete sentences, normally sighted, trained participants demonstrated lexical access and achieved reading rates of over 60

wpm and accuracies of over 90%. Naive participants with no previous exposure to SLR achieved average reading rates over 30

wpm and over 90% accuracy within a single testing session. While the lexical access, reading rates and accuracies we have observed should be sufficient to allow accurate comprehension (Pelli et al.,
1985; Whittaker and Lovie-Kitchin,
1993; Coltheart et al.,
2001), this was not directly assessed. Therefore, future work will examine how different presentation rates and methods affect the comprehension of longer passages of text. We anticipate that SLR will facilitate accurate and efficient reading with any prosthetic visual device, as the method is not greatly affected by limitations in spatial or temporal resolution.