Brain, Vol. 122, No. 11, 2119-2132,
November 1999
© 1999 Oxford University Press
Invited review |
Dynamics of letter string perception in the human occipitotemporal cortex
1 Brain Research Unit, Low Temperature Laboratory, Helsinki University of Technology, Finland, 2 Physiology Department, Oxford University and 3 Psychology Department, Newcastle University, Newcastle upon Tyne, UK
Correspondence to:
Antti Tarkiainen, Brain Research Unit, Low Temperature Laboratory, Helsinki University of Technology, PO Box 2200, Fin-02015 HUT, Finland E-mail: att{at}neuro.hut.fi
| Abstract |
|---|
|
|
|---|
The inferior occipitotemporal brain areas, especially in the left hemisphere, have been shown to be involved in the processing of written words and letter strings. This processing probably occurs within 200 ms after presentation of the letter string. It has also been suggested that this activation may differ between fluent and dyslexic readers. Using whole-head magnetoencephalography, we studied the spatiotemporal dynamics of brain processes evoked by visually presented letter strings in 12 healthy adult subjects. Our achromatic stimuli consisted of rectangular patches in which single letters, two-letter syllables, four-letter words, or symbol strings of equal length were embedded and to which variable noise was added. This manipulation dissociated three different response patterns. The first of these patterns took place ~100 ms after stimulus onset, originated in areas surrounding the V1 cortex and was distributed along the ventral visual stream, extending laterally as far as V4v. This response was systematically modulated by noise but was insensitive to the stimulus content, suggesting involvement in early visual analysis. The second pattern took place ~150 ms after stimulus onset and was concentrated in the inferior occipitotemporal region with left-hemisphere dominance. This activation showed a preference for letter strings, and its strength and timing correlated with the speed at which the subjects were able to read words aloud. The third pattern also occurred in the time window ~150 ms after stimulus onset, but originated mainly in the right occipital area. Like the second pattern, it was modulated by string length, but showed no preference for letters compared with symbols. The present data strongly support the special role of the left inferior occipitotemporal cortex in visual word processing within 200 ms after stimulus onset.
visual cortex; letter string; magnetoencephalography; word reading; stimulus degradation
BA = Brodmann area; ECD = equivalent current dipole; EOG = electro-oculogram; fMRI = functional MRI; MEG = magnetoencephalography; SQUID = superconducting quantum interference device
| Introduction |
|---|
|
|
|---|
The importance of the lateral occipital cortex for processing visually presented words and letters has been revealed by lesion studies. For example, left occipital lesions can cause pure alexia (Ajax, 1967
PET and fMRI measure the changes in blood flow that are triggered by activity within large populations of neurons. Subtractions of the flow patterns observed under different experimental conditions are used to locate the brain areas that are presumably associated with particular cognitive subprocesses. However, neither technique has the temporal resolution necessary to uncover the time course of events within the neuronal networks.
In trying to understand visual word recognition, the ability to follow in time the sequence of cortical events is particularly helpful. For example, one might expect to find that brain areas responsible for the processing of letter strings are active before those areas that are involved in lexicalsemantic processing. Accordingly, Nobre and colleagues and Allison and colleagues performed intracranial recordings that unequivocally identified responses to letter strings in the inferior temporal sulcus/fusiform gyrus bilaterally (Allison et al., 1994
; Nobre et al., 1994
). According to these studies, letter string-specific activation peaked ~150200 ms after stimulus onset, and was followed ~200 ms later by semantically sensitive activation in the medial temporal areas.
Magnetoencephalography (MEG) is well suited for studies of language processing because it allows non-invasive brain recordings in neurologically normal subjects with good spatial resolution and excellent temporal resolution. Salmelin and colleagues and Kuriki and colleagues found MEG responses consistent with word/letter string-specific tuning in the occipitotemporal/extrastriate cortex (Salmelin et al., 1996
; Kuriki et al., 1998
). Since the timing of these responses was similar to that reported by Nobre and colleagues in epileptic patients (Nobre et al., 1994
), it is plausible that both the MEG studies and the intracranial recordings detected signals generated in the same cortical areas.
The crucial role of occipitotemporal areas in word recognition is further supported by observations in subjects with developmental disorders of reading. Salmelin and colleagues found diminished and delayed left occipitotemporal activation in developmentally dyslexic subjects during silent reading compared with healthy control subjects (Salmelin et al., 1996
). Their data suggest that left occipitotemporal mechanisms may be critical for fluent, automatized word recognition. If so, it is important to characterize such responses in both time and space, because they may provide an objective means for evaluating reading abilities.
As MEG allows the detection of small changes in the timing and strength of brain responses, we decided to base our approach on the well-known behavioural effects of stimulus degradation: decreasing stimulus visibility increases subjects' reaction times in word-reading and lexical decision tasks (e.g. Besner and Smith, 1992; Holcomb, 1993). Accordingly, we expected to see systematic changes in the amplitude and/or latency of letter string-specific activity as the visibility of the stimuli was reduced.
In the present study, we determined (i) whether letter string-specific MEG responses can be identified with systematic locations and time windows across subjects, and (ii) whether there is any correlation between such early MEG responses and reaction times for reading the stimuli aloud.
| Material and methods |
|---|
|
|
|---|
Subjects
Twelve healthy, right-handed, Finnish-speaking adults (four females, eight males) gave their informed consent to participation in this study. Their ages ranged from 21 to 42 years (mean age 29 years). They were all university students or graduates, and had normal visual acuity.
Stimuli
Subjects were presented with four categories of stimulus: (i) pure Gaussian noise; (ii) single-element stimuli: single letters (one of 25 different letters) or geometrical symbols (a diamond, triangle or square); (iii) two-element stimuli: two-letter Finnish syllables (25 different syllables) or two-item symbol strings (four different combinations of two symbols: a circle, diamond, triangle or square); and (iv) four-element stimuli: four-letter Finnish words (50 different words) or four-item symbol strings (four different combinations of four symbols). All word stimuli were common Finnish nouns, e.g. RAHA (money), TALO (house) and VELI (brother).
Letters/syllables/words were embedded in one of four levels of Gaussian noise labelled 0, 8, 16 or 24 (for details see Appendix I). Symbol strings were always presented without noise (level 0), and served as controls for letter strings of equivalent length. Figure 1A
shows examples of the different stimuli.
|
The noise levels were selected so that at zero (level 0) or low (level 8) noise subjects could see letters, syllables and words easily. Moderate noise (level 16) made identification of letter strings more difficult and high noise (level 24) made it extremely difficult.
In two subjects we repeated word measurements with both normal and distorted text in which we perturbed letter spacing and orientation. Although the distorted text looked word-like, these novel symbol strings were not readable in the same way as words. Figure 1B
illustrates the appearance of the distorted letter strings.
Magnetoencephalography
When a large population of neurons becomes simultaneously active, the small postsynaptic currents in parallel-oriented pyramidal nerve cells create a net magnetic field that can be measured outside the head using SQUID (superconductive quantum interference device) sensors. From the distribution of the measured magnetic fields, the location, orientation and strength of the underlying currents can be estimated using equivalent current dipoles (ECDs), thereby giving a satisfactory representation of activity within that area. Thus, by measuring its associated magnetic field patterns, the underlying brain activity can be located in time and, with some restrictions, also in space. (For a thorough review of magnetoencephalography, see Hämäläinen et al., 1993.)
In a magnetically shielded room, we measured the magnetic fields generated by the subjects' cortical activity. We used a Neuromag-122TM neuromagnetometer (Neuromag Ltd, Helsinki, Finland), which employs 122 SQUID sensors arranged in a helmet-shaped array (Ahonen et al., 1993
). The planar gradiometers of the device detect maximum signal just above the activated cortical area. Because of the approximately spherical symmetry of the human brain, MEG is most sensitive to currents tangential to the skull, and the main contribution of MEG signal thus arises from neurons within the fissural cortex.
Alignment of MEG and anatomical data
First, we determined a head coordinate system, to which the coordinate systems for both the MEG measurement and the subject's MRIs were aligned. The head coordinate system was defined in relation to three anatomical landmarks: the nasion and points just anterior to the left and right ear canals, easily localizable also on the subjects' structural MRIs. The x-axis of the head coordinate system runs through the points located in front of the ear canals, with positive values towards the right side. The y-axis passes through the nasion and runs perpendicular to the x-axis from the back of the head (negative) to the front (positive). The z-axis runs perpendicular to both the x- and the y-axes. It passes through the origin defined by the intersection of the x- and y-axes, with positive direction towards the top of the head.
Before the MEG measurement, small coils were attached to the subject's head and the locations of the coils were determined with a 3D digitizer (Isotrak 3S1002; Polhemus Navigation Sciences, Colchester, Vt., USA) together with the three anatomical landmarks defined above. Once the subject was sitting with his or her head inside the measurement helmet of the neuromagnetometer, a small electric current was fed to the coils to induce a measurable magnetic field pattern. This allowed the coils to be located with respect to the neuromagnetometer. Since the coil locations were also known in head coordinates, all MEG measurements could be transformed onto the head coordinate system. Furthermore, since the head coordinate system could be mapped onto the subject's structural MRIs (using the nasion and the left and right ear canals), individual MEG responses could also be mapped onto the subject's structural MRIs.
Procedure
During MEG measurement, subjects sat in a dimly lit, magnetically shielded room. The stimuli were generated by a Macintosh Quadra 840AV computer and projected (Sony LCD Data Projector VCL-350QM) onto a screen ~1 m in front of the subjects. Stimuli appeared on the screen in a centrally placed rectangular patch (~5° x 2°). All images were shown on a large background of uniform grey. The grey level of the background was set to 150 on a scale of 0255 from black to white, based on the mean level of the different stimulus images, to keep the luminance level relatively constant and to reduce the eye stress caused by long periods of viewing of the stimuli.
The subjects' MEG responses were recorded in four experimental conditions: (i) pure noise (levels 0, 8, 16 and 24), i.e. four stimulus types; (ii) single letters in noise (levels 0, 8, 16 and 24) plus single symbols (no noise), i.e. five stimulus types; (iii) syllables in noise (levels 0, 8, 16 and 24) plus double symbols (no noise), i.e. five stimulus types; and (iv) words in noise (levels 0, 8, 16 and 24) plus four symbols (no noise), i.e. five stimulus types.
All stimulus types (letter strings at different noise levels and noiseless symbol strings) appeared with equal probability. The order of stimulus presentation was randomized within conditions but was the same for all subjects. For words, extra care was taken to ensure that at least 30 s elapsed between consecutive presentations of the same word in order to reduce repetition effects. Measurements were carried out on two days so that responses to pure noise and words were always measured on one day and responses to letters and syllables on another day.
Each experimental condition involved a 2030 min MEG recording session, which was divided into four blocks with intervening pauses of 13 min to allow the subjects to rest. During a recording session, each stimulus was displayed for 60 ms with a 2 s interstimulus interval. MEG signals were passband-filtered to 0.03120 Hz, sampled at 397 Hz and averaged on-line in separate bins, one for each stimulus type. Signal averaging began 0.2 s prior to stimulus onset and continued for 0.8 s after stimulus onset. The horizontal and vertical electro-oculograms (EOG) were monitored continuously and epochs contaminated by eye-blinks and eye movements were excluded from the on-line averages. To achieve an acceptable signal-to-noise ratio, a minimum of 70 trials was averaged for each bin, though typically this total exceeded 100.
During a recording session, subjects were instructed to fixate the central region of the projection screen and to pay attention to the stimuli. Every so often a question mark appeared for 2 s (1.5% probability), prompting the subject to report the preceding stimulus. These `probe' trials ensured that subjects maintained concentration. The responses to the question mark and to the stimulus following it were excluded from the averages. The pure noise condition contained no probe trials. Since the recording session was 20% shorter for this condition, it was easier for the subjects to maintain concentration.
Reaction times
Seven out of the 12 subjects took part in a behavioural word pronunciation task. We presented half of the same stimuli as had appeared in the word condition during the MEG recording session but with the interstimulus interval increased to 3 s to allow enough time for the vocalization. The test was conducted in the magnetically shielded room to make sure that the conditions were identical to the corresponding MEG measurement. Subjects were instructed to read the words aloud as quickly as possible. If they could not identify a word, they were instructed to remain silent. The reaction times to different stimuli were measured using the signal recorded from a microphone attached close to the subject's face and the EMG signal measured from two electrodes placed in the opposite corners of the subject's mouth, and analysed off-line. Subjects' responses were recorded with a DAT (digital audio tape) recorder and evaluated for correctness.
Data analysis
Averaged MEG responses were digitally low-pass filtered at 40 Hz. The baseline for the signals was calculated over the period 200 to 0 ms before stimulus onset.
For signal analysis, the shape of the conducting volume, i.e. the brain (Hämäläinen and Sarvas, 1989
), has to be defined. In our studies we approximated the brain as a spherically symmetrical conductor. In each subject, the posterior part of the brain was modelled by a sphere adjusted to the local curvature with the help of the subject's structural MRIs. This model was then used when estimating the source areas in the time window 0300 ms after stimulus onset, because this early activity was concentrated in the occipital and occipitotemporal regions. After 300 ms, activity was seen mainly in the temporal cortices. These source areas were analysed separately using a sphere model adjusted to the curvature of the temporal regions. In four subjects, we also used a realistically shaped conductor model consisting of ~1200 triangles to test the validity of the spherically symmetrical conductor model. The results from the spherical and realistically shaped conductor models were in good agreement, thus justifying the use of the mathematically simpler spherical model for the bulk of our analyses (for details, see Appendix II).
Equivalent current dipoles representing active source areas were determined using the data from a minimum of six sensor pairs surrounding the local magnetic signal maximum, at time points when visual inspection revealed clear dipolar field patterns with minimum interference from other active brain areas. If it was necessary to scrutinize the field patterns with the activity of specific source areas removed for clarity, we employed the signal space projection method (Uusitalo and Ilmoniemi, 1997
). For the early time interval 0300 ms, 510 source areas were identified in each subject. For the later time interval, up to 600 ms, 06 mainly temporal sources were found for each subject. Because the active source areas were similar in the different stimulus conditions, we were able to select a single set of ECDs for each subject which gave a good account of the data in all conditions. This enabled the direct comparison of dipole amplitudes and latencies within and across the different conditions.
| Results |
|---|
|
|
|---|
Averaged MEG signals
As illustrated in Fig. 2
|
Figure 2A
To draw further conclusions from the spatial locations and behaviour of the source areas responsible for the two response patterns mentioned above, they had to be reliably identified from among all the other sources. To do this, we set certain objective criteria describing the behaviour seen in the averaged MEG signals. These criteria were then applied to the amplitude waveforms of all ECDs in all subjects. We were able to classify the sources responsible for the early noise-sensitive behaviour into a category called Type I and the sources responsible for the later word-responsive behaviour into a category called Type II. The use of selection criteria enabled us to identify also a third pattern (Type III), which had not been readily recognizable from the averaged MEG responses.
Analysis of early signals (0300 ms)
Type I activation
We called the early activation, which increased with increasing noise level, the Type I response. To be included in this category a source had to fulfil all the following criteria. In the pure noise condition, (i) a clear activation peak was found at the highest noise level; (ii) the source waveforms showed a systematic increase in peak amplitudes as a function of noise: noise 0 < noise 8
noise 16
noise 24, i.e. activity at noise level 0 was significantly (P < 0.05) smaller than that at higher noise levels (a difference of at least 1.96 times the baseline standard deviation); (iii) peak latencies at the highest noise levels were similar to or shorter than those at the lowest noise levels.
ECDs that fulfilled Type I criteria were found in 10 out of 12 subjects. Results in nine of these 10 subjects were quite consistent, showing Type I activity that peaked within 125 ms after stimulus onset (in pure noise level 24 condition). In one subject, the Type I responses were clearly delayed (peaks at 140145 ms) and showed quite strong activation even for the noiseless condition, raising doubt as to whether his activity represented neural processing that was functionally similar to that found in the other subjects. Therefore, we set an upper limit of 130 ms for Type I peak latencies (in pure noise level 24 condition), excluding the Type I sources of this one subject. A total of 16 Type I sources from nine subjects were accepted for further analysis.
Figure 3
illustrates how the peak amplitudes and latencies of Type I sources, averaged over all nine subjects, were modified across all experimental conditions. The peak amplitudes are expressed relative to the pure noise level 24 condition. If no clear peak was apparent (usually in the pure noise level 0 condition, i.e. after presentation of an evenly grey rectangle), the baseline standard deviation was used to define peak amplitude, and no peak latency was obtained for that situation. The mean ± standard error of the mean peak latency for the pure noise level 24 condition was 107 ± 4 ms and the mean amplitude 15 ± 3 nAm (nanoampere-metres) (with mean baseline standard deviation 0.8 nAm).
|
Consistent with the criteria used to define Type I activity, Fig. 3A
Type II activation
Type I activity was followed by a response that was strongest for visible words. This later activity, which we call Type II, is clearly a candidate for letter string-specific activation. To be included in this category, ECDs had to fulfil all the following criteria for peak amplitudes and latencies. They had (i) a clear peak in the words 0 condition which had longer latency than Type I activity in the same subject, (ii) significantly stronger activation and/or shorter latency (by at least 5 ms, which is twice the sampling interval) for words at noise level 0 than at noise level 24, (iii) significantly more activity for words than for pure noise patches at noise level 0, and (iv) stronger activity and/or shorter latency for words at noise level 0 than for four-item symbol strings. Thus, the classification was based on word and pure noise conditions.
We identified at least one Type II dipole in 11 out of 12 subjects. The first Type II sources usually peaked ~130150 ms after stimulus onset (words 0 condition) with no clear exceptions. However, some later Type II sources were also observed. To include all the Type II sources that we thought were likely to represent the same kind of neural processing, we set the upper limit of 180 ms for Type II peak latencies (in the words 0 condition). The total number of accepted Type II sources, gathered from 11 subjects, was 15.
Mean Type II peak amplitudes (relative to noiseless words) and latencies across all subjects are illustrated in Fig. 4
. For the noiseless word condition the mean amplitude was 18 ± 2 nAm and mean latency 143 ± 3 ms.
|
Even though we demanded only that the activation to words at noise level 0 was stronger than the activation to pure noise at noise level 0, Fig. 4A
The effect of increasing noise on Type II activation in the word condition was variable across individuals. In some subjects the peak latency was delayed, in others the signal amplitude was decreased (at least for the highest noise level) and in the remainder both effects were observed. Interestingly, some subjects showed first a small increase in Type II amplitudes from noise level 0 to levels 8 and 16 and then a clear decrease for the highest noise level. On the other hand, in some subjects the responses decreased monotonically with noise.
A two repeated measures ANOVA of Type II amplitudes showed significant main effects of stimulus (four levels: pure noise, letters, syllables and words) and noise (four levels: 0, 8, 16 and 24) [F(3,42) = 33.9, P < 0.0001 and F(3,42) = 5.5, P < 0.005, respectively]. The two-way interaction stimulus x noise was also significant [F(9,126) = 8.2, P < 0.0001].
As with the Type I pattern, latency modulation was not as clear as amplitude modulation. Nevertheless, the fastest responses were measured to low-noise words (levels 0 and 8) and responses were delayed for symbols and high-noise words (levels 16 and 24) (Fig. 4B
). Note also that the peak latencies of noise-free letter strings decreased systematically as a function of string length (P < 0.01; paired t test: letters versus words), whereas the peak latencies for symbol strings remained at ~155 ms (P = 0.38; paired t test: one-item versus four-item symbol strings).
The locations of all Type II dipoles are shown in Fig. 4C
. (It should be noted that two of these dipoles, both of which were located in the right hemisphere, could also be classified as Type I sources.) In seven of these 11 subjects, all Type II sources were located in the left hemisphere close to the border of the temporal and occipital cortices, 36 ± 2 mm from the midline. Three subjects showed bilateral Type II sources and one subject showed a Type II source in the right hemisphere only. For this source the criterion (iv) was only barely fulfilled.
Type III activation
Only a minority of all occipital sources showed modulation consistent with classification as Type I or Type II. Most of the remaining sources were active at some point after stimulus presentation but did not show any systematic stimulus-related modulation. However, we were able to isolate a third pattern of stimulus-dependent behaviour (Type III), which was not immediately obvious from the averaged data. Many sources showed a clear peak for noiseless words but did not fulfil all the criteria for Type II classification. Nevertheless, some of these sources did show modulation according to string length. We classified such Type III activity as showing (i) a clear peak for words at noise level 0, (ii) significantly larger amplitude for four-item symbol strings than for one-item symbol strings, and (iii) failure to fulfil Type II classification. The time window for Type III sources was set to be exactly the same as for Type II sources, i.e. Type III activity had to occur after Type I activity but before 180 ms after stimulus onset. Equivalent current dipoles satisfying all these criteria were found in nine out of the 12 subjects. The total number of Type III sources was 15.
Figure 5
shows the mean peak amplitudes (relative to four-item symbol strings) and latencies of Type III dipoles averaged across all subjects. The mean amplitude in the four-item symbol string condition was 18 ± 3 nAm and the mean latency 154 ± 5 ms. Type III amplitudes increased with the length of the stimulus string (Fig. 5A
), as with Type II sources, but letters did not evoke stronger activity than symbols. This was confirmed by a two-factor (stimulus type: letter string versus symbol string; string length: 1, 2 and 4 elements) repeated measures ANOVA of peak amplitudes. The main effect of string length was significant [F(2,28) = 19.2, P < 0.0001], whereas the main effect of stimulus type was not [F(1,14) = 0.01, P > 0.5].
|
The estimated locations of Type III sources are shown in Fig. 5C
Considerations about multiple sources
Multiple Type I, Type II and Type III sources were identified in five, three and six subjects, respectively. In principle, if multiple sources acted independently of each other, they could be included in a global analysis without biasing the results. In practice, it is difficult to determine the independence of multiple sources. Therefore, we applied either an amplitude or a latency criterion in order to select just one Type I, Type II and Type III source per individual. The details of these criteria are given in Appendix III. To test whether analyses based on the complete data set were likely to be biased compared with analyses based on either of the restricted data sets, we carried out a one between groups (selection criterion: all, amplitude, or latency), two repeated measures ANOVA (four levels of stimulus; four levels of noise) of Type I, Type II and Type III amplitudes. In each case the main effect of criterion had no significant effect on outcome [F(2,31) = 0.34, P > 0.5, F(2,34) = 0.42, P > 0.5 and F(2,30) = 0.02, P > 0.5, respectively]. These results suggest that our data were not significantly distorted by the problem of source independence.
Some effect of selection criterion can be seen with Type III source locations (Fig. 5C
). When only the dipole showing the largest difference in strength between one-item and four-item symbol strings was selected for each subject (Appendix III, Amplitude criterion), the dipoles were located more uniformly in the right occipital cortex. This selection did not, however, markedly change the peak modulation shown in Fig. 5
and the interpretations based on that.
Letter-like symbols
The effect of manipulating the symbol type on Type II activity was studied in two subjects. To do this we repeated the word measurements with letter-like symbols (Fig. 1B
). Figure 6
shows the amplitude waveforms of these subjects' Type II sources for geometric and letter-like symbols and for two repetitions of responses to words. Since the magnetic field patterns measured using the control stimuli were similar to those measured originally, the same sets of dipoles were used to explain the data. The activity evoked by distorted text was stronger than that for geometric symbols and showed a closer resemblance to, though it was not identical with, the activity evoked by words at noise level 0. These control measurements also illustrate the reproducibility of the responses to words, at least in these two subjects.
|
Analysis of late signals (300600 ms)
Inspection of the averaged data showed that most of the signals occurring >300 ms after stimulus presentation were located in the temporal and frontal brain areas. As many as six new ECDs were identified per subject (mean 3.3) to account for the magnetic field patterns. Of these, a total of 22 sources fulfilled selection criteria for Type II behaviour (11 subjects, one to four sources for each subject). As can be seen in Fig. 7
|
Reaction times in the behavioural pronunciation task
Figure 8
|
The subjects' mean reaction times to words at noise levels 0 and 8 were very similar. At noise level 16, the reaction times were delayed despite almost perfect identification (93 ± 3% correct). At noise level 24, identification was noticeably impaired (19 ± 7% correct) and subjects' reaction times were markedly longer (200 ± 50 ms) than at noise level 16. The shape of the reaction time curve in Fig. 8
We attempted to explain the mean reaction times using the averaged (across seven subjects) Type II peak latency or amplitude modulation alone. For latency modulation we used the differences relative to the noiseless words condition, and for amplitude modulation the inverse of amplitudes relative to the noiseless words condition. Both models were scaled so that they matched the measured reaction times at noise levels 0 and 24. As shown in Fig. 8
, neither of these approaches gave a good account of all the measured reaction times; using Type II response onset latencies instead of peak latencies did not improve the situation. On the other hand, we were able to create a combined measure that resulted in an almost perfect fit to the reaction times (Fig. 8
). This successful measure combined both the peak amplitude and the latency modulation of the Type II ECDs: it was calculated by dividing the mean latency difference (relative to noiseless words) by the mean relative amplitude (relative to noiseless words) raised to the power of 3.5. When analysed on the individual level, it was possible in one or two cases to achieve a reasonably good fit to the reaction time curve by using only the Type II peak amplitudes. However, as seen in Fig. 8
, the best results were generally attained by combining the noise level-dependent amplitude modulation with latency modulation.
The composite measure we created has no obvious physiological basis. Intuitively, it results from the individual variation in the trade-off between amplitude and latency modulation. Therefore, we do not expect the exact formula to be of significance, but we emphasize the idea that interaction of activation strength with timing correlates with behavioural measures.
| Discussion |
|---|
|
|
|---|
In this MEG study, we investigated the early cortical processing of visually presented letter strings the visibility of which was systematically varied. We isolated three systematic patterns of stimulus-modulated activity distinguished by their behaviour, anatomical location and timing. The effects were seen either in the activation strength or in the timing, or in both, on an individual basis.
Type I activity occurred ~80130 ms after stimulus onset and originated bilaterally in occipital areas bordering V1 and extending laterally as far as V4v. The amplitude of Type I responses increased both with increasing noise, i.e. with increasing luminance contrast of adjacent pixels in a noise patch (Fig. 1
), and with the number of elements in letter/symbol strings embedded in the patches. However, Type I sources did not show object specificity as the amplitudes were the same for letter and symbol strings of equal length. Moreover, at the highest noise levels, all stimulus types evoked equally strong Type I responses. This non-specific pattern of behaviour suggests that Type I activity reflects the kind of low-level processing that is common to all visual stimuli, such as the extraction of oriented contrast borders. Our finding is in line with functional imaging studies reporting increased activation in areas V1, V2 and V3 in response to scrambled objects compared with clearly delineated objects, probably due to additional luminance contrast borders created by scrambling (Allison et al., 1994
; Malach et al., 1995
; Grill-Spector et al., 1998
).
The second distinct response pattern, Type II, occurred within 180 ms after stimulus onset and was found in the occipitotemporal cortex with left-hemisphere dominance. Unlike Type I activity, Type II responses were diminished at the highest noise level. This suggests that Type II responses reflect the processing of stimulus attributes more complex than oriented contrast borders, for example. Type II activity was strongest for letter strings, especially for words, and was clearly reduced for geometric symbols. We also found that Type II responses to strings of rotated letters were quite similar to those evoked by words. Together, these findings suggest that Type II activity may not be specific for words per se. Instead, it may reflect processing of multi-element strings whose image characteristics resemble letter strings. We suggest that Type II signals may even reflect the activity of a `module' that acts as an interface between the visual and language domains, rather like a complex filter. It is likely that the properties of such a filter would be developedor tunedby the extensive exposure to printed words that skilled readers experience.
If Type II activity is specifically associated with the processing of letter strings, it is likely that damage to the underlying population of neurons will affect the fluency of reading. This idea is supported by studies of patients with pure alexia, who have to resort to slow letter-by-letter reading (e.g. Warrington and Shallice, 1980; Coslett and Saffran, 1989; Behrmann et al., 1998). In cases where pure alexia is observed without hemianopia, the critical lesion site has been localized either to the left inferior occipitotemporal region (Ajax, 1967
; Greenblatt, 1973
; Henderson et al., 1985
), coincident with the origin of Type II activity, or to areas subjacent to the left angular gyrus (Greenblatt, 1976
).
The noise-dependent behaviour of Type II activation was strongly correlated with the subjects' reaction times in the word pronunciation task. Bearing in mind that this result is based on only four data points, the fact that we found such a good fit suggests that the subjects' behavioural responses to visually presented words may depend directly on Type II neural processing; it may represent a key rate-limiting step.
The third response pattern (Type III activity) was not as readily detectable in the 122-sensor data sets as Type I and Type II activity. Nevertheless, it showed enough systematic stimulus-related modulation to be identified reliably. Type III activity was located mainly in the right extrastriate cortex, unlike Type II activity. Although the time course of activation was similar to Type II, Type III activity showed no specificity for stimulus type (i.e. letter versus symbol strings). The activation strengths of Type III sources were modulated by string length. One possible interpretation of Type III activation is that it represents some attribute of object processing that is common to both letter and symbol strings but is not a necessary component for the visuallanguage filter properties that we propose for the left occipitotemporal cortex.
Generally speaking, we found reasonable agreement between the locations of letter string-specific Type II sources and studies that suggest that extrastriate regions are involved in word- and letter string-specific processing (e.g. Petersen et al., 1988, 1989, 1990; Price et al., 1994, 1996; Puce et al., 1996; Pugh et al., 1996; Salmelin et al., 1996; Rumsey et al., 1997; Kuriki et al., 1998). In particular, the location of the present Type II sources in the left hemisphere agrees with our earlier MEG results of single-word reading (Salmelin et al., 1996
). Furthermore, the time window around 150 ms, in which we observed Type II and Type III sources, agrees with those reported elsewhere (Allison et al., 1994
; Nobre et al., 1994
; Salmelin et al., 1996
; Kuriki et al., 1998
) (Appendix IV). It is worth noting that Allison and colleagues have reported specific responses to such stimulus categories as letters, numbers and faces within 200 ms after stimulus onset, and originating in the inferior temporo-occipital cortex (Allison et al., 1994
). The time window around 150200 ms may thus be particularly important in the analysis of visual objects exceptionally relevant to human behaviour.
Büchel and colleagues recently reported activation of the left basal posterior temporal lobe [Brodmann area (BA) 37] in a visual reading task for sighted subjects as well as in a tactile reading task for congenitally blind and late-blind subjects (Büchel et al., 1998
). Interestingly, they also stated that developmental dyslexics, when compared with control subjects, showed reduced activation in the same area. A similar difference between fluent and dyslexic readers was also reported by Salmelin and colleagues in the left inferior temporo-occipital cortex (Salmelin et al., 1996
). It is thus plausible that our letter string-specific Type II activity may be associated with the BA 37 activity reported by Büchel and colleagues (Büchel et al., 1998
).
Closer comparisons between our data and those in the published literature reveal interesting discrepancies. In part this may be due to differences between experimental paradigms and in part to small errors in source localization or uncertainty in the exact source locations. Nevertheless, the left medial extrastriate response, specific for word-like stimuli, that was reported by two other groups (Petersen et al., 1990
; Pugh et al., 1996
) seems to be more medial than our Type II sources. Interestingly, the existence of the medial extrastriate word-specific source has been questioned recently by Indefrey and colleagues, who suggested that medial occipital responses are associated with the length and not with the lexicality of the strings (Indefrey et al., 1997
). On the other hand, when presented with pseudo-words some of the subjects of Indefrey and colleagues showed left-lateralized activity at the occipitotemporal junction and along the posterior superior temporal sulcus, which was not observed with false fonts. These observations are in good agreement with our resultsType I and Type III amplitudes, originating in areas close to the midline, increased with string length but were not specific for letters, whereas the letter-specific Type II sources were located more laterally in the left occipitotemporal cortex.
Manipulating the visibility of a word naturally affects not only the orthographic prelexical stage but also other subcomponents of reading after this stage. Accordingly, we also identified brain areas sensitive to the visibility of words in the more anterior brain regions. Specifically, the left superior temporal cortex displayed a preference for clearly visible words compared with symbol strings and heavily degraded words. The signals in this area peaked ~300 ms after stimulus presentation for non-degraded words. As the same cortical area has been shown to be involved in analysis of word meaning between 250 and 350 ms after word presentation (Helenius et al., 1998
), it is likely that these signals reflect semantic analysis.
In conclusion, we report the existence of an early, letter string-specific MEG response, identified reliably in 11 out of 12 subjects. These responses took place within 200 ms after stimulus presentation and were concentrated mainly in the left inferior occipitotemporal regions. The level of noise masking the visually presented letter strings affected the time course and strength of letter string-specific activation. This modulation was correlated with the subjects' reaction times in the word pronunciation task. The present data speak strongly for a special role for the left inferior occipitotemporal region in the neural processing of letter strings.
We suggest that the neural population underlying this response represents a mechanism that acts as an interface between visual and language domains. The properties of such a mechanism are likely to be developed with constant exposure to printed text. The analysis of this letter string-specific response may eventually provide a tool for evaluating the acquisition and fluency of reading.
| Appendix I |
|---|
|
|
|---|
The following algorithm was used to generate the 2D noise in the stimuli. For each pixel the steps were as follows. (i) A random number, R, was picked from a Gaussian distribution of zero mean and unit variance. (ii) The random number, R, was multiplied by a variable called noise range, N. The noise range was calculated as: N =
Vx64, where 64 is the range of grey levels and V corresponds approximately to a measure of the variance of the noise distribution (as a proportion). V took the values 0, 0.0156, 0.0625 and 0.1406, and N the values 0, 8, 16 and 24, respectively. (iii) The new grey level value for the pixel was calculated as: new luminance = old luminance + (R x N). As V increased, so the width of the Gaussian distribution increased in proportion to its square root. For any pixel, if the new luminance value fell outside the grey level range of 063, then the algorithm was repeated for that pixel. A new random number was selected until the new luminance value fell inside the range. Thus, the final distribution looked like a real Gaussian that had been truncated at 0 and 63. | Appendix II |
|---|
|
|
|---|
The validity of the spherically symmetrical conductor model (adjusted to the curvature of the posterior parts of the brain) was tested with a realistically shaped conductor model in four subjects. Each of the individually constructed realistically shaped models consisted of ~1200 triangles. For our comparisons we estimated ECDs in the occipital regions from the same magnetic field patterns using both spherical and realistically shaped conductor models. We restricted the comparison between the models to occipital regions because the differences between conductor models are expected to be greater for the occipital cortex than for the temporal cortex (based on computer simulations done in our laboratory). Only dipoles showing similar goodness-of-fit values for the two conductor models were considered. Across the four subjects, the mean distance between dipole locations for the spherical and realistically shaped models was 4 ± 1 mm. For dipoles located in the left hemisphere at least 15 mm from the midline (x coordinate < 15 mm), the mean difference was 3 ± 1 mm. For right hemisphere dipoles at least 15 mm from the midline (x coordinate > 15 mm) the mean location difference was 3 ± 1 mm and for dipoles close to midline (|x-coordinate| < 15 mm) the mean difference was 5 ± 2 mm. The mean difference in amplitude for all accepted dipoles was 2.1 ± 0.4 nAm and the mean difference in dipole orientation, calculated in a tangential plane relative to the centre of the spherical model, was 3 ± 1 degrees.
| Appendix III |
|---|
|
|
|---|
If several Type I, Type II or Type III dipoles were observed for one subject, only the best of these dipoles was selected for some of the analyses that were done. The selection was based on the dipole peak amplitudes or latencies according to the following criteria.
Latency criterion
- Type I: the dipole showing the earliest peak for pure noise 24.
- Type II: the dipole showing the earliest peak for words 0.
- Type III: the dipole showing the earliest peak for four-item symbol strings.
Amplitude criterion
- Type I: the dipole showing highest value for (pure noise 24 pure noise 0)/(baseline standard deviation).
- Type II: the dipole showing highest value for (words 0 four-symbol strings)/(baseline standard deviation).
- Type III: the dipole showing highest value for (four-symbol strings one-symbol strings)/(baseline standard deviation).
| Appendix IV |
|---|
|
|
|---|
Since publication of the 1996 paper by Salmelin and colleagues (Salmelin et al., 1996
After correction of the stimulus system-induced delay, the timing of our activations would seem to differ from that in, for example, the papers by Nobre and colleagues (Nobre et al., 1994
) and Allison and colleagues (Allison et al., 1994
); while they report a letter string-specific response ~200 ms after word onset, we now report a functionally and spatially similar response ~150 ms after word onset. However, our words were short and very familiar, which may have speeded up the cortical response. It should also be noted that, although delays in stimulus systems are quite common, they are not always known or taken into account, which may result in some variance between the results obtained by different groups.
| Acknowledgments |
|---|
We wish to thank Riitta Hari and Katri Kiviniemi for comments on the manuscript and Kimmo Uutela for comments and expert technical help. This work was supported by the Academy of Finland, the Human Frontier Science Program, the EC Human Capital and Mobility Program [through the Neuro-BIRCH II (large scale facility of neuromagnetism) facility in Helsinki] and the Oxford MRC IRC (Interdisciplinary Research Centre) for Cognitive Neuroscience. The MRI scans were obtained at the Department of Radiology, Helsinki University Central Hospital.
| References |
|---|
|
|
|---|
Ahonen AI, Hämäläinen MS, Kajola MJ, Knuutila JET, Laine PP, Lounasmaa OV, et al. 122-Channel SQUID instrument for investigating the magnetic signals from the human brain. Physica Scripta 1993; T49: 198205.
Ajax ET. Dyslexia without agraphia. Arch Neurol 1967; 17: 64552.[ISI][Medline]
Allison T, McCarthy G, Nobre A, Puce A, Belger A. Human extrastriate visual cortex and the perception of faces, words, numbers, and colors. [Review]. Cereb Cortex 1994; 4: 54454.
Behrmann M, Plaut DC, Nelson J. A literature review and new data supporting an interactive account of letter-by-letter reading. In: Coltheart M, editor. Pure alexia (letter-by-letter reading). Hove (UK): Psychology Press; 1998. p. 751.
Besner D, Smith MC. Models of visual word recognition: when obscuring the stimulus yields a clearer view. J Exp Psychol Learn Mem Cogn 1992; 18: 46882.
Bookheimer SY, Zeffiro TA, Blaxton T, Gaillard W, Theodore W. Regional cerebral blood flow during object naming and word reading. Hum Brain Mapp 1995; 3: 93106.[ISI]
Büchel C, Price C, Friston K. A multimodal language region in the ventral visual pathway. Nature 1998; 394: 2747.[Medline]
Coslett HB, Saffran EM. Evidence for preserved reading in `pure alexia'. Brain 1989; 112: 32759.
Damasio AR, Damasio H. The anatomic basis of pure alexia. Neurology 1983; 33: 157383.
Greenblatt SH. Alexia without agraphia or hemianopsia. Anatomical analysis of an autopsied case. Brain 1973; 96: 30716.
Greenblatt SH. Subangular alexia without agraphia or hemianopsia. Brain Lang 1976; 3: 22945.[ISI][Medline]
Grill-Spector K, Kushnir T, Hendler T, Edelman S, Itzchak Y, Malach R. A sequence of object-processing stages revealed by fMRI in the human occipital lobe. Hum Brain Mapp 1998; 6: 31628.[ISI][Medline]
Hämäläinen MS, Sarvas J. Realistic conductivity geometry model of the human head for interpretation of neuromagnetic data. IEEE Trans Biomed Eng 1989; 36: 16571.[ISI][Medline]
Hämäläinen M, Hari R, Ilmoniemi RJ, Knuutila J, Lounasmaa OV. Magnetoencephalographytheory, instrumentation, and applications to noninvasive studies of the working human brain. Rev Mod Phys 1993; 65: 41397.[ISI]
Helenius P, Salmelin R, Service E, Connolly JF. Distinct time courses of word and context comprehension in the left temporal cortex. Brain 1998; 121: 113342.
Henderson VW. Anatomy of posterior pathways in reading: a reassessment. [Review]. Brain Lang 1986; 29: 11933.[ISI][Medline]
Henderson VW, Friedman RB, Teng EL, Weiner JM. Left hemisphere pathways in reading: inferences from pure alexia without hemianopia. Neurology 1985; 35: 9628.
Holcomb PJ. Semantic priming and stimulus degradation: implications for the role of the N400 in language processing. Psychophysiology 1993; 30: 4761.[ISI][Medline]
Howard D, Patterson K, Wise R, Brown WD, Friston K, Weiller C, et al. The cortical localization of the lexicons. Brain 1992; 115: 176982.







