12
Intl. Journal of Human–Computer Interaction, 28: 472–483, 2012 Copyright © Taylor & Francis Group, LLC ISSN: 1044-7318 print / 1532-7590 online DOI: 10.1080/10447318.2011.622970 A Chinese Text Input Brain–Computer Interface Based on the P300 Speller James W. Minett, Hong-Ying Zheng, Manson C-M. Fong, Lin Zhou, Gang Peng, and William S-Y. Wang The Chinese University of Hong Kong, Hong Kong, China A visual speller is a brain–computer interface that empowers users with limited motor functionality to input text into a computer by measuring their electroencephalographic responses to visual stimuli. Most prior research on visual spellers has focused on input of alphabetic text. Adapting a speller for other types of segmental or syllabic script is straightforward because such scripts comprise sufficiently few characters that they may all be displayed to the user simultaneously. Logographic scripts, such as Chinese hanzi, however, impose a challenge: How should the thousands of Chinese characters be displayed to the user? Here, we present a visual speller, based on Farwell and Donchin’s P300 Speller, for Chinese character input. The speller uses a novel shape-based method called the First–Last, or FLAST, method to encode more than 7,000 Chinese characters. Characters are input by selecting two components, from a set of 56 distinct components, that match the shape of the target character, followed by selection of the character itself. At the input speed of one character per 107 s, 24 able-bodied participants achieved mean online accuracy of 82.8% per compo- nent selection and 63.5% per character input. At the faster input speed of one character per 77 s, mean online accuracy was 59.4% per component selection and 33.3% per character input. 1. INTRODUCTION A substantial minority of people have motor deficits that limit or destroy their ability to communicate by conventional means, that is, by speech, by writing, or by sign language. Various neuromuscular disorders can cause such severe motor impairment, including amyotrophic lateral sclerosis, multiple sclerosis, spinocerebellar ataxia, cerebral palsy, and brain- stem stroke. Brain–computer interfaces (BCIs) provide affected This work was supported by research grants awarded to William S-Y. Wang by the Innovation and Technology Commission, The Government of the Hong Kong SAR (Grant Nos.: ITS/068/09, InP/241/09, and InP/229/09), the Patent Committee of The Chinese University of Hong Kong (Grant No.: TBF/11/ENG/01), the Shun Hing Institute of Advanced Engineering at The Chinese University of Hong Kong (Grant No.: BME 8115020), and the Research Grants Council of Hong Kong (CUHK Grant No.: 2050461). Address correspondence to James W. Minett, Department of Electronic Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong. E-mail: [email protected] individuals an alternative channel that can augment their abil- ity to communicate with other people. Typically, the input signals used to drive BCIs are the fluctuations in electrical potential elicited by cognitive activity in response to sen- sory input, measured by electrodes placed on the scalp (i.e., electroencephalography [EEG]; e.g., Farwell & Donchin, 1988; Townsend et al., 2010), or by electrode grids implanted directly onto the cortical surface of the brain (i.e., electrocorticogra- phy; e.g., Brunner, Ritaccio, van Erp, Aloise, & Cincotti, 2011; Leuthardt, Schalk, Wolpaw, Ojemann, & Moran, 2004). Many BCIs for augmentative communication elicit brain signals using visual stimuli (e.g., Farwell & Donchin, 1988; Guger et al., 2009; Jin et al., 2010; Minett, Peng, Zhou, Zheng, & Wang, 2010; Sellers, Krusienski, McFarland, Vaughan, & Wolpaw, 2006; Serby, Yom-Tov, & Inbar, 2005; Townsend et al., 2010), but effective use has also been made of both auditory stim- uli (e.g., Furdea et al., 2009; Höhne, Schreuder, Blankertz, & Tangermann, 2010; Klobassa et al., 2009) and tactile stimuli (e.g., Brouwer & van Erp, 2010; Brouwer, van Erp, Aloise, & Cincotti, 2010). In the P300 Speller, Farwell and Donchin’s (1988) original visual speller, a 6 × 6 matrix of characters or symbols is dis- played onscreen to the user. The 12 rows and columns of this stimulus matrix are briefly intensified, one by one, in pseudo- random order, forming an oddball sequence (Fabiani, Gratton, Karis, & Donchin, 1987), while the user attends to the particu- lar target character that is to be input. Intensification of a row or column that contains the target tends to elicit a P300 event- related potential (ERP; Sutton, Braren, Zubin, & John, 1965) with greater amplitude than that elicited by intensification of other rows or columns, allowing the user’s intended target char- acter to be inferred after it has been intensified one or more times. The original P300 Speller allowed able-bodied users to select 2.3 characters per minute on average at 95% accuracy, corresponding to a mean input rate of about 12 bits/minute (Farwell & Donchin, 1988). The P300 Speller continues to be the most commonly used BCI system (Mak et al., 2011). Numerous refine- ments have been proposed that adjust various features of the speller, including matrix size (Allison & Pineda, 2003) 472 Downloaded by [University of California, Berkeley] at 18:09 22 August 2012

A Chinese Text Input Brain–Computer Interface Based on the ... · A Chinese Text Input Brain–Computer Interface Based on the P300 Speller James W. Minett, Hong-Ying Zheng, Manson

  • Upload
    others

  • View
    13

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A Chinese Text Input Brain–Computer Interface Based on the ... · A Chinese Text Input Brain–Computer Interface Based on the P300 Speller James W. Minett, Hong-Ying Zheng, Manson

Intl. Journal of Human–Computer Interaction, 28: 472–483, 2012Copyright © Taylor & Francis Group, LLCISSN: 1044-7318 print / 1532-7590 onlineDOI: 10.1080/10447318.2011.622970

A Chinese Text Input Brain–Computer Interface Basedon the P300 Speller

James W. Minett, Hong-Ying Zheng, Manson C-M. Fong, Lin Zhou, Gang Peng,and William S-Y. WangThe Chinese University of Hong Kong, Hong Kong, China

A visual speller is a brain–computer interface that empowersusers with limited motor functionality to input text into a computerby measuring their electroencephalographic responses to visualstimuli. Most prior research on visual spellers has focused on inputof alphabetic text. Adapting a speller for other types of segmentalor syllabic script is straightforward because such scripts comprisesufficiently few characters that they may all be displayed to theuser simultaneously. Logographic scripts, such as Chinese hanzi,however, impose a challenge: How should the thousands of Chinesecharacters be displayed to the user? Here, we present a visualspeller, based on Farwell and Donchin’s P300 Speller, for Chinesecharacter input. The speller uses a novel shape-based methodcalled the First–Last, or FLAST, method to encode more than7,000 Chinese characters. Characters are input by selecting twocomponents, from a set of 56 distinct components, that match theshape of the target character, followed by selection of the characteritself. At the input speed of one character per 107 s, 24 able-bodiedparticipants achieved mean online accuracy of 82.8% per compo-nent selection and 63.5% per character input. At the faster inputspeed of one character per 77 s, mean online accuracy was 59.4%per component selection and 33.3% per character input.

1. INTRODUCTIONA substantial minority of people have motor deficits that

limit or destroy their ability to communicate by conventionalmeans, that is, by speech, by writing, or by sign language.Various neuromuscular disorders can cause such severe motorimpairment, including amyotrophic lateral sclerosis, multiplesclerosis, spinocerebellar ataxia, cerebral palsy, and brain-stem stroke. Brain–computer interfaces (BCIs) provide affected

This work was supported by research grants awarded to WilliamS-Y. Wang by the Innovation and Technology Commission, TheGovernment of the Hong Kong SAR (Grant Nos.: ITS/068/09,InP/241/09, and InP/229/09), the Patent Committee of The ChineseUniversity of Hong Kong (Grant No.: TBF/11/ENG/01), the ShunHing Institute of Advanced Engineering at The Chinese Universityof Hong Kong (Grant No.: BME 8115020), and the Research GrantsCouncil of Hong Kong (CUHK Grant No.: 2050461).

Address correspondence to James W. Minett, Department ofElectronic Engineering, The Chinese University of Hong Kong, Shatin,Hong Kong. E-mail: [email protected]

individuals an alternative channel that can augment their abil-ity to communicate with other people. Typically, the inputsignals used to drive BCIs are the fluctuations in electricalpotential elicited by cognitive activity in response to sen-sory input, measured by electrodes placed on the scalp (i.e.,electroencephalography [EEG]; e.g., Farwell & Donchin, 1988;Townsend et al., 2010), or by electrode grids implanted directlyonto the cortical surface of the brain (i.e., electrocorticogra-phy; e.g., Brunner, Ritaccio, van Erp, Aloise, & Cincotti, 2011;Leuthardt, Schalk, Wolpaw, Ojemann, & Moran, 2004). ManyBCIs for augmentative communication elicit brain signals usingvisual stimuli (e.g., Farwell & Donchin, 1988; Guger et al.,2009; Jin et al., 2010; Minett, Peng, Zhou, Zheng, & Wang,2010; Sellers, Krusienski, McFarland, Vaughan, & Wolpaw,2006; Serby, Yom-Tov, & Inbar, 2005; Townsend et al., 2010),but effective use has also been made of both auditory stim-uli (e.g., Furdea et al., 2009; Höhne, Schreuder, Blankertz, &Tangermann, 2010; Klobassa et al., 2009) and tactile stimuli(e.g., Brouwer & van Erp, 2010; Brouwer, van Erp, Aloise, &Cincotti, 2010).

In the P300 Speller, Farwell and Donchin’s (1988) originalvisual speller, a 6 × 6 matrix of characters or symbols is dis-played onscreen to the user. The 12 rows and columns of thisstimulus matrix are briefly intensified, one by one, in pseudo-random order, forming an oddball sequence (Fabiani, Gratton,Karis, & Donchin, 1987), while the user attends to the particu-lar target character that is to be input. Intensification of a rowor column that contains the target tends to elicit a P300 event-related potential (ERP; Sutton, Braren, Zubin, & John, 1965)with greater amplitude than that elicited by intensification ofother rows or columns, allowing the user’s intended target char-acter to be inferred after it has been intensified one or moretimes. The original P300 Speller allowed able-bodied users toselect 2.3 characters per minute on average at 95% accuracy,corresponding to a mean input rate of about 12 bits/minute(Farwell & Donchin, 1988).

The P300 Speller continues to be the most commonlyused BCI system (Mak et al., 2011). Numerous refine-ments have been proposed that adjust various features ofthe speller, including matrix size (Allison & Pineda, 2003)

472

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a, B

erke

ley]

at 1

8:09

22

Aug

ust 2

012

Page 2: A Chinese Text Input Brain–Computer Interface Based on the ... · A Chinese Text Input Brain–Computer Interface Based on the P300 Speller James W. Minett, Hong-Ying Zheng, Manson

A CHINESE TEXT INPUT BCIF 473

and interstimulus interval (Sellers et al., 2006), backgroundnoise and interference color contrast (Nam, Li, & Johnson,2010), interface type and screen size (Li, Nam, Shadden, &Johnson, 2010), stimulus presentation (Frye, Hauser, Townsend,& Sellers, 2011; Townsend et al., 2010), stimulus motion(Hong, Guo, Liu, Gao, & Gao, 2009), predictive spelling (Ryanet al., 2011), and method of data processing (Krusienski et al.,2006; Serby, Yom-Tov, & Inbar, 2005). The use of the checker-board paradigm (Townsend et al., 2010), for example, whichintensifies sets of six characters that are located in distinct rowsand columns of an 8 × 9 matrix—rather than entire rows orcolumns—allows able-bodied users to input 4.36 characters perminute at 91.52% accuracy, maintaining a mean practical inputrate of 22.59 bits/minute, almost double that of Farwell andDonchin’s original speller.

1.1. Nonalphabetic ScriptPrevious research on the use of visual spellers to augment

communication has focused mostly on languages that are writ-ten with alphabetic script, including English and German.Alphabets consist of characters that map to the phonemesof the corresponding spoken language (Daniels & Bright,1996), although the mapping is not necessarily one to one.1

Implementation of a visual speller for other alphabetic scripts(e.g., Greek and Cyrillic) is straightforward because, by appro-priately adjusting the size of the stimulus matrix, all thecharacters in the script can be displayed onscreen to the usersimultaneously. Like alphabetic scripts, other types of segmen-tal script, particularly abjads (e.g., Arabic) and abugidas (e.g.,Devanagari), as well as syllabaries (e.g., Japanese kana) alsorepresent the sounds of the corresponding spoken language(Daniels & Bright, 1996) and can also be input directly witha visual speller because they too comprise a small set of charac-ters that may be composed sequentially to form written words.For example, the two kana scripts of Japanese, katakana andhiragana, each comprise 48 characters. In contrast, logographicscripts—notably Chinese hanzi—consist of large numbers ofcharacters that represent units of meaning, rather than units ofsound (Wang, 1981). The most commonly used dictionary ofChinese, Xinhua Zidian (2004), for example, has more than11,000 entries. It is, therefore, impossible to display simultane-ously in the stimulus matrix of a visual speller the many thou-sands of Chinese characters that the user might wish to input.

To resolve this obstacle to implementation of a visual spellerfor Chinese text input, we first consider how Chinese text maybe input to a computer via a keyboard. There are two broadclasses of methods for this purpose: sound-based methods,in which sequences of keystrokes are input to represent the

1The lack of a necessary one-to-one correspondence betweenalphabetic characters and phonemes is exemplified by the letters-ough, which represent various combinations of phonemes, such as

, depending on the English word inwhich they appear.

sounds of the spoken language, and shape-based methods,in which sequences of keystrokes are input to represent thevisual forms of the written language (Fong & Minett, 2012).Foremost among the sound-based input methods are those usingHanyu Pinyin (Yin & Felley, 1990), or just Pinyin, the officialRomanization of standard Chinese (i.e., Putonghua, which isbased on the Beijing dialect of Mandarin Chinese; Gu, 2009)that has been adopted throughout the mainland of China andelsewhere, both to teach Putonghua pronunciation and to tran-scribe Chinese words into alphabetic script. Pinyin uses just the26 letters of the Roman alphabet to represent the entire inven-tory of Putonghua consonants and vowels.2 A Chinese charactermay be typed using Pinyin by entering the sequence of let-ters that corresponds to its pronunciation. Multiple charactersmay have the same pronunciation, necessitating an additionalselection to type the particular character required. For exam-ple, the Pinyin sequence ‘tian’ represents the pronunciationof the Chinese characters (“sky”) and (“to append”),as well as many others. Sound-based input methods usingthe Pinyin Romanization may not be suitable for all Chineseusers. According to Ethnologue (Lewis, 2009), the web-basedresource maintained by the Summer Institute of Linguistics,about 70% of the Chinese population speaks a dialect ofMandarin, about 80% of whom speak Putonghua, the dialectupon which Pinyin is based. Of the remaining population, themajority speak related—but mutually unintelligible—dialectsof Han Chinese (i.e., Sinitic; Van Driem, 2001).

In contrast to sound-based methods, shape-based input sys-tems specify a set of forms to represent the visual appearance ofthe target character. The Chinese character, or sinogram (Wang& Tsai, 2011), has a hierarchical structure, consisting of one ormore strokes with particular spatial arrangement (Wang, 1981).For example, the sinogram (“sky”) is made up of four strokes:two horizontal strokes, one leftward going stroke, and one right-ward going stroke. The same four strokes may be recombinedwith different spatial arrangement to form the distinct char-acter (“husband”). Certain spatial arrangements of strokes,which we refer to here as components, reoccur with great fre-quency in the Chinese lexicon—for example, each of the twosinograms just stated contains the two components and(which are themselves both sinograms). Popular shape-basedinput methods include wubizixing (or just wubi), used mainlyin the Chinese mainland, and simplified Cangjie, more com-monly used in Hong Kong. The simplified Cangjie method, forexample, uses a set of 24 basic shapes, each shape designatinga set of stroke patterns with similar spatial arrangements, eachmapped to a key on the keyboard. This method requires twobasic shapes to be selected, followed by selection of a particular

2In addition to consonants and vowels, the Chinese syllable com-prises lexical tone: the use of distinct pitch patterns to provide lexicalcontrast among syllables that are otherwise pronounced similarly.Pinyin uses a set of optional diacritics to mark the lexical tone of eachsyllable.

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a, B

erke

ley]

at 1

8:09

22

Aug

ust 2

012

Page 3: A Chinese Text Input Brain–Computer Interface Based on the ... · A Chinese Text Input Brain–Computer Interface Based on the P300 Speller James W. Minett, Hong-Ying Zheng, Manson

474 J. W. MINETT ET AL.

sinogram from among a list of sinograms that match the cho-sen basic shapes. For example, the sinogram is representedby the basic shapes and , which are selected by the keys“m” and “k,” respectively. This same encoding also representsa number of other sinograms, including (“change”) and(“government”).

1.2. BCIs for Chinese Text InputRecently, attempts have been made to develop BCIs for

Chinese text input (Jin et al., 2010; Minett et al., 2010; Wu et al.,2009), all using shape-based methods rather than sound-basedmethods. Although the standard P300 Speller could be adaptedrelatively simply to input characters by selecting appropriatePinyin sequences, developing a universal system for applicationthroughout China would be challenging—if not impossible—because, as noted previously, Pinyin reflects the pronuncia-tion of Putonghua, based on the Beijing dialect of MandarinChinese, which differs considerably from the pronunciations ofother Han Chinese dialects.

In Jin et al. (2010), the BCI makes use of an existing inputsystem—the T9 stroke input system (Tegic Communications,Seattle, WA)—built into a mobile phone. A four-by-four stimu-lus matrix is used to display a mixture of single strokes, digits,and controls. The system distinguishes five distinct types ofstroke: horizontal, vertical, leftward going, rightward going ordot, and hook, as does the system reported in (Wu et al., 2009).The rows and columns of the stimulus matrix are intensified inpseudo-random sequence while the user attends to the item to beselected, as in Farwell and Donchin’s P300 Speller. A sinogramis input by selecting the sequence of strokes by which thatsinogram may be written. After selection of each stroke, a short-list of seven sinograms is displayed that matches the currentstroke sequence. The user may then (a) input one of theseseven sinograms by selecting the corresponding digit (1–7),(b) request that more matching sinograms be displayed byselecting the digit 8, (c) select an additional stroke, or (d) deletethe previous selection. Using Bayesian linear discriminant anal-ysis and particle swarm optimization, this system deliveredoffline stroke input accuracy ranging from 38.67% up to 100%by averaging a block of fifteen 2-s trials from each of 11 able-bodied participants. One participant maintained 100% accuracyby averaging four trials, achieving an information transfer rate(Lenhardt, Kaper, & Ritter, 2008) of 40.34 bits/minute, equiv-alent to about eight stroke selections per minute. Based on theauthors’ estimate that six to seven stroke selections are requiredto input a sinogram, the online system may allow slightly betterthan one sinogram to be input per minute by the most proficientusers. However, the online performance of this system has yetto be validated (Jin et al., 2010).

1.3. The Present Study—The FLAST SpellerAs just noted, users of the system in (Jin et al., 2010) were

required to make six to seven selections on average in order to

input one sinogram. In online operation, an incorrect selectionin any one of those six to seven selections would necessitatedeletion of the erroneous selection, followed by an additionalattempt to make the intended selection. For users who can main-tain high selection accuracy (e.g., one of their 11 participantsmaintained mean classification accuracy above 90%), there isno great negative consequence. However, for individuals unableto maintain high selection accuracy (e.g., two of their 11 par-ticipants achieved mean classification accuracy below 50%), itis likely that repeated correction of errors in stroke selectionwould greatly increase the time required to input a sinogramcorrectly. We have therefore sought to develop an alternativemethod for encoding the Chinese lexicon that allows sinogramsto be input in fewer selections.

In this article, we describe and present a performance analy-sis of a novel visual speller for Chinese text input that requiresat most three selections per sinogram. This online systemuses a set of 56 shape-based components, shown in Figure 1,extending the 35-component offline system proposed in Minettet al. (2010), to represent a lexicon of 7,072 distinct tradi-tional Chinese sinograms.3 Each sinogram is represented bytwo components: the first component, representing the patternof strokes that would be written first when writing the sinogram,and the last component, representing the pattern of strokesthat would be written last—we therefore call this speller theFirst–Last Speller, or FLAST . For example, the sinogram isrepresented by the components and . Of the 56 × 56 pos-sible pairs of components, only 1,601 pairs encode sinograms.Of these, 491 pairs each encode a single sinogram. The 56 com-ponents defined in FLAST were chosen for the express purpose

FIG. 1. The 56 shape-based components of the FLAST speller.

3There are two sets of sinograms with which Chinese may be writ-ten: traditional characters, used mainly in Hong Kong, Macau, andTaiwan, and simplified characters, used in the mainland of China andelsewhere. Simplified characters were introduced in China in 1956 withthe aim of enhancing literacy (Wang, 1973).

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a, B

erke

ley]

at 1

8:09

22

Aug

ust 2

012

Page 4: A Chinese Text Input Brain–Computer Interface Based on the ... · A Chinese Text Input Brain–Computer Interface Based on the P300 Speller James W. Minett, Hong-Ying Zheng, Manson

A CHINESE TEXT INPUT BCIF 475

that no pair would encode more than 56 sinograms—the mostnumerous pair, and , encodes 41 sinograms, leaving ampleroom for future expansion of the lexicon.

The FLAST Speller has two modes of operation: a sinograminput mode and a component input mode. Sinogram input pro-ceeds in three stages: (a) selection of the first component,(b) selection of the last component, and (c) selection of thesinogram itself. During each stage, an 8 × 8 stimulus matrixis displayed onscreen, the bottom row of which displays eightsystem controls: sleep, delete component, delete sinogram,delete sentence, copy, paste, toggle sinogram/alphanumericinput (allowing both sinograms and alphanumeric text to beinput), and enter (allowing sinograms that also appear as com-ponents in the stimulus matrix, e.g., , meaning “two,” to beinput). During Stages 1 and 2, the first seven rows of the matrixcomprise the 56 components. During Stage 3, these rows com-prise a set of 56 sinograms, a subset of which is encoded bythe pair of components selected during the first two stages;the remaining cells of the stimulus matrix are filled by high-frequency sinograms. The 7,072 sinograms that are encodedin FLAST are distributed across 140 stimulus matrices, withsome high-frequency sinograms occurring in multiple matrices.To assist the user to locate target sinograms more easily, theposition of each sinogram in the particular stimulus matrix thatcontains it is held constant.

At the beginning of each stage, the entire stimulus matrixis intensified for a period of time to allow the user time tolocate the target component or sinogram, after which the rowsand columns of the stimulus matrix are intensified in pseudo-random order. Figure 2 illustrates the FLAST procedure, show-ing the stimulus matrix at the beginning of Stage 2 after the firstcomponent, , has been selected in Stage 1 (see Figure 2A),and at the beginning of Stage 3 after the last component, , hasbeen selected in Stage 2 (see Figure 2B). During sinogram inputmode, component selections are displayed between underline-characters (e.g., _ _), to distinguish them from sinogramselections, which are shown without such underlines (e.g.,

). Once a sinogram has been selected during Stage 3,the component selections are deleted from the screen and theselected sinogram displayed. During component input mode,only components may be input, and they are displayed withoutunderlines (see Figure 2C).

2. METHODS

2.1. ParticipantsTwenty-four students (12 female, 12 male, Mage =

21.7 years, age range = 19–30 years) from The ChineseUniversity of Hong Kong, all native Chinese readers, werepaid to participate in this study. All participants were able-bodied, were naive to BCI use, had normal or corrected-to-normal vision, and reported no history of neurological illness.Each participant gave informed consent in compliance with a

protocol approved by the Survey and Behavioural ResearchEthics Committee of The Chinese University of Hong Kong.

2.2. Data AcquisitionEach participant was seated in a quiet, dimly lit room about

90 cm before a LCD monitor that displayed the experimentalstimuli. EEG data were acquired using a 32-channel ActiveTwoEEG system (BioSemi B.V., Amsterdam, The Netherlands) withAg/AgCl active electrodes at positions Fp1, Fp2, AF3, AF4,Fz, F3, F4, F7, F8, FC1, FC2, FC5, FC6, Cz, C3, C4, T7, T8,CP1, CP2, CP5, CP6, Pz, P3, P4, P7, P8, PO3, PO4, Oz, O1,and O2 (Sharbrough, Lesser, Lüders, Nuwer, & Picton, 1991).Two additional electrodes, common mode sense and driven rightleg, positioned on either side of vertex at positions C1 and C2,respectively, were used to provide a feedback loop to drive theaverage electrical potential as close as possible to the ampli-fier reference voltage (BioSemi, 2007). The EEG data weresampled at a rate of 256 Hz, and a 0.1 Hz to 40 Hz digitalband-pass filter applied. No artifact correction was applied todeal with eye blinks or eye movements. Stimulus presentation,data collection, and data processing were all managed by thegeneral-purpose BCI software BCI2000 (Schalk, McFarland,Hinterberger, Birbaumer, & Wolpaw, 2004) as follows.

2.3. Experimental ProcedureEach participant completed the experiment in a single ses-

sion that consisted of two parts: calibration followed by onlinetesting. During calibration, EEG data were elicited from partic-ipants while the FLAST Speller ran in component input modein order to determine the coefficients of a classifier (describedin section 2.5). In each run, participants were required to inputa string of five pseudo-randomly selected components that wasdisplayed at the top of the screen (see Figure 2C). The currenttarget component was shown within parentheses to the right ofthe component string. After a 10-s pause, during which timethe entire stimulus matrix was intensified—allowing the par-ticipant time to locate the target component and to blink, ifnecessary—the rows and columns of the matrix were intensifiedin pseudo-random order. Participants were instructed to attendto the target throughout this period, keeping a mental countof the number of times it was intensified (Farwell & Donchin,1988). Each intensification was 62.5 ms in duration, followed bya 62.5 ms interstimulus interval during which time no stimuluswas intensified. Thus there were eight intensifications per sec-ond. For each intensification, an 800-ms segment of EEG data(comprising 205 samples) was extracted from each channel,forming a vector of 6,560 features (32 channels × 205 samples)for use in classification.

In each 2-s trial of 16 intensifications, each row and col-umn was intensified once. After 10 such trials, requiring atotal duration of 20 s, the component that was identified usingthe current classifier was presented onscreen below the com-ponent string (see Figure 2C). This procedure was repeated

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a, B

erke

ley]

at 1

8:09

22

Aug

ust 2

012

Page 5: A Chinese Text Input Brain–Computer Interface Based on the ... · A Chinese Text Input Brain–Computer Interface Based on the P300 Speller James W. Minett, Hong-Ying Zheng, Manson

476 J. W. MINETT ET AL.

(A) (B)

(C)

FIG. 2. Illustration of the FLAST procedure. (A) The component matrix at the beginning of Stage 2, showing the intensification of the entire stimulus matrix whilethe participant locates the target component. (B) An example of the sinogram matrix at the beginning of Stage 3. (C) The component matrix during calibration,showing the intensification of one row of the matrix and the current progress in selecting the target string of five components.

until all components in the string had been presented. Afteran initial practice run of five components, six calibration runswere carried out, each run lasting 21/2 min. After calibration wascomplete, each participant performed three additional runs incomponent input mode in order to evaluate offline componentselection accuracy, each run requiring five components to beselected, as in the calibration runs.

The participant was then instructed to use the FLAST Spelleronline in sinogram input mode to input four phrases, each con-sisting of a four-sinogram string, for example, (“ofutmost urgency,” literally burning-eyebrow urgency). Beforeeach run, the participant was shown the target phrase, togetherwith the component encodings of each of the four sinograms, ona handout to which they could refer throughout the run. As sum-marized in section 1.3 (see also Figures 2A and B), sinograminput proceeded in three stages. Stages 1 and 2 followed thesame component selection procedure as in the calibration runs(but without displaying the target components to be input), per-mitting the first and last component of the target sinogram to beselected. Upon selection of the last component, Stage 3 com-menced by displaying for 15 s the intensified sinogram matrixcontaining the sinograms that were encoded by the compo-nent pair identified by the classifier in the previous two stages.

This pause allowed the participant ample time to locate the tar-get sinogram and to blink, if necessary. If either of the twocomponent selections was incorrect, the sinogram matrix dis-played during Stage 3 would not contain the target sinogram.The participant was not permitted to correct such errors, butrather was instructed to select the sinogram at the top left cor-ner of the sinogram matrix. After the 15-s pause, the sinogrammatrix was intensified following the same procedure as forcomponents.

During both component selection (Stages 1 and 2) andsinogram selection (Stage 3), the rows and columns of thematrix were intensified in pseudo-random order for 62.5 ms,followed by a 62.5-ms interstimulus interval. Each trial of16 intensifications took 2 s. During input of the first two runs,10 trials were presented per selection (i.e., 20 target intensifi-cations and 140 nontarget intensifications). During input of thesecond two runs, five trials were presented per selection (i.e.,10 target intensifications and 70 nontarget intensifications). Thetime required to input each sinogram was 107 s during Runs1 and 2—comprising 60 s of stimulus matrix intensificationsand 47 s of overhead time for interstimulus pauses and targetcomponent/sinogram search—and 77 s during Runs 3 and 4—comprising 30 s of intensifications and 47 s of overhead. Four

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a, B

erke

ley]

at 1

8:09

22

Aug

ust 2

012

Page 6: A Chinese Text Input Brain–Computer Interface Based on the ... · A Chinese Text Input Brain–Computer Interface Based on the P300 Speller James W. Minett, Hong-Ying Zheng, Manson

A CHINESE TEXT INPUT BCIF 477

sinograms were input in each run. Runs 1 and 2 were eachcompleted in 7 min 8 s, and Runs 3 and 4 in 5 min 8 s.

2.4. Performance MeasuresAccuracy was calculated as the percentage of either com-

ponents or sinograms that were correctly input. To input asinogram correctly, the participant was required to make threeconsecutive correct selections, that is, first component, last com-ponent, and then the sinogram itself; an error in any one ofthese three selections resulted in an incorrect sinogram input.The number of bits transmitted per selection was calculatedaccording to the formula (Wolpaw et al., 2000):

B = log2 N + P log2 P + (1 − P) log2

[1 − P

N − 1

]

where B denotes the number of bits per selection, N is thetotal number of possible component or sinogram selections,and P is the proportion of correct selections. The bit rate inbits/minute was then obtained by multiplying B by the numberof selections carried out per minute. During component input,56 distinct components could be selected. During sinograminput, 7,072 distinct sinograms could be input. In addition tobit rate, the theoretical bit rate in bits/minute was calculatedby dividing B by the duration in minutes of each selectionexcluding the 47-s overhead time during online operation forinterstimulus pauses and target component/sinogram search.

2.5. ClassificationThe EEG signals elicited by the speller were classified

using stepwise linear discriminant analysis (SWLDA; Draper& Smith, 1981) following the procedure in (Krusienski, Sellers,McFarland, Vaughan, & Wolpaw, 2008). SWLDA is an effi-cient and commonly used method for binary classification (e.g.,Farwell & Donchin, 1988; Ryan et al., 2011; Townsend et al.,2010). The binary classification problem consists of determin-ing a separating hyperplane,

w · x − b = 0, (1)

where x denotes the feature vector (described in section 2.3), wdenotes the feature weights, and b denotes the bias term, withw and b to be estimated by SWLDA. The row and column ofthe stimulus matrix predicted to contain the target were calcu-lated as those for which the distance of the response from thehyperplane was greatest (Krusienski et al., 2008). SWLDA esti-mates w and b iteratively by alternating between forward andbackward stepwise regression to construct a multiple regressionmodel. In particular, at each forward regression, the statisti-cally most significant predictor (with p < .1) was added tothe model. At each backward regression, the least significantpredictors (with p > .15) were removed from the model. Thisprocedure was repeated for 60 iterations, or until no predictors

were either added or removed. For more details of the SWLDAmethod, refer to Krusienski and colleagues (Krusienski et al.,2006; Krusienski et al., 2008).

Calibration of the classifier coefficients was performedoffline in two phases, each requiring the participants to perform15 component selections. This two-phase calibration processwas used so that the improved accuracy arising from usingthe classifier obtained during the first calibration phase wouldencourage the participant to focus their attention and seekto optimize their performance during the second calibrationphase. During both phases, each component selection consistedof 10 trials, comprising data from 20 target intensificationsand 140 nontarget intensifications. Classifier coefficients werederived to discriminate the features of the EEG data elicitedacross all channels by target intensifications from those elicitedby nontarget intensifications. Upon completion of each phase,five sets of classifier coefficients were calculated using thestepwisefit function in Matlab (version R2010A, MathWorks,Natick, MA), each set obtained by sampling 50% of the EEGdata elicited during that phase. In total, 10 min of EEG dataelicited from 4,800 intensifications per participant were usedfor the calibration, of which 600 were target intensificationsand 4,200 were nontarget intensifications. Of the five clas-sifiers obtained from the first phase of calibration, the onethat provided peak offline accuracy for the smallest numberof trials per selection when applied to EEG data obtainedduring the first phase was used as the classifier during the sec-ond phase of calibration. Of the 10 classifiers obtained fromthe two calibration phases, the one that provided peak offlineaccuracy for the fewest number of trials per selection whenapplied to the EEG data obtained during both phases was usedin the subsequent online testing. The classification accuraciesfor each of the ten classifiers obtained during calibration areshown in Figure 3 for all 24 participants—the figures illus-trates that the 10 accuracy curves were tightly packed for mostparticipants.

The classifier’s spatiotemporal filter obtained duringcalibration was applied to the EEG data, averaged over a speci-fied number of trials per selection (either 10 trials during Runs1 and 2 of online testing or five trials during Runs 3 and 4;see section 2.3), to determine the classification score for eachrow and column of the stimulus matrix. The classification scorefor each stimulus was then calculated by summing the scoresfor the row and column containing it. The stimulus with thegreatest score was then displayed onscreen (see Figure 2) to theparticipant to indicate which component or sinogram had beeninput.

3. RESULTSFigure 4 shows the map of spatial features obtained dur-

ing calibration summed across all 24 participants—the figureindicates the proportion of participants for whom each ofthe 32 channels was included in the respective participant’sclassifier. In total, 10 channels (F7, P7, Pz, PO3, O1, Oz,

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a, B

erke

ley]

at 1

8:09

22

Aug

ust 2

012

Page 7: A Chinese Text Input Brain–Computer Interface Based on the ... · A Chinese Text Input Brain–Computer Interface Based on the P300 Speller James W. Minett, Hong-Ying Zheng, Manson

478 J. W. MINETT ET AL.

FIG. 3. Offline classification accuracy for each of the 10 classifiers obtained during calibration for each participant. Note. The accuracy during calibration of theclassifier selected for use in online testing is indicated by the thick solid line; the mean accuracy across all 10 classifiers is indicated by the thin solid line; theaccuracy of each of the other nine classifiers is indicated by a dashed line.

FIG. 4. Map of spatial features summed across all 24 participants. Note.The map indicates the proportion of participants for whom each electrodewas included in their respective classifiers during online testing (color figureavailable online).

O2, PO4, P8, T8) contained spatiotemporal features that wereselected in the classifiers of at least half of the participants.

The offline accuracy of the FLAST Speller for componentinput obtained by each of the 24 participants is summarized inFigure 5.

3.1. Online Accuracy and Bit RateTable 1 shows the accuracy and bit rate of the FLAST Speller

for sinogram input obtained during online testing by each of the24 participants at 10 trials per selection, corresponding to aninput speed of one sinogram per 107 s (i.e., 0.56 sinograms perminute; Runs 1 and 2). The mean component accuracy at thisselection rate was 82.8%, delivering 12.93 bits/minute. Fiveparticipants (S4, S5, S7, S18, S23) achieved 100% component

accuracy. Participants input sinograms with mean accuracy at63.5%. The mean bit rate was 4.23 bits/minute, with a the-oretical bit rate of 7.55 bits per minute (excluding from thecalculation the accumulated overhead of 47 s per sinogram forinterstimulus pauses and visual search time). Five of the 24 par-ticipants achieved 100% sinogram input accuracy at this inputspeed.

Table 2 shows the online accuracy and bit rate obtained byparticipants at five trials per selection, corresponding to an inputspeed of one sinogram per 77 s (i.e., 0.78 sinograms per minute;Runs 3 and 4). The mean component accuracy was 59.4%and the corresponding bit rate was 15.97 bits/minute, slightlyfaster than for the 10-intensification selection rate. However,sinogram accuracy was 33.3% and the corresponding bit ratewas 2.87, lower than for the 10-intensification selection rate.Only one participant (S17) achieved 100% sinogram accuracyat this input speed.

3.2. ERP WaveformsThe ERP waveforms elicited by the FLAST Speller are

shown in Figure 6, which displays each participant’s averagedwaveforms (see Figure 6A) as well as the grand-average wave-form (see Figure 6B) for both targets and nontargets at fourrepresentative channels: Cz, Pz, P7, and P8. For the majorityof participants, there is a clear negative deflection in the parietalchannels P7 and P8, peaking about 200 ms after onset of targetintensification. For most participants, there is no clear positivedeflection after 300 ms (i.e., P300 component), although thegrand-averaged waveform does exhibit a positive deflection inparietal channels, peaking at about 300 ms, albeit with negativevoltage with respect to baseline, and in the Cz channel, peakingat about 250 ms.

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a, B

erke

ley]

at 1

8:09

22

Aug

ust 2

012

Page 8: A Chinese Text Input Brain–Computer Interface Based on the ... · A Chinese Text Input Brain–Computer Interface Based on the P300 Speller James W. Minett, Hong-Ying Zheng, Manson

A CHINESE TEXT INPUT BCIF 479

FIG. 5. Offline classification accuracy obtained by each participant during component selection after calibration.

TABLE 1Online Classification Accuracy and Bit Rate Obtained by Each Participant During

Sinogram Input at 10 Trials per Selection (i.e., One Character per 107 s)

Component Selection Sinogram Selection

Participant Accuracy Bit Rate Accuracy Bit Rate Theoretical Bit Rate

S1 87.5% 13.62 50.0% 3.02 5.39S2 68.8% 9.31 25.0% 1.34 2.39S3 87.5% 13.62 75.0% 4.92 8.78S4 100.0% 17.42 100.0% 7.17 12.79S5 100.0% 17.42 100.0% 7.17 12.79S6 93.8% 15.33 75.0% 4.92 8.78S7 100.0% 17.42 100.0% 7.17 12.79S8 81.3% 12.08 50.0% 3.02 5.39S9 87.5% 13.62 75.0% 4.92 8.78S10 68.8% 9.31 12.5% 0.59 1.06S11 68.8% 9.31 12.5% 0.59 1.06S12 93.8% 15.33 87.5% 5.97 10.65S13 50.0% 5.75 12.5% 0.59 1.06S14 81.3% 12.08 62.5% 3.95 7.04S15 93.8% 15.33 75.0% 4.92 8.78S16 25.0% 1.98 0.0% 0.00 0.00S17 93.8% 15.33 87.5% 5.97 10.65S18 100.0% 17.42 100.0% 7.17 12.79S19 87.5% 13.62 75.0% 4.92 8.78S20 75.0% 10.65 37.5% 2.15 3.84S21 62.5% 8.05 50.0% 3.02 5.39S22 93.8% 15.33 87.5% 5.97 10.65S23 100.0% 17.42 100.0% 7.17 12.79S24 87.5% 13.62 75.0% 4.92 8.78

M 82.8% 12.93 63.5% 4.23 7.55SD 18.27% 4.03 32.12% 2.38 4.24SE 3.73% 0.82 6.56% 0.49 0.87

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a, B

erke

ley]

at 1

8:09

22

Aug

ust 2

012

Page 9: A Chinese Text Input Brain–Computer Interface Based on the ... · A Chinese Text Input Brain–Computer Interface Based on the P300 Speller James W. Minett, Hong-Ying Zheng, Manson

480 J. W. MINETT ET AL.

TABLE 2Online Classification Accuracy and Bit Rate Obtained by Each Participant During

Sinogram Input at Five Trials per Selection (i.e., One Character per 77 s)

Component Selection Sinogram Selection

Participant Accuracy Bit Rate Accuracy Bit Rate Theoretical Bit Rate

S1 37.5% 7.44 12.5% 0.82 2.11S2 50.0% 11.50 0.0% 0.00 0.00S3 37.5% 7.44 12.5% 0.82 2.11S4 75.0% 21.30 37.5% 2.99 7.68S5 87.5% 27.25 75.0% 6.84 17.56S6 43.8% 9.40 0.0% 0.00 0.00S7 62.5% 16.11 25.0% 1.86 4.77S8 31.3% 5.62 12.5% 0.82 2.11S9 62.5% 16.11 37.5% 2.99 7.68S10 43.8% 9.40 0.0% 0.00 0.00S11 37.5% 7.44 12.5% 0.82 2.11S12 75.0% 21.30 50.0% 4.20 10.79S13 25.0% 3.96 0.0% 0.00 0.00S14 75.0% 21.30 50.0% 4.20 10.79S15 81.3% 24.16 50.0% 4.20 10.79S16 12.5% 1.23 0.0% 0.00 0.00S17 100.0% 34.84 100.0% 9.96 25.58S18 93.8% 30.65 87.5% 8.30 21.29S19 68.8% 18.63 37.5% 2.99 7.68S20 50.0% 11.50 37.5% 2.99 7.68S21 43.8% 9.40 0.0% 0.00 0.00S22 68.8% 18.63 37.5% 2.99 7.68S23 87.5% 27.25 62.5% 5.48 14.08S24 75.0% 21.30 62.5% 5.48 14.08

M 59.4% 15.97 33.3% 2.87 7.36SD 23.39% 9.03 29.64% 2.81 7.22SE 4.77% 1.84 6.05% 0.57 1.47

4. DISCUSSIONAccurate sinogram input using the FLAST Speller requires

the user to make three consecutive selections correctly: firstcomponent, last component, and then the sinogram itself.During sinogram input using 10 trials per selection—that is,0.56 sinograms per minute—the mean component selectionaccuracy that participants achieved was 82.8%. Based on thisfigure, the predicted accuracy for sinogram input was (82.8%),3

that is, 56.8%. The mean sinogram accuracy that participantsactually achieved was 63.5%, similar to the predicted accu-racy. We infer that participants were able to locate and attend tosinogram targets with similar ease as component targets. Usingfive trials per selection—that is, 0.78 sinograms per minute—sinogram input accuracy was only 33.3%, too low on averagefor effective online use, although five participants maintainedsinogram input accuracy greater than 50% at this input speed.

To consider ways to enhance performance, it is informa-tive to examine the component selection and sinogram inputbit rates. Participants achieved 12.93 bits/per minute whileselecting component targets. However, when calculated interms of correct sinogram inputs, the mean bit rate was only4.23 bits/minute. This reduction in bit rate had two main causes.First, the time required to input one sinogram included anoverhead period of 47 s—for interstimulus pauses and visualsearch time—during which time no EEG data were processed.Neglecting this overhead period from the calculation of bit rategenerated 7.55 bits/minute for the 10-intensification selectionrate and 7.36 bits/minute for the five-intensification selec-tion rate. Second, the three-stage FLAST procedure potentiallyallows 56 × 56 × 56 (i.e., 175,616) distinct sinograms—that is,12.07 bits of information—to be input. In practice, however, thesystem allows only 7,072 distinct sinograms—that is, 8.86 bits

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a, B

erke

ley]

at 1

8:09

22

Aug

ust 2

012

Page 10: A Chinese Text Input Brain–Computer Interface Based on the ... · A Chinese Text Input Brain–Computer Interface Based on the P300 Speller James W. Minett, Hong-Ying Zheng, Manson

A CHINESE TEXT INPUT BCIF 481

FIG. 6. (A) ERP waveforms elicited from each participant by both target (thick line) and nontarget (thin line) intensifications at four electrode locations: Cz, Pz, P7,and P8. (Amplitude units are µV; time units are ms.) (B) Grand-averaged ERP waveform elicited by both target (thick line) and nontarget (thin line) intensificationsfor all 24 participants at four electrode locations: Cz, Pz, P7, and P8.

of information—to be input, resulting in a “loss” of 3.21 bitsper sinogram input.

This bottleneck on performance may be alleviated by revis-ing the stimulus presentation in three ways4: First, by reducingthe number of components, the stimulus matrix size may be

4Performance improvement could also be sought by revising theclassification algorithm, by additional training of the participants, andso on. The focus in this article is to present the FLAST speller for

reduced, resulting in faster trials and, potentially, more accu-rate selections, although more selections must consequently bemade before a sinogram can be input. If the components aredefined such that only component combinations that encodesinograms are available to the user, a higher resultant bit ratemay be maintained. The five-stroke methods (Jin et al., 2010;

Chinese text input and to assess its performance based on the use of astandard classification algorithm, SWLDA.

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a, B

erke

ley]

at 1

8:09

22

Aug

ust 2

012

Page 11: A Chinese Text Input Brain–Computer Interface Based on the ... · A Chinese Text Input Brain–Computer Interface Based on the P300 Speller James W. Minett, Hong-Ying Zheng, Manson

482 J. W. MINETT ET AL.

Wu et al., 2009) take this approach. Although suitable for userswho achieve high accuracy, these systems may be unusablein practice for users with low selection accuracy who may berequired to frequently correct the six to seven selections nec-essary to input one sinogram. In the future, a comparison ofonline sinogram input accuracy and speed may be performedfor the FLAST method and the five-stroke methods to determineunder what circumstances each approach achieves the betterperformance. Second, by reducing the overhead duration—boththe time allocated to locate target components and sinograms,and the interstimulus intervals—sinograms may be input morerapidly. We anticipate that the overhead may be significantlyreduced below the present 47 s with minimal reduction in per-formance. Third, the mean number of selections required toinput sinograms can be reduced by incorporating character pre-diction into the speller (Ryan et al., 2011). For the sinogram, thiscan be achieved on three levels: at the word level, by assess-ing which words commonly co-occur to form phrases; at thesinogram level, by predicting the second sinogram of a two-sinogram word after input of the first sinogram; and at thecomponent level, by predicting which sinograms are most likelyto match a single selected component. Implementation of bothoverhead reduction and character prediction is in progress, andis expected to enhance both the online accuracy and input speedof the FLAST Speller.

The majority of visual spellers use the P300 response as theprimary source of spatiotemporal features by which to conductclassification. However, as introduced in the results presentedearlier, no clear P300 component was elicited from participantsin response to target intensification in the FLAST procedure.Instead, an apparent N200 component, peaking about 200 msafter target onset at both left- and right-hemisphere parietalsites, was observed. The N-200 Speller (Hong et al., 2009) hasmade use of the N200 visual motion response (Kuba & Kubova,1992), elicited by the use of motion stimuli, for accurate alpha-betic text input. In the present work, however, the source of theapparent N200 cannot be attributed to a visual motion responsebecause no motion stimuli were used. Additional experimentsare necessary to investigate the causes and implications of theweak P300 and strong apparent N200 response in FLAST.

Validation of the performance of the system by users withneuromuscular disability also remains a necessary step to becarried out. Nevertheless, the results reported here demonstratethat online input of Chinese text from a large lexicon using theFLAST Speller is feasible.

REFERENCESAllison, B. Z., & Pineda, J. A. (2003). ERP’s evoked by different matrix

sizes: implications for a brain computer interface (BCI) system. IEEETransactions on Neural Systems and Rehabilitation Engineering, 11,110–113.

BioSemi B.V. (2007, July 3). Active two user manual (Version 3.2). Amsterdam,the Netherlands: BioSemi B. V.

Brouwer, A-M., & van Erp, J. B. F. (2010). A tactile P300 brain-computerinterface. Frontiers in Neuroscience, 4(19). doi:10.3389/fnins.2010.00019

Brouwer, A-M., van Erp, J. B. F., Aloise, F., & Cincotti, F. (2010). Tactile,visual, and bimodal P300s: Could bimodal P300s boost BCI performance?SRX Neuroscience. doi:10.3814/2010/967027

Brunner, P., Ritaccio, A. L., Emrich, J. F., Bischof, H., & Schalk,G. (2011). Rapid communication with a ‘P300’ matrix speller usingelectrocorticographic signals (ECoG). Frontiers in Neuroscience, 1(5).doi:10.3389/fnins.2011.00005

Daniels, P. T., & Bright, W. (1996). The world’s writing systems. New York,NY: Oxford University Press.

Draper, N. R., & Smith, H. (1981). Applied regression analysis (2nd ed.).New York, NY: Wiley.

Fabiani, M., Gratton, G., Karis, D., & Donchin, E. (1987). Definition, iden-tification, and reliability of measurement of the P300 component of theevent-related brain potential. In P. K. Ackles, J. R. Jennings, & M. G. H.Coles (Eds.), Advances in psychophysiology, Vol. 2 (pp. 1–78). Amsterdam,the Netherlands: JAI Press.

Farwell, L. A., & Donchin, E, (1988). Talking of the top of yourhead: Toward a mental prosthesis utilizing event-related brain potentials.Electroencephalography and Clinical Neurophysiology, 70, 510–523.

Fong, M. C-M., & Minett, J. W. (2012). Chinese input methods: Overview andComparisons. Journal of Chinese Linguistics, 40, 102–138.

Frye, G. E., Hauser, C. K., Townsend, G., & Sellers, E. W. (2011).Suppressing flashes of items surrounding targets during calibration of aP300-based brain–computer interface improves performance. Journal ofNeural Engineering, 8. doi:10.1088/1741-2560/8/2/025024

Furdea, A., Halder, S., Krusienski, D. J., Bross, D., Nijboer, F., Birbaumer,N., & Kübler, A. (2009). An auditory oddball (P300) spelling system forbrain–computer interfaces. Psychophysiology, 46, 617–625.

Gu, Y. (2009). Chinese. In K. Brown & S Ogilvie (Eds.), Concise encyclopediaof languages of the world (pp. 213–220). Oxford, UK: Elsevier.

Guger, C., Daban, S., Sellers, E., Holzner, C., Krausz, G., Carabalona, R., . . .

Edlinger, G. (2009). How many people are able to control a P300-basedbrain–computer interface (BCI)? Neuroscience Letters, 462, 94–98.

Höhne, J., Schreuder, M., Blankertz, B., & Tangermann, M. (2010). Two-dimensional auditory p300 speller with predictive text system. Proceedingsof the 2010 Annual International Conference of the IEEE Engineering inMedicine and Biology Society (EMBC), 4185–4188.

Hong, B., Guo, F., Liu, T., Gao, X., & Gao, S. (2009). N200-speller usingmotion-onset visual response. Clinical Neurophysiology, 120, 1658–1666.

Jin, J., Allison, B. Z., Brunner, C., Wang, B., Wang, X., Zhang, J., . . .

Pfurtscheller, G. (2010). P300 Chinese input system based on BayesianLDA. Biomedizinische Technik: Biomedical Engineering, 55(1), 5–18.

Klobassa, D. S., Vaughan, T. M., Brunner, P., Schwartz, N. E., Wolpaw, J.R., Neuper, C., & Sellers, E. W. (2009). Toward a high-throughput audi-tory P300-based brain-computer interface. Clinical Neurophysiology, 120,1252–1261.

Krusienski, D. J., Sellers, E. W., Cabestaing, F., Bayoudh, S., McFarland, D.J., Vaughan, T. M., & Wolpaw, J. R. (2006). A comparison of classifi-cation techniques for the P300 speller. Journal of Neural Engineering, 3,299–305.

Krusienski, D. J., Sellers, E. W., McFarland, D. J., Vaughan, T. M., & Wolpaw,J. R. (2008). Toward enhanced P300 speller performance. Journal ofNeuroscience Methods, 167(1), 15–21.

Kuba, M., & Kubova, Z. (1992). Visual evoked potentials specific for motiononset. Documenta Ophthalmologica, 80, 83–89.

Lenhardt, A., Kaper, M., & Ritter, H. J. (2008). An adaptive P300-basedonline brain-computer interface. IEEE Transactions on Neural Systems andRehabilitation Engineering, 16, 121–130.

Leuthardt, E. C., Schalk, G., Wolpaw, J. R., Ojemann, J. G., & Moran, D. W.(2004). A brain–computer interface using electrocorticographic signals inhumans. Journal of Neural Engineering, 1(2), 63–71.

Lewis, M. P. (Ed.). (2009). Ethnologue: Languages of the World (16th ed.).Dallas, TX: SIL International. Available from http://www.ethnologue.com/

Li, Y., Nam, C. S., Shadden, B. B., & Johnson, S. L. (2010). A P300-based brain–computer interface: Effects of interface type and screen size.International Journal of Human–Computer Interaction, 27, 52–68.

Mak, J. N., Arbel, Y., Minett, J. W., McCane, L. M., Yuksel, B., Ryan,D., . . . Erdogmus, D. (2011). Optimizing the P300-based brain–computer

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a, B

erke

ley]

at 1

8:09

22

Aug

ust 2

012

Page 12: A Chinese Text Input Brain–Computer Interface Based on the ... · A Chinese Text Input Brain–Computer Interface Based on the P300 Speller James W. Minett, Hong-Ying Zheng, Manson

A CHINESE TEXT INPUT BCIF 483

interface: current status, limitations and future directions. Journal of NeuralEngineering, 8. doi:10.1088/1741-2560/8/2/025003

Minett, J. W., Peng, G., Zhou, L., Zheng, H-Y., & Wang, W. S-Y. (2010,June). An assistive communication brain–computer interface for Chinesetext input (ID 41183). Proceedings of the 4th International Conference onBioinformatics and Biomedical Engineering.

Nam, C. S., Li, Y., & Johnson, S. (2010). Evaluation of P300-based brain–computer interface in real-world contexts. International Journal of Human–Computer Interaction, 26, 621–637.

Ryan, D. B., Frye, G. E., Townsend, G., Berry, D. R., Mesa-G., S., Gates, N.A., & Sellers, E. W. (2011). Predictive spelling with a P300-based brain-computer interface: Increasing the rate of communication. InternationalJournal of Human-Computer Interaction, 27, 69–84.

Schalk, G., McFarland, D. J., Hinterberger, T., Birbaumer, N., & Wolpaw, J.R. (2004). BCI2000: A general-purpose brain–computer interface (BCI)system. IEEE Transactions on Biomedical Engineering, 51, 1034–1043.

Sellers, E. W., Krusienski, D. J., McFarland, D. J., Vaughan, T. M., & Wolpaw,J. R. (2006). A P300 event-related potential brain-computer interface (BCI):The effects of matrix size and inter stimulus interval on performance.Biological Psychology, 73, 242–252.

Serby, H., Yom-Tov, E., & Inbar, G. F. (2005). An improved P300-basedbrain–computer interface. IEEE Transactions on Neural Systems andRehabilitation Engineering, 13, 89–98.

Sharbrough, F. C. G., Lesser, R. P., Lüders, H., Nuwer, M., & Picton, W. (1991).AEEGS guidelines for standard electrode position nomenclature. ClinicalNeurophysiology, 8, 202–204.

Sutton, S., Braren, M., Zubin, J., & John, E. R. (1965). Information delivery andthe sensory evoked potential. Science, 155, 1436–1439.

Townsend, G., LaPallo, B. K., Boulay, C. B., Krusienski, D. J., Frye, G.E., Hauser, C. K., . . . Sellers, E. W. (2010). A novel P300-based brain–computer interface stimulus presentation paradigm: Moving beyond rowsand columns. Clinical Neurophysiology, 121, 1109–1120.

Van Driem, G. (2001). Languages of the Himalayas: An ethnolinguistic hand-book of the greater Himalayan region. Leiden, the Netherlands: Brill.

Wang, W. S-Y. (1973). The Chinese language. Scientific American, 228,50–60.

Wang, W. S-Y. (1981). Language structure and optimal orthography. In O. J.L. Tzeng & H. Singer (Eds.), Perception of print: Reading research inexperimental psychology (pp. 223–236). Hillsdale, NJ: Erlbaum.

Wang, W. S-Y., & Tsai, Y. (2011). The Alphabet and the sinogram: Setting thestage for a look across orthographies. In P. McCardle, B. Miller, J. R. Lee,& O. Tzeng (Eds.), Dyslexia across languages. (pp. 1–16) Baltimore, MD:Brookes.

Wolpaw, J. R., Birbaumer, N., Heetderks, W. J., McFarland, D. J., Peckham, P.H., Schalk, G., . . . Vaughan, T. M. (2000). Brain-computer interface tech-nology: A review of the first international meeting. IEEE Transactions onRehabilitation Engineering, 8, 164–173.

Wu, B., Su, Y., Zhang, J-H., Li, X., Zhang, J-C., Cheng, W-D., & Zheng, X-X.(2009). A virtual Chinese keyboard BCI system based on P300 potentials[Chinese]. Acta Electronica Sinica, 37, 1733–1745.

Xinhua Zidian (10th ed.). [Chinese]. (2004). Beijing, China: Shang wu yin shuguan.

Yin, B., & Felley, M. (1990). Chinese Romanization. Pronunciation andorthography. Beijing, China: Sinolingua.

ABOUT THE AUTHORSJames W. Minett received his PhD in Electronic Engineeringfrom City University of Hong Kong. He is currently ResearchAssociate in the Language Engineering Laboratory at theChinese University of Hong Kong, where he works onbrain-computer interfaces, psycholinguistics, and evolutionarylinguistics.

Hong-Ying Zheng received her PhD in ElectronicEngineering at the Language Engineering Laboratory,Chinese University of Hong Kong. She continued to workat the laboratory as a Visiting Scholar, conducting research onpsycholinguistics and brain-computer interfacing.

Manson C-M. Fong graduated from the Hong KongUniversity of Science and Technology with an MPhil in Physicsin 2009. He is now pursuing a PhD in Electronic Engineering atthe Language Engineering Laboratory of the Chinese Universityof Hong Kong. His research interests include brain-computerinterfacing and signal processing.

Lin Zhou graduated from Hunan University with a Bachelordegree in Software Engineering in 2007. She is now a ResearchAssistant in Electronic Engineering at the Language EngineeringLaboratory of the Chinese University of Hong Kong, doingresearch on psycholinguistics and brain-computer interfaces.

Gang Peng received his PhD in Language Engineeringfrom City University of Hong Kong. He is Research AssociateProfessor of the Department of Linguistics and ModernLanguages at the Chinese University of Hong Kong. He isalso Professor of Shenzhen Institutes of Advanced Technology,Chinese Academy of Sciences.

William S-Y. Wang received his PhD from the Universityof Michigan. He was Professor of Linguistics at the Universityof California at Berkeley for thirty years. Currently he directsthe Language Engineering Laboratory at the Chinese Universityof Hong Kong. He is also Academician of Academia Sinica inTaiwan.

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a, B

erke

ley]

at 1

8:09

22

Aug

ust 2

012