A hierarchical stimulus presentation paradigm for a P300-based Hangul speller

A Hierarchical Stimulus Presentation Paradigmfor a P300-Based Hangul Speller

Tae-Hoon Lee,1 Tae-Eui Kam,1 Sung-Phil Kim2

1 Department of Computer Science and Engineering, Korea University,Seongbuk-ku, Seoul 136-713, Republic of Korea

2 Department of Brain and Cognitive Engineering, Korea University,Seongbuk-ku, Seoul 136-713, Republic of Korea

Received 30 December 2010; revised 10 February 2011; accepted 15 February 2011

ABSTRACT: We propose a hierarchical stimulus presentation para-

digm for a P300-based Hangul (Korean script) input system. A P300-based input system (or speller) is one of the most promising noninva-

sive brain-computer interface (BCI) applications based on its direct

applicability in many computer programs. Although the previous row/

column stimulus presentation paradigm has been well-suited to theEnglish input, it may not be optimal for a Hangul input because

Hangul has a distinct hierarchical structure. To overcome the limita-

tion of the previous paradigms, we developed a new P300-based

Hangul input system by taking the unique hierarchical structure ofHangul into account for creating a hierarchical stimulus presentation

paradigm. By using the hierarchical structure, we can effectively

reduce the window size of the interface without loss of classification

accuracy. A performance comparison shows that the hierarchical par-adigm exhibits higher classification accuracy than the row/column

paradigm even with a smaller window size. Thus, the proposed hier-

archical paradigm is more efficient to spell Hangul and will be moreuseful for BCI-based Hangul input for a text messenger, e-mail pro-

gram, word processor and other similar applications. VVC 2011 Wiley

Periodicals, Inc. Int J Imaging Syst Technol, 21, 131–138, 2011; Published

online in Wiley Online Library (wileyonlinelibrary.com). DOI 10.1002/ima.20282

Key words: brain-computer interface; electroencephalography;

P300; Hangul speller; hierarchical stimulus presentation paradigm

I. INTRODUCTION

A brain-computer interface (BCI; Wolpaw et al., 2002) is an emerg-

ing research area that tries to read the brain signal to recognize the

intention of patients who have neurologic disorders with paralysis

(e.g., amyotrophic lateral sclerosis [ALS] or the locked-in syn-

drome) and translates it into commands to control effectors. A BCI

can be broadly categorized into the following three classes based on

the brain signal it senses: intracortical BCIs using neuronal activity,

epicortical BCIs using an electrocorticogram, and scalp BCIs using

an electroencephalogram (EEG). The scalp BCI class has been

most intensively investigated because of its noninvasiveness, rela-

tively low cost, and ease of use.

A P300-based speller is an application of the scalp BCIs that

detects a P300 component of EEG to recognize and type an

intended alphabet of the user. The P300 component appears in the

event-related potential (ERP) that positively increases at approxi-

mately 300 ms after stimuli onset (Farwell and Donchin, 1988).

Farwell and Donchin (1988) developed the first type of P300-based

BCI system to spell English words. In this system, the row/column

stimulus presentation paradigm was used in which each row or col-

umn of a 6 3 6 matrix of 36 characters flashed randomly. To input

a particular character, the user viewed the character during the pre-

sentation and a P300 waveform occurred when the row or column

containing that character flashed.

Since its introduction, the P300-based input system using the

row/column stimulus presentation paradigm has demonstrated

promising performance in spelling words on the computer (Hoff-

mann et al., 2008; Krusienski et al., 2008). The P300-based input

system has also been applied to control many other effectors such

as a robot, cursor, or wheelchair (Rebsamen et al., 2006; Bell et al.,

2008; Citi et al., 2008). However, the P300-based input system has

primarily been designed for English, and currently there is no P300-

based input system developed yet for Hangul (Korean script).

Because of structural differences between Hangul and English, we

cannot employ conventional English-oriented P300-based input sys-

tem for typing Hangul. We feel the need to develop a new P300-

based Hangul input system for Korean computer applications (Fig.

1 for an overall flow of the Hangul P300 speller).

The P300-based input system consists of three main compo-

nents: stimulus presentation paradigm design, signal processing and

feature selection/extraction, and classification. Although all three

components are important for improving the performance of P300-

based input systems, the stimulus presentation paradigm design

Correspondence to: Sung-Phil Kim; e-mail: [email protected] sponsors: This research was supported by Undergraduate Research Program

funded by Korea Foundation for the Advancement of Science and Creativity (2010-2-029) and also supported by WCU (World Class University) Program (R31-10008-0)through the National Research Foundation of Korea funded by the Ministry of Educa-tion, Science and Technology.

' 2011 Wiley Periodicals, Inc.

may be the most crucial part for the development of a P300-based

input system for a new language. Hence, we concentrate on creating

a new stimulus presentation paradigm for Hangul and improving

spelling performance over the conventional row/column stimulus

presentation paradigm.

The conventional row/column stimulus presentation paradigm is

well-suited to spell English that has a nonhierarchical structure, but

may not work as well to spell Hangul because Hangul has a hier-

archical structure quite different from English. Also, the row/col-

umn stimulus presentation paradigm needs a large size of window

to present all characters at once, rendering itself hard to use with

the applications, such as a text messenger, word processor, and e-

mail program.

To solve these problems, we develop a novel hierarchical stimu-

lus presentation paradigm based on the special structure of Hangul.

The hierarchical stimulus presentation paradigm consists of two

layers. The first layer is constructed for separating the components

of Hangul. A Hangul syllable consists of the following three com-

ponents: an initial consonant, a middle vowel, and a final consonant

(optional). Because a syllable is always constructed following this

order of consonant–vowel–consonant, either the consonant charac-

ter group or the vowel character group needs to be presented at

each presentation. Therefore, in the first layer, we present each

character group with the structure of three hierarchies, i.e., conso-

nant–vowel–consonant. The second layer is constructed for cluster-

ing the characters of each component of Hangul because Hangul

characters can be readily grouped based on pronunciation. We build

a subhierarchical structure for each hierarchy to cluster the charac-

ters. In the second layer, each substructure has two hierarchies. In

the first hierarchy, the representative character of each cluster is

presented. Once one of the representative characters is selected, in

the second hierarchy all of the characters in the selected cluster are

presented and one is chosen as a final input.

The results of this study are organized as follows. The structure

of Hangul and the procedure of the row/column stimulus paradigm

are introduced in Section II. In Section III, the proposed hierarchi-

cal stimulus presentation paradigm for a Hangul input system is

described. Also, we provide a description on data acquisition, proc-

essing, and classification of EEG signals. In Section IV, we show

the comparison results between the row/column and the hierarchical

stimulus presentation paradigms for a Hangul input system. In Sec-

tion V, we draw a conclusion and suggest future corollary studies.

II. RELATED WORK

A. Structure of Hangul. Hangul is the Korean script system,

which was invented in 1443. One syllable of Hangul consists of the

sequence of an initial consonant, a vowel, and an optional final con-

sonant. The two (without the final consonant) or three ordered

sequence of characters makes a syllable. The characters in each

component are shown in Table I.

B. Row/Column Stimulus Presentation Paradigm. Farwell

and Donchin (1988) developed a P300-based BCI speller to input

English. The P300-based BCI speller uses a 6 3 6 matrix to present

the oddball stimulus where a rare target stimulus is randomly high-

lighted among consecutive normal stimuli (Fig. 2a). The user

focuses on the character which is to be input, while a row or a col-

umn in the 6 3 6 matrix randomly flashes. In one round of the pre-

sentation of all six rows and six columns, the target character is

Table I. A hierarchy of the Hangul syllable and the character sets in each hierarchy

Hierarchy Characters

Initial consonant

Middle vowel

Final consonant

Bold-faced characters are the basic components in consonants and vowels.

Figure 1. Overall flow of P300-based Hangul input system.

132 Vol. 21, 131–138 (2011)

highlighted twice as both the row and column containing the target

character flash, and consequently the P300 wave is supposed to be

elicited twice. The ratio of target-to-nontarget presentation is 1:5 in

this sequence. A P300-based BCI speller detects these P300

responses from the captured EEG signals and determines which let-

ter should be typed.

Flashing with row and column units is more efficient than flash-

ing each character one by one because only 12 flashes are needed

for flashing whole characters in a 63 6 matrix instead of 36 flashes.

However, there is basically the ‘‘double-flash problem’’ with the

row/column stimulus presentation paradigm, as indicated by Town-

send et al. (2010). It can occur when the row and column containing

the target character flash consecutively so that the target character

accidently flashes twice with a short interval. Then, it is possible

that only a single P300-ERP is elicited when the two consecutive

P300 waveforms overlap or when the user does not recognize the

second flash of the target. Different stimulus presentation para-

digms have been proposed to address this problem (Martens et al.,

2009; Townsend et al., 2010), but those paradigms are still not effi-

cient to spell Hangul as they were not designed to specify the

unique characteristics of the hierarchical structure of Hangul.

Another issue on the row/column stimulus presentation para-

digm is that it is difficult to reduce the size of the window which

presents stimuli because it induces a decrease in the size of charac-

ters or intercharacter distances which would negatively affect spell-

ing accuracy. In fact, Salvaris and Sepulveda (2009) showed that

the accuracy of P300-based BCI spellers highly depends on the size

of characters and the intercharacter distances. However, it will be

necessary to fit the window size to various editing applications such

as a messenger, word processor, or mobile input system for the

practical use of P300-based BCIs. Hence, a new stimulus presenta-

tion paradigm is required which can allow a reduced window size

without loss of accuracy.

III. METHODS

A. Hierarchical Stimulus Presentation Paradigm for theHangul Input System. The previous row/column stimulus pre-

sentation paradigm was designed to spell English. When we simply

adopt this row/column paradigm for a Hangul input system, a 6 3 7

matrix is necessary for flashing all Hangul characters. For efficiency

of the system, we design that some double characters, which are

only used for the final consonant (‘‘ ,’’ ‘‘ ,’’ etc.), can be input

by twice spelling of single consonants. Figure 2b shows the window

for presenting Hangul character stimuli in the row/column stimulus

presentation paradigm.

The hierarchical structure of Hangul suggests that we do not

need to show whole consonants and vowels simultaneously and can

use the specific ordering rule of Hangul (Section IIA). Another fea-

ture of Hangul to be considered is that constants and vowels can be

clustered based on pronunciation, thus helping us create a hierarchi-

cal structure by naturally clustering the characters with similar pro-

nunciations. This hierarchical structure may increase the efficiency

of the interface because only a few characters can be presented

instead of all the Hangul characters at once. Taking such unique

features of Hangul into account, we design a hierarchical stimulus

presentation paradigm as an alternative to the conventional row/col-

umn stimulus presentation paradigm.

Our hierarchical stimulus presentation paradigm consists of two

layers. The first layer is designed to separate Hangul characters into

three component groups based on the fact that a Hangul syllable is

composed of a sequence of an initial consonant, a vowel, and an

optional final consonant. Accordingly, the first layer contains three

hierarchies corresponding to each component. Then, we lay out the

subhierarchical structures from each hierarchy to build more spe-

cific clusters of the characters in each component of Hangul (Fig.

3a). For such specific clusters, the second layer structures are built

to contain two hierarchies. The first hierarchy presents representa-

tive characters of each cluster on the first level of the window, and

once a cluster is selected, the second hierarchy presents those char-

acters belonging to the selected cluster on the second level of the

window.

Figures 4a and 4b illustrate the first hierarchy of the second

layer with the representative characters for each cluster for both

consonants and vowels; there are eight clusters for consonants and

five for vowels. Once one of the clusters is selected, the second

level of windows shows up to present a set of characters in the

selected cluster (Figs. 4c and 4d). As an example, consider that

the user wants to input ‘‘ ’’ (Fig. 5). First, the user selects ‘‘ ,’’

which is the representative character of the cluster, including the

character ‘‘ ’’ at the first hierarchy. Then, the window turns to

the next level showing the characters in the cluster of ‘‘ ,’’ so

that the user can select ‘‘ .’’ The overall structure to input a

Hangul character is described in Figure 3b. Note that the arrow for

going back to the previous step is presented in each hierarchy to

‘‘undo’’ a false selection. Same as the row/column stimulus pre-

sentation paradigm, some double characters which are only used

for final consonants (such as ‘‘ ,’’ ‘‘ ,’’ etc.) can be input by

twice spelling of single consonants. For that, the proposed system

shows a window of the final consonants twice and when the user

wants to input one final consonant or none, the character ‘‘Ø’’ is

selected for the null input.

Figure 2. Examples of the row/column stimulus presentation paradigm. (a) an English input system (Farwell and Donchin, 1988), (b) a Hangul

input system.

Vol. 21, 131–138 (2011) 133

Because of its hierarchical structures, the proposed paradigm

has three main advantages. First, it has a reduced window which is

a four times as small as that of the row/column stimulus presenta-

tion paradigm (only using half-length of width and height than

those of the row/column stimulus presentation paradigm by reduc-

ing the number of characters presented at once from 42 to 9). Sec-

ond, we can increase a character size that is maximally six times

larger than that of the row/column paradigm. Salvaris and Sepul-

veda (2009) showed that the performances of P300 speller in a large

character size interface are higher than that in a small character size

interface. Finally, we are free from the row/column flashing strat-

egy, and in fact are able to use a character-by-character flashing

strategy to eliminate the ‘‘double-flash problem’’ which is men-

tioned in Section II.B. Although we use character-by-character

flashing strategy, the number of flashing times in the proposed para-

digm (maximally 9 clusters 1 4 characters 5 13, minimally 6

Figure 3. The overall structure of the proposed Hangul input system. (a) The system is composed of the first hierarchical structure (the first

layer) and the corresponding sub-hierarchical structure (the second layer). Three hierarchies constitute the first layer where two subhierarchiesare laid out from each hierarchy. (b) An illustration of a general procedure to input a Hangul syllable using the proposed input system. [Color fig-

ure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

Figure 4. A hierarchy of the presentation of Hangul characters. (a) Representative characters for consonant clusters, (b) representative charac-

ters for vowel clusters, (c) character sets of each consonant cluster, and (d) character sets of each vowel cluster.

134 Vol. 21, 131–138 (2011)

clusters 1 4 characters 5 10) is not higher than that in the row/col-

umn stimulus presentation paradigm (6 rows 1 7 columns 5 13)

because of its hierarchical structure.

B. Data Acquisition. Three able-bodied Korean subjects (two

men and one woman; 25-30 years of age) participated in the study.

All of the subjects were naive to BCI use. Informed consent was

obtained from each subject before the experiment. Each participant

sat in a comfortable chair from the computer monitor at a distance of

�1 m. The EEG was recorded using the Quick-Cap (Compumedics

NeuroScan, Charlotte, NC) embedded with 32 electrodes distributed

over the entire scalp based on the International 10-20 system. The

impedances of all channels were reduced below 5.0 kX before each

session began. The EEG signals were band-pass filtered through

0.1–100 Hz and digitized at a rate of 250 Hz. Also, we additionally

applied 50 Hz notch filter to remove line noises. Each session con-

sisted of five rounds of each character. For each subject, we acquired

three sessions for training data and other three sessions for test data.

C. Feature Extraction. We selected a 600 ms window of data

(150 digitized samples in the window) after the onset of the flash to

decide which cluster or individual character was intended by the

user. The first 200 ms segment (50 samples) was used as a baseline

because the P300 component was supposed to be elicited 300 ms af-

ter the onset. As suggested by Krusienski et al. (2008), we divided

150 samples into small blocks of N samples (N 5 6, 12, 24), to

reduce the dimensionality of the data. Then the EEG signals are

segmented by B blocks (B 5 25, 12, 6). The sample means of each

block were used as the EEG features F as follows:

F ¼ ½f1; . . . fb; . . . fB� where fb ¼ 1

N

XN

k¼1

skb ð1Þ

where skb represents a kth sample of EEG signals in the bth block andrepresents mean of samples in the bth block. Of 32 channels, we

selected 10 (Fz, Cz, P7, P3, Pz, P4, P8, O1, Oz, and O2) that have

been known to be highly related with the P300-ERP. Combining the

sample means from all 10 channels, the dimensionality of the feature

space became 10 channels 3 B blocks. Figure 6 shows the locations

of the selected channels. Flashing of representative characters and

flashing of selected characters constituted a single round and a total

of 15 rounds constituted a single epoch to input a character.D.Classification. In this article, we use the Fisher’s linear discrimi-

nant analysis (Fikunaga, 1990) for classification. Finding the P300-

Figure 5. An illustrative example of selecting a character in the subhierarchical structure. [Color figure can be viewed in the online issue, whichis available at wileyonlinelibrary.com.]

Figure 6. Locations of the selected EEG channels in the Interna-

tional 10-20 system used for detecting the P300-ERP. [Color figurecan be viewed in the online issue, which is available at

wileyonlinelibrary.com.]

Vol. 21, 131–138 (2011) 135

ERP in EEG signals is a binary classification problem with a deci-

sion hyperplane defined as follows (Salvaris and Sepulveda, 2009):

w � xþ b ¼ 0 ð2Þ

where x is a feature vector of EEG signals, w is a weight vector,

and b is a bias term.

In the proposed hierarchical stimulus presentation paradigm, a

character was chosen with two hierarchical steps. At the first hierar-

chy, the user selects a representative cluster that contains the target

character. Then, at the second hierarchy, the user selects the target

character directly. To cope with this hierarchy, the classification

process also consists of two steps that are described as follows:

g ¼ arg maxg

ðX

j

w � xjgÞ ð3Þ

c ¼ arg maxc

ðX

j

w � xjcÞ ð4Þ

where g [ {1,. . .,G} (G 5 9 for consonants and G 5 6 for vowels)

represents a cluster index in the first hierarchy, g is an index of the

cluster that elicits the P300-ERP. c represents the index of charac-

ters in the second hierarchy and c is an index of the character that

elicits the P300-ERP and j denotes an index of round in one epoch

to input a character.

IV. RESULTS

A. Grand Average of P300-ERP. Figure 7 compares the aver-

aged amplitudes of the P300 ERPs when a target flashed in the

row/column or the hierarchical stimulus presentation paradigm at

the channels of Fz, Cz, and Pz for three subjects. Both paradigms

elicited the increase in amplitudes appearing from around 300 ms

after the stimulus onset to 600 ms. However, we observed that the

P300 waveforms elicited by the hierarchical stimulus presentation

paradigm revealed a larger increase in amplitudes when compared

with those in the row/column stimulus presentation paradigm for

all the subjects. Such larger and clearer P300 waveforms generated

by the hierarchical paradigm were made possible due to the fact

that we could present stimuli with the larger character size using

the character-by-character presentation strategy (other factors of

the interface such as intercharacter distances are same). Hence, it

shows that we could effectively reduce the problems of ambiguity

Figure 7. Average P300 amplitudes of the target flashes in the row/column and the hierarchical stimulus presentation paradigms on sites Fz,

Cz, and Pz for each subject.

136 Vol. 21, 131–138 (2011)

in the P300 waveforms associated with the conventional row/col-

umn paradigm.

B. Classification Accuracy. Figure 8 depicts classification ac-

curacy for each subject as a function of rounds in the row/column

and the hierarchical stimulus presentation paradigms. We get accu-

racies by calculating right spelling rates without using ‘‘undo’’ a

false selection. We found that the performance of the hierarchical

stimulus presentation paradigm was superior to the row/column

stimulus presentation paradigm in most of rounds (except 1 and 2

rounds with N 5 6 and the 2 round with N 5 24 in the subject 1

and except 1, 5, and 6 rounds with N 5 24 in the subject 2) in the

subjects. Also, we achieved classification accuracy as high as 100%

using our hierarchical paradigm with 15 rounds with N 5 6 and 12

in all the subjects.

These results suggest that the hierarchical stimulus presentation

paradigm improves the performance of the P300-based Hangul

speller than the conventional row/column stimulus presentation par-

adigm. It also implies that the proposed paradigm based P300

speller will be more useful for some real applications, such as mes-

sengers or word processors, as it provides a smaller window size

without loss of accuracy.

V. CONCLUSIONS AND FUTURE WORK

In this article, we proposed the hierarchical stimulus presentation

paradigm for a P300-based Hangul input system. This system is the

first P300-based BCI speller specialized to input Hangul. This new

P300-based input system consists of a unique hierarchical structure

that was designed based on the hierarchical structure of Hangul.

Such a hierarchical structure, along with clustering of the Hangul

characters, led to the reduced number of characters presented at a

time. Consequently, we could increase the character size and

employ a character-by-character flashing strategy, which helped

improve the classification accuracy. Our hierarchical paradigm eli-

cited larger and clearer P300-ERPs when compared with the con-

ventional row/column paradigm. We observed that the classification

accuracy by the proposed paradigm was higher than that of the row/

column paradigm. Although they were obtained from a relatively

small dataset, the empirical results clearly suggest that the hierarch-

ical stimulus presentation paradigm is better suited than the row/

column stimulus presentation paradigm to spell Hangul, increasing

its usability for the many real applications, such as messengers and

word processors.

As the next steps, we will extend the hierarchical stimulus pre-

sentation paradigm to input not only Hangul but also different

types of characters, including English letters or numbers. We will

Figure 8. Classification accuracy as a function of the number of the presentation rounds in three subjects.

Vol. 21, 131–138 (2011) 137

also investigate the feasibility of using our new P300-based input

system for those who are afflicted with ALS or the locked-in

syndrome.

REFERENCES

C. Bell, P. Sehenoy, R. Chalodhorn, and R. Rao, Control of a humanoid

robot by a noninvasive brain–computer interface in humans, J Neural Eng 5

(2008), 214–220.

L. Citi, R. Poli, C. Cinel, and F. Sepulveda, P300-based bci mouse with ge-

netically-optimized analogue control, IEEE Trans Neural Syst Rehabil Eng

16 (2008), 51–61.

L. Farwell and E. Donchin, Talking off the top of your head: Toward a men-

tal prosthesis utilizing event-related brain potentials, Electroencephalogr

Clin Neurophysiol 70 (1988), 510–523.

K. Fikunaga (Editor), Introduction to statistical pattern recognition, Aca-

demic Press, San Diego, 1990.

U. Hoffmann, G. Garcia, J. Vesin, K. Diserens, and T. Ebrahimi, A boosting

approach to p300 detection with application to brain-computer interfaces,

Proc 2nd Int IEEE EMBS Conf Neural Eng, Washington D.C., USA, Mar.

16-19, 2005, pp. 97–100.

D. Krusienski, E. Sellers, D. McFarland, T. Vaughan, and J. Wolpaw, Toward

enhanced p300 speller performance, J Neurosci Methods 167 (2008), 15–21.

S. Martens, N. Hill, J. Farquhar, and B. Scholkopf, Overlap and refractory

effects in a brain-computer interface speller based on the visual p300 event-

related potential, J Neural Eng 6 (2009), 026003.

B. Rebsamen, E. Burdet, C. Guan, H. Zhang, C. Teo, Q. Zeng, M. Ang, and

C. Laugier, A brain-controlled wheelchair based on p300 and path guidance,

Proc 1st IEEE/RAS-EMBS International Conference on Biomedical

Robotics and Biomechatronics, Pisa, Italy, Feb. 20-22, 2006, pp. 1101–

1106.

M. Salvaris and F. Sepulveda, Visual modifications on the p300 speller bci

paradigm, J Neural Eng 6 (2009), 046011.

G. Townsend, B. LaPallo, C. Boulay, D. Krusienski, G. Frye, C. Hauser, N.

Schwartz, T. Vaughan, J. Wolpaw, and E. Sellers, A novel p300-based

brain-computer interface stimulus presentation paradigm: Moving beyond

rows and columns, Clin Neurophysiol 121 (2010), 1109–1120.

J. Wolpaw, N. Birbaumer, D. McFarland, G. Pfurtscheller, and T. Vaughan,

Brain-computer interfaces for communication and control, Clin Neurophy-

siol 113 (2002), 767–791.

138 Vol. 21, 131–138 (2011)

Documents

A hierarchical stimulus presentation paradigm for a P300-based Hangul speller