8
Uncorrected Proof SHORT REPORTS Comprehensive analysis of the gene expression profiles in human gastric cancer cell lines Jiafu Ji 1,9 , Xin Chen 2,3,9 , Suet Yi Leung 4,9 , Jen-Tsan A Chi 5 , Kent Man Chu 6 , Siu Tsan Yuen 4 , Rui Li 2,3 , Annie SY Chan 4 , Jiyou Li 7 , Nina Dunphy 8 and Samuel So* ,2,3 1 Department of Surgery, Beijing Cancer Hospital, Peking University School of Oncology, Beijing, China; 2 Department of Surgery, Stanford University, Stanford, CA, USA; 3 Asian Liver Center, Stanford University, Stanford, CA, USA; 4 Department of Pathology, The University of Hong Kong, Queen Mary Hospital, Hong Kong, China; 5 Department of Biochemistry, Stanford University, Stanford, CA, USA; 6 Department of Surgery, The University of Hong Kong, Queen Mary Hospital, Hong Kong, China; 7 Pathology Department, Beijing Cancer Hospital, Peking University School of Oncology, Beijing, China; 8 Department of Genetics, Stanford University, Stanford, CA, USA Gastric adenocarcinoma is one of the major malignancies worldwide. Gastric cell lines have been widely used as the model to study the genetics, pharmacology and biochem- istry of gastric cancers. Here we describe a comprehensive survey of the gene expression profiles of 12 gastric carcinoma cell lines, using cDNA microarray with 43 000 clones. For comparison, we also explored the gene expression patterns of 15 cell lines derived from lymphoid, endothelial, stromal and other epithelial cancers. Expression levels of specific genes were validated through comparison to protein expression by immunohistochemistry using cell block arrays. We found sets of genes whose expression corresponds to the molecular signature of each cell type. In the gastric cancer cell lines, apart from genes that are highly expressed corresponding to their common epithelial origin from the gastrointestinal tract, we found marked heterogeneity among the gene expression patterns of these cell lines. Some of the heterogeneity may reflect their underlying molecular characteristics or specific differentiation program. Two putative gastric carcinoma cell lines were found to be B-cell lymphoma, and another one had no epithelial specific gene expression and hence was of doubtful epithelial origin. These cell lines should no longer be used in gastric carcinoma research. In conclusion, our gene expression database can serve as a powerful resource for the study of gastric cancer using these cell lines. Oncogene (2002) 00, 00 – 00. doi:10.1038/sj.onc.1205829 Keywords: gene expression profile; microarray; gastric cancer Gastric cancer is the second most common cancer worldwide, accounting for almost 10% of new cancer cases (Parkin et al., 1999). It is also among the leading causes of death from cancer throughout the world (Pisani et al., 1999). Gastric adenocarcinoma constitu- tes approximately 90% of all gastric cancers. Helicobacter pylori infection has been clearly linked to the development of gastric adenocarcinoma (Ebert et al., 2000). However, the molecular mechanism of this association remains unclear. Pathologically, there are two types of gastric cancer: intestinal type and diffuse type. The intestinal type cancer commonly arises in a background of chronic atrophic gastritis and intestinal metaplasia. As with other types of malignancies, the prognosis of gastric cancer patients depends heavily on the clinical and pathological stage of the disease at diagnosis. Patients with small and early cancer lesions who undergo surgical resection have a better chance of survival. However, most patients are diagnosed with advanced stage disease and the five-year survival rate is generally less than 10% (Peddanna et al., 1995). Recent evidence suggests that the phenotypic diversity of tumors is associated with corresponding diversity in their gene expression programs. cDNA microarray technology has been applied to study the gene expression patterns in different tumor types, providing new insight into the development and classification of these cancers (Alizadeh et al., 2000; Golub et al., 1999; Perou et al., 2000; Ramaswamy et al., 2001; Welsh et al., 2001). Cell lines have been extensively used as experimental models to study the genetics, pharmacology and biochemistry of cancer cells, as well as cellular response to different stimulators. Ross et al. (2000) system- atically studied the variation in gene expression programs in 60 human cancer cell lines (NCI60). NCI60 cell lines were derived from nine broad categories of tissue of origin. They have been used in the large-scale drug screening by NCI (Monks et al., 1997). It was shown that each cell line expresses genes characteristic of its cellular origin. Specific features of these gene expression patterns seemed to be related to Received 3 April 2002; revised 13 June 2002; accepted 28 June 2002 *Correspondence: S So, Department of Surgery, 300 Pasteur Drive, Room H3680, Stanford University Medical Center, Stanford, CA 94305, USA; E-mail: [email protected] 9 These authors contributed equally to this work. Oncogene (2002) 00, 00 – 00 ª 2002 Nature Publishing Group All rights reserved 0950 – 9232/02 $25.00 www.nature.com/onc ONC 11580 (REPROOF) ONCOGENOMICS

2 Stanford University, Stanford, CA, USA; 4 ONCOGENOMICSgenome-tes approximately 90% of all gastric cancers. Helicobacter pylori infection has been clearly linked to the development

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 2 Stanford University, Stanford, CA, USA; 4 ONCOGENOMICSgenome-tes approximately 90% of all gastric cancers. Helicobacter pylori infection has been clearly linked to the development

Unco

rrect

ed

Pro

ofSHORT REPORTS

Comprehensive analysis of the gene expression profiles in human gastric

cancer cell lines

Jiafu Ji1,9, Xin Chen2,3,9, Suet Yi Leung4,9, Jen-Tsan A Chi5, Kent Man Chu6, Siu Tsan Yuen4,Rui Li2,3, Annie SY Chan4, Jiyou Li7, Nina Dunphy8 and Samuel So*,2,3

1Department of Surgery, Beijing Cancer Hospital, Peking University School of Oncology, Beijing, China; 2Department of Surgery,Stanford University, Stanford, CA, USA; 3Asian Liver Center, Stanford University, Stanford, CA, USA; 4Department ofPathology, The University of Hong Kong, Queen Mary Hospital, Hong Kong, China; 5Department of Biochemistry, StanfordUniversity, Stanford, CA, USA; 6Department of Surgery, The University of Hong Kong, Queen Mary Hospital, Hong Kong,China; 7Pathology Department, Beijing Cancer Hospital, Peking University School of Oncology, Beijing, China; 8Department ofGenetics, Stanford University, Stanford, CA, USA

Gastric adenocarcinoma is one of the major malignanciesworldwide. Gastric cell lines have been widely used as themodel to study the genetics, pharmacology and biochem-istry of gastric cancers. Here we describe acomprehensive survey of the gene expression profiles of12 gastric carcinoma cell lines, using cDNA microarraywith 43 000 clones. For comparison, we also explored thegene expression patterns of 15 cell lines derived fromlymphoid, endothelial, stromal and other epithelialcancers. Expression levels of specific genes werevalidated through comparison to protein expression byimmunohistochemistry using cell block arrays. We foundsets of genes whose expression corresponds to themolecular signature of each cell type. In the gastriccancer cell lines, apart from genes that are highlyexpressed corresponding to their common epithelialorigin from the gastrointestinal tract, we found markedheterogeneity among the gene expression patterns ofthese cell lines. Some of the heterogeneity may reflecttheir underlying molecular characteristics or specificdifferentiation program. Two putative gastric carcinomacell lines were found to be B-cell lymphoma, and anotherone had no epithelial specific gene expression and hencewas of doubtful epithelial origin. These cell lines shouldno longer be used in gastric carcinoma research. Inconclusion, our gene expression database can serve as apowerful resource for the study of gastric cancer usingthese cell lines.Oncogene (2002) 00, 00 – 00. doi:10.1038/sj.onc.1205829

Keywords: gene expression profile; microarray; gastriccancer

Gastric cancer is the second most common cancerworldwide, accounting for almost 10% of new cancercases (Parkin et al., 1999). It is also among the leadingcauses of death from cancer throughout the world(Pisani et al., 1999). Gastric adenocarcinoma constitu-tes approximately 90% of all gastric cancers.Helicobacter pylori infection has been clearly linkedto the development of gastric adenocarcinoma (Ebertet al., 2000). However, the molecular mechanism ofthis association remains unclear. Pathologically, thereare two types of gastric cancer: intestinal type anddiffuse type. The intestinal type cancer commonlyarises in a background of chronic atrophic gastritisand intestinal metaplasia. As with other types ofmalignancies, the prognosis of gastric cancer patientsdepends heavily on the clinical and pathological stageof the disease at diagnosis. Patients with small andearly cancer lesions who undergo surgical resectionhave a better chance of survival. However, mostpatients are diagnosed with advanced stage diseaseand the five-year survival rate is generally less than10% (Peddanna et al., 1995).Recent evidence suggests that the phenotypic

diversity of tumors is associated with correspondingdiversity in their gene expression programs. cDNAmicroarray technology has been applied to study thegene expression patterns in different tumor types,providing new insight into the development andclassification of these cancers (Alizadeh et al., 2000;Golub et al., 1999; Perou et al., 2000; Ramaswamy etal., 2001; Welsh et al., 2001).Cell lines have been extensively used as experimental

models to study the genetics, pharmacology andbiochemistry of cancer cells, as well as cellular responseto different stimulators. Ross et al. (2000) system-atically studied the variation in gene expressionprograms in 60 human cancer cell lines (NCI60).NCI60 cell lines were derived from nine broadcategories of tissue of origin. They have been used inthe large-scale drug screening by NCI (Monks et al.,1997). It was shown that each cell line expresses genescharacteristic of its cellular origin. Specific features ofthese gene expression patterns seemed to be related to

Received 3 April 2002; revised 13 June 2002; accepted 28 June2002

*Correspondence: S So, Department of Surgery, 300 Pasteur Drive,Room H3680, Stanford University Medical Center, Stanford, CA94305, USA; E-mail: [email protected] authors contributed equally to this work.

Oncogene (2002) 00, 00 – 00ª 2002 Nature Publishing Group All rights reserved 0950 – 9232/02 $25.00

www.nature.com/onc

ONC 11580 (REPROOF)

ON

CO

GE

NO

MIC

S

Page 2: 2 Stanford University, Stanford, CA, USA; 4 ONCOGENOMICSgenome-tes approximately 90% of all gastric cancers. Helicobacter pylori infection has been clearly linked to the development

Unco

rrect

ed

Pro

of

Figure 1 (a) Hierarchical clustering of the patterns of variation in the expression of 6849 cDNA clones in 27 cell lines. The data areshown in a table format, in which rows represent individual genes and columns represent individual cell lines. The color in each cellreflects the expression level of the corresponding gene in the corresponding cell line, relative to its mean expression level across theentire set of cell lines. The scale (lower right corner) extends from fluorescence ratios of 0.25 to 4 relative to the mean level for allsamples. Grey indicates missing or excluded data. (b) to (f) Features of the variation in the gene expression patterns that can berelated to specific physiological or histological features of the cell lines. (b) Epithelial cell cluster; (c) B lymphocytes cluster; (d)

Gene expression of gastric cell linesJ Ji et al

2

Oncogene

Page 3: 2 Stanford University, Stanford, CA, USA; 4 ONCOGENOMICSgenome-tes approximately 90% of all gastric cancers. Helicobacter pylori infection has been clearly linked to the development

Unco

rrect

ed

Pro

ofthe physiological properties of the cell lines, including

cell proliferation rate and drug metabolism. The geneexpression profiles of these cell lines also helped todifferentiate specific cell types in normal and tumortissues. Furthermore, correlating the gene expressionprofile in each cell line with the responsiveness of eachcell line to drugs allowed for the identification of genesthat may be important for drug sensitivities (Scherf etal., 2000; Staunton et al., 2001). However, gastriccancer cell lines were not included in the NCI60studies. Still little is known about the gene expressionpatterns of gastric cell lines on the genomic scale.

Global gene expression patterns in 27 human cell lines

In this study, we used cDNA microarray with 43 000cDNA clones, representing approximately 30 000unique genes to study the gene expression patterns of12 gastric carcinoma cell lines. As a control to identifythe gene expression signatures of the various cell types,we included 15 cell lines of other tissue origin,including T-cell, B-cell, monocyte, myelocyte, fibro-blast, endothelium, colon, breast and pancreas. Weused a hierarchical clustering algorithm to group genesas well as the cell lines, on the basis of similarity intheir expression patterns. A total of 6849 cDNA cloneswere shown to have significant variation (genes with atleast fourfold of expression difference from the mean inone array and 60% valid data points) among 27 celllines (Figure 1a).The most notable feature of the clustered data was

that cell lines clustered into two major branches. Celllines derived from gastrointestinal epithelial cells,including gastric, colon and pancreas, clustered intoone branch, whereas the rest of the cell lines clusteredinto a second branch. The second branch againconsisted of three sub-branches: endothelial cells andfibroblast cluster together; the cell lines derived from

leukocytes cluster together; and SNU1 and MCF7formed the third cluster. As shown previously, thesegene expression patterns were clearly related to thehistological origins of the cell lines (Ross et al., 2000).Each cell type expressed the genes characteristic of itscellular origin (Figure 1b – f). For example, B lympho-cyte cell lines expressed CD20, BCL2, Immunoglobinheavy and light chain, and HLA class II molecules(Figure 1c); T lymphocyte cell lines expressed CD3,CD6 and CD28 (Figure 1d); endothelial cell linesexpressed Von Willebrand factor, CD31, VE-cadherin(Figure 1f); and fibroblast cell lines expressed differentmembers of collagen family and other genes encodingcell matrix proteins (Figure 1e). Cells originated fromepithelial lineage, including gastric carcinoma cell lines,clustered together and expressed a set of genes, manyof which have been implicated in epithelial cell biology(Figure 1b). This cluster included genes whose productsencode structural proteins, for example, cell – celladherence complex (desmoplakin, Claudin 3 andClaudin 4), cell –matrix complex (integrin beta 4) andepithelial intermediate filament (Keratin 8 and Keratin18). The patterns of gene expression measured in these27 cell lines provide us with a framework fordistinguishing different cell types within the histologi-cally complex gastric tissues.It is notable that within each tissue specific gene

cluster, many genes encoding signaling molecules,transcriptional regulators, and EST clones wereidentified (see the GeneExplore file of websupplement (http://genome-www.stanford.edu/gc_cells/explore.shtml) for the complete data set including ESTUnigen cluster ID and Accession Numbers). Ofparticular interest in cancer cell biology are theepithelial and endothelial gene clusters. In the epithelialcell cluster, besides those structural proteins known tobe important for epithelial cell biology, the cluster alsoincluded cell surface receptors (ErbB3, MST1 receptor,DDR1), transcriptional factors (E74-like factor),

T lymphocytes cluster; (e) fibroblast cell cluster; (f) endothelial cell cluster. Due to limited space, only a few selected gene names areshown. See Supplementary Information for full data. Materials and methods: A total of 27 cell lines were used in this study. Theseincluded: twelve gastric cancer cell lines (AGS, KATO3, SNU1, SNU5, SNU16, RF1, RF48, N87, NUGC3, MKN45, BGC823,PAM82), two T cell lines (Jurkat, MOLT4), two B cell lines (LAM, HFI-1), one acute promyelocytic leukemia-like cell line(NB4+RA), one monocyte-like cell line (U937+PMA), one pancreas cancer cell line (BxPC3), four colon cancer cell lines (colo205,HCT116, SW620, HCT15), one breast cancer cell line (MCF7), one primary fibroblast and two primary endothelial cell lines. Thedetailed information for these cell lines is available through the web supplement. All cell lines were cultured to 80% confluence,harvested and frozen in 7808C until ready to be isolated. mRNA was extracted directly from the frozen cell pellet using FastTrack(Invitrogen) mRNA isolation kit. For the microarray production, 43 000 cDNA clones, representing about 30 000 unique genes,were mechanically printed onto treated glass microscope slides, as previously described (http://cmgm.stanford.edu/pbrown/ar-ray.html) (Perou et al., 2000). For RNA labeling, a common reference, which consisted of mixture of eleven cell lines was used(Perou et al., 2000). The hybridization procedures were performed as previously described (Alizadeh et al., 2000). A detailed pro-tocol is available at: http://cmgm.stanford.edu/pbrown/protocols/5_hyb_human.html. Primary data collection and analysis were car-ried out using GenePix Pro 3.0 (Axon Instruments). Areas of the array with obvious blemishes were manually flagged and excludedfrom subsequent analysis. The raw data were deposited into Stanford Microarray Database (Sherlock et al., 2001) at: http://genome-www4.stanford.edu/MicroArray/SMD/index.html. For the generation of the cluster, all non-flagged array elements for which thefluorescent intensity in either channel was greater than 2.5 times the local background were considered well measured. Genes forwhich fewer than 60% of measurements across all the samples in this study met this standard were excluded from further analysis.We chose to further analyse genes whose expression level differed by at least fourfold, in at least one sample, from their mean ex-pression level across all samples. We applied a hierarchical clustering algorithm both to the genes and arrays using the Pearsoncorrelation coefficient as the measure of similarity, and average linkage clustering, as described (Eisen et al., 1998). The results werevisualized and analysed with TreeView (M Eisen; http://rana.lbl.gov)

Gene expression of gastric cell linesJ Ji et al

3

Oncogene

Page 4: 2 Stanford University, Stanford, CA, USA; 4 ONCOGENOMICSgenome-tes approximately 90% of all gastric cancers. Helicobacter pylori infection has been clearly linked to the development

Unco

rrect

ed

Pro

ofsignaling molecules (SH3BP1, ARHEGF5,

ARHEGF16), and many ESTs of unknown function.In the endothelial specific gene cluster, many angio-genic factors and receptors which were recentlyimplicated in neovascularization were identified: forexample, placental growth factor (PGF), Angiopoietin2 (ANGPT2), fms-related tyrosine kinase 1 (FLT1),and neuropilins (NRP1 and NRP2). Moreover, manysignaling molecules (RGS5, RGS4), cell cycle regulator(CDKN1B), as well as ESTs of unknown function werealso seen. Further investigation into the function ofthese genes is clearly needed. Our gene expression datahas provided preliminary information about thesegenes and further study of each of these genes mayreveal novel biological function in epithelial cellgrowth, angiogenesis, extracellular matrix formation,and host immune response. Also, examination of thegene repertoire in the tumor cells, fibroblasts andendothelial cells may reveal potential autocrine andparacrine interactions between these cell types in vivo.Several interesting features regarding gastric cell lines

emerged from the hierarchical clustering analysis. Twoof the gastric carcinoma cell lines (RF1 and RF48)were found to co-cluster with the B lymphoma celllines, LAM and HFI-1 (Figure 1), and showeddissimilar gene expression patterns compared withother gastric cell lines (Figures 1 and 3). They

expressed genes that were characteristic of the B-celllineage, including CD20, BCL2, immunoglobin andHLA class II molecules (Figure 1c). A closer look atthe cell morphology also revealed that the RF1 andRF48 were relatively small cells with vesicular nuclei,multiple peripheral attached nucleoli, consistent withthe cytological features of immature B-cells. This raisedthe question that RF1 and RF48 were possiblymisidentified. Both cell lines were derived from thesame patient, RF1 from the primary gastric adeno-carcinoma and RF48 from the metastases in asciticfluid. To further investigate the cellular origin of thesetwo cell lines, immunohistochemical staining wasperformed on these two cell lines together with anothergastric carcinoma cell line, SNU5, with LeukocyteCommon Antigen (LCA, CD45, Dako), B-cell markerCD20 (Dako) and pan-epithelial marker Cam5.2(Becton Dickinson) (Figure 2). Both RF1 and RF48stained positive for LCA and CD20, but negative forCam5.2, whereas SNU5 showed strong positivestaining for Cam5.2 and negative staining for LCAand CD20. Together with the microarray data, it isconclusive that RF1 and RF48 were in fact B-celllymphoma.SNU1 appeared to be very different from all other

epithelial derived cell lines. It loosely clustered withMCF7, which is a breast carcinoma cell line (Figure 1).

Figure 2 Immunohistochemical staining shows that RF1 and RF48 are derived from B cell lineage. From left to right are RF1,RF48 and SNU5. Each cell line was stained with the following antibodies: (from top to bottom) LCA (CD45, Dako), CD20(L26, Dako) and Cam5.2 (Becton-Dickinson). Immunohistochemical staining was performed using the standard streptavidin-biotinperoxidase method with heat-mediated antigen retrieval

Gene expression of gastric cell linesJ Ji et al

4

Oncogene

Page 5: 2 Stanford University, Stanford, CA, USA; 4 ONCOGENOMICSgenome-tes approximately 90% of all gastric cancers. Helicobacter pylori infection has been clearly linked to the development

Unco

rrect

ed

Pro

ofWe found that it lacked the expression of the set of

genes characteristic of epithelial cells (Figure 1b). Itexpressed a set of unique genes (Figure 3a), however,most of these genes lacked tissue specificity. To furthercharacterize this cell line, we performed immunohisto-chemical staining. We found that it stained negative forall the pan-epithelial markers available, includingCam5.2 (Becton Dickinson), AE1/AE3 (Dako),MNF116 (Dako), BerEP4 (Dako), EMA (Dako), andcarcinoembryonic antigen (mCEA, Zymed). SNU1 alsostained negative for gastrointestinal stromal tumormarkers including desmin (Dako), smooth muscle actin(Dako), c-kit (Dako) and CD34 (Becton Dickinson),connective tissue marker vimentin (Dako), and mela-noma marker HMB45 (Dako). It stained positive forneuronal specific enolase (Zymed) and S100 (Dako),raising the possibility of neuroendocrine origin.However, ultrastructural examination revealed veryprimitive tumor cells with no specific differentiationfeatures. In particular, there was no neurosecretorygranule so the possibility of a neuroendocrine tumorcannot be confirmed (data not shown). Together withour microarray data, it suggests that SNU1 may not bederived from gastric epithelial cells. However, wecannot be conclusive about its cellular origin and canonly consider it to be a poorly differentiated tumor cellline.

Heterogeneity among the gene expression patterns ofgastric cancer cell lines

To further analyse the gene expression patterns ofgastric cancer cell lines, we clustered 12 gastric celllines using 3499 cDNA clones which had significantvariation among these 12 cell lines (Figure 3). Apartfrom RF1, RF48 and SNU1, which are probably non-epithelial in origin, the remaining nine gastric cancercell lines were divided into two main groups. SNU5,SNU16, KATO3, MKN45 and AGS clustered into onebranch, whereas NUGC3, N87, PAM82 and BGC823clustered into a second branch. The major genes thatdistinguish the two groups included Villin 1, LGALS4and LI-cadherin (Figure 3e). All of these genes areknown to be expressed in intestinal epithelial cells.Moreover, Villin1 and LI-cadherin has been shown tobe expressed in gastric mucosa with intestinal meta-plasia (Grotzinger et al., 2001; Osborn et al., 1988). Itis most likely that the cell lines from the first branchare derived from tumors that progress from intestinalmetaplasia. As these two groups of cell lines adopt adifferent differentiation program, further investigationis needed to clarify whether they may possess distinctbiological and genetic properties, or different respon-siveness to chemotherapeutic agents.Some oncogenes and tumor suppressor genes are

known to be altered by various mechanisms in gastriccancers. They are also reflected in the gene expressiondata. The expression of p53 protein was dramaticallydownregulated in KATO3 compared with other gastriccell lines (Figure 3b). This result is consistent with

reports of genomic deletion of p53 gene in KATO3(Yokozaki, 2000). Amplification of FGFR2 (K-sam) isknown in KATO3 (Yokozaki, 2000), and we havefound a correspondingly high level of FGFR2expression in KATO3 and SNU16 (Figure 3c). It isknown that amplification of the ErbB2 gene, which isassociated with over-expression of the mRNA andprotein, is present in a subset of gastric adenocarcino-ma (Tokunaga et al., 1995). N87 showed a very highexpression level of ErbB2 compared with all othergastric cancer cell lines. Interestingly, a group of genesmapped to the ErbB2 locus at chromosome 17q11-21,including PPARBP, CrkRS, MLN64, MLLT6 andMCG953, was also highly expressed in N87 cell (Figure3d). This supports the idea that there is amplificationat 17q11-21 and this amplification results in the highexpression of ErbB2 oncogene in N87 cell.Examination of the gene expression of individual

gastric cell lines revealed marked heterogeneity withsets of genes being up-regulated or down-regulated ineach cell line (Figure 3a). Further study will be neededto investigate if these represent results of chromosomalaberrations, alteration in transcriptional regulators orother underlying molecular events; and the biologicalsignificance of these unique gene expression patterns.

Gene expression patterns correlating withimmunohistochemical staining of cell block arrays

DNA microarray analysis showed heterogeneity amongthe gene expression patterns within human gastric celllines. However, microarray data only reviews theexpression level of mRNA. Because proteins are themajor components that carry out most of the cellularfunctions, it is important to study the correspondingexpression level at the protein level. To facilitate thecharacterization of large number of gene productsidentified from DNA microarray studies in an efficientand economic way, we constructed a cell block arraycontaining all twelve gastric cancer cell lines (includingRF1, RF48 and SNU1) and performed immunohisto-chemical staining on these arrays. Several antiserums,including Villin-1, ErbB2, E-cadherin (CDH1), Keratin7, Keratin 20 and Muc5AC, were used in the cell blockarray staining (Table 1). In general, we noticed a closeassociation of mRNA expression and protein level.This observation supported the reliability of our geneexpression data obtained from cDNA microarrayexperiments.Two of the examples are shown in Figure 4. ErbB2

is highly expressed in N87 cell by our microarrayanalysis and we found that N87 is the only cell linethat shows a positive staining of ErbB2 (Figure 4a). E-cadherin is one of the major molecules involved incell – cell adhesion. Loss of expression of E-cadherinhas been linked to the development of tumor (Nollet etal., 1999), and may result in the diffuse type of gastriccancer (Debruyne et al., 1999). Microarray analysisrevealed that E-cadherin (CDH1) is expressed at verylow levels in PAM82, BGC823, SNU5 and AGS cells.

Gene expression of gastric cell linesJ Ji et al

5

Oncogene

Page 6: 2 Stanford University, Stanford, CA, USA; 4 ONCOGENOMICSgenome-tes approximately 90% of all gastric cancers. Helicobacter pylori infection has been clearly linked to the development

Unco

rrect

ed

Pro

of

Figure 3 (a) Hierarchical clustering of the patterns of variation in the expression of 3499 cDNA clones in 12 gastric cell lines basedon similarity in gene expression patterns. The scale is the same as in Figure 1. To the right, lines and cell line names labeled in redrepresent the corresponding gene cluster that is highly expressed in the cell line. Alternatively, lines and cell line names labeled ingreen represent the corresponding gene cluster that is expressed in the cell line at low level. (b) to (e) Features of the variation in thegene expression patterns in these gastric cell lines. (b) Expression of TP53 genes; (c) Expression of FGFR2; (d) ErbB2 cluster; (e)Villin 1 cluster. See Supplementary Information for full data. Materials and methods: All methods in this figure, including dataselection and clustering, are the same as for Figure 1

Gene expression of gastric cell linesJ Ji et al

6

Oncogene

Page 7: 2 Stanford University, Stanford, CA, USA; 4 ONCOGENOMICSgenome-tes approximately 90% of all gastric cancers. Helicobacter pylori infection has been clearly linked to the development

Unco

rrect

ed

Pro

of

Cell block array immunohistochemistry showed acorresponding negative staining of E-cadherin proteinin all these cell lines (Figure 4b).Completion of the human genome project has

provided the basic structure of all human genes. Italso provides us an exciting opportunity to study geneexpression and function on the genomic scale. In thisstudy, by profiling the expression of 27 cell lines, wewere able to identify sets of tissue-specific genes thatare of potential importance in carcinoma cell growth,angiogenesis, extracellular matrix formation and

immune response. We also found a marked hetero-geneity of gene expression among the gastric cancer celllines, which may reflect their underlying differences inhistiogenetic origin, differentiation program or mole-cular pathway of tumor evolution. By their geneexpression signature, we were able to redefine thehistiogenetic origin of two gastric cell lines (RF1 andRF48) as B-cell lymphomas. Although the histiogeneticorigin of SNU1 remains elusive, our data has raisedsufficient doubt about its epithelial origin. Becausethese three cell lines are available from American

Table 1 Summary of the correlation of the gene expression data from cDNA microarray with immunohistochemical staining

Pearson correlationSNU5 SNU16 KATO3 MKN45 AGS NUGC3 N87 PAM82 BGC823 SNU1 RF1 RF48 coefficient

Villin Microarray 1.73 3.22 1.36 2.17 1.51 0.25 71.26 71.87 71.57 71.89 71.37 71.67 0.90**IHC 2.00 3.00 3.00 3.00 3.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00

ErbB2 Microarray 70.10 0.31 0.47 0.58 0.47 70.51 5.06 70.61 71.10 70.60 72.21 72.16 0.86**IHC 0.00 0.00 0.00 0.00 0.00 0.00 3.00 0.00 0.00 0.00 0.00 0.00

Ecad Microarray 71.00 2.05 1.66 2.88 71.20 1.94 2.48 71.89 71.27 71.96 72.34 72.46 0.94**IHC 0.00 2.00 1.00 2.00 0.00 1.00 2.00 0.00 0.00 0.00 0.00 0.00

KRT7 Microarray 0.24 1.02 2.67 0.94 70.75 3.26 1.67 1.84 74.18 73.37 73.34 0.88**IHC 0.00 2.00 3.00 2.00 0.00 3.00 3.00 2.00 2.00 0.00 0.00 0.00

KRT20 Microarray 5.31 70.51 70.19 72.55 71.16 2.95 71.01 71.43 71.40 0.87**IHC 3.00 1.00 1.00 1.00 1.00 0.00 3.00 0.00 0.00 0.00 0.00 0.00

Muc5AC Microarray 70.99 70.52 1.05 1.60 1.57 0.30 70.65 70.96 71.41 0.77*IHC 0.00 2.00 1.00 2.00 0.00 0.00 0.00 2.00 1.00 0.00 0.00 0.00

The following antibodies were used in the staining: Villin (Immunotech); E-cadherin (Zymed); ErbB2 (Dako); Keratin 7 (Dako); Keratin 20(Dako); Muc 5AC (Neomarkers). Immunohistochemical staining was performed using the standard strepavidin – biotin peroxidase method withheat-mediated antogen retrieval. The percentage of positive cells and intensity of staining were taken into account and graded in a scale of 0 to 3(0, negative; 1, occasional positive cells or weak staining intensity; 2, moderate number of positive cells or moderate intensity of staining; 3, mostcells positive with intense staining) without knowledge of the microarray data, the mean expression levels were taken if there were more than onecDNA clone per gene in the array. The Pearson correlation coefficient of the mRNA level and protein expression was computed for each gene(**Correlation is significant at 0.01 level; *Correlation is significant at 0.05 level)

Figure 4 Correlating the gene expression data from cDNA microarray with immunohistochemical staining using cell block array.The upper panel shows the relative mRNA expression level measured by cDNA microarray. The scale of the color is the same as inFigure 1. The lower panel shows the immunohistochemical staining. (a) ErbB2 (Dako); (b) E-cadherin (Zymed). N87 shows deepbrown membrane staining for ErbB2 whilst all other cell lines are negative. SNU16, KATO3, MKN45, NUGC3 and N87 expressedE-cadherin as deep brown membrane or granular cytoplasmic staining. All other cell lines are negative for E-cadherin protein. En-larged figures for immunohistochemical staining of each cell line are available through the web supplement. Materials and methods:Twelve gastric cell lines were cultured to 80% confluence, washed in PBS and fixed in 4% paraformaldehyde. Paraffin cell blockswere prepared and a tissue microarray block containing all the cell lines was constructed. Immunohistochemical staining was per-formed using the standard streptavidin-biotin peroxidase method with heat-mediated antigen retrieval

Gene expression of gastric cell linesJ Ji et al

7

Oncogene

Page 8: 2 Stanford University, Stanford, CA, USA; 4 ONCOGENOMICSgenome-tes approximately 90% of all gastric cancers. Helicobacter pylori infection has been clearly linked to the development

Unco

rrect

ed

Pro

ofTyped Cell Culture (ATCC), they have been widely

used as models to study the genetics and biology ofgastric carcinoma. We believe that it is not appropriateto continue to use them in future studies.In this study, we found that each gastric cell line has

its own characteristic gene expression program (Figure3a). Some of them may be the result of chromosomalabnormalities. For example, the high expression of thegenes in the ErbB2 locus suggests the amplification ofthe 17q11-21 region in N87 cell. It would be extremelyinteresting to further study the DNA copy numbervariation in these cell lines by array based comprehen-sive genomic hybridization (aCGH) (Pollack et al.,1999). Correlating aCGH data with our gene expres-sion results will certainly provide new candidates foroncogenes or tumor suppressor genes, as well as newinsight into the molecular genetics of gastric cancer.Moreover, examination of the drug sensitivity of thesegastric cell lines and correlation with the geneexpression profile may lead to identification of genesresponsible for drug resistance in gastric cancer. Ourstudy has provided the first comprehensive view of thegene expression patterns in gastric cell lines on a

genomic scale. It serves as a powerful resource forfurther study of gastric cancer using these cell lines asmodels.

Note added in proof

Supplementary Information is available through theauthor’s web supplement site at:http://genome_www.stanford.edu/GCcells.

Acknowledgments

We are grateful for the members of the Patrick BrownLaboratory at the Department of Biochemistry, StanfordUniversity. We are especially thankful for the advice of thisproject provided by Drs Patrick Brown and DavidBotstein. We also thank the Stanford Functional GenomicCenter, Stanford Microarray database, and Stanford AsianLiver Center for their support, and Wijan Prapong for hishelp in the preparation of this manuscript. This work issupported by the HM Lui Foundation (X Chen, R Li, andS So); Research Grants Council of the Hong Kong SpecialAdministrative Region (HKU 7264/01M); and China ‘973’grant (G1998051203).

References

Alizadeh AA, Eisen MB, Davis RE, Ma C, Lossos IS,Rosenwald A, Boldrick JC, Sabet H, Tran T, Yu X, PowellJI, Yang L, Marti GE, Moore T, Hudson J, Lu L, LewisDB, Tibshirani R, Sherlock G, Chan WC, Greiner TC,Weisenburger DD, Armitage JO, Warnke R and StaudtLM et al. (2000). Nature, 403, 503 – 511.

Debruyne P, Vermeulen S and Mareel M. (1999). Acta.Gastroenterol. Belg., 62, 393 – 402.

Ebert MP, Yu J, Sung JJ and Malfertheiner P. (2000). Eur. J.Gastroenterol. Hepatol., 12, 795 – 798.

Eisen MB, Spellman PT, Brown PO and Botstein D. (1998).Proc. Natl. Acad. Sci. USA, 95, 14863 – 14868.

Golub TR, Slonim DK, Tamayo P, Huard C, GaasenbeekM, Mesirov JP, Coller H, Loh ML, Downing JR, CaligiuriMA, Bloomfield CD and Lander ES. (1999). Science, 286,531 – 537.

Grotzinger C, Kneifel J, Patschan D, Schnoy N, Anagnos-topoulos I, Faiss S, Tauber R, Wiedenmann B and GessnerR. (2001). Gut, 49, 73 – 81.

Monks A, Scudiero DA, Johnson GS, Paull KD andSausville EA. (1997). Anticancer Drug Des., 12, 533 – 541.

Nollet F, Berx G and van Roy F. (1999). Mol. Cell Biol. Res.Commun., 2, 77 – 85.

Osborn M, Mazzoleni G, Santini D, Marrano D, MartinelliG and Weber K. (1988). Virchows Arch. A. Pathol. Anat.Histopathol., 413, 303 – 312.

Parkin DM, Pisani P and Ferlay J. (1999). Int. J. Cancer, 80,827 – 841.

Peddanna N, Holt S and Verma RS. (1995). Anticancer Res.,15, 2055 – 2064.

Perou CM, Sorlie T, Eisen MB, van de Rijn M, Jeffrey SS,Rees CA, Pollack JR, Ross DT, Johnsen H, Akslen LA,Fluge O, Pergamenschikov A, Williams C, Zhu SX,Lonning PE, Borresen-Dale AL, Brown PO and BotsteinD. (2000). Nature, 406, 747 – 752.

Pisani P, Parkin DM, Bray F and Ferlay J. (1999). Int. J.Cancer, 83, 18 – 29.

Pollack JR, Perou CM, Alizadeh AA, Eisen MB, Perga-menschikov A, Williams CF, Jeffrey SS, Botstein D andBrown PO. (1999). Nat. Genet., 23, 41 – 46.

Ramaswamy S, Tamayo P, Rifkin R, Mukherjee S, YeangCH, Angelo M, Ladd C, Reich M, Latulippe E, MesirovJP, Poggio T, Gerald W, Loda M, Lander ES and GolubTR. (2001). Proc. Natl. Acad. Sci. USA, 98, 15149 – 15154.

Ross DT, Scherf U, Eisen MB, Perou CM, Rees C, SpellmanP, Iyer V, Jeffrey SS, Van de Rijn M, Waltham M,Pergamenschikov A, Lee JC, Lashkari D, Shalon D, MyersTG, Weinstein JN, Botstein D and Brown PO. (2000). Nat.Genet., 24, 227 – 235.

Scherf U, Ross DT, Waltham M, Smith LH, Lee JK, TanabeL, Kohn KW, Reinhold WC, Myers TG, Andrews DT,Scudiero DA, Eisen MB, Sausville EA, Pommier Y,Botstein D, Brown PO and Weinstein JN. (2000). Nat.Genet., 24, 236 – 244.

Sherlock G, Hernandez-Boussard T, Kasarskis A, BinkleyG, Matese JC, Dwight SS, Kaloper M, Weng S, Jin H, BallCA, Eisen MB, Spellman PT, Brown PO, Botstein D andCherry JM. (2001). Nucleic Acids Res., 29, 152 – 155.

Staunton JE, Slonim DK, Coller HA, Tamayo P, Angelo MJ,Park J, Scherf U, Lee JK, Reinhold WO, Weinstein JN,Mesirov JP, Lander ES and Golub TR. (2001). Proc. Natl.Acad. Sci. USA, 98, 10787 – 10792.

Tokunaga A, Onda M, Okuda T, Teramoto T, Fujita I,Mizutani T, Kiyama T, Yoshiyuki T, Nishi K andMatsukura N. (1995). Cancer, 75, 1418 – 1425.

Welsh JB, Zarrinkar PP, Sapinoso LM, Kern SG, BehlingCA, Monk BJ, Lockhart DJ, Burger RA and HamptonGM. (2001). Proc. Natl. Acad. Sci. USA, 98, 1176 – 1181.

Yokozaki H. (2000). Pathol. Int., 50, 767 – 777.

Gene expression of gastric cell linesJ Ji et al

8

Oncogene