Bioinformatics analysis and identification of potential genes related to
pathogenesis of cervical intraepithelial neoplasia
Xue Zhang#1, Jian Bai#2, Cheng Yuan1 ,Long Long1, Zhewen Zheng1, Qingqing Wang1,
Fengxia Chen1 , Yunfeng Zhou1△
1Department of Radiation and Medical Oncology, Zhongnan Hospital, Wuhan
University, Wuhan, Hubei 430071, P.R. China2Department of Gastrointestinal Surgery and Department of Gastric and Colorectal
Surgical Oncology, Zhongnan Hospital of Wuhan University, Wuhan, Hubei, China
#contributed equally to this work.
△Correspondence to: Pro. Yunfeng ZhouInstitution: Department of Radiation and Medical Oncology, Zhongnan Hospital,
Wuhan University Email: [email protected]
Abstract
The aim of this study was to explore and identify the key genes and signal
pathways contributing to cervical intraepithelial neoplasia (CIN). The gene expression
profiles of GSE63514 were downloaded from Gene Expression Omnibus database.
Differentially expressed genes (DEGs) were screened performing with packages in
software R. After Gene ontology terms, Kyoto Encyclopedia of Genes and Genomes
(KEGG) pathway enrichment analyzing, and Gene set enrichment analysis (GSEA),
weighted gene co-expression network analysis (WGCNA) was used to analyze these
genes. Then sub-modules were subsequently analyzed base CIN grade, and protein–
protein interaction (PPI) network of DEGs were constructed. 537 DEGs were
screened in total, consisting 331 up-regulated genes and 206 down-regulated genes in
CIN samples compared to normal samples. The most DEGs were enriched in
chromosomal region in cellular component (CC), organelle fission inbiological
process (BP) and ATPase activity in molecular function (MF). KEGG pathway
enrichment analyzing found the DEGs were mainly concentrated in 10 pathways. The
results of GSEA mainly enriched in 4 functional sets: E2F-Targets, G2M-Checkpoint,
Mitotic-Spindle and Spermatogenesis. A total of 6 modules were identified by
WCGNA. Subsequently, grey module was the highest correlation (Cor=0.78, P=5e-
22) and 31 genes were taken as candidate hub genes for CIN high grade risk
(weighted correlation coefficients >0.80). Finally, diagnostic analysis showed that in
addition to CCDC7, the expression levels of the remaining 13 DEGs have a high
diagnostic value (AUC>0.8 and P<0.05). These findings provided a new sight into the
understanding of molecular functions for CIN.
Keywords: cervical intraepithelial neoplasia, bioinformatical analysis, microarray,
differentially expressed genes
Introduction
The formation of cervical cancer is a continuous process from inflammation to
cervical intraepithelial neoplasia (CIN), and finally to invasive cancer, which takes 10
to 25 years1-3. CIN is regarded as a potentially premalignant transformation of
squamous cells of the cervix. According to the composite data for the natural history
of CIN, CIN1 is likely to regress in 60% of cases, persist in 30%, progress to CIN 3 in
10%, and progress to invasion in 1%4. Two high-risk HPV subtypes (types 16 and 18)
themselves produce two proto-oncoproteins, E6 and E7, which are key to their
disease5.
In recent years, many studies have focused on the diversity or heterogeneity of
various solid tumor types6,7. Gene expression network patterns are more complex in
cancer cells and tumors than in normal cells and organs8-11. In the study of Banerji et
al8, signaling entropy has been found to be significantly higher in cancer cells,
especially cancer stem cells, than in normal cells, thereby helping to distinguish them.
In this study, genes from CIN and normol samples were analyzed and screened
for differentially expression, from microarray datasets (GSE63514), using
bioinformatics. Functions and signal pathway enrichments of differentially expressed
genes (DEGs) were analyzed. Moreover, WGCNA explored the genes modules were
associated with CIN grade. Finally, identifying the biological function of the hub
genes and pathways, this study may offer a better insight of potential molecular
mechanisms to explore novel therapeutic strategies for CIN.
Methods
Data Procession
The gene expression profiles of GSE63514
(https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE63514) submitted by den
Boon J et al. was downloaded from the Gene Expression Omnibus (GEO) database.
The GSE63514 was an expression profiling based on GPL570 platform (Affymetrix
Human Genome U133 Plus 2.0 Array) and contained 128 samples (24 normal
samples, 76 CIN samples and 28 cervical cancer samples). All samples were taken
from flash-frozen biopsy and cryosectioned. This study mainly focused on the
Screening for differentially expressed genes (DEGs) between CIN and normal
samples, therefore, the 28 cervical cancer samples were not included.
Prior to bioinformatics analysis, we first mapped the array probes to the
respective Gene ID by using the array annotations. If a probe matches multiple genes,
the probe will be deleted. If a gene matches multiple probes, we will calculate its
average value. A proper threshold was settled based on the amount of genes filtered
out. A workflow of this study was indicated in Fig. 1.
Analysis of microarray datasetsLimma package12 in R/Bioconductor software was used to compare CIN sample
with its normal sample. In addition, normalization and log2 conversion were carried
out for each GEO dataset to filter out the final DEGs. The filtration conditions are as
follows: |log2FC| ≥1 and adjust P‐value(AdjP-value) < 0.05.
Enrichment analysis of gene function and pathwaysThe ClusterProfiler is an ontology-based R package, it applies the biological
terms classification and enrichment analysis to the comparison of gene clusters to
better understand the higher order functions of biological system13. DAVID14
(http://david.abcc.ncifcrf.gov/), a common functional annotation tool of
bioinformatics resources was utilized to distinguish the biological attributes such as
biological process (BP), cellular component (CC) and molecular function (MF) of
important DEGs. Moreover, Kyoto Encyclopedia of Genes and Genomes (KEGG)15
(http://www.genome.jp/kegg/) pathway enrichment analysis was used to discern the
crucial pathways significantly. AdjP-value <0.05 was set as the cut-off criterion for
the significant enrichment.
Gene set enrichment analysis (GSEA)
The enrichment analyses were conducted to detect whether a series of priori
defined biological processes was enriched. The enriched pathways were arranged in
the order of their normalized enrichment scores (NESs), and FDR < 0.05 was chosen
as the cut-off criteria.
Construction of gene co-expression network Firstly, the quality of the DEGs of GSE63514 was checked through R package.
Then, the scale-free gene co-expression network was constructed through the
“WGCNA” package. Pearson’s correlation matrices were calculated and a weighted
adjacency matrix was constructed through a power function amn = |cmn|β (cmn
means Pearson's correlation between gene m and gene n; amn = adjacency between
gene m and gene n). Afterwards, the most appropriate soft - thresholding parameter
(β) was chosen to transform the adjacency matrix into a topological overlap matrix
(TOM), so that modules including similar genes were identified. Module eigengenes
(MEs) was defined as the most principal component and clarify all genes into a single
characteristic expression profile. The correlation between module exigencies (MEs)
was defined as the dominating component of gene module and clinical traits to
identify the correlative module. The module highly related to given clinical
characteristics was selected for further analysis.
Results
Identification of DEGsThe clinical parameters are shown in Tab S1. A total of 100 tissues were divided
into 76 CIN and 24 normal samples in GSE53757. After integrated analysis, 537
DEGs (|log2 FC| ≥ 1 and AdjP-value < 0.05) were screened in total, consisting 331
up-regulated genes and 206 down-regulated genes in CIN samples compared to
normal samples. Volcano plots (Fig. S1) were visualized to show the correlation
between DEGs.
GO, pathway enrichment analysis and GSEA of DEGsAll DEGs were uploaded to the online website DAVID to discern GO
classfication. The terms for each GO category were shown in Fig 2 and Fig S2. The
most DEGs were enriched in chromosomal region in CC (Fig. 2A), organelle fission
in BP (Fig. 2B) and ATPase activity in MF (Fig. 2C). The results of pathway
enrichment analysis were shown in Fig. 3.
To identify potential function of the hub genes, GSEA was conducted
respectively to search “All gene sets” enriched in the samples with the gene highly
expressed. The DEGs are mainly enriched in 4 functional sets: E2F-Targets, G2M-
Checkpoint, Mitotic-Spindle and Spermatogenesis (Fig. 4 and 5).
Co-expression network construction and key modules identification
The DEGs with similar expression patterns were grouped into modules via the
average linkage hierarchical clustering, calculated by “WGCNA” package. A total of
6 modules were identified (Fig. 6A). Subsequently, we calculated the correlation
between gene module and CIN grade. Grey module has the highest correlation
(Cor=0.78, P=5e-22; Fig. 6B). Therefore, 31 genes with the high connectivity in grey
module were taken as candidate hub genes for CIN high grade risk in the module
(weighted correlation coefficients >0.80, Tab S2). The analysis of protein interaction
network suggested that 14 of these genes might interact more closely in CIN
classification (Fig. 7 and Tab S3). Diagnostic analysis results showed that in addition
to CCDC7, the expression levels of the remaining 13 genes have a high diagnostic
value (AUC>0.8 and P<0.05; Fig. 8).
Conclusion
DEGs in CIN samples can be used to diagnose the progressing disease before it
leads to cancer. A combinatorial approach utilizing gene expression profile, PPI
network, hubs, modules and motifs was employed to identify potential prognostic
markers capable of distinguishing progressing cervical disease. A total of 537 DEGs
(331 up-regulated genes and 206 down-regulated genes) were identified in CIN
samples by gene expression profiling. These genes also deregulated a number of
biological pathways including: Cell cycle, DNA replication, Fanconi anemia
pathway, p53 signaling pathway, Homologous recombination, Oocyte meiosis,
Mismatch repair, Pyrimidine metabolism, Progesterone-mediated oocyte
maturation and Drug metabolism - other enzymes. In addition, 4 functional gene sets
were enriched: E2F-Targets, G2M-Checkpoint, Mitotic-Spindle and Spermatogenesis.
31 DEGs out of 537 were found as candidate hub genes for CIN high grade risk.
Among them, 13 genes might interact more closely in CIN classification and have a
high diagnostic value.
The most DEGs were enriched in chromosomal region in CC, organelle fission in
BP and ATPase activity in MF. Chromosomal instability is a crucial sign of
malignancy. Kudela E et al16 focused on chromosomal changes in the process of
cervical carcinogenesis and CIN. This study indicated the amplification of
chromosomal regions increases with the degree of dysplasia toward the invasive
disease. Increasing in the amplification of 3q26 is noticeable already at CIN
2 + lesions, and 5p15 amplification is shifted up toward CIN 3. At present, organelle
fission focuses on mitochondrial fission. Mitochondria are highly dynamic organelles,
and mitochondrial fission is a crucial step of apoptosis17. Mitochondrial fragmentation
is involved in the apoptotic process of cervical cancer17. However, whether this is
related to CIN has not yet been clarified. As a condition in which cells change their
chromosomal content at a high rate, chromosomal instability is a consistent feature of
the majority of solid tumours18, and chromosomal instability plays an important role
in cervical disease, and is significantly associated with patient outcome. KEGG
results showed that most DEGs enrichment pathways were related to cell cycle. Ki67
is a marker of cell proliferation, and the increased expression of Ki67 is correlated
with higher cervical CIN grade and is a highly sensitive biomarker for differentiating
between CIN1 and CIN2/319,20. In addition, high-risk HPV E7 oncoproteins bind and
inactivate pRb, leading to abnormal cell proliferation21.
Previous studies have focused on different types of solid tumors (cervical cancer),
such as genetic instability at gene locus 1p36, which may be a feature of cervical
cancer22; decreased expression of cytokeratin 7 may lead to poor prognosis of cervical
cancer23; HPV infection is an important potential biomarker of cervical cancer24;
neutrophil ratio and white matter cell count can be used as a prognostic factor for
recurrence of cervical cancer25. However, the continuous process from inflammation
to CIN to invasive cancer is often overlooked. Since CIN is the most important
precancerous lesion of cervical cancer, we focus more on the progress from normal
cervical epithelial tissue to CIN, which is closely related to the occurrence and
progression of cancer. Our results show that TOP2A and RFC4 play an important role
in this process. TOP2A is regarded as a biomarker for the improved diagnosis of
CIN26. Recent study has shown that TOP2A protein is expressed in cells with aberrant
S-phases and including HPV-transformed cells in association with elevated expression
of the HPV E6/E7 proteins27. It is worth noting that many studies have shown that
TOP2A expression level is significantly correlated with CIN grade26,28. In addition,
RFC4 accelerated G1 to S phase progression, and promoted the proliferation of
cervical cancer cells and the growth of cervical cancer29. However, our study screened
13 DEGs related to CIN grade. At present, there is not enough evidence to support the
association with CIN grade except TOP2A and RFC4. The research of gene
bioinformatics provides a possible molecular targeting mechanism for the treatment of
progressive cervical diseases. Therefore, subsequent studies will focus on validating
these DEGs.
The limitation of this study is that the data used in this study are from public
databases, so the quality cannot be evaluated. In addition, we did not further study the
differential expression of CIN to cervical cancer.
Sum up, this study used bioinformation-based methods to reveal DEGs related to
CIN. This study is a gene analysis with a large sample size that integrates microarray
data from GEO databases. Then the functional and pathway enrichment analysis of
DEGs was carried out. In addition, the WGCNA method was used to analyze the
clinical data graed related to CIN. Therefore, this research provided a new sight into
the understanding of molecular functions for CIN. However, further experiments are
required to confirm and validate these predicted results.
Acknowledgments
National Natural Science Foundation Youth Project (81701768)
Conflict of interest
The authors report no conflicts of interest.
Reference
1. Pinto AP, Crum CP. Natural history of cervical neoplasia: defining progression and its consequence. Clinical obstetrics and gynecology. 2000;43:352-2.
2. Insinga RP, Glass AG, Rush BB. Diagnoses and outcomes in cervical cancer screening: a population-based study. American journal of obstetrics and gynecology. 2004;191:105-13.
3. Elfgren K, Jacobs M, Walboomers JM, Meijer CJ, Dillner J. Rate of human papillomavirus clearance after treatment of cervical intraepithelial neoplasia. Obstetrics and gynecology. 2002;100:965-71.
4. Ostor AG. Natural history of cervical intraepithelial neoplasia: a critical review. International journal of gynecological pathology : official journal of the International Society of Gynecological Pathologists. 1993;12:186-92.
5. Walboomers JM, Jacobs MV, Manos MM, Bosch FX, Kummer JA, Shah KV, et al. Human papillomavirus is a necessary cause of invasive cervical cancer worldwide. The Journal of pathology. 1999;189:12-9.
6. Greaves M. Cancer stem cells: back to Darwin? Seminars in cancer biology. 2010;20:65-70.
7. Meacham CE, Morrison SJ. Tumour heterogeneity and cancer cell plasticity. Nature. Sep 19 2013;501(7467):328-337.
8. Banerji CR, Miranda-Saavedra D, Severini S, Widschwendter M, Enver T, Zhou JX, et al. Cellular network entropy as the energy potential in Waddington's differentiation landscape. Scientific reports. 2013;3:3039.
9. Banerji CR, Severini S, Caldas C, Teschendorff AE. Intra-tumour signalling entropy
determines clinical outcome in breast and lung cancer. PLoS computational biology. 2015;11:e1004115.
10. Goldberg AD, Allis CD, Bernstein E. Epigenetics: a landscape takes shape. Cell. Feb 23 2007;128(4):635-638.
11. Teschendorff AE, Sollich P, Kuehn R. Signalling entropy: A novel network-theoretical framework for systems analysis and interpretation of functional omic data. Methods (San Diego, Calif.).2014;67:282-93.
12. Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic acids research. 2015;43:e47.
13. Yu G, Wang LG, Han Y, He QY. clusterProfiler: an R package for comparing biological themes among gene clusters. Omics : a journal of integrative biology. 2012;16:284-7.
14. Huang da W, Sherman BT, Lempicki RA. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nature protocols. 2009;4:44-57.
15. Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic acids research. 2000;28:27-30.
16. Kudela E, Visnovsky J, Balharek T, Farkasova A, Zubor P, Plank L, et al. Different amplification patterns of 3q26 and 5p15 regions in cervical intraepithelial neoplasia and cervical cancer. Annals of diagnostic pathology. 2018;35:16-20.
17. Kong B, Wang Q, Fung E, Xue K, Tsang BK. p53 is required for cisplatin-induced processing of the mitochondrial fusion protein L-Opa1 that is mediated by the mitochondrial metallopeptidase Oma1 in gynecologic cancers. The Journal of biological chemistry. 2014;289:27134-45.
18. Cimini D. Merotelic kinetochore orientation, aneuploidy, and cancer. Biochimica et biophysica acta. 2008;1786:32-40.
19. Sari Aslani F, Safaei A, Pourjabali M, Momtahan M. Evaluation of Ki67, p16 and CK17 Markers in Differentiating Cervical Intraepithelial Neoplasia and Benign Lesions. Iranian journal of medical sciences. 2013;38:15-21.
20. McCluggage WG. Premalignant lesions of the lower female genital tract: cervix, vagina and vulva. Pathology. 2013;45:214-28.
21. Xing Y, Wang C, Wu J. Expression of geminin, p16, and Ki67 in cervical intraepithelial neoplasm and normal tissues. Medicine. 2017;96:e7302.
22. Cortés-Gutiérrez EI, García-Vielma C, Dávila-Rodríguez MI, Sánchez-Dávila H, Fernández JL, Gosálvez J. 1p36 is a chromosomal site of genomic instability in cervical intraepithelial neoplasia. Biotech Histochem. 2019: 1-8. doi: 10.1080/10520295.2019.1652344. [Epub ahead of print]
23. Hashiguchi M, Masuda M, Kai K, Nakao Y, Kawaguchi A, Yokoyama M, et al. Decreased cytokeratin 7 expression correlates with the progression of cervical squamous cell carcinoma and poor patient outcomes. J Obstet Gynaecol Res. 2019.45:2228-36.
24. Abudula A, Rouzi N, Xu L, Yang Y, Hasimu A. Tissue-based metabolomics reveals potential biomarkers for cervical carcinoma and HPV infection. Bosn J Basic Med Sci. 2019. doi: 10.17305/bjbms.2019.4359. [Epub ahead of print]
25. Farzaneh F, Faghih N. Evaluation of Neutrophil-Lymphocyte Ratio as a Prognostic Factor in Cervical Intraepithelial Neoplasia Recurrence. 2019; 20: 2365-72.
26. Yang QC, Zhu Y, Liou HB, Zhang XJ, Shen Y, Ji XH. A cocktail of MCM2 and TOP2A, p16INK4a and Ki-67 as biomarkers for the improved diagnosis of cervical intraepithelial lesion. Polish journal of pathology : official journal of the Polish Society of Pathologists. 2013;64:21-7.
27. Sahasrabuddhe VV, Luhn P, Wentzensen N. Human papillomavirus and cervical cancer: biomarkers for improved prevention efforts. Future microbiology. Sep 2011;6(9):1083-98.
28. David O, Cabay RJ, Pasha S, Dietrich R, Leach L, Guo M, et al. The role of deeper levels and ancillary studies (p16(Ink4a) and ProExC) in reducing the discordance rate of Papanicolaou findings of high-grade squamous intraepithelial lesion and follow-up cervical biopsies. Cancer. 2009;117:157-66.
29. Liu D, Zhang XX, Xi BX, Wan DY, Li L, Zhou J, et al. Sine oculis homeobox homolog 1 promotes DNA replication and cell proliferation in cervical cancer. International journal of oncology. 2014;45:1232-40.
Figure Legends
Figure 1 The flowchart of the integrated analysis and functional validation.
Figure 2 GO analysis and the significantly terms of differentially expressed genes
(DEGs) in CIN.
Figure 3 Significantly signaling pathway analysis of differentially expressed genes
(DEGs) related to CIN performing with KEGG pathway website and software R. (A)
The network of pathways and genes, blue represents pathways, green is the down-
regulated gene, red is the up-regulated gene. (B) Pathway enrichment analysis based
on differentially expressed genes (DEGs). GeneRatio = count/setsize.
Figure 4 GESA Constructs function set and genes network. Yellow represents
functional sets, the number on the outer edge of the network represents entrezID.
Figure 5 Gene set enrichment analysis (GSEA). (A) E2F-Targets (B) G2M-
Checkpoint (C) Mitotic-Spindle (D) Spermatogenesis
Figure 6 Results of the co-expression network.(A) Dendrogram of the differentially
expressed genes (DEGs) of GEO datasets clustered. (B) The correlation between the
module eigengenes and the CIN grade.
Figure 7 Protein–protein interaction (PPI) network of differentially expressed genes
(DEGs)
Figure 8 ROC diagnosis analysis of the differentially expressed genes (DEGs) for
CIN
Figure S1 Volcano plot of differentially expressed genes (DEGs). Red, green and
black denote up-regulated, down-regulated and non-significantly expressed genes,
respectively.
Figure S2 Gene and functional network constructed by GO enrichment analysis.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure S1
Figure S2
Table S1 Clinical evaluation of cervical specimens
Normal (n=24) CIN1 (n=14) CIN2 (n=22) CIN3 (n=40)
Median age 28.5 27 24 30.5
HPV genotype
HPV 16 16 (16.2) 12 (12.1) 20 (20.2) 32 (32.3)
HPV 18 1 (25.0) 0 0 2 (50.0)
other high-risk HPV 5 (25.0) 2 (10.0) 2 (10.0) 6 (25.0)
other HPV 2 (40.0) 0 0 1 (20.0)
Study cytology
HSIL+ 16 (14.3) 13 (11.6) 22 (19.6) 37 (32.1)
LSIL 3 (50.0) 1 (16.7) 0 2 (33.3)
ASC-US 2 (40.0) 0 0 2 (40.0)
Normal 3 (100.0) 0 0 0
Missing 0 0 0 0
Colposcopy impression
Cancer 0 0 1 (3.2) 8 (25.8)
High grade 7 (29.2) 0 7 (29.2) 9 (37.5 )
Low grade 13 (23.6) 11 (20.0) 14 (25.5) 15 (25.5)
Metaplasia & Equivocal 3 (25.0) 2 (16.7) 0 6 (50.0)
Normal 1 (100.0) 0 0 0
Missing 0 1 (20.0) 0 3 (60.0)
HSIL = high grade squamous intraepithelial lesion; LSIL = low grade squamous intraepithelial lesion; ASC-US = atypical
squamous cells of undetermined significance
Roemeling C A V , Radisky D C , Marlow L A , et al. Neuronal pentraxin 2 supports clear cell renal cell carcinoma by activating the AMPA-selective glutamate receptor-4[J]. Cancer Research, 2014, 74(17):4796-810.
Table S2 31 genes with the high connectivity in grey module were taken as candidate hub genes for CIN high grade risk
GenecolorNEW
kMEblue kMEbrown kMEgreen kMEgrey kMEturquoise kMEyellow
SYCP2 grey -0.31933 0.813579 0.748354 0.902993 0.792075 0.674302TCAM1P grey -0.56294 0.687622 0.70513 0.890289 0.765379 0.66542DCLRE1B grey -0.57014 0.753084 0.756259 0.87688 0.789789 0.769592CDC7 grey -0.26862 0.860968 0.809275 0.872573 0.84401 0.704296TOPBP1 grey -0.30903 0.832814 0.742776 0.863863 0.871869 0.642696ACTL6A grey -0.60488 0.653026 0.693032 0.86191 0.78691 0.641038CKLF grey -0.48131 0.737526 0.716072 0.86049 0.822085 0.667799C18orf54 grey -0.5212 0.699997 0.663395 0.859988 0.783101 0.671595CENPQ grey -0.62882 0.688158 0.671624 0.858939 0.817186 0.651964ATAD2 grey -0.23736 0.772421 0.788452 0.8506 0.79 0.547155PRIM2 grey -0.34016 0.786596 0.746414 0.848956 0.796391 0.641971RAD54L grey -0.30243 0.886984 0.84222 0.842679 0.750092 0.825727EXO1 grey -0.63054 0.740341 0.723948 0.840726 0.760152 0.790412KIAA0101 grey -0.51815 0.739371 0.812399 0.83823 0.820286 0.697761HENMT1 grey -0.14286 0.777 0.733399 0.837402 0.73288 0.58583TOP2A grey -0.47398 0.750826 0.848469 0.834757 0.841481 0.689668NCAPH grey -0.40566 0.811809 0.873767 0.826243 0.818959 0.701439ZWILCH grey -0.31397 0.783499 0.825587 0.825152 0.823406 0.622159LINC00925 grey -0.42662 0.649481 0.661172 0.823625 0.630425 0.627539CENPO grey -0.39311 0.832394 0.781135 0.820124 0.715549 0.843433ZFR2 grey -0.38531 0.633341 0.665261 0.818176 0.741588 0.468172RIBC2 grey -0.71674 0.551642 0.599487 0.816547 0.654397 0.6715SYNGR3 grey -0.53921 0.535436 0.609433 0.814336 0.677193 0.509219ASPM grey -0.46706 0.67391 0.728141 0.814137 0.717365 0.671502LOC375196 grey -0.2865 0.664535 0.677685 0.814128 0.662071 0.560577RFC4 grey -0.09753 0.868167 0.855795 0.813682 0.8086 0.610759DHFR grey -0.18802 0.815199 0.854834 0.810412 0.830256 0.555871CDKN2A grey -0.32164 0.646347 0.717055 0.806906 0.749467 0.439761MLF1 grey -0.25365 0.691977 0.622876 0.804038 0.674678 0.53789CDC45 grey -0.21746 0.869529 0.888799 0.803326 0.789017 0.706893
Table S3 Protein to Protein Interaction with high interaction score
node1 node2neighborhood_on
_chromosomegene_fusion
phylogenetic_
cooccurrencehomology coexpression
experimentally_dete
rmined_interactiondatabase_annotated
CENPO CENPQ 0 0 0 0 0.107 0.917 0.9
CDC45 CDC7 0 0 0 0 0.431 0.264 0.9
PRIM2 CDC45 0 0 0 0 0.863 0 0.9
PRIM2 RFC4 0 0 0 0 0.874 0.157 0.9
RFC4 EXO1 0 0 0 0 0.74 0.186 0.9
RFC4 TOPBP1 0 0 0 0 0.589 0.18 0.9
CDC45 TOPBP1 0 0 0 0 0.26 0.829
EXO1 TOPBP1 0 0 0 0 0.271 0 0.9
RFC4 KIAA0101 0 0 0 0 0.757 0 0.9
TOP2A ASPM 0 0 0 0 0.967 0
TOP2A KIAA0101 0 0 0 0 0.949 0
TOP2A NCAPH 0 0 0 0 0.775 0.572
CENPO ZWILCH 0 0 0 0 0.098 0 0.9
CENPQ ZWILCH 0 0 0 0 0.194 0 0.9
CDC45 EXO1 0 0 0 0 0.902 0
CDC45 RFC4 0 0 0 0 0.852 0.064
PRIM2 CDC7 0 0 0 0 0.213 0 0.9
RAD54L EXO1 0 0 0 0 0.75 0.147
ASPM EXO1 0 0 0 0 0.918 0.104
ASPM NCAPH 0 0 0 0 0.903 0
ASPM KIAA0101 0 0 0 0 0.903 0