1
BioMed Central Page 1 of 1 (page number not for citation purposes) BMC Systems Biology Open Access Poster presentation Predicting gene co-expression from CIS-regulatory regions Jamie Allen* 1,2 , Mark Shirley 1 , Steve Rushton 1 , Ajay Kohli 1 and Ehsan Mesbahi 2 Address: 1 Institute for Research on Environment and Sustainability (IRES), Devonshire Building, Newcastle University, Newcastle, NE1 7RU, UK and 2 School of Marine Science and Technology, Armstrong Building, Newcastle University, Newcastle, NE1 7RU, UK Email: Jamie Allen* - [email protected] * Corresponding author Background Single Nucleotide Polymorphisms (SNPs) affecting phe- notype are frequently found in the CIS regulatory factors of a gene. However current prediction tools ignore posi- tional data and thus over estimate the co-expression of genes. In order to reduce this error, this work intends to produce a method of correlating quantitative trait loci (QTLs) simultaneously to their CIS element and the expressed phenotype. To decipher the cryptic code in this non-coding DNA autoassociative neural network tools and multidimensional self-organising maps (Kohenen maps) are being used. Preliminary work produced a tool using a 2 dimensional Kohenen map has produced some promising results. This work aims to further improve the prediction rate by increasing the dimensions used in the self-organising map so more positional data is incorporated. Once an effective prediction tool has been developed it will be trained using existing genomic data from Arabidopsis pollen and its effectiveness assessed. Once trained the prediction tool can then be used on novel data and the predictions, if not available in published literature can be tested in vitro using Transcript and Locus profiling techniques within the university. The ultimate aim being to use the prediction tool on regions of the rice genome mapped for high density heter- osis QTLs to predicted co-regulated genes. Then using in- house genomics techniques meaningful biological infor- mation may be assigned to the co-expression. Methods 1000 base upstream sequence of redox genes and isogenes in bacteria and plants were obtained from the NCBI http:/ /www.ncbi.nlm.nih.gov/ and TIGR http://www.tigr.org/ databases. Motifs in bacteria were scanned using the Prokaryotic Database of Gene Regulation's virtual footprint tool http:/ /prodoric.tu-bs.de/ . Output was filtered using customised Perl scripts. For rice and arabidopsis, motif lists were obtained from the Plant CIS-acting regulatory DNA ele- ments (PLACE) database [1] and are scanned using a cus- tom written Perl script. Results Bacterial data has been initially analysed using multivari- ate techniques and indicates greater variation between iso- genes than between genes or even between isogenes of the same gene in different species. The future is to develop the neural network tools to refine prediction. References 1. Higo K, Ugawa Y, Iwamoto M, Korenaga T: Plant cis-acting regu- latory DNA elements (PLACE) database. Nucleic Acids Res 1999, 27:297-300. from BioSysBio 2007: Systems Biology, Bioinformatics and Synthetic Biology Manchester, UK. 11–13 January 2007 Published: 8 May 2007 BMC Systems Biology 2007, 1(Suppl 1):P56 doi:10.1186/1752-0509-1-S1-P56 <supplement> <title> <p>BioSysBio 2007: Systems Biology, Bioinformatics, Synthetic Biology</p> </title> <editor>John Cumbers, Xu Gu, Jong Sze Wong</editor> <note>Meeting abstracts – A single PDF containing all abstracts in this Supplement is available <a href="www.biomedcentral.com/content/files/pdf/1752-0509-1-S1-full.pdf">here</a>.</note> <url>http://www.biomedcentral.com/content/pdf/1752-0509-1-S1-info.pdf</url> </supplement> This abstract is available from: http://www.biomedcentral.com/1752-0509/1?issue=S1 © 2007 Allen et al; licensee BioMed Central Ltd.

Predicting gene co-expression from CIS-regulatory regions

Embed Size (px)

Citation preview

BioMed Central

Page 1 of 1(page number not for citation purposes)

BMC Systems Biology

Open AccessPoster presentationPredicting gene co-expression from CIS-regulatory regionsJamie Allen*1,2, Mark Shirley1, Steve Rushton1, Ajay Kohli1 and Ehsan Mesbahi2

Address: 1Institute for Research on Environment and Sustainability (IRES), Devonshire Building, Newcastle University, Newcastle, NE1 7RU, UK and 2School of Marine Science and Technology, Armstrong Building, Newcastle University, Newcastle, NE1 7RU, UK

Email: Jamie Allen* - [email protected]

* Corresponding author

BackgroundSingle Nucleotide Polymorphisms (SNPs) affecting phe-notype are frequently found in the CIS regulatory factorsof a gene. However current prediction tools ignore posi-tional data and thus over estimate the co-expression ofgenes. In order to reduce this error, this work intends toproduce a method of correlating quantitative trait loci(QTLs) simultaneously to their CIS element and theexpressed phenotype. To decipher the cryptic code in thisnon-coding DNA autoassociative neural network toolsand multidimensional self-organising maps (Kohenenmaps) are being used.

Preliminary work produced a tool using a 2 dimensionalKohenen map has produced some promising results. Thiswork aims to further improve the prediction rate byincreasing the dimensions used in the self-organising mapso more positional data is incorporated. Once an effectiveprediction tool has been developed it will be trained usingexisting genomic data from Arabidopsis pollen and itseffectiveness assessed. Once trained the prediction toolcan then be used on novel data and the predictions, if notavailable in published literature can be tested in vitrousing Transcript and Locus profiling techniques withinthe university.

The ultimate aim being to use the prediction tool onregions of the rice genome mapped for high density heter-osis QTLs to predicted co-regulated genes. Then using in-

house genomics techniques meaningful biological infor-mation may be assigned to the co-expression.

Methods1000 base upstream sequence of redox genes and isogenesin bacteria and plants were obtained from the NCBI http://www.ncbi.nlm.nih.gov/ and TIGR http://www.tigr.org/databases.

Motifs in bacteria were scanned using the ProkaryoticDatabase of Gene Regulation's virtual footprint tool http://prodoric.tu-bs.de/. Output was filtered using customisedPerl scripts. For rice and arabidopsis, motif lists wereobtained from the Plant CIS-acting regulatory DNA ele-ments (PLACE) database [1] and are scanned using a cus-tom written Perl script.

ResultsBacterial data has been initially analysed using multivari-ate techniques and indicates greater variation between iso-genes than between genes or even between isogenes of thesame gene in different species. The future is to develop theneural network tools to refine prediction.

References1. Higo K, Ugawa Y, Iwamoto M, Korenaga T: Plant cis-acting regu-

latory DNA elements (PLACE) database. Nucleic Acids Res1999, 27:297-300.

from BioSysBio 2007: Systems Biology, Bioinformatics and Synthetic BiologyManchester, UK. 11–13 January 2007

Published: 8 May 2007

BMC Systems Biology 2007, 1(Suppl 1):P56 doi:10.1186/1752-0509-1-S1-P56

<supplement> <title> <p>BioSysBio 2007: Systems Biology, Bioinformatics, Synthetic Biology</p> </title> <editor>John Cumbers, Xu Gu, Jong Sze Wong</editor> <note>Meeting abstracts – A single PDF containing all abstracts in this Supplement is available <a href="www.biomedcentral.com/content/files/pdf/1752-0509-1-S1-full.pdf">here</a>.</note> <url>http://www.biomedcentral.com/content/pdf/1752-0509-1-S1-info.pdf</url> </supplement>

This abstract is available from: http://www.biomedcentral.com/1752-0509/1?issue=S1

© 2007 Allen et al; licensee BioMed Central Ltd.