Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Bioprospecting of Genes and Allele
Mining: Approaches and Oppertunities
T. Mohapatra
Central Rice Research Institute, Cuttack, Odisha
Prospecting the Biological Resources: Bio-prospecting
• It is a systematic search for whole organisms, genes and natural compounds in the living world for useful purposes.
• It is nothing new. Informal bio-prospecting began when prehistoric people noticed that one plant root tasted better than another, or some plants could be used as medicines to treat various human diseases.
Bio-prospecting
• Scientific bio-prospecting started later to identify the active ingredients present in different organisms and isolate or replicate them for large-scale use.
• Alexander Fleming’s discovery of the antibiotic penicillin is an example of bio-prospecting that happened accidentally.
Scottish-born microbiologist
Penicillium
notatum
Bio-prospecting of Genes • Genes are the functional hereditary
units in our chromosomes.
• Genes control expression of different characteristics. For example, dwarf height, early flowering, disease resistance, heat tolerance, high yield etc.
• Genes for dwarf plant height in rice and wheat were also associated with higher crop yield that ushered in “Green Revolution”.
• Bio-prospecting of genes aims at discovery of novel genes in biological resources
Bio-prospecting of Genes: Bt Example • Identification of soil bacterium
Bacillus thuringiensis (Bt) to have insecticidal properties and subsequent isolation of the gene responsible for this characteristic from the bacteria is a bright example of bio-prospecting of extremely useful genes.
• A revolution in cotton production in the country has been possible with deployment of Bt gene that works effectively against the cotton bollworm.
• The technology is highly remunerative to the farmers and environment friendly.
Genes prospected from diverse biological resources for tolerance to abiotic stresses
Stress Gene/Enzyme Source
Osmotic Delta-pyrrolin-5-carboxylate
synthetase (P5CS)
Mothbean
(V. aconitifolia)
Drought and
Salinity
Mannitol-1-phosphate
dehydrogenase (mt1D)
E. coli
Cold and Salt Choline oxidase (cod A) Arthrobactor
globiformis
Salt Choline dehydrogenase (bet A) E. coli
Cold Omega-3-fatty acid desaturase
(fad 7)
Arabidopsis
Drought Trehalose-6-phosphate synthase Yeast
Drought Levan sucrase (Sac B) Bacillus subtilis
Tools of biotechnology have rendered the traditional barriers to
gene flow less relevant
Bio-prospecting of Genes: The Approaches
• Using heterologous probes or primers
• Using purified protein
– Antibody to screen cDNA library
– Degenerate oligos to screen library
• Transcriptome profiling
• Insertional mutagenesis
• Map-based cloning
• Integrated approach
Transcriptome Profiling
• Differential Display RT-PCR
• cDNA AFLP
• Representational Difference Analysis
• Suppression Subtractive Hybridization
• Microarrays
• Serial Analysis of Gene Expression
• Massively Parallel Sequencing
Suppression Subtractive Hybridization
• Selectively amplifies target cDNA fragments (differentially expressed) and simultaneously suppresses non-target DNA amplification
• Based on the suppression PCR effect
• Normalization is included
• No need to physically separate single stranded and double stranded DNAs
SSH • Can detect as little as 0.001%
target
• Critical factor is relative concentration of target in tester and driver populations
• Effective enrichment when:
– Target present at ≥ 0.01%
– Concentration ratio ≥ 5-fold
Identification of a testis specific gene
Use of DNA Microarray
• Array design: choice of sequences to be used as probes
• Analysis of scanned images
– Spot detection, normalization, quantitation
• Primary analysis of hybridization data
– Basic statistics, reproducibility, data scattering, etc.
• Comparison of multiple samples
– Clustering, classification …
• Sample tracking and databasing of results
Comparative hybridization Bioinformatics
Transcriptome Profiling Using Illumina Solexa Genome
Analyzer NGS Platform
Synthesis by Polymerase and sequence
generation by Bridge PCR
Generation of 75 to 100
bp short sequence reads
Metzker et al. (2009) Nat. Rev.
Prospecting Genes Using Insertion Tags
P coding region
P coding region Gene X
Gene X with
insert
Chromosome with genes including the one with insert
Identify and clone the fragment carrying the insert Eg. 1. Make a library and probe with the insertion sequences. 2. Ligate DNA into circles and amplify using divergent insert primers (inverse PCR)
Digest genomic DNA with restriction
endonuclease
Inverse PCR Ligate
Clone and sequence
Amplify by PCR
Use sequences from gene X to identify and clone wild type allele
P coding region Gene X
Cloning of GW2 QTL for Grain Weight FAZ1 (indica) X WY3 (japonica)
• Mapped several QTL for grain weight or size, including width, length, thickness and 1,000-grain weight.
• This included a major QTL for grain width, GW2, on chromosome 2 with the WY3 allele at GW2 contributing to increased grain width.
41.9 ± 1.3 g
test weight
17.9 ± 0.7 g
test weight F1
F2
RM5897 RM2634
GW2
Mining is the extraction of valuable minerals or other geological materials from the earth
Mining in a wider sense
comprises extraction of
any non-renewable
resource (e.g., petroleum,
natural gas, or even
water)
What is an Allele? • An allele from the Greek allelos,
meaning each other
• The word is a short form of
allelomorph ('other form'), which is
used to describe variant forms of a
gene detected as different
phenotypes.
• For example, at the gene locus for
ABO blood type proteins in human;
eye colour in fruit fly.
• Alleles are now understood to be
alternative DNA sequences at the
same physical locus, which may or
may not result in different phenotypic
traits.
Microsatellite Alleles
1 2 3
PCR - Amplification
Variety 1
Variety 2
Variety 3
Genomic DNA
GA GA GA GA GA GA GA GA GA GA GA GA
GA GA GA GA GA GA GA GA GA GA GA GA GA GA GA GA
GA GA GA GA GA GA GA GA
12 16 8
Alleles at the Sequence Level
Different
alleles
present in
different
germplasm
lines
Known gene that is
being used in breeding
Allele 1
C C A
G A T
Alternate forms of a gene - Alleles
G C C
A C A
G C A
T G C
Allele 2
Allele 3
Allele 4
Allele 5
A T A
A T C
Allele 6
Allele 7
Prospecting New Alleles: Allele Mining
Yield
under
stress
Allele 3 present in one of the germplasm line is the best;
can be identified through association mapping
Allele 1 Allele 2 Allele 7 Allele 6 Allele 3 Gene
Resistant landrace
with a new allele for
powdery mildew
resistance
Resistant landrace
with the new allele
silenced shows
powdery mildew
symptom
• A core collection of 1320 wheat
landraces screened for powdery
mildew reaction
• 211 landraces showed complete
or partial resistance
• 111 landraces had Pm3 gene
specific marker
• Sequencing, sequence analysis
and functional validation by
gene silencing were done to
identify new alleles
• 7 new functionally active alleles
of Pm3 gene identified
TILLING (Targeting Induced Local Lesions In Genomes): An Approach for Detecting New Alleles
LI-COR Model 4300
DNA Analyzer for SNP
detection
Allele Mining Using Next Generation
Sequencing (NGS) Platforms
Roche 454 Pyrosequencer
Illumina Solexa Genome Analyzer
ABI SOLiD
Crop species SNPs mined NGS platforms
Ref
Rice (Nipponbare vs. 93-11) Rice (Nipponbare vs. Koshihikari)
1,226, 791 67,051
Solexa Solexa
Huang et al. (2009) Yamamoto et al. (2010)
Arabidopsis 8,23,325 Solexa Ossowski et al. (2008)
Chickpea ~1000 454, Solexa and ABI SoLiD
Unpublished
Eucalyptus 23,742 454 Novaes et al. (2008)
Wheat ~1000
454 Cronn et al. (2008)
Whole Genome Resequencing
Targeted Resequencing
Transcriptome Sequencing
De novo Sequencing
Mining SNPs among
even thousands of related
genotypes, and within
and between larger
germplasm pools more
efficiently & economically
in greater depth
Whole Genome Resequencing Using ABI SOLiD
Isolated high quality column purified
genomic DNA with total quantity of 30-
40 µg from three rice varieties
Mate-paired Library Preparation
Emulsion PCR & Bead Enrichment
Bead Deposition Sequencing by ligation
Dual Interrogation of each base
Chracteristics Basmati370 IRBB60 Taraori
Total mappable
reads
63,867,955 58,993,349 69,938,730
% mapping 49.78 53.27 56.0
Sequence
coverage (x)
7.43 6.86 8.13
Used Bioscope v1.2 for
mapping, pairing and SNP
detection
Rice
Pseudomolecule
6.1: Reference
Genome
Population Genetic Structure Among Domesticated Rice Genotypes
Aromatics indica ja
po
nica
Au
s
Wild
Requirements for Large Scale Bio-prospecting and Allele Mining
• Ensuring the richness of bio-diversity
• Consortium mode of operation
• Pooling of expertise available in different institutions
• Detailed characterisation of the germplasm lines
• Use of modern tools of genomics
• Adequate funding support and infrastructure
BIOPROSPECTING OF GENES AND ALLELE MINING FOR ABIOTIC
STRESS TOLERANCE
Consortium Leader NRC on Plant Biotechnology
Pusa Campus, New Delhi-110012
Consortium Partners 36 partner institutes
Crop Sciences, Horticultural Sciences, Animal Sciences, Fisheries & Microbial Sciences
RICE GROWING AREAS IN INDIA
Rice accounts for about 42%
of total food grain production
and >55% of diet in India
Rice is considered to
have originated in
the Himalayan foot-hills
It is cultivated below the sea
level in the Kuttanad district
of Kerala state as well as at
altitude of 2000 meters
above the mean sea level in
the hills of Jammu &
Kashmir, Uttaranchal and
North-Eastern States.
Bio-prospecting of Genes and Allele Mining: Enormous Opportunities
• The microbial diversity thriving in extreme heats of deserts, hot springs and even volcanic eruptions is far more striking
• They are the source of genes that can help crop plants to overcome a range of abiotic stresses
Bio-prospecting of Genes and Allele Mining: Enormous Opportunities
The animals such as camel and goat possess unique adaptive mechanisms for abiotic stress tolerance that need to be understood and exploited.
Nubra valley in J&K, -300 - +150C Rajasthan, 50 - 500C
Goat Biodiversity
Gaddi, Chegu
Changthangi
Temperature range
20°C to -20°C
Cold region Hot humid regions
Ganjam
Black Bengal
Malabari
Hot arid region
Marwari
Surti
Sirohi
Bio-prospecting of Genes and Allele Mining: Catfish, Trout, Shrimp and Prawn
The fish species such as trout, catfish and shrimp/prawn possess unique adaptive mechanisms for tolerance to cold, anoxia and salinity.
Traits, Source Organisms and Institutions: Sharing of Responsibilities
Trait Source species
Participating Institutes
Moisture stress
Microbes Rice Maize Sorghum Lathyrus Cucumis Ziziphus
NBAIM, IARI, NRCG NRCPB, NBPGR, IARI, DRR, CRRI, IGKVV, DUSC, IIT – Kharagpur, VPKAS IARI, VPKAS, ICAR Research Complex for NEH Region, CCSHAU, GBPUAT NRCS, MPKV-Rahuri NBPGR, IIT-Kharagpur NBPGR, IIVR, IIHR, RAU-Bikaner CIAH, NRCPB, NBPGR
Salinity/ Sodicity and acidity
Microbes Rice Shrimp
NBAIM, CMFRI, CIFT, CIFA, IARI, CIFRI, NRCG DRR, CRRI, CAU-Barapani, ICAR Research Complex for NEH Region, NRCPB, DUSC CIBA, CIFA, CIFE
Trait Source species
Participating Institutes
Temperature (Heat and Cold)
Microbes Rice Vigna Spp. Goat Camel Trout fish
NBAIM, IARI, CIFA, NRCG CRRI, VPKAS, CAU-Barapani, ICAR Research Complex for NEH Region, IARI, NRCPB, DUSC NBPGR, RAU – Bikaner CIRG, NBAGR, SKUAST, IVRI, NDRI NBAGR, NRCC, SKAUST, IVRI, NDRI DCFR
Submergence (Anoxia)
Microbes Catfish Rice
CRRI, CIFT, CMFRI NBFGR, CIFA CRRI, NRCPB, DUSC
Statistical and Computational Genomics
Across species IASRI and other participating institutes
Traits – Eight (Moisture stress, salinity, sodicity, acidity, heat, cold, submergence and anoxia)
Species: Microbes – several; Plant – Seven; Fish – Four; Animals – Two
Institutions - 36 (ICAR Institutes, SAUs, CUs and IIT)
Traits, Source Organisms and Institutions: Sharing of Responsibilities
Our Focus
Emphasis on the master regulators such as transcription factors and signaling components while prospecting genes and alleles for complex abiotic stress tolerance traits
Prospecting Genes for Moisture Stress Tolerance
AB
A
DRFB2A, 2B
pZIP
MYC/MYB
DRE/CRT ABRE MYCRS/MYBRS mRNA
Involvement of different transcription factors in response to moisture
stress in the induction of stress genes.
Various transcription factors. Osmatic stress signaling generated
moisture stress seems to be mediated by transcription factors such as
DREB 2A, 2B and Pzip and MYC and MYB as transcription activators
that interact with CRT/DRF, ABRE or MYCRE/MYBRE elements in the
promoter of stress genes.
Abiotic Stress Responsive Gene
Statistical and Computational Genomics
Data analysis support
Development of algorithms Capacity building
Creation of centralized facility
The statistical and computational aspect is complex and enormous.
It will be a beginning. As we go along, need based further
strengthening will take place with more resources and manpower.
Installed Applied Biosystem (ABI) 3730xl DNA Analyzer for high-throughput genotyping of rice core collections using multiplexed
microsatellite markers
Creation of a High Throughput Genotyping Facility
6912 core germplasm samples can be genotyped for 4 markers in 5 days
Genotyping using 72 markers will require 90 days of uninterrupted run
Installed Bead-array Platforms for highly Multiplexed Illumina
GoldenGate and Infinium Assays for High-throughput
Genotyping of SNP Markers
Illumina Beadexpress Illumina iScan
What is Gained from Bio-prospecting and Allele Mining
NOVEL GENES TO MEET THE
CHALLENGES OF CLIMATE CHANGE
NEW ALLELES OF KNOWN GENES FOR
TARGET TRAIT
ELITE GENOTYPES WITH DESIRABLE
TRAIT
WELL CCHARACTERIZED GERMPLASM
RESOURCES
NEW FACILITIES FOR HIGH THROUGHPUT
GENOTYPING, PHENOTYPING AND
COMPUTATIONAL GENOMICS