62
Introduction to Gene Mining Part B: How similar are plant and human versions of a gene? After completing part B, you will demonstrate How to use NCBI BLASTp and www.Araport.org data to determine whether Arabidopsis thaliana and human muscle protein genes and gene products are homologous. 1

Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

  • Upload
    adcobb

  • View
    19

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

1

Introduction to Gene Mining

Part B: How similar are plant and human versions of a gene?

After completing part B, you will demonstrate

How to use NCBI BLASTp and www.Araport.org data to determine whether Arabidopsis thaliana

and human muscle protein genes and gene products are homologous.

Page 2: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

The Arabidopsis Information Portal is funded by a grant from the National Science Foundation (#DBI-1262414)

and co-funded by a grant from the Biotechnology and Biological Sciences Research Council (BB/L027151/1).

These lessons were developed during the summer of 2015 as education outreach for the www.Araport.org portal in

conjunction with the J. Craig Venter Institute, Rockville, MD, 20850, USA.

Contact informationGeneral information: [email protected]

Jason Miller, Grant Co-Principal Investigator, JCVI [email protected]

This lesson was prepared by Andrea Cobb, Ph.D. ([email protected])

with the help of Margot Goldberg ([email protected])2

Page 3: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

In Part A, our sample question was:

Can we study your muscle disease using a plant model?

3

Page 4: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

We used the NCBI portal to find names of human muscle genes.

4

Page 5: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

We also found the function of human actin-alpha 1 gene ( ACTA1) and asked “ Might plants need that same function?”

5

Page 6: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

.

We used NCBI BLASTn tosearch in Arabidopsis thaliana

for genes which align to human ACTA1

6

Page 7: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

We learned that “alignment” is achieved by using an algorithm that maximizes local matches between two

sequences.

7

Page 8: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

We learned how to use the BLASTn report scores with Query cover, Ident and the E-values to choose a

statistically meaningful alignment.

8

Page 9: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

Explain--Gene Discovery Scorecard

In a group of 3-4 students, examine your gene discovery scorecard and then:

Infer characteristics of genes which were in both A. thaliana and humans.

Identify characteristics of genes present in humans but not found in plants.

9

Page 10: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

Explain What information so far indicates whether or not plants have animal muscle genes?

What additional information might you need to be certain whether or not plants have animal muscle genes?

10

Page 11: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

Part B: Evaluating homology- How similar are plant and human versions of a gene?

11

Page 12: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

Recipes handed down often change

12

Page 13: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

Which parts of the recipes were conserved (were almost the same) in all generations’ recipes?

Which parts were not conserved?13

Page 14: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

Reasons why a recipe might be changed

• Discuss in groups and report your ideas.

14

Page 15: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

How might you track the passage of a recipe from one generation to the next if you can’t ask the cooks?

?

15

Page 16: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

How is a gene like a recipe?

• Discuss in groups and report your ideas.

16

Page 17: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

What features of a gene might

make it a version of another

gene?

Record your answers.

https://www.youtube.com/watch?v=gCxrkl2igGY is a song you might remember.

17

Page 18: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

• What is homology?

• What criteria do scientists use to classify particular genes and their protein products as homologs?

Explore

18

Page 19: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

• Homology- a general term describing 2 or more genes which share an ancestral gene

• How might recipes be “homologous”?

19

Page 20: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

To use a plant model for my patient’s disease, I need to find a

plant homolog to his ACTA1 gene. We found that the Arabidopsis

thaliana ACT7 gene is a version, but is it similar enough to be a

homolog?

20

Page 21: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

Should we search for homologs using a gene sequence or a protein sequence?

21

Page 22: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

The structure of a eukaryotic gene is complex!

The amino acid sequence of the protein is more likely to be

conserved than the gene sequence

Translation (protein synthesis)

http://nitro.biosci.arizona.edu/courses/EEB600A-2003/lectures/lecture24/lecture24.html

22

Page 23: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

A BLASTp using the gene product’s amino acid sequence is likely to find protein homologs

A BLASTn might find more differences than similarities

23

Page 24: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

We will use a protein BLAST tool, BLASTp, to find homologous proteins. We need to first find the protein sequence coded by the human ACTA1

gene on the NCBI protein page.

24

Page 25: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

From the ACTA1 protein information page, select FASTA, then copy and paste the amino acid sequence into a Word Document.

>gi|49168518|emb|CAG38754.1| ACTA1 [Homo sapiens]MCDEDETTALVCDNGSGLVKAGFAGDDAPRAVFPSIVGRPRHQGVMVGMGQKDSYVGDEAQSKRGILTLKYPIEHGIITNWDDMEKIWHHTFYNELRVAPEEHPTLLTEAPLNPKANREKMTQIMFETFNVPAMYVAIQAVLSLYASGRTTGIVLDSGDGVTHNVPIYEGYALPHAIMRLDLAGRDLTDYLMKILTERGYSFVTTAEREIVRDIKEKLCYVALDFENEMATAASSSSLEKSYELPDGQVITIGNERFRCPETLFQPSFIGMESAGIHETTYNSIMKCDIDIRKDLYANNVMSGGTTMYPGIADRMQKEITALAPSTMKIKIIAPPERKYSVWIGGSILASLSTFQQMWITKQEYDEAGPSIVHRKCF

Each amino acid is represented by a particular letter

25

Page 26: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

Navigate to the BLASTp link on NCBI.

26

Page 27: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

Paste the protein sequence for ACTA1 here.

Enter Arabidopsis thaliana for the search database.

Select blastp and then click on the BLAST button.

27

Page 28: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

The BLASTp report is similar to the BLASTn report.

Query sequence

28

Page 29: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

“Descriptions” shows 4 actins with the same query coverage, E-value and Ident!There appear to be 4 possible homologous proteins but which is most similar to the human ACTA1 protein?

29

Page 30: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

There are a number of actin proteins with high Query coverage, very low E-values and high identity. Check them all (for some whose numbers are represented more than once, check the first listing). Then select “Multiple Alignment” to directly compare those sequences.

30

Page 31: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

Conserved amino acids are shown in red. Which differences can you find quickly?

Can you spot a deletion? Where is an amino acid replaced by a chemically

similar type?Where is an amino acid replaced by a chemically

different type? 31

Page 32: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

Protein sequence homology is analyzed by constructing a Distance tree of results. Check the desired

“hits”, then select “Distance tree”.

32

Page 33: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

Query—human ACTA1 protein

Nodes represent a shared ancestral gene

These proteins are all homologs.

33

Page 34: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

34

Page 35: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

Of the proteins in Arabidopsis thaliana, ACT7 has the highest identity (88%) and lowest E-value (0.0) when compared to human ACTA1.

A gene tree program predicts the presence of ancestral genes between ACT7 and ACTA1.

Is that sufficient to confirm protein homology for experimental modeling?

35

Page 36: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

A more restricted alignment between human ACTA1 and the closest 3 Arabidopsis proteins can check that ACT7 is the protein

closest to the ancestral gene.

Check Align two or more sequences, then copy and past protein sequences for ACT7, ACT8 and ACT2 into Subject Sequence box.

36

Page 37: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

Multiple alignment results for human ACTA1 protein and the 3 closest Arabidopsis proteins.

37

Page 38: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

What do the distance tree results indicate?

38

Page 39: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

Do you have enough data to use Arabidopsis ACT7 gene as a model for the human ACTA1 gene?

Discuss and report your ideas.

39

Page 40: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

What criteria from published work indicated that these plant processes and human diseases involved

homologous genes or proteins ?

40

Page 41: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

Homologous proteins will have:

• Very low E-values for sequence alignment(< .00001)

• >25% conserved sequences for >100 aa* • Protein-protein interactions of one homolog which

are similar to protein-protein interactions of the other homolog

• Similar co-expression of genes for each homolog • Similar Function Gene Ontology (GO terms) • Conserved sequences and protein domains*

http://jura.wi.mit.edu/bio/education/hsteachers2012/form_blast_intro.pdf

41

Page 42: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

Let’s find homology information and data about the Arabidopsis ACT7 gene in http://www.Araport.org Use the pull-down menu to access the ThaleMine tool.

42

Page 43: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

Enter information about your gene of interest, in this case, ACT7

43

Page 44: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

Results show 1 gene, 2 articles and 1 mRNA in the database.

We are only interested in studying the gene for now, so we will select the category –Gene or just select the identifier for the gene from the list at right

44

Page 45: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

This is the Gene information sheet for the Arabidopsis thaliana ACT7 gene. How did the function listed under Curator Summary compare to your

previous prediction?

45

Page 46: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

The blue bar under Curator Summary has tabs that take you quickly to that section down the page. Click on the Homology tab.

Links to information about human ACT7 homologs.

46

Page 47: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

Homologous proteins will have:

• Very low E-values for sequence alignment• (< .00001)• >25% conserved sequences for > 100 aa* • Protein-protein interactions of one homolog which

are similar to protein-protein interactions of the other homolog

• Similar co-expression of genes for each homolog • Similar Function Gene Ontology (GO terms) • Conserved protein domains

* http://jura.wi.mit.edu/bio/education/hsteachers2012/form_blast_intro.pdf 47

Page 48: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

Compare the first (human ACTA1) and second (Arabidopsis ACT7) sequences in each alignment and it is evident that many more than 25% of any 100 amino acids in any of the regions align.

48

Page 49: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

Homologous proteins will have:

• Very low E-values for sequence alignment• (< .00001)• >25% conserved sequences for > 100 aa* • Protein-protein interactions of one homolog which

are similar to protein-protein interactions of the other homolog

• Similar co-expression of genes for each homolog • Similar Function Gene Ontology (GO terms) • Conserved protein domains

* http://jura.wi.mit.edu/bio/education/hsteachers2012/form_blast_intro.pdf 49

Page 50: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

Actin interacts with many proteins

https://www.youtube.com/watch?v=FzcTgrxMzZk 50

Page 51: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

ACT7 and ACTA1 proteins each interact with a variety of other proteins. Because the same protein may have a plant name and a different animal name, further investigation is needed to

know from this data whether ACTA1 and ACT7 are interacting with identical proteins.

Arabidopsis ACT7 interacts with these proteins

Human ACTA1 interacts with these proteins

51

Page 52: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

Homologous proteins will have:

• Very low E-values for sequence alignment• (< .00001)• >25% conserved sequences for > 100 aa* • Protein-protein interactions of one homolog which

are similar to protein-protein interactions of the other homolog ??

• Similar co-expression of genes for each homolog • Similar Function Gene Ontology (GO terms) • Conserved protein domains

* http://jura.wi.mit.edu/bio/education/hsteachers2012/form_blast_intro.pdf 52

Page 53: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

Co-expression (transcription of 2 or more genes at the same time in the same cell) is required for gene products (proteins) to work together.

http://www.frontiersin.org/files/Articles/96150/fpls-05-00426-HTML/image_m/fpls-05-00426-g001.jpg

In the image above, two differently colored fluorescent proteins are co-expressed in Arabidopsis.

53

Page 54: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

What genes are co-expressed (same time, same location) for ACT7 or ACTA1?

Arabidopsis ACT7is co-expressed with these genes

Human ACTA1 co-expression is shown with purple lines.

54

Scientists would need to confirm that the different plant and animal names were actually the same protein.

Page 55: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

Homologous proteins will have:

• Very low E-values for sequence alignment• (< .00001)• >25% conserved sequences for > 100 aa* • Protein-protein interactions of one homolog which are

somewhat similar to protein-protein interactions of the other homolog ??

• Some similar co-expression of genes for each homolog ??

• Some similar Function Gene Ontology (GO terms) • Conserved protein domains*

http://jura.wi.mit.edu/bio/education/hsteachers2012/form_blast_intro.pdf

55

Page 56: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

Gene Ontology provides information about biological process, molecular function and cellular location –are

any ACT7 GO terms similar to human ACTA1 GO terms?Arabidopsis ACT7

Human ACTA1

56

Page 57: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

Homologous proteins will have:

• Very low E-values for sequence alignment• (< .00001)• >25% conserved sequences for > 100 aa* • Protein-protein interactions of one homolog which are

somewhat similar to protein-protein interactions of the other homolog ??

• Some similar co-expression of genes for each homolog ??

• Some similar Function Gene Ontology (GO terms) • Conserved protein domains*

http://jura.wi.mit.edu/bio/education/hsteachers2012/form_blast_intro.pdf

57

Page 58: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

58

Page 59: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

Homologous proteins will have:

• Very low E-values for sequence alignment• (< .00001)• >25% conserved sequences for > 100 aa* • Protein-protein interactions of one homolog which are

somewhat similar to protein-protein interactions of the other homolog ??

• Some similar co-expression of genes for each homolog ??

• Some similar Function Gene Ontology (GO terms) • Conserved protein domains*

http://jura.wi.mit.edu/bio/education/hsteachers2012/form_blast_intro.pdf 59

Page 60: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

Members of the Arabidopsis actin family of genes are homologous with each other. Does that mean that the Arabidopsis actins are

homologous with human ACTA1? 60

Page 61: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

Arabidopsis actin gene ACT7 plays an essential role in germination and root growth

The Plant JournalVolume 33, Issue 2, pages 319-328, 16 JAN 2003 DOI: 10.1046/j.1365-313X.2003.01626.xhttp://onlinelibrary.wiley.com/doi/10.1046/j.1365-313X.2003.01626.x/full#f2

Wild-type, no ACT7 mutation

Mutant ACT7+

Wild-type, no ACT7 mutation

Mutant ACT7+

We have an ACT7 mutant with an observable phenotype difference compared to the normal wild type.

61

Page 62: Introduction to Gene Mining: Part B: How similar are plant and animal versions of a gene?

Have we found a suitable plant research model for nemaline myopathy?

What additional information would you want? Scientific literature searches for Arabidopsis information are easy to access in http:www.Araport.org apps 50 years of Arabidopsis research!

62