20
Introduction to Scientific Data Lecture 5

Introduction to Scientific Data Lecture 5. The Informatics Effect Computers have transformed how we collect, store, analyze, and visualize data. Notably,

Embed Size (px)

Citation preview

Page 1: Introduction to Scientific Data Lecture 5. The Informatics Effect Computers have transformed how we collect, store, analyze, and visualize data. Notably,

Introduction to Scientific Data

Lecture 5

Page 2: Introduction to Scientific Data Lecture 5. The Informatics Effect Computers have transformed how we collect, store, analyze, and visualize data. Notably,

The Informatics Effect

Computers have transformed how we collect, store, analyze, and visualize data. Notably,

However, scientific data is not the same as computer data.

• scientific data vary more than ever before;• we can store more data than ever before;• we can analyze more data more quickly than ever before;• we can visualize data in revealing new ways.

Page 3: Introduction to Scientific Data Lecture 5. The Informatics Effect Computers have transformed how we collect, store, analyze, and visualize data. Notably,

So, What Are Scientific Data?

• Do data need to be numerical?• Do data need to be measured for a purpose?• Is free text data?– What about clinical reports from physicians?– What about web sites?

• Can processing steps turn computer data into scientific data?• Is the distinction a matter of quality?

We can ask several questions about scientific data, including:

Why is the distinction important?

Page 4: Introduction to Scientific Data Lecture 5. The Informatics Effect Computers have transformed how we collect, store, analyze, and visualize data. Notably,

Neuroscience Data

Blood flow in the brain

Page 5: Introduction to Scientific Data Lecture 5. The Informatics Effect Computers have transformed how we collect, store, analyze, and visualize data. Notably,

Genetic Data

Gene expression levels across conditions

Page 6: Introduction to Scientific Data Lecture 5. The Informatics Effect Computers have transformed how we collect, store, analyze, and visualize data. Notably,

Oceanographic Data

Sea surface temperature

Page 7: Introduction to Scientific Data Lecture 5. The Informatics Effect Computers have transformed how we collect, store, analyze, and visualize data. Notably,

What Are Scientific Data?

What do these examples have in common?

They aren’t data.

Page 8: Introduction to Scientific Data Lecture 5. The Informatics Effect Computers have transformed how we collect, store, analyze, and visualize data. Notably,

Then What Is This?

1. Satellites record infrared radiance as a collection of numbers.

2. Computers convert these numbers to temperature data using algorithms based on the theory of blackbody radiation.

3. One can plot these temperatures as a heat map to get a global view of sea surface temperature.

This is a visualization of data produced in multiple stages.

Page 9: Introduction to Scientific Data Lecture 5. The Informatics Effect Computers have transformed how we collect, store, analyze, and visualize data. Notably,

Then What About DNA Microarrays?

Raw microarray images

geneID 15’ 60’ 360’

ssr3571 0.97 1.05 0.96

ssr3570 0.99 1.11 0.91

ssr3532 1.46 1.15 1.21

ssr3467 1.08 1.51 0.98

ssr3465 0.51 0.76 1.16

ssr3451 0.80 1.01 1.12

Expression levels Heat map visualization

Raw expression levels undergo statistical normalization beforeserving as empirical evidence.

Page 10: Introduction to Scientific Data Lecture 5. The Informatics Effect Computers have transformed how we collect, store, analyze, and visualize data. Notably,

What About the fMRI Image?

1. An MRI scanner records the blood oxygen level at each voxel in a 3D grid over an extended period of time.

2. Computers convert the raw time series to data using algorithms that remove the effects of noise, motion, etc.

3. These data undergo further processing to visualize neuronal activity with respect to an anatomical image.

This is a visualization of the data after several stages of processing.

Page 11: Introduction to Scientific Data Lecture 5. The Informatics Effect Computers have transformed how we collect, store, analyze, and visualize data. Notably,

So, What Are Scientific Data?

• collect data for a purpose;

• process raw measurements to produce data that answer scientific questions;

• structure and interpret data in light of scientific theories.

The general characteristics seem tied to intent or purpose. In particular, scientists

What other distinctions come to mind?

Page 12: Introduction to Scientific Data Lecture 5. The Informatics Effect Computers have transformed how we collect, store, analyze, and visualize data. Notably,

Scientific Data Through History

• Early data consisted of drawings and tables of numbers and other properties.

• As science progressed, x-rays, clinical reports, and other modalities served as data.

• Currently, informatics solutions are changing our definitions of scientific data.

The nature of data changes with scientific progress, technological advancements, and problem-specific needs.

Should we be cautious about the term ‘scientific data’?

Page 13: Introduction to Scientific Data Lecture 5. The Informatics Effect Computers have transformed how we collect, store, analyze, and visualize data. Notably,

Tycho Brahe’s Data

Early data were often recorded as tables of numbers. The Rudolphine Tables recorded by Brahe•provided precise astronomical records,•recorded the positions of stars and planets, and•enabled Kepler’s discovery of his laws of planetary motion.

Page 14: Introduction to Scientific Data Lecture 5. The Informatics Effect Computers have transformed how we collect, store, analyze, and visualize data. Notably,

Galileo’s Moon

Data were also recorded as drawings, particularly in astronomy and anatomy.Through his use of the newly invented telescope, Galileo reported data that•presented a visual record of the moon’s surface,•revealed evidence of craters and mountains for the first time, and•challenged the pervading view that celestial objects are perfect spheres.

From Sidereus Nuncius

Page 15: Introduction to Scientific Data Lecture 5. The Informatics Effect Computers have transformed how we collect, store, analyze, and visualize data. Notably,

Wilhelm Roentgen’s X-rays

Technology both refined the senses and expanded them.Roentgen discovered x-rays which let scientists •view internal anatomy noninvasively,•determine the structure of crystals,•analyze the elemental composition of solid materials, and •detect interesting astronomical phenomena.

Page 16: Introduction to Scientific Data Lecture 5. The Informatics Effect Computers have transformed how we collect, store, analyze, and visualize data. Notably,

Rosalind Franklin’s DNA X-rays

Photo 51

A theory of wave interactions with atoms supports x-ray diffraction.

Franklin used this method to image the geometry of DNA molecules.

These images led Watson and Crick to the double-helix model.

Page 17: Introduction to Scientific Data Lecture 5. The Informatics Effect Computers have transformed how we collect, store, analyze, and visualize data. Notably,

Mass Spectrometry

Mass spectrometers produce data from chemical compounds.

The charge to mass ratios indicate the compound’s fragmentation pattern.

The graph suggests composition, structure, and other properties.

Isotope distributions of a peptide

Page 18: Introduction to Scientific Data Lecture 5. The Informatics Effect Computers have transformed how we collect, store, analyze, and visualize data. Notably,

The Informatics Effect: Data and the Human Genome Project

A large scale endeavor, the Human Genome Project was greatly aided by informatics technology, such as•databases for storing DNA sequences;•sequence alignment tools for genome reconstruction;•sequence annotation tools for labeling and relating important regions of the genome; and•visualization tools that provide overviews of large areas of the genome.

Page 19: Introduction to Scientific Data Lecture 5. The Informatics Effect Computers have transformed how we collect, store, analyze, and visualize data. Notably,

Planned for 2015, the LSST will scan the sky semiweekly and yield 60PB of raw images that are analyzed for•events that would benefit from collaborative monitoring,•variable objects (e.g,. gamma ray bursts), and•moving objects (e.g., asteroids).Further processing tools will•measure properties of faint objects,•map the x,y coordinates of the images to celestial coordinates, and•classify objects based on static and dynamic behavior.

The Informatics Effect:Data and the Large Synoptic Survey Telescope

Page 20: Introduction to Scientific Data Lecture 5. The Informatics Effect Computers have transformed how we collect, store, analyze, and visualize data. Notably,

Scientific Data

Scientific activity depends upon the ability to collect, store, retrieve, analyze, and visualize data.

These data often result from the processing of raw measurements by informatics tools.

Researchers in all areas are recognizing the vital role of informatics solutions that drive the data lifecycle.

Advances in these solutions can ultimately lead to advances in scientific knowledge.