38
Access to and Add Value of Archived Data - Methodology of Data Integration and Mining for 1:1M Land Type Mapping of China Prof. Liu Chuang Prof. Shen Yuancen Global Change Information and Research Center IGSNRR/Chinese Academy of Sciences PPF-WSIS Phase II, 14 November 2005, Tunis

Access to and Add Value of Archived Data - Methodology of Data Integration and Mining for 1:1M Land Type Mapping of China Prof. Liu Chuang Prof. Shen Yuancen

Embed Size (px)

Citation preview

Access to and Add Value of Archived Data -

Methodology of Data Integration and Mining

for 1:1M Land Type Mapping of China

Prof. Liu Chuang

Prof. Shen Yuancen

Global Change Information and Research Center

IGSNRR/Chinese Academy of Sciences

PPF-WSIS Phase II, 14 November 2005, Tunis

1 China’s Scientific Data Sharing Program2 Opportunities and Challenges: Access to and Add Value of the Archived Data3 Methodology of Adding Value of Archived

Data4 Example:

1:1M Land Type Mapping of China

1 China’s Scientific Data Sharing Program

China has an implementation program in enhancing open access to scientific data, a national long-term (2005-2020) program: Scientific Data Sharing Program (SDSP) which is initialed in 2003

About 40 data centers, 300 major databases covering almost all of the basic sciences will be long term supported, a series of data policies and data standards will be established to meet the needs of open access to the archived data.

Besides, e-Government programs in agencies of China and e-Sciences program in CAS will promote the scientific data sharing program greatly. For example, the quick response system of water resources management system.

About 250 TB data archived with the standard or near standard manners in

China (June 2005)

2 Opportunities and Challenges: Access to and Add Value of the Archived Data

The progress makes great opportunities for scientists in research:

• the location of data• the way to access • free or low costs

Two Major Challenges in China:

• Preservation and open access: more stable, more open, more fast, more easy and more low cost in services, which is a long way to go

• Add Value: new methodology in data integration and mining, which is a new way to be created

3 Methodology of Adding Value of Archived

Data

The value of scientific data can be divided into:

value for scientific researchvalue for social benefitvalue for economic income

Relationship between data value and data integration/mining

Dataset 1

Dataset 2

Dataset 3

time

value

Reference Hierarchical Model for Data Integration and Data Mining

data model

knowledge

Data Selection

Data Integration

Object Simulating

Cal/Val

Compu

tatio

nal P

roce

ss

Distributed Information Infrastructure

Innovated Ideas/Society Needs

• Data Selection: two important issues in this stage

(1) how to select the necessary data among the distributed data holders in order to meet the need of modeling for a specific objective

(2) how to determine the weights of each selected datasets

• Data Integration: one issue, very difficult issue, in this stage has to be solved- making the selected datasets compatible

including data standard, termination, definition, format, unit, resolution, time period, method of capture the data ….

• Object simulating: two issue, the critical issues, in this stage need to be solved- establish a relationship between the datasets selected (model)- determine the parameters in the model

• Cal/Val for the new dataset: How the new dataset qualitycould be: - how quality is or what conditions the new dataset or knowledge could be high quality?- Are there any way to help the dataset quality enough?

• New knowledge/new dataset created

go to publication and data archiving process

Reference Hierarchical Model for Data Integration and Data Mining

data model

knowledge

Data Selection

Data Integration

Object Simulating

Cal/Val

Compu

tatio

nal P

roce

ss

Distributed Information Infrastructure

Innovated Ideas/Society Needs

Example:

Data Integration and Mining for 1:1M Land Type Mapping of China

Land type research and 1:1M mapping in China

There is a long history in China in land type studies, the earlier record in 170 BC, identified the China land into 9 types.

The most resent land type studies in 1:1M mapping started in 1987, the first land type classification system for 1:1M mapping of China created in 1990 led by Prof. Zhao Songqiao. landtypeclaSytemChina.doc

The stage of completed part of the 1:1M Land Type Map of China

Datasets : The datasets used in this paper include:(1) Climate datasets in more than 600 climate stations from CMA(2) Soil map in 1:1M from CAS(3) MODIS-NDVI/EVI, 250m, 1kmresolution, 16-day and 10 days

composite 2002, from NASA and CAS(4) MODIS-NDSI, 1 km resolution, 10 days and monthly

composite 2002, from CAS(5) SRTM in 90 Meters in USGS and DEM in 1:250k from

Geomatic Center of China(6) Ground truth survey datasets in Northeast China, Inner Mongolia,

Tibet, Gansu, Zhejiang, Guizhou …(7) historical records including documentation and maps from CAS(8) yearbooks of agriculture and land use from Statistic Bureau of China

MODIS-NDVI 16-days composite datasets, 2002, 1km • Field sites

NDVI = (MODIS2-MODIS1)/ (MODIS2+MODIS1)

EVI = 2.5*(MODIS2-MODIS1)/(MODIS2+6*MODIS1-

7.5*MODIS3+1)

NDSI = (MODIS4-MODIS6)/(MODIS4+MODIS6)

- 2000

0

2000

4000

6000

8000

10000

1 2 3 4 5 6 7 8 9 10 11 12

Month

1000

0*ND

VI

Forest (Betula)

0 NDVI 0.83

Single peak

Location:

Far East Russia and Daxingan Mountain in Helongjian Province

Location:

Great Hinggan Mt.Forest (Larix+Betula, up)

Meadow steppe (down)

Location:

Huang-Huai-Hai Plain

Rotated crops land with winter wheat and maize

Location:

North Korea

Forest (purple)

Rice (white)

 

Wetland (reed)

0 NDVI 0.53

0 EVI 0.42

Location: Yellow River Delta

NDVI Time Series of Phragmites Australis

0

0. 1

0. 2

0. 3

0. 4

0. 5

0. 6

1 2 3 4 5 6 7 8 9 10 11 12month

ND

VI

EVI Time Series of Phragmites Australis

0

0. 05

0. 1

0. 15

0. 2

0. 25

0. 3

0. 35

0. 4

0. 45

1 2 3 4 5 6 7 8 9 10 11 12

month

EV

I

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

J an. Feb. Mar. Apr. May. J un. J ul . Aug. Sep. Oct. Nov. Dec.Months

MODI

S_ND

VI*1

0000

0

1000

2000

3000

4000

5000

6000

7000

J an. Feb. Mar. Apr. May. J un. J ul . Aug. Sep. Oct. Nov. Dec.Months

MODI

S_EV

I*10

000

0

1000

2000

3000

4000

5000

6000

7000

J an. Feb. Mar. Apr. May. J un. J ul . Aug. Sep. Oct. Nov. Dec.Months

MODI

S_ND

VI*1

0000

0

500

1000

1500

2000

2500

3000

3500

4000

4500

J an. Feb. Mar. Apr. May. J un. J ul . Aug. Sep. Oct. Nov. Dec.Months

MODI

S_EV

I*10

000

Temperate Meadow 0 NDVI 0.6

Temperate Meadow 0 NDVI 0.8

Temperate Steppe

0 NDVI 0.4

Temperate Steppe

0 NDVI 0.6

Location: Xilingol, Inner Mongolia

0

500

1000

1500

2000

2500

3000

J an. Feb. Mar. Apr. May. J un. J ul . Aug. Sep. Oct. Nov. Dec.Months

MODI

S_ND

VI*1

0000

0

500

1000

1500

2000

2500

J an. Feb. Mar. Apr. May. J un. J ul . Aug. Sep. Oct. Nov. Dec.Months

MODI

S_EV

I*10

000

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

J an. Feb. Mar. Apr. May. J un. J ul . Aug. Sep. Oct. Nov. Dec.Months

MODI

S_ND

VI*1

0000

0

500

1000

1500

2000

2500

3000

3500

4000

J an. Feb. Mar. Apr. May. J un. J ul . Aug. Sep. Oct. Nov. Dec.Months

MODI

S_EV

I*10

000

Temperate Desert

0 NDVI 0.25

Temperate Desert Steppe

0 NDVI 0.2

Sand Steppe

0 NDVI 0.45

Sand Steppe

0 NDVI 0.35

Location: Xilingol, Inner Mongolia

Location: Coastal area in Northern Jiangsu province

Wetland

0 NDVI 0.52

0 EVI 0.35

Fi g. 2 S. al terni fl ora互花米草盐沼,

0

500

1000

1500

2000

2500

3000

3500

4000

1 2 3 4 5 6 7 8 9 10 11 12

Month月份,

1000

0*EV

I

Fi g. 1 S. al terni fl ora sal t march互花米草盐沼,

0

1000

2000

3000

4000

5000

6000

7000

1 2 3 4 5 6 7 8 9 10 11 12

Month月份,

10

00

0*N

DV

I

Location:

Qinghai Province

Alpine Meadow

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

1 2 3 4 5 6 7 8 9 10 11 12

al pi ne meadow, month

10

00

0*N

DV

I

0

1000

2000

3000

4000

5000

6000

7000

1 2 3 4 5 6 7 8 9 10 11 12

al pi ne meadow, month

10

00

0*E

VI

Gobi in arid region in northwestern China

0

200

400

600

800

1000

1200

1400

1 2 3 4 5 6 7 8 9 10 11 12Gobi , Month

10

00

0*N

DV

I

0

100

200

300

400

500

600

700

800

900

1 2 3 4 5 6 7 8 9 10 11 12Gobi , Month

10

00

0*E

VI

Location: MinQin County, Gansu Province

Gobi

0

1000

2000

3000

4000

5000

6000

1 2 3 4 5 6 7 8 9 10 11 12

Wheat , Month

1000

0*EV

I

0

1000

2000

3000

4000

5000

6000

7000

1 2 3 4 5 6 7 8 9 10 11 12

Wheat , Month

10000*N

DV

I

Location:

MinQin County (Oasis), Gansu Province

Spring Wheat Crop

Land

June 2001

April 2001

August 2001

Location: Gongbujiangda area located at the Eastern Tibet

Location:

Nyainqntanglha Mountains

NDSI >0.4 and MODIS2 > 0.11

Up left: Feb.2002

Up right: June 2002

Down left: Sep. 2002

Conclusion: 

The reference Hierarchical mode of data integration and mining is very important for innovated knowledge development, the computational science plays a critical role in the new methodology. The new methodology in data integration and mining will take China land type studies into a new milestone.

Thank you !