7
Data Manipulation Evangelos Pournaras, Izabela Moise, Dirk Helbing Evangelos Pournaras, Izabela Moise, Dirk Helbing 1

Data Manipulation - ETH ZData Format & Manipulation Select a data manipulation approach based on how data are stored and managed: 1.Text files 2.Databases 3.Big Data Evangelos Pournaras,

  • Upload
    others

  • View
    40

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Data Manipulation - ETH ZData Format & Manipulation Select a data manipulation approach based on how data are stored and managed: 1.Text files 2.Databases 3.Big Data Evangelos Pournaras,

Data ManipulationEvangelos Pournaras, Izabela Moise, Dirk Helbing

Evangelos Pournaras, Izabela Moise, Dirk Helbing 1

Page 2: Data Manipulation - ETH ZData Format & Manipulation Select a data manipulation approach based on how data are stored and managed: 1.Text files 2.Databases 3.Big Data Evangelos Pournaras,

Data Manipulation

The process of data transformation, formatting & structuring.

Examplesupdating, adding/removing, sorting, selection, merging, shifting,aggregation, etc.

TipIn Data Science, data come with collection & science starts withmanipulation!

Evangelos Pournaras, Izabela Moise, Dirk Helbing 2

Page 3: Data Manipulation - ETH ZData Format & Manipulation Select a data manipulation approach based on how data are stored and managed: 1.Text files 2.Databases 3.Big Data Evangelos Pournaras,

A "Dirty Job"

Evangelos Pournaras, Izabela Moise, Dirk Helbing 3

Page 4: Data Manipulation - ETH ZData Format & Manipulation Select a data manipulation approach based on how data are stored and managed: 1.Text files 2.Databases 3.Big Data Evangelos Pournaras,

Do you really need it?

• Big Data and Internet of Things result in large amount ofunstructured data.

• New data collection opportunities require advanced datamanipulation techniques.

• Involvement with data manipulation becomes more likely &required nowadays.

Evangelos Pournaras, Izabela Moise, Dirk Helbing 4

Page 5: Data Manipulation - ETH ZData Format & Manipulation Select a data manipulation approach based on how data are stored and managed: 1.Text files 2.Databases 3.Big Data Evangelos Pournaras,

Data Format & Manipulation

Select a data manipulation approach based on how data are storedand managed:

1. Text files

2. Databases

3. Big Data

Evangelos Pournaras, Izabela Moise, Dirk Helbing 5

Page 6: Data Manipulation - ETH ZData Format & Manipulation Select a data manipulation approach based on how data are stored and managed: 1.Text files 2.Databases 3.Big Data Evangelos Pournaras,

How to manipulate data

Most programming languages and several software tools canmanipulate data:

• Java, Python, C, C++, etc.

• Matlab, R, Excel, etc.

Criteria for selection:

• Ease of use

• Library support

• Portability

• Performance

• Data format

Evangelos Pournaras, Izabela Moise, Dirk Helbing 6

Page 7: Data Manipulation - ETH ZData Format & Manipulation Select a data manipulation approach based on how data are stored and managed: 1.Text files 2.Databases 3.Big Data Evangelos Pournaras,

What is next?

• Data manipulation with AWK

Evangelos Pournaras, Izabela Moise, Dirk Helbing 7