18
Embulk makes Japan visible Kai Sasaki Treasure Data Inc.

Embulk makes Japan visible

Embed Size (px)

Citation preview

Page 1: Embulk makes Japan visible

Embulk makes Japan visible

Kai Sasaki Treasure Data Inc.

Page 2: Embulk makes Japan visible

Who am I?

• Kai Sasaki (@Lewuathe)

• Treasure Data Inc

• Maintaining and improvingHadoop infrastructure

• Hadoop, Spark contributor

Page 3: Embulk makes Japan visible

Topic• What is Embulk?

• Embulk ☓ GeoJSON

• DATA.GO.JP (http://www.data.go.jp/)

• DEMO

• Conclusion

Page 4: Embulk makes Japan visible

What is Embulk?• Parallel bulk data loader

• using plugins

• to make data integration relaxed

http://www.embulk.org/docs/

Page 5: Embulk makes Japan visible

http://www.slideshare.net/frsyuki/embulk-56197273/4

Page 6: Embulk makes Japan visible

Plugins

http://www.slideshare.net/frsyuki/embuk-making-data-integration-works-relaxed/12

Page 7: Embulk makes Japan visible

Plugins

http://www.embulk.org/plugins/

Page 8: Embulk makes Japan visible

Embulk ☓ GeoJSON• GeoJSON is a format for encoding geographic

data structures

{ “type”: “FeatureCollection”, “features”: [ { “type”: “Feature”, “geometry”: { “type”: “Point”, “coordinates”: [37.0, 128.4] }, “properties”: { “name”: “Point A” } } ] }

Page 9: Embulk makes Japan visible

Embulk ☓ GeoJSON

https://github.com/benbalter/dc-wifi-social/blob/master/bars.geojson

Page 10: Embulk makes Japan visible

Embulk ☓ GeoJSON

• embulk-formatter-geojsonhttps://rubygems.org/gems/embulk-formatter-geojson

• Convert any type of source data (csv, tsv, json msgpack etc) supported by input plugin into GeoJSON format.

$ embulk new ruby-formatter …

Page 11: Embulk makes Japan visible

Embulk ☓ GeoJSON

id,name,population,…1,Tokyo,1000,…2,Osaka,800,…

template.geojson

{ “id”: 1, “properties”: { “name”: “Tokyo”, “population”: 1000 }, “geometry”: <From template.geojson> }

Page 12: Embulk makes Japan visible

embulk-formatter-geojson$ embulk gem install embulk-formatter-geojson $ cat config.yml … out: type: file formatter: type: geojson template_file: /path/to/template.geojson identifier: “id" … $ embulk run config.yml

Page 13: Embulk makes Japan visible

DATA.GO.JP

http://www.data.go.jp/

Page 14: Embulk makes Japan visible

DEMO

http://www.lewuathe.com/opendata/

Page 15: Embulk makes Japan visible

d3.json(url, function(error, geoJp) { svg.selectAll("path") .data(geoJp.features) .enter().append(“path") .on("mouseover", function(d) { $("#description").text(d.properties["name"]); }) .attr("class", function(d) { return d.id; }) .attr("d", geopath) .attr("fill", function(d) { var prop = d.properties[“population”]; return colors[prop]; }); });

• d3.js (https://d3js.org/)

Page 16: Embulk makes Japan visible

d3.json(url, function(error, geoJp) { svg.selectAll("path") .data(geoJp.features) .enter().append(“path") .on("mouseover", function(d) { $("#description").text(d.properties["name"]); }) .attr("class", function(d) { return d.id; }) .attr("d", geopath) .attr("fill", function(d) { var prop = d.properties[“population”]; return colors[prop]; }); });

• d3.js (https://d3js.org/)Embedded Properties

Page 17: Embulk makes Japan visible

Conclusion

• Embulk can be yet another format converter

• GeoJSON as a container including data and topology

• DATA.GO.JP provides various type of open data

Page 18: Embulk makes Japan visible

Thank you!