View
135
Download
1
Category
Tags:
Preview:
Citation preview
Open Data
● Discoverable
– Available and Searchable on Internet.
● Structured
– Open and Machine-readable Format.
● Unconditional
– Legal Framework allows to reproduce an repurposethe data.
Open Source
● Software Development Model
● Free Software (1985)
– Free = Freedom
– Run the program (Freedom 0)
– Study the source code and change it (Freedom 1)
– Redistribute copies (Freeom 2)
– Distribute your modified version in same license (Freedom3)
● Open Source (1998)
Open Source Web ApplicationSoftware Stack
● LAMP
– Linux (1991): Operating System
– Apache (1995): Web Server
– MySQL (1995): Database Server
– PHP (1995): Server-side Scripting Language
● Other Alternatives:
– LNMP: Replacing Apache with Nginx
– Another M of LAMP: MariaDB, MongoDB
Python
● Programming Language
– Since 1991
– Widely used general purpose
– High-level
– Open Source
● Another P of LAMP
My Open Data related Projects
● TV Timetable of Live Football Matches (2004)
● Weather Information (2006)
● Public Transportation Information (2006)
● LegCo Vote Information (2013)
● Air Quality Information (2014)
● Restaurant Information (2014)
TCTrack
● Plot a map of typhoon path of different observationagencies
● Google Map API
– First Typhoon Map in HK using Google API
– Sammy.HK TCTrack → Weather Underground → Hong KongObservatory
● Twitter API
– Posting typhoon updates from any potential formation oftropcial cyclone in Northwest Pacific Ocean.
● Data Sources: HKO, JTWC.
Licensed Restaurants in Hong Kong
● Open Data from Data.One PSI
● Open Source Software Tools
– Python
– Scrapy Web Scraping Framework
● Source Codes are released on GitHub
– https://github.com/sammyfung/LP_Restaurants_Scrapy
Creating environment of a Scrapy project
● Requirements
– Python, Python-Dev, virtualenv, pip
● Creating a virtual enviornment for pythonproject
– virtualenv ~/env
– source ~/env/bin/activate
– pip install scrapy
Creating a Scrapy project
● Creating a new Scrapy project with spider
– scrapy startproject LP_Restaurants_Scrapy
– cd LP_Restaurants_Scrapy
– scrapy genspider rlxml fehd.gov.hk
● Creating a scrapy data model
● Doing some tests with scrapy shell.
– scrapy shell <URL>
– http://www.fehd.gov.hk/english/licensing/license/text/LP_Restaurants_EN.XML
● Writing the parse function of a scrapy spider.
● Try and test the spider
– scrapy crawl rlxml -t json -o restaurant_licenses.json
Recommended