19
Data Science: Challenges and Opportunities for State Policymaking Karen Chapple Professor and Chair, Department of City & Regional Planning

Data Science: Challenges and Opportunities for …Data Science: Challenges and Opportunities for State Policymaking Karen Chapple Professor and Chair, Department of City & Regional

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Data Science: Challenges and Opportunities for …Data Science: Challenges and Opportunities for State Policymaking Karen Chapple Professor and Chair, Department of City & Regional

Data Science: Challenges and Opportunities for State Policymaking

Karen ChappleProfessor and Chair,Department of City & Regional Planning

Page 2: Data Science: Challenges and Opportunities for …Data Science: Challenges and Opportunities for State Policymaking Karen Chapple Professor and Chair, Department of City & Regional

What is big data and analytics?

Presenter
Presentation Notes
Found data vs designed data: data science uses found data rather than designed data collected via a subjective framework Volume Where it’s from (users/sensors/cell phones/transactional data exhaust/crowdsourced/etc.) How it is analyzed- machine learning algorithms rather than traditional regression models, these are less biased because the variables are not hand selected
Page 3: Data Science: Challenges and Opportunities for …Data Science: Challenges and Opportunities for State Policymaking Karen Chapple Professor and Chair, Department of City & Regional

Data Science in the Private Sector

Presenter
Presentation Notes
Why is it more common in the private sector vs public sector? The public sector faces higher ethical (must not discriminate against a class of people) and economic standards (constrained budgets for hiring and innovation) Innovation happens through a democratic and/or bureaucratic process which can be slower Lack of qualified practitioners/people trained in ML, big data analytics Lack of infrastructure to collect and support large volumes of data outside of certain public agencies
Page 4: Data Science: Challenges and Opportunities for …Data Science: Challenges and Opportunities for State Policymaking Karen Chapple Professor and Chair, Department of City & Regional

The Risks of Outsourcing

Presenter
Presentation Notes
Privacy and Security risks- how is the data being used and stored? Sustainability issues- critical services people depend on can be disrupted if they become unprofitable Equity: the private sector won’t necessarily suit the needs of marginalized populations
Page 5: Data Science: Challenges and Opportunities for …Data Science: Challenges and Opportunities for State Policymaking Karen Chapple Professor and Chair, Department of City & Regional

What should data science applications in the public sector look like?

How can the UC system support California government

with data science?

Key Questions

Presenter
Presentation Notes
Fix this up
Page 6: Data Science: Challenges and Opportunities for …Data Science: Challenges and Opportunities for State Policymaking Karen Chapple Professor and Chair, Department of City & Regional

The State of Data

Presenter
Presentation Notes
As part of an Open Data policy launched in March 2019, the state of California maintains an open data portal (ca.data.gov) at the state level, which hosts nearly 2200 datasets from 42 state departments. Environmental, health and financial data are the most well-represented sources.
Page 7: Data Science: Challenges and Opportunities for …Data Science: Challenges and Opportunities for State Policymaking Karen Chapple Professor and Chair, Department of City & Regional

An Uneven Data Landscape

Cities (pop >30,000) without open data portals Cities (pop > 30,000) with open data portals

Presenter
Presentation Notes
Just 17% of California cities and 15 out of 58 counties maintain their own open data portals. The open data landscape is uneven, with far more data available in urban parts of the state. There is a dearth of open data portals available in the central valley region, though some cities in the central valley have a limited amount of public GIS data.
Page 8: Data Science: Challenges and Opportunities for …Data Science: Challenges and Opportunities for State Policymaking Karen Chapple Professor and Chair, Department of City & Regional

Webtools: big data for a diverse audienceTableau, Microsoft BI and ArcGIS online: user-friendly interactive data visualization

Javascript web tools: customizable for a specific use case and data set(s)

Presenter
Presentation Notes
Tableau, Microsoft BI and ArcGIS online are proprietary (cost money), don’t require much code. Javascript is an open source language and tools can be built freely in most cases
Page 9: Data Science: Challenges and Opportunities for …Data Science: Challenges and Opportunities for State Policymaking Karen Chapple Professor and Chair, Department of City & Regional
Presenter
Presentation Notes
TNCs Today: San Francisco County Transportation Authority visualized data from Uber and Lyft APIs in a Javascript webtool for a granular look at TNC usage in the city. Users can specify time of day, day of the week, and filter by pickups and drop offs and see trip distribution around the city This was a partnership with Northeastern University http://tncstoday.sfcta.org/
Page 10: Data Science: Challenges and Opportunities for …Data Science: Challenges and Opportunities for State Policymaking Karen Chapple Professor and Chair, Department of City & Regional
Presenter
Presentation Notes
Land Use Viewer: California Department Water Resources has an ArcGis online webtool for visualizing county land use and crop datasets for Groundwater Sustainability Agencies and water managers. https://gis.water.ca.gov/app/CADWRLandUseViewer/
Page 11: Data Science: Challenges and Opportunities for …Data Science: Challenges and Opportunities for State Policymaking Karen Chapple Professor and Chair, Department of City & Regional

What is Machine Learning?● Umbrella term for processes that derive information from large

datasets using statistical modeling and computing methods● ML models are “trained” on existing data before being used to

interpret new data

Predict a continuous value of an attribute based on other data and information● OLS regression, neural networks, K-means

regression

Classify a datapoint into one of many discrete categories based on attributes of the datapoint● Support vector machines, random forest, nearest

neighbor clustering

Page 12: Data Science: Challenges and Opportunities for …Data Science: Challenges and Opportunities for State Policymaking Karen Chapple Professor and Chair, Department of City & Regional

Classification Applications

Presenter
Presentation Notes
Use a clustering algorithm called K nearest neighbors to derive organic but meaningful spatial categorizations instead of the artificially zip code, neighborhood, census tract categories Data: real estate transaction data from LA county Will add more to description
Page 13: Data Science: Challenges and Opportunities for …Data Science: Challenges and Opportunities for State Policymaking Karen Chapple Professor and Chair, Department of City & Regional

Regression Applications

Page 14: Data Science: Challenges and Opportunities for …Data Science: Challenges and Opportunities for State Policymaking Karen Chapple Professor and Chair, Department of City & Regional

Citizen Science

the collection and usage of data that is crowdsourced by members of the general public

More data, Less time and money

Presenter
Presentation Notes
Will add more to description
Page 15: Data Science: Challenges and Opportunities for …Data Science: Challenges and Opportunities for State Policymaking Karen Chapple Professor and Chair, Department of City & Regional
Presenter
Presentation Notes
Researchers at UC Berkeley’s Safe Transportation Research and Education Center created a project called Street Story, which provides an interface for users to record information about collisions, near misses, and other transportation safety hazards This information is publicly accessible and visualized in a clear and effective way The team works with other community groups to incorporate the tool into existing policies and programs https://safetrec.berkeley.edu/tools/street-story-platform-community-engagement
Page 16: Data Science: Challenges and Opportunities for …Data Science: Challenges and Opportunities for State Policymaking Karen Chapple Professor and Chair, Department of City & Regional

API: Application Programming Interface

Presenter
Presentation Notes
Will add more to description Browsers and applications can send requests through an API to another server (ie: the google maps server!) and receive data back to manipulate or display
Page 17: Data Science: Challenges and Opportunities for …Data Science: Challenges and Opportunities for State Policymaking Karen Chapple Professor and Chair, Department of City & Regional

What could the state do? APR example- big data - analytics- API - citizen science

Coming in Spring 2020 at UCB: Data Science for Housing – A Professional Training Course

Page 18: Data Science: Challenges and Opportunities for …Data Science: Challenges and Opportunities for State Policymaking Karen Chapple Professor and Chair, Department of City & Regional

Strategic Plan for Data – and Data Science –what about in California?

Page 19: Data Science: Challenges and Opportunities for …Data Science: Challenges and Opportunities for State Policymaking Karen Chapple Professor and Chair, Department of City & Regional

Data Science in the UC system