16
Supporting Big Data, Open Data, Data Analytics and Data Science Dr Simon Price Research IT Manager

Supporting Big Data, Open Data, Data Analytics and Data Science

Embed Size (px)

Citation preview

Page 1: Supporting Big Data, Open Data, Data Analytics and Data Science

Supporting Big Data, Open Data, Data Analytics and Data Science

Dr Simon PriceResearch IT Manager

Page 2: Supporting Big Data, Open Data, Data Analytics and Data Science

2

• Bristol is a research-intensive university

• 6 Faculties: Social Science & Law, Science, Engineering, Arts and two Medical Faculties

• Employs 2000+ researchers (excluding PhDs)

• Each year (approximately):• 1500 research funding applications• £100M research income• 4500 research outputs

Page 3: Supporting Big Data, Open Data, Data Analytics and Data Science

3

Outline

1. Big Data2. Open Data3. Data Analytics4. Data Science

5. Implications for IT support

Page 4: Supporting Big Data, Open Data, Data Analytics and Data Science

4

Big Data

Page 5: Supporting Big Data, Open Data, Data Analytics and Data Science

5

Big Data

• Lots and lots of technology buzzwords!• Some important ones:

• MapReduce• The Hadoop stack

• Distributed file systems• Query languages & programming languages

• NoSQL databases (columns, document, graph, ...)

Page 7: Supporting Big Data, Open Data, Data Analytics and Data Science

7

Big Data

• Trends in Hadoop stack• Near realtime analytics• Streaming analytics• In-memory

• Trends in NoSQL• Relational and NoSQL moving closer together

Page 8: Supporting Big Data, Open Data, Data Analytics and Data Science

8

Open Data

Page 9: Supporting Big Data, Open Data, Data Analytics and Data Science

9

Open Data - data.bris• Each PI allocated 5TB "forever"• Research Data Management• Open Data Publication

Page 10: Supporting Big Data, Open Data, Data Analytics and Data Science

10

Open Data - public data

Page 11: Supporting Big Data, Open Data, Data Analytics and Data Science

11

140+ datasets live on opendata.bristol.gov.uk Some real time data Transport API repository now available Examples

Government: Elections since 2007 Community: Quality of Life survey Education: School Results Energy: Installed PV, Energy Use in Council Buildings Environment: Real time & Historic Air Quality, Flood Alerts (EA) Land use: 2013 Planning applications Health: Life expectancy/ Mortality, Obesity, NHS Spend

Bristol is Open - datasets

Page 12: Supporting Big Data, Open Data, Data Analytics and Data Science

12

Data Analytics

• Operational focus• variables are "known knowns and known unknowns"

• Descriptive• summarisation known variables and alerting

• Predictive• correlations between known variables

Page 13: Supporting Big Data, Open Data, Data Analytics and Data Science

13

Data Science

• Multidisciplinary data-intensive research• Focus on research insights, causation and prediction• Usually involves Machine Learning and Statistics

• Different perspectives:• Computer Scientists view DS as a research domain• Statisticians view DS as a research domain• Other academics view DS as a service

Page 14: Supporting Big Data, Open Data, Data Analytics and Data Science

14

3 May 2023

Page 15: Supporting Big Data, Open Data, Data Analytics and Data Science

15

3 May 2023

Page 16: Supporting Big Data, Open Data, Data Analytics and Data Science

16

Implications for IT support

• Governance• Shift from IT-owned to academic-owned (Shadow IT)

• Skills• IT experts need to train and trust academics• Nurture internal skills pipeline (interns, postgrads)

• Systems• Mixed economy of internal and external