Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Atlas of Living Australia A state-of-the-art online
information platform
Michael Hope
The ALA is made possible by contributions from its many partners. It receives
support through the Australian Government through the National Collaborative
Research Infrastructure Strategy (NCRIS) and is hosted by CSIRO.
What is the ALA?
• One of several facilities funded by the Australian Government for national research infrastructure
• Hosted by
• Enables ‒ more effective & efficient scientific
research
‒ expanded research opportunities
‒ informing policy & management
‒ community participation & connection
‒ education resource
Australia’s aggregation & access facility for biodiversity data
What is the ALA?
• Regional node of GBIF – Comprises ~ 1/10
th of global records in GBIF
• Used for: ‒ Global scale scientific research.
‒ Informing trends in:
‒ biodiversity status
‒ Planning for & tracking against international biodiversity management targets (Aichi, CBD, IPBES, etc.)
‒ Predictive modelling for:
• trans-national biosecurity and human & animal health issues.
• Biodiversity responses to climate change and other factors.
‒ Much more …
www.gbif.org
Australia’s aggregation & access facility for biodiversity data
Implementation
What is the ALA?
• World-leading biodiversity informatics infrastructure ‒ All FOSS and open data under CC licences
‒ Integrated but modular tools and services
‒ API layer provides enormous flexibility
External consumers
• Web sites • Mobile apps • Databases
Australia’s aggregation & access facility for biodiversity data
One infrastructure - many systems
ALA – open infrastructure - “Powered by ALA”
Web Services
ALA Portal International Instances Domain Instances (Aus)
clic
k o
n im
ages to
go th
rough to
site
Hubs
Web Services
Data Repository
ALA - tools for biodiversity
Data Capture
Data Discovery
Data Visualisation
Data Analysis
Getting data into the system
Combining data sets & applying tools & methods to
interpret
Viewing data in different ways
Finding, navigating, & filtering data
ENABLING
- researchers
- government
- community
- industry to……
• engage with
biodiversity
• manage
biodiversity
• report on
biodiversity
• understand
impact on
biodiversity
click on images to go through to site
ALA Tools – data capture
Data Capture
Data Discovery
Data Visualisation
Data Analysis
ALA Sightings + Mobile DigiVol – Crowd Digitisation
Direct Data Uploads ALA Sandbox
API’s
ODBC
etc.
ALA Tools – data capture
Data Capture
Data Discovery
Data Visualisation
Data Analysis
BioCollect + Mobile MERIT
Project
discovery +
General field
data capture
cloud hosting service
AUS govt
sponsored
environmental
interventions
• A tool for the public to discover and connect with projects
• A tool for project owners to communicate their project to the world
• A free ALA hosted universal web app for recording field data (structured & unstructured)
• Supports both:
– Survey/event based data collection
– Activity/schedule based project structures
• Hubs for:
– Citizen Science
– Ecological Science
– NRM Works
– Custom ….
www.ala.org.au/biocollect/ www..ala.org.au/biocollect
ALA Tools – data capture
Data Capture
Data Discovery
Data Visualisation
Data Analysis
Profiles Tool Seasonal Calendars
Species profile
content authoring
& management
Recording
cultural &
phenologic
perspectives
ALA Stakeholders
Ecological surveys
Govt & NGO NRM Projects
Citizen science
www..ala.org.au/biocollect
Collections
ALA Tools – data discovery
Data Capture
Data Discovery
Data Visualisation
Data Analysis
by collection by data set
by species by location/region
ALA Tools - navigating the data
ALA Tools - biodiversity information
ALA Tools – data visualisation
Data Capture
Data Discovery
Data Visualisation
Data Analysis
active charts image galleries
lists spatial views
ALA Tools – Spatial Portal
Data Capture
Data Discovery
Data Visualisation
Data Analysis
scatterplot analysis more analytical tools…
point data over spatial layers environmental niche modelling
ALA October 2012
ALA Tools – Phylolink
Data Capture
Data Discovery
Data Visualisation
Data Analysis
Phylolink
ALA Tools - HPC Modelling with ALA Data
• ALA Spatial Portal – modelling & analysis tools
• ALA4R
• Teaming up with virtual labs for high performance modelling
ZoaTrack – telemetry data tool
Key Lessons - as an aggregator
• Support the interests of primary data sources & custodians
– proper recognition of the source and prominently displaying licensing and attribution
– provide feedback on data usage
– facilitating feedback of annotations and additions
– minimizing technical and administrative burden
– understanding and patience regarding their issues and constraints
– providing clear and understandable advice on data sharing issues (especially licensing)
Key Approach - as an aggregator
• Provide additional capabilities on the aggregated data
– building trust and understanding in the data
• not vetting, but including metadata, quality checks etc
• users being able to determine what is fit for purpose
– transparent, traceable and complete data
• logging changes, original records, use standards
– provide interfaces tailored to different communities
– allow simple ad hoc uploads & downloads in most usable formats
– provide tools that demonstrate significant value add
Challenges
• Data quality
– Annotation tools, feedback
– Fitness for use
• Data not being shared
• Adding new classes of data to support environmental monitoring and reporting
• Making it all fit together
– Linkages with other types of data
• genomic, environmental etc
– Linkages with other initiatives (global, national and state)
• Data citation, DOI’s
Challenges
• Stakeholder / partner relationships
– Engagement / Retention
– Understanding each other’s roles and responsibilities
– Data ownership, custodianship, curation
– Data storage
– Legacy systems & IT cuts
– Specific “in-house” needs around data management
May all your problems be technical
Benefits of ALA infrastructure
• 100% open source technology stack No licencing costs
• Component-based – Relatively easy maintenance with less impact on other
components – Allows assembly in different combinations for different purposes
or communities
• APIs can be exposed Allows external parties to access data & application services
• Ansible-script-based deployment Relatively easy and rapid installation
• Virtualised environment More flexible maintenance & performance management
• Standards-based More seamless data interchange
• “Hubs” Allow communities to have their own version without having a separate installation
Summary
• The ALA is world-leading biodiversity informatics e-infrastructure.
• Provides capabilities right across the environmental information supply chain.
• ALA is strongly connected at all levels of community – local, regional, state, national and global.
• ALA is an exemplar for open infrastructure, open data and data re-use.
• The ALA has significantly contributed to the volume, quality and accessibility of Australia’s biodiversity data.
• The ALA is enabling new areas of research and significantly improved efficiencies in data access.
• ALA’s partnerships with NCRIS, the CSIRO and its many contributors are critical to it’s success.
> AUD $50 million investment
Partners – founding & beyond
National Research infrastructure
Open source & open access
A world-leading
collaborative e-infrastructure
integral to advancing
biodiversity knowledge
• 70+ million records
• 5,238 data sets
• 477 spatial layers
• 10+ billion records downloaded from
over 600k events
• 1,200 + scientific publications (known)
Data types • specimens
• occurrence
• images, sounds
• literature
• sequences
• more ……
System • data capture & aggregation
• data management
• data discovery
• data visualisation
• data analysis & reporting
ALA – sharing biodiversity knowledge