26
Open sharing and maintenance of scientific code Jordan S Read; Luke A Winslow 2013-08-20

Open sharing and maintenance of scientific code

  • Upload
    mimis

  • View
    31

  • Download
    0

Embed Size (px)

DESCRIPTION

Open sharing and maintenance of scientific code. Jordan S Read; Luke A Winslow 2013-08-20. Background. Who I am USGS-CIDA 2012 PhD in physical limnology (UW-Madison) Civil Engineer. My experience with code and model development Lake Analyzer CLM rGDP ; rGLM Numerous collaborations. - PowerPoint PPT Presentation

Citation preview

Page 1: Open sharing and maintenance of scientific code

Open sharing and maintenance of scientific code

Jordan S Read; Luke A Winslow2013-08-20

Page 2: Open sharing and maintenance of scientific code

Background

• Who I am– USGS-CIDA– 2012 PhD in physical

limnology (UW-Madison)– Civil Engineer

• My experience with code and model development– Lake Analyzer– CLM– rGDP; rGLM– Numerous collaborations

Page 3: Open sharing and maintenance of scientific code

Background

My philosophy on science code:“Code created for the pursuit of science questions should be open, accessible, and designed to enable others to build from”

• Kind of like your scientific publications, right?• That means I shouldn’t be able to build my scientific

livelihood around a piece of “black-box” code

Page 4: Open sharing and maintenance of scientific code

Background

• My responsibility as a member of the science community:

“Methods used to obtain published results should be clear, transparent and

repeatable”

• My responsibility as a federal employee:“Provide public access to all elements of publicly funded research”

Page 5: Open sharing and maintenance of scientific code

Road map

Part I• My experiences with

science code development

• Motivation to open up your scientific code

Part II• Maintaining and

modifying code• Code collaboration

Page 6: Open sharing and maintenance of scientific code

Lake Analyzer

• GLEON background– Hanson & Hamilton collaboration and student

exchange– Physics & Climate working group

• Requirements– Easy to use– Provide access to complex physical derivatives– Handle dataset irregularities• Errors, gaps, intermittent sampling frequencies, etc.

– Rapid processing of large datasets

Page 7: Open sharing and maintenance of scientific code

Lake Analyzer

• I took on the role of primary coder– Why? GLEON had paid my travel to two

meetings…including NZ!• I did the work in MATLAB, because that is

what I was most familiar with• Side project during grad school• Built from feedback from GLEON physics &

climate group

Page 8: Open sharing and maintenance of scientific code

Lake Analyzer

Page 9: Open sharing and maintenance of scientific code

Lake Analyzer

• Repeatable – .lke file ~ metadata

• Visualizations (plotting options for outputs)

• Easy to use

Page 10: Open sharing and maintenance of scientific code

Lake Analyzer

• Software publication

Page 11: Open sharing and maintenance of scientific code

Lake Analyzer

• Software publication

• Open codebase

Page 12: Open sharing and maintenance of scientific code

• Software publication

• Open codebase

• Platform/language independence

Lake Analyzer

Page 13: Open sharing and maintenance of scientific code

Lake Analyzer

• Software publication

• Open codebase

• Platform/language independence

• Useful and citable

19 citations in ~20 months

Page 14: Open sharing and maintenance of scientific code

Opening up scientific code

• Publishing your code– Would a simple paper of physical derivations be cited at

this rate?– Would a methods paper be as popular if the code

wasn’t available/open?– Additional motivation for creation of code

• Writing open code– More use– Ease of collaboration– Integrity/transparency

Page 15: Open sharing and maintenance of scientific code

Opening up scientific code

• Reasons many choose not to open code– Too much work– Code is too messy– Potential for criticism– Code as scientific livelihood– Has known errors…– Others?

Page 16: Open sharing and maintenance of scientific code

Opening up scientific code

• When to put in the effort– Collaborations– When you are doing it “right”– When you will use it in the future– When you are publishing something– When you have to– Others?

Page 17: Open sharing and maintenance of scientific code

Part II: Maintaining code

So…the code works, what’s next?• How do I take risks with code?– i.e., changing the way a function works– What if I make a mistake? (undo+undo+undo…?)

• How do multiple people collaborate on a single set of scripts? – In serial?– Google docs vs word for writing a paper

Page 18: Open sharing and maintenance of scientific code

Maintaining code

• Risky modifications– Metabolism_modelv28.R?– Metabolism_model_NEW.R?– Metabolism_model_NEWsecondTRY.R?– Metabolism_model_NEWEST.R?

Page 19: Open sharing and maintenance of scientific code

Maintaining code

• When we publish, we use track changes– Can we do the same for code?

• Version management– AKA: version control, revision control, source control– How it works– Why you should know what it means– Benefits to using version management

• Historical record of code evolution• Easy to “roll back” to previous working version• The code has only one home

Page 20: Open sharing and maintenance of scientific code

Maintaining code

How it works– Creates a “life history of code”

Page 21: Open sharing and maintenance of scientific code

Hey, nice sweaterThanks. I travel a

lot. Want to start a project?

Sure! I have some modeling code So do I! Let’s

combine our efforts

Maintaining code

How it works– Creates a “life history of code”

Page 22: Open sharing and maintenance of scientific code

Maintaining code

1 2

Here is a new set of methods

Page 23: Open sharing and maintenance of scientific code

Maintaining code

1 2 3

I made some improvements

Page 24: Open sharing and maintenance of scientific code

Maintaining code

1 2 3 4

Whoops! Fixed a bug

Page 25: Open sharing and maintenance of scientific code

Conclusions

• Code as if it will be seen and used by others– You may be that “other” in 3 years

• Decide if creating publicly usable code makes sense for your research

• Make your code accessible to collaborators• Consider the concepts imbedded in version

management

Page 26: Open sharing and maintenance of scientific code

Jordan S ReadUSGS Center for Integrated Data Analytics608-821-3922 | [email protected]

Questions?

Thanks GLEON FP & TLS!