Developing an open source community for cloud bioinformatics

Developing an open sourcecommunity for cloud

bioinformatics

Brad Chapmanhttp://bcbio.wordpress.com/

8 June 2010

Overview

1 Building open source bioinformatics

communities is hard.

2 Developer resources are a productive

target.

3 Framework: collaborative software

images and data snapshots.

Motivation

Open sourceOpenBio, BiopythonGraduate school – developed distributedalgorithm. Never reused.

WorkStartup: Automated biological pipelines.Research hospital: Democratization ofanalysis.

Filters in biological computing

Working in same biological area

Interest in developing open source code

Technical abilities

Your software is good enough

Successful bioinformatics

Sean Eddy, HMMER

...the best software in the field is often an

unplanned labor of love from a single

investigator.

http://selab.janelia.org/people/eddys/blog/?p=313

Recognizing contributions

Successful community projects

OpenBio: BioPerl, Biopython, BioJava

Bioconductor

Common themeAimed at developers.

Biologists benefit indirectly.

Lowering activation energy

Establishing common platform

=The solutionto all ourproblems

Remove install and distribution barriers

Building block for scaling

Existing cloud bioinformatics work

JCVI Cloud BioLinux

bioperl-max

MachetEC2

Debian Med

Overlapping set of useful functionality.

Integrated community solution

Inclusive but configurable

Easy to contribute

Automated

Bootstrap bare machine to fully ready

distributed AMI.

http://github.com/chapmanb/bcbb/tree/master/ec2/

biolinux/

Inclusive but configurable

# Top level YAML configuration file specifying# groups of programs to be installed.packages:- python- r- erlang- databases- viz- bio_search- bio_alignment- bio_nextgen- bio_sequencing- bio_visualization- phylogeny

libraries:- r-libs- python-libs

Easy to contribute

# Configuration file defining R specific libraries that# are installed via CRAN and Bioconductor.cranrepo: http://software.rc.fas.harvard.edu/mirrors/R/cran:- ggplot2- rjson- sqldf- NMF- ape

biocrepo: http://bioconductor.org/biocLite.Rbioc:- ShortRead- BSgenome- edgeR- GOstats- biomaRt- Rsamtools

Automated

def install_biolinux():

ec2_ubuntu_environment()

pkg_install, lib_install = _read_main_config()

_apt_packages(pkg_install)

_do_library_installs(lib_install)

def _ruby_library_installer(config):

for gem in config[’gems’]:

sudo("gem install %s" % gem)

Fabric: http://docs.fabfile.org/

Ready to use biological data

% ls /referenceGenomes/AthalianaCelegansDmelanogasterEcoliHsapiensMmusculusMsmegmatisMtuberculosis_H37RvPaeruginosa_UCBPP-PA14phiX174RnorvegicusScerevisiaeXtropicalis

% ls Hsapiens/hg18arachnebowtiebwaelandmaqseqsnpsucsc

http://github.com/chapmanb/bcbb/blob/master/galaxy/galaxy_fabfile.py

Organization: Codefest 2010

www.open-bio.org/wiki/Codefest_2010

Developing an open source community for cloud bioinformatics

Technology

Xu Xing: EasyGenomics – Next Generation Bioinformatics on the Cloud

Cloud Technologies and Bioinformatics Applications

Developing Your Cloud Strategy

CLOUD BIOINFORMATICS Part1

Research Article A P2P Framework for Developing ...downloads.hindawi.com/journals/ijg/2013/361327.pdf · A P2P Framework for Developing Bioinformatics Applications in Dynamic Cloud

Cloud Prediction of Protein Structure and Function with ... · 4 BioMedResearchInternational distributionssuchasUbuntu.Anoverviewofthepackages offered for bioinformatics and cloud

Developing reproducible bioinformatics analysis workflows for … · 2019-11-01 · University of Groningen Developing reproducible bioinformatics analysis workflows for heterogeneous

IDB-Cloud Providing Bioinformatics Services on Cloud

DEVELOPING APPLICATIONS CLOUD -

Information Resources for Bioinformatics 1 MARC: Developing Bioinformatics Programs July, 2008 Alex Ropelewski ropelews@psc.edu Hugh Nicholas nicholas@psc.edu

Developing Resilient Cloud Architecture

Cloud bioinformatics 2

Developing Hybrid Cloud Applications

MARC: Developing Bioinformatics Programs July 2009 Alex Ropelewski PSC-NRBSC Bienvenido Vélez UPR Mayaguez 1 Essential Computing for Bioinformatics Lecture

Cloud 101: Developing a Cloud-Computing Strategy for

Developing applications for the cloud

The pulse of cloud computing with bioinformatics as an example

CCIS 367 - Cloud Storage and Bioinformatics in a …...bioinformatics services developed as part of the Cloud Computing Adoption Framework (CCAF). Our Cloud Storage design and deployment

Delivering Bioinformatics Training Using Cloud Computing Infrastructure - Nathan Watson-Haigh

Cloud computing in developing nations