Upload
jisc
View
556
Download
0
Embed Size (px)
Citation preview
Robert Esnouf, Campus network engineering workshop19/10/2016 Electron Microscopy Between OPIC, Oxford
and eBIC
Electron Microscopy Between OPIC, Oxford and eBIC, Harwell
Robert Esnouf ([email protected]),Head of Research Computing Core,Wellcome Trust Centre for Human Genetics,Old Road Campus,University of Oxford
Campus Network Engineering for Data-Intensive Science, London, 19 October 2016
Overview of talk
The Wellcome Trust Centre for Human Genetics: science & facilities
Why is electron microscopy such hot science
OPIC and eBIC
Networking challenges of OPIC/eBIC model
The Old Road Campus, University of OxfordOne of Europe’s largest biomedical
research campuses◦In east Oxford near John Radcliffe,
Churchill, Nuffield Orthopaedic & Warneford Hospitals
◦First building (HWBGM) opened 1999◦Already ~2000
researchers, spaceto double…
The Wellcome Trust Centre for Human GeneticsA department of the University of
Oxford◦Founded in 1994 with core support from
the WT◦Moved to new building in 1999 – Henry
Wellcome Building for Genomic Medicine on the Old Road Campus
The Wellcome Trust Centre for Human GeneticsAbout 500 researchers
◦“to advance the understanding of genetically-related conditions through multi-disciplinary research”
◦Sequencing, statistical genetics, disease-focused research (diabetes, obesity, heart disease, malaria), optical microscopy, MRI, functional genetics, crystallography & electron microscopy (STRUBI & OPIC)
WTCHG Research Computing Core
ResComp Core is squeezed in a tiny room◦4120 compute cores, 4.2PB raw GPFS storage,
3.9PB other (archive) storage; FDR InfiniBand◦2.2 FTE to manage (me, Jon, and 20% Colin
Freeman)In 2015, the ResComp Core delivered:
◦Compute to 303 users (150 active) from 32 groups
◦2,640 cores of the main cluster delivered 55.5 billion seconds (1,761 years) of CPU time
◦27 different users from 12 different research groups each used >20 years of CPU time
ResComp hardware environment
Home devised butPUE measured at 1.32
Sequencing facilitiesWTCHG Oxford Genomics Centre
◦Illumina systems: HiSeq 2500, HiSeq 4000, MiSeq
◦IonTorrent, genotyping (Solexa?)◦Evaluating long-read technologies (have
MinION)Mixture of WGS, exome sequencing, RNA-
seq, single-cell work, custom sequencingApproximately 1000 genomes per year2PB base call files 100TB BAMs (FASTQ
more like 200TB) per year
ONT MinION
ONT PromethION
Other systems (PacBio)
Long-read sequencing technologies
Processing ONT long-read sequencing Processing from MinION readers
◦ Each pore produces file of 100-100,000 base calls.◦ Many small files produced, modest data volume
What about PromethION (up to 48 flow cells)?◦ Up to 80GB/hour 3.8TB in 2 days/flow cell◦ Average 200kB files 400,000 files/hour/flow cell
Frightening headline numbersEach 2-day run could generate:182.4TB of FAST5 files @ 8.5Gbit/s921.6 million 100kB-1MB filesRequire 960 cores to process data
...but something seismichappened in 2014 ...
Oxford Particle Imaging Centre (OPIC)
An EM facility unique in Europe◦Biosafety containment suite (ACDP3/DEFRA4)◦FEI Tecnai Polara EM with Gatan K2 detector◦Can be accessed by UK researchers (20%
eBIC)Second EM in normal
containment400fps movies “flattened”
to images (>1TB/day)At least 2.2Å resolution
Single particle structures by EM (relion)Images are translucent like hospital X-raysComputation & memory intense process
◦Correct for drift and shake, detect particles◦Extract particle images (2TB 50-100GB)◦2D classification of particle projections◦3D classification of particle ◦3D refinement of structural model(s)
Particle covered by a pixel box◦Cubic dependence of memory on box size◦400-box (picornavirus) requires 300GB memory
Net result - example from FMDV EM & X-ray structures (12h data collection for EM)
X-ray 2.6Å
EM 3.3Å
end May 2016
18
eBIC: the Electron Bio-Imaging CentreDiamond Light Source, Harwell Science and Innovation Campus
Data collection statisticsNumber of unique groups Allocated time
*14 groups from Cambridge, 10 groups from Oxford, 6 groups from Birkbeck, 5 groups from Imperial, 4 groups from Manchester and Bilbao, 3 groups from Leeds, 2 groups from Edinburgh, the Crick and Dundee, 1 Group from Diamond, Warwick, Madrid, Bristol, Leicester, Helsinki, Sheffield, Stockholm, Virginia and SPring8
Industrialization of EM(install on syncrotron hall floor)Installation: 8/5-5/6 (2015)External users: Monday 29/6Publications in Cell & Nat. Comms within first ½ yearHeavy oversubscription and v. high industrial interestThe democratization of cryo-EM – Nat Methods,2016, Stuart, Subramaniam, Abrescia
20
The eBIC ‘hall’ in the new building to accommodate 4 Krios microscopes
In addition to Krios I and II, Talos and Scios machines are now operational
Two further Krios microscopes ordered, installation 2016 and 2017.
Recruitment is underway – challenging in the present EM feeding frenzy
User programme is developing, the first BAGs are now in-place and being tested as a way of optimising use.
Time critical, first-pass processingMicroscopes are expensive
◦~£2-5M to buy + £1M p.a. to runImmense shortage of expert staff timeNot all samples give good images
◦Bad samples, problems with optics◦Need to detect quickly◦First-pass processing is a CTF◦Process ~50GB images, results back in 30s
Collected in Oxford, processed in Harwell!
New fibres within WTCHG
16F OM4
16F OM4
16F OM416F OM4
8F OM4
8F OM4
IT2 CommsIT1 Comms
OPICComms
Containment Lab.
OpenLab.
STRUBI Servers
8F OM3
16F OM3
ClusterRoom
New network within WTCHG
2x MM
10g
2x MM
10g
2x MM 10g
Each microscope 1x MM 10g
1x MM
10g
Firewall
1x MM
10g
University
network
The Oxford University Network
CORCCMUS
CINDCROQ
40Gbit/slinks
WTCHG
FroDo
TVN PoPs
40Gbit/s links:10Gbit/s to .well.ox10Gbit/s to .strubi.ox
10Gbit/slinks
10Gbit/slinks
Thanks to...
The WTCHG ResComp Core◦Jon Diprose and Colin Freeman’s left leg
STRUBI and OPIC Microscopy Staff◦esp. Juha Huiskonen and Abhay Kotecha
Staff at Diamond Light Source and eBIC◦David Stuart, Alistair Siebert, …
All of you for your attention!