Get Data to Computation
eudat.eu/b2stagewww.eudat.eu
B2STAGE InstallationHow to enable B2STAGE on your
site
Version 1.1August 2016
This work is licensed under the Creative Commons CC-BY 4.0 licence.Attribution: EUDAT – www.eudat.eu
Get Data to Computation
eudat.eu/b2stagewww.eudat.eu
B2STAGEB2STAGE is a reliable, efficient, light-weight and easy-to-use service to transfer research data sets between EUDAT storage resources and high-performance computing
(HPC) workspaces
eudat.eu/b2stage
3
iRODS is nice, but …
User desktopWeb browser
Pythonscp
gridFTP
Your site
PID Registry
PIDFile
?
eudat.eu/b2stage
Allowing third party transfers
4
User desktop
Data location or PID
HPCGridFTP server
data
Your site
PID Registry
PID
controlcontrol
eudat.eu/b2stage
Move large amounts of data between data stores and high-performance compute resources by means of different protocols and API clientsIngest computational results back into EUDATDeposit large data sets into EUDAT resources for long-term preservation
Deploying B2STAGE allows your users to:
Features:High-speed transferReliable and light-weightData access by PIDs
5
Purpose
eudat.eu/b2stage
What exactly will this allow?
6
Your site
GridFTP server
iRODS-DSI
User desktop
GridFTP client
data
control
PID Registry
PID
control
HPCGridFTP server
7
eudat.eu/b2stage
Outline
Prerequisites
Basic deployment and configuration
Additional features
Get Data to Computation
eudat.eu/b2stagewww.eudat.eu
Prerequisites
9
eudat.eu/b2stage
Prerequisites
iRODS v4.1 deployment and configurationIncluding the Development Tools and Runtime Libraries packages (see http://irods.org/download/)
Globus GridFTP server (globus-gridftp-server-progs) deployment and configuration
Software components deployment:CMake 2.7 or higherlibglobus-common-dev (.deb) or globus-common-devel (.rpm)libglobus-gridftp-server-dev (.deb) or globus-gridftp-server-devel (.rpm)libglobus-gridmap-callout-error-dev (.deb) or globus-gridmap-callout-error-devel (.rpm) (see http://www.ige-project.eu/downloads/software/releases/downloads)libcurl4-openssl-dev
It is possible to use the official iRODS and GridFTP server packages without recompiling
them.
Get Data to Computation
eudat.eu/b2stagewww.eudat.eu
Basic deployment and configuration
Hands-on material
B2STAGE installation (part 9)
Example installation on UbuntuInstallation of the iRODS-DSIConfiguring the gridFTP serverConfiguring the PID resolutionGiving access to users
https://github.com/EUDAT-Training/B2SAFE-B2STAGE-Training
Material onTraining module which provides hands-on material for:
EUDAT B2SAFEiRODS4B2HANDLEand the EUDAT B2STAGE service.
12
eudat.eu/b2stage
B2STAGE Examples - Listing
List data in iRODS with globus-url-copy:
globus-url-copy -list gsiftp://<server>/<irodszone>/home/<user>/
$ globus-url-copy -list gsiftp://eve.eudat-sara.vm.surfsara.nl/eveZone/home/eve/Collection/
globus-url-copy -list gsiftp://<server>/<PID> where the PID is either attached to a file or an iRODS collection$ globus-url-copy -list gsiftp://eve.eudat-sara.vm.surfsara.nl/846/cc83ae10-5e37-11e6-9c19-04040a64004a/
Both commands will list the same folder
13
eudat.eu/b2stage
B2STAGE Examples - Copy
Copy data from iRODS to another server:
globus-url-copy –r gsiftp://<server>/<irodszone>/home/<user>/ <local Path>
$ globus-url-copy -r gsiftp://eve.eudat-sara.vm.surfsara.nl/eveZone/home/eve/Collection/ /home/eve/getData/
globus-url-copy –r gsiftp://<server>/<PID> <local Path>$ globus-url-copy -r gsiftp://eve.eudat-sara.vm.surfsara.nl/846/cc83ae10-5e37-11e6-9c19-04040a64004a/ /home/eve/getData/
Both commands will copy the data in Collection to the folder getData on your local machine.
Get Data to Computation
eudat.eu/b2stagewww.eudat.eu
Additional configuration
15
eudat.eu/b2stage
Additional configuration
Enable checksum checking by Globus.org
Specify a policy to manage more than one iRODS resource
Handle unknown users (Distinguished Names)
16
eudat.eu/b2stage
Globus Online Checksums
Enabling the checksum checking offered by Globus.org
Configure iRODS to use MD5 checksums by default (iRODS 4 otherwise defaults to SHA-256).
Edit /etc/irods/server_config.json and set:"default_hash_scheme": "MD5",
17
eudat.eu/b2stage
Specify a policy to manage more than one iRODS resource
Edit $GLOBUS_LOCATION/etc/gridftp.conf. Set $irodsResourceMap to a file, e.g. called mapResourcefile$irodsResourceMap "path/to/mapResourcefile"
Populate path/to/mapResourcefile with lines mapping particular iRODS paths with iRODS resource to be used. Use ‘;’ to separate them. For example, assume that resc-repl is an alternative iRODS resource:$ cat path/to/mapResourcefile/CINECA01/home/cin_staff/rmucci00;resc-repl/CINECA01/home/cin_staff/mrossi;resc-repl
If none of the listed paths is matched, the iRODS default resource is used.
18
eudat.eu/b2stage
Handling unmapped users
Users whose distinguished name (DN) is not yet mapped to an iRODS user, can be automatically provided with access
Configure the DSI to invoke an iRODS server-side command with iexec
The command receives the certificate’s DN (distinguished name)
Edit $GLOBUS_LOCATION/etc/gridftp.confSet '$irodsDnCommand' to the name of the command to execute.
E.g., to invoke a script called 'createUser', add:$irodsDnCommand "createUser"
On the iRODS server, the command should be installed in '$IRODS_HOME/server/bin/cmd/'
For more info: http://eudat.eu/services/b2stageUser documentation: http://eudat.eu/services/userdoc/b2stage
Thank you
www.eudat.eu
Authors Contributors
This work is licensed under the Creative Commons CC-BY 4.0 licence
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures.Contract No. 654065
Roberto Mucci (CINECA)Kostas Kavoussanakis (EPCC)Christine Staiger (SURFsara)
Thank you