Upload
morgan-webster
View
214
Download
0
Embed Size (px)
Citation preview
October,18-22 2004
Scientific LinuxScientific Linuxexperienceexperience
@@INFN/TriesteINFN/Trieste
B.Gobbo – Compass B.Gobbo – Compass R.Gomezel - T.Macorini - L.Strizzolo R.Gomezel - T.Macorini - L.Strizzolo
INFN - TriesteINFN - Trieste
2October,18-22 2004
Outline of PresentationOutline of Presentation
IntroductionIntroduction Evaluation phaseEvaluation phase SL Test installationSL Test installation InstallationInstallation ConfigurationConfiguration ManagementManagement SL and Compass SL and Compass
Computing FARMComputing FARM
3October,18-22 2004
Before moving to SLBefore moving to SL When the end of official support from RedHat When the end of official support from RedHat
was announced in 2003 we had all PCs running was announced in 2003 we had all PCs running Linux RedHat 7.3 and 9Linux RedHat 7.3 and 9
HEP community was evaluating different roads HEP community was evaluating different roads to followto follow
Initially we are waiting for a common answer Initially we are waiting for a common answer coming from INFN Computing Committee and coming from INFN Computing Committee and looking at what was moving aroundlooking at what was moving around
At our site we tried different distribution based At our site we tried different distribution based on RedHat Enterprise Linux like on RedHat Enterprise Linux like CentOS 3.1CentOS 3.1 Fermi LinuxFermi Linux Scientific Linux 3.0.1Scientific Linux 3.0.1
4October,18-22 2004
Evaluation of the needsEvaluation of the needs As time went by we decided to have a As time went by we decided to have a
meeting in order to discuss how to meet meeting in order to discuss how to meet the needs of people involved in different the needs of people involved in different experiments experiments
It was clear that there were many It was clear that there were many distributions which were a rebuild of distributions which were a rebuild of RHEL3 from Red Hat’s source RPM RHEL3 from Red Hat’s source RPM stripped of parts that cannot be stripped of parts that cannot be distributed without a license distributed without a license Need to have a distribution well maintained Need to have a distribution well maintained
and validated by the HEP communityand validated by the HEP community
5October,18-22 2004
Scientific Linux – Final Test Scientific Linux – Final Test installationinstallation
As Scientific Linux 3.0.1 was available As Scientific Linux 3.0.1 was available we installed it on a few PC clients in we installed it on a few PC clients in order to test how user’s feeling was order to test how user’s feeling was we got a good feedbackwe got a good feedback
So we went on with the installation of So we went on with the installation of SL 3.0.2SL 3.0.2
At the time SL 3.0.3 was released we At the time SL 3.0.3 was released we had in productionhad in production 10 PC client (desktop functionality)10 PC client (desktop functionality) 1 PC with server functionality1 PC with server functionality
6October,18-22 2004
Scientific Linux – Final test Scientific Linux – Final test installationinstallation
No problems or complaints coming No problems or complaints coming from users running their from users running their applications and softwareapplications and software
Then the decision to upgrade all PCs Then the decision to upgrade all PCs running SL to the new SL 3.0.3 and running SL to the new SL 3.0.3 and to move more PCs to the new to move more PCs to the new releaserelease Right now we have 30 PCs clients and 3 Right now we have 30 PCs clients and 3
PCs servers with SL 3.0.3 installedPCs servers with SL 3.0.3 installed
7October,18-22 2004
Scientific Linux - Scientific Linux - InstallationInstallation
Kickstart is used to install SL on the Kickstart is used to install SL on the PCsPCs Kickstart file generated using Kickstart file generated using buildks
which is a tool developed at INFN Trieste Buildks allows to install and configure
linux PCs automatically based on group-specific policy
a specific network group configuration Type of PC: multiboot or singleboot ...
8October,18-22 2004
Scientific Linux - Scientific Linux - ManagementManagement
Installed machines are currently Installed machines are currently managed via managed via Linux-updateLinux-update, a tool , a tool developed locallydeveloped locally
Linux-update allows us to modify in a Linux-update allows us to modify in a centralized way centralized way Configuration filesConfiguration files Specific installationsSpecific installations RPM packets managementRPM packets management
Anyway YUM is easier and more Anyway YUM is easier and more powerful to manage RPM packetspowerful to manage RPM packets
9October,18-22 2004
Scientific Linux – Scientific Linux – Configuration (1/2)Configuration (1/2)
A local repository has been created A local repository has been created for the SL 3.0.3 distributionfor the SL 3.0.3 distribution
Every night via lftp a mirror is Every night via lftp a mirror is created copying the distribution created copying the distribution from from ftp://ftp.scientific.linux.org/linux/scieftp://ftp.scientific.linux.org/linux/scientific/303/i386/ntific/303/i386/
This distribution is locally accessible This distribution is locally accessible using ftp anonymous just open to the using ftp anonymous just open to the local network nodes local network nodes
10October,18-22 2004
Scientific Linux – Scientific Linux – Configuration (2/2)Configuration (2/2)
Linux PCs are configured as “YUM Linux PCs are configured as “YUM clients” of the local FTP server clients” of the local FTP server hosting the distributionhosting the distribution
Update is done as the PC starts up Update is done as the PC starts up and via a cron job using the simple and via a cron job using the simple command command yum updateyum update as running as running
Kernel update is not managed by an Kernel update is not managed by an automatic toolautomatic tool After a test phase it is installed manually After a test phase it is installed manually
11October,18-22 2004
Scientific Linux – Release Scientific Linux – Release upgradeupgrade
Moving from SL 3.0.2 to 3.0.3 was Moving from SL 3.0.2 to 3.0.3 was smooth and painlesssmooth and painless
A script was created and run on all A script was created and run on all PCs with SL 3.0.2 installedPCs with SL 3.0.2 installed
They were updated to the new release They were updated to the new release using mainly using mainly yum update yum update and just a and just a few more things as runningfew more things as running
At the end just a reboot was At the end just a reboot was necessary (for the new kernel)necessary (for the new kernel)
12October,18-22 2004
ACID (The COMPASS Trieste ACID (The COMPASS Trieste Compute Farm)Compute Farm)
Current EnvironmentCurrent Environment HardwareHardware
32 PC clients, 4 disk servers, 1 tape server + a small 32 PC clients, 4 disk servers, 1 tape server + a small tape library (STK L40), 1 Oracle server, 1 supervisor tape library (STK L40), 1 Oracle server, 1 supervisor and software repository server. All two processors and software repository server. All two processors machinesmachines
Software:Software: OS: Red Hat Linux 7.3 (but Oracle server: it runs Red OS: Red Hat Linux 7.3 (but Oracle server: it runs Red
Hat AS2.1)Hat AS2.1) Just few packages patched for local deeds:Just few packages patched for local deeds:
kernel (NFS over TCP, 32k R/W size, F.Collin kernel (NFS over TCP, 32k R/W size, F.Collin patches to Tape Drivers), OpenAFS, OpenSSH (local patches to Tape Drivers), OpenAFS, OpenSSH (local setup), CASTOR (it is a CERN software, small setup), CASTOR (it is a CERN software, small changes in code, local setup)changes in code, local setup)
B. Gobbo
13October,18-22 2004
ACID ACID (The COMPASS Trieste (The COMPASS Trieste Compute Farm)Compute Farm)
SoftwareSoftware Monitoring: BigBrother 1.9e (Quest Monitoring: BigBrother 1.9e (Quest
Software)Software) Resource Management: Grid Engine 5.3 Resource Management: Grid Engine 5.3
(Sun Open Source Software)(Sun Open Source Software) HSM: CASTOR 1.7.1.5 (CERN Software)HSM: CASTOR 1.7.1.5 (CERN Software) RPMs upgrade using LinuxUpdate tool RPMs upgrade using LinuxUpdate tool
(Developed in INFN Trieste)(Developed in INFN Trieste)
B. Gobbo
14October,18-22 2004
ACID ACID (cont.d)(cont.d) Upgrades to be done before end of the yearUpgrades to be done before end of the year
HardwareHardware ~6 more PC client nodes (with Opteron?)~6 more PC client nodes (with Opteron?)
SoftwareSoftware OS: Move to Scientific Linux (CERN Version under OS: Move to Scientific Linux (CERN Version under
test)test) 2 Client nodes already migrated for tests2 Client nodes already migrated for tests 1 Server installed with Red Hat EL31 Server installed with Red Hat EL3 All needed software already portedAll needed software already ported
Experiment software porting basically finished.Experiment software porting basically finished. Initially we had problems with Grid Engine 6.0: Initially we had problems with Grid Engine 6.0:
AFS token is not exported (that is a major AFS token is not exported (that is a major problem as user home directories are under problem as user home directories are under AFS). Lack of support for AFS credential in AFS). Lack of support for AFS credential in qsub. Anyhow Grid Engine 6.0 now works.qsub. Anyhow Grid Engine 6.0 now works.B. Gobbo
15October,18-22 2004
ACID ACID (cont.d)(cont.d)
SoftwareSoftware Few changes are needed to “adapt” distribution to Few changes are needed to “adapt” distribution to
local environment local environment Mainly remove some “too-CERN-related” itemsMainly remove some “too-CERN-related” items We keep CERN built kernelWe keep CERN built kernel
We are still considering “pure” SL too, as an We are still considering “pure” SL too, as an alternativealternative
In that case few packages need to be added. Kernel In that case few packages need to be added. Kernel has to be rebuilt. has to be rebuilt.
We keep local mirror of SLC and SL repositoryWe keep local mirror of SLC and SL repository Following the CERN model: upgrade via APT (from the Following the CERN model: upgrade via APT (from the
local repository)local repository) We had a look at QUATTOR too We had a look at QUATTOR too
But the farm is not so big. Probably such a powerful But the farm is not so big. Probably such a powerful too is not neededtoo is not needed
B. Gobbo