EOS installation guide and more information
This page aims to help the admins to proceed with the EOS installation in storage servers, and add some information (commands, etc) in order to configure or tune the installed instance.
WARNING: parts of these instructions may have become STALE !!!
Better rely on the official EOS docs and ask for advice on the:
-
EOS Community Forum
-
ALICE LCG TF list
Installation
Traditionally, the installation is done in SL[5-6] and CentOS[7] based machines, using the script 'eos-deploy', download here: eos-deploy
You can also follow official EOS documentation but you might miss some ALICE particularities: EOS docs
0) Cleanup
eos-deploy will remove eos packages, and install eos-cleaup, then run it, which will return 'most' of the things to the proper initial status needed for installation.
1) make sure there is no XRootD V4.X installed if you go for aquamarine release - if yes, uninstall!
rpm -e `rpm -qa | grep xrootd | awk '{printf("%s ", $1); }'`
2) mask xrootd* and libmicrohttpd* from EPEL and base and update (e.g. CentOS-*) yum repositories:
This should be alerted by eos-deploy automatically if the repos are not ok. Do:
in /etc/yum.repos.d/epel.repo
in /etc/yum.repos.d/epel-testing.repo
add to each section:
exclude=xrootd*,libmicrohttpd*
volume-n02-v02/cern-02 32T 2.4T 30T 8% /data2
1094: XRootD MGM port (only on MGMs)
1095: XRootD FST port (only on FSTs)
1096: XRootD SYNC port (only on MGMs)
1097: XRootD MQ port (only on MGMs)
In principle not used by ALICE:
443: https X509 port (only on HTTPS gateways or MGM)
8443: https KRB5 port (only on HTTPS gateways or MGM)
8000: http port (only on MGMs)
8001: http port (only on FSTs)
After running the script, you should have all necessary eos services running.
- Some helpful EOS commands
# You can login to eos with:
[root@mmmartinmgm1 ~]# eos -b
EOS Console [root://localhost] |/>
You get back that console, where you can run eos commands, and in a shell.
# You can see your filesystems/disks:
EOS Console [root://localhost] |/> fs ls
#..........................................................................................................................................
# host (#...) # id # path # schedgroup # geotag # boot # configstatus # drain # active
#..........................................................................................................................................
mmmartinfst1.cern.ch (1095) 1 /var/eos/fs/0 default.0 booted rw nodrain online
mmmartinfst1.cern.ch (1095) 2 /var/eos/fs/1 default.1 booted rw nodrain online
mmmartinfst1.cern.ch (1095) 3 /var/eos/fs/2 default.2 booted rw nodrain online
mmmartinfst1.cern.ch (1095) 4 /var/eos/fs/3 default.3 booted rw nodrain online
mmmartinfst2.cern.ch (1095) 5 /var/eos/fs/0 default.0 booted rw nodrain online
mmmartinfst2.cern.ch (1095) 6 /var/eos/fs/1 default.1 booted rw nodrain online
mmmartinfst2.cern.ch (1095) 7 /var/eos/fs/2 default.2 booted rw nodrain online
mmmartinfst2.cern.ch (1095) 8 /var/eos/fs/3 default.3 booted rw nodrain online
# Your spaces (default is the one created on the script)
EOS Console [root://localhost] |/> space ls
#------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
# type # name # groupsize # groupmod #N(fs) #N(fs-rw) #sum(usedbytes) #sum(capacity) #capacity(rw) #nom.capacity #quota #balancing # threshold # converter # ntx # active #intergroup
#------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
spaceview default 0 24 8 8 18.08 G 67.36 G 26.56 G 0 off off 20 off 2 0 off
# The same for the groups:
EOS Console [root://localhost] |/> group ls
#---------------------------------------------------------------------------------------------------------------------
# type # name # status #nofs #dev(filled) #avg(filled) #sig(filled) #balancing # bal-shd #drain-shd
#---------------------------------------------------------------------------------------------------------------------
groupview default.0 on 2 0.74 26.84 0.74 idle 0 0
groupview default.1 on 2 0.74 26.84 0.74 idle 0 0
groupview default.2 on 2 0.74 26.84 0.74 idle 0 0
groupview default.3 on 2 0.74 26.84 0.74 idle 0 0
# Common case: you want to use RAIN6 (the inner EOS Raid6 equivalent) and you don't have 6 servers, when you need at least 6 stripes.
You can group fs in the same group, for example, doing a group twice the size. You will to reorganize the layout of the storage.
Example: 3 servers with 4 fs:
EOS Console [root://localhost] |/> fs ls
#..........................................................................................................................................
# host (#...) # id # path # schedgroup # geotag # boot # configstatus # drain # active
#..........................................................................................................................................
mmmartinfst1.cern.ch (1095) 1 /var/eos/fs/0 default.0 booted rw nodrain online
mmmartinfst1.cern.ch (1095) 2 /var/eos/fs/1 default.1 booted rw nodrain online
mmmartinfst1.cern.ch (1095) 3 /var/eos/fs/2 default.2 booted rw nodrain online
mmmartinfst1.cern.ch (1095) 4 /var/eos/fs/3 default.3 booted rw nodrain online
mmmartinfst2.cern.ch (1095) 5 /var/eos/fs/0 default.0 booted rw nodrain online
mmmartinfst2.cern.ch (1095) 6 /var/eos/fs/1 default.1 booted rw nodrain online
mmmartinfst2.cern.ch (1095) 7 /var/eos/fs/2 default.2 booted rw nodrain online
mmmartinfst2.cern.ch (1095) 8 /var/eos/fs/3 default.3 booted rw nodrain online
mmmartinfst3.cern.ch (1095) 9 /var/eos/fs/0 default.0 booted rw nodrain online
mmmartinfst3.cern.ch (1095) 10 /var/eos/fs/1 default.1 booted rw nodrain online
mmmartinfst3.cern.ch (1095) 11 /var/eos/fs/2 default.2 booted rw nodrain online
mmmartinfst3.cern.ch (1095) 12 /var/eos/fs/3 default.3 booted rw nodrain online
we want:
EOS Console [root://localhost] |/> fs ls
#..........................................................................................................................................
# host (#...) # id # path # schedgroup # geotag # boot # configstatus # drain # active
#..........................................................................................................................................
mmmartinfst1.cern.ch (1095) 1 /var/eos/fs/0 default.0 booted rw nodrain online
mmmartinfst1.cern.ch (1095) 2 /var/eos/fs/1 default.0 booted rw nodrain online
mmmartinfst1.cern.ch (1095) 3 /var/eos/fs/2 default.1 booted rw nodrain online
mmmartinfst1.cern.ch (1095) 4 /var/eos/fs/3 default.1 booted rw nodrain online
mmmartinfst2.cern.ch (1095) 5 /var/eos/fs/0 default.0 booted rw nodrain online
mmmartinfst2.cern.ch (1095) 6 /var/eos/fs/1 default.0 booted rw nodrain online
mmmartinfst2.cern.ch (1095) 7 /var/eos/fs/2 default.1 booted rw nodrain online
mmmartinfst2.cern.ch (1095) 8 /var/eos/fs/3 default.1 booted rw nodrain online
mmmartinfst3.cern.ch (1095) 9 /var/eos/fs/0 default.0 booted rw nodrain online
mmmartinfst3.cern.ch (1095) 10 /var/eos/fs/1 default.0 booted rw nodrain online
mmmartinfst3.cern.ch (1095) 11 /var/eos/fs/2 default.1 booted rw nodrain online
mmmartinfst3.cern.ch (1095) 12 /var/eos/fs/3 default.1 booted rw nodrain online
for that, we use these commands:
EOS Console [root://localhost] |/eos/alicemiguel/grid/> fs mv 'fs_id' default.'group_id'
success: moved filesystem 'id' into space default.'gid'
Then we set the number of stripes:
|eos> attr set sys.forced.stripes=6 /eos/instancename/foldername
That last command also offers other possibilities. You can always use the help of each command.
EOS Console [root://localhost] |/> attr ls
to see all possibilities.
# File information
EOS Console [root://localhost] |/eos/alicemiguel/grid/> file info passwd
File: '/eos/alicemiguel/grid/passwd' Flags: 0640
Size: 1372
Modify: Wed Feb 4 17:32:57 2015 Timestamp: 1423067577.602053000
Change: Wed Feb 4 17:32:58 2015 Timestamp: 1423067578.87207739
CUid: 0 CGid: 0 Fxid: 00000006 Fid: 6 Pid: 11 Pxid: 0000000b
XStype: none XS: ETAG: 1610612736:1423067577
plain Stripes: 1 Blocksize: 4k LayoutId: 00100001
#Rep: 1
# fs-id #...................................................................................................................................
# host # schedgroup # path # boot # configstatus # drain # active # geotag
#...................................................................................................................................
0 3 mmmartin-fst1.cern.ch default.2 /var/eos/fs/2 booted rw nodrain online
*******
# Copy a file
EOS Console [root://localhost] |/eos/alicemmmartin/test/> cp /etc/hosts /eos/alicemmmartin/test/
doing stat of /etc/hosts
[eos-cp] going to copy 1 files and 159 B
append: /etc/hosts hosts
[eoscp] hosts Total 0.00 MB |====================| 100.00 % [0.0 MB/s]
[eos-cp] copied 1/1 files and 159 B in 0.08 seconds with 2101 B/s
# EOSHA
This service tests that the MGM is running every 10s. If it fails, it will send an email to the configured admin /etc/sysconfig/eos.
# Add new FST
eos node set mmmartinfst1.cern.ch:1095 on
- Some testing
06:18:10 # eos cp /var/tmp/1G /eos/aliceornl/test/
doing stat of /var/tmp/1G
[eos-cp] going to copy 1 files and 1.05 GB
append: /var/tmp/1G 1G
[eoscp] 1G Total 1000.00 MB |====================| 100.00 % [403.5 MB/s]
[eos-cp] copied 1/1 files and 1.05 GB in 2.64 seconds with 396.77 MB/s
_________________________________________________________
09:47:20 # eoscp -b 100000000 root://localhost//eos/aliceornl/test/1G /dev/null
[eoscp] Total 1000.00 MB |====================| 100.00 % [754.4 MB/s]
[eoscp] #################################################################
[eoscp] # Date : ( 1431881243 ) Sun May 17 09:47:23 2015[eoscp} # auth forced= krb5= gsi=
[eoscp] # Source Name [00] : root://localhost//eos/aliceornl/test/1G
[eoscp] # Destination Name [00] : /dev/null
[eoscp] # Data Copied [bytes] : 1048576000
[eoscp] # Realtime [s] : 1.390000
[eoscp] # Eff.Copy. Rate[MB/s] : 754.371250
[eoscp] # Write Start Position : 0
[eoscp] # Write Stop Position : 1048576000
_________________________________________________________
xrdcp root://localhost//eos/aliceornl/test/1G /dev/null -f
[xrootd] Total 1000.00 MB |====================| 100.00 % [699.1 MB/s]
If there is an issue with permissions, use ruid=0 parameter to map to 'nobody', like:
xrdcp "root://localhost//eos/aliceornl/test/1G?ruid=0" /dev/null -f
_________________________________________________________
root@alice-eos-01.ornl.gov:/eos/aliceornl/test
10:29:49 # time for name in `seq 1 1000`; do touch empty.$name; done
real 0m11.058s
user 0m0.409s
sys 0m0.882s
_________________________________________________________
EOS Console [root://localhost] |/eos/aliceornl/test/> mkdir /test
EOS Console [root://localhost] |/eos/aliceornl/test/> test mkdir 1000
info: doing directory test with loop =1000
[ mkdir] startstop : 771.510
= mkdir= startstop : 771.510
EOS Console [root://localhost] |/eos/aliceornl/test/> test rmdir 1000
info: doing directory test with loop =1000
[ rmdir] startstop : 711.595
= rmdir= startstop : 711.595
! Important: BACKUP of MGM !