![]() |
|
| Faculty of Science Racah Institute of Physics ICPL site | |
| Running Jobs On ICPL Cluster | |
|
Accessing ICPL Cluster Accessing the gate serversIn order to run jobs you must first access the gate servers.The gate servers connect between the internet and the university network on one side and the ICPL Cluster network the other.
SSH and X11 clientWindows :putty. Or : KiTTY. Or : MobaXterm (ssh and X11 in one). X-server for windows ( to open X11 windows ) : xming Or : VcXsrv. Unix, Linux & MAC OS users can access by open a terminal and run: ssh <USERNAME>@<SERVERNAME> Example : ssh -X <USERNAME>@newgate1.phys.huji.ac.il The gate servers are (use standart ssh port 22):newgate1.phys.huji.ac.ilnewgate2.phys.huji.ac.il newgate3.phys.huji.ac.il newgate4.phys.huji.ac.il Currently installed operation system is Rocky Linux For more information about Rocky Linux operating system check Rocky Linux. For more information about linux shell commands check explainshell. ![]() After initial loginYou should create a password less login using ssh key.for more information about ssh passwordless login read here ssh public key authentication. or more practical example read here. Working with Environment Modules User environment is managed using Lmod, Lua based module system that provides a convenient way to dynamically change the users' environment through modulefiles. For more information about - Lmod View loaded modules ml module list View available modules module avail module avail openmpi Add module ml <MODULENAME> module add <MODULENAME> Del module ml -<MODULENAME> module add <MODULENAME> Add a compiler ml intel_parallel_studio_xe module add intel_parallel_studio_xe Add mpi ml openmpi module add openmpi Remove mpi ml -openmpi Replace mpi with a newer version module swap openmpi/1.6.3 openmpi/1.10.2 Submitting a job using slurm resource manager Jobs on ICPL Cluster are run as batch jobs in an unattended manner or as a shell terminal in interactive mode. Jobs are scripts with instructions how and where to execute your work. Typically a user logs in to the gate servers, prepares a job and submits it to the job queue. The user can then disconnect from the system without interupting the job. the job will continue to run on a designated node and the user can collect the data, read the output files etc. Further information about slurm and slurm commands can be found here.
Jobs are managed by Slurm, which
is in charge of
Running a job involves, at the minimum, the following steps
Partiitons on ICPL Cluster
Slurm Commands sbatch Job scripts are submitted with the sbatch command, e.g.: For further information about the sbatch command, type man squeue on the gate server. squeue Displaying job status For further information about the squeue command, type man squeue on the gate server. scancel SLURM provides the scancel command for deleting jobs from the
system using the job identification number: srun run a job on slurm cluster directly from the shell, e.g.: smq alias to: sq alias to: Running in interactive mode srsh The srsh command is an alias used to open interactive shell on
another node. Job Submit Example Submit file examle create submit file named myjob.sbatch with the following contents as example:( Note: #SBATCH is a slurm command not a remark, ##SBATCH is a remark ) #!/bin/sh |
|
| Contact Us |
| © All rights reserved to The Hebrew University of Jerusalem |