Example 1: Run LAMMPS on adroit:
LAMMPS is used to simulate and analyze atom and molecule movements.
This sample job will result in the creation of this cool visualization!
Our first focus is going to be getting on the cluster. This will allow us to run our program on adroit (aka: one of the clusters)
$ ssh <YourUserID>@adroit.princeton.edu
Create a directory
$ mkdir lammps.ex
$ cd lammps.ex
Download LAMMPS
$ pwd
/home/<YourUserID>/lammps.ex
$ wget https://raw.githubusercontent.com/PrincetonUniversity/install_lammps/master/01_installing/ins/adroit/lammps_mixed_prec_adroit_gpu_a100.sh
...
Execute the LAMMPS installation script and save the installation output to the “install_lammps.log” file. (This will also cause the output to be displayed onto the terminal in real time.)
$ bash lammps_mixed_prec_adroit_gpu_a100.sh | tee install_lammps.log
...
Let’s choose an example input “melt” to use for our test job. It is helpful to move this file into our build directory
$ mv .....
Now navigate to the build folder
$ cd build
Execute the MakeFile, which will compile your code
$ make all
...
Now we are going to create a slurm file — this file will help us run our program!
$ touch slurm.run
Open the file so that we can edit it
$ nano slurm.run
Below is a slurm file that will run your code — when using different inputs it’s important to change the nodes, ntasks, cpus per task, time, and more. For more information on how to create slurm files, click here.
Copy and paste this text into slurm.run, replace “<YourNetID>”
#!/bin/bash
####### --clusters=adroit # Select which system(s) to use
##BATCH --account=blah
##BATCH --partition=all
####### --reservation=blah
####### --partition=main # Partition (job queue)
##BATCH --requeue # Return job to the queue if preempted
#SBATCH --job-name=badidea # Assign an short name to your job
#SBATCH --nodes=1 # Number of nodes you require
#SBATCH --ntasks=8 # Total # of tasks across all nodes
#SBATCH --cpus-per-task=1 # Cores per task (>1 if multithread tasks)
#SBATCH --mem-per-cpu=4G # Real memory (RAM) required per node
#SBATCH --time=15:00 # Total run time limit (DD-HH:MM:SS)
#SBATCH --output=slurm.%N.%j.out # STDOUT file for SLURM output
#SBATCH --mail-type=end
#SBATCH --mail-user=<YourNetID>@princeton.edu
## Environment settings needed for this job
module purge
module load intel/19.1.1.217
module load intel-mpi/intel/2019.7
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
## Run the job
srun $HOME/.local/bin/lmp_adroit -in in.melt
Notice that “srun” is what is ultimately executing our program.
Now we will execute LAMMPS. “sbatch” is the command used to submit a job script to Slurm for execution. It is followed by the name of the script you want to submit, in this case, “slurm.run.
$ sbatch slurm.run
This specific job will take about 25 minutes to run. To verify that your job is running you can run the command:
$ squeue -u <YourNetID>
...
[am3949@adroit5 build]$ squeue -u am3949
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
1825257 gpu lmpmltx <YourUserID> R 0:26 1 adroit-h11g2
We know that our job is running because we can see an “R” symbol under our RT status column.
While we’re waiting on our output let’s run some commands to take a closer look at our job’s performance.
First connect directly to the node where the job is running and verify it’s running in parallel.
$ ssh adroit-h11g2 #connect to the node
$ top -u <YourUserID> #verify the job is running in parallel
Now you will see a new output file in your build folder
$ ls
.. #add all files here
View your file by using the more command
$ more slurm-<NumberShown>
It’s really important when running your own jobs that you optimize your slurm file to run efficiently on the cluster. This command will give you data on how fast your program ran. This can be used to compare with different slurm configurations.
jobstats is a simple way to do this.
You should see an output file appear in your build directory
ex: slurm.adroit-h11n6.1799247.out
you are going to use for example number “1799247” in your jobstats
$ jobstats <YourOutputNumber>
You should get an output that looks something like this:
================================================================================
Slurm Job Statistics
================================================================================
Job ID: 1799290
NetID/Account: am3949/cses
Job Name: badidea
State: COMPLETED
Nodes: 1
CPU Cores: 8
CPU Memory: 32GB (4GB per CPU-core)
QOS/Partition: test/all
Cluster: adroit
Start Time: Mon Jul 10, 2023 at 11:53 AM
Run Time: 00:13:44
Time Limit: 00:15:00
Overall Utilization
================================================================================
CPU utilization [|||||||||||||||||||||||||||||||||||||||||||||||98%]
CPU memory usage [ 1%]
Detailed Utilization
================================================================================
CPU utilization per node (CPU time used/run time)
adroit-h11n2: 01:47:45/01:49:52 (efficiency=98.1%)
CPU memory usage per node - used/allocated
adroit-h11n2: 378.7MB/32.0GB (47.3MB/4.0GB per core of 8)
Notes
================================================================================
* This job only used 1% of the 32GB of total allocated CPU memory. For
future jobs, please allocate less memory by using a Slurm directive such
as --mem-per-cpu=1G or --mem=1G. This will reduce your queue times and
make the resources available to other users. For more info:
https://researchcomputing.princeton.edu/support/knowledge-base/memory
* This job ran in the test QOS. Each user can only run a small number of
jobs simultaneously in this QOS. For more info:
https://researchcomputing.princeton.edu/support/knowledge-base/job-priority#test-queue
* For additional job metrics including metrics plotted against time:
https://myadroit.princeton.edu/pun/sys/jobstats (VPN required off-campus)
Notice the notes section. It seems like in our original slurm file, we are using too much CPU memory, we can fix this by decreasing the amount of “ntasks” we have on our slurm file.
Change your ntasks line to read the following:
#SBATCH --ntasks=1 # Total # of tasks across all ...
This is how we make sure that our program is running efficiently on the cluster.
Now we are going to work on running our visualization. If you do not plan on visualizing your data you can skip this step.
First we are going to copy our file into our home directory
Open a new window in your terminal
This should get you to your home directory terminal.
Then we are going to navigate to the folder of which we want to copy our file into
$ cd Desktop
$ pwd
/Users/am3949/Desktop
This command will copy our melt.lammpstrj file into our home directory
$ scp <YourUserId>@adroit.princeton.edu:/home/<YourUserId>/lammps.ex/lammps-stable_29Oct2020/build/melt.lammpstrj .
....
melt.lammpstrj 49% 239MB 28.5MB/s 00:08 ET
Download VMD using this link, be sure you’re downloading the correct version. If you’re on Mac, you can check here:
You may run into a problem with downloading VMD. If so, check out this post.
Now that you have VMD open and running we are going to add our output file
Navigate to the file you just copied
This should load our molecule onto VMD — You may notice it just looks like a big blob of lines. We’re going to fix that by editing our Display settings. Navigate here:
Now we are going to change our drawing method to VDW
Leave a Reply