Anvil Changes
During Anvil scheduled maintenance on March 21, 2023, several changes were made on Anvil.
-
The default Slurm partition has been changed from
wholenodetoshared. This change seeks to reduce accidental waste of compute resources and SUs. In the old default partitionwholenode, jobs consume all 128-cores on a node even if a user requests just one task, i.e. jobs get charged 128 SUs per node per hour. With the current change, jobs not requesting an explicit partition will be placed in thesharedpartition instead, leading to fewer surprises.Note: the
sharedpartition does not allow multi-node jobs (see description of partitions and their limits). If your multi-node jobs used to rely onwholenodebeing the default partition, you may have to specify the partition explicitly now. AQOSMaxCpuPerJobLimiterror would be a good indicator during job submission. -
CUDA updates. NVIDIA GPU driver has been updated, and CUDA 12.0.1 was added as a module.
-
Operating system updates. Operating system on Anvil machines has been updated to Rocky 8.7.
-
Slurm updates. We have updatef Slurm version to 22.05.8.
Please submit a ticket through ACCESS Help Desk at https://support.access-ci.org/open-a-ticket if you have any questions.