H200 Partition
The H200 partition is comprised of 6 nodes with 54 H200 GPUs. Access to this partition is limited and is only available by direct request from a faculty PI.
General use of an H200 GPU
We have enabled MIG on some of the GPUs within the H200ea partition.
Slurm settings:
Fractionalized GPUs
Two GPUs on dcc-h200-gpu-05 (0 and 1) have been fractionalized (one into 7 parts and one into 2 parts). Fractional GPUs can be requested with:
Submission
All h200ea jobs will need to specify the “gres” with the “h200” GPU type, e.g.
#SBATCH --gres=gpu:h200:1
This is to accommodate the MIG settings on dcc-h200-gpu-05.
Due to the increased usage of the h200ea partition, we ask that jobs that will require more than 72 hours to complete be limited to a single H200 GPU (7-day partition limit). If the jobs will require less than 72 hours, 2-GPU jobs can be submitted with the line
#SBATCH --time=72:00:00
added to the job script. There is currently a per user max GPU limit of 12 H200s allocated at one time.