카테고리 없음

creating a custom module in slurm

DS-Lee 2024. 11. 6. 21:47

The following shows the example of creating a custom module for PyTorch in Slurm.


Creating a custom module for PyTorch version 2.5.1 in a Slurm-managed high-performance computing (HPC) environment involves several steps:

  1. Create a virtual environment

    mkdir -p modules/pytorch/2.5.1
    python -m venv modules/pytorch/2.5.1
    source activate modules/pytorch/2.5.1/bin/activate
  2. Install PyTorch and Other Libraries

     python modules/pytorch/2.5.1/bin/pip torch ... 

    "modules/pytorch/2.5.1/bin/pip" specifies the exact pip to use.

  1. Create the Modulefile

    Modulefiles are scripts that configure the environment variables needed to use the software. Here's how to create one:

    • Create the Directory Structure:

      mkdir -p modulefiles/pytorch
    • Create the Modulefile:

      Create a file named 2.5.1 (it's called a TCL file) inside /usr/local/modulefiles/pytorch/ with the following content:

      nano modulefiles/pytorch/2.5.1

      copy and paste the following

       #%Module1.0#####################################################################
       ##
       ## PyTorch 2.5.1 modulefile
       ##
       proc ModulesHelp { } {
           puts stderr "This module loads your personal PyTorch 2.5.1 virtual environment."
       }
       module-whatis "Loads your personal PyTorch 2.5.1 virtual environment"
      
       # Set the root of your virtual environment
       set root /cluster/projects/nn11068k/modules/pytorch/2.5.1
      
       # Set the VIRTUAL_ENV environment variable
       setenv VIRTUAL_ENV $root
      
       # Unset PYTHONHOME to avoid conflicts
       unsetenv PYTHONHOME
      
       # Prepend the virtual environment's bin directory to PATH
       prepend-path PATH $root/bin
      
       # Prepend the virtual environment's library directories to LD_LIBRARY_PATH
       prepend-path LD_LIBRARY_PATH $root/lib
       prepend-path LD_LIBRARY_PATH $root/lib64
      
       # Prepend the site-packages to PYTHONPATH
       prepend-path PYTHONPATH $root/lib/python3.11/site-packages
      
       # If using CUDA, you might need to set CUDA paths (optional)
       # prepend-path LD_LIBRARY_PATH /usr/local/cuda/lib64
  2. Test the Module

     module load Python/3.11.5-GCCcore-13.2.0
     module use modulefiles  # register the directory for our custom modulefiles
     module load pytorch/2.5.1

    test it by running python and try

    import torch
    torch.rand(4)
    torch.__version__