ABOUT ME

-

Today
-
Yesterday
-
Total
-
  • creating a custom module in slurm
    카테고리 없음 2024. 11. 6. 21:47

    The following shows the example of creating a custom module for PyTorch in Slurm.


    Creating a custom module for PyTorch version 2.5.1 in a Slurm-managed high-performance computing (HPC) environment involves several steps:

    1. Create a virtual environment

      mkdir -p modules/pytorch/2.5.1
      python -m venv modules/pytorch/2.5.1
      source activate modules/pytorch/2.5.1/bin/activate
    2. Install PyTorch and Other Libraries

       python modules/pytorch/2.5.1/bin/pip torch ... 

      "modules/pytorch/2.5.1/bin/pip" specifies the exact pip to use.

    1. Create the Modulefile

      Modulefiles are scripts that configure the environment variables needed to use the software. Here's how to create one:

      • Create the Directory Structure:

        mkdir -p modulefiles/pytorch
      • Create the Modulefile:

        Create a file named 2.5.1 (it's called a TCL file) inside /usr/local/modulefiles/pytorch/ with the following content:

        nano modulefiles/pytorch/2.5.1

        copy and paste the following

         #%Module1.0#####################################################################
         ##
         ## PyTorch 2.5.1 modulefile
         ##
         proc ModulesHelp { } {
             puts stderr "This module loads your personal PyTorch 2.5.1 virtual environment."
         }
         module-whatis "Loads your personal PyTorch 2.5.1 virtual environment"
        
         # Set the root of your virtual environment
         set root /cluster/projects/nn11068k/modules/pytorch/2.5.1
        
         # Set the VIRTUAL_ENV environment variable
         setenv VIRTUAL_ENV $root
        
         # Unset PYTHONHOME to avoid conflicts
         unsetenv PYTHONHOME
        
         # Prepend the virtual environment's bin directory to PATH
         prepend-path PATH $root/bin
        
         # Prepend the virtual environment's library directories to LD_LIBRARY_PATH
         prepend-path LD_LIBRARY_PATH $root/lib
         prepend-path LD_LIBRARY_PATH $root/lib64
        
         # Prepend the site-packages to PYTHONPATH
         prepend-path PYTHONPATH $root/lib/python3.11/site-packages
        
         # If using CUDA, you might need to set CUDA paths (optional)
         # prepend-path LD_LIBRARY_PATH /usr/local/cuda/lib64
    2. Test the Module

       module load Python/3.11.5-GCCcore-13.2.0
       module use modulefiles  # register the directory for our custom modulefiles
       module load pytorch/2.5.1

      test it by running python and try

      import torch
      torch.rand(4)
      torch.__version__

    Comments