Go to file
Nicolas 456f4b3d48 note added 2023-07-05 08:57:26 +00:00
README.md systemdspawner link in readme added 2022-06-18 15:51:51 +02:00
id_rsa_nicolas.pub pub key added, files renamed, disk setup added 2022-03-16 01:35:27 +01:00
jupyterhub.service init 2022-03-12 01:42:12 +01:00
jupyterhub_conda.service conda install updated 2022-06-23 09:02:26 +02:00
jupyterhub_config.py clean up, adding "set -e" 2022-05-05 10:41:19 +02:00
jupyterhub_config_https.py some fixes for conda variant 2022-06-23 14:25:45 +02:00
setup_apps.sh conda variant to install apps added 2022-06-07 17:55:01 +02:00
setup_apps_apptainer.sh apptainer installation script fixed 2023-06-28 14:41:08 +00:00
setup_apps_conda.sh apps_conda fixed 2022-08-19 16:22:39 +02:00
setup_apps_miniforge.sh note added 2023-07-05 08:57:26 +00:00
setup_bcache.sh clean up, adding "set -e" 2022-05-05 10:41:19 +02:00
setup_cuda.sh nvidea pgp key updated 2022-12-19 11:48:16 +00:00
setup_suspend.sh changed from tensorflow to tensorflow-gpu and switched to use nodesource.com for a more recent nodejs 2022-03-22 10:40:49 +01:00
setup_tensorflow_conda.sh apps_conda tested, tensowflow in userspace added "setup_tensorflow_conda.sh" 2022-06-24 12:53:41 +02:00
work-environment.yml sympy in work-environment.yml added 2022-11-14 13:50:18 +00:00

README.md

Setup for Tensorflow with GPU

Tested for ubuntu-20.04.4

Steps:

  1. Prepare setup:

    git clone https://repos.nonan.net/nicolas/gpu_server_setup.git
    cd gpu_server_setup
    
  2. Setup driver/CUDA:

    sudo bash setup_cuda.sh
    
  3. Reboot system:

    sudo systemctl reboot
    
  4. Setup bcache:

    sudo bash setup_bcache.sh
    
  5. Setup apps (Python, JupyterHub (Hub is running as root), Tensorflow etc.):

    sudo bash setup_apps.sh
    

Notes

CUDA

Check state of NVIDIA devices (electrical power, temperature, memory etc.):

nvidia-smi

bcache

Check bcache performance:

cat /sys/block/bcache0/bcache/state
cat /sys/block/bcache*/bcache/stats_five_minute/cache_hit_ratio
cat /sys/block/bcache*/bcache/stats_hour/cache_hit_ratio

Tune bcache (not permanent):

echo 64M > /sys/block/bcache0/bcache/sequential_cutoff
echo 4096 > /sys/block/bcache0/queue/read_ahead_kb

Fan-temperature control for GPUs

For a multiuser setup