Go to file
Nicolas 250b0da175 superfans-gpu-controller link added 2022-03-21 18:09:57 +01:00
README.md superfans-gpu-controller link added 2022-03-21 18:09:57 +01:00
id_rsa_nicolas.pub pub key added, files renamed, disk setup added 2022-03-16 01:35:27 +01:00
jupyterhub.service init 2022-03-12 01:42:12 +01:00
jupyterhub_config.py init 2022-03-12 01:42:12 +01:00
setup_apps.sh cantera added and bcash-script fixed 2022-03-21 17:44:25 +01:00
setup_bcache.sh cantera added and bcash-script fixed 2022-03-21 17:44:25 +01:00
setup_cuda.sh cloud-init deactivation added 2022-03-21 17:04:15 +01:00

README.md

Setup for Tensorflow with GPU

Tested for ubuntu-20.04.4

Steps:

  1. Prepare setup:

    git clone https://repos.nonan.net/nicolas/gpu_server_setup.git
    cd gpu_server_setup
    
  2. Setup driver/CUDA:

    sudo bash setup_cuda.sh
    
  3. Reboot system:

    sudo systemctl reboot
    
  4. Setup bcache:

    sudo bash setup_bcache.sh
    
  5. Setup apps (Python, JupyterHub (Hub is running as root), Tensorflow etc.):

    sudo bash setup_apps.sh
    

Notes

CUDA

Check state of NVIDIA devices (electrical power, temperature, memory etc.):

nvidia-smi

bcache

Check bcache performance:

cat /sys/block/bcache0/bcache/state
cat /sys/block/bcache*/bcache/stats_five_minute/cache_hit_ratio
cat /sys/block/bcache*/bcache/stats_hour/cache_hit_ratio

Tune bcache (not permanent):

echo 64M > /sys/block/bcache0/bcache/sequential_cutoff
echo 4096 > /sys/block/bcache0/queue/read_ahead_kb

Fan-temperature control for GPUs