gpu_server_setup/README.md

1.2 KiB

Setup for Tensorflow with GPU

Tested for ubuntu-20.04.4

Steps:

  1. Prepare setup:

    git clone https://repos.nonan.net/nicolas/gpu_server_setup.git
    cd gpu_server_setup
    
  2. Setup driver/CUDA:

    sudo bash setup_cuda.sh
    
  3. Reboot system:

    sudo systemctl reboot
    
  4. Setup bcache:

    sudo bash setup_bcache.sh
    
  5. Setup apps (Python, JupyterHub (Hub is running as root), Tensorflow etc.):

    sudo bash setup_apps.sh
    

Notes

CUDA

Check state of NVIDIA devices (electrical power, temperature, memory etc.):

nvidia-smi

bcache

Check bcache performance:

cat /sys/block/bcache0/bcache/state
cat /sys/block/bcache*/bcache/stats_five_minute/cache_hit_ratio
cat /sys/block/bcache*/bcache/stats_hour/cache_hit_ratio

Tune bcache (not permanent):

echo 64M > /sys/block/bcache0/bcache/sequential_cutoff
echo 4096 > /sys/block/bcache0/queue/read_ahead_kb

Fan-temperature control for GPUs