superfans-gpu-controller/README.md

54 lines
1.5 KiB
Markdown
Raw Normal View History

2019-03-08 19:24:59 +00:00
# NVIDIA GPU-based FAN controller for SUPERMICRO server
2019-03-08 19:30:25 +00:00
This controller enables automatic adjustments of FANs in SUPERMICRO servers based on GPU temperature. Only NVIDIA GPUs are supported since the tool uses nvidia-smi to parse the GPU temperature. FANs are controlled through IPMI tool (`impitool`) using the modified superfans (https://github.com/putnam/superfans) script.
2019-03-08 19:24:59 +00:00
# Requirements
2019-03-08 19:52:00 +00:00
* Linux (tested on Ubuntu 18.04)
2019-03-08 19:24:59 +00:00
* Python 2.7
2019-03-08 19:29:11 +00:00
* nvidia drivers/tools (`nvidia-smi`)
* IPMI tool (`impitool`) with loaded module (`modprobe ipmi_devintf`)
2019-03-08 19:24:59 +00:00
2019-03-08 19:29:11 +00:00
Tested on SUPERMICRO 4029GP TRT2 with RTX 2080 Ti (nvidia 415.27 drivers).
2019-03-08 19:24:59 +00:00
# Usage
2019-03-08 19:52:00 +00:00
Directly call python script (requires sudo access for `impitool`):
2019-03-08 19:24:59 +00:00
```bash
2019-03-08 19:52:00 +00:00
sudo python superfans_gpu_controller.py
2019-03-08 19:24:59 +00:00
```
2019-03-08 19:52:00 +00:00
Or install systemd service (`superfans-gpu-controller.service`):
```bash
2019-03-08 19:53:40 +00:00
sudo chmod +x ./install_daemon.sh
sudo ./install_daemon.sh
2019-03-08 19:52:00 +00:00
```
2019-03-08 20:25:45 +00:00
Service is registered to start at system startup. Start and stop it using:
2019-03-08 19:52:00 +00:00
```bash
# start
sudo systemctl start superfans-gpu-controller
# stop
sudo systemctl stop superfans-gpu-controller
# check the status
sudo systemctl status superfans-gpu-controller
# view logs (with trailing)
sudo journalctl -f -u superfans-gpu-controller
```
2019-03-08 20:25:45 +00:00
# Settings
Currently the settings are hardcoded into superfans_gpu_controller.py (TODO: split into config file) using the following table:
* 0°C => 25%
* 60°C => 30%
* 70°C => 36%
* 80°C => 40%
* 85°C => 45%
* 90°C => 50%
At full workload using 4x RTX 2080 Ti this results in around 75°C - 80°C (GPU temperature).