Refactor structure for essential steps

This commit is contained in:
Varun Vasudeva
2024-08-21 01:18:14 -05:00
parent 3a98385e0c
commit 2d252cbfc8

122
README.md
View File

@@ -11,6 +11,15 @@ _TL;DR_: A guide to setting up a fully local and private language model server a
- [System Requirements](#system-requirements) - [System Requirements](#system-requirements)
- [Prerequisites](#prerequisites) - [Prerequisites](#prerequisites)
- [Essential Setup](#essential-setup) - [Essential Setup](#essential-setup)
- [General](#general)
- [Drivers](#drivers)
- [Nvidia GPUs](#nvidia-gpus)
- [AMD GPUs](#amd-gpus)
- [Ollama](#ollama)
- [Startup Script](#startup-script)
- [Scheduling Startup Script](#scheduling-startup-script)
- [Configuring Script Permissions](#configuring-script-permissions)
- [Configuring Auto-Login](#configuring-auto-login)
- [Additional Setup](#additional-setup) - [Additional Setup](#additional-setup)
- [SSH](#ssh) - [SSH](#ssh)
- [Firewall](#firewall) - [Firewall](#firewall)
@@ -20,19 +29,19 @@ _TL;DR_: A guide to setting up a fully local and private language model server a
- [Open WebUI Integration](#open-webui-integration) - [Open WebUI Integration](#open-webui-integration)
- [Downloading Voices](#downloading-voices) - [Downloading Voices](#downloading-voices)
- [Verifying](#verifying) - [Verifying](#verifying)
- [Ollama](#ollama) - [Ollama](#ollama-1)
- [Open WebUI](#open-webui-1) - [Open WebUI](#open-webui-1)
- [OpenedAI Speech](#openedai-speech-1) - [OpenedAI Speech](#openedai-speech-1)
- [Updating](#updating) - [Updating](#updating)
- [General](#general) - [General](#general-1)
- [Nvidia Drivers \& CUDA](#nvidia-drivers--cuda) - [Nvidia Drivers \& CUDA](#nvidia-drivers--cuda)
- [Ollama](#ollama-1) - [Ollama](#ollama-2)
- [Open WebUI](#open-webui-2) - [Open WebUI](#open-webui-2)
- [OpenedAI Speech](#openedai-speech-2) - [OpenedAI Speech](#openedai-speech-2)
- [Troubleshooting](#troubleshooting) - [Troubleshooting](#troubleshooting)
- [`ssh`](#ssh-1) - [`ssh`](#ssh-1)
- [Nvidia Drivers](#nvidia-drivers) - [Nvidia Drivers](#nvidia-drivers)
- [Ollama](#ollama-2) - [Ollama](#ollama-3)
- [Open WebUI](#open-webui-3) - [Open WebUI](#open-webui-3)
- [OpenedAI Speech](#openedai-speech-3) - [OpenedAI Speech](#openedai-speech-3)
- [Monitoring](#monitoring) - [Monitoring](#monitoring)
@@ -91,75 +100,78 @@ I also recommend installing a lightweight desktop environment like XFCE for ease
## Essential Setup ## Essential Setup
1. ### Update the system ### General
- Run the following commands: Update the system by running the following commands:
``` ```
sudo apt update sudo apt update
sudo apt upgrade sudo apt upgrade
``` ```
2. ### Install drivers ### Drivers
- #### Nvidia
- Follow Nvidia's [guide on downloading CUDA Toolkit](https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Debian). The instructions are specific to your machine and the website will lead you to them interactively. Now, we'll install the required GPU drivers that allow programs to utilize their compute capabilities.
- Run the following commands:
#### Nvidia GPUs
- Follow Nvidia's [guide on downloading CUDA Toolkit](https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Debian). The instructions are specific to your machine and the website will lead you to them interactively.
- Run the following commands:
``` ```
sudo apt install linux-headers-amd64 sudo apt install linux-headers-amd64
sudo apt install nvidia-driver firmware-misc-nonfree sudo apt install nvidia-driver firmware-misc-nonfree
``` ```
- Reboot the server. - Reboot the server.
- Run the following command to verify the installation: - Run the following command to verify the installation:
``` ```
nvidia-smi nvidia-smi
``` ```
- #### AMD #### AMD GPUs
- Run the following commands: - Run the following commands:
``` ```
deb http://deb.debian.org/debian bookworm main contrib non-free-firmware deb http://deb.debian.org/debian bookworm main contrib non-free-firmware
apt-get install firmware-amd-graphics libgl1-mesa-dri libglx-mesa0 mesa-vulkan-drivers xserver-xorg-video-all apt-get install firmware-amd-graphics libgl1-mesa-dri libglx-mesa0 mesa-vulkan-drivers xserver-xorg-video-all
``` ```
- Reboot the server. - Reboot the server.
3. ### Install `ollama` ### Ollama
Ollama, a Docker-based wrapper of `llama.cpp`, serves the inference engine and enables inference from the language models you will download. It'll be installed as a service, so it runs automatically at boot. Ollama, a Docker-based wrapper of `llama.cpp`, serves the inference engine and enables inference from the language models you will download. It'll be installed as a service, so it runs automatically at boot.
- Download `ollama` from the official repository: - Download `ollama` from the official repository:
``` ```
curl -fsSL https://ollama.com/install.sh | sh curl -fsSL https://ollama.com/install.sh | sh
``` ```
We want our API endpoint to be reachable by the rest of the LAN. For `ollama`, this means setting `OLLAMA_HOST=0.0.0.0` in the `ollama.service`. We want our API endpoint to be reachable by the rest of the LAN. For `ollama`, this means setting `OLLAMA_HOST=0.0.0.0` in the `ollama.service`.
- Run the following command to edit the service: - Run the following command to edit the service:
``` ```
systemctl edit ollama.service systemctl edit ollama.service
``` ```
- Find the `[Service]` section and add `Environment="OLLAMA_HOST=0.0.0.0"` under it. It should look like this: - Find the `[Service]` section and add `Environment="OLLAMA_HOST=0.0.0.0"` under it. It should look like this:
``` ```
[Service] [Service]
Environment="OLLAMA_HOST=0.0.0.0" Environment="OLLAMA_HOST=0.0.0.0"
``` ```
- Save and exit. - Save and exit.
- Reload the environment. - Reload the environment.
``` ```
systemctl daemon-reload systemctl daemon-reload
systemctl restart ollama systemctl restart ollama
``` ```
> [!TIP] > [!TIP]
> If you installed `ollama` manually or don't use it as a service, remember to run `ollama serve` to properly start the server. Refer to [Ollama's troubleshooting steps](#ollama-2) if you encounter an error. > If you installed `ollama` manually or don't use it as a service, remember to run `ollama serve` to properly start the server. Refer to [Ollama's troubleshooting steps](#ollama-3) if you encounter an error.
4. ### Create the `init.bash` script ### Startup Script
This script will be run at boot to set the GPU power limit and start the server using `ollama`. We set the GPU power limit lower because it has been seen in testing and inference that there is only a 5-15% performance decrease for a 30% reduction in power consumption. This is especially important for servers that are running 24/7. In this step, we'll create a script called `init.bash`. This script will be run at boot to set the GPU power limit and start the server using `ollama`. We set the GPU power limit lower because it has been seen in testing and inference that there is only a 5-15% performance decrease for a 30% reduction in power consumption. This is especially important for servers that are running 24/7.
- Run the following commands: - Run the following commands:
``` ```
touch init.bash touch init.bash
nano init.bash nano init.bash
``` ```
- Add the following lines to the script: - Add the following lines to the script:
``` ```
#!/bin/bash #!/bin/bash
sudo nvidia-smi -pm 1 sudo nvidia-smi -pm 1
@@ -172,71 +184,71 @@ I also recommend installing a lightweight desktop environment like XFCE for ease
sudo nvidia-smi -i 0 -pl (power_limit) sudo nvidia-smi -i 0 -pl (power_limit)
sudo nvidia-smi -i 1 -pl (power_limit) sudo nvidia-smi -i 1 -pl (power_limit)
``` ```
- Save and exit the script. - Save and exit the script.
- Make the script executable: - Make the script executable:
``` ```
chmod +x init.bash chmod +x init.bash
``` ```
5. ### Add `init.bash` to the crontab ### Scheduling Startup Script
Adding the `init.bash` script to the crontab will schedule it to run at boot. Adding the `init.bash` script to the crontab will schedule it to run at boot.
- Run the following command: - Run the following command:
``` ```
crontab -e crontab -e
``` ```
- Add the following line to the file: - Add the following line to the file:
``` ```
@reboot /path/to/init.bash @reboot /path/to/init.bash
``` ```
> Replace `/path/to/init.bash` with the path to the `init.bash` script. > Replace `/path/to/init.bash` with the path to the `init.bash` script.
- (Optional) Add the following line to shutdown the server at 12am: - (Optional) Add the following line to shutdown the server at 12am:
``` ```
0 0 * * * /sbin/shutdown -h now 0 0 * * * /sbin/shutdown -h now
``` ```
- Save and exit the file. - Save and exit the file.
6. ### Give `nvidia-persistenced` and `nvidia-smi` passwordless `sudo` permissions ### Configuring Script Permissions
We want `init.bash` to run the `nvidia-smi` commands without having to enter a password. This is done by editing the `sudoers` file. We want `init.bash` to run the `nvidia-smi` commands without having to enter a password. This is done by giving `nvidia-persistenced` and `nvidia-smi` passwordless `sudo` permissions, and can be achieved by editing the `sudoers` file.
AMD users can skip this step as power limiting is not supported on AMD GPUs. AMD users can skip this step as power limiting is not supported on AMD GPUs.
- Run the following command: - Run the following command:
``` ```
sudo visudo sudo visudo
``` ```
- Add the following lines to the file: - Add the following lines to the file:
``` ```
(username) ALL=(ALL) NOPASSWD: /usr/bin/nvidia-persistenced (username) ALL=(ALL) NOPASSWD: /usr/bin/nvidia-persistenced
(username) ALL=(ALL) NOPASSWD: /usr/bin/nvidia-smi (username) ALL=(ALL) NOPASSWD: /usr/bin/nvidia-smi
``` ```
> Replace `(username)` with your username. > Replace `(username)` with your username.
- Save and exit the file.
> [!IMPORTANT] > [!IMPORTANT]
> Ensure that you add these lines AFTER `%sudo ALL=(ALL:ALL) ALL`. The order of the lines in the file matters - the last matching line will be used so if you add these lines before `%sudo ALL=(ALL:ALL) ALL`, they will be ignored. > Ensure that you add these lines AFTER `%sudo ALL=(ALL:ALL) ALL`. The order of the lines in the file matters - the last matching line will be used so if you add these lines before `%sudo ALL=(ALL:ALL) ALL`, they will be ignored.
- Save and exit the file.
7. ### Configure auto-login ### Configuring Auto-Login
When the server boots up, we want it to automatically log in to a user account and run the `init.bash` script. This is done by configuring the `lightdm` display manager. When the server boots up, we want it to automatically log in to a user account and run the `init.bash` script. This is done by configuring the `lightdm` display manager.
- Run the following command: - Run the following command:
``` ```
sudo nano /etc/lightdm/lightdm.conf sudo nano /etc/lightdm/lightdm.conf
``` ```
- Find the following commented line. It should be in the `[Seat:*]` section. - Find the following commented line. It should be in the `[Seat:*]` section.
``` ```
# autologin-user= # autologin-user=
``` ```
- Uncomment the line and add your username: - Uncomment the line and add your username:
``` ```
autologin-user=(username) autologin-user=(username)
``` ```
> Replace `(username)` with your username. > Replace `(username)` with your username.
- Save and exit the file. - Save and exit the file.
## Additional Setup ## Additional Setup