From 3a98385e0c6e4d0047447e9eacd796dbb8bbf7d4 Mon Sep 17 00:00:00 2001
From: Varun Vasudeva <varunvasudeva1@gmail.com>
Date: Tue, 20 Aug 2024 14:45:00 -0500
Subject: [PATCH] Update Ollama installation step, add tip for manual installs

---
 README.md | 42 ++++++++++++++++++++++++++++--------------
 1 file changed, 28 insertions(+), 14 deletions(-)

diff --git a/README.md b/README.md
index 1be6d30..e2e46fc 100644
--- a/README.md
+++ b/README.md
@@ -121,27 +121,35 @@ I also recommend installing a lightweight desktop environment like XFCE for ease
       - Reboot the server.
 
 3. ### Install `ollama`
+
+    Ollama, a Docker-based wrapper of `llama.cpp`, serves the inference engine and enables inference from the language models you will download. It'll be installed as a service, so it runs automatically at boot.
+
     - Download `ollama` from the official repository:
         ```
         curl -fsSL https://ollama.com/install.sh | sh
         ```
-    - (Recommended) We want our API endpoint to be reachable by the rest of the LAN. For `ollama`, this means setting `OLLAMA_HOST=0.0.0.0` in the `ollama.service`. 
-      - Run the following command to edit the service:
+
+    We want our API endpoint to be reachable by the rest of the LAN. For `ollama`, this means setting `OLLAMA_HOST=0.0.0.0` in the `ollama.service`. 
+
+    - Run the following command to edit the service:
         ```
         systemctl edit ollama.service
         ```
-      - Find the `[Service]` section and add `Environment="OLLAMA_HOST=0.0.0.0"` under it. It should look like this:
+    - Find the `[Service]` section and add `Environment="OLLAMA_HOST=0.0.0.0"` under it. It should look like this:
         ```
         [Service]
         Environment="OLLAMA_HOST=0.0.0.0"
         ```
-       - Save and exit.
-       - Reload the environment.
-            ```
-            systemctl daemon-reload
-            systemctl restart ollama
-            ```
-    
+    - Save and exit.
+    - Reload the environment.
+        ```
+        systemctl daemon-reload
+        systemctl restart ollama
+        ```
+
+    > [!TIP]
+    > If you installed `ollama` manually or don't use it as a service, remember to run `ollama serve` to properly start the server. Refer to [Ollama's troubleshooting steps](#ollama-2) if you encounter an error.
+
 4. ### Create the `init.bash` script
 
     This script will be run at boot to set the GPU power limit and start the server using `ollama`. We set the GPU power limit lower because it has been seen in testing and inference that there is only a 5-15% performance decrease for a 30% reduction in power consumption. This is especially important for servers that are running 24/7.
@@ -156,13 +164,9 @@ I also recommend installing a lightweight desktop environment like XFCE for ease
         #!/bin/bash
         sudo nvidia-smi -pm 1
         sudo nvidia-smi -pl (power_limit)
-        ollama run (model)
-        ollama serve
         ```
         > Replace `(power_limit)` with the desired power limit in watts. For example, `sudo nvidia-smi -pl 250`.
 
-        > Replace `(model)` with the name of the model you want to run. For example, `ollama run mistral:latest`.
-
         For multiple GPUs, modify the script to set the power limit for each GPU:
         ```
         sudo nvidia-smi -i 0 -pl (power_limit)
@@ -522,6 +526,16 @@ For any service running in a container, you can check the logs by running `sudo
 - Disable Secure Boot in the BIOS if you're having trouble with the Nvidia drivers not working. For me, all packages were at the latest versions and `nvidia-detect` was able to find my GPU correctly, but `nvidia-smi` kept returning the `NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver` error. [Disabling Secure Boot](https://askubuntu.com/a/927470) fixed this for me. Better practice than disabling Secure Boot is to sign the Nvidia drivers yourself but I didn't want to go through that process for a non-critical server that can afford to have Secure Boot disabled.
 
 ### Ollama
+- If you receive the `could not connect to ollama app, is it running?` error, your `ollama` instance wasn't served properly. This could be because of a manual installation or the desire to use it at-will and not as a service. To run the `ollama` server once, run:
+    ```
+    ollama serve
+    ```
+    Then, **in a new terminal**, you should be able to access your models regularly by running:
+    ```
+    ollama run (model)
+    ```
+    For detailed instructions on _manually_ configuring `ollama` to run as a service (to run automatically at boot), read the official documentation [here](https://github.com/ollama/ollama/blob/main/docs/linux.md). You shouldn't need to do this unless your system faces restrictions using Ollama's automated installer.
+    
 - If you receive the `Failed to open "/etc/systemd/system/ollama.service.d/.#override.confb927ee3c846beff8": Permission denied` error from Ollama after running `systemctl edit ollama.service`, simply creating the file works to eliminate it. Use the following steps to edit the file. 
   - Run:
     ```