Running a local LLM on your desktop/server

Posted by Jordi Corbilla February 25, 2024

Running a local LLM on your desktop/server

This article provides guidance on setting up and running a local Large Language Model (LLM) on a desktop environment. The repository contains relevant resources and code, primarily in Jupyter Notebook format, to facilitate the installation and operation of a local LLM. For detailed instructions and code, visit the GitHub repository.

Download LM Studio:

Go to https://lmstudio.ai/ and download the latest version for Windows:

Run the setup file (LM-Studio-0.2.12-Setup.exe) to install LM Studio locally.
Once installed, LM Studio will open:
Download LLMs:
- Search for the latest Llama-2 model and install the one that will fit in your machine:
- Navigate to the chat section and select the downloaded model and start chatting!:
Enable the local inference server
- We can expose this model so we can access it programmatically:
- Navigate to the local server tab and start the server on port 1234.
Use OpenAI API to talk to the model
Get my Jupyter Notebook and run it:
https://github.com/JordiCorbilla/Running_Local_LLM/blob/main/Running_LLM_Locally.ipynb
Now you can talk to the LLM locally and you can even expose this externally using ngrok.