Running a local LLM on your desktop/server
Download LM Studio:
- Go to https://lmstudio.ai/ and download the latest version for Windows:
Run the setup file (LM-Studio-0.2.12-Setup.exe) to install LM Studio locally.
Once installed, LM Studio will open:
Download LLMs:
- Search for the latest Llama-2 model and install the one that will fit in your machine:
- Navigate to the chat section and select the downloaded model and start chatting!:
Enable the local inference server
- We can expose this model so we can access it programmatically:
- Navigate to the local server tab and start the server on port 1234.
Use OpenAI API to talk to the model
Get my Jupyter Notebook and run it:
https://github.com/JordiCorbilla/Running_Local_LLM/blob/main/Running_LLM_Locally.ipynb
Now you can talk to the LLM locally and you can even expose this externally using ngrok.
Comments
Post a Comment