Kubernetes + Ollama:Deploy Ollama in Kubernetes in 10 Minutes

Authors

In the development and deployment of modern cloud-native applications, Kubernetes has become the most popular container orchestration tool. Ollama, a tool for efficiently installing large models, can be perfectly integrated with Kubernetes to achieve efficient and scalable large model deployments. This article will teach you how to deploy Ollama in Kubernetes in 10 minutes.

Preparation

Before starting, please ensure you have completed the following preparations:

  1. Install Kubernetes: Ensure that Kubernetes is installed and configured on your machine. If not, refer to the Kubernetes official documentation for installation instructions.

  2. Install kubectl: kubectl is the command-line tool for interacting with the Kubernetes cluster. Please refer to the kubectl installation guide for installation.

  3. Docker Image: Ensure you have built and pushed the Ollama Docker image to your container image repository (such as Docker Hub).

For a simple trial, you can also use microk8s - https://microk8s.io/ or k3s - https://k3s.io/ for installation.

Step 1: Write the Ollama Deployment File

First, we need to write a Kubernetes deployment file to define how the Ollama service is deployed. Create a file named ollama-deployment.yaml and add the following content:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: ollama
spec:
  selector:
    matchLabels:
      app: ollama
  template:
    metadata:
      labels:
        app: ollama
    spec:
      containers:
      - name: ollama
        image: ollama/ollama:latest
        ports:
        - name: http
          containerPort: 11434
          protocol: TCP

---

apiVersion: v1
kind: Service
metadata:
  name: ollama-svc
spec:
  selector:
    app: ollama
  ports:
  - protocol: TCP
    port: 11434

Step 2: Apply the Deployment File

Use the kubectl command to apply the deployment file to the Kubernetes cluster:

kubectl apply -f ollama-deployment.yaml

After executing this command, Kubernetes will create the Ollama service and load balancer based on the deployment file.

Step 3: Verify the Deployment

Check the deployment status with the following commands:

kubectl get deployments
kubectl get pods
kubectl get services

Or check the deployment status through the dashboard.

You should be able to see relevant information about the Ollama deployment, including the number of replicas, Pod status, and the external IP address of the service.

Step 4: Access the Ollama Service

Enter the pod and execute the following command to install the corresponding model. You can also build an image with the model included and deploy it in the previous steps, so you don’t need to install the model manually.

ollama run llama3

After installation, you can request the installed model from other pods in the namespace using the curl command.

$ curl -L 'http://ollama-svc:11434/api/generate' \
> -H 'Content-Type: application/json' \
> -d '{
>     "model": "llama3",
>     "prompt": "How to handle workplace conflicts",
>     "format": "json",
>     "stream": false
> }'
{"model":"llama3-zh:latest","created_at":"2024-07-09T12:13:05.90114Z","response":"{ \"message\": \"Handling conflicts in the workplace is an important skill. Here are some strategies that may help you resolve conflicts: \\r\\n\\r\\n1. **Listen to the other party** - Give your colleague a chance to express their views and concerns in detail. You can show that you are listening by repeating, summarizing, or asking questions. \\r\\n2. **Stay calm** - Try not to let emotions affect the conversation. Take deep breaths and give yourself time to calm down so you can address the issue more systematically. \\r\\n3. **Find common ground** - Try to find things both parties agree on. This can help create a collaborative atmosphere and make negotiations easier. \\r\\n4. **Use \"I\" language** - Use \"I feel...\" or \"I think...\" instead of \"You always...\", which can reduce accusations and keep the conversation constructive. \\r\\n5. **Propose solutions** - Once you understand the other party's issues and concerns, you can propose possible solutions. Ensure these suggestions are specific, feasible, and consider both parties' interests. \\r\\n6. **Seek third-party help** - If the conflict is severe or you feel unable to handle it, seeking a neutral third party (such as an HR representative or career counselor) may be a good idea. They can offer professional advice and intervene if necessary. \\r\\n7. **Maintain respect and honesty** - Maintain respect and honesty in the conversation, even if you disagree with the other party, treat each other with respect as colleagues. \\r\\n\\r\\nRemember, no one likes conflicts, but sometimes they can be opportunities for growth. By effectively resolving conflicts, you may discover new ways of working or strengthen team cohesion.\", \"type\": \"text\", \"is_end_session\": false }\n\n\n\n\t\t\t\t\n\n\t\t\t\t\n\n\t\t\t\t\t\n","done":true,"done_reason":"stop","context":[198,27,91,318,62,2527,91,29,882,198,109425,55642,104077,83324,110158,104584,27,91,318,62,408,91,29,198,27,91,318,62,2527,91,29,78191,198,90,220,1,2037,794,220,1,19000,104077,83324,16325,55642,110158,104584,107226,48982,107693,9554,118552,1811,88852,107226,98184,88367,19361,103129,35304,57668,114914,110158,104584,9554,105226,105838,5232,59,81,59,77,59,81,59,77,16,13,220,334,20022,122,50287,124269,334,220,12,45154,247,110310,42016,30926,48044,126490,3922,67933,105986,30590,114706,104563,9554,103276,28542,34208,106529,26203,100,1811,57668,74770,68438,30358,59464,5486,60843,37985,58291,29172,57107,37507,21405,31958,57668,19000,30051,89151,36735,228,50287,67998,81,59,77,17,13,220,334,118551,106142,106353,334,220,12,59330,121,33857,16937,102654,40474,12774,103,109829,33764,58543,1811,102987,105324,107246,3922,90112,102099,117373,21082,106142,106353,113931,3922,104390,57668,81258,27327,34226,19361,40089,22649,30590,114914,87219,67998,81,59,77,18,13,220,334,116472,32018,119046,28542,334,220,12,59330,251,42421,125414,104836,24273,72368,126794,121738,1811,44388,74770,123725,114690,120143,112355,9554,102146,30320,249,3922,33655,50928,114638,110695,106053,124662,124778,119237,67998,81,59,77,19,13,220,334,11883,2118,40,863,120074,334,220,12,86758,2118,37046,117293,51279,58291,2118,37046,112403,51279,69636,103668,2118,57668,60843,21043,1981,34690,104390,27327,111689,83747,64467,70616,64026,118551,33764,58543,9554,108053,34171,67998,81,59,77,20,13,220,334,118664,114914,112897,334,220,12,85997,57668,120222,35287,124269,125648,34208,106529,26203,100,34547,3922,

57668,74770,118664,113882,88367,9554,114914,114997,1811,35056,33563,108787,29172,97522,21043,118789,31540,23039,9554,3922,64026,103786,124116,106837,104836,24273,9554,60632,105576,67998,81,59,77,21,13,220,334,116472,32018,109790,24273,123725,334,220,12,82363,110158,104584,108008,109759,30358,58291,57668,117293,102099,110621,55642,3922,111498,116472,93233,48044,16325,80195,9554,109790,24273,10110,30624,17792,48634,86429,106691,58291,116319,113178,57107,7705,88367,122503,53901,36668,37689,1811,104563,74770,104908,107371,122903,3922,64026,19000,109215,13646,75910,17701,48972,50338,67998,81,59,77,22,13,220,334,118551,113797,30358,34208,120228,41073,334,220,12,111505,69978,33764,58543,105363,113797,30358,58318,120228,41073,3922,106189,110878,57668,34208,124269,103276,28542,107653,3922,75863,31634,23897,42016,30926,124176,50021,106483,113797,30358,67998,81,59,77,59,81,59,77,41914,101987,3922,81543,109545,17792,114765,110158,104584,3922,78388,19361,105703,127150,74770,112743,13153,46961,9554,126490,1811,68438,89186,30590,114914,110158,104584,3922,57668,88367,38093,109836,116879,102301,76868,58291,50285,103229,104440,83266,118314,111200,48634,1811,498,220,1,1337,794,220,1,1342,498,220,1,285,62,408,62,6045,794,905,220,92,4815,720,256,12858,2355,256,12858,2355,256,18737,256]}

Conclusion

Great! Through the above steps, you have successfully deployed Ollama to the Kubernetes environment. This process not only demonstrates the power and flexibility of Kubernetes but also highlights the convenience of deploying Ollama. I believe this concise and easy-to-understand tutorial can help you become more proficient in leveraging the perfect combination of Kubernetes and Ollama, bringing infinite possibilities to your large model application development and deployment.

If you encounter any problems or have any questions during the deployment process, feel free to leave a comment below, and let's discuss and solve them together!

Share this content