Docker 容器化
AI Agent 应用 Docker 化:开发环境一致、向量数据库即开即用、生产可部署。
核心场景
1 2 3
| 场景1: 本地开发 → Docker 运行向量数据库(Milvus/Qdrant/pgvector) 场景2: 开发测试 → Docker 运行 Ollama(无 NVIDIA GPU 时纯 CPU 推理) 场景3: 生产部署 → 多容器编排:Agent 服务 + 向量库 + Redis + API Gateway
|
Ollama 容器化
最简启动
1 2 3 4 5 6 7 8 9 10 11 12
| docker pull ollama/ollama:latest docker run -d \ --name ollama \ -p 11434:11434 \ ollama/ollama:latest
docker exec -it ollama ollama pull llama3.2
curl http://localhost:11434/api/tags
|
带 GPU 的 Ollama(NVIDIA GPU 服务器)
1 2 3 4 5 6 7 8 9 10 11 12 13
|
docker run -d \ --name ollama \ --gpus all \ -p 11434:11434 \ -v ollama_data:/root/.ollama \ ollama/ollama:latest
docker exec -it ollama ollama pull deepseek-r1:7b docker exec -it ollama ollama pull nomic-embed-text:latest
|
Docker Compose 完整开发栈
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53
| services: ollama: image: ollama/ollama:latest container_name: ollama ports: - "11434:11434" volumes: - ollama_data:/root/.ollama deploy: resources: reservations: devices: - driver: nvidia count: all capabilities: [gpu]
open-webui: image: ghcr.io/open-webui/open-webui:main container_name: open-webui ports: - "3005:8080" environment: - OLLAMA_BASE_URL=http://ollama:11434 depends_on: - ollama
qdrant: image: qdrant/qdrant:latest container_name: qdrant ports: - "6333:6333" - "6334:6334" volumes: - qdrant_data:/qdrant/storage
redis: image: redis:7-alpine container_name: redis ports: - "6379:6379" volumes: - redis_data:/data
volumes: ollama_data: qdrant_data: redis_data:
|
启动:
Agent 服务的 Dockerfile
多阶段构建(推荐)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
| FROM python:3.12-slim as builder WORKDIR /app
RUN pip install uv
COPY requirements.txt .
RUN uv pip install --system --no-cache -r requirements.txt
FROM python:3.12-slim WORKDIR /app
RUN useradd --create-home appuser
COPY --from=builder /usr/local/lib/python3.12/site-packages /usr/local/lib/python3.12/site-packages COPY --from=builder /usr/local/bin /usr/local/bin COPY . .
USER appuser
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \ CMD python -c "import requests; requests.get('http://localhost:8000/health').raise_for_status()"
EXPOSE 8000
CMD ["python", "app.py"]
|
Spring Boot 的 Dockerfile
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
| FROM eclipse-temurin:21-jdk-alpine as builder WORKDIR /app COPY pom.xml . COPY src ./src RUN apk add --no-cache maven && \ mvn package -DskipTests && \ mv target/*.jar app.jar
FROM eclipse-temurin:21-jre-alpine WORKDIR /app COPY --from=builder /app/app.jar . RUN adduser --disabled-password appuser && \ chown -R appuser:appuser /app && \ chmod 400 app.jar USER appuser EXPOSE 8080 ENTRYPOINT ["java", "-jar", "app.jar"]
|
NVIDIA GPU 支持(容器中使用显卡)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
| curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey \ | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
distribution=$(. /etc/os-release && echo "$ID$VERSION_ID") curl -sL "https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list" \ | sed "s#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g" \ | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update sudo apt-get install -y nvidia-container-toolkit=1.19.0
sudo nvidia-ctk runtime configure --docker sudo systemctl restart docker
docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi
|
Docker 网络:宿主机访问容器内服务
1 2 3 4 5 6 7 8 9 10
| services: agent: build: . ports: - "8000:8000" extra_hosts: - "host.docker.internal:host-gateway" environment: - OLLAMA_BASE_URL=http://host.docker.internal:11434 - QDRANT_URL=http://qdrant:6333
|
host.docker.internal 在 Linux 上需要 extra_hosts 配置( macOS/Windows 自动支持)。
生产级 docker-compose(参考架构)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75
| services: agent: build: . container_name: agent-service ports: - "8000:8000" environment: - OPENAI_API_KEY=${OPENAI_API_KEY} - QDRANT_URL=http://qdrant:6333 - REDIS_URL=redis://redis:6379 - OLLAMA_BASE_URL=http://ollama:11434 depends_on: - qdrant - redis - ollama restart: unless-stopped deploy: resources: limits: memory: 2G
qdrant: image: qdrant/qdrant:latest container_name: qdrant ports: - "6333:6333" volumes: - qdrant_storage:/qdrant/storage restart: unless-stopped
redis: image: redis:7-alpine container_name: redis ports: - "6379:6379" volumes: - redis_data:/data restart: unless-stopped
ollama: image: ollama/ollama:latest container_name: ollama ports: - "11434:11434" volumes: - ollama_models:/root/.ollama deploy: resources: reservations: devices: - driver: nvidia count: all capabilities: [gpu] restart: unless-stopped
nginx: image: nginx:alpine container_name: nginx ports: - "80:80" - "443:443" volumes: - ./nginx.conf:/etc/nginx/nginx.conf depends_on: - agent
volumes: qdrant_storage: redis_data: ollama_models:
|