Docker 容器化

AI Agent 应用 Docker 化:开发环境一致、向量数据库即开即用、生产可部署。


核心场景

1
2
3
场景1: 本地开发  → Docker 运行向量数据库(Milvus/Qdrant/pgvector)
场景2: 开发测试 → Docker 运行 Ollama(无 NVIDIA GPU 时纯 CPU 推理)
场景3: 生产部署 → 多容器编排:Agent 服务 + 向量库 + Redis + API Gateway

Ollama 容器化

最简启动

1
2
3
4
5
6
7
8
9
10
11
12
# 拉取并运行 Ollama
docker pull ollama/ollama:latest
docker run -d \
--name ollama \
-p 11434:11434 \
ollama/ollama:latest

# 进入容器拉取模型
docker exec -it ollama ollama pull llama3.2

# 验证
curl http://localhost:11434/api/tags

带 GPU 的 Ollama(NVIDIA GPU 服务器)

1
2
3
4
5
6
7
8
9
10
11
12
13
# 1. 安装 NVIDIA Container Toolkit(见下方"GPU 支持"章节)

# 2. 启动 Ollama(自动识别 GPU)
docker run -d \
--name ollama \
--gpus all \
-p 11434:11434 \
-v ollama_data:/root/.ollama \
ollama/ollama:latest

# 3. 拉取模型
docker exec -it ollama ollama pull deepseek-r1:7b
docker exec -it ollama ollama pull nomic-embed-text:latest

Docker Compose 完整开发栈

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
# docker-compose.yml
services:
# Ollama 本地模型服务
ollama:
image: ollama/ollama:latest
container_name: ollama
ports:
- "11434:11434"
volumes:
- ollama_data:/root/.ollama
# GPU 支持(需要 --gpus all)
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]

# Open WebUI(可选,Web 界面)
open-webui:
image: ghcr.io/open-webui/open-webui:main
container_name: open-webui
ports:
- "3005:8080"
environment:
- OLLAMA_BASE_URL=http://ollama:11434
depends_on:
- ollama

# Qdrant 向量数据库
qdrant:
image: qdrant/qdrant:latest
container_name: qdrant
ports:
- "6333:6333" # REST API
- "6334:6334" # gRPC API
volumes:
- qdrant_data:/qdrant/storage

# Redis(Agent 记忆存储)
redis:
image: redis:7-alpine
container_name: redis
ports:
- "6379:6379"
volumes:
- redis_data:/data

volumes:
ollama_data:
qdrant_data:
redis_data:

启动:

1
docker compose up -d

Agent 服务的 Dockerfile

多阶段构建(推荐)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
# Stage 1: Builder
FROM python:3.12-slim as builder
WORKDIR /app

# 安装 uv(大幅加速依赖安装)
RUN pip install uv

# 复制依赖文件
COPY requirements.txt .
# 锁定版本安装到系统(最终镜像会用虚拟环境)
RUN uv pip install --system --no-cache -r requirements.txt

# Stage 2: Runtime
FROM python:3.12-slim
WORKDIR /app

# 安全:非 root 用户
RUN useradd --create-home appuser

# 复制依赖和代码
COPY --from=builder /usr/local/lib/python3.12/site-packages /usr/local/lib/python3.12/site-packages
COPY --from=builder /usr/local/bin /usr/local/bin
COPY . .

# 切换到非 root
USER appuser

# 健康检查
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD python -c "import requests; requests.get('http://localhost:8000/health').raise_for_status()"

EXPOSE 8000

CMD ["python", "app.py"]

Spring Boot 的 Dockerfile

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
FROM eclipse-temurin:21-jdk-alpine as builder
WORKDIR /app
COPY pom.xml .
COPY src ./src
RUN apk add --no-cache maven && \
mvn package -DskipTests && \
mv target/*.jar app.jar

FROM eclipse-temurin:21-jre-alpine
WORKDIR /app
COPY --from=builder /app/app.jar .
RUN adduser --disabled-password appuser && \
chown -R appuser:appuser /app && \
chmod 400 app.jar
USER appuser
EXPOSE 8080
ENTRYPOINT ["java", "-jar", "app.jar"]

NVIDIA GPU 支持(容器中使用显卡)

安装 NVIDIA Container Toolkit(Ubuntu/Debian)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# 1. 添加 GPG 密钥
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey \
| sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg

# 2. 添加软件源
distribution=$(. /etc/os-release && echo "$ID$VERSION_ID")
curl -sL "https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list" \
| sed "s#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g" \
| sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

# 3. 安装
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit=1.19.0

# 4. 配置 Docker
sudo nvidia-ctk runtime configure --docker
sudo systemctl restart docker

# 5. 验证
docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi

Docker 网络:宿主机访问容器内服务

1
2
3
4
5
6
7
8
9
10
services:
agent:
build: .
ports:
- "8000:8000"
extra_hosts:
- "host.docker.internal:host-gateway" # 允许容器访问宿主机
environment:
- OLLAMA_BASE_URL=http://host.docker.internal:11434 # 容器访问宿主机 Ollama
- QDRANT_URL=http://qdrant:6333 # Docker 内部网络

host.docker.internal 在 Linux 上需要 extra_hosts 配置( macOS/Windows 自动支持)。


生产级 docker-compose(参考架构)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
services:
# Agent 应用
agent:
build: .
container_name: agent-service
ports:
- "8000:8000"
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY}
- QDRANT_URL=http://qdrant:6333
- REDIS_URL=redis://redis:6379
- OLLAMA_BASE_URL=http://ollama:11434
depends_on:
- qdrant
- redis
- ollama
restart: unless-stopped
deploy:
resources:
limits:
memory: 2G

# 向量数据库
qdrant:
image: qdrant/qdrant:latest
container_name: qdrant
ports:
- "6333:6333"
volumes:
- qdrant_storage:/qdrant/storage
restart: unless-stopped

# 记忆存储
redis:
image: redis:7-alpine
container_name: redis
ports:
- "6379:6379"
volumes:
- redis_data:/data
restart: unless-stopped

# Ollama(本地模型)
ollama:
image: ollama/ollama:latest
container_name: ollama
ports:
- "11434:11434"
volumes:
- ollama_models:/root/.ollama
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
restart: unless-stopped

# Nginx 反向代理(可选)
nginx:
image: nginx:alpine
container_name: nginx
ports:
- "80:80"
- "443:443"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf
depends_on:
- agent

volumes:
qdrant_storage:
redis_data:
ollama_models: