07 - PPT Builder 智能体
一、定位
PPTBuilderAgent 是模板驱动的 PPT 生成智能体,采用状态机模式管理 PPT 从需求分析到渲染完成的完整生命周期。支持:
- 多种用户意图(CREATE / MODIFY / RESUME)
- 断点续传(任意状态中断后可恢复)
- AI 智能配图(自动生成幻灯片图片)
- 双引擎渲染(Python-pptx / PptxGenJS)
二、状态机
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
| ┌─────────────┐ ┌──────→ │ INIT │ ──意图识别 │ └──────┬──────┘ │ ↓ │ ┌─────────────┐ │ │ REQUIREMENT │ 需求澄清(可能转 FAILED) │ └──────┬──────┘ │ ↓ │ ┌─────────────┐ │ │ TEMPLATE │ 模板选择 │ └──────┬──────┘ │ ↓ │ ┌─────────────┐ │ │ SEARCH │ 信息收集(可选) │ └──────┬──────┘ │ ↓ │ ┌─────────────┐ │ │ OUTLINE │ 大纲生成 │ └──────┬──────┘ │ ↓ │ ┌─────────────┐ │ │ SCHEMA │ 内容填充 + 图片生成 │ └──────┬──────┘ │ ↓ │ ┌─────────────┐ │ │ RENDER │ 渲染生成 │ └──────┬──────┘ │ ↓ │ ┌─────────────┐ └────────│ SUCCESS │ 完成 └─────────────┘ │ ↓ (任意失败) ┌─────────────┐ │ FAILED │ 失败(可恢复) └─────────────┘
|
状态枚举:entity/record/pptx/PptInstStatus.java
INIT - 初始化REQUIREMENT - 需求澄清SEARCH - 信息收集OUTLINE - 大纲生成TEMPLATE - 模板选择SCHEMA - Schema生成RENDER - PPT渲染SUCCESS - 完成FAILED - 失败
三、核心组件
3.1 状态策略接口
agent/pptx/strategy/PptStateStrategy.java:
1 2 3 4 5 6
| public interface PptStateStrategy { void execute(AiPptInst inst, Sinks.Many<String> sink, String query, StringBuilder thinkingBuffer, PptStateStrategyContext context);
PptInstStatus getTargetStatus(); }
|
3.2 策略实现
| 策略类 | 起始状态 | 目标状态 | 核心逻辑 |
|---|
RequirementStrategy | INIT / REQUIREMENT | SEARCH | 需求澄清,信息不足则转 FAILED |
SearchStrategy | SEARCH | OUTLINE | 调用 Tavily 搜索补充信息 |
TemplateStrategy | TEMPLATE | OUTLINE | 模板选择 |
OutlineStrategy | OUTLINE | SCHEMA | 生成 PPT 大纲 |
SchemaStrategy | SCHEMA | RENDER | 生成结构化 Schema + AI 配图 |
RenderStrategy | RENDER | SUCCESS | 调用 Python 渲染服务 |
SuccessStrategy | SUCCESS | - | 输出最终 URL |
FailedStrategy | FAILED | - | 输出错误信息 |
3.3 策略工厂
PptStateStrategyFactory(单例):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38
| @Slf4j public class PptStateStrategyFactory { private static final Map<PptInstStatus, PptStateStrategy> STRATEGY_MAP = new HashMap<>();
static { STRATEGY_MAP.put(PptInstStatus.INIT, new RequirementStrategy()); STRATEGY_MAP.put(PptInstStatus.REQUIREMENT, new RequirementStrategy()); STRATEGY_MAP.put(PptInstStatus.TEMPLATE, new TemplateStrategy()); STRATEGY_MAP.put(PptInstStatus.OUTLINE, new OutlineStrategy()); STRATEGY_MAP.put(PptInstStatus.SEARCH, new SearchStrategy()); STRATEGY_MAP.put(PptInstStatus.SCHEMA, new SchemaStrategy()); STRATEGY_MAP.put(PptInstStatus.RENDER, new RenderStrategy()); STRATEGY_MAP.put(PptInstStatus.SUCCESS, new SuccessStrategy()); STRATEGY_MAP.put(PptInstStatus.FAILED, new FailedStrategy()); }
public PptStateStrategy getStrategy(PptInstStatus status) { }
public void executeNextState(AiPptInst inst, Sinks.Many<String> sink, String query, StringBuilder thinkingBuffer, PptStateStrategyContext context) { AiPptInst latestInst = context.getPptInstService().getById(inst.getId());
if (latestInst.getErrorMsg() != null && !latestInst.getErrorMsg().isEmpty() && latestInst.getStatusEnum() != PptInstStatus.SUCCESS) { context.getPptInstService().updateError(latestInst.getId(), "", latestInst.getStatusEnum()); }
PptStateStrategy strategy = getStrategy(latestInst.getStatusEnum()); strategy.execute(latestInst, sink, query, thinkingBuffer, context); } }
|
3.4 策略上下文
PptStateStrategyContext 跨策略共享依赖:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
| public class PptStateStrategyContext { private final ChatClient chatClient; private final ChatModel chatModel; private final AiPptInstService pptInstService; private final AiPptTemplateService pptTemplateService; private final PptPythonRenderService pythonRenderService; private final ImageGenerationService imageGenerationService; private final MinioService minioService; private final AiSessionService sessionService; private final AgentTaskManager taskManager; private final List<ToolCallback> toolCallbacks; private final ChatMemory chatMemory;
private Long currentSessionId; private String currentConversationId; private boolean modifyMode; private String modifyQuery;
public void setDisposable(...) { ... } public void loadChatHistory(...) { ... } public String createJsonResponse(String content, String type) { ... } public boolean shouldContinueToNextStep(String response) { ... } public void continueStateMachine(AiPptInst inst, Sinks.Many<String> sink, ...) { ... } }
|
四、意图识别
PptIntentRecognizer 识别三种意图:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
| public class PptIntentRecognizer {
public PptIntentResult recognize(String conversationId, String query) { AiPptInst latestInst = pptInstService.getLatestInst(conversationId);
if (latestInst == null) { return new PptIntentResult(PptIntent.CREATE_PPT, "会话中无PPT实例,默认新建"); }
PptInstStatus status = latestInst.getStatusEnum(); String errorMsg = latestInst.getErrorMsg();
if (needsResume(status, errorMsg, query)) { return new PptIntentResult(PptIntent.RESUME_PPT, "检测到上次执行未完成,从状态 " + status + " 继续执行"); }
if (status == PptInstStatus.SUCCESS) { return recognizeWithLLM(query); }
return new PptIntentResult(PptIntent.CREATE_PPT, "状态为 " + status); }
private boolean needsResume(PptInstStatus status, String errorMsg, String query) { return ...; } }
|
意图枚举(PptIntent):
CREATE_PPT - 新建 PPTMODIFY_PPT - 修改现有 PPTRESUME_PPT - 恢复中断的 PPT
五、执行入口
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
| @Override public Flux<String> execute(String conversationId, String query) { Flux<String> checkResult = checkRunningTask(conversationId); if (checkResult != null) return checkResult;
this.currentConversationId = conversationId;
Sinks.Many<String> sink = Sinks.many().unicast().onBackpressureBuffer(); AgentTaskManager.TaskInfo taskInfo = registerTask(conversationId, sink);
strategyContext = new PptStateStrategyContext( chatClient, chatModel, pptInstService, pptTemplateService, pythonRenderService, imageGenerationService, minioService, sessionService, taskManager, toolCallbacks, chatMemory); strategyContext.setCurrentConversationId(conversationId);
initTimers(); clearUsedTools(); currentQuestion = query; AiSession savedSession = sessionService.saveQuestion(...); currentSessionId = savedSession.getId();
PptIntentResult intent = intentRecognizer.recognize(conversationId, query); emit(sink, finished, "\n🎯 识别到意图: " + intent.getIntent() + " - " + intent.getReason() + "\n", "thinking", thinkingBuffer);
startStateMachine(intent, conversationId, query, sink, thinkingBuffer);
return wrapSinkWithHandlers(sink, ...); }
|
六、关键策略详解
6.1 需求澄清(RequirementStrategy)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
| public void execute(AiPptInst inst, Sinks.Many<String> sink, ...) { sink.tryEmitNext(context.createThinkingResponse("正在分析您的需求...\n"));
messages.add(new SystemMessage(PptBuilderPrompts.REQUIREMENT_PROMPT)); context.loadChatHistory(inst.getConversationId(), messages, true, true); messages.add(new UserMessage("<question>" + query + "</question>"));
chatClient.prompt().messages(messages).stream().content() .doOnNext(chunk -> { responseBuffer.append(chunk); sink.tryEmitNext(context.createThinkingResponse(chunk)); }) .doOnComplete(() -> { String response = ThinkTagParser.stripThinkTags(responseBuffer.toString());
if (context.shouldContinueToNextStep(response)) { context.getPptInstService().updateRequirement(inst.getId(), response, TARGET_STATUS); sink.tryEmitNext(context.createThinkingResponse("\n✅ 需求已确认,开始收集相关信息\n")); context.continueStateMachine(inst, sink, query, thinkingBuffer); } else { context.getPptInstService().updateError(inst.getId(), "需要补充信息:\n" + response, PptInstStatus.REQUIREMENT); PptStateStrategyFactory.getInstance().executeFailedState(...); } }) .doOnError(err -> { }) .subscribeOn(Schedulers.boundedElastic()) .subscribe(); }
|
关键判断(PptStateStrategyContext.shouldContinueToNextStep):
1 2 3 4 5 6
| public boolean shouldContinueToNextStep(String response) { }
|
6.2 大纲生成(OutlineStrategy)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
| public void execute(...) { sink.tryEmitNext(context.createThinkingResponse("正在生成PPT大纲...\n"));
String requirement = inst.getRequirement(); String searchInfo = inst.getSearchInfo(); String templateCode = inst.getTemplateCode(); AiPptTemplate template = context.getPptTemplateService().getByCode(templateCode);
String prompt = PptBuilderPrompts.getOutlinePrompt(requirement, template.getTemplateSchema(), template.getTemplateName(), searchInfo);
context.getChatClient().prompt().messages(new UserMessage(prompt)).stream().content() .doOnNext(chunk -> { sink.tryEmitNext(context.createThinkingResponse(chunk)); outlineContent.append(chunk); }) .doOnComplete(() -> { context.getPptInstService().updateOutline(inst.getId(), ThinkTagParser.stripThinkTags(outlineContent.toString()), TARGET_STATUS); sink.tryEmitNext(context.createThinkingResponse("\n✅ 大纲生成完成,开始设计PPT详细内容\n")); context.continueStateMachine(inst, sink, query, thinkingBuffer); }) .doOnError(err -> { }) .subscribeOn(Schedulers.boundedElastic()) .subscribe(); }
|
6.3 Schema 生成与配图(SchemaStrategy)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
| public void execute(...) { String templateSchema = template.getTemplateSchema(); String outline = inst.getOutline(); String prompt = PptBuilderPrompts.getSchemaGenerationPrompt(templateSchema, outline);
BeanOutputConverter<PptSchema> converter = new BeanOutputConverter<>(...);
Mono.fromCallable(() -> { String json = ThinkTagParser.stripThinkTags( context.getChatModel().call(new Prompt(prompt)).getResult().getOutput().getText()); PptSchema pptSchema = converter.convert(json); String pptSchemaJson = JSON.toJSONString(pptSchema);
context.getPptInstService().updatePptSchema(inst.getId(), pptSchemaJson, TARGET_STATUS);
processImageGeneration(pptSchema, sink, inst.getConversationId(), context);
context.getPptInstService().updatePptSchema(inst.getId(), JSON.toJSONString(pptSchema), TARGET_STATUS); context.continueStateMachine(inst, sink, query, thinkingBuffer); return null; }) .doOnError(err -> { }) .subscribeOn(Schedulers.boundedElastic()) .subscribe(); }
|
AI 配图流程(processImageGeneration):
1 2 3 4 5 6 7 8 9 10 11 12 13
| Schema 生成完成 ↓ 遍历所有 slide.data ↓ 找出 type=image/background 且 url 为空的字段 ↓ 对每个字段: 1. 调用 ImageGenerationService.generateImage(prompt) 2. 下载生成的图片 3. 上传到 MinIO 4. 更新 schema.fieldData.url = MinIO URL ↓ 保存更新后的 Schema
|
6.4 渲染(RenderStrategy)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
| public void execute(...) { sink.tryEmitNext(context.createThinkingResponse("正在渲染PPT...\n"));
Mono.fromCallable(() -> { String pptSchemaJson = inst.getPptSchema(); return context.getPythonRenderService().renderPpt(inst, pptSchemaJson); }) .doOnSuccess(fileUrl -> { context.getPptInstService().updateFileUrl(inst.getId(), fileUrl, TARGET_STATUS); sink.tryEmitNext(context.createThinkingResponse("✅ PPT渲染完成\n")); context.continueStateMachine(inst, sink, query, thinkingBuffer); }) .doOnError(err -> { }) .subscribeOn(Schedulers.boundedElastic()) .subscribe(); }
|
七、数据模型
7.1 AiPptInst(PPT 实例)
entity/record/pptx/AiPptInst.java:
| 字段 | 含义 |
|---|
id | 主键 |
conversationId | 会话 ID |
templateCode | 模板编码 |
status | 当前状态 |
query | 用户原始需求 |
requirement | 澄清后的需求 |
searchInfo | 搜索结果 |
outline | 大纲 |
pptSchema | 最终 JSON Schema |
fileUrl | 渲染后的 PPT 文件 URL |
errorMsg | 错误信息(用于断点重连判断) |
7.2 PptSchema(结构化 Schema)
entity/record/pptx/PptSchema.java:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
| { "title": "PPT标题", "slides": [ { "type": "cover", "title": "...", "data": { "background": { "type": "background", "url": "...", "content": "..." }, "title": { "type": "text", "content": "..." } } }, ... ] }
|
支持的 slide 类型由 AiPptTemplate.templateSchema 定义。
八、典型请求/响应
请求:
1
| GET /agent/pptx/stream?query=帮我做一份关于AI在医疗领域应用的PPT&conversationId=ppt-001
|
SSE 响应:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
| data: {"type":"thinking","content":"\n🎯 识别到意图: CREATE_PPT - 会话中无PPT实例\n"}
data: {"type":"thinking","content":"正在分析您的需求...\n"}
data: {"type":"text","content":"我需要确认几个关键点:受众是谁?时长多少?..."}
data: {"type":"thinking","content":"⏸【暂停生成PPT】..."}
data: {"type":"thinking","content":"✅ 需求已确认,开始收集相关信息\n"}
data: {"type":"thinking","content":"🔍 正在搜索: AI医疗应用案例\n"}
data: {"type":"thinking","content":"正在生成PPT大纲...\n"}
data: {"type":"thinking","content":"1. 封面\n2. AI在医疗领域的应用概述\n..."}
data: {"type":"thinking","content":"✅ 大纲生成完成,开始设计PPT详细内容\n"}
data: {"type":"thinking","content":"正在设计PPT详细内容...\n"}
data: {"type":"thinking","content":"✅PPT内容设计完成,开始生成图片素材\n"}
data: {"type":"thinking","content":"共需生成 5 张图片\n"}
data: {"type":"thinking","content":"正在生成图片 (1/5)... \n"}
data: {"type":"thinking","content":"✅ 图片生成完成 (1/5)\n"}
...
data: {"type":"thinking","content":"✅ 素材准备就绪,开始渲染PPT\n"}
data: {"type":"thinking","content":"正在渲染PPT...\n"}
data: {"type":"thinking","content":"✅ PPT渲染完成\n"}
data: {"type":"text","content":"PPT 生成成功!\n下载链接: http://minio:19000/rag-test2/ppt/..."}
|
九、断点续传机制
核心思想:每个状态完成时都把结果持久化到数据库 + 记录 errorMsg。失败时设置 errorMsg 但不修改状态。
重连流程:
1 2 3 4 5 6 7 8 9 10 11 12 13
| 用户发起新请求 ↓ PptIntentRecognizer.recognize() ↓ 发现 latestInst.errorMsg != null && status != SUCCESS ↓ 返回 PptIntent.RESUME_PPT ↓ PptStateStrategyFactory.executeNextState() ↓ 清空 errorMsg(允许继续) ↓ 从 latestInst.statusEnum 对应的策略继续
|
十、错误处理
| 场景 | 处理 |
|---|
| 模板不存在 | updateError("模板不存在: xxx", TEMPLATE) → 转 FAILED |
| 需求澄清失败 | 转 FAILED |
| 大纲生成失败 | updateError("大纲生成失败: xxx", OUTLINE) → 转 FAILED |
| Schema 生成失败 | 转 FAILED |
| 渲染失败 | updateError("PPT渲染失败: xxx", RENDER) → 转 FAILED |
| 图片生成失败 | 不转 FAILED,使用空 URL 继续(优雅降级) |
关键设计:失败时不修改状态只写 errorMsg,这样下次调用能识别为”重连”。
十一、扩展方向
- 多模板支持:增加
AiPptTemplate 模板表 + 模板编辑器 - PPT 风格切换:增加
style 字段控制配色和布局 - 数据图表:在 Schema 中支持 chart 类型(柱状图、折线图、饼图)
- 协同编辑:多人同时编辑同一 PPT 实例
- 历史版本:每次 MODIFY 都保存一个历史版本