07 - PPT Builder 智能体

一、定位

PPTBuilderAgent 是模板驱动的 PPT 生成智能体，采用状态机模式管理 PPT 从需求分析到渲染完成的完整生命周期。支持：

多种用户意图（CREATE / MODIFY / RESUME）
断点续传（任意状态中断后可恢复）
AI 智能配图（自动生成幻灯片图片）
双引擎渲染（Python-pptx / PptxGenJS）

二、状态机

         ┌─────────────┐
┌──────→ │   INIT      │ ──意图识别
│        └──────┬──────┘
│               ↓
│        ┌─────────────┐
│        │ REQUIREMENT │ 需求澄清（可能转 FAILED）
│        └──────┬──────┘
│               ↓
│        ┌─────────────┐
│        │   TEMPLATE  │ 模板选择
│        └──────┬──────┘
│               ↓
│        ┌─────────────┐
│        │   SEARCH    │ 信息收集（可选）
│        └──────┬──────┘
│               ↓
│        ┌─────────────┐
│        │   OUTLINE   │ 大纲生成
│        └──────┬──────┘
│               ↓
│        ┌─────────────┐
│        │   SCHEMA    │ 内容填充 + 图片生成
│        └──────┬──────┘
│               ↓
│        ┌─────────────┐
│        │   RENDER    │ 渲染生成
│        └──────┬──────┘
│               ↓
│        ┌─────────────┐
└────────│   SUCCESS   │ 完成
         └─────────────┘
                 │
                 ↓ (任意失败)
         ┌─────────────┐
         │   FAILED    │ 失败（可恢复）
         └─────────────┘

状态枚举：entity/record/pptx/PptInstStatus.java

INIT - 初始化
REQUIREMENT - 需求澄清
SEARCH - 信息收集
OUTLINE - 大纲生成
TEMPLATE - 模板选择
SCHEMA - Schema生成
RENDER - PPT渲染
SUCCESS - 完成
FAILED - 失败

三、核心组件

3.1 状态策略接口

agent/pptx/strategy/PptStateStrategy.java：

public interface PptStateStrategy {
    void execute(AiPptInst inst, Sinks.Many<String> sink, String query,
                 StringBuilder thinkingBuffer, PptStateStrategyContext context);

    PptInstStatus getTargetStatus();  // 执行完成后跳转到的状态
}

3.2 策略实现

策略类	起始状态	目标状态	核心逻辑
`RequirementStrategy`	INIT / REQUIREMENT	SEARCH	需求澄清，信息不足则转 FAILED
`SearchStrategy`	SEARCH	OUTLINE	调用 Tavily 搜索补充信息
`TemplateStrategy`	TEMPLATE	OUTLINE	模板选择
`OutlineStrategy`	OUTLINE	SCHEMA	生成 PPT 大纲
`SchemaStrategy`	SCHEMA	RENDER	生成结构化 Schema + AI 配图
`RenderStrategy`	RENDER	SUCCESS	调用 Python 渲染服务
`SuccessStrategy`	SUCCESS	-	输出最终 URL
`FailedStrategy`	FAILED	-	输出错误信息

3.3 策略工厂

PptStateStrategyFactory（单例）：

@Slf4j
public class PptStateStrategyFactory {
    private static final Map<PptInstStatus, PptStateStrategy> STRATEGY_MAP = new HashMap<>();

    static {
        STRATEGY_MAP.put(PptInstStatus.INIT, new RequirementStrategy());
        STRATEGY_MAP.put(PptInstStatus.REQUIREMENT, new RequirementStrategy());
        STRATEGY_MAP.put(PptInstStatus.TEMPLATE, new TemplateStrategy());
        STRATEGY_MAP.put(PptInstStatus.OUTLINE, new OutlineStrategy());
        STRATEGY_MAP.put(PptInstStatus.SEARCH, new SearchStrategy());
        STRATEGY_MAP.put(PptInstStatus.SCHEMA, new SchemaStrategy());
        STRATEGY_MAP.put(PptInstStatus.RENDER, new RenderStrategy());
        STRATEGY_MAP.put(PptInstStatus.SUCCESS, new SuccessStrategy());
        STRATEGY_MAP.put(PptInstStatus.FAILED, new FailedStrategy());
    }

    public PptStateStrategy getStrategy(PptInstStatus status) {
        // 找不到则返回 DefaultStrategy（提示状态异常）
    }

    public void executeNextState(AiPptInst inst, Sinks.Many<String> sink, String query,
                                  StringBuilder thinkingBuffer, PptStateStrategyContext context) {
        // 1. 重新加载最新状态
        AiPptInst latestInst = context.getPptInstService().getById(inst.getId());

        // 2. 检查断点重连
        if (latestInst.getErrorMsg() != null && !latestInst.getErrorMsg().isEmpty()
                && latestInst.getStatusEnum() != PptInstStatus.SUCCESS) {
            // 清空错误信息，允许继续
            context.getPptInstService().updateError(latestInst.getId(), "",
                    latestInst.getStatusEnum());
        }

        // 3. 执行对应策略
        PptStateStrategy strategy = getStrategy(latestInst.getStatusEnum());
        strategy.execute(latestInst, sink, query, thinkingBuffer, context);
    }
}

3.4 策略上下文

PptStateStrategyContext 跨策略共享依赖：

public class PptStateStrategyContext {
    private final ChatClient chatClient;
    private final ChatModel chatModel;
    private final AiPptInstService pptInstService;     // PPT 实例服务
    private final AiPptTemplateService pptTemplateService;  // 模板服务
    private final PptPythonRenderService pythonRenderService;  // 渲染服务
    private final ImageGenerationService imageGenerationService;  // 图片生成
    private final MinioService minioService;           // 文件存储
    private final AiSessionService sessionService;
    private final AgentTaskManager taskManager;
    private final List<ToolCallback> toolCallbacks;
    private final ChatMemory chatMemory;

    private Long currentSessionId;
    private String currentConversationId;
    private boolean modifyMode;          // 是否为修改模式
    private String modifyQuery;          // 当前修改需求

    // 辅助方法
    public void setDisposable(...) { ... }
    public void loadChatHistory(...) { ... }
    public String createJsonResponse(String content, String type) { ... }
    public boolean shouldContinueToNextStep(String response) { ... }
    public void continueStateMachine(AiPptInst inst, Sinks.Many<String> sink, ...) { ... }
}

四、意图识别

PptIntentRecognizer 识别三种意图：

public class PptIntentRecognizer {

    public PptIntentResult recognize(String conversationId, String query) {
        AiPptInst latestInst = pptInstService.getLatestInst(conversationId);

        if (latestInst == null) {
            return new PptIntentResult(PptIntent.CREATE_PPT, "会话中无PPT实例，默认新建");
        }

        PptInstStatus status = latestInst.getStatusEnum();
        String errorMsg = latestInst.getErrorMsg();

        // 1. 断点重连：有错误信息 / 关键词
        if (needsResume(status, errorMsg, query)) {
            return new PptIntentResult(PptIntent.RESUME_PPT,
                    "检测到上次执行未完成，从状态 " + status + " 继续执行");
        }

        // 2. SUCCESS 状态：用 LLM 区分 CREATE 还是 MODIFY
        if (status == PptInstStatus.SUCCESS) {
            return recognizeWithLLM(query);
        }

        // 3. 中间状态：默认 CREATE
        return new PptIntentResult(PptIntent.CREATE_PPT, "状态为 " + status);
    }

    private boolean needsResume(PptInstStatus status, String errorMsg, String query) {
        // - 有错误信息 → 重连
        // - 包含 "继续" "重试" "resume" → 重连
        // - 中间状态且不包含 "新建" "重新" → 重连
        return ...;
    }
}

意图枚举（PptIntent）：

CREATE_PPT - 新建 PPT
MODIFY_PPT - 修改现有 PPT
RESUME_PPT - 恢复中断的 PPT

五、执行入口

@Override
public Flux<String> execute(String conversationId, String query) {
    // 1. 检查并发
    Flux<String> checkResult = checkRunningTask(conversationId);
    if (checkResult != null) return checkResult;

    // 2. 保存 conversationId
    this.currentConversationId = conversationId;

    Sinks.Many<String> sink = Sinks.many().unicast().onBackpressureBuffer();
    AgentTaskManager.TaskInfo taskInfo = registerTask(conversationId, sink);

    // 3. 构建策略上下文
    strategyContext = new PptStateStrategyContext(
            chatClient, chatModel, pptInstService, pptTemplateService,
            pythonRenderService, imageGenerationService, minioService,
            sessionService, taskManager, toolCallbacks, chatMemory);
    strategyContext.setCurrentConversationId(conversationId);

    // 4. 保存用户问题
    initTimers();
    clearUsedTools();
    currentQuestion = query;
    AiSession savedSession = sessionService.saveQuestion(...);
    currentSessionId = savedSession.getId();

    // 5. 意图识别 → 启动状态机
    PptIntentResult intent = intentRecognizer.recognize(conversationId, query);
    emit(sink, finished, "\n🎯 识别到意图: " + intent.getIntent() + " - " + intent.getReason() + "\n",
            "thinking", thinkingBuffer);

    startStateMachine(intent, conversationId, query, sink, thinkingBuffer);

    // 6. 包装 sink
    return wrapSinkWithHandlers(sink, ...);
}

六、关键策略详解

6.1 需求澄清（RequirementStrategy）

public void execute(AiPptInst inst, Sinks.Many<String> sink, ...) {
    sink.tryEmitNext(context.createThinkingResponse("正在分析您的需求...\n"));

    messages.add(new SystemMessage(PptBuilderPrompts.REQUIREMENT_PROMPT));
    context.loadChatHistory(inst.getConversationId(), messages, true, true);
    messages.add(new UserMessage("<question>" + query + "</question>"));

    chatClient.prompt().messages(messages).stream().content()
        .doOnNext(chunk -> {
            responseBuffer.append(chunk);
            sink.tryEmitNext(context.createThinkingResponse(chunk));
        })
        .doOnComplete(() -> {
            String response = ThinkTagParser.stripThinkTags(responseBuffer.toString());

            if (context.shouldContinueToNextStep(response)) {
                // 信息完整，继续
                context.getPptInstService().updateRequirement(inst.getId(), response, TARGET_STATUS);
                sink.tryEmitNext(context.createThinkingResponse("\n✅ 需求已确认，开始收集相关信息\n"));
                context.continueStateMachine(inst, sink, query, thinkingBuffer);
            } else {
                // 信息不足，转 FAILED
                context.getPptInstService().updateError(inst.getId(),
                        "需要补充信息：\n" + response, PptInstStatus.REQUIREMENT);
                PptStateStrategyFactory.getInstance().executeFailedState(...);
            }
        })
        .doOnError(err -> { /* 转 FAILED */ })
        .subscribeOn(Schedulers.boundedElastic())
        .subscribe();
}

关键判断（PptStateStrategyContext.shouldContinueToNextStep）：

public boolean shouldContinueToNextStep(String response) {
    // - 包含【开始生成PPT】→ 继续
    // - 包含【暂停生成PPT】→ 停止
    // - 包含 "请问" "请提供" → 停止
    // - 默认 → 继续
}

6.2 大纲生成（OutlineStrategy）

public void execute(...) {
    sink.tryEmitNext(context.createThinkingResponse("正在生成PPT大纲...\n"));

    String requirement = inst.getRequirement();
    String searchInfo = inst.getSearchInfo();
    String templateCode = inst.getTemplateCode();
    AiPptTemplate template = context.getPptTemplateService().getByCode(templateCode);

    String prompt = PptBuilderPrompts.getOutlinePrompt(requirement,
            template.getTemplateSchema(), template.getTemplateName(), searchInfo);

    context.getChatClient().prompt().messages(new UserMessage(prompt)).stream().content()
        .doOnNext(chunk -> {
            sink.tryEmitNext(context.createThinkingResponse(chunk));
            outlineContent.append(chunk);
        })
        .doOnComplete(() -> {
            context.getPptInstService().updateOutline(inst.getId(),
                    ThinkTagParser.stripThinkTags(outlineContent.toString()), TARGET_STATUS);
            sink.tryEmitNext(context.createThinkingResponse("\n✅ 大纲生成完成，开始设计PPT详细内容\n"));
            context.continueStateMachine(inst, sink, query, thinkingBuffer);
        })
        .doOnError(err -> { /* 转 FAILED */ })
        .subscribeOn(Schedulers.boundedElastic())
        .subscribe();
}

6.3 Schema 生成与配图（SchemaStrategy）

public void execute(...) {
    String templateSchema = template.getTemplateSchema();
    String outline = inst.getOutline();
    String prompt = PptBuilderPrompts.getSchemaGenerationPrompt(templateSchema, outline);

    BeanOutputConverter<PptSchema> converter = new BeanOutputConverter<>(...);

    Mono.fromCallable(() -> {
        String json = ThinkTagParser.stripThinkTags(
                context.getChatModel().call(new Prompt(prompt)).getResult().getOutput().getText());
        PptSchema pptSchema = converter.convert(json);
        String pptSchemaJson = JSON.toJSONString(pptSchema);

        context.getPptInstService().updatePptSchema(inst.getId(), pptSchemaJson, TARGET_STATUS);

        // ★ AI 配图
        processImageGeneration(pptSchema, sink, inst.getConversationId(), context);

        // 更新包含图片 URL 的 schema
        context.getPptInstService().updatePptSchema(inst.getId(), JSON.toJSONString(pptSchema), TARGET_STATUS);
        context.continueStateMachine(inst, sink, query, thinkingBuffer);
        return null;
    })
    .doOnError(err -> { /* 转 FAILED */ })
    .subscribeOn(Schedulers.boundedElastic())
    .subscribe();
}

AI 配图流程（processImageGeneration）：

Schema 生成完成
        ↓
遍历所有 slide.data
        ↓
找出 type=image/background 且 url 为空的字段
        ↓
对每个字段:
  1. 调用 ImageGenerationService.generateImage(prompt)
  2. 下载生成的图片
  3. 上传到 MinIO
  4. 更新 schema.fieldData.url = MinIO URL
        ↓
保存更新后的 Schema

6.4 渲染（RenderStrategy）

public void execute(...) {
    sink.tryEmitNext(context.createThinkingResponse("正在渲染PPT...\n"));

    Mono.fromCallable(() -> {
        String pptSchemaJson = inst.getPptSchema();
        return context.getPythonRenderService().renderPpt(inst, pptSchemaJson);
    })
    .doOnSuccess(fileUrl -> {
        context.getPptInstService().updateFileUrl(inst.getId(), fileUrl, TARGET_STATUS);
        sink.tryEmitNext(context.createThinkingResponse("✅ PPT渲染完成\n"));
        context.continueStateMachine(inst, sink, query, thinkingBuffer);
    })
    .doOnError(err -> { /* 转 FAILED */ })
    .subscribeOn(Schedulers.boundedElastic())
    .subscribe();
}

七、数据模型

7.1 AiPptInst（PPT 实例）

entity/record/pptx/AiPptInst.java：

字段	含义
`id`	主键
`conversationId`	会话 ID
`templateCode`	模板编码
`status`	当前状态
`query`	用户原始需求
`requirement`	澄清后的需求
`searchInfo`	搜索结果
`outline`	大纲
`pptSchema`	最终 JSON Schema
`fileUrl`	渲染后的 PPT 文件 URL
`errorMsg`	错误信息（用于断点重连判断）

7.2 PptSchema（结构化 Schema）

entity/record/pptx/PptSchema.java：

{
  "title": "PPT标题",
  "slides": [
    {
      "type": "cover",         // 幻灯片类型
      "title": "...",
      "data": {
        "background": {
          "type": "background",
          "url": "...",        // 图片URL（可空，AI 配图）
          "content": "..."     // 配图提示词
        },
        "title": {
          "type": "text",
          "content": "..."
        }
      }
    },
    ...
  ]
}

支持的 slide 类型由 AiPptTemplate.templateSchema 定义。

八、典型请求/响应

请求：

1	GET /agent/pptx/stream?query=帮我做一份关于AI在医疗领域应用的PPT&conversationId=ppt-001

SSE 响应：

data: {"type":"thinking","content":"\n🎯 识别到意图: CREATE_PPT - 会话中无PPT实例\n"}

data: {"type":"thinking","content":"正在分析您的需求...\n"}

data: {"type":"text","content":"我需要确认几个关键点：受众是谁？时长多少？..."}

data: {"type":"thinking","content":"⏸【暂停生成PPT】..."}

data: {"type":"thinking","content":"✅ 需求已确认，开始收集相关信息\n"}

data: {"type":"thinking","content":"🔍 正在搜索: AI医疗应用案例\n"}

data: {"type":"thinking","content":"正在生成PPT大纲...\n"}

data: {"type":"thinking","content":"1. 封面\n2. AI在医疗领域的应用概述\n..."}

data: {"type":"thinking","content":"✅ 大纲生成完成，开始设计PPT详细内容\n"}

data: {"type":"thinking","content":"正在设计PPT详细内容...\n"}

data: {"type":"thinking","content":"✅PPT内容设计完成，开始生成图片素材\n"}

data: {"type":"thinking","content":"共需生成 5 张图片\n"}

data: {"type":"thinking","content":"正在生成图片 (1/5)... \n"}

data: {"type":"thinking","content":"✅ 图片生成完成 (1/5)\n"}

...

data: {"type":"thinking","content":"✅ 素材准备就绪，开始渲染PPT\n"}

data: {"type":"thinking","content":"正在渲染PPT...\n"}

data: {"type":"thinking","content":"✅ PPT渲染完成\n"}

data: {"type":"text","content":"PPT 生成成功！\n下载链接: http://minio:19000/rag-test2/ppt/..."}

九、断点续传机制

核心思想：每个状态完成时都把结果持久化到数据库 + 记录 errorMsg。失败时设置 errorMsg 但不修改状态。

重连流程：

用户发起新请求
        ↓
PptIntentRecognizer.recognize()
        ↓
发现 latestInst.errorMsg != null && status != SUCCESS
        ↓
返回 PptIntent.RESUME_PPT
        ↓
PptStateStrategyFactory.executeNextState()
        ↓
清空 errorMsg（允许继续）
        ↓
从 latestInst.statusEnum 对应的策略继续

十、错误处理

场景	处理
模板不存在	`updateError("模板不存在: xxx", TEMPLATE)` → 转 FAILED
需求澄清失败	转 FAILED
大纲生成失败	`updateError("大纲生成失败: xxx", OUTLINE)` → 转 FAILED
Schema 生成失败	转 FAILED
渲染失败	`updateError("PPT渲染失败: xxx", RENDER)` → 转 FAILED
图片生成失败	不转 FAILED，使用空 URL 继续（优雅降级）