{"id":"agent-launch-eval-pack","name":"Agent 上线评测包","name_en":"Agent Launch Evaluation Pack","category":"agent-delivery-pack","status":"available","tagline":"不只是做出 Agent，而是判断它能不能交付上线。","summary":"面向 Agent 开发者与服务商的上线前评测体系：100–300 条标准测试用例,覆盖幻觉、工具调用错误、多轮偏移、Prompt injection、成本与延迟,含报告与风险分级模板。","url":"https://bidmosaicads.com/products/agent-launch-eval-pack","target_users":["Agent 开发者","AI 服务商","企业技术团队"],"use_cases":["evaluation","regression_testing","prompt_injection_testing","cost_latency_profiling"],"inputs":["待测 Agent 的接口或运行方式","业务场景描述","(可选)既有对话日志"],"outputs":["100–300 条标准测试用例","工具调用 / RAG 幻觉 / 多轮稳定性测试模板","Prompt injection 测试样例","成本与延迟统计表","上线前检查清单","测试报告模板 + 风险分级模板"],"tech_stack":["评测脚本(Python/TS)","promptfoo / 自研 runner","MCP","CSV/Notion 报告"],"tiers":[{"id":"starter","name":"入门版","price":299,"currency":"CNY","best_for":"想要测试样例与清单的人","includes":["核心测试用例样例","上线前检查清单","测试报告模板","简单使用说明"]},{"id":"standard","name":"标准版","price":999,"currency":"CNY","best_for":"Agent 开发者与服务商","includes":["100–300 条标准测试用例","工具调用/幻觉/多轮/注入测试模板","成本与延迟统计表","风险分级模板","基础商用授权"]},{"id":"commercial","name":"商用交付版","price":1999,"currency":"CNY","best_for":"需要向客户出具上线评测报告的团队","includes":["标准版全部内容","可交付测试报告模板(完整)","商用授权","一定期限内版本更新","可选答疑支持"]}],"price":{"starter":299,"standard":999,"commercial":1999,"currency":"CNY"},"delivery_method":["GitHub private repository","downloadable files","documentation portal"],"human_approval_required":false,"prerequisites":["能运行评测脚本","了解待测 Agent 的调用方式"],"license":"commercial-use-with-limits","refund_policy":"7 天内未访问且未下载可全额退款","version":"1.0.0","last_updated":"2026-05-21"}