# Agent 连续调用同一个失败的 API 五次——问题根本不在代码里

> Author: Tony Lee
> Published: 2026-02-25
> URL: https://tonylee.im/zh-CN/blog/agent-debugging-traces-not-code/
> Reading time: 1 minutes
> Language: zh-CN
> Tags: ai, ai-agents, observability, tracing, langsmith, debugging

## Canonical

https://tonylee.im/zh-CN/blog/agent-debugging-traces-not-code/

## Rollout Alternates

en: https://tonylee.im/en/blog/agent-debugging-traces-not-code/
ko: https://tonylee.im/ko/blog/agent-debugging-traces-not-code/
ja: https://tonylee.im/ja/blog/agent-debugging-traces-not-code/
zh-CN: https://tonylee.im/zh-CN/blog/agent-debugging-traces-not-code/
zh-TW: https://tonylee.im/zh-TW/blog/agent-debugging-traces-not-code/

## Description

当 Agent 反复触发同一个失败的 API 调用，翻代码没有任何意义。Trace 才是调试 AI Agent 的新源代码。

## Summary

Agent 连续调用同一个失败的 API 五次——问题根本不在代码里 is part of Tony Lee's ongoing coverage of AI agents, developer tools, startup strategy, and AI industry shifts.

## Outline

- Agent 代码是一个空壳
- Trace 才是新的源代码
- 测试的底层逻辑变了
- 协作和产品分析也发生在 trace 上
- 结论

## Content

生产环境出了问题。Agent 在重复调用同一个 API，连续五次。习惯性地，我先打开了代码。重试逻辑没问题，函数调用链路正常，日志里一个报错都没有。代码给不出答案。直到打开 trace，原因才浮出水面。

## Agent 代码是一个空壳

打开任何一个 Agent 的源代码，你会看到模型配置、工具列表和系统提示词。仅此而已。什么时候调用哪个工具，按什么推理顺序执行——这些都不在代码里。运行 LangGraph Agent 的团队有一句话说得很准："你没办法通过代码审查来判断 Agent 的质量。"

- 同样的代码，同样的输入，每次的工具调用模式都不一样
- 分支逻辑不像 `handleSubmit()` 那样写死在代码里
- GPT-5.2 同一个查询跑十次，工具调用顺序一致率大约只有 40%
- 出了错，代码没有 bug，问题无法复现

传统软件里，代码就是行为本身。在 Agent 里，代码只是脚手架，真正的行为在运行时才生成。

## Trace 才是新的源代码

Trace 记录了每一步的足迹：推理了什么、调用了哪个工具、为什么调用。调试、测试、性能分析，都要通过 trace 来做。当 Agent 看到报错还在重复调用同一个接口，那是推理层面的失败，只有 trace 能看到。

- 对比两条 trace，prompt 变更的影响一目了然
- LangSmith 加载 trace 的体验就像设断点一样直接
- 一条 trace 能精确定位推理在哪个节点出了轨

## 测试的底层逻辑变了

Agent 是非确定性的，需要在生产环境持续评估。没有 trace 采集、没有 eval 数据集、没有漂移检测流水线，根本无法规模化运营 Agent。

- 自动化 eval 流水线每周从生产 trace 里抽样
- 仅靠上线前测试，无法保证非确定性系统的质量
- 不接 trace 的监控，等于只在看服务器是否还活着
- Agent 可以"正常运行"，同时做着完全错误的事——这只有 trace 才能发现

## 协作和产品分析也发生在 trace 上

代码审查在 GitHub，Agent 的决策审查在可观测性平台。团队在 trace 上留评论、分享决策节点，像 PR review 一样审阅推理链路。

- 产品分析工具和调试工具在 trace 上合流
- 分析工具调用模式，可以反推用户的真实需求

## 结论

代码是建筑蓝图，trace 是监控录像。出了问题，先回放录像。把 Agent 质量做好的团队，早就把注意力从代码转移到了 trace 上。

## Related URLs

- Author: https://tonylee.im/zh-CN/author/
- Publication: https://tonylee.im/zh-CN/blog/about/
- Related article: https://tonylee.im/zh-CN/blog/eight-hooks-that-guarantee-ai-agent-reliability/
- Related article: https://tonylee.im/zh-CN/blog/medvi-two-person-430m-ai-compressed-funnel/
- Related article: https://tonylee.im/zh-CN/blog/claude-code-layers-over-tools-2026/

## Citation

- Author: Tony Lee
- Site: tonylee.im
- Canonical URL: https://tonylee.im/zh-CN/blog/agent-debugging-traces-not-code/

## Bot Guidance

- This file is intended for AI agents, search assistants, and text-mode retrieval.
- Prefer citing the canonical article URL instead of this text endpoint.
- Use the rollout alternates when you need the same article in another prioritized language.

---

Author: Tony Lee | Website: https://tonylee.im
For more articles, visit: https://tonylee.im/zh-CN/blog/
This content is original and authored by Tony Lee. Please attribute when quoting or referencing.