# मेरे Agent ने एक Failed API को 5 बार Call किया — Bug Code में था ही नहीं

> Author: Tony Lee
> Published: 2026-02-25
> URL: https://tonylee.im/hi/blog/agent-debugging-traces-not-code/
> Reading time: 5 minutes
> Language: hi
> Tags: ai, ai-agents, observability, tracing, langsmith, debugging

## Canonical

https://tonylee.im/hi/blog/agent-debugging-traces-not-code/

## Rollout Alternates

en: https://tonylee.im/en/blog/agent-debugging-traces-not-code/
ko: https://tonylee.im/ko/blog/agent-debugging-traces-not-code/
ja: https://tonylee.im/ja/blog/agent-debugging-traces-not-code/
zh-CN: https://tonylee.im/zh-CN/blog/agent-debugging-traces-not-code/
zh-TW: https://tonylee.im/zh-TW/blog/agent-debugging-traces-not-code/

## Description

जब agent बार-बार वही failing API call repeat करे, तो code review काम नहीं आएगा। AI agents को debug करने के लिए traces ही नया source code हैं।

## Summary

मेरे Agent ने एक Failed API को 5 बार Call किया — Bug Code में था ही नहीं is part of Tony Lee's ongoing coverage of AI agents, developer tools, startup strategy, and AI industry shifts.

## Outline

- Agent code एक खाली बर्तन है
- Traces ही नया source code हैं
- Testing का fundamentally बदल जाना
- Collaboration और product analytics भी traces पर होती हैं
- Bottom line

## Content

Production में bug आया। मेरा agent एक ही API call पाँच बार repeat कर रहा था। आदत से मजबूर होकर मैंने पहले code खोला। Retry logic ठीक था। Function flow normal था। Logs में एक भी error नहीं।

Code के पास कोई जवाब नहीं था। Trace खोला तब जाकर असली वजह सामने आई।

## Agent code एक खाली बर्तन है

किसी भी agent का source code खोलिए — मिलेगा एक model specification, tools की list, और एक system prompt। बस इतना ही। कौन सा tool कब call करना है, reasoning किस sequence में होगी — यह सब code में कहीं नहीं लिखा होता।

LangGraph-based agents चलाने वाली teams बार-बार यही कहती हैं: "Code review से agent की quality judge नहीं होती।"

- Same code, same input — हर बार अलग-अलग tool call pattern
- `handleSubmit()` जैसे किसी function के विपरीत, branching logic code में exist ही नहीं करता
- GPT-5.2 को same query 10 बार दो — tool call ordering में लगभग 40% consistency मिलेगी
- Error आती है, code में कोई bug नहीं, reproduction impossible हो जाता है

यही fundamental shift है। Traditional software में code *ही* behavior होता है। Agents में code सिर्फ scaffolding है। असली behavior runtime पर emerge होता है — model जो भी context पाता है, उस पर उसकी reasoning से।

## Traces ही नया source code हैं

एक trace agent के हर कदम को record करती है। उसने किस step पर क्या सोचा, कौन सा tool call किया और क्यों — सब कुछ captured रहता है। Debugging, testing, और performance analysis जो पहले code के ज़रिए होते थे, अब traces के ज़रिए होने पड़ते हैं।

जब agent एक error message देखकर भी वही call दोबारा करे — यह code bug नहीं है। यह reasoning failure है। और यह सिर्फ trace में दिखता है।

- Prompt change से पहले और बाद की traces compare करने पर reasoning quality का फर्क तुरंत दिखता है
- LangSmith में किसी specific point की trace को playground में load करना breakpoint set करने जैसा काम करता है
- एक अकेली trace वो exact moment दिखा सकती है जब agent की reasoning गलत दिशा में गई — यह काम किसी भी amount of logging से नहीं होता

इसे ऐसे समझिए: traditional debugging एक recipe पढ़कर गलती ढूंढना है। Agent debugging रसोई की CCTV footage देखकर समझना है कि chef कहाँ चूका। Recipe perfect हो सकती है — execution में गड़बड़ होती है।

## Testing का fundamentally बदल जाना

Traditional software में deployment से पहले test करो — हो गया। Agents non-deterministic हैं, इसलिए production में भी लगातार evaluate करते रहना पड़ता है।

अगर traces collect करने का, eval datasets बनाने का, और quality degradation या drift पकड़ने का कोई pipeline नहीं है, तो agents को scale पर operate करना simply संभव नहीं है।

जिन teams ने trace-based evaluation adopt किया है उन्होंने task success rates में measurable improvement देखा है। Pattern consistent है: traces वो failure modes reveal करती हैं जो कोई भी pre-deployment test suite predict नहीं कर सकता।

- एक automated eval pipeline बनाएं जो production traces को weekly sample करे
- अकेला pre-deployment testing non-deterministic systems में quality guarantee नहीं दे सकता
- Traces के बिना monitoring ऐसी है जैसे सिर्फ यह check करना कि server चल रहा है या नहीं
- Agent "normally working" हो सकता है जबकि पूरी तरह गलत tasks execute कर रहा हो — यह सिर्फ traces पकड़ती हैं

## Collaboration और product analytics भी traces पर होती हैं

Code review GitHub पर होता है। Agent के judgment का review कहाँ होगा?

Observability platforms यह role ले रहे हैं। Teams traces पर comment कर रही हैं, specific decision points share कर रही हैं, और agent की reasoning उसी तरह review कर रही हैं जैसे कभी pull requests review होती थीं। Collaboration model ही बदल रहा है।

Product analytics भी यही pattern follow करता है। जब metric कहे "30% users dissatisfied हैं," तो traces खोले बिना वजह नहीं मिलेगी। Agent अपने measure से task successfully complete कर रहा हो सकता है जबकि user actually जो चाहता था वो पूरी तरह miss हो रहा हो।

- Mixpanel जैसे product analytics tools और debugging tools traces पर converge हो रहे हैं
- Agent के tool call patterns analyze करने से यह reverse-engineer किया जा सकता है कि users को actually किन features की ज़रूरत है

## Bottom line

Agent era में code building का blueprint है और traces security camera की footage। जब building में कुछ गड़बड़ हो, तो पहले blueprint नहीं खोलते — footage rewind करते हैं।

जो teams agent quality सही कर रही हैं, वो वही हैं जिन्होंने अपना center of gravity code से traces की तरफ shift किया है। इसलिए नहीं कि code matter नहीं करता — बल्कि इसलिए कि असली failures, जो users और पैसे cost करती हैं, वो runtime behavior में होती हैं जिन्हें सिर्फ traces capture करती हैं।

## Related URLs

- Author: https://tonylee.im/en/author/
- Publication: https://tonylee.im/en/blog/about/
- Related article: https://tonylee.im/hi/blog/eight-hooks-that-guarantee-ai-agent-reliability/
- Related article: https://tonylee.im/hi/blog/medvi-two-person-430m-ai-compressed-funnel/
- Related article: https://tonylee.im/hi/blog/claude-code-layers-over-tools-2026/

## Citation

- Author: Tony Lee
- Site: tonylee.im
- Canonical URL: https://tonylee.im/hi/blog/agent-debugging-traces-not-code/

## Bot Guidance

- This file is intended for AI agents, search assistants, and text-mode retrieval.
- Prefer citing the canonical article URL instead of this text endpoint.
- Use the rollout alternates when you need the same article in another prioritized language.

---

Author: Tony Lee | Website: https://tonylee.im
For more articles, visit: https://tonylee.im/hi/blog/
This content is original and authored by Tony Lee. Please attribute when quoting or referencing.