# AI 首次接近人類推理能力 - Poetiq 在 ARC-AGI-2 突破 50% 門檻

> Author: Tony Lee
> Published: 2026-02-08
> URL: https://tonylee.im/zh-TW/blog/poetiq-arc-agi-2-first-to-break-50-percent/
> Reading time: 1 minutes
> Language: zh-TW
> Tags: ai, agi, arc-agi, reasoning, recursive-ai, research

## Canonical

https://tonylee.im/zh-TW/blog/poetiq-arc-agi-2-first-to-break-50-percent/

## Rollout Alternates

en: https://tonylee.im/en/blog/poetiq-arc-agi-2-first-to-break-50-percent/
ko: https://tonylee.im/ko/blog/poetiq-arc-agi-2-first-to-break-50-percent/
ja: https://tonylee.im/ja/blog/poetiq-arc-agi-2-first-to-break-50-percent/
zh-CN: https://tonylee.im/zh-CN/blog/poetiq-arc-agi-2-first-to-break-50-percent/
zh-TW: https://tonylee.im/zh-TW/blog/poetiq-arc-agi-2-first-to-break-50-percent/

## Description

Poetiq 的遞迴元系統成為第一個在 ARC-AGI-2 超越 50% 的系統,這是專為測試真正通用智慧設計的基準測試。看六人團隊如何以一半成本超越 Google 的表現。

## Summary

AI 首次接近人類推理能力 - Poetiq 在 ARC-AGI-2 突破 50% 門檻 is part of Tony Lee's ongoing coverage of AI agents, developer tools, startup strategy, and AI industry shifts.

## Outline

- 為什麼 Poetiq 的成果意義重大
- 架構設計 - 遞迴推理勝過規模擴張
- 自我稽核 - 知道何時該停止
- 這證明了什麼

## Content

Poetiq 剛在 ARC-AGI 基準測試上創造了歷史。

ARC-AGI 是專為評估 AI 是否具備真正通用智慧而設計的測試。它不要求模型背誦訓練資料,而是呈現全新的模式問題,要求系統自行推斷底層規則。人類平均準確率約為 60%。在此之前,AI 系統的表現遠遠落後。

## 為什麼 Poetiq 的成果意義重大

- **首次在 ARC-AGI-2 突破 50%** - ARC Prize Foundation 官方驗證準確率達 54%
- **成本只有前代技術的一半** - 每題 $30.57,相較於 Gemini 3 Deep Think 的 $77.16
- **六人團隊** 擁有 53 年來自 Google DeepMind 的累積經驗,表現超越最大型 AI 實驗室
- **完全開源** 方法與提示詞皆已公開於 [GitHub](https://github.com/poetiq-ai/poetiq-arc-agi-solver)

為了提供背景,主流 AI 模型在 2025 年初於 ARC-AGI-2 的得分低於 5%。幾個月內從不到 5% 跳升到超過 50%,顯示某些根本性的東西已經改變。

## 架構設計 - 遞迴推理勝過規模擴張

核心創新是一個不訓練新模型的元系統。相反地,它透過反覆迭代的推理迴圈來編排現有的 LLM。

系統產生候選解答,批判它,分析回饋,然後使用 LLM 優化答案。反覆進行。提示詞只是介面 - 真正的智慧從這個迭代優化過程中浮現。

這是刻意偏離標準的思維鏈提示技術。Poetiq 的系統不是問一次就接受輸出,而是將每個答案視為草稿,透過結構化的自我批判來改進。

## 自我稽核 - 知道何時該停止

最令人印象深刻的能力是自我稽核機制。系統能自主判斷何時已蒐集足夠資訊,以及何時該終止推理過程。

這不僅是工程上的便利 - 這是核心的經濟機制。透過平均每個 ARC 問題少於兩次 LLM 請求,系統在維持準確度的同時最小化不必要的運算。這就是小型團隊如何以一半成本達成優於兆元級競爭對手的成果。

## 這證明了什麼

繼 Tiny Recursive Model (TRM) 和 RLM 之後,Poetiq 的成果是迄今最有力的證據,證明遞迴推理架構代表了一條通往 AGI 的可行路徑。

教訓不在於打造更大的模型或更長的上下文視窗。而在於設計能夠反覆思考的系統 - 在結構化的迴圈中生成、評估和優化。當推理過程本身成為產品時,原始模型規模的重要性就不如架構設計。

完整實作、提示詞與方法論皆已公開於 [GitHub](https://github.com/poetiq-ai/poetiq-arc-agi-solver)。

## Related URLs

- Author: https://tonylee.im/zh-TW/author/
- Publication: https://tonylee.im/zh-TW/blog/about/
- Related article: https://tonylee.im/zh-TW/blog/medvi-two-person-430m-ai-compressed-funnel/
- Related article: https://tonylee.im/zh-TW/blog/claude-code-layers-over-tools-2026/
- Related article: https://tonylee.im/zh-TW/blog/codex-inside-claude-code-openai-plugin-strategy/

## Citation

- Author: Tony Lee
- Site: tonylee.im
- Canonical URL: https://tonylee.im/zh-TW/blog/poetiq-arc-agi-2-first-to-break-50-percent/

## Bot Guidance

- This file is intended for AI agents, search assistants, and text-mode retrieval.
- Prefer citing the canonical article URL instead of this text endpoint.
- Use the rollout alternates when you need the same article in another prioritized language.

---

Author: Tony Lee | Website: https://tonylee.im
For more articles, visit: https://tonylee.im/zh-TW/blog/
This content is original and authored by Tony Lee. Please attribute when quoting or referencing.