prompt-engineering

2 posts

Feb 20, 2026

Paste Your Prompt Twice and Watch Accuracy Change

Google Research validated it across 7 models and 7 benchmarks. No training, no prompt engineering. Just copy-paste. I tested it and here's what actually happened.

Feb 18, 2026

From 6.7% to 68.3% Task Success: The Harness Made the 10x Difference, Not the Model

What LangChain's Terminal Bench results and the hashline format experiment revealed. The same model flipped leaderboard rankings, and the reasons came down to three things: prompts, tools, and middleware.