Hashline vs Replace: Does the Edit Format Matter?
Can Bölük's The Harness Problem showed hashline-style edits (line-number anchored, like 4#WB) outperforming traditional replace-mode edits (old_string/new_string matching) for coding agents. I've b...

Source: DEV Community
Can Bölük's The Harness Problem showed hashline-style edits (line-number anchored, like 4#WB) outperforming traditional replace-mode edits (old_string/new_string matching) for coding agents. I've been experimenting with building my own harness (tau), and wanted to verify this result and see if I should consider using hashline as the default edit strategy there. So I built edit-bench to test this myself across multiple languages and models. Setup edit-bench generates mutation-based tests from existing codebases. You point a script at a directory, and it generates mutations like deleting a statement, flipping a boolean, swapping args, etc. Languages: Python (from hive), TypeScript (from oh-my-pi), Rust (from irradiate) Models: gpt-4.1-mini, google/gemini-3-flash-preview, qwen/qwen3.5-397b-a17b Edit modes: replace (old_string/new_string) vs hashline (line-number anchored) 20 tasks per language, single-attempt oneshot runs I also recently added fuzzy matching to tau (trim cascade: trim_end