重构遗留代码
重构遗留代码
Section titled “重构遗留代码”遗留代码是程序员的噩梦——它不是「写得不好」的代码,是没人能解释为什么这么写的代码。
原作者早走了,文档没有,注释是「TODO: fix this later」(later 是 2017 年),测试更别提。你不敢动它,因为不知道动一下会引发什么连锁反应;你又必须动它,因为它挡着你做新功能。
这是重构的终极副本。下面这套流程来自老金 10 万字教程里的重构遗留代码案例,是 Claude Code 在这种场景下能派上用场的标准打法。
第一步:先探索,再下刀
Section titled “第一步:先探索,再下刀”遗留代码最大的风险是未知——你不知道某个函数被谁调用、某个全局变量被谁改、某个奇怪分支是什么时候加的、为什么加。
不要亲自用主上下文去探索。一个 5 万行的遗留模块,你让主代理读完,上下文就废了一半。委托 Explore 子代理:
delegate to the Explore subagent at "very thorough" level:- map the structure of src/legacy/billing/ — every file, every public function- for each function, list: callers (grep across repo), external dependencies, side effects (db writes, file IO, global mutations)- identify the "scary" parts: functions with >100 lines, deeply nested conditionals, functions called from >5 placesreturn a structured report. don't modify anything「don’t modify anything」是给 Explore 的红线——它本来就没写权限,但你再强调一遍,确保它只读只看只总结。
这份报告是你后续所有动作的地图。没有这张图,所有重构都是赌博。
第二步:Plan Mode 制定迁移方案
Section titled “第二步:Plan Mode 制定迁移方案”地图有了,下一步是制定方案——别急着动手。
enter Plan Mode. based on the Explore report, design a migration plan for src/legacy/billing/:
1. identify the "seams" — places where the legacy code can be detached from the rest (interfaces, single-call-site functions, pure utilities)2. for each seam, propose a replacement strategy: wrap, extract, rewrite, or leave3. order the migrations by risk (lowest risk first) — pure utilities before business logic, leaf functions before central dispatchers4. for each step, specify: - what changes - which tests must pass (existing + new ones to add before touching) - how to verify no behavior change (diff strategy, snapshot tests, parallel runs)5. mark any step that requires behavior change as OUT OF SCOPE — flag, don't fix最后一条「OUT OF SCOPE」是关键。遗留代码里藏着无数「实际上是个 bug 但大家已经习惯」的行为——动了它们,下游会以你意想不到的方式崩。重构阶段只搬不改,bug 留到重构完之后单独发 PR 修。
Plan 出来后,审计它。看看:
- 步骤是不是真的从低风险开始?
- 每步有没有对应的测试保护?
- 有没有夹带「顺手优化」?
- 顺序合不合理(依赖在前,调用方在后)?
第三步:小步替换,每步测试
Section titled “第三步:小步替换,每步测试”Plan 定了,开始执行。核心原则:每步小到可以单独 revert。
execute step 1 of the migration plan: extract the date-formatting utility from src/legacy/billing/format.py into src/billing/utils/date.py.
before changing anything:- write characterization tests for the current behavior of the function being extracted- run them — they must pass on the legacy code
after extraction:- the old function should delegate to the new one (keep the legacy entry point alive)- all characterization tests must still pass- all existing tests must still pass
commit with message: "refactor(billing): extract date formatting into utils (step 1/N)"这一段 prompt 把「一步」拆成了前置保护 + 提取 + 双向兼容 + 验证 + 提交五件事。其中两个细节最值钱:
Characterization Tests(特征测试)
Section titled “Characterization Tests(特征测试)”这是 Michael Feathers 在《修改代码的艺术》里讲的核心招式——在动遗留代码之前,先给它写测试,描述它「现在」的行为(而不是它「应该」的行为)。
write characterization tests for the function `calculate_discount` in src/legacy/billing/discount.py.don't test what the function "should" do — test what it currently does.include cases for: empty cart, single item, bulk order, expired coupon, negative quantity.for each case, capture the actual current output as the assertion.注意「don’t test what it should do — test what it currently does」——这是 characterization test 的精髓。哪怕函数对负数返回了一个奇怪结果,你也记下这个奇怪结果——因为某个客户可能在依赖这个奇怪行为。重构后输出变了,测试立刻红,你就知道下游会以同样方式崩。
保留旧代码做对照
Section titled “保留旧代码做对照”「keep the legacy entry point alive」——旧函数别删,让它代理到新函数:
def format_date(d): # DEPRECATED: use src.billing.utils.date.format_date instead # kept for backward compat — remove after all callers migrate from src.billing.utils.date import format_date as _new return _new(d)这样:
- 老调用方还在调老函数,行为不变
- 新调用方可以直接用新函数
- 重构期间双轨运行,谁崩了能立刻定位
- 最后一个老调用方迁移完,再统一删除老入口
第四步:用 review subagent 防回归
Section titled “第四步:用 review subagent 防回归”每完成一组迁移,让 review 子代理审一遍:
delegate to review subagent:- examine the last 5 refactor commits- verify: behavior preserved? old entry points still delegate correctly?- check: any caller of the old function I missed?- check: any test that was silently deleted (not just modified)?- flag: any place where "refactor" actually changed behavior「any test that was silently deleted」是个细节——AI 在重构时偶尔会「删掉跑不过的测试」而不是「修测试」。这种偷偷删测试的行为是回归的头号来源。让 review 子代理专门盯这个。
第五步:分批发 PR,别一次性吞
Section titled “第五步:分批发 PR,别一次性吞”遗留代码重构最大的错误是一个 PR 改 50 个文件——审查者看不完、回退不知道退哪、出问题隔离不了。
正确做法:
this migration will be 5 separate PRs:- PR1: characterization tests only (no code changes) — establishes the safety net- PR2: extract pure utilities (lowest risk)- PR3: extract leaf business logic- PR4: replace central dispatchers- PR5: remove legacy entry points (after all callers verified migrated)
each PR must independently pass tests and not break the build每个 PR 独立可发、可审、可 revert。最后一个 PR(删除老入口)要等所有调用方都迁移完才发——可以用 grep 验证:
grep the whole repo for any remaining reference to the legacy entry point. if zero, we're safe to delete. if any, list them — they need migrating first一份「啃遗留代码」prompt 模板
Section titled “一份「啃遗留代码」prompt 模板”Target: [遗留模块路径]
Phase 1 — Explore (委托 Explore 子代理):- map structure, callers, dependencies, side effects- identify scary parts (>100 lines, >5 callers, deep nesting)- don't modify
Phase 2 — Plan (Plan Mode):- identify seams (interfaces, single-call-site, pure utils)- order by risk (low → high)- mark behavior changes as OUT OF SCOPE- propose characterization tests for each step
Phase 3 — Execute (per step):- write characterization tests for current behavior- extract / wrap / rewrite per plan- keep legacy entry point alive, delegating to new- all tests green (old + new)- commit: "refactor(scope): step N/M — <what>"
Phase 4 — Review (委托 review subagent):- behavior preserved?- legacy entry still delegating?- any test silently deleted?- any caller missed?
Phase 5 — Cleanup (separate PR, last):- grep for remaining legacy references- delete legacy entry points- final test run走完这五步,你啃下来的不只是「让代码能动」,而是「让代码能继续动下去」——这才是重构遗留代码真正的目标。
- 想看更基础的重构流程——读 代码重构流程。
- 子代理怎么用——回顾 Subagents 深入。
- Plan Mode 怎么用——读 Plan Mode 与 Ultraplan。