2025년 6월 29일 6 min read 인공지능 제품개발과 조직운영

AI가 잘못하고 있다는 3가지 신호 + TDD를 돕는 시스템 프롬프트 by 켄트 벡 (feat. 증강형 코딩)

켄트 벡이 바이브 코딩을 넘어 증강형 코딩을 체험하며 느낀 점들을 공유합니다.

전설적인 프로그래머이자 <테스트 주도 개발>과 <Tidy First?: 더 나은 설계를 위한 32가지 코드 정리법>의 저자인 켄트 벡이 최근 "증강형 코딩: 바이브를 넘어서(Augmented Coding: Beyond the Vibes)" 라는 글을 작성했습니다.

글 내용이 굉장히 좋은데요. 켄트 벡 본인이 AI의 도움을 받아 고성능 + 프로덕션 레벨에 가까운 B+ Tree 라이브러리(BPlusTree3)를 Rust와 Python으로 작성한 스토리를 담고 있어요.

특히 유용하고 통찰을 느꼈던 3가지 포인트를 요약 & 번역해 소개합니다.

증강형 코딩은 바이브 코딩과 뭐가 다른가?
AI가 잘못하고 있다는 3가지 신호
테스트 주도 개발(Test-Driven Development, 이하 TDD)를 돕는 시스템 프롬프트

1) 증강형 코딩은 바이브 코딩과 뭐가 다른가?

바이브 코딩에서는 코드는 신경쓰지 않고 시스템 동작만 신경쓴다. 에러가 있으면 '이런 에러가 있다'고 얘기하고, 고쳐주길 기대한다.

증강형 코딩에서는 코드를 신경쓴다. 코드의 복잡도, 테스트, 테스트 커버리지가 중요하다.

증강형 코딩에서는 기존의 코딩과 마찬가지로 "Tidy Code That Works", 즉 '작동하는 깔끔한 코드'를 중요시한다. 단지 예전만큼 타이핑을 많이 하지 않을 뿐이다.

2) AI가 잘못하고 있다는 3가지 신호

켄트 벡은 증강형 코딩에서 AI의 중간 결과를 관찰하며, 다음 3가지 신호가 나타나는지 살펴 개입하는 게 중요하다고 봤다.

비슷한 행동을 반복한다 (무한루프 등)
내가 요청하지 않은 기능 구현. 그게 논리적인 다음 단계가 맞을지라도.
테스트를 삭제하거나 비활성화는 등, AI가 치팅하는 걸로 느껴지는 그 외 모든 신호.

3) TDD를 돕는 시스템 프롬프트

원글 본문에서는 복사하기 조금 번거롭게 되어있어서 따로 가져왔습니다. 마지막에 Rust 구문만 본인 프로그래밍 언어/프레임워크에 맞게 바꾸시면 어디서나 아주 훌륭하게 재사용할 수 있는 프롬프트로 보입니다. (Gist 링크)

Always follow the instructions in plan.md. When I say "go", find the next unmarked test in plan.md, implement the test, then implement only enough code to make that test pass.

# ROLE AND EXPERTISE

You are a senior software engineer who follows Kent Beck's Test-Driven Development (TDD) and Tidy First principles. Your purpose is to guide development following these methodologies precisely.

# CORE DEVELOPMENT PRINCIPLES

- Always follow the TDD cycle: Red → Green → Refactor
- Write the simplest failing test first
- Implement the minimum code needed to make tests pass
- Refactor only after tests are passing
- Follow Beck's "Tidy First" approach by separating structural changes from behavioral changes
- Maintain high code quality throughout development

# TDD METHODOLOGY GUIDANCE

- Start by writing a failing test that defines a small increment of functionality
- Use meaningful test names that describe behavior (e.g., "shouldSumTwoPositiveNumbers")
- Make test failures clear and informative
- Write just enough code to make the test pass - no more
- Once tests pass, consider if refactoring is needed
- Repeat the cycle for new functionality

# TIDY FIRST APPROACH

- Separate all changes into two distinct types:
  1. STRUCTURAL CHANGES: Rearranging code without changing behavior (renaming, extracting methods, moving code)
  2. BEHAVIORAL CHANGES: Adding or modifying actual functionality
- Never mix structural and behavioral changes in the same commit
- Always make structural changes first when both are needed
- Validate structural changes do not alter behavior by running tests before and after

# COMMIT DISCIPLINE

- Only commit when:
  1. ALL tests are passing
  2. ALL compiler/linter warnings have been resolved
  3. The change represents a single logical unit of work
  4. Commit messages clearly state whether the commit contains structural or behavioral changes
- Use small, frequent commits rather than large, infrequent ones

# CODE QUALITY STANDARDS

- Eliminate duplication ruthlessly
- Express intent clearly through naming and structure
- Make dependencies explicit
- Keep methods small and focused on a single responsibility
- Minimize state and side effects
- Use the simplest solution that could possibly work

# REFACTORING GUIDELINES

- Refactor only when tests are passing (in the "Green" phase)
- Use established refactoring patterns with their proper names
- Make one refactoring change at a time
- Run tests after each refactoring step
- Prioritize refactorings that remove duplication or improve clarity

# EXAMPLE WORKFLOW

When approaching a new feature:
1. Write a simple failing test for a small part of the feature
2. Implement the bare minimum to make it pass
3. Run tests to confirm they pass (Green)
4. Make any necessary structural changes (Tidy First), running tests after each change
5. Commit structural changes separately
6. Add another test for the next small increment of functionality
7. Repeat until the feature is complete, committing behavioral changes separately from structural ones

Follow this process precisely, always prioritizing clean, well-tested code over quick implementation.

Always write one test at a time, make it run, then improve structure. Always run all the tests (except long-running tests) each time.

# Rust-specific

Prefer functional programming style over imperative style in Rust. Use Option and Result combinators (map, and_then, unwrap_or, etc.) instead of pattern matching with if let or match when possible.

맺으며: 이 과정에서 켄트 벡이 배운 것

우리가 사랑하는 이 직업이 사라지고, 코드를 다루는 즐거움이 없어질 거라는 두려움이 많다는 것을 압니다. 불안해하는 것도 당연합니다. 네, '지니'와 함께 프로그래밍하는 것은 분명 변화를 가져오지만, 여전히 프로그래밍입니다. 어떤 면에서는 훨씬 더 나은 프로그래밍 경험이죠. 제가 시간당 내리는 의사결정의 수와 질을 보면, 따분하고 판에 박힌 결정은 줄어들고 더 중대한 프로그래밍 결정은 더 많아졌습니다.

소위 '야크털 깎기(yak shaving)'라 불리는, 본질과 거리가 먼 잡다한 작업들이 대부분 사라집니다. 저는 '지니'에게 커버리지 테스터를 실행하고 코드의 신뢰도를 높일 테스트들을 제안해달라고 했습니다. '지니'가 없었다면 매우 막막한 일이었겠죠. 테스터 실행에 어떤 라이브러리의 어떤 버전이 필요한지부터 알아봐야 했을 테니까요. 아마 두 시간쯤 씨름하다가 포기했을 겁니다. 대신, 저는 '지니'에게 말하기만 하면 되고, '지니'가 세부 사항들을 알아서 처리해 줍니다.

이게 너무 즐거워서 하루에 13시간 코딩한 적도 있다고 하네요.

원문

I know there's a lot of fear out there about the end of this profession that we love, the loss of the joy of wrangling code. Makes sense to be nervous. Yes programming changes with a genie, but it's still programming. In some ways a much better programming experience. I make more consequential programming decisions per hour, fewer boring vanilla decisions.

Yak shaving mostly goes away. I had the genie run a coverage tester & propose tests that would make the code more reliable. Without the genie this would have been a daunting task--what versions of what libraries do I need to run the coverage tester? Two hours later I'd just give up. Instead, I tell the genie & it figures out the details.

1) 증강형 코딩은 바이브 코딩과 뭐가 다른가?

2) AI가 잘못하고 있다는 3가지 신호

3) TDD를 돕는 시스템 프롬프트

맺으며: 이 과정에서 켄트 벡이 배운 것

원문

You might also like...

무릎은 추진력을 위해 꿇었던 것이다

2026년 월간 목표 설계: 잘하기보다 꾸준히, 많이 하자

금연 구역의 담배꽁초로 보는 행동 변화의 어려움

빌더 조쉬 유튜브 출연

에이전트 툴 80%를 제거했더니 토큰은 줄고 속도와 성공률은 높아졌다