Claude Code in Real Projects: Setup, Context, Tests, and Limits

Macintosh HDWritingAI Coding

Article 07AI CodingMay 6, 2026

Reading time: 24 min

Claude Code in Real Projects: Setup, Context, Tests, and Limits

A practical Claude Code workflow guide for real repositories: setup, CLAUDE.md, context, issue planning, safe edits, tests, diff review, hooks, skills, and limits.

The easiest way to misuse Claude Code is to treat it like a magic developer who can safely change anything in your repository after one sentence.

The better way is more boring and much more useful: treat Claude Code like a fast junior engineer with access to your terminal. It can inspect the codebase, edit files, run tests, read failures, and iterate. But it still needs a clear task, project rules, small scopes, test commands, and human review before the changes become production code.

Claude Code is an agentic coding tool from Anthropic. The official docs describe it as a tool that can read your codebase, edit files, run commands, and integrate with development tools. That makes it different from a simple autocomplete assistant. It is closer to a terminal-based coding agent that can move through a repository and execute a workflow.

This article is a practical guide to using Claude Code in a real project. We will cover setup, CLAUDE.md, context management, the issue → plan → edit → test → review workflow, safe prompting, real code examples, hooks, skills, subagents, and the files you should not let an agent touch without careful review.

What Claude Code Is Good At

Claude Code is strongest when the task is clear, scoped, and testable.

Good tasks include:

finding where a feature is implemented;
explaining a confusing module;
writing tests for an existing function;
refactoring a small set of files;
fixing a failing test;
adding validation to a form;
updating types after an API change;
creating a migration draft;
generating release notes from a diff;
reviewing a PR for obvious bugs and missing tests.

Weak tasks are vague and open-ended:

“make the app better”;
“refactor the whole frontend”;
“improve performance everywhere”;
“fix all bugs”;
“rewrite the auth system”;
“make this production-ready.”

The tool is powerful because it can work across files and run commands. That same power is why you need a process. A bad prompt can lead to broad, noisy changes that are technically impressive but impossible to review.

Install and Start a Session

Use the official installation method from Anthropic’s docs for your environment. For macOS, Linux, and WSL, the native install command is currently shown as:

bashCopy
curl -fsSL https://claude.ai/install.sh | bash

On Windows PowerShell, the docs show:

powershellCopy
irm https://claude.ai/install.ps1 | iex

After installation, open your project and start Claude Code from the repository root:

bashCopy
cd ~/work/my-app
claude

You can also start a session with an initial prompt:

bashCopy
claude "explain the structure of this repository"

For one-off non-interactive usage, the CLI reference includes -p mode, which prints a response and exits:

bashCopy
claude -p "summarize the changes in this branch and list risk areas"

The important habit: run Claude Code from the same directory where you would run project commands yourself. The agent needs to see the same repo, scripts, config files, and test setup that a developer would use.

Start with a Clean Git State

Before you ask Claude Code to edit files, check your working tree.

bashCopy
git status --short

If you already have manual changes, commit them or stash them first:

bashCopy
git add .
git commit -m "Save work before Claude session"
# or
git stash push -m "before-claude-session"

This matters because Claude can touch multiple files. A clean baseline makes review much easier:

bashCopy
git diff
git diff --stat
git diff --name-only

A good Claude Code session should end with a reviewable diff, not a mystery pile of edits.

Create a Useful CLAUDE.md

The most important setup file is usually CLAUDE.md. Anthropic’s GitHub Actions docs recommend creating a CLAUDE.md file in the repository root to define code style guidelines, review criteria, project-specific rules, and preferred patterns.

Think of CLAUDE.md as the project’s operating manual for the agent. It should not be a giant essay. It should be short, specific, and enforceable.

Here is a practical starter version for a TypeScript / Next.js project:

mdCopy
# CLAUDE.md

## Project
This is a Next.js TypeScript application. Prefer small, reviewable changes. Do not rewrite unrelated code.

## Commands
- Install: npm install
- Dev: npm run dev
- Type check: npm run typecheck
- Lint: npm run lint
- Test: npm test
- Build: npm run build

## Code style
- Use TypeScript strictly. Avoid `any` unless there is a clear reason.
- Prefer named exports for reusable utilities.
- Keep React components small and readable.
- Do not introduce new dependencies without explaining why.
- Follow existing folder structure and naming conventions.

## Testing rules
- If you change business logic, add or update tests.
- If you fix a bug, add a regression test when practical.
- Run the narrowest relevant test first, then run the broader command.

## Safety rules
- Do not edit authentication, billing, database migrations, environment config, or CI files unless the task explicitly asks for it.
- Do not read or print secrets from .env files.
- Do not run destructive commands.
- Ask before deleting files or making broad refactors.

## Review criteria
Before finishing, summarize:
1. What changed.
2. Which files changed.
3. Which tests were run.
4. What still needs human review.

This file helps because it moves recurring instructions out of your prompt. You do not need to remind the agent every time that the project uses strict TypeScript or that npm run build must pass.

Do not overload CLAUDE.md with every possible preference. If a section becomes a multi-step procedure, consider moving it into a skill instead.

Add a Real Task Template

Claude Code works better when issues are written like engineering tasks, not vague requests.

A weak task:

txtCopy
Fix the dashboard filters.

A better task:

mdCopy
## Problem
The dashboard filter dropdown resets when the user changes page. The selected status should persist in the URL query string.

## Expected behavior
- Selecting a status updates `?status=` in the URL.
- Reloading the page keeps the selected status.
- Clearing the filter removes the query param.
- Existing pagination behavior should still work.

## Relevant files
- `app/dashboard/page.tsx`
- `components/dashboard/StatusFilter.tsx`
- `lib/url-state.ts`

## Acceptance criteria
- Add or update tests for URL state behavior.
- Run typecheck, lint, and relevant tests.
- Keep the change scoped to dashboard filters.

The second task gives Claude boundaries. It names the problem, expected behavior, relevant files, and acceptance criteria. It also protects the repo from unrelated changes.

The Core Workflow: Issue → Plan → Edit → Test → Review

A reliable Claude Code workflow has five stages.

Issue
Give Claude a concrete task with expected behavior and constraints.
Plan
Ask Claude to inspect the repo and propose a plan before editing.
Edit
Approve a small set of changes. Keep scope narrow.
Test
Let Claude run the relevant test/build commands and fix failures.
Review
You review the diff, not just the final summary.

A good first prompt looks like this:

txtCopy
Read the issue below and inspect the relevant files. Do not edit yet.

First, explain the current implementation and propose a small plan.
After I approve the plan, make the minimal code changes.

Issue:
[PASTE ISSUE HERE]

That Do not edit yet line is important. It turns Claude into an investigator before it becomes an editor.

Example: Adding a URL-Persistent Dashboard Filter

Imagine this component currently stores the selected status only in local state:

tsxCopy
// components/dashboard/StatusFilter.tsx
"use client";

import { useState } from "react";

type Status = "all" | "open" | "blocked" | "done";

export function StatusFilter() {
  const [status, setStatus] = useState<Status>("all");

  return (
    <select
      value={status}
      onChange={(event) => setStatus(event.target.value as Status)}
    >
      <option value="all">All</option>
      <option value="open">Open</option>
      <option value="blocked">Blocked</option>
      <option value="done">Done</option>
    </select>
  );
}

The user wants the selected filter to persist in the URL. A scoped Claude prompt could be:

txtCopy
Task: Make StatusFilter persist its selected value in the URL query string.

Constraints:
- Keep the component client-side.
- Use Next.js navigation utilities.
- Valid statuses are: all, open, blocked, done.
- If status is all, remove the `status` query param.
- Do not change unrelated dashboard layout files.
- Add a small utility if it keeps the component clean.

Before editing, inspect the current dashboard page and explain where the state should live.

A reasonable implementation might look like this:

tsxCopy
// components/dashboard/StatusFilter.tsx
"use client";

import { usePathname, useRouter, useSearchParams } from "next/navigation";

type Status = "all" | "open" | "blocked" | "done";

const STATUSES = new Set<Status>(["all", "open", "blocked", "done"]);

function parseStatus(value: string | null): Status {
  if (value && STATUSES.has(value as Status)) {
    return value as Status;
  }

  return "all";
}

export function StatusFilter() {
  const router = useRouter();
  const pathname = usePathname();
  const searchParams = useSearchParams();
  const status = parseStatus(searchParams.get("status"));

  function handleChange(nextStatus: Status) {
    const params = new URLSearchParams(searchParams.toString());

    if (nextStatus === "all") {
      params.delete("status");
    } else {
      params.set("status", nextStatus);
    }

    const query = params.toString();
    router.replace(query ? `${pathname}?${query}` : pathname, { scroll: false });
  }

  return (
    <select
      value={status}
      onChange={(event) => handleChange(event.target.value as Status)}
      aria-label="Filter dashboard by status"
    >
      <option value="all">All</option>
      <option value="open">Open</option>
      <option value="blocked">Blocked</option>
      <option value="done">Done</option>
    </select>
  );
}

This is the kind of change Claude Code can handle well: the scope is narrow, the behavior is testable, and the user can review the diff quickly.

Ask Claude to Run the Right Commands

Do not let the agent guess your validation workflow. Put the commands in CLAUDE.md, then repeat the relevant ones in your task prompt if the change is important.

For a Next.js project, a typical validation sequence might be:

bashCopy
npm run typecheck
npm run lint
npm test -- StatusFilter
npm run build

A good prompt:

txtCopy
After making the change, run:
1. npm run typecheck
2. npm run lint
3. npm test -- StatusFilter

If a command fails, explain the failure before fixing it.
Do not run broad refactors to make tests pass.

That last line matters. Agents sometimes fix a test by changing the wrong thing. You want a narrow correction, not a cascade of unrelated edits.

Example Test for the Filter

Here is a small test shape for URL-state behavior. The exact setup depends on your test framework and Next.js mocking strategy, but the important thing is to test behavior rather than implementation details.

tsxCopy
// components/dashboard/StatusFilter.test.tsx
import { render, screen } from "@testing-library/react";
import userEvent from "@testing-library/user-event";
import { describe, expect, it, vi } from "vitest";
import { StatusFilter } from "./StatusFilter";

const replace = vi.fn();

vi.mock("next/navigation", () => ({
  useRouter: () => ({ replace }),
  usePathname: () => "/dashboard",
  useSearchParams: () => new URLSearchParams("status=open"),
}));

describe("StatusFilter", () => {
  it("reads the selected status from the URL", () => {
    render(<StatusFilter />);

    expect(screen.getByLabelText(/filter dashboard/i)).toHaveValue("open");
  });

  it("updates the URL when the status changes", async () => {
    const user = userEvent.setup();
    render(<StatusFilter />);

    await user.selectOptions(screen.getByLabelText(/filter dashboard/i), "blocked");

    expect(replace).toHaveBeenCalledWith(
      "/dashboard?status=blocked",
      { scroll: false }
    );
  });
});

When you ask Claude to write a test, be specific about the behavior you care about. Otherwise, it may add a snapshot or a shallow test that increases coverage but does not protect the workflow.

How to Give Context Without Flooding the Agent

Context is the difference between a useful coding agent and a noisy one. But more context is not always better.

Good context:

the task;
acceptance criteria;
relevant files;
commands to run;
constraints;
examples of existing patterns;
what not to change.

Bad context:

entire log files pasted into the prompt;
huge stack traces without the command that produced them;
vague architecture opinions;
screenshots when code is available;
old requirements that contradict the current issue.

If Claude needs logs, ask it to inspect a small slice first:

txtCopy
The failing command is `npm run build`.
Please inspect the error, but do not read entire generated log files.
Use targeted commands like `tail -n 120` or search for the first TypeScript error.

This protects both cost and clarity. Long irrelevant context can distract the agent and make the session harder to review.

Reviewing the Diff

Never accept the final summary as the review. Review the diff.

Useful commands:

bashCopy
git diff --stat
git diff --name-only
git diff
git diff --check

Look for:

files that were not part of the task;
new dependencies;
deleted tests;
loosened types;
any added to silence TypeScript;
changed auth, billing, migrations, or environment config;
snapshots updated without explanation;
broad formatting changes mixed with logic changes;
comments that explain obvious code instead of decisions.

A good review prompt after the agent edits:

txtCopy
Review your own diff before I review it.
List:
1. Every file changed.
2. Why it changed.
3. Any risky assumptions.
4. Any behavior that still needs manual verification.
5. Tests you ran and their results.
Do not make more edits unless you find a clear bug.

This forces the agent to turn the diff into a review checklist.

Do Not Let the Agent Touch These Files Without Review

Some files deserve extra caution because mistakes can break production, leak secrets, or create hard-to-debug behavior.

Require explicit approval before Claude Code edits:

.env, .env.local, .env.production, or secret-related files;
authentication and session code;
billing, payments, invoices, or subscription logic;
database migrations and destructive schema changes;
CI/CD workflows;
deployment scripts;
permission and role-based access logic;
encryption, signing, or token handling;
analytics that affect billing or reporting;
generated lockfiles unless dependency changes were intentional.

Add a version of this rule directly to CLAUDE.md. The agent should know that these files are not normal implementation files.

Use Skills for Repeated Workflows

Anthropic’s docs explain that Claude Code skills are created with a SKILL.md file and can be invoked directly with slash commands. Skills are useful when you keep pasting the same checklist, playbook, or multi-step procedure into chat.

For example, you can create a code review skill:

txtCopy
.claude/skills/code-review/SKILL.md

Example skill:

mdCopy
---
name: code-review
description: Review the current git diff for bugs, missing tests, type safety issues, and risky broad changes.
---

# Code Review Skill

Review the current git diff as a senior engineer.

Focus on:
- correctness
- edge cases
- missing tests
- type safety
- security-sensitive changes
- unrelated edits
- broad refactors that were not requested

Process:
1. Run `git diff --stat`.
2. Run `git diff --name-only`.
3. Inspect the diff.
4. List findings by severity.
5. Suggest minimal fixes.

Do not modify files unless explicitly asked.

Then you can use it during a session:

txtCopy
/code-review

This is better than pasting the same review instructions every time. CLAUDE.md should hold stable project facts. Skills should hold repeatable procedures.

Use Subagents for Side Tasks

Claude Code also supports specialized subagents. Anthropic’s docs describe subagents as assistants that handle specific types of tasks in their own context window, with custom prompts, tool access, and permissions.

A subagent is useful when a side task would flood your main conversation with logs, search results, or file contents.

Examples:

a test-writer subagent;
a documentation reviewer;
a database query validator;
a security-sensitive diff reviewer;
a dependency upgrade investigator.

A simple subagent concept:

mdCopy
# Test Writer Agent

Purpose: inspect changed business logic and propose missing tests.

Rules:
- Do not edit files by default.
- Identify existing test patterns first.
- Propose the smallest useful tests.
- Return only files inspected, suggested tests, and rationale.

Use subagents when the work is specialized and repetitive. Do not create five agents for a small bug fix. Too much agent structure becomes its own maintenance burden.

Hooks Can Enforce Workflow Rules

Claude Code hooks let you run shell commands, HTTP endpoints, or prompt-based checks at specific lifecycle events. This can be useful for teams that want guardrails around agent behavior.

A practical use case: run tests or formatting checks after file changes, or block risky commands before they run.

For example, you might use hooks to:

warn when .env files are read;
block destructive shell commands;
run npm run typecheck after TypeScript files change;
log tool usage for audit;
require review before editing migration files.

Do not start with a complicated hook system. Start with clear rules in CLAUDE.md, then add hooks when you see repeated mistakes.

MCP: Connect Only What the Task Needs

Claude Code can connect to external tools and data sources through MCP. Anthropic’s docs describe MCP as a way to connect Claude Code to tools, databases, APIs, issue trackers, monitoring dashboards, and other systems.

This is powerful, but it should be scoped.

Good MCP use cases:

read a GitHub issue;
inspect Sentry errors;
query a read-only analytics dashboard;
retrieve Figma context;
pull a ticket description from Linear or Jira.

Risky MCP use cases:

broad database write access;
production admin panels;
tools that can send external emails;
payment or billing systems;
untrusted content sources without prompt-injection safeguards.

The rule is simple: connect Claude Code to tools when it avoids manual copy-paste and improves context. Do not connect tools just because the integration is available.

Common Failure Modes

Claude Code usually fails in predictable ways.

It changes too much
Fix by giving tighter scope and asking for a plan before edits.
It follows the wrong pattern
Fix by pointing it to a similar existing file.
It makes tests pass for the wrong reason
Fix by reviewing whether behavior is actually protected.
It hides uncertainty
Ask it to list assumptions and unknowns.
It edits generated or sensitive files
Add explicit safety rules and review the diff.
It over-refactors
Tell it to make the minimal change and avoid unrelated cleanup.
It trusts stale context
Ask it to inspect current files, not rely only on pasted snippets.
It consumes too much context
Ask for targeted searches and short log slices.

Most failures are process failures. The agent did not know the boundary, or the human did not review the result carefully enough.

A Practical Prompt Library

Use prompts that create workflow boundaries.

Explore without editing:

txtCopy
Inspect this repository area and explain how it works. Do not edit files.
Focus on the data flow, relevant files, and likely change points.

Plan before implementation:

txtCopy
Before editing, propose a minimal implementation plan.
Include files you expect to change, tests to run, and risks.
Wait for approval before making changes.

Small bug fix:

txtCopy
Fix this bug with the smallest safe change.
Do not refactor unrelated code.
Add a regression test if practical.
Run the relevant tests and summarize the diff.

Test generation:

txtCopy
Find the existing test pattern for this module.
Add tests for the behavior described in the issue.
Do not change production code unless the test reveals a clear bug.

Diff review:

txtCopy
Review the current git diff.
Look for bugs, missing tests, type loosening, security-sensitive changes, and unrelated edits.
Do not modify files. Return findings by severity.

Build failure:

txtCopy
The build failed. Inspect the first relevant error and explain the cause before editing.
Do not make broad changes. Fix only the root cause if it is clear.

When Not to Use Claude Code

Do not use a coding agent just because it is available.

Avoid or heavily restrict Claude Code when:

the task touches production credentials;
you do not understand the code well enough to review it;
the repo has no tests and the change is high risk;
the task involves payments, auth, permissions, or data deletion;
the desired behavior is not specified;
the change requires product judgment more than implementation;
you are under time pressure and cannot review the diff.

Claude Code is useful when it accelerates a workflow you can verify. It is dangerous when it replaces understanding.

Checklist Before Running Claude Code on a Real Task

Use this checklist before asking the agent to edit:

Is my git working tree clean?
Is the task written with expected behavior?
Did I list relevant files or areas?
Did I define what not to change?
Are test/build commands documented?
Does CLAUDE.md include project rules?
Are sensitive files protected by instructions or permissions?
Is the change small enough to review?
Do I know how to manually verify the result?

If you cannot answer these questions, the agent is likely to produce a messy diff.

Conclusion: Claude Code Is a Workflow Tool, Not a Shortcut Around Engineering

Claude Code can be a serious productivity tool in real projects. It can inspect a codebase, edit files, run commands, respond to failures, and help with repetitive engineering work.

But the value does not come from letting an agent roam freely through the repository. The value comes from giving it a narrow task, strong project context, test commands, review rules, and clear boundaries.

The best workflow is not “ask Claude to build the feature.” It is:

issue → plan → edit → test → review

Use CLAUDE.md for project rules. Use skills for repeated procedures. Use subagents for specialized side tasks. Use hooks and permissions when the team needs stronger guardrails. Use MCP only when external tools genuinely improve context.

Most importantly, review the diff. Claude Code can write code quickly, but you are still responsible for what gets merged.