How to bring AI agents into a mature software project without starting over and build a workspace that makes your existing codebase your biggest advantage.
The internet is full of guides on how to build software from scratch using AI. Prompt your way to a product. Describe your app in plain English and watch it materialize. This approach has a name now: vibe coding. It is fast, it is impressive for greenfield projects, and it generates no shortage of hype.
But most working developers are not building from scratch. They are maintaining and extending software that has been in production for years, sometimes decades. The codebase has grown organically. Decisions were made that made sense at the time and are now buried in a Word document from 2017 or an unlabeled tab in a shared spreadsheet. The architecture diagram lives in a PDF that only two people have seen. The onboarding wiki has not been updated since the last project leader or sponsor changed roles.
This is the reality that almost no AI productivity content addresses: how do you introduce AI assistance into a project that already exists, where the context is scattered, the documentation is inconsistent, and the history is messy? That is exactly what this article covers.
When you build a project from scratch with an AI agent, you are the author of every decision in the session. The agent has full context because you provided it in real time, one prompt at a time.
Mature projects carry years of institutional knowledge that lives outside any single conversation: architecture decision records, product requirement documents, data model specifications, integration contracts, historical defect notes, security policies, and compliance constraints. Some of this is in version control. Most of it is not. It is scattered across email threads, Confluence pages, SharePoint folders, Jira tickets, and network drives that nobody has touched in years or decades.
AI agents work best when they can see relevant context. Without it, the output is generic, sometimes wrong, and frequently misaligned with how the project actually operates. The solution is not to hope the agent guesses correctly. The solution is to give it what it needs, and this article walks through how to do exactly that in five practical steps.
Before you can feed anything to an AI agent, you need to know what exists and where it lives. Most teams skip this step because it feels like overhead. It is not. It is the foundation that everything else rests on, and skipping it guarantees that important context stays invisible to the agent indefinitely.
Walk through the following questions for your project:
Build a simple spreadsheet or Markdown file listing each document, its current location, its format, and how important it is to day-to-day development. This inventory becomes your map for everything that follows.
AI agents can technically retrieve content over HTTP, but the most reliable approach is to have all relevant documents stored locally in plain-text formats. This eliminates access and authentication issues, keeps the content version-controlled alongside the code, and makes it trivially easy for the agent to find and read without requiring any special integrations.
The target format does not need to be perfect. It needs to be readable, searchable, and reasonably structured. The table below lists common source formats and the recommended conversion target for each.
| Source Format | Recommended Target | Notes |
|---|---|---|
| Word Document (.docx) | HTML or Markdown | Use Pandoc to automate conversion. HTML preserves tables and formatting; Markdown is lighter and easier to edit by hand. |
| Excel Spreadsheet (.xlsx) | CSV | Export each tab as a separate CSV file. Drop formatting-only sheets. Keep column headers on row one so the agent can understand the structure. |
| Markdown or plain text | Use a PDF extraction tool or copy the text content directly. Recreate important tables in Markdown manually; they are worth the effort. | |
| PowerPoint (.pptx) | Markdown | Slide titles become headings. Include speaker notes where they contain substantive context. Skip purely decorative slides. |
| Confluence / Wiki | Markdown | Most wiki platforms offer a Markdown or HTML export option. Clean up navigation boilerplate and breadcrumb content before saving. |
| Jira / Azure DevOps | CSV or Markdown | Export relevant epics, stories, or backlog items. Focus on closed items that established expected behaviour, not just open tickets describing future work. |
| Email Threads | Markdown | Copy key decisions and conclusions, not the full thread history. Attribute authorship where the context matters, and date the entry. |
| Meeting Recordings | Transcript (plain text) | Use a transcription tool such as Whisper or Otter.ai. Summarize into decisions and action items rather than preserving word-for-word transcripts. |
Do not let perfect be the enemy of useful. A rough conversion that captures 80% of the content is far more valuable than no conversion at all. Start with the highest-priority documents from your inventory and work down the list over time.
Once you have converted your documents, you need a consistent place to
store them so that the AI agent always knows where to look. A dedicated
folder in your repository works well for this. A common convention is
/resources/external/
to distinguish this curated context from the actual source code and build
artifacts.
A practical structure might look like this:
/resources/
/external/
/requirements/ ← product and functional requirements
/architecture/ ← system design documents and ADRs
/data-model/ ← schema definitions and ER diagram notes
/api-specs/ ← OpenAPI files or informal API documentation
/runbooks/ ← deployment and incident response procedures
/security/ ← security policies and compliance requirements
/conventions/ ← coding standards and team practices
Keep file names descriptive and include a brief
README.md
in each subfolder explaining what belongs there and when it was last
reviewed. This helps both human team members and the AI agent navigate the
structure without guessing.
Many projects span more than one repository. The main application might
live in one repo, the infrastructure configuration in another, and the
internal tooling in a third. In VS Code, you can bring all of these
together into a single
Workspace using a
.code-workspace
file. The file simply lists the folders you want open at the same time:
{
"folders": [
{ "path": "/code/myapp" },
{ "path": "/code/myapp-infra" },
{ "path": "/code/myapp-tools" }
]
}
With the workspace open, the AI agent can see and reference files across all folders in a single session. This is particularly useful when you cannot add AI-specific content directly to a tightly controlled repository. Create a separate, purpose-built repository for your AI context files, add the controlled project folder into the workspace alongside it, and the agent can see both without you ever touching the original repo's structure.
Every AI agent needs a set of standing instructions for your project: where things are, what the standards are, what is off-limits, and how decisions get reviewed and approved. This is the single most important investment you can make in a productive long-term working relationship with your agent, and it pays off every single session.
GitHub Copilot reads from a file called
copilot-instructions.md
placed in your repository's
.github/
folder. Claude reads from a file typically named
AGENT.md
at the root of your project. Both serve the same purpose: persistent
context that the agent loads before every session so you do not have to
re-explain your project from scratch each time.
Below is a representative example of what a well-structured instruction file looks like for a mid-sized application. Adjust every section to match your project's specifics.
# Project: MyApp — AI Agent Instructions ## Overview MyApp is a billing and customer management platform serving SMB clients. Primary language: Java 17 + Spring Boot. Frontend: React 18 + TypeScript. This project handles personally identifiable information (PII) and payment data subject to PCI-DSS and PIPEDA compliance requirements. ## Where to Find Project Context - Business requirements: /resources/external/requirements/ - Architecture decisions: /resources/external/architecture/ - API contracts: /resources/external/api-specs/openapi.yaml - Data model and schema: /resources/external/data-model/schema.md - Security policies: /resources/external/security/ ## Technology Stack - Java 17 + Spring Boot 3.2 (backend services) - React 18 + TypeScript 5 (frontend) - PostgreSQL 15 (primary database) - Redis 7 (session and cache layer) ## Coding Standards - Follow shared utility patterns in /src/main/java/com/myapp/common/ - All new API endpoints require OpenAPI annotations - Service layer tests are required; controller tests are optional - Use the repository pattern for all database access; no raw queries in controllers ## Review and Approval - Changes to /src/main/java/com/myapp/billing/ require billing team review - Database schema changes require DBA sign-off before merging - Security-sensitive changes require a second reviewer from the security team - All changes go through pull request; no direct commits to main ## Do Not - Do not hardcode credentials, tokens, or environment-specific URLs - Do not modify database migration files already deployed to production - Do not add external dependencies without an architecture review - Do not generate, log, or store PII in temporary files or debug output
With your context centralized and your instruction file in place, you are ready to start working with the agent in a meaningful way. Your first few prompts will reveal gaps: things the agent does not know, references that are missing, or guidance that is too vague to act on. That is expected and, more importantly, useful.
Treat each session as a feedback loop. When the agent's output is not
quite right, refine your prompt and try again. When a pattern of
refinement reveals that the agent is consistently missing context or
defaulting to a behaviour you do not want, that is a signal to update
AGENT.md
or
copilot-instructions.md
rather than repeating the correction every session. The correction belongs
in the file, not in your memory.
Over time, this feedback loop becomes a genuine form of continuous improvement. The instruction file grows more precise. The context folder becomes more complete. The quality and relevance of the agent's output improves with each iteration, and far fewer corrections are needed on repeat work. This is exactly how you would train a new team member, and the same patience and deliberateness pays off here.
The framework above works with many different tools, but some combinations work particularly well together for this kind of agentic development workflow. The following stack covers every layer from version control to the AI agent itself.
| Tool | Role in the Workflow | Why It Works Well Here |
|---|---|---|
| GitHub | Version control and storage | All project files, converted documentation, and instruction files live here. GitHub's native integration with both Copilot and VS Code makes it the natural center of the workflow. |
| Visual Studio Code | Development environment | Workspace files let you combine multiple repositories in a single view. The extension ecosystem is deep, and both major AI extensions are built for it. |
| GitHub Copilot or Claude | AI agent | Both are strong choices. Copilot integrates natively with VS Code and GitHub. Claude tends to reason well over long documents and handles large context windows gracefully. Use whichever aligns with your existing paid subscription. |
| GitHub Pull Request Extension | Review workflow | Create, review, and merge pull requests directly from VS Code without switching to a browser. This keeps the work context intact during review cycles and makes it easy to commit AI-assisted output to the repository immediately. |
| VS Code Workspaces | Multi-repository context | When your project spans multiple repositories, or when you need a separate AI repository alongside a controlled project, workspaces bring everything into a single session without touching the original repository structure. |
A paid AI subscription matters here. Free tiers impose context-window and message-rate limits that quickly become frustrating on projects of any real size. The productivity gains from this framework on a serious project are significant enough that the subscription cost pays for itself within the first week of use.
Once the framework is in place, improvements show up across the entire project lifecycle, not just in code generation. Here is what typically shifts once an agent has full project context to work with.
With full context available, the agent can help decompose features, identify dependencies, flag risks, and generate implementation plans that account for the existing architecture and constraints. Planning sessions that used to take hours of discussion can be drafted in minutes and refined from there.
Design ideation is faster when the agent can cross-reference existing patterns and constraints in real time. Design review is sharper because the agent can check whether a proposed change is consistent with documented decisions and security policies before a human reviewer ever sees it. The quality of what reaches review improves, and the review cycle itself gets shorter.
An agent working with full project context generates code that is consistent with the rest of the codebase, follows established patterns, and respects the constraints defined in the instruction file. The quality difference compared to a context-free agent is significant and immediately visible in code review.
With access to the security policy documents, the agent becomes a useful first-pass reviewer for defensive coding issues: missing input validation, improper error handling, insecure dependency usage, and common OWASP vulnerabilities. This does not replace human review, but it raises the floor on what makes it to a human reviewer in the first place.
Documentation is frequently the first thing to fall behind in a busy project. With the codebase visible and the context files in place, the agent can generate API documentation, update architecture notes, draft release summaries, and produce onboarding guides from the actual current state of the code rather than from someone's memory of what they think the code does.
When a bug is reported, an agent with full project context can traverse the codebase faster than any human trace. Give it the symptom, point it at the relevant areas, and ask for root-cause hypotheses. It will surface candidates in a fraction of the time a manual investigation would take. Combined with access to runbooks and historical decision records, troubleshooting sessions become dramatically shorter.
Ask the agent to generate unit tests for a service class and it will write tests that align with the project's testing conventions, use the right assertion libraries, and cover edge cases drawn from the requirements documents. If your team practices test-driven development, the agent can produce the test suite first and let you write the implementation against it.
Release notes, change summaries, stakeholder updates, and incident post-mortems all benefit from an agent that knows exactly what changed and why. Rather than spending time writing a summary from memory, describe what was done and let the agent draft it from the actual commit history and ticket context. The output is more accurate, more complete, and takes a fraction of the effort.
The volume and quality of work a single developer or team can produce using this framework is genuinely different from what was possible before. Speed is higher, output quality is more consistent, and the cognitive load is lower because the agent is handling a significant share of the context-tracking, cross-referencing, and first-draft work that used to eat time without producing much visible output.
None of this requires a greenfield project or a wholesale technology rewrite. The same codebase your team has been maintaining for years, with the same (or different) people, can operate at a materially higher level once the knowledge base is organized and the agent is properly oriented. The investment in the first few steps pays compounding returns from that point forward.
The developers who internalize this workflow become the ones producing work that is better reasoned, better tested, better documented, and more defensible under scrutiny. In a field where speed and quality are perpetually in tension, that combination is a real advantage.