This blog is part of my journey from solution architect to AI architect. In this blog, I will share how I used multiple LLMs not only to generate code but also to design and deliver a law assistant in small, testable phases.

The Problem

The Law AI Assistant ingests key provincial (Quebec) laws, such as the “Act respecting labour standards”, the “Civil Code of Québec”, and the “Act respecting the governance of the health and social services system”, and uses a small Llama‑based model and retrieval to answer questions grounded in these texts.

I decided to use a small language model (SLM), specifically a Llama‑based model running locally, to keep costs predictable and preserve privacy for legal queries.

I had to decide how to ingest Quebec legal documents, how to structure retrieval so answers stay anchored in the law, and how to deliver the system in increments that produce working software.

Step 1: Asking an LLM to Slice the Architecture

Instead of designing the roadmap alone, I asked ChatGPT to break the system into 3–5 phases with the following constraints: each phase must produce working software that is independent, testable in isolation, and built on previous phases. This is essentially an incremental architecture applied to an AI system.

The model proposed a 4‑phase structure that I refined:

  • Phase 1: Core ingestion and basic Q&A over a limited Quebec law corpus
  • Phase 2: Improved retrieval quality and relevance scoring
  • Phase 3: Better reasoning and context handling for more complex questions
  • Phase 4: Optimization, UX, and non‑functional improvements (latency, observability, etc.)

This gave me a simple but concrete backbone for delivery, similar to how we define thin vertical slices in modern agile projects.

Step 2: Turning Phases into a Markdown Brief for the LLM

Once the phases were clear, I focused on Phase 1 and asked ChatGPT to generate a structured Markdown “spec” I could reuse as a prompt. The template looked like this:

  • Feature Overview: explain what you are building overall
  • Phase breakdown:  Phases 1 to 4
  • Starting with Phase 1:
    • Context (tech stack, relevant files)
    • Intent of this phase
    • Constraints
    • Scope
    • Verification and how we test success

This Markdown file behaved like a lightweight functional spec, a prompt, and a contract with the LLM. Instead of saying “build my app,” I now had a precise, structured artifact I could feed to any model.

Step 3: Using Multiple LLMs as “Architect” and “Reviewer”

Next, I moved to Claude. I imported the same global instructions, pasted the Markdown, and asked Claude to improve it. Claude played the role of a critical reviewer, raising questions, challenging assumptions, and refining the Phase 1 definition and constraints.

At the end of this step, I had a more robust CLAUDE.md file that described Phase 1 in enough detail that a developer or an LLM acting as one could work from it.

Step 4:  Prompt‑Driven Development for Phase 1

With CLAUDE.md ready, I asked Claude in “code” mode to implement Phase 1 with a very explicit instruction:

“Implement Phase 1 as defined in CLAUDE.md. Start with the database migrations and the IngestionService. Do not scaffold the frontend yet.”

Phase 1 for the Quebec law assistant included the following:

  • A simple schema to store legal documents and chunks (articles, sections)
  • An ingestion pipeline to load Quebec law texts into the database
  • An embedding and retrieval layer to support basic question answering using the SLM
  • A minimal interface (for example, a CLI or basic API) to send a question and receive an answer, plus references

Because the prompt clearly described the context, scope, constraints, and execution order, the generated code was much closer to what I would expect from a developer following a well‑written ticket.

Step 5: Iterative Delivery Across Phases

After Phase 1, I asked the LLM how to test the implementation and used its suggestions to write and run tests. Once I was satisfied, I moved to Phase 2, updated the Markdown spec, and repeated the pattern: refine the phase, have the LLM implement it, then test and adjust.

Over time, this produced:

  • Small, independent increments that always resulted in working software
  • Continuous feedback on retrieval quality and answer grounding
  • A way to gradually improve reasoning and UX without breaking the system

This is very close to how we want architecture and delivery to work, even in non‑AI projects: small slices, continuous validation, and deliberate evolution.

Why This Workflow Works Well for Architects

From an architecture perspective, this approach has a few strong properties:

  • Clear boundaries: Each phase has a defined scope, explicit goals, and measurable outcomes.
  • Strong testability: You can validate each phase independently, from ingestion correctness to retrieval quality.
  • Lower cognitive load: You never try to design or build the entire system at once.
  • Better LLM performance: Models behave more reliably when the prompt narrows the scope, clarifies intent, and provides a structured context.

For a legal assistant, there is an additional benefit: using a small local model with retrieval from an explicit corpus of Quebec law makes it easier to control privacy and mitigate hallucinations by exposing citations or direct quotations.

How You Can Apply This to Your Projects

If you are experimenting with AI, here is a simple pattern you can reuse:

  • Ask an LLM to propose a 3–5 phase roadmap with the constraints “independent, testable, working software builds on previous phases.”
  • Turn the chosen phase into a structured Markdown brief with context, intent, constraints, scope, and verification.
  • Use one LLM to generate the spec, and another to review and challenge it.
  • Store the spec in your repo and use it as the main prompt when asking an LLM to implement code.
  • After each phase, ask the LLM how to test, run the tests, and then iterate.
About the Author

My name is Adel Ghlamallah and I’m an architect and a java developer.

View Articles