Two notebooks. Run them in order. The first builds a fixed pipeline that translates English to SQL. The second turns it into a real ReAct agent that can choose between two tools.
Class 1 - the pipeline (diagrams/pipeline_architecture.drawio.png)
Class 2 - the agent (diagrams/agent_architecture.drawio.png)
Open the .drawio files in https://app.diagrams.net or the draw.io
desktop app to edit them.
| Order | Notebook | What it teaches |
|---|---|---|
| 1 | nl_sql_pipeline.ipynb |
Fixed pipeline: question -> SQL -> answer. Three prompt versions measured against a test suite. The hard lesson on domain terms. |
| 2 | agent.ipynb |
Same data + a second source. Build a multi-tool ReAct agent by hand. The LLM picks the tool. |
nl_sql_pipeline.ipynb Class 1 - the pipeline
agent.ipynb Class 2 - the agent
spacex_launches.db SQLite database (18 SpaceX missions) - used by both
seed.sql SQL to recreate the database
schema.md schema description for the LLM
(column meanings + allowed values + domain terms + few-shot)
vehicle_specs.json rocket specs (thrust, height, reusability...)
used as the agent's second data source
diagrams/ draw.io source + PNG exports for both architectures
pyproject.toml uv project file (so `uv add ...` works)
README.md this file
You need:
-
uvto manage Python and packages. Install once:curl -LsSf https://astral.sh/uv/install.sh | shWindows: see https://docs.astral.sh/uv/getting-started/installation/.
-
An OpenAI API key with at least a few cents of credit. Create one at https://platform.openai.com/api-keys. The notebooks prompt for it on first run (the input is hidden as you type).
git clone https://github.com/fnusatvik07/agentbuilder-class4-nlsql.git
cd agentbuilder-class4-nlsql
# install jupyter and the LangChain packages (one-time)
uv add jupyter langchain-openai langchain-core pandas
# class 1 - the pipeline
uv run jupyter lab nl_sql_pipeline.ipynb
# class 2 - the agent (run this AFTER class 1)
uv run jupyter lab agent.ipynb- How an LLM uses a schema description to write SQL.
- Why
schema.md(for the LLM) is separate fromseed.sql(for SQLite). - How to validate LLM output before executing it.
- How to build a test suite of (question, expected SQL, expected answer) tuples and measure prompt changes against it.
- The hard lesson: when business vocabulary collides with column
values (
heavy launchvs the literal'Falcon Heavy'enum), the LLM will confidently produce the wrong answer unless you write the term down. - The limitations of a fixed-order pipeline.
- Why a fixed pipeline cannot answer multi-source questions.
- How LangChain's
@tooldecorator andbind_tools()work. - How to write the ReAct loop by hand: Thought -> Action -> Observation -> repeat -> Final Answer.
- How to read an agent's trace and spot bad tool choices.
- The new failure modes that come with agents (running in circles, premature termination, hallucinated facts).
rm -f spacex_launches.db
sqlite3 spacex_launches.db < seed.sqlBring them to class. Or open an issue on this repo.

