Learning NotebookLM: What I did and what happened

I wanted to understand how NotebookLM works. The obvious test case was my own company accounts. I’ve been running a UK Ltd since 2021, I have the documents, and I know what the actual answers should be. If the system invents something, I’d catch it immediately because I lived through it. This isn’t a productivity hack piece. This is a process journal. Here’s what I did, step by step, and what actually came out the other end.

Why this test case

NotebookLM grounds an AI in specific documents so it doesn’t hallucinate. It reads what you give it, answers from that, and says when something falls outside the documents. My company accounts were a perfect learning material because I could verify every claim. If NotebookLM pulled a number, I could check it against the actual filing. If it made a connection between spending patterns, I could confirm whether that was real or pattern-matching on noise.

Step 1: Uploading the documents

I gathered everything I had:

  • All filed CT600s (tax returns) since incorporation
  • The Certificate of Incorporation
  • Companies House filings
  • My detailed accounting records (the spreadsheets where I actually track what I’ve spent)

Then I created what I called Core_Context: a document that wasn’t just data, but metadata. It was me explaining the intent behind the numbers. Why the accounting year-end changed. What services were included in each period. How the business had evolved. The human narrative that sits underneath the tax forms. Then, I uploaded everything into NotebookLM.

Step 2: Testing with real questions

Once the documents were in, I started asking it things I already knew the answer to.

First test: year-on-year analysis of revenue and spending from 2021 to 2025. I asked NotebookLM to pull the numbers from the filed accounts and compare them.

It worked. It didn’t guess. It didn’t apply some generic framework about “healthy growth” or “scaling patterns.” It reported what the documents said: revenue grew, spending on digital services increased proportionally, margins stayed fairly consistent. All verifiable against the actual filings.

The second thing I asked it to do: trend spotting. I asked why spending had increased year-on-year. The documents themselves told the story. I’d scaled from freelance tools to proper software subscriptions. The spending was deliberate, not accidental. NotebookLM identified a real pattern, not a fictional one.

Third: compliance calendar. I asked NotebookLM to extract all the key dates, when my CT600 is due, when I need to file accounts, when corporation tax is payable. Instead of manually checking Companies House rules, I had a list generated from my actual filing history. I verified the dates. They were right.

Step 3: Taking it further with Claude

NotebookLM is text-based. Once I had the compliance calendar as a text list, I wanted to do something with it: turn dates into calendar entries, make it actionable instead of just information.

I took the raw text report from NotebookLM and handed it to Claude with a specific instruction: convert this into an ICS file (calendar format) that I can import directly into my calendar system.

Claude parsed the dates, structured them, generated a file I could actually use. Not as raw information. As action.

This is where the multi-model workflow mattered. NotebookLM did the grounding, it knew what was actually true about my company. Claude did the conversion, it turned that truth into something I could do something with.

The workflow looked like this: me with a question → NotebookLM grounding the answer in actual documents → Claude converting that into structured output → me with a calendar file ready to import.

Why this matters more than just uploading to Claude

You could do this in Claude Projects too—upload all the same documents, ask the same questions. And it would probably work fine. But there’s a meaningful difference.

Claude has memory across conversations, but it’s generated summaries, not the documents themselves. If I uploaded my CT600 in one conversation, the next conversation wouldn’t have access to it. The documents don’t persist, only whatever Claude has distilled from past chats. That’s useful for continuity, not for accuracy. NotebookLM keeps the actual source material permanently in the notebook. I’m always querying the documents, not a summary of a summary.

And more importantly, the grounding is the point. NotebookLM is designed to stay true to source material. It’s not trying to be helpful in ways that go beyond what the documents say. General-purpose AI is designed to make connections, extrapolate, offer broader context. NotebookLM just says: here’s what the documents actually say.

That’s a meaningful difference when accuracy matters more than helpfulness.

Where I’m at

I’ve just set this up. The maintenance implications are obvious but untested: when I file new accounts, I’ll need to add them to NotebookLM and regenerate the compliance calendar. That’s the trade-off: upfront work to build it, then ongoing work to keep it current.

The upfront work was significant. I had to go through every filing, organise the documents, create the metadata layer, write out the Core_Context_v1 document explaining what each number actually meant. I’d estimate two to three hours total. Not because it’s complicated, but because doing it properly means thinking through your own business history, why did you change the accounting year-end, what does each category mean, what’s the actual narrative underneath the tax forms.

The system works exactly as I expected it to. I ask it questions. It answers from the documents. I verify the answers. I hand them to Claude if I need to do something with them. That workflow is solid.

It’s not revolutionary. It’s also not “just uploading PDFs to ChatGPT.” It’s a system that works because someone thought through the grounding layer and did the work to build it properly. That work is where the value lives.

Check out other projects I’ve been working on: