Spreadsheets & Data Analysis
Chapter objectives
- Query data in natural language
- Clean and reformat a messy spreadsheet
- Spot trends and anomalies without advanced formulas
The spreadsheet, that unloved coworker
In Marc's firm, spreadsheets are everywhere: fee tracking, invoicing software exports, clients' bank statements, missing-document checklists. And almost nobody truly masters them. The result: hours spent copying things over, sorting by hand, and a vague feeling of "I know the answer is in this file, but I don't know how to get it out".
The good news of this chapter: you no longer need to learn formulas to make your data talk. Pivot tables, VLOOKUPs and other intimidating formulas become optional — you describe what you want to know in plain language, and the calculation is done for you, with the reasoning explained. That explanation is crucial: it lets you verify the logic without redoing the calculation.
An honest clarification before we start: delegating the calculation doesn't mean delegating responsibility for the number. This whole chapter rests on a clear division of roles — the AI calculates, sorts and cleans; you verify a sample and validate the interpretation. On financial data, this guardrail is non-negotiable.
Asking your data questions
Import an export (CSV, spreadsheet) and ask normal questions: "who are my 10 biggest clients?", "how is revenue trending by quarter?". Claude calculates and explains the reasoning. You can chain questions like in a conversation: each answer can be dug into ("and if we exclude client X?", "same thing but by team member?") without rephrasing everything.
For reliable results, briefly describe your file in the first message: what each column contains, the period covered, the quirks ("the Amount column includes tax", "credit notes appear as negatives"). Those two sentences of context prevent 90% of interpretation misunderstandings — exactly like when you hand a file to a colleague.
Here is an export of fees by client (CSV): <paste>. Context: amounts in € excluding tax, one row per invoice, the "Date" column is the issue date, credit notes are negative. Answer: - top 10 clients by revenue - quarter-over-quarter trend - clients down more than 20% vs last year Give a short comment for each result and explain how you calculated.
Cleaning a messy spreadsheet
Inconsistent dates, duplicates, mixed-up columns? Describe the desired final state, and Claude proposes the clean table and explains what it fixed. The method fits in one sentence: describe the destination, not the route. You don't need to know how to normalize dates; you need to say "all dates in YYYY-MM-DD format".
Always demand two deliverables: the clean table, and the list of corrections made. This list is your audit log: "12 duplicates removed, 8 dates reformatted, 3 rows without amounts flagged". It lets you immediately spot an overzealous correction — for example two legitimate invoices for the same amount on the same day, wrongly taken for a duplicate. Without this list, you're validating blind.
This invoice table is a mess. Clean it: - normalize dates to YYYY-MM-DD format - remove duplicates (same client + same amount + same date) - add a "month" column - sort by date Show a before/after preview, list ALL corrections made, and flag separately the doubtful rows you didn't dare fix on your own.
Spotting trends and anomalies
Explicitly ask for anomalies: aberrant amounts, sudden gaps, missing values. That's often where data-entry errors or real client issues hide. The word "explicitly" matters: by default, an analysis answers your questions; it won't spontaneously flag the invoice entered as €18,400 instead of €1,840, unless you've requested the hunt for outliers.
For Marc, this monthly hunt has become a mini-ritual: ten minutes on the fee export to detect unusual amounts, clients dropping off, and invoicing gaps. Twice in three months, it revealed a real problem — a struggling client he was able to call before the relationship broke down, and a data-entry error that would have skewed the annual reporting. An anomaly isn't a bug in your data: it's often the most valuable information it contains.
Analyze this fee export: <paste>. Hunt for anomalies: 1. Aberrant amounts (far above or below the client's usual) 2. Clients whose billing dropped more than 30% over the last 3 months 3. Months with no invoice at all for a usually regular client 4. Missing or inconsistent values For each anomaly: the row concerned, why it's suspicious, and your hypothesis (data-entry error? real client signal?).
Reporting back: from table to clear message
An analysis only has value if its recipient understands it. And Marc's recipient is almost never another accountant: it's a tradesperson client, a rushed business partner, a banker. Systematic final step: ask for the plain-language translation of the results — "summarize these figures in 5 sentences for a non-financial client" or "write the commentary paragraph I'll put in my monthly reporting".
You can also prepare the visual presentation: ask which chart type would suit each result (trend over time, breakdown, comparison) and have the data generated ready to paste into your spreadsheet to create it. The numerical analysis, the written commentary and the visual suggestion all come out of the same conversation thread — your complete monthly reporting in twenty minutes.
Cross-referencing two files
The next level up: cross-referencing two sources. The fee export on one side, the list of time spent per team member on the other — and the awkward question: "which clients take a lot of our time for low fees?". Done manually, this cross-reference requires lookup formulas few people master; in natural language, it's a simple sentence, provided you indicate the common key (here, the client name or code) and ask for the list of unmatched items.
That list of unmatched items is the classic trap of cross-referencing: a client spelled "Atelier Bernard" in one file and "BERNARD Atelier" in the other silently disappears from the analysis. Always ask "list the rows you couldn't match" — you fix the spellings, you rerun, and your cross-reference becomes complete. This kind of reflex is what separates a professional analysis from an approximate one.
Guardrails: trust is calibrated
Let's recap the chapter's discipline, because it's what will let you move fast for a long time: column context in the first message, calculation explanation required, list of corrections for any cleanup, sample recounted by hand, totals verified, unmatched items listed for cross-references. Six reflexes, two minutes each — and analyses you can stand behind.
With experience, you'll calibrate your trust: on tasks you've verified ten times without an error, you'll lighten the checks; on a new data source or a high-stakes case, you'll strengthen them. It's exactly the relationship you build with a human team member — trust isn't decreed, it settles in through the repeated experience of work done well.
Context
Marc has an 18-month fee export in disarray: dates in three different formats, a few duplicates, and he suspects two or three clients are dropping off without him having noticed. Before his quarterly review with his business partner, he wants a clean file, a reliable top-clients list, the list of declining clients — and a commentary paragraph ready to paste into his reporting.
Instructions
- Take a real spreadsheet (anonymized if needed: replace names with Client A, B, C).
- Describe the column context in the first message: units, period, quirks.
- Ask for a cleanup with a before/after preview and the complete list of corrections.
- Verify a sample: recount 3 random rows and the grand total by hand.
- Ask for the top clients, the quarterly trend and the anomaly hunt (with hypotheses: data-entry error or real signal?).
- For each flagged anomaly, decide for yourself: which one deserves real action this week?
- Finish with the report-back: "summarize these results in 5 sentences for a non-financial reader".
In summary
- You can query your data in natural language, and chain questions in conversation.
- Describe the column context in the first message: that prevents 90% of misunderstandings.
- To clean up, describe the desired final state and demand the list of corrections made.
- Explicitly ask for anomalies: they're often the most valuable information in the file.
- "Explain how you calculated" is your fastest verification.
- On numbers, keep final control: a recounted sample + a verified total.
- To cross-reference two files, indicate the common key and ask for the list of unmatched items.
- Finish with the plain-language translation: an understood analysis beats a brilliant one.
Quiz — check your understanding
1. How do you analyze a spreadsheet without advanced formulas?
2. What precaution applies to numbers?
3. Why demand the list of corrections during a cleanup?
4. Why ask "explain how you calculated"?
5. You cross-reference two files by client name. What is the classic trap?