What "Verifiable Math" Actually Means — and Why It Matters for Your Money
At some point in the last few years, you've probably asked a financial question to a chatbot or AI assistant and gotten back an answer that sounded authoritative. A number…
At some point in the last few years, you've probably asked a financial question to a chatbot or AI assistant and gotten back an answer that sounded authoritative. A number with a dollar sign in front of it. A percentage. A timeline. Something specific enough to feel like it was calculated.
The uncomfortable truth is that for most AI tools, it wasn't. The number was generated — produced by a language model that predicts what a plausible-sounding response looks like — rather than computed from a formula applied to your actual inputs. The difference is enormous, and it's not something the confident presentation of the answer gives you any way to detect.
Verifiable math is the alternative. It's a specific approach to building financial AI tools that separates the calculation from the language — where numbers come from a computation layer that ran an actual formula, not from a model inventing something that sounds like the right answer.
Understanding what it means, and why it matters for the specific domain of personal finance, changes how you should evaluate any AI tool you'd use to make money decisions.
What AI language models actually do with numbers
A large language model doesn't calculate. It predicts. The model has been trained on enormous amounts of text, and when you ask it a financial question, it generates a response based on the statistical patterns of how similar questions have been answered in its training data.
For many tasks, this works remarkably well. The patterns in language are rich and the model's ability to navigate them is genuinely impressive. But for mathematical calculations, there's a fundamental problem: the model is learning to produce text that looks like a calculation, not to produce the result of actually running one.
Consider asking a basic model "what's my monthly payment on a $250,000 mortgage at 6.75% interest over 30 years?" A language model doesn't apply the loan amortisation formula. It generates text that resembles what the answer to that question would look like, based on examples in its training data. Sometimes the answer is correct. Sometimes it's close but wrong. Occasionally it's significantly wrong. And there is nothing in the presentation — no visual indicator, no confidence flag — that tells you which type of answer you've received.
For questions where you can easily verify the answer (trivial arithmetic, well-known facts), this works out often enough not to matter. For complex financial calculations with your specific numbers, the error rate is a genuine problem.
The four types of errors this creates
It helps to be specific about what goes wrong, because the failure modes are different and matter differently.
The wrong formula. The model applies a formula that sounds like the right one but isn't quite. For a simple mortgage payment, the standard formula is well-established and the model gets it right most of the time. For something less standard — the tax treatment of a partial business expense, a particular type of loan calculation, a complex annuity projection — the model might apply a simplified or adjacent formula that produces a plausible but incorrect result.
Arithmetic errors in the formula it does apply. Even when the model correctly identifies the formula, it can make errors in applying the numbers. These errors are more common with multi-step calculations where each step's output feeds the next. The model is generating text, not tracking variables, so propagation errors can compound.
Hallucinated inputs. The model might substitute assumed or approximate values for inputs it doesn't have. If you ask for your retirement projection without specifying your current savings, the model might assume a starting balance rather than asking you for it — and the answer reflects that assumption without flagging it clearly.
Confident precision about estimates. A model might say "your emergency fund target is $14,300" when the honest answer would be a range depending on factors it doesn't know. The specific figure sounds like a calculated result. It's often a rough estimate presented in the style of a calculation.
What the computation layer changes
Verifiable math addresses all four failure modes by using a different architecture for numeric outputs.
Instead of asking the language model to produce numbers, the model calls a tool — a calculation function — and passes the relevant inputs. The function runs the actual formula and returns the result. The number that appears in the conversation is the function's output, not the model's prediction of what the output should look like.
This is the same principle as how a spreadsheet works: Excel doesn't generate a plausible revenue figure — it applies SUM(B2:B12) to your actual inputs and returns the result. The value in the cell is the calculation's output, not an approximation. Verifiable financial AI extends this principle to a conversational interface: the model handles the language (understanding your question, explaining the result, answering follow-ups), and the computation function handles the arithmetic.
The word "verifiable" refers to what happens after the number appears: you can click on it, expand the formula and inputs that produced it, and confirm that the arithmetic is correct. If the formula uses a 6.75% rate on a $250,000 balance over 360 months, those are the values you'll see. You can check them. You can change them and see how the output changes. The number has a provenance you can inspect, rather than a confidence the model projects.
Why this matters specifically for financial decisions
Most AI use cases can tolerate some error rate. If an AI writing assistant makes a sentence slightly clumsy, you edit it. If an AI coding assistant suggests code that doesn't compile, you see the error and iterate. The feedback loop is fast and the cost of errors is low.
Financial decisions have a different error profile. They're often made infrequently, based on information that isn't quickly verified, with consequences that play out over months or years. A retirement savings projection that's 20% too optimistic doesn't surface its error for decades. A home affordability calculation that overstates your safe borrowing limit doesn't surface its error until you're under financial stress. An emergency fund target that's too low doesn't surface its error until the emergency arrives.
This is the domain where getting the calculation right matters, and where being confidently wrong is more dangerous than being transparently uncertain.
Verifiable math resolves this by making correctness inspectable rather than assumed. You don't have to trust that the number is right — you can look at the formula and inputs and verify it yourself. This is a different kind of trust than "this AI is usually accurate." It's the same kind of trust you'd give a spreadsheet: not faith in the tool's judgement, but confidence in a visible calculation.
How Cashowa implements this
In Cashowa, every numeric answer in a conversation is produced by a tool call rather than by the language model's generation. When you ask "what's my savings rate?", the model doesn't write a number — it calls a calculation function with your income and savings data, and the function returns the rate. The number you see is that return value.
Clicking on any figure in the interface opens the formula — the specific mathematical expression that produced it — along with the input values used. If you uploaded your bank CSV and the model calculated your average monthly grocery spend as $387, clicking the number shows you: the formula (sum of all categorised grocery transactions ÷ number of months), the transaction list it summed, and the month count. You can verify every element. If there's an error — a transaction miscategorised, an input value that doesn't match what you expect — you can see it and correct it.
The model is still doing important work: it understands your question, decides which calculation to run, explains the result in plain English, and answers follow-up questions. But it doesn't invent numbers. If it can't calculate something with the data you've provided, it asks for the missing input rather than substituting an assumption. That constraint is the feature: it means the absence of a number tells you something, and the presence of a number means something.
How to tell if a tool you're using has this property
The test is simple: can you see the formula? Not a post-hoc explanation of how the answer was derived — the model can generate plausible explanations for any number, whether it calculated it or not. The test is whether the interface shows you the actual formula used, with the actual input values, before or alongside the number.
If a tool gives you a number and you can click into it and see monthly_payment = (P × r × (1+r)^n) / ((1+r)^n - 1) with your specific values substituted, that's verifiable. If a tool gives you a number and offers a narrative explanation of the approach, that's the model explaining itself — not the same thing.
For the specific domain of personal finance — where the decisions are consequential and the error cost is high — this distinction is worth caring about.
Frequently asked questions
Does verifiable math mean the AI is slower or less convenient?
No. The computation happens fast — calculation functions run in milliseconds. The overhead of calling a computation tool rather than generating a number inline is imperceptible. What does take slightly longer is presenting the formula in an expandable format, but that happens after the answer is shown, not before.
Can the computation functions themselves have bugs?
Yes, in principle. Any software can have bugs. But calculation function bugs are categorically easier to find and fix than language model errors: they produce consistent, reproducible wrong answers that fail test cases. A language model error that produces a plausible-sounding but incorrect figure is much harder to detect because it doesn't fail consistently — it varies with phrasing and context.
If the AI uses real formulas, why would I need it at all? Couldn't I just use a spreadsheet?
The language model adds things a spreadsheet can't easily do: it understands your question without requiring you to know which formula to use, it handles data that's messy or unstructured (like a bank statement CSV), and it explains results in plain English. The value is the combination of natural language understanding with rigorous computation — not one or the other.
What if I don't want to verify the math? Can I just trust the answer?
Yes. The verifiability is an option, not a requirement. For most people, knowing that the option to verify exists is enough to create reasonable confidence in the numbers. You don't check the formula in a spreadsheet every time either — but you could. That capability is what gives you the right kind of trust.
Is verifiable math a regulatory requirement for financial AI?
Not currently in most jurisdictions, though financial regulators are increasingly attentive to AI use in finance. Verifiability aligns with principles that regulators in various markets are moving toward — explainability, traceability, and the ability to audit how recommendations were made. Tools built on verifiable math are better positioned for a regulatory environment that's moving in this direction.
Does verifiable math work for questions where the answer is uncertain or probabilistic?
Yes, but with important honesty about the uncertainty. A retirement projection involves assumptions about future returns, inflation, and your own behaviour — and the right approach is to show a range of scenarios with the underlying assumptions stated, rather than a single confident figure. A verifiable approach handles this by making the assumptions explicit inputs that you can vary, rather than hiding them in the model's confidence.