This article will count 0.25 units (15 minutes) of unverifiable CPD. Remember to log these units under your membership profile.

Your client just sent across their workpapers. They look clean, well-cited, and professionally structured. There is only one problem: AI helped build them. Now you have to sign off. The question is not whether the output looks right. The question is whether you can actually stand behind it.

This is the debate that has gripped the global audit profession in 2026, sparked by two sharp opinion pieces published in Accounting Today. The two authors agree on almost nothing, but together they draw the clearest picture yet of what is at stake when AI enters the audit room.

‍The case against trusting AI-assisted work products

In his article in May 2026, Jim Germer drew a direct line between the 1929 crash and today's AI moment. His argument is uncomfortable but precise: the profession built its independence standard on a century of hard lessons. Self-certification failed catastrophically. So the profession invented the audit, the opinion, and the independence requirement. None of that has been applied to AI.

Germer identifies three conditions that have defined auditor independence for a century. The auditor must be independent of the entity under examination. The auditor must have unrestricted access to underlying records. And the auditor must apply a standard set by an authority outside the entity being reviewed.

He argues that AI fails all three. The same institution that builds the AI model also defines what reliable means, trains the model to meet that definition, and then evaluates whether it did. There is no external auditor. There is no GAAP equivalent. There is no materiality standard set by anyone outside the system being assessed.

The result, he says, is a closed loop. When an AI system tells you it is reliable, that is not independent verification. It is the software equivalent of management saying controls are effective and using management's own documents to prove it.

Germer goes further. He names three specific failure modes that an auditor cannot detect without independent certification of the underlying system:

  1. Omission: the system fails to surface material information that would change a professional judgment.

  2. Fabrication: the system produces citations, authority, or conclusions that do not exist.

  3. Instability: the system produces different answers under equivalent factual conditions, without disclosing that variability.

Each of these, he argues, would constitute an adverse finding if visible before reliance. The problem is that they are often not visible at all.‍ ‍

This connects directly to a key concern explored by IRBA's Late Imran Vanker in Accounting Weekly's earlier piece on ChatGPT and auditing, where AI was shown to misidentify auditing standards entirely during live demonstrations. The hallucination risk is not theoretical. It has already shown up in practice.‍

The case for trusting auditors to do their job

‍Scott Davis, writing in June 2026, is direct in his rebuttal. He says Germer has confused two completely separate things: the reliability of the tool, and the auditor's responsibility to test what management provides.

‍Davis's argument is clean and familiar. Auditors have never been required to certify the tools their clients use. When a client prepares workpapers in Excel, auditors do not require an independent audit of Excel before they can rely on the output. They test the output itself, against underlying records, using professional scepticism and established audit procedures.

‍AI-assisted workpapers, Davis argues, are no different. The object of the audit is management's assertions and the evidence supporting them. It is not the AI system that produced the schedule. If management uses AI to compile a debtors' reconciliation, the auditor still confirms the debtors. The AI is just a more sophisticated spreadsheet.‍ ‍

He also pushes back on the idea that this requires a new assurance model. If an auditor fails to test AI-assisted work properly, that is an audit failure. It is not evidence that the profession needs a separate external auditor for AI systems before normal audit procedures can function.‍ ‍

The Big Four are already moving this way, with firms like KPMG, EY, PwC, and Deloitte deploying AI assistants across audit, tax, and advisory. Their model is not AI certifying itself. It is AI doing the groundwork, with auditors testing the outputs.‍ ‍

What this means for you, in your practice

Both authors are right about something.

Davis is correct that the audit process already handles AI-assisted workpapers, in principle. Auditors test evidence. They do not certify software. That is a reasonable, defensible position.‍ ‍

But Germer is correct about a risk that Davis does not fully address: the failure modes he identifies, omission, fabrication, and instability, are specifically dangerous because they are not always visible in the output. A management representation that looks right and is wrong is still a problem. So is an AI-generated schedule that looks right and is wrong.‍ ‍

The difference between Excel and an AI large language model is meaningful here. Excel does not hallucinate. It does not produce a formula result that changes depending on how you phrased the question. It does not omit a material item because its training data was incomplete. An AI model can do all of these things, and the output will still look clean, cited, and professional.‍ ‍

That is the practical exposure you need to manage. And it matters for clients in every sector. A construction firm's AI-assisted contract revenue schedule. A retail client's AI-compiled inventory reconciliation. A logistics operator's AI-generated cash flow workpapers. The risk is not abstract. It sits in the files you are currently reviewing.‍ ‍

Here is what that means in practice:‍ ‍

  • You do not need to certify the AI system. But you do need to go deeper on the evidence behind AI-assisted outputs than you would for manually prepared work.

  • Ask where the numbers came from. Trace back to the source data.

  • Apply more scepticism to cited references, especially legal authorities, tax rulings, or technical standards.

  • And where the AI has made a judgment call, make sure a human has made the same call independently.‍ ‍

The standard has not changed. The risk profile has. And your professional liability remains exactly where it has always been: with you, not the software developer.‍ ‍

Accounting Weekly's practical breakdown of AI tools already available to practitioners shows how tools like Mindbridge and Datasnipper are being used to test rather than replace audit judgments. That is the model that works. AI as a testing tool. Not AI as a trusted preparer.‍

Professional scepticism applies to AI the same way it applies to management

ISA 200 defines professional scepticism as a questioning mind, an alertness to conditions that may indicate misstatement, and a critical assessment of audit evidence. None of that standard changes because the workpaper was prepared by AI. What changes is where the scepticism needs to be aimed.

With manually prepared workpapers, your scepticism is focused on management's intent and competence. Could they have made an error? Could they be hiding something? With AI-assisted workpapers, those questions still apply. But you add a third one: could the AI have produced a convincing output that is factually wrong, and would I be able to tell?

The answer is that you often cannot tell just by reading the output. That is the whole problem. AI outputs are written in confident, fluent, well-structured language. They include citations. They include cross-references. They look exactly like what a competent person produces. Professional scepticism means you do not let that appearance substitute for evidence.

The standard that governs this is already in your toolkit. ISA 500 requires audit evidence to be sufficient and appropriate. Appropriateness has two components: relevance and reliability. For AI-assisted workpapers, the reliability test is what matters most. Reliability under ISA 500 is higher when evidence comes from independent external sources than from the client's own internal records. An AI system trained and operated by the client, or by a software vendor with no external oversight, is not an independent external source. It is an internal source with a very convincing output format.

That means AI-generated content sits at the lower end of the reliability hierarchy. It needs to be corroborated. It cannot stand alone.

How to check AI outputs against valid external evidence

This is the practical question that neither author in the debate fully answers. Here is a workable framework for your practice.

Step 1: Identify what the AI actually decided

AI systems do not just compile data. They make implicit decisions: what to include, how to classify it, which standard applies, what the correct treatment is. Before you can test the output, you need to identify where those decisions sit. Look for any conclusion, classification, or interpretive statement in the workpaper. Each one is a point of AI judgment that needs independent verification.

Step 2: Identify the appropriate external source for each claim

Every factual or legal claim in an AI-assisted workpaper has a correct external source. The skill is in knowing which source to go to, and going there directly rather than relying on the AI's citation.

Tax positions and rulings: go directly to the SARS website, the Income Tax Act, or the relevant binding general ruling. Do not accept an AI-generated summary of a ruling as sufficient. Pull the actual document and read the relevant section yourself.

Accounting treatment: go to the IFRS Foundation's official standards at ifrs.org, or to SAICA's issued guidance. AI systems have been shown to misstate the scope and requirements of accounting standards. Verify the standard number, the paragraph reference, and the actual requirement before relying on any AI-assisted accounting conclusion.

  • Legal and contractual matters: go to the actual contract, the relevant legislation, or the applicable court judgment. An AI citation to a case should be independently confirmed on SAFLII or the relevant law report. Cases that AI systems cite sometimes do not say what the AI claims they say.

  • Company and entity information: go to the CIPC registry directly for registration status, directors, and share structure. Do not rely on an AI-compiled entity profile.

  • Bank and third-party confirmations: these must come from the external party directly, not from an AI-compiled summary of correspondence. The confirmation process has not changed.

Step 3: Apply the instability test before you finalise

Because AI outputs can vary under equivalent conditions, a useful additional procedure is to ask whether the conclusion would hold if the question were rephrased. You do not need to run the AI yourself to do this. Simply ask: does the conclusion in this workpaper depend on a specific interpretation that the AI chose, and is there a reasonable alternative interpretation that would change the answer? If yes, get an independent human judgment on that point before you rely on the AI's version.

Step 4: Document your scepticism, not just your conclusion

ISQM1 and relevant accounting and audit standards require you to document the evidence you relied on and why it is sufficient and appropriate. When AI-assisted workpapers are involved, that documentation needs to record more than your conclusion. It needs to record the specific external sources you independently verified, the specific AI claims you tested, and the basis on which you concluded the AI-assisted material was reliable enough to support your opinion. This is not extra work for its own sake. It is the protection you need if the work is ever reviewed and someone asks how you knew the AI had not fabricated a ruling or omitted a material liability.

What to do today

‍ When you receive AI-assisted workpapers from a client, these are the minimum additional steps that protect you:‍ ‍

  • Trace at least two material items back to primary source documents, regardless of how clean the AI output looks.

  • Check any cited standard, ruling, or authority independently before relying on it.

  • Ask your client whether the same AI query was run more than once, and whether different outputs were compared. Document your additional procedures and your conclusion on why the AI-assisted materials are reliable enough to support your opinion.‍ ‍

You are not being asked to audit the machine. You are being asked to do your job properly in an environment where the machine can produce convincing errors. That is a distinction worth understanding before you sign.

Join CIBA and get access to a list of great CPD offerings on cpd.myciba.org.

Next
Next

Cybersecurity for the Small Practice: What SARS, POPIA, and Your PI Insurer Expect When You Hold Client Data