CA

Prompting ChatGPT and Claude for GST Notices: What Works, What's Risky

P

CA Prateek Agarwal ·

General-purpose LLMs like ChatGPT and Claude have quietly become a first draft tool in a lot of CA offices, and GST notice work is where the temptation is strongest — a notice arrives, the reply window is short, and the model will happily produce something that reads like a reply in thirty seconds. The problem is that "reads like a reply" and "is a defensible reply" are very different things. This piece sets out what these models genuinely do well on GST notices, where they will quietly sink you, and a discipline for using them without putting a client's data or your own name at risk.

What general LLMs actually do well on GST notices

The honest answer is that ChatGPT and Claude are good at language and structure, not at law and facts. Once you accept that split, the useful uses become obvious.

  • Summarising a long notice. A ten-page ASMT-10 scrutiny notice or a multi-annexure Section 73 demand can be reduced to a clean list of "what is alleged, para by para, and what each para is asking for." The model is reading English and reorganising it — exactly its strength.
  • Structuring the reply. Given the list of allegations, an LLM produces a sound skeleton: para-wise heading, factual position, supporting documents relied on, and a prayer. This saves the blank-page time without touching any legal substance.
  • Drafting boilerplate. The covering letter, the standard reservations ("without prejudice", "the notice is bad in law to the extent…"), the index of annexures, the request for a personal hearing under natural justice — repetitive, low-risk prose that a model writes competently.
  • Explaining a provision in plain language. If you want a quick plain-English read of what a sub-rule is driving at before you go and verify it, the model is a decent tutor. Treat it as a starting explanation, never as the authority.

In short, use the LLM for the carpentry of the reply — the framing, ordering, and tone — and keep the load-bearing parts (the section numbers, the figures, the case law, the legal stand) firmly in your own hands.

Where it gets risky

This is the part most "AI for tax" enthusiasm skips, and it is the part that actually matters for a YMYL document filed under your client's authentication.

Hallucinated sections and case citations

General LLMs do not know the CGST Act; they predict plausible text. That means they will produce a section number that looks right and is wrong, cite a Rule that was amended two years ago, or invent a tribunal decision — complete with a party name, a citation, and a holding — that does not exist. This is not a rare glitch; fabricated citations are a well-documented failure mode, and they are dangerous precisely because they are fluent. A wrong section in a reply to the department is worse than no section at all: it signals the reply was not professionally checked.

The rule is absolute: every statutory reference, every Rule, every figure, and every case the model produces must be independently verified against the bare Act, the notification, or the actual judgment before it goes anywhere near a notice. If you cannot verify a citation, delete it rather than soften it.

Confidentiality

When you paste a notice into a consumer ChatGPT or Claude chat, you are sending the client's GSTIN, name, turnover, period, and the department's specific allegations to a third-party service — outside any engagement-level confidentiality you owe, and potentially feeding a model you do not control. For a profession bound by client confidentiality, and for a country with the Digital Personal Data Protection Act, 2023 now in force, that is a real exposure, not a theoretical one. We cover the data-handling angle in detail in Using AI Tools on Client Data Under the DPDP Act.

No liability and no accountability

If the model gets it wrong, no one is answerable but you. The LLM provider's terms disclaim everything; there is no professional standing behind the output. The reply is filed under the client's DSC or EVC and your name. The model does not carry the file, attend the hearing, or stand behind the legal position — so it cannot own any part of it.

It does not know the limitation timeline

An ASMT-10 typically carries a short response window; a DRC-01A intimation precedes the formal DRC-01 show-cause; the demand timelines under Section 73 and Section 74 differ. The model has no view of your notice's dates and will never warn you that the window closes on Friday. Tracking limitation is a professional duty the tool cannot discharge.

A short anonymisation checklist before you paste anything

If you are going to use a public LLM at all, never paste the raw notice. Strip it to a skeleton first. Run through this before anything leaves your machine:

  1. Remove the GSTIN and PAN. Replace with [GSTIN] / [PAN].
  2. Remove the legal name and trade name. Use [the taxpayer] or [Company A].
  3. Remove the proper officer's name, the notice/DIN number, and the jurisdiction. None of it helps the draft.
  4. Generalise the numbers if you can. If you need the model to structure a turnover-gap reply, "an ITC difference of approximately ₹X lakh" works as well as the exact figure for drafting purposes.
  5. Remove names of directors, suppliers, customers, and bank details. Replace with placeholders.
  6. Keep only what the model needs to draft: the type of allegation (e.g. "GSTR-3B vs GSTR-2B ITC mismatch"), the period in generic terms, and the structure you want.

A useful test: read your anonymised prompt as if it were a stranger's. If you could identify the client from it, it is not anonymised yet.

Example prompt patterns

These are written with placeholders on purpose — copy the shape, not the data.

Summarising a notice:

"Below is an anonymised GST scrutiny notice. List, para by para, exactly what is being alleged and what document or explanation each para asks the taxpayer to furnish. Do not add any legal analysis. [paste anonymised notice]"

Building a reply skeleton:

"I am replying to a GST scrutiny notice alleging a difference between ITC claimed in GSTR-3B and ITC available in GSTR-2B for [period]. Draft a para-wise reply structure only — headings for factual position, reconciliation, documents relied upon, and prayer. Leave the legal grounds and section references blank for me to fill. Do not cite any sections or case law."

Note the explicit instruction not to cite — this is the single most useful habit. It turns the model into a structuring tool and shuts down its most dangerous reflex.

Drafting boilerplate:

"Draft a standard covering letter and a request for a personal hearing for a reply to a GST show-cause notice. Neutral, professional tone. No section numbers, no figures."

Plain-language check (verify afterwards):

"In plain English, what is the GSTR-3B vs GSTR-2B ITC matching requirement trying to achieve? I will verify the legal position separately."

The discipline running through all of these: the model frames and phrases; you supply and verify every section, figure, and authority.

When to switch to a domain tool

The moment a notice involves real client data, real citations, or real exposure — which is to say, almost always — a general LLM is the wrong tool and a domain tool trained on Indian tax law is the right one. The difference is not intelligence; it is grounding and data handling. Tools built for this purpose answer from Indian statute and case law and cite their sources, and they are built to handle client data under proper terms rather than a consumer chat window.

  • Vaive is an AI co-pilot aimed at GST litigation, research, drafting, and client management — built around the notice-and-reply workflow rather than general chat.
  • TaxBotGPT is an AI tax assistant trained on Indian tax law that gives cited answers and assists with notice drafting, so the citations have a traceable basis.
  • VIDUR is an AI assistant for Indian tax, corporate, and regulatory law research and drafting.
  • Taxmann.ai does AI legal and tax research and drafting backed by Taxmann's own content, which is about as grounded as the citation question gets in India.

Even with these, the verification duty does not vanish — a cited answer is easier to check than an invented one, but you still check. What changes is that you are no longer asking a model that has never read the CGST Act to remember it. For the broader picture of where these tools fit in a GST practice, see How AI Is Changing GST Compliance for Indian CAs, and for a clear-eyed list of the specific things general models get wrong on Indian tax, read What AI Gets Wrong About Indian Tax.

Where the human stays in the loop, always

  • The legal position is yours. Whether to contest, concede, or seek time, and on what grounds, is professional judgement no model makes.
  • Every citation is verified by you. Section, Rule, notification, and case — checked against the source, not the model's memory.
  • The limitation timeline is tracked by you. The window on the notice in front of you is the one fact the model does not have and will not flag.
  • The filing is authorised by you. Under the client's DSC or EVC and your name. That signature is the line the tool cannot cross.

Frequently asked questions

Can I just paste the GST notice into ChatGPT or Claude to get a reply?

You can, but you should not — not the raw notice. It contains the client's GSTIN, name, period, and the department's allegations, all of which then sit with a third-party service outside your engagement confidentiality and possibly the DPDP Act. Anonymise it first using the checklist above, and even then use the model only to structure and phrase, not to supply law or figures.

Will the section numbers and case law ChatGPT gives me be correct?

Not reliably. General LLMs predict plausible text, so they routinely produce wrong section numbers and entirely fabricated case citations that read convincingly. Treat every statutory reference and every case the model gives you as unverified until you have checked it against the bare Act, the notification, or the actual judgment. A wrong citation in a reply is worse than none.

Are domain tools like Vaive or TaxBotGPT safer than ChatGPT for notice work?

For client work, yes — on two counts. They answer from Indian tax law and cite sources, so their citations have a traceable basis rather than being invented, and they are built to handle client data under proper terms instead of a consumer chat. You still verify the output, but you start from a grounded answer rather than a confident guess.

Who is responsible if an AI-drafted reply is wrong?

You are. The reply is filed under the client's authentication and the CA's name, and the LLM provider's terms disclaim all liability. The model does not own the legal position, the citations, or the limitation timeline — the professional does. AI is a drafting aid; accountability does not transfer with the draft.

The takeaway

ChatGPT and Claude are genuinely useful for GST notice work as long as you are clear about which half of the job they do: they handle the carpentry — summarising, structuring, and phrasing — and they handle none of the law. The two non-negotiables are confidentiality and accuracy: never feed a consumer chat a client's identifiable data, and never trust a section number or case citation it produces until you have verified it yourself. For anything beyond a first draft on a real client matter, a domain tool trained on Indian law and built for client data is the safer instrument — browse the software directory to see the options. Whatever you draft with, the legal position, the citations, and the limitation clock stay with the Chartered Accountant.

Related software