When an organisation adopts tools powered by models like ChatGPT or Gemini, it can feel like you’re interacting with software that simply “reads” and “writes”. In reality, the model doesn’t process words the way humans do. It processes tokens.
Tokens aren’t a technical footnote for specialists. They’re the unit that shapes costs, response times, operational limits, and risk exposure (privacy, security, reliability). Put simply: if data is the fuel, tokens are the meter.
What a token is (and why it hits the budget)
For computers, natural language is too broad and ambiguous to be handled as a list of “words”. Large language models (LLMs) break text into smaller building blocks—tokens—which can be:
- a whole word (“school”)
- part of a word (“evalua-”, “-tion”)
- punctuation and symbols (“,”, “.”, “/”)
- frequently occurring character sequences
The economic impact
In most AI services—especially via APIs—you don’t pay “per question”. You pay per token consumed:
- input tokens (your prompt plus any documents you paste or attach)
- output tokens (the model’s generated response)
The more you feed the model, and the more you ask it to produce, the higher the bill.
The language gap (often overlooked)
As a rule of thumb, 1,000 tokens equal roughly 750 English words, but that ratio shifts across languages. Italian tends to fragment more: for the same informational content, Italian text can require 20–30% more tokens than English.
Practical takeaway: if your cost model was built on English examples (or on “lean” prompts), real budgets in Italian—especially in administrative, legal, and technical contexts—can easily be underestimated.
From words to vectors: how AI “sees” documents and workflows
To understand why an LLM can be powerful—and where it can become risky—you need a clear view of the pipeline.
1) Tokenisation
A document (contract, administrative order, tender specification, regulation) is converted into a sequence of tokens, each mapped to a numeric ID.
2) Embeddings (concept mapping)
Those IDs are projected into a high-dimensional numeric space. The model doesn’t “read” letters; it operates on semantic distances.
Intuitively: “invoice” tends to sit close to “payment”, “due date”, and “supplier”—and far from “biodiversity” or “geology”.
3) Processing (probabilistic relationships)
The model calculates relationships among tokens and context to generate the most plausible output.
Governance risk: plausible doesn’t mean true
Because the model optimises for linguistic coherence and statistical plausibility—not factual truth—it can produce hallucinations: content that looks credible but is simply made up.
In business and legal settings, that turns into tangible risk:
- “realistic” protocol numbers that don’t exist
- incorrect deadlines
- muddled legal references
- quotations that can’t be verified
Golden rule: anything that qualifies as sensitive data, decision-critical data, or official information must be subject to verification—human or automated—with traceable sources.
The context window: the AI’s desk space
Every model has a hard limit on how many tokens it can handle in a single interaction: the context window.
Think of it like desk space:
- stack too many documents on the desk and the ones at the back are no longer “in view”
- once the context overflows, the model may drop early sections or over-simplify
What this means for business and government
- You can’t always paste an entire case file and expect reliable analysis.
- Quality degrades when the context is saturated—especially with long regulatory texts.
- You need a strategy for selection, chunking, and retrieval.
This is where a key approach comes in: RAG (Retrieval-Augmented Generation). Instead of pushing everything into the prompt, RAG lets the model work on targeted excerpts pulled from an indexed archive—backed by sources you can cite and verify.
Privacy and security: the myth of anonymity
A common mistake is assuming that once text is turned into numbers (tokens), the data becomes “anonymous”. It doesn’t.
Tokenisation is not anonymisation. It’s just encoding.
Why it’s a real problem
- Names, emails, tax IDs, health references, or disciplinary details remain faithfully represented—and often reconstructable.
- If prompts are logged (in uncontrolled systems), you may be transferring personal or confidential data in practice.
- Even when data isn’t directly identifying, combinations can make it identifying (re-identification risk).
What “governing” usage really means
For business and the public sector, the key question isn’t “Is AI powerful?” It’s:
What data can enter the system, where is it processed, under which safeguards, and for what purpose?
A baseline governance package—especially in regulated environments—should include:
- clear policies on allowed vs. prohibited data
- minimisation: only share what is strictly necessary
- controlled environments (enterprise tenants/services where possible)
- source traceability for critical outputs
- staff training: security isn’t only technical—it’s behavioural
Conclusion: tokens as a strategic digital KPI
Tokens are the “currency” that pays for both performance and risk in generative AI. That’s why they shouldn’t be treated as a technical detail, but as a strategic KPI:
- Cost: manage input/output; standardise prompts and templates
- Performance: avoid context saturation; optimise workflows
- Quality: reduce hallucinations with constraints, verification, and sources
- Compliance: prevent data-handling errors and misuse
If you can govern tokens, you can govern AI adoption.
