Why the Next Big Wave of Speech Innovation May Be Hyperlocal

In the fast-moving world of artificial intelligence, speech translation is often portrayed as a race already won by a handful of global titans. Meta, Google, Microsoft, Zoom, and a few others have built astonishing systems capable of turning speech from one language into another almost instantly. Around them, a constellation of specialized companies — KUDO, Interprefy, Boostlingo, you name it — are bringing these capabilities into conference rooms, online meetings, and hospitals.

It’s tempting to believe that this space is now fully occupied, that innovation is something only big players can afford. But that view misses the quiet revolution taking place elsewhere. Beneath the global headlines, a different kind of progress is unfolding in parallel: smaller in scale, closer to home, and potentially just as transformative.

From the global to the local stage

For years, developing speech translation required deep pockets, proprietary data, and teams of researchers. Today, the landscape is radically different. The underlying technologies — automatic speech recognition, neural machine translation, and speech synthesis or the more modern end-to-end models — have become astonishingly accessible. APIs from major providers can be integrated in a weekend; open-source models can be fine-tuned with modest computing power. The barriers to entry have fallen. What now matters most is not who builds the model, but how one applies it: how it is packaged, contextualized, and made to serve a concrete human need.

And that opens the door to an infinite field of local experimentation. Let me share with you three examples from my local stage, Italy.

Innovation, one classroom at a time

Take Stefano, a project born not in Silicon Valley but in Treviso, a mid-sized city in northern Italy. Developed by Copilots Srl and supported by local institutions, Stefano is an AI-powered assistant designed to help students with disabilities or limited Italian proficiency follow their classes in real time.
Using a proprietary engine called PaideIA, the app listens to the teacher, reformulates the lesson in simplified language, or translates it into the student’s native tongue. It can even generate conceptual maps and visual aids on the fly. As the founders explain “Stefano does not replace the teacher: it supports them, simplifies their work, and enhances their capabilities. It is a tool at the service of inclusive education”.

In an ordinary classroom, Stefano is performing an extraordinary act: making education accessible through live language mediation. It is being experimentally deployed at the Coletti School in Treviso, an institution with a particularly high share of multilingual students and significant linguistic and cultural diversity. The city’s mayor calls it “a national laboratory for inclusion”. It is not a global product — at least not yet — but a deeply local one, born of a specific need, implemented through collaboration between schools, associations, and technologists. It’s precisely this local grounding that gives it meaning.

From Florence, a bridge across silence

A few hundred kilometers south, in Florence, another initiative, Tradooko, is reimagining communication barriers. Winner of the Accessibility for Future 2025 Startup Competition, the company has built a device-plus-software solution that transcribes and translates spoken content in real time, helping deaf and foreign users communicate with public and private services.

Founder Niccolò Raffaelli, a veteran in industrial design and fashion, describes it as a project of “linguistic integration”, designed to make everyday interactions more inclusive: at hospitals, transport hubs, or public offices. Tradooko is not trying to rival Google Translate. It’s addressing the nuanced, unglamorous but deeply human problems that global systems often overlook, and it does it at a local level.

Chiaro: Where AI Meets Cognitive Diversity

Or take Chiaro, a promising new startup founded by Gianluca Capotosto in Italy’s southern region of Abruzzo. Its mission is as simple as it is profound: to translate the world into a form each person can truly understand. Rather than forcing users to adapt to standardized modes of communication, Chiaro adapts the message itself.

The app maps the user’s cognitive style and reshapes text and visual information accordingly—without “normalizing” or correcting them. It listens, learns, and waits for feedback. In doing so, it affirms the user’s identity rather than suppressing it. As its founders describe it, “Chiaro is a neuroaffirmative cognitive tutor that doesn’t correct you: it adapts to you, translating complexity into your native cognitive language”. Chiaro demonstrates how even in the field of speech and language technology, innovation can emerge from reimagining what “translation” means: not just between languages, but between cognitions.

The space for local innovations

These stories hint at a deeper truth: the next frontier of speech translation may not be about bigger models, but about smaller contexts. Once the core technology becomes a commodity, value migrates to design, usability, and domain adaptation. The real breakthroughs happen where technology meets social fabric: schools, clinics, town halls, museums, local media. And this can unfold on the global stag, but perhaps even more meaningfully, on the local one, where impact is immediate, personal, and deeply rooted in community needs.

A startup in Barcelona might develop a tool to facilitate translation between Castilian and Catalan during local council meetings. A cooperative in Lampedusa might deploy AI interpreters to support migrant communities arriving on the island. Each of these micro-innovations adds a new layer to what “speech translation” can mean in practice: technology tailored not for scale, but for concrete, local needs.

Local innovation succeeds because proximity breeds understanding. When you operate within a specific community, you know the needs, you see the gaps, and you receive immediate feedback. You can test, iterate, and co-develop with the very people who will use the product. This tight feedback loop allows ideas to mature quickly and technology to be translated into purpose-built applications: tools designed not for abstract markets, but for real lives. Local ecosystems, with their schools, municipalities, cooperatives, and small businesses, become living laboratories where speech translation can evolve from a technical feature into genuine social infrastructure.

A time of opportunities for young talents

This is the moment for entrepreneurs, educators, and public institutions to act. The global AI platforms will keep getting better, but the opportunity lies in making them useful — and that can also happen locally. Innovation does not always scale up; sometimes it can be sustainable, also from a business perspective, by making an impact at local level.

What we are witnessing in places like Treviso and Florence is not a footnote to the AI revolution, but its realization within the fabric of everyday life. You don’t need to chase unicorn status, raise millions, or build the next global giant to make an impact. There’s still room for a different kind of innovation: solid, local, and grounded in real needs. And this, too, may well define the next business chapter of the AI era.