Voice is the future of apps – 9to5Mac

Like it or not, there’s no going back: apps and operating systems will continue to gravitate toward voice interactions.

It is not mandatory, but unavoidable

But here’s the thing: none of the points I’m about to make mean you will compulsory talking to your devices against your will, nor that humanity mindlessly slaps its way into a future where every publicly shared space is inevitably filled with a cacophony of over-dependent, AI-loving morons.

The GUI isn’t going away, just like the calculator didn’t go away after Lotus 1-2-3 was released. Even today you can buy a counter if you want. Some are actually quite expensive.

However, at this point, it is absolutely inevitable that both application developers and operating systems will increasingly gravitate towards voice interactions.

And there are good reasons for this, the most obvious being affordability.

I don’t just mean users who can’t physically interact with their devices, although that in itself is more than fantastic. I’m also referring to users who aren’t as tech-savvy as you, but who have the same needs as they struggle to navigate phones, computers, and platforms that seem to work effortlessly for everyone else.

And if your knee-jerk reaction is to see these users as lazy or anything along those general lines, I’m sorry to tell you, but you’re missing the point of the whole promise of modern computing.

Technological advancements are meant to lower the barrier to entry and help people get where they want to go, no matter how familiar they are with anything from Terminal to Safari.

In fact, most of Apple’s existence has been based on this very premise, even if its leadership seems to forget it at times.

Hello computer

All that said, here’s another big reason why a voice-first approach is inevitable: the real underlying technology necessary to make it happen is finally be good.

Yes, every single LLM still makes stupid mistakes, and it’s likely they always will if they’re based on current transformer-based autoregressive approaches.

But companies, AI frontier labs, and even independent developers are either learning to work around these limitations or moving to completely different architectures, some of which are very promising.

There have been significant advances in voice interfaces over the past year, with tools like Wispr and Speechify seeing an increasingly steep pace of adoption.

According to Wispr Flow founder and CEO Tanay Kothari, its users will eventually reach a point where voice accounts for roughly 75% of all input within the product. And among adult users, keyboard usage drops below 5%.

And I’ll eat my hat if they don’t work on proper agent capabilities to go alongside their dictation tools. In fact, Speechify is already clearly moving in that direction.

Let’s also not forget the recent OpenClaw tsunami, warts and all, that completely blew the roof off what anyone expected autonomous agents to be able to do anytime soon. In fact, many users rely on platforms like ElevenLabs to actually speak out loud to their agents, some of whom have seen the ElevenLabs API proactively implement OpenClaw itself.

Anyone who knows what they’re talking about will tell you how remarkable it is, againwarts and all.

Evolution on this front is accelerating

And here’s how fast things move: I started writing this article a while before OpenClaw became what it is today.

I originally wrote:

“(…) it won’t be long before apps and operating systems rely on autonomous frameworks where users just say what they want and AI takes care of the meaning, maps the steps, and performs that action in agent-ready apps on the user’s behalf.”

As it turns out, it is really he wasn’t

I also originally intended to wrap up the text by introducing things like Anthropic’s MCP and Apple’s App Intents to illustrate how the pieces fit together to enable voice interfaces. I even wanted to suggest that we will see news on this front next June, during WWDC.

Now that I still believe that we urine see more voice-centric features, APIs and benefits in June, even the idea that they will be dependent on developers is starting to seem short-sighted or outdated.

I may be misremembering the details, but I believe it’s John Gruber who talks about how somewhere, maybe at Drexel University, they ended up paving a path that people cut into the grass because it was shorter than the route the architects designed.

I honestly believe that for many users, voice is the shortest route.

From saying a request to an iPhone or Mac and getting an advanced shortcut in return, to tweaking photos, searching and editing documents, or even requiring multi-step workflows across apps, it’s becoming increasingly clear that as technology finally catches up, the interface most users find easiest to navigate is no interface at all. Or rather, the one who ennobles humanity from the first grunt.

All that being said, I still hate it when people send me voicemails.

Accessories offer on Amazon

Add 9to5Mac as a preferred resource on Google
Add 9to5Mac as a preferred resource on Google

FTC: We use automatic income earning affiliate links. More.

Leave a Comment