Index

7 min read

What We Lost on the Way to Faster

On typing, voice input, and the friction that was doing something

There’s a notebook somewhere in my room from 2019. Engineering days. I used it to think through architecture decisions — not to document them, to think them. The pages are ugly. Scratch-throughs, inkblots where I held the pen too long, whole paragraphs crossed out with a single diagonal line that somehow felt more decisive than deletion.

I haven’t opened it in years. But I remember something about the experience of filling it that I haven’t been able to replicate since: a wrong idea had weight. It left a mark. You couldn’t quietly undo it — you had to physically reckon with it, cross it out, and keep going with the evidence of the mistake still visible on the page.

That’s gone now. And I didn’t notice when it left.


A friend who consults swears by SuperWhisper. Pranava Hari is building Muesli, an on-device Whisper alternative, and the LinkedIn comments are full of people who feel seen. I tried Whispr Flow — genuinely impressive onboarding, by the way — and uninstalled it twenty minutes later after a keybind bug made every key on my keyboard stop working.

So I have skin in this conversation. And I think the voice-to-text pitch, for all its polish, is solving the wrong problem.


The trajectory is easy to trace in hindsight.

Fountain pen. Typewriter. Keyboard. Swipe typing. Voice-to-text. AI-refined voice-to-text.

Each step removed a layer of friction. Each step also quietly removed a layer of commitment. The keyboard made editing effortless. Swipe typing made word selection a gesture. Voice input removed the physical act entirely. And now tools like Whispr Flow do not just transcribe what you said — they bring form to it. Punctuation, grammar, structure. The thought arrives tidy, before you have had to tidy it yourself.

McLuhan said the medium is the message. It usually gets invoked lazily, as a way of sounding interesting without saying anything specific. But here it is precise: each step in that trajectory does not just carry thought differently. It produces different thought. The fountain pen produces one kind of idea. The keyboard produces another. Voice-to-refined-text produces a third. The question is not which is faster. It is which produces the thought worth having.


The pitch goes like this: the bottleneck is how fast you can type. Remove that bottleneck, and thought flows freely. Iterate faster. Ship faster. Think faster.

I think the friction of typing is the thinking.

When you type, you are forced to compress. A half-formed idea does not survive the journey to the keyboard intact. It either sharpens or dies. The pause before you commit to a word is not dead time. It is selection pressure. You are choosing among framings, and the one that makes it to the screen is, on average, better than the first one that arrived.

Voice collapses that pause. You commit to the first adequate framing. The transcript arrives after the thinking is done, and now you have an editing problem layered on top of a thinking problem.

There is also the feedback loop. Typing produces a visible artifact as you go. The text on screen is a mirror; you catch yourself mid-thought and course-correct in real time. Voice externalizes thought without that loop. You are not reading as you speak. The cognitive modes are different, and I would argue the typing mode is more generative for work that involves constructing an argument rather than capturing a feeling.

Whispr Flow and its variants are not transcription tools in the traditional sense. They are not turning speech into text the way a court reporter does. They are taking your spoken ramblings and distilling them into structured prose. The words are yours. The thought is yours. But the compression — the step where you decide which framing survives and which one does not — has been partially outsourced.

That is the part I do not want to hand over too casually.


Then there is the social reality, which the voice-to-text pitch ignores entirely.

Whispering to your laptop in an open office is absurd. You are either speaking too quietly to get clean transcription, or you are that person narrating their thoughts aloud while everyone around you slowly loses their mind. And it is not just noise. It is content. Some thoughts are fine as typed text and are career events when spoken in a Bangalore floor plan.

My consultant friend has a legitimate use case. A lot of consulting thinking starts oral — in rooms, in meetings, in synthesis of things that were already said. Voice-to-text fits that workflow naturally.

But for work that is inherently textual and precise — structured argument, analytical writing, ideation that requires you to construct rather than capture — I am not convinced the friction was the problem.


I type on my phone without breaking eye contact mid-conversation. It is a minor flex I deploy with genuine satisfaction. I have multiple keyboards. The fluency took years to build, and I think it is worth something — not just instrumentally, but in itself.

This probably sounds like nostalgia dressed up as epistemics. Maybe it is. But I think there is something real underneath it about what counts as work now.

Knowledge work is already almost entirely intangible. There is no table a carpenter built, no mark of labour you can point to at the end of the day. Typing, oddly, is one of the few remaining physical signatures of that work: the sound, the posture, the visible commitment to a screen. It is not much. But it is something.

Voice removes even that. You are just talking, which is what humans do constantly, effortlessly, without it feeling like work. Speaking does not carry the same weight as writing, even when the informational content is identical. There is no artifact in the moment, no residue of effort. The act of working becomes further detached from any felt sense of doing.


None of this is Luddism. The tools are impressive. The onboarding on Whispr Flow is genuinely one of the better product experiences I have seen recently, which is its own problem, actually. Great onboarding makes you feel like a real problem is being solved before you have had time to ask whether the problem needed solving.

The notebook from 2019 had inkblots because I thought slowly. I crossed out wrong ideas because they cost something to write and cost something to abandon. The scratch-throughs are a record of actual thinking — not the polished output, but the path.

We have optimized for iteration speed. Faster drafts, cleaner output, less time between idea and form. What we have not asked clearly enough is whether the path to the idea matters, and whether the friction we have been removing was load-bearing.

My keyboard stopped working. I uninstalled the tool. And in the frustrated quiet that followed, I typed out exactly what I thought about it.

That felt like the right medium.

The prior two posts in this series — on building AI harnesses and on stress-testing ideas — were both typed. Make of that what you will.