Can you have a real conversation with your notes?
Niklas Luhmann, the prolific sociologist behind the Zettelkasten method, often described his note-taking practice as having a conversation with his system. He would write a note, and the system would “respond” by surfacing related ideas, unexpected connections, unexpected tensions.
It was a metaphor. A beautiful one, but still a metaphor.
But what if it didn’t have to be?
At the PKM Summit in Utrecht last week, I proposed a breakout session around exactly this question: what would it mean to have literal voice conversations with your personal knowledge management system? Not just speaking to transcribe, but speaking to think, and having your notes respond.
I’ll be honest: I went in with more questions than answers. And I came out with even more. But the discussion surfaced a framework that I find genuinely useful.
The four levels of voice interaction
The participants in our session quickly recognized that “voice with PKM” means very different things depending on how sophisticated the interaction is. We mapped it into four levels.
Level 1: transcription
This is where most people start. Tools like Wispr Flow, Onit, or Handy let you speak and get text back. It’s a fast, friction-free way to capture thoughts, especially useful when your hands are busy or you think faster than you type.
Useful, certainly. But this is essentially just a faster keyboard. Your PKM isn’t doing anything with your ideas yet.
Level 2: AI post-processing
Here’s where it gets interesting. You speak your thoughts, including instructions, and AI restructures them into something more useful than raw transcription.
This is what I do myself. I’ll verbally draft an idea in Tana, weaving in instructions like “turn this into bullet points” or “reorganize this around the main argument.” The result isn’t just captured speech; it’s been processed into a form my PKM can actually work with. Non-linear thinking becomes structured notes.
The key difference from level 1: you’re not just inputting; you’re collaborating.
Level 3: interactive voice dialogue
At this level, the AI doesn’t just process your input, it engages. It asks clarifying questions. It pushes back. It says: “You mentioned three different things here; which one is the core point?”
This is where voice starts to feel like the conversations Luhmann was describing, and where I’m currently experimenting myself. Using the voice chat feature in Tana, I’ve found that my thoughts become sharper not because I transcribed them, but because something challenged them. It’s slower than level 2, but noticeably deeper.
Level 4: the Jarvis mode
This is the aspirational level. Imagine being able to ask your PKM: “What did I decide about this project in January?” or “What sources have I read on habit formation?” and getting a coherent spoken response drawn from your own notes.
Full integration. Your PKM as a genuine thinking partner you can speak with, not just write to.
We’re not there yet. But the direction is clear.
The tension I can’t resolve
Here’s what I keep coming back to.
A few years ago, I wrote a post arguing that writing is the best medium for deep thinking. The act of writing forces you to formulate ideas clearly, to make vague thoughts concrete, to notice where your logic breaks down. The friction is the feature.
Voice removes that friction.
So is voice input actually good for thinking, or just good for capturing? Is it a shortcut that helps you move faster, or a shortcut that lets you skip the hard part?
I don’t have a clean answer. My instinct is that levels 1 and 2 are primarily about capture efficiency: getting ideas into the system faster and with less friction. Levels 3 and 4 are where real thinking might happen, but only if the AI engagement is substantive enough to replicate what writing does naturally.
The question is whether today’s tools get there.
Where are you?
Most people working with PKM systems are operating at level 1. A smaller group has reached level 2. Levels 3 and 4 are still being figured out.
What I took away from the PKM Summit is that the ceiling is much higher than the current tools suggest. And that the interesting work isn’t just building better transcription. It’s figuring out how spoken dialogue can become a genuine mode of thinking, not just a faster mode of typing.
Luhmann’s metaphor might become literal sooner than we think.
At which level are you using voice in your PKM? And do you think voice can ever match writing as a medium for deep thought?



