Ideas for voice messages in DeltaChat [noise cancellation, speech-to-text]

I have some ideas for voice messages in Delta Chat which I think they are worth mentioning.

First, removing background noise from the voice clip. You are in a street and want to record and send a voice clip. And the street noise is not a problem if there is noise removal. This can be done either by recognizing the user’s voice and trashing everything else. Or by recognizing the background noise and removing them.

Second, Transcribing voice messages. Since we now have offline speech recognition with several small models(~100MB) available for several languages, it could be worth giving a try. Maybe send the text instead of the voice message? Which will be voice typing. Or send the text as metadata of the voice message. About voice typing, I am unsure if it’s job of Delta Chat app.

For the first thing, in Linux noise cancellation exists as part of the sound system like pulseaudio or pipewire. And this means DC Desktop doesn’t have this job in scope. I’m sure the same goes for Windows. I’m unsure about Android.

Regarding the second one, I’m really unsure. But on Android, several keyboards provide voice typing. For instance Futo Voice input even uses offline models. I wonder if they provide an API.

1 Like

Both of these sound like they would be widely useful outside a chat client, and should be done there. It would be really nice to have such features!