I made a video call app (with 15-second ping)

Another giant leap for mankind. A webxdc app for actual video calls over actual email (albeit with 15-second ping (but it may improve soon)).

A simplistic video call app UI

This is just a prototype.


Unfortunately, as of 2023-11-16, sending audio/video won’t work on unmodified versions of Delta Chat, but this may change in the near future.

Receiving video does work on regular Delta Chat.
If you don’t want to bother modifying Delta Chat — hit me up, I’ll show you the app in action.

Modifying Delta Chat

Below are instructions on how to modify Delta Chat. But make sure not to launch any webxdc apps that you don’t trust on the modified version of Delta Chat as it is insecure. Tested on Delta Chat 1.40.4.

  1. Download Delta Chat Desktop.

  2. Find the DeltaChat/resources/app.asar file in the app folder.

  3. Open it as a ZIP file.

  4. Open the tsc-dist/main/deltachat/webxdc.js file inside the archive.

  5. Find the line

    const logPermissionRequest = (permission) => {

    and add a line

    return true; // ADDED BY ME

    right below it.

  6. Save the modified app.asar file.

  7. After you’re done playing around with this app, make sure to remove the line you added, or simply reinstall Delta Chat.

Running the app

  1. Launch Delta Chat.
  2. Build an .xdc file with ./create-xdc.sh, or just download it from the “Releases” section.
  3. Send the .xdc file to a chat.
  4. Wait for some other chat members to launch the app.
  5. Press “Start sending my media”; or just wait for others to send theirs.

Keep in mind that video data takes a lot of space. Make sure not to waste the storage quota on your email server. The expected bitrate in this app for audio + video is ~50 MB / hour per member and ~2 MB / hour per member for just audio.


Because it’s funny.

And it might actually become an actually useful video call app, when:

  • the ratelimit gets much better than 1 email per 10 seconds
  • webxdc apps can be allowed camera permission
  • A way is found to not fill up email servers with audio/video data (maybe something like “ephemeral webxdc messages”)

How it works

Nope, it’s not WebRTC.

  1. Record 10 seconds of your camera stream with a MediaRecorder.
  2. Serialize the data.
  3. Send it over email (with webxdc.sendUpdate()).
  4. Repeat from step 1.

When we receive data, deserialize it and display it using Media Source Extensions API.

Web demo (no email involved)

If you just want to see how the app feels, without actually communicating with anyone using email, go to https://wofwca.github.io/video-call-over-email/ .



There is a related thread about voice chat over IRC:

I would recommend focusing on audio calls:

it would be interesting if the app could do noise cancellation and then like Mumble/Murmur detect when there is actual sound/speech and only send the stream if there is actually activity, in silient times you don’t send anything that will be much better for performance, most of the time people speak in a “push to talk” mode so if you can detect the pauses and avoid sending unnecessary silent continuously the app could work even needing less resources

video is big/heavy and phone-call-quality level of voice communication could fit perfectly over email, just like sending voice messages in delta chat but in a easier way detecting when people talk

1 Like

I find this concept appealing!

My suggestion is that upon implementation, Delta Chat should continuously buffer the last second of audio locally.

Then, when the mumble detection is activated, it can send the buffer and stream the rest of the audio. This way, if there is a delay before the mumble detection identifies that it should start transmitting, any speech that occurred during that delay won’t be missed because it has been buffered.

It’s natively possible, with getUserMedia({ noiseCancelation: { ideal: true }.

I don’t think there is a native way to do this, but in a different project of mine I do it with Web Audio API.

Good idea, though I think this currently is not the biggest blocker. The biggest one is the rate limit.

Your words about the “don’t send silence” helped me to come up with another stupid idea: with the current rate limit, we could bring down the delay from 10 seconds down to, say, 2s by limiting the duration of your speech to 12 seconds. Then you’d have to wait for a minute to be able to speak again. That is, when you start speaking, we send your voice every 2 seconds, reaching the rate limit of 6 after ~12 seconds.

I made this account just to tell you that seeing this made my day. Bravo!

Blew my mind… meaning some small part of my brain is still screaming that this is abhorrent and how is it even possible, while the rest of me is grinning like a madman because (1) what an amazing hack this is and (2) this could be another small step (hah!) to a future where everyone happily uses open, privacy-friendly communication tools.

If I had any time/energy to code outside of my regular job, I would be all over webxdc for making boardgames and collab tools.



Even my wife doesn’t give me as much love :hugs: :hugs: :hugs: (jk jk… unless??)

I’m happy that you liked it!