Strip headers to reduce message size

Expected behavior

When receiving a Delta Chat message, the client could strip unnecessary headers.
It can absolutely strip the autocrypt header from CC-to-self messages, but it could also strip received-by and different spam-filter headers and only keep from/to/in-reply-to headers.

It could also strip autocrypt headers from received messages, but this should probably have a config option, as it may cause problems when using multiple clients with the same account.

Actual behavior

A message which is only a single word can be a message that needs more than 1 kb.

1 Like

yeah would be nice if DC only fetches the needed headers, and also don’t re-download already pre-fetched headers (which is currently the case! IIRC)

I do not mainly think about the storage inside delta, but on the mail-server. Deltachat could load the message, strip unnecessary headers and store it again, so your mailbox doesn’t grow too much.

deltachat can already auto delete emails on the mailserver older than time X. Maybe this feature can already help you?

1 Like

I want to keep all messages, so new clients can read them and so I can read them in normal mail clients. But for “chat messages” I do not need them to still include Received-By and simliar headers. Some mails also contain noisy spam reports with details which rules caused how many points and other headers that decrease the signal to noise ratio.

Some headers that are no longer needed after receiving the mail would be:

  • X-Spam-*
  • X-Greylist
  • possibly Received. They could be useful, but they are also a lot of overhead in many mails
  • DKIM-Signature
  • Delivered-To
  • Authentication-Results

Headers like References could probably be minified to only contain the messages needed to reconstruct threads. Especially GMails seems to add a lot of related messages in this header.

An example mail I have here has 5.1 kb for 190 bytes of message, i.e., the message itself is only about 4% of the size of the mail.

The References header alone is 996 bytes. The two Google DKIM Signatures (DKIM-Signature, X-Google-DKIM-Signature) are 1226 bytes.

The actual message content are 9 bytes + a full-quote (stripping full quotes could be a separate option, but one needs to be careful with this).

One also should account for sender name and mail address and In-Reply-To and (needed) References headers. (including the full quote) could probably be around 500-1000 bytes depending on how aggressive it is optimized, reducing the size to 5-10% of the original size.

Once the mail is through the spam filter and stored in my mailbox, I will never need the DKIM signature again. So I think DeltaChat could load the mail, strip the signature and then store the mail again.

So I think DeltaChat could load the mail, strip the signature and then store the mail again.

what about mobile traffic qoutas?

I rather think your mailserver should handle this and DC could be adjusted to not re-download/re-process the stripped mails.

Most users don’t have control over the mail server and many users have a quota on their account. And I don’t know any imap server that can be setup to strip headers from mails that are stored into the sent-folder.

When assuming the 5 kb of the example message (It can easily be more, especially when many users of clients that aren’t delta chat uses full-quotes), you can store roughly 20500 messages when you have a 100 mb quota. That’s not much when you want to use Delta as messenger for a longer time. I would even say the 200000 when assuming a large mailbox with 1 gb are not much for some users.
Of course, you could use the function to delete the mails (I guess the content is kept in the Android data folder?), but then new clients will not have access to old messages.

If the other clients are not Deltachat they can not access the old messages, for Deltachat you can use export/import backup.

Yes the messages are stored locally on the device, not much is synced back to imap (except message deletion and soon also if a message was read).

Again what about traffic quotas? many people still have those, unlimited mobile data connection is not everywhere available and cheap, yet.

I guess that’s the point. If you think about E-Mail and IMAP as tranport medium for Delta chat, everything is fine. But when you like to use Delta chat more like a mail client, it’s worth to care about the message size on the disk. And especially when you mix using Delta chat and other mail clients, you cannot rely on exporting the Deltachat database in a format only Delta chat is able to read, so you want to have the message on the server. And then it matters if the message is like 12k or just 4k (still both for under 1k actual payload).