I’m an enthusiastic DC user, but I have a problem with late arriving notifications/messages. I already read all the other topics with similar problem descriptions to figure out my problem but couldn’t find a solution. So I’m hoping someone can point me in the right direction.
DeltaChat: 1.42.4
Device: Galaxy A50 with all Google accounts and services removed. Additionally I have TrackerControl filtering the connections. DC is monitored but no connections are blocked.
Account: I have my mail-account hosted at Netcup.
Expected behavior: A notification is shown as soon as Idle gets a push.
Actual behavior: I get a late notification for some of the messages. Not all but some of them. Around 60-70 % of the messages arrive immediately. The other messages sometimes a few minutes late or up to no notification at all. As soon as I open DC all missing messages arrive.
Steps to reproduce: I leave the phone locked and check if notifications arrive. If I open DC later, I see “Verbinden …” in the top bar and the missing messages arrive and the notifications are shown.
I tried the following:
In DC “Zuverlässige Hintergrundverbindung” is activated.
I deactivated all battery optimization and power safing feature according to: https://dontkillmyapp.com/. Android says that DC is running in foreground all the time. No background usage is displayed in the setting.
I did set up a NodeRed sending me new messages every 30 minutes to maybe narrow down the reasons. But it happens occasionally. At home while connected to the wifi. Out of the house connected to the cell. And it’s also not tied to a network change.
I do have a debug log but can’t upload it as I’m a new user. In the debug log the message, which should arrive at 09:30 and 10:00, both arrive at 10:29:22 because i opened DC at this time.
It would be great if someone could have a look at my issue.
I think I might have found the problem. I deactivated the Captive Portal Check on my phone a long time ago. As soon as I discovered, that the DeltaChat messages came in late, I turned the Captive Portal Check back on. But it didn’t change anything. On the weekend my phone turned itself off because of low battery. Since then every single message arrived on point. The longest delay was 15 seconds. Seems like a reboot needs to be done after the Captive Portal Check is re-enabled.
Okay I was too fast. Just now it happened again. I was connected to the same cell signal all the time. It can be a really poor connection in this area but other messengers still work. From the last successfully arriving message to the next incoming message there were almost 2 hours with 3 missed messages in between. I did get the messages in one block after the two hours with the following lines in the log in between. Nothing like a broken pipe or interrupted idle or so. Seems like it is either not getting the imap push or not seeing the poor/broken connection.
Many times:
NetworkCallback.onLinkPropertiesChanged()
NetworkCallback.onCapabilitiesChanged()
FetchWorker.doWork() started
FetchWorker.doWork() will return
One time in between:
Recently seen loop waiting for 24h 0m 0s or interrupt
So while the phone is not used it received a lot of NetworkCallback.onLinkPropertiesChanged() and NetworkCallback.onCapabilitiesChanged(), but no NetworkCallback.onAvailable()?
If this is the case we should probably call maybeNetwork when onLinkPropertiesChanged is received to probe the IMAP connection:
There is also a long-standing issue that time is frozen on Android during the doze state, so we cannot easily detect that nothing happens for a long time:
Would indeed be helpful if you send the whole log to delta@merlinux.eu
That’s correct. In the new log there is not a single “onAvailable()” but a lot of “onLinkPropertiesChanged()” and a few “onCapabilitiesChanged()”.
I just saw that delta@merlinux.eu rejected my mail with the first log file. I just sent the new log file compressed in a zip folder to delta@merlinux.eu. Especially the part between line 1563 (last successfully received message) and 2401 (all the messages in between arrive) seems interesting.
Android code seems to be working correctly. The log also also contains some Connected #N messages, this results in maybe_network calls:
I do not think we should call maybe_network in more places, if connectivity was not really changed (disconnecting/reconnecting from Wi-Fi etc.) there is no reason to trigger reconnection.
In the log around line 2401 IDLE timed out:
12-05 15:18:46.599 9674 9707 I DeltaChat: src/imap.rs:447: Dropping an IMAP connection.
12-05 15:18:46.600 9674 9707 W DeltaChat: src/scheduler.rs:644: idle: INBOX: IMAP IDLE protocol timed out: deadline has elapsed
12-05 15:18:46.600 9674 9707 I DeltaChat: src/imap.rs:306: Connecting to IMAP server
This means connection was lost and 23-minute timeout was triggered.
The log is for core deltachat_core_version=v1.131.7. In core 1.132.1 this timeout was reworked from a dumb 23-minute timeout to a 5-minute timeout renewed every time there is some activity:
This will ensure lost connection is detected no later than 5 minutes of no activity on the network.
There is no Android release with core 1.132.1 yet. You can try to use nightly version side-by-side with the release version by importing a backup into it if you want to test, but otherwise just wait for a new release.
Thank you for advancing the idle timeout. That seems like a perfect fix to me. A message delay of 5 minutes in areas with bad cellular connection seems totally fine. Especially as this should not increase the battery usage. Btw: DC’s battery usage is way less than compared to other messengers on my phone.
I tried to understand the issue a little more by reading through the github repo. The second problem behind this is the not advancing Instant time when the cpu is in deep sleep mode. Otherwise the idle timeout would have been triggered latest after 23 minutes and the messages should have pulled? So #5108 might be the second part of the solution?
The basic decision from Go to add a second set of timer for non CPU-depending timeouts is clear. Also that DC depends on tokio for asynchron tasks is clear to me now.
Please correct me if I’m wrong: As tokio is a rust platform, rust first needs to implement the second set of timer with CLOCK_BOOTTIME which then enables tokio to implement the timer and finally DC could make use of this to avoid missing imap timeouts?
Seems like a final fix to the imap timeout is way in the future if I read through the provided issues .
@link2xt: Thank you for pushing this foreward on all possible points.
There is no decision and no agreed on proposal. Ideally Rust standard library should get a new kind of clock in addition to std::time::Instant and std::time::SystemTime (say std::time::BootTime), then tokio can implement new kind of timeouts on top of this. But it is already possible to take boot-time crate and implement new timeouts, even also as a separate crate, then use it in async-imap.
I hope this gets more visibility though and new timers are added straight into tokio as there is also tokio-io-timeout crate in our stack and it affects a lot of other users such as Rust QUIC implementation quinn which implements congestion control, pacing etc. algorithms which depend on timers that should not be suspended when Android screen is off.
Today, I had the chance to test the nightly version with the new idle timeout compared to the current f-droid (1.42.6) version. Two out of 20 messages were delayed: