Hi, I’ve just came across this WebXDC idea through its implementation in Monocles Chat (which does not appear to be working at the moment, but that’s another story).
I am trying to find a description of WebXDC’s threat model and security design considerations, but so far no luck. Would be very grateful if someone could point me to whatever information might be available on those topics.
AFAIK nobody has written a full “whitepaper” about that yet.
Webxdc is really simple in essence:
html5 web apps inside of a zip file
no internet connection - to prevent tracking and exfiltration of data
messenger / host implements the websxdc spec, which means it provides implementations for a channel of status update messages over which the webxdc apps can talk with the other devices that have the app open
Would I be correct in surmising that the threat model is roughly that of a webview (or equivalent) in the target platform, modulo whatever permissions the application author has requested + the user has granted?
For instance, it looks like WebXDC could be a great way of implementing geolocation tracking between contacts (you know the scenario: where you share your real-time location with one or more contacts or a chat room, typically for a limited time), provided that the application author allows access to the relevant WebAPI (e.g., by including the relevant permission in the manifest), and that the users sharing location do grant that permission.
Also, in the case of XMPP messengers, it could help with fast prototyping of XEPs (XMPP protocol extensions).
A further question: can you confirm that the WebXDC specification does not intend to dictate how that is achieved? E.g., I believe in Android one uses LOAD_CACHE_ONLY (not an Android dev here) but I’m not sure if similar restrictions are available in the general case.
There is no permission system yet. So the main point is that it can not exfiltrate your data over the internet.
There are ideas of adding apis to the messenger (like deltachat) to get data from it, but that will probably rather evolve into its own messenger specific extension specification.
Like a webxdc with geolocation web api would not be of much use when you don’t have the web view open (it could not share the position in the background), so this is a feature that should rather be implemented directly into the messenger to use the native apis directly.
Though there could be an extension specification that allows access to the data, so you could have an alternative map app or similar, though as I said such a thing would be probably use a messenger specific api as all messengers have different features. Also you would really need to think hard about security and permission model if you expose more data from the messenger to the webxdc which could still exfiltrate it to an evil group member.
yeah it can help with prototyping. like XMPP could define some additional extensions to the webxdc spec to offer more JS apis.
Currently we also have an experiment in deltachat, where we (ab-/re-)use the application status update apis to talk to our core, which gives locations for a map app instead of app updates - Though I still believe a dedicated api with a good security concept (like UI to set permissions and maybe signing by trusted devs that review the code) is needed for those Messenger Extensions. (aka. I’m not a fan of reusing setUpdateListener and the normal webxdc spec for this)
Yes, that would also be hard to specify with all the different web view browser engines and limited apis you have to configure them. (though it could make sense to link the issues and mitigation methods that other implementors have used)
Like it’s hard to turn off webrtc in chromium, so we abuse a connection limit, by filling it up before the webxdc starts we ensure that no new webrtc connection can be started by the webxdc app. In safari/webkit it was easier, we could just overwrite the api variable for all frames so the webxdc can not access it anymore.
After you think you have everything secured it is good practice to organise/get an independent professional security audit of your implementation.
And with Gecko its also trivial to disable WebRTC. I find it quite interesting how Firefox in many ways is the superior/obvious choice (compared to Chrome) to design with privacy and security in mind but DC uses Chrome anyway (for now) because most people have it as their default device browser so you don’t need to ship extra code.
So this makes me wonder what happens if a user actually does use Firefox or Safari as the default device browser? Will DC’s webxdc engine fail or potentially not work as expected because it is expecting/assuming to find a Chrome engine instead of a Gecko or Webkit engine?
Also, is “fill 500” tested before each release just in case Google decides to suddenly change Chrome’s WebRTC behavior in the future?
But unfortunately it’s hard to use GeckoView for webxdc:
It lacks apis we need (maybe there is some possibility with some crazy webextention dance to define a custom uri scheme, but we haven’t explored if it really works)
it would increase app size substantially
Pros for using it would be:
relatively easy to compile (compared to what I heard about chromium, haven’t compiled chromium myself though)
we could remove code and apis we don’t want or need → so we would be able to exclude webrtc and even internet connectivity completely.
we could probably also get the size down by removing code we don’t need, though for that you need to know the gecko code base well. (most compile flags to disable features just disable not exclude from build, and some of those flags result in compile errors / don’t even work → so the fork would have pretty large changes)
No not everywhere:
android uses android system web view which is google chrome on most androids and on some custom rooms it is a chromium fork
desktop is currently based on electron, so it uses chromium
there are plans to make an experimental tauri version, which would use the system installed web view: webkit on Mac and linux and windows-web view (aka. chromium) under windows
iOS uses the system web view which is webkit (web engine of safari)
deltatouch (ubuntu touch client;webxdc support is in the works there) will probably use the qt web view which is also chromium
default browser != system webview implementation, so that would not change anything.
I wouldn’t call it an “engine”, it’s just a handler method for the requests the web view does and an implementation of the window.webxdc interface (webxdc.js).
If your system would allow you to change the system web view, that system web view would need to be completely compatible to the android web view api, otherwise all kinds of web view based apps would break.
No that’s probably a weakness. also it depends on google chrome releases, not deltachat releases.
So google could remove the webrtc connection limit and if the users update google chrome they would ge the issue back.
So the whole chrome situation is not good:
we tried to contact some browser developers to ask them to implement the webrtc- CSP directive, because it is basically a security hole in CSP. But we haven’t pursued this path further since we found the FILL500 workaround.
we tried switching to gecko view but it was too much work, so out of scope for our small team