Request for packaging a WebXDC developer tutorial and reference materials

ian · March 6, 2026, 3:48pm

Goals

Being exhaustive is not a goal, so it is acceptable to skip parts which are more advanced or not needed for contributing to and creating simple webxdc apps.

Maintenance

To ensure long term maintainability, it ought to be built automatically from the sources, not by hand, so it could assimilate upstream updates. Hence, some kind of templating and higher level editing instructions might be useful. For example, it could be marked up to cut out sections or assemble multiple pages into one or how to fix up relative links.

Features

efficient storage
read simplified tutorials, basic rendering of rich content
quick interactive quiz & exercises after each section
full text search similar to https://devdocs.io/ with ranked fuzzy search observing synonyms with context enriched from parents and weighted siblings
cheat sheets for syntax of JavaScript, HTML5, CSS3, Web API, webxdc
API reference
JavaScript console with bookmarks
rewrite those external links to internal ones whose target is included in the bundle
follow internal links
tabbed browsing between multiple document pages, JS console, notepad and source code editor
show a deep link to the respective original source page to enable providing a correction upstream
optional: also usable outside Delta Chat, rendered on a static VCS page (ideally as a single page app that can be saved as a file with a single click)
optional: include the internationalized versions of content that already exists
optional: provide all phases of creating webxdc on the device without servers: edit source, preview (limited), compress to zip, send to chat
optional: an IDE or code paste pad with token highlighting, static code analysis and syntax error checking to develop simple apps (PoC: use eval and parse the exception)
Aim for a file size less than 1MB. The whole aim is to provide a TL;DR for those who don’t want to go through the whole curriculum, so if it needs so many words to describe well, we are doing it wrong.

Sources

As a proof of concept, we may prioritize not writing any new documentation at all. Rather, we should package all or parts of the most important existing documentation in the area into a single webxdc.

Assimilating non-FOSS resources is not possible, but it may inspire others to fill in the gaps when phrasing our own exercises for example.

FOSS resources

Non-commercial copyleft resources

Distributable resources

Related tools

davidsm10 · March 7, 2026, 1:33am

For the docs you probably want something similar to this site.

ian · March 7, 2026, 11:22pm

Yes, although I dislike that they overload the MDN website to scrape the docs instead of just cloning the source repo and rendering it themselves - this is akin to what abusers are doing to the web at large now for different reasons. And the more complicated question of how to best take subsets is still open.

github.com/freeCodeCamp/devdocs

lib/docs/scrapers/mdn/html.rb

main

module Docs
  class Html < Mdn
    prepend FixInternalUrlsBehavior

    # release = '2025-09-15'
    self.name = 'HTML'
    self.base_url = 'https://developer.mozilla.org/en-US/docs/Web/HTML'
    self.links = {
      home: 'https://developer.mozilla.org/en-US/docs/Web/HTML',
      code: 'https://github.com/mdn/content/tree/main/files/en-us/web/html'
    }

    html_filters.push 'html/clean_html', 'html/entries'

    options[:root_title] = 'HTML'

    options[:replace_paths] = {
      '/Element/h1' => '/Element/Heading_Elements',
      '/Element/h2' => '/Element/Heading_Elements',
      '/Element/h3' => '/Element/Heading_Elements',

This file has been truncated. show original

davidsm10 · March 8, 2026, 2:34pm

I took a look at GitHub - mdn/content: The official source for MDN Web Docs content. Home to over 14,000 pages of documentation about HTML, CSS, JS, HTTP, Web APIs, and more. · GitHub content and it seems that you can fit all the html, css and javascript markdown content into a ~ 10mb zip, this is including tutorial, guides, how-to and some images, if you only keep the reference then you can probably lower it to 8 or less.

But web APIs only make a ~ 16 mb zip, even if you can remove some of the APIs you know you are not going to need in webxdcs i don’t think it can all fit in one webxdc, you would need at least 2.

davidsm10 · March 8, 2026, 4:49pm

Btw, this may be relevant GitHub - pryzrack/xdc-md-view: markdown visor for Deltachat · GitHub, you replace the content in `/data` with markdown files, but it seems like mdn content needs a built step or the markdown is more advanced, so this webxdc can’t render it all.

ian · March 9, 2026, 11:59am

There are 14187 markdown files (without attachments) within that repository that contains ~~893k~~ 57MB of text and compresses down to ~~92kB~~ 13MB with gzip (8MB xz). That said, I don’t think we would need 99% of the files there if we aim low enough. No beginner would bother seeking through even a tiny fraction of that.

We should cherry pick specific folders instead of including whole subtrees recursively, but the following seem to include the most important folders:

1.5MB gz /files/en-us/learn_web_development
1.3MB gz /files/en-us/web/javascript

If we needed some of the more useful media (such as sketches of CSS terms), we could optimize its scale and compression or recreate them in vector format (such as using HTML & CSS…), but probably only if we are in dire need for space. After closer inspection, most space is taken up by unnecessary and low information content png decoration.

Also note that packing all docs into a single file improves compression ratio compared to lots of tiny individual files. Additionally, we could even compress the content with xz and self-extract upon access if we had enough text.

ian · March 9, 2026, 12:26pm

At the same time, could we agree on a reduced scope to both lessen the cognitive load on beginners targeted and to improve the cross-platform compatibility of the created code?

For example, do you see any drawback with only covering technologies already available on both Android 5 (Chrome ~37) and KaiOS (Gecko 37) for example? We could go even lower as nothing stops you from implementing perfectly usable games in ES6 (ES2015) as long as you have capabilities for canvas, audio and touch.

And as an unpopular opinion, I dare say that one could also implement perfectly enjoyable games without touching canvas or webgl at all! Maybe also some simple gimmick web API such as optional vibration, but definitely not hundreds of them.

Despite not focusing on gathering guides for those or linking them prominently from the guide main page, the respective API pages could still be returned by the search feature if they only take a few dozen kilobytes of extra space anyway.

davidsm10 · March 9, 2026, 2:44pm

Well, that’s great then, i must have compressed with some attachments when i tested.

If it is really that small, then it maybe makes sense to check how big it is the html after the build step, if it is small enough (less than 5mb) it could be used directly to avoid having to implement markdown previewer. I could not check myself because i got errors while installing dependencies.

ian · March 12, 2026, 8:32am

I’m not unconditionally against bundling HTML instead of a more frugal markup. Good compression of either ought not result in a difference more than a few percent and also friendly to sliding-window based methods (i.e., **something** vs <b>something</b>).

However, I have a feeling that the “pretty” HTML they generate might be kind of bloated with a more notable difference. Implementing a step that both prunes the fat and ensures the output between the individual sources will look consistent would probably be more work than implementing very simple markdown-like rendering from scratch.

Don’t worry about markup processing. I had implemented quite a few Markdown parsers & renderers for. I would focus on allowing indexing the content and for a user to read it, postponing improvements to later date. It would already be mostly readable as a proof of concept if links and section headings were handled and line breaks added between paragraphs - taking just a few lines of code with regexp!

For a minimal viable product, a slightly altered gemini (gemtext) renderer would accomplish 95% of the value for 5% of the work with surprise-free error cases remaining using just a few pages of code. Some simple additions would be in order, such as inline links and maybe the most straightforward cases for inline markup (bold, italic, code), but that’s still doable in a few lines of code. We can skip nested lists and tables for now.

davidsm10 · April 12, 2026, 10:28pm

They don’t even generate static html files, they generate heavy json files and it seems they use some custom server to serve those json files.

davidsm10 · April 18, 2026, 4:19am

How exactly did you calculate this? I just extracted all the files that end in .md inside the html, css, javascript and api subfolders into a separate folder keeping the same folder structure and the result is more than 70mb (17mb zipped), even if i write everything into one single file i still get 36.5mb (7.4 zipped).

ian · April 18, 2026, 8:50am

You’re right. I can’t recall from such a distant past, but I probably executed some of the stats commands in different folders. The md files under /files/en-us/learn_web_development seem to zip to 1.5MB (or 1MB xz) though, so that could still fit. Cherry picking separate folders instead of whole subtrees would probably be more efficient for the above use case (many under /files/en-us/web/javascript would also be useful).

davidsm10 · April 23, 2026, 7:14pm

Only the learn_web_development folder is not very useful, i think it would be more useful to have the web/html, web/javascript, web/css and web/api folders, I think they all can be compressed into around 4.5mb tar.xz file, and if the learn_web_development folder is included into around 5.8mb tar.xz file, both sizes are fine for this kind of webxdc you’d only download once to your saved messages.

davidsm10 · June 7, 2026, 3:38pm

I did some work related to this:

A webxdc markdown viewer:

Scripts to turn MDN markdown content into more normalized markdown, and rewrite structure and links and pack the result into a .tar.xz file:

Still needs quite some work to be usable but you may try it. Couldn’t upload the .xdc file here, max allowed size is 4mb, and its size is 8mb.