Microsoft training AI on GitHub code [please opt out in repo settings]

Microsoft will start officially training AI on GitHub projects this month if developers don’t opt out, I don’t know if DC maintainers are aware about this or did already opt out or if they have a plan to migrate to a more trustworthy and reliable platform.

There are also other issues with GitHub which DC devs are obviously aware about:

pray the apparently increasingly vibe-coded GitHub infrastructure grants you results.

GitHub’s history of censorship and ongoing access restriction also does not align well with DC’s goals and values:

1 Like

In the same spirit:

1 Like

thank you for the pointer.

wrt censorship: i agree in general, however, it is also true, that github is available in cuba. where eg. gitlab, often praised as being more open, is not. reason is that microsoft took the effort to get an exception to bypass the embargo. microsoft may have shady reasons for that and may act differently in other countries, sure. and one can switch to a non-us hosted service, also true,

anyways, in general, most, if not all Delta Chat developers, including me, are critical of github since quite some time.

however, migration comes at quite some costs, there is lots of CI involved that needs to be migrated. this is lots of work, easily for weeks, until everything is fine. as we do not have additional resources to do the migration, these resources will be missing in Delta Chat and chatmail development. so timing is critical here.

note, that we’re using eg. codeberg for webxdc developments already today. and we’re also donating already today an amount of money yearly to codeberg.

3 Likes

It might also be a better translation platform, but same problem:

It would be great if there were a FOSS team dedicated to assisting with such transitions!

1 Like

If you are talking about this post, this is not what it says:

There is a personal setting (not organization-wide) and it is about training on the inputs/outputs when using copilot:

Models are trained on everything that can be found on the internet regardless of the settings.

3 Likes

Indeed, the original poster probably only asked the GitHub repo owners to disable the toggle on your screenshot.

And indeed abusers already violate content everywhere on the web, however content served with tdm reservation metadata is already considered a form of protection (definitely from a legal standpoint):

1 Like

Github already admitted to training on all repositories, apparently:

The common opinion seems to be, if you don’t want to get to be funnelled into the Co-Pilot sloppening then you should probably delete and move your GitHub repositories elsewhere. (I’m not saying that’s what DeltaChat should do. It seems to be a difficult question.)

Deleting GitHub repositories is also unlikely to prevent your code from being added to datasets, someone will mirror your code to GitHub anyway and datasets likely include all the source tarballs of major Linux distributions regardless of where the source code is hosted.

1 Like

Sure, but perhaps it could increase your chances in multiple ways. E.g. you’re no longer agreeing to the Github Terms of Use in regards to your latest code.

1 Like