Today, I’ve taken the decision to join with the Software Freedom Conservancy in their Give Up GitHub campaign. This is likely a fairly modest step for me, as I’m not a bigshot project leader or a major contributor on a famous package. I have made some interesting contributions in the past, notably MAME and the Linux kernel, but as I’m not a major voice in the free and open source software (FOSS) community, I can say i’m doing this far more for me than I am for them.
The point of FOSS is to develop a commons that can be of benefit to the world. That’s everyone from industrial players to bedroom hobbyists to children just learning about technology. Most critically, the one thing that keeps this commons properly flourishing is a set of legal protections that require all parties involved to make the source code of their software available to others. This fosters the raw material for research, review, community development, and even the maintenance of a historical record.
Without a strong commitment to this principle, the FOSS commons becomes a way to profit on the labor of others while returning nothing of value. Not only is is lazy, but it ultimately means the software commons will be deprived of the re-contribution process that keeps it healthy and flourishing. In short, it prioritizes a cheap short-term gain while openly damaging the very ecosystem on which our modern technological world runs.
Microsoft, through its GitHub Copilot program, has decided to effectively plagiarize the entire FOSS commons. By training a machine learning model on wide swaths of FOSS projects, and then subsequently claiming that all output from Copilot cannot be considered plagiarism because the output is “original creations” from an algorithm “merely trained on existing code”, Microsoft have decided to abuse the work of thousands of engineers, myself included, without consent. Worse, they’ve effectively laundered our code through their Copilot system. It will take serious legal effort to produce a ruling on the status of intellectual property laundered through an AI, but morally, this should have never been done. I’ve shuttered my GitHub repos and moved them to a personal server here, and I will do what I can to avoid the use of it in my FOSS work.
Of course, I realize that this won’t actually protect me from being scanned by machine learning systems and having my source code plagiarized, but the less data I directly put in Microsoft’s hands at this point, the better. In general, you try to avoid sleeping with the enemy.
In general, it’s time for the FOSS community to realize that these kind of mega-platforms can’t be considered safe places for our work. If it looks like a valuable target for acquisition, someone will acquire it. And once they do, it’s only a matter of time before the money men find a way to exploit your data in ways they should not. This has been proven time and time again.
I only regret not understanding this sooner.