DMCA & Source Code Leaks: Modern Enterprises’ Biggest Concern

Code is everywhere. From the apps that we order food from or that streams our favorite shows, to the cars that we drive and the alarms that wake us up every morning, every aspect of our lives in 2020 is intertwined with code.

Code has become the driving force for multi-billion dollar companies as they house more and more of their priceless intellectual property within it. As a result, code integrity now sits at the top of their security concerns. But how big of a concern is it really? We’re finding that, in this day and age, the threat is constantly growing in scope and severity. In order to truly appreciate just how much code leaks should and do matter to companies both big and small, let’s analyze some of the DMCA requests received by Github in 2019.

What are DMCA requests?

DMCA or Digital Millennium Copyright Act is arguably one of the most important copyright laws in the United States and facilitates the liability of service providers that host user-generated content (such as GitHub). The law dictates that service providers must allow companies to request the removal of specific content on the grounds of copyright infringement. In our context, that means that GitHub allows companies to protect their code from being reused without permission.

How often do developers use source code from other companies?

A lot, apparently. According to Github, the software development platform received 1,762 takedown notices and had to indefinitely take down 14,320 projects in 2019 alone. For comparison, they only received 145 takedown notices and had to take down 505 projects in 2015. It’s important to note that these numbers only reflect successful applications; they also receive a lot of “incomplete or insufficient notices” that are not currently tracked.

Trends in DMCA takedown notices - Cycode

Protect Your Source Code

Try Cycode free for 14 days. No credit card required.


What is the potential impact of code leaks?

One of the most famous code leaks occurred in 1994, when a hacker stole the MP3 codec source code from the University of Erlangen, which essentially initiated music piracy on the internet and changed the music industry forever. Code leaks have multiplied exponentially ever since.

In 2003, a California resident acquired the leaked source code of Lineage II, a multiplayer online game who used it to bootleg the game and run it on his own servers. According to the FBI, the individual was siphoning $750K a month in potential revenues from the game’s developers.

In another incident that took place in late 2018, Uber paid a hefty $148M fine for failing to notify drivers that hackers stole their personal information. The leak occurred due to permissive configurations on their Github repository which allowed the hackers to access the AWS credentials that were used to steal the sensitive data.

How hard is it to spot a code leak?

One of the most significant problems with code leaks is that it can take a significant amount of time to detect and find them. For example, Scotiabank, one of Canada’s leading banks, released their code to GitHub in August 2018. Unfortunately, they hadn’t realized that they had also published highly sensitive code that included private login credentials in their repository. It wasn’t until September 2019 that a researcher notified Scotiabank of this dangerous and costly mistake. This cautionary tale highlights another key issue with leaks: in many cases, they result from weak security practices or a lack of awareness rather than from malicious intent.

And just recently, in May 2020, Mercedes-Benz accidentally allowed access to their source code to anyone via a simple Google search. This occurred on account of faulty permissive security settings in their on-premise Git repository. One of the discoverers of this error shared that he “often just [hunts] for interesting GitLab instances, mostly with simple Google dorks, when I’m bored and I’m amazed by how little thought seems to go into the security settings”.

Why does this concern small and medium enterprises?

Since DMCA requests are shared publicly by Github, we took a quick look at the data. Throughout 2019, Amazon successfully submitted 10 DMCA requests, Apple issued 16 (5 of them in August alone) and BMO (Bank of Montreal) issued 14 requests. Some of these companies issue these requests on a monthly basis on average.

Apart from these big corporations, we can see that smaller, lesser-known companies are also occupied with protecting their source code. In fact, they make up the bulk of DMCA requests. In the grand scheme of code integrity, those among the Fortune 500 only made up 3% of all 191 requests issued this January. And the requests come from all corners of the globe. InspireUI, a Vietnamese code firm issued 5 requests. Longrise, a Chinese smart city company, issued 19. Hex Rays, a code analysis company from Belgium, issued 2.

These numbers make it clear that companies all over the world, big and small, suffer from the misuse of proprietary source code. We can also appreciate the persistence of this issue and how it’s not a question of “will it happen to me?” but rather “how many times is it going to happen to me?”

What can I do to protect my source code from leaking?

Practicing correct security etiquette is crucial in preventing these unfortunate leaks. Cycode integrates with your source code control to continuously scan your repositories and your organization’s members to find possible code leaks or mistakenly published sensitive credentials.

Protect Your Source Code in Minutes

Learn how you can gain visibility into all of your organization’s
source code to protect it from theft and loss.

Related Posts

Why You Need to Know SAMM

Introduction We here at Cycode passionately advocate for protecting your source code and the secrets within it throughout its lifecycle and along all points of

Read More »

How to Setup Branch Protection Rules

Branching is the cornerstone of cooperative work using Git. Developers utilize branches to work on the same source code repository in parallel. Generally speaking, when working with branches, there is one main branch in a repository from which

Read More »