Medical Data Leaked on Github

Wed, Aug 26, 2020 3-minute read

An epic case of shooting yourself in the foot

Apparently 200,000 patient records (PHI and PII) were leaked on GitHub…
…not because of cyber-criminals
…not because of the Russians or Chinese or some other sophisticated attacker
…rather: because of significant errors by developers and failure of their organizations to conduct any security review or audit.

“Personally Identifiable Information (PII) and Protected Health Information (PHI) that may be protected with industry-standard security while in an entity’s secure environment lose that protection when a developer exposes login credentials in public GitHub repositories. In this report, we describe nine data leak incidents that potentially impacted an estimated 150,000 - 200,000 patients, and possibly many more.”

The leaked records, sensitive sysadmin credentials, and more were found by an ethical hacker who tried to responsibly contact the affected organizations before eventually reporting the breach through The researcher and collaborated on an interesting-looking report, “No Need to Hack When it’s Leaking.”

“It started when Jelle Ursem, a security researcher in the Netherlands, wondered, ‘Hey — let’s see if somebody is actually stupid enough to upload medical customer data to GitHub.' It took Ursem less than 10 minutes…

My Initial Takeaways

  1. Yes, clearly the developers bear significant fault here. However, the real failures were organizational. Failure of the organizations to provide adequate training to the developers. Failure of leadership to establish and enforce secure development policy and procedures. Failure to conduct sufficient development reviews and code audits. Failure to effectively address these tools and processes during a risk management program – the quantities of data, use cases, interfaces, and more. Failure – in at least one of the cases – to perform sufficient third-party risk management on a vendor.

  2. The ethical researcher was the first to report these leaks. Given the ease with which he found them, and the fact that certain nation-state actors have taken keen interest in theft of medical data on Westerners, there’s a high probability that somebody already accessed and copied the leaked data.

  3. Doubtless many other organizations have the same problems as the 9 featured in this report. In fact, that’s one of the conclusions of the report. Hopefully others who need to hear these lessons will see reflections of their own organizations in this report and start making systematic improvements to their own processes.

From the conclusion:

“There are undoubtedly many more leaks that can be found on GitHub, and we know that at least some threat actors are already using GitHub as a way to find login credentials in repositories. It is no longer sufficient to just search Google or Shodan or BinaryEdge for your firm’s data or to search for signs of your firm’s data on the dark web. You also need to search GitHub.