GitHub is an increasingly critical part of many organizations’ IT stack. GitHub is the home of source code in the cloud, especially for open source. Based on the Git open-source version control system, GitHub is a cloud-based service for Git repositories. It is home to over 300 million repositories and used by 94 million software developers, which is more than six times the number of users as the next-largest git service, Bitbucket. GitHub’s focus on open source community-building and collaboration is key to the platform’s growth. A developer’s GitHub username has become their de facto calling card, resume, portfolio, and scoreboard—following them from job to job and allowing them to keep in touch with peers and show off their accomplishments.
However, the massive proliferation of source code on the platform is making GitHub repositories a more attractive target for cyber-criminals and much harder for enterprises to manage. By the end of 2022, reports of organizations compromised by GitHub-based hacks were an everyday occurrence, affecting Okta, DropBox, Slack, Toyota and the Department of Veterans Affairs, just to name a few that were publicly known. Open source, sharing, and collaboration are built into the DNA of the platform, creating new security risks.
GitHub Risk Factors
Secrets exposure
One obvious risk exposure of tokens, passwords and secrets embedded in source code. It’s one of those things everyone knows we shouldn’t do, but it’s often the fastest and easiest way to get code working. “I don’t have time to go and set up a secrets share right now. I’m just going to confirm that this works as expected, and I’ll go back and fix it later when I refactor,” said every busy and well-meaning dev behind a catastrophic data breach ever!
Secrets exposure is widespread. Analysis of the Twitch codebase by the GitGuardian team found over 6,000 secrets in their source code. Meanwhile, GitHub’s own scanning of public repositories revealed over 1.7 million exposed secrets. However, it’s also a very solvable problem. In fact, GitHub itself made automated secrets scanning free for all public repositories in December 2022.
Attack path analysis
Source code can give an attacker an opportunity to download your source and analyze it for exploits at leisure. Source code analysis can uncover hidden vulnerabilities or backdoors that can enable further attacks. The open-source roots and more collaborative nature of source code creation can make some teams more casual about their security posture for code. “There’s no customer data in here…”
Supply chain attacks
Increasing reliance on open-source modules for development of even critical proprietary software, combined with the storage of packages in public GitHub repositories, gives attackers a path to inject malicious code into your source. The SolarWinds attack amply demonstrated the potential fallout of software supply chain attacks and showed us that we need to look holistically at the supply chain to secure source code.
Infrastructure-as-Code
The rise of Hashicorp Terraform and other Infrastructure-as-Code (IaC) tools means that the blueprint of your production environment itself may be stored in a GitHub code repository. As DevOps and DevSecOps teams and methodologies become more common as part of the Shift Left movement, more and more security and configuration changes are made in these IaC files, not on the systems themselves. New processes and security tools are needed to cover this change in business processes.
Challenges to securing source code in GitHub
Complexity of access controls
Much like every other source of sensitive data in the cloud, GitHub’s access controls are becoming increasingly complicated, with a growing set of organization types, role types, permission types, etc. This complexity, combined with an explosion in the total number of repositories and contributors organizations are managing, creates a huge visibility challenge. Currently there are 90 distinct permissions that a user can have on any given repository. There are roles that help manage and aggregate permissions, but role can vary by repository. Multiply that by the number of contributors you have, and the number of repositories you host, and it becomes very hard to answer the question of who has what access to critical repositories. Plus, it’s not always clear from the name of the GitHub role what permissions are really provided- for example, the “Read” allows “write” actions for comments.
Private and public repositories in the same organization
It’s common for companies to use both private and public repositories for different tasks. For example, you might host user-accessible documentation, company-sponsored open-source projects, sample apps, or connectors in public repositories, and your proprietary source code, or IaC repositories in private repositories. When you’re managing hundreds of repositories it’s difficult to remember where external collaborators belong, and where they don’t.
Mingling of company and personal identities
Just as private and public repositories are hosted by the same organization, it’s also common to see a mix of personal and company identities, even for internal collaborators. As we’ve already discussed, developers tend to use their personal GitHub handle to keep track of their contributions across their career. Even if companies require employees to create dedicated company accounts, the global nature of GitHub usernames can mean that the desired names are unavailable, or are easy to copy. For example, I don’t actually have to work at Cyberdyne Systems to make the username miles_dyson_cyberdine
. This makes the crucial task of differentiating between internal and external contributors even more difficult. Who exactly is CodeNinja666 anyway? Are they one of ours? Should they be pushing changes to source code?
Secure your source code with Veza
Think about your company’s “crown jewels”: your most valuable information and data, and the assets on which attackers are likely to set their sights. Your source code is almost certainly on the list and needs to be protected accordingly. Securing source code in GitHub is part of a larger need to solve for SaaS access security and governance in your organization and needs to start with the full linking of identity to sensitive data objects, like GitHub repositories. Veza captures identity and authorization metadata from your identity provider, cloud providers, and SaaS providers like GitHub, so that you can track access permissions for all identities, through GitHub access controls, to all repositories. Incidentally, we also integrate with other source code and developer tools like Bitbucket and Gitlab, if your organization doesn’t use GitHub.
Track internal and external contributors
By capturing metadata from your IdP, as well as GitHub, Veza makes it easy to track access for all contributors. You can distinguish between internal and external collaborators, create alert rules to discover and remediate external identities who can access sensitive repositories, and make sure that all internal users are correctly provisioned through the appropriate groups and roles in your IDP. No more wondering whether CodeNinja666 works for you or not.
Understand effective permissions
Remember those 90 permissions a user can have on a repository? Veza standardizes permissions from GitHub, along with other SaaS platforms, cloud providers and databases, into human-readable language that anyone can understand. You don’t need to be an expert in GitHub access controls to know who can create, read, update or delete key repositories.
Automated monitoring and remediation
Using Veza’s Query Builder and Alert Rules, you can constantly monitor your GitHub repositories to remediate common misconfigurations and enforce your security policies. For example, you can:
- Find and eliminate orphaned or dormant accounts
- Monitor changes in write access to critical repositories storing production app deployment packages or IaC config files
- Prevent unwanted accounts from gaining write access to IaC repositories to secure your infrastructure
- Prioritize specific users and roles for risk-based remediation of access as we extend our Over Privileged Access Score (OPAS) to GitHub in the future
See Veza for GitHub in action
To see how Veza can help you secure your source code, schedule a demo.