GitHub has launched a new way to identify and flag security vulnerabilities for developers with code scanning powered by machine learning (ML).
GitHub said it used examples surfaced by community and GitHub-created CodeQL queries to train deep learning models to help recognise whether a code snippet consists of a potential vulnerability. With the new ML analysis capabilities, code scanning can find more alerts for four of the most common vulnerability patterns: cross-site scripting (XSS), path injection, NoSQL injection, and SQL injection.
GitHub code scanning is powered by the CodeQL analysis engine. To identify potential security vulnerabilities, you can enable CodeQL to run queries against your codebase. These open source queries are written by members of the community and GitHub security experts, and each query is carefully crafted to recognise as many variants of a particular vulnerability type as possible and provide broad Common Weakness Enumeration (CWE) coverage. Queries are continuously updated to recognise emerging libraries and frameworks. Identifying such libraries is important: it allows us to accurately identify flows of untrusted user data, which are often the root cause of security vulnerabilities.
The new experimental JavaScript and TypeScript analysis is rolled out to all users of code scanning’s security-extended and security-and-quality analysis suites. “If you’re already using one of these suites, your code will be analysed using the new machine learning technology,” said Tiferet Gazit, senior machine learning engineer, and Alona Hlobina, product manager, both at GitHub, in a blog post.
If the new experimental analysis finds additional results, the user will see the new alerts displayed alongside the other code scanning alerts in the “Security” tab of your repository and also on pull requests. The new alerts are marked with the “Experimental” label.
This is now available in public beta for JavaScript and TypeScript repositories on GitHub.