Skip to content

Latest commit

 

History

History
45 lines (30 loc) · 1.83 KB

File metadata and controls

45 lines (30 loc) · 1.83 KB

CVE Identification and Secure Git

People

William Johnson, Kellan Christ (we're open to more members interested)

Outline

The bottle neck is software security lies in the supply chain. Developers often utilized code by other developers through packages, code, libraries, but instances of vulnerabilities and exploits may be oblivious to these developers. Additionally, packages may not be made up to date when new vulnerabilities are discovered.

Project Description

With CVE Identification (and Secure Git), the goal is to determine a point of interest (or multiple points of interest) for a CVE. There are two main procedures:

  • Look for a file with the vulnerability and keep metadata (including hash) of file (i.e. grep)
  • Create a hashmap to map the interest point to other projects across Github found through World of Code.

(Preliminary) Naive Identification:

  • Use hash file to find exact hash match with another file (i.e. when file of project that is directly cloned has not been modified)

Advanced Identification:

  • Find specific code snippets containing vulnerability
  • Sliding window to look for range of tokens to find vulnerable code
  • Sandboxed fuzzing? (Requires executing code)

Data Collection and Data Cleaning

Collection:

  • CVE collection prioritized by most critical vulnerabilities to least
  • One instance of CVE links to collection of Github projects with vulnerability

Cleaning:

  • Remove projects with orphaned files or files not utilized
  • Remove projects that have least activity

Potential Future Work

  • Implementation of sgit utility (a layer built on top of git)
  • Example: Verifies project and asks user to confirm, before cloning
  • Operations: Hash lookup, secure scan

Expected Outcome

  • Build hashmap database and CVEs mapped
  • Collect metrics (what % of projects have vulnerabilities)
  • Secure Git utility