Skip to content

namishelex01/Sensitive-Data-Scraping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 

Repository files navigation

Sensitive-Data-Scraping

This repo is dedicated to all the regexes which I searched and found on multiple blogs/repos/articles for scraping sensitive data out of a page/site.

Reference :- https://github.com/dxa4481/truffleHogRegexes

Regex table for sensitive information harvesting

Slack Token

(xox[pboa]-[0-9]{12}-[0-9]{12}-[0-9]{12}-[a-z0-9]{32})

RSA private key

-----BEGIN RSA PRIVATE KEY-----

SSH (OPENSSH) private key

-----BEGIN OPENSSH PRIVATE KEY-----

SSH (DSA) private key

-----BEGIN DSA PRIVATE KEY-----

SSH (EC) private key

-----BEGIN EC PRIVATE KEY-----

PGP private key block

-----BEGIN PGP PRIVATE KEY BLOCK-----

Putty Key PuTTY-User-Key-File-2
SSH2 private key -----BEGIN SSH2 ENCRYPTED PRIVATE KEY-----

Generic Private Key block

BEGIN.*PRIVATE KEY

Username Password combo

^[a-z]+://.*:.*@

Facebook Oauth

[fF][aA][cC][eE][bB][oO][oO][kK].*['|\"][0-9a-f]{32}['|\"]

Twitter Oauth

[tT][wW][iI][tT][tT][eE][rR].*['|\"][0-9a-zA-Z]{35,44}['|\"]

GitHub

[gG][iI][tT][hH][uU][bB].*['|\"][0-9a-zA-Z]{35,40}['|\"]

Google Oauth

(\"client_secret\":\"[a-zA-Z0-9-_]{24}\")

AWS API Key

AKIA[0-9A-Z]{16}

Heroku API Key

[hH][eE][rR][oO][kK][uU].*[0-9A-F]{8}-[0-9A-F]{4}-[0-9A-F]{4}-[0-9A-F]{4}-[0-9A-F]{12}

Generic API Key

[aA][pP][iI]_?[kK][eE][yY].*['|\"][0-9a-zA-Z]{32,45}['|\"]

Generic Secret

[sS][eE][cC][rR][eE][tT].*['|\"][0-9a-zA-Z]{32,45}['|\"]

Slack Webhook

https://hooks[.]slack[.]com/services/T[a-zA-Z0-9_]{8}/B[a-zA-Z0-9_]{8}/[a-zA-Z0-9_]{24}

Google (GCP) Service-account

\"type\": \"service_account\"

Twilio API Key

SK[a-z0-9]{32}

Password in URL

[a-zA-Z]{3,10}://[^/\\s:@]{3,20}:[^/\\s:@]{3,20}@.{1,100}[\"'\\s]

Amazon MWS Auth Token

(amzn\\.mws\\.[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12})

Facebook Access Token

(EAACEdEose0cBA[0-9A-Za-z]+)

Google API Key

(AIza[0-9A-Za-z\\-_]{35})

Google Cloud Platform API Key

(AIza[0-9A-Za-z\\-_]{35})

Password in URL

[a-zA-Z]{3,10}://[^/\\s:@]{3,20}:[^/\\s:@]{3,20}@.{1,100}[\"'\\s]

Google Cloud Platform OAuth

([0-9]+-[0-9A-Za-z_]{32}\\.apps\\.googleusercontent\\.com)

Google Drive API Key

(AIza[0-9A-Za-z\\-_]{35})

Google Drive OAuth

([0-9]+-[0-9A-Za-z_]{32}\\.apps\\.googleusercontent\\.com)

Google (GCP) Service-account

("type": "service_account".*)

Google Gmail API Key

(AIza[0-9A-Za-z\\-_]{35})

Google Gmail OAuth

([0-9]+-[0-9A-Za-z_]{32}\\.apps\\.googleusercontent\\.com)

Google OAuth Access Token

(ya29\\.[0-9A-Za-z\\-_]+)

Google YouTube API Key

(AIza[0-9A-Za-z\\-_]{35})

Google YouTube OAuth

([0-9]+-[0-9A-Za-z_]{32}\\.apps\\.googleusercontent\\.com)

MailChimp API Key

([0-9a-f]{32}-us[0-9]{1,2})

Mailgun API Key

(key-[0-9a-zA-Z]{32})

PayPal Braintree Access Token

(access_token\\$production\\$[0-9a-z]{16}\\$[0-9a-f]{32})

Picatic API Key

(sk_live_[0-9a-z]{32})

Stripe API Key

(sk_live_[0-9a-zA-Z]{24})

Stripe Restricted API Key

(rk_live_[0-9a-zA-Z]{24})

Square Access Token

(sq0atp-[0-9A-Za-z\\-_]{22})

Square OAuth Secret

(sq0csp-[0-9A-Za-z\\-_]{43})

Twitter Access Token

[tT][wW][iI][tT][tT][eE][rR].*[1-9][0-9]+-[0-9a-zA-Z]{40}

Blacklisted Keywords to search for inside the context

apikey

api_key

aws_secret_access_key

db_pass

password

passwd

private_key

secret

secrete

Considering the False Positives cases

""

""):

"\'

")

#pass

#password

$(shell

'\"

''

''):

')

'this

(nsstring

-default}

<a

<aws_secret_access_key>

<input

<password>

=

\\"$(shell

\\k.*"

\\k.*'

`grep

dummy_secret

false

false):

false,

false;

none

none,

none}

not

null

null,

null;

password

password)

password,

password},

redacted

some_key

string,

string?

string}

string}}

test-access-key

todo

true

true):

true,

true;

{

Regex for Specific Cases

Something followed by Colon

Eg. secret_key: foo

({})(("|\')?):(\s*?)(("|\')?)([^\s]+)(\5)

Something followed by Colon & Quotes

Eg. secret_key: "foo"

({})(("|\')?):(\s*?)(("|\'))([^\s]+)(\5)

Something followed by Equal signs

Eg. password = bar

({})((\'|")])?()(\s*?)=(\s*?)(("|\')?)([^\s]+)(\7)

Something followed by Equal signs & Quotes

Eg. password = "bar"

({})((\'|")])?()(\s*?)=(\s*?)(("|\'))([^\s]+)(\7)

Something followed by Quotes & Semicolon

Eg. private_key "something";

({})([^\s]*?)(\s*?)("|\')([^\s]+)(\4);



About

This repo is dedicated to all the regexes which I searched and found on multiple blogs/repos/articles for scraping sensitive data out of a page/site.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors