Add python random samples by amazon-pratik · Pull Request #65 · amazon-pratik/Repo_test_owasp2

amazon-pratik · 2025-08-05T12:36:21Z

Added 50 randomly selected python code samples from the security dataset.

amazon-q-developer · 2025-08-05T12:36:25Z

⏳ I'm reviewing this pull request for security vulnerabilities and code quality issues. I'll provide an update when I'm done

gemini-code-assist

Summary of Changes

Hello @amazon-pratik, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

I've introduced a new set of 50 randomly selected Python code samples, sourced from a security dataset. This addition aims to enrich our existing collection, providing a broader range of examples for security analysis and testing. The samples cover various common vulnerabilities and demonstrate different programming patterns, particularly within the context of machine learning frameworks like PyTorch and TensorFlow.

Highlights

New README for Python Samples: I've added a new python/README.md file that provides an overview of the random code samples, their purpose, and their structure within the repository.
Addition of 50 Python Code Samples: I've included 50 randomly selected Python code samples, primarily focusing on security-related defects and machine learning code, to expand our dataset for analysis and testing purposes.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

amazon-q-developer · 2025-08-05T12:43:40Z

+# {fact rule=cross-site-scripting@v1.0 defects=1}
+@register.filter(is_safe=True, needs_autoescape=False)      #sensitive 
+def non_conformant_3(value):
+    result = '<strong>' + value + '</strong>'


Caution
Description: Concatenating unsanitized input directly into HTML can lead to XSS vulnerabilities. Use Django's template system or escape the input using conditional_escape before concatenation.

Severity: Critical

The fix addresses the XSS vulnerability by using Django's conditional_escape function to sanitize the input before concatenating it with HTML. This prevents potential malicious scripts from being executed when the template filter is used. The fix is applied to the non_conformant_1 and non_conformant_3 functions, which were previously vulnerable to XSS attacks.

Suggested change

result = '' + value + ''

# Non Conformant

# {fact rule=cross-site-scripting@v1.0 defects=1}

@register.filter(is_safe=True) #sensitive

def non_conformant_1(value):

from django.utils.html import conditional_escape # Import conditional_escape for input sanitization

result = '' + conditional_escape(value) + ''

return result

# {/fact}

# {fact rule=cross-site-scripting@v1.0 defects=1}

@register.filter(is_safe=True) #sensitive

def non_conformant_2(value, arg):

result = value.replace(arg, '')

return result

# {/fact}

# {fact rule=cross-site-scripting@v1.0 defects=1}

@register.filter(is_safe=True, needs_autoescape=False) #sensitive

def non_conformant_3(value):

from django.utils.html import conditional_escape # Import conditional_escape for input sanitization

result = '' + conditional_escape(value) + ''

return result

# {fact rule=cross-site-scripting@v1.0 defects=1}

@register.filter(is_safe=True, needs_autoescape=False) #sensitive

def non_conformant_4(value, arg):

result = value.replace(arg, '')

return result

amazon-q-developer · 2025-08-05T12:43:41Z

+
+    # Noncompliant: Unsafe data from event is written into S3 Select query.
+    sql_exp = "SELECT s.DateTime, s.Wind Speed, s.Wind Direction"\
+            "FROM s3object s WHERE s.DateTime = '%s'" % current_user


Caution
Description: SQL injection vulnerability in S3 Select query due to unsanitized user input. Use parameterized queries or sanitize the user input before constructing the SQL expression.

Severity: Critical

The fix addresses the SQL injection vulnerability by sanitizing the user input before using it in the SQL query. We import the 're' module to use regular expressions for input sanitization. The 'current_user' input is sanitized by removing any characters that are not alphanumeric, whitespace, or hyphens. The sanitized input is then used in the SQL query using string formatting. This approach helps prevent SQL injection attacks by ensuring that only expected characters are included in the query.

Suggested change

"FROM s3object s WHERE s.DateTime = '%s'" % current_user

# FN in this case implies that 'Expression' parameter of S3 query APIs like 'select_object_content()' is not considered as sink by the rule.

import boto3

import re # Import re for regular expression operations

s3 = boto3.client('s3')

def lambda_handler(event, context):

current_user = event["username"]

# Sanitize the user input

sanitized_user = re.sub(r'[^\w\s-]', '', current_user)

# Use the sanitized input in the SQL query

sql_exp = "SELECT s.DateTime, s.\"Wind Speed\", s.\"Wind Direction\" "\

"FROM s3object s WHERE s.DateTime = '{}'".format(sanitized_user)

resp = s3.select_object_content(

Bucket="XXXXXXXXXXX",

Key="key",

ExpressionType='SQL',

Expression=sql_exp,

amazon-q-developer · 2025-08-05T12:43:42Z

+def do_login():
+    username = request.form['username']
+    password = request.form['password']
+    db = MySQLdb.connect(host="localhost", user="root", passwd="root", db="db")


Caution
Description: We detected the use of a hardcoded active database password in the source code. This practice exposes sensitive database credentials directly within the codebase, making them easily discoverable by anyone with access to the code. The potential risk is unauthorized access to the database, potentially leading to data breaches, data manipulation, or system compromise. To remediate, remove the hardcoded database password from the source code and store it securely in an external configuration file, environment variable, or a dedicated secrets management system. Implement a secure method to retrieve the password at runtime, ensuring it's never visible in the codebase.

Learn more

Severity: Critical

The fix removes hardcoded database credentials and instead retrieves them from environment variables or a configuration file. It also uses parameterized queries to prevent SQL injection vulnerabilities.

Suggested change

db = MySQLdb.connect(host="localhost", user="root", passwd="root", db="db")

# Import os for environment variable access

# Import configparser for reading configuration files

import os

import configparser

def do_login():

username = request.form['username']

password = request.form['password']

# Load database credentials from environment variables or configuration file

db_host = os.environ.get('DB_HOST', 'localhost')

db_user = os.environ.get('DB_USER', 'root')

db_password = os.environ.get('DB_PASSWORD')

db_name = os.environ.get('DB_NAME', 'db')

if not db_password:

config = configparser.ConfigParser()

config.read('config.ini')

db_password = config['Database']['Password']

db = MySQLdb.connect(host=db_host, user=db_user, passwd=db_password, db=db_name)

cur = db.cursor()

cur.execute("SELECT * FROM users WHERE username = %s AND password = %s", (username, password))

results = cur.fetchall()

if len(results) > 0:

session['username'] = username

amazon-q-developer · 2025-08-05T12:43:43Z

+def get_db_connection():
+    db = MySQLdb.connect(host="localhost",
+                         user="root",
+                         passwd="root",


Caution
Description: We detected credentials hardcoded in your code, which might allow unauthorized users to access your account. We recommend you to store your credentials outside of the code in a configuration file, a database, or a management service for secrets and retrieve from it. For more information, see CWE-259 and CWE-798.

Severity: Critical

The fix replaces the hardcoded password with a call to os.getenv() to retrieve the password from an environment variable. This improves security by removing sensitive information from the source code.

Suggested change

passwd="root",

import os # Import os module to access environment variables

def get_db_connection():

db = MySQLdb.connect(host="localhost",

user="root",

passwd=os.getenv("DB_PASSWORD"), # Retrieve password from environment variable

db="db")

return db

amazon-q-developer · 2025-08-05T12:43:44Z

+def do_login():
+    username = request.form['username']
+    password = request.form['password']
+    db = MySQLdb.connect(host="localhost", user="root", passwd="root", db="db")


Caution
Description: We detected credentials hardcoded in your code, which might allow unauthorized users to access your account. We recommend you to store your credentials outside of the code in a configuration file, a database, or a management service for secrets and retrieve from it. For more information, see CWE-259 and CWE-798.

Severity: Critical

The fix replaces the hardcoded database password with an environment variable accessed using os.getenv(). This improves security by removing sensitive information from the source code and allows for easier management of credentials.

Suggested change

db = MySQLdb.connect(host="localhost", user="root", passwd="root", db="db")

# Import os to access environment variables for secure credential storage

import os

def do_login():

username = request.form['username']

password = request.form['password']

db = MySQLdb.connect(host="localhost", user="root", passwd=os.getenv("DB_PASSWORD"), db="db")

cur = db.cursor()

cur.execute("SELECT * FROM users WHERE username = '" + username + "' AND password = '" + password + "'")

results = cur.fetchall()

amazon-q-developer · 2025-08-05T12:43:45Z

+def do_login():
+    username = request.form['username']
+    password = request.form['password']
+    db = MySQLdb.connect(host="localhost", user="root", passwd="root", db="db")


Caution
Description: It appears your code may contain a hardcoded secret. We recommend replacing it with AWS Secrets Manager references to enhance security and follow best practices. For more information, please refer OWASP password storage cheat sheet.

Severity: Critical

The fix replaces hardcoded database credentials with AWS Secrets Manager integration, and uses parameterized queries to prevent SQL injection vulnerabilities.

Suggested change

db = MySQLdb.connect(host="localhost", user="root", passwd="root", db="db")

# Import statements

import os

from aws_secretsmanager_caching import SecretCache, SecretCacheConfig

def do_login():

username = request.form['username']

password = request.form['password']

# Use AWS Secrets Manager to retrieve database credentials

secret_cache = SecretCache(SecretCacheConfig())

db_secret = secret_cache.get_secret_string('my-db-secret')

db_creds = json.loads(db_secret)

db = MySQLdb.connect(host=db_creds['host'], user=db_creds['username'], passwd=db_creds['password'], db=db_creds['dbname'])

cur = db.cursor()

cur.execute("SELECT * FROM users WHERE username = %s AND password = %s", (username, password))

results = cur.fetchall()

if len(results) > 0:

session['username'] = username

amazon-q-developer · 2025-08-05T12:43:46Z

+#copilot mean_prob: 0.7142668250582043
+
+def admin_page():
+    is_admin = (g.username == 'admin')


Caution
Description: The admin_page function lacks proper authorization checks, relying only on a simple username comparison. Implement a robust authorization mechanism, such as role-based access control, to verify admin privileges.

Severity: Critical

The fix replaces the simple username comparison with a call to a function is_admin() that should implement a robust authorization mechanism. If the user is not an admin, the function aborts with a 403 Forbidden error. This addresses the comment by implementing a more secure check for admin privileges. However, the fix is incomplete as it requires the implementation of the is_admin() function, which should be defined elsewhere in the application to properly verify admin status, possibly using role-based access control or another secure method.

Suggested change

is_admin = (g.username == 'admin')

#the admin page (for admins only - error otherwise)

@app.route('/admin')

@ensure_logged_in

def admin_page():

# TODO: Implement a robust authorization mechanism to verify admin privileges

if not is_admin(g.username):

abort(403) # Forbidden

return '''<html>

Hello, %s! <a href='./'>Home</a> <a href='/logout'>Logout</a>

Admin? True

</html>''' % g.username

amazon-q-developer · 2025-08-05T12:43:47Z

+# {fact rule=cross-site-scripting@v1.0 defects=1}
+@register.filter(is_safe=True, needs_autoescape=False)      #sensitive 
+def non_conformant_4(value, arg):
+    result = value.replace(arg, '')


Caution
Description: Returning unsanitized input can lead to XSS vulnerabilities. Implement proper escaping using conditional_escape before returning the result.

Severity: Critical

The fix addresses the XSS vulnerability by implementing proper escaping using conditional_escape before returning the result. The changes include using conditional_escape to sanitize the input and mark_safe to indicate that the output is safe to render. This approach ensures that any potentially harmful content in the input is properly escaped before being returned, preventing XSS attacks.

Suggested change

result = value.replace(arg, '')

# {fact rule=cross-site-scripting@v1.0 defects=1}

@register.filter(is_safe=True) #sensitive

def non_conformant_2(value, arg):

result = conditional_escape(value.replace(arg, '')) # import django.utils.html

return mark_safe(result) # import django.utils.safestring

# {/fact}

# {fact rule=cross-site-scripting@v1.0 defects=1}

@register.filter(is_safe=True, needs_autoescape=False) #sensitive

def non_conformant_3(value):

result = '' + conditional_escape(value) + '' # import django.utils.html

return mark_safe(result) # import django.utils.safestring

# {fact rule=cross-site-scripting@v1.0 defects=1}

@register.filter(is_safe=True, needs_autoescape=False) #sensitive

def non_conformant_4(value, arg):

result = conditional_escape(value.replace(arg, '')) # import django.utils.html

return mark_safe(result) # import django.utils.safestring

# {/fact}

# Conformant

amazon-q-developer · 2025-08-05T12:43:48Z

+
+
+# ruleid:autoescape-disabled
+Environment(loader=templateLoader, autoescape=fake_func())


Caution
Description: Using jinja2 templates with autoescape=False is dangerous and can lead to XSS. Ensure autoescape=True or use the select_autoescape function to mitigate XSS vulnerabilities. https://bandit.readthedocs.io/en/latest/plugins/b701_jinja2_autoescape_false.html

Severity: Critical

The fix replaces the potentially unsafe fake_func() with select_autoescape(['html', 'xml']), which automatically enables autoescaping for HTML and XML files. This helps mitigate XSS vulnerabilities by ensuring proper escaping of user-supplied content in templates.

Suggested change

Environment(loader=templateLoader, autoescape=fake_func())

# Import jinja2 for Environment and select_autoescape

# select_autoescape is used to automatically enable autoescaping for specific file extensions

from jinja2 import Environment, select_autoescape

Environment(loader=templateLoader, autoescape=select_autoescape(['html', 'xml']))

amazon-q-developer · 2025-08-05T12:43:49Z

+    now = datetime.datetime.now()
+    html = "<html><body>It is now %s.</body></html>" % now
+
+    return "Hey there! {}!".format(dill.loads(b64decode(user_obj)))


Caution
Description: This code is vulnerable to code injection because it executes user-controlled input without proper validation or sanitization. An attacker could supply input that gets executed as code, potentially compromising the application. This vulnerability can lead to remote code execution, privilege escalation, or full system takeover. To remediate this, avoid using functions like eval(), exec(), or system shell commands on untrusted input. In Python, prefer ast.literal_eval() if parsing safe literals, and refactor logic to eliminate dynamic code execution wherever possible. Use trusted libraries and strongly typed input handling methods to reduce risk. More information - https://owasp.org/www-community/attacks/Code_Injection

Severity: Critical

The fix removes the insecure deserialization using dill.loads() and b64decode(), and instead uses html.escape() to sanitize the user input. The sanitized input is then directly used in the string formatting, eliminating the potential for code injection.

Suggested change

return "Hey there! {}!".format(dill.loads(b64decode(user_obj)))

def current_datetime(request):

# import html

# html.escape() is used to sanitize user input and prevent XSS attacks

user_obj = html.escape(request.cookies.get('uuid', ''))

now = datetime.datetime.now()

html = "<html><body>It is now %s.</body></html>" % now

return "Hey there! {}!".format(user_obj)

amazon-q-developer · 2025-08-05T12:43:50Z

✅ I finished the code review, and left comments with the issues I found. I will now generate code fix suggestions.

Add python random samples

007452e

Added 50 randomly selected python code samples from the security dataset.

gemini-code-assist Bot reviewed Aug 5, 2025

View reviewed changes

amazon-q-developer Bot reviewed Aug 5, 2025

View reviewed changes

-    result = '<strong>' + value + '</strong>'
+# Non Conformant
+# {fact rule=cross-site-scripting@v1.0 defects=1}
+@register.filter(is_safe=True)      #sensitive
+def non_conformant_1(value):
+    from django.utils.html import conditional_escape  # Import conditional_escape for input sanitization
+    result = '<strong>' + conditional_escape(value) + '</strong>'
+    return result
+# {/fact}
+# {fact rule=cross-site-scripting@v1.0 defects=1}
+@register.filter(is_safe=True)      #sensitive
+def non_conformant_2(value, arg):
+    result = value.replace(arg, '')
+    return result
+# {/fact}
+# {fact rule=cross-site-scripting@v1.0 defects=1}
+@register.filter(is_safe=True, needs_autoescape=False)      #sensitive
+def non_conformant_3(value):
+    from django.utils.html import conditional_escape  # Import conditional_escape for input sanitization
+    result = '<strong>' + conditional_escape(value) + '</strong>'
+    return result
+# {fact rule=cross-site-scripting@v1.0 defects=1}
+@register.filter(is_safe=True, needs_autoescape=False)      #sensitive
+def non_conformant_4(value, arg):
+    result = value.replace(arg, '')
+    return result

-            "FROM s3object s WHERE s.DateTime = '%s'" % current_user
+# FN in this case implies that 'Expression' parameter of S3 query APIs like 'select_object_content()' is not considered as sink by the rule.
+import boto3
+import re  # Import re for regular expression operations
+s3 = boto3.client('s3')
+def lambda_handler(event, context):
+    current_user = event["username"]
+    # Sanitize the user input
+    sanitized_user = re.sub(r'[^\w\s-]', '', current_user)
+    # Use the sanitized input in the SQL query
+    sql_exp = "SELECT s.DateTime, s.\"Wind Speed\", s.\"Wind Direction\" "\
+              "FROM s3object s WHERE s.DateTime = '{}'".format(sanitized_user)
+    resp  = s3.select_object_content(
+        Bucket="XXXXXXXXXXX",
+        Key="key",
+        ExpressionType='SQL',
+        Expression=sql_exp,

-    db = MySQLdb.connect(host="localhost", user="root", passwd="root", db="db")
+# Import os for environment variable access
+# Import configparser for reading configuration files
+import os
+import configparser
+def do_login():
+    username = request.form['username']
+    password = request.form['password']
+    # Load database credentials from environment variables or configuration file
+    db_host = os.environ.get('DB_HOST', 'localhost')
+    db_user = os.environ.get('DB_USER', 'root')
+    db_password = os.environ.get('DB_PASSWORD')
+    db_name = os.environ.get('DB_NAME', 'db')
+    if not db_password:
+        config = configparser.ConfigParser()
+        config.read('config.ini')
+        db_password = config['Database']['Password']
+    db = MySQLdb.connect(host=db_host, user=db_user, passwd=db_password, db=db_name)
+    cur = db.cursor()
+    cur.execute("SELECT * FROM users WHERE username = %s AND password = %s", (username, password))
+    results = cur.fetchall()
+    if len(results) > 0:
+        session['username'] = username

-    is_admin = (g.username == 'admin')
+#the admin page (for admins only - error otherwise)
+@app.route('/admin')
+@ensure_logged_in
+def admin_page():
+    # TODO: Implement a robust authorization mechanism to verify admin privileges
+    if not is_admin(g.username):
+        abort(403)  # Forbidden
+    return '''<html>
+    Hello, %s! <a href='./'>Home</a> <a href='/logout'>Logout</a>
+    <p>Admin? True</p>
+    </html>''' % g.username



		# ruleid:autoescape-disabled
		Environment(loader=templateLoader, autoescape=fake_func())

-Environment(loader=templateLoader, autoescape=fake_func())
+# Import jinja2 for Environment and select_autoescape
+# select_autoescape is used to automatically enable autoescaping for specific file extensions
+from jinja2 import Environment, select_autoescape
+Environment(loader=templateLoader, autoescape=select_autoescape(['html', 'xml']))

-    return "Hey there! {}!".format(dill.loads(b64decode(user_obj)))
+def current_datetime(request):
+    # import html
+    # html.escape() is used to sanitize user input and prevent XSS attacks
+    user_obj = html.escape(request.cookies.get('uuid', ''))
+    now = datetime.datetime.now()
+    html = "<html><body>It is now %s.</body></html>" % now
+    return "Hey there! {}!".format(user_obj)

Conversation

amazon-pratik commented Aug 5, 2025

Uh oh!

amazon-q-developer Bot commented Aug 5, 2025

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

amazon-q-developer Bot Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer Bot Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer Bot Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer Bot Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer Bot Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer Bot Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer Bot Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer Bot Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer Bot Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer Bot Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer Bot Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer Bot Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer Bot Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer Bot Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer Bot Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer Bot Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer Bot Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer Bot Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer Bot Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer Bot Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer Bot commented Aug 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!