Add python code quality examples#52
Conversation
amazon-pratik
commented
Aug 5, 2025
- Added code quality examples for python
- Includes error handling, logging, naming, performance, and readability/maintainability examples
- Contains both compliant and non-compliant code samples
- Added code quality examples for python - Includes error handling, logging, naming, performance, and readability/maintainability examples - Contains both compliant and non-compliant code samples
|
⏳ I'm reviewing this pull request for security vulnerabilities and code quality issues. I'll provide an update when I'm done |
There was a problem hiding this comment.
Summary of Changes
Hello @amazon-pratik, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request introduces a structured collection of Python code quality examples. My aim is to provide clear, actionable demonstrations of good and bad coding practices across several key dimensions, serving as a valuable resource for developers looking to improve their Python code. The examples are designed to be self-explanatory, highlighting common pitfalls and their recommended solutions.
Highlights
- Comprehensive Python Code Quality Examples: I've added a new set of Python code examples specifically designed to illustrate various aspects of code quality. These examples cover critical areas such as error handling, logging, naming conventions, performance optimization, and overall readability/maintainability.
- Compliant and Non-Compliant Samples: For each code quality rule, I've provided pairs of code samples: one demonstrating a non-compliant (less optimal or problematic) approach and another showcasing a compliant (best practice) solution. This allows for direct comparison and understanding of the improvements.
- Practical Demonstrations: The examples are practical, covering common programming tasks like file I/O, data processing, API interactions, and mathematical calculations, making the concepts easily relatable to real-world scenarios.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Code Review
This pull request adds a comprehensive set of Python code quality examples, covering error handling, logging, naming conventions, performance, and readability. The examples are well-structured, with clear "compliant" and "non-compliant" versions that effectively demonstrate key principles. My review focuses on enhancing the "compliant" examples to be even more robust and idiomatic. The feedback aims to further improve the quality and educational value of these examples.
| except ValueError as e: | ||
| print(f"Calculation error: {e}") | ||
| return None |
There was a problem hiding this comment.
Consider re-raising the exception after logging it, to ensure that the caller is aware of the error and can handle it appropriately. This prevents the error from being silently ignored.
| except ValueError as e: | |
| print(f"Calculation error: {e}") | |
| return None | |
| except ValueError as e: | |
| print(f"Calculation error: {e}") | |
| raise |
| except configparser.Error as e: | ||
| print(f"Config error: {e}") | ||
| return None |
There was a problem hiding this comment.
Consider re-raising the exception after logging it, to ensure that the caller is aware of the error and can handle it appropriately. This prevents the error from being silently ignored.
| except configparser.Error as e: | |
| print(f"Config error: {e}") | |
| return None | |
| except configparser.Error as e: | |
| print(f"Config error: {e}") | |
| raise |
| print(f"Encryption error: {e}") | ||
| return None | ||
| # {/fact} No newline at end of file |
There was a problem hiding this comment.
Consider re-raising the exception after logging it, to ensure that the caller is aware of the error and can handle it appropriately. This prevents the error from being silently ignored.
| print(f"Encryption error: {e}") | |
| return None | |
| # {/fact} | |
| except Exception as e: | |
| print(f"Encryption error: {e}") | |
| raise |
| except (IOError, csv.Error) as e: | ||
| print(f"CSV error: {e}") | ||
| return None |
There was a problem hiding this comment.
Consider re-raising the exception after logging it, to ensure that the caller is aware of the error and can handle it appropriately. This prevents the error from being silently ignored.
| except (IOError, csv.Error) as e: | |
| print(f"CSV error: {e}") | |
| return None | |
| except (IOError, csv.Error) as e: | |
| print(f"CSV error: {e}") | |
| raise |
| print(f"Conversion error: {e}") | ||
| return None | ||
| # {/fact} No newline at end of file |
There was a problem hiding this comment.
Consider re-raising the exception after logging it, to ensure that the caller is aware of the error and can handle it appropriately. This prevents the error from being silently ignored.
| print(f"Conversion error: {e}") | |
| return None | |
| # {/fact} | |
| except ValueError as e: | |
| print(f"Conversion error: {e}") | |
| raise |
| strength_criteria = { | ||
| 'has_uppercase': any(c.isupper() for c in password), | ||
| 'has_lowercase': any(c.islower() for c in password), | ||
| 'has_digit': any(c.isdigit() for c in password), | ||
| 'has_special': any(c in '!@#$%^&*()_+-=[]{}|;:,.<>?' for c in password) |
There was a problem hiding this comment.
Consider using a list of criteria and then summing the boolean values for better readability.
| strength_criteria = { | |
| 'has_uppercase': any(c.isupper() for c in password), | |
| 'has_lowercase': any(c.islower() for c in password), | |
| 'has_digit': any(c.isdigit() for c in password), | |
| 'has_special': any(c in '!@#$%^&*()_+-=[]{}|;:,.<>?' for c in password) | |
| strength_criteria = [ | |
| any(c.isupper() for c in password), | |
| any(c.islower() for c in password), | |
| any(c.isdigit() for c in password), | |
| any(c in '!@#$%^&*()_+-=[]{}|;:,.<>?' for c in password) | |
| ] | |
| criteria_met = sum(strength_criteria) |
| return [key for key, value in sample_item.items() | ||
| if isinstance(value, (int, float))] |
There was a problem hiding this comment.
Consider using isinstance directly in the list comprehension for conciseness.
| return [key for key, value in sample_item.items() | |
| if isinstance(value, (int, float))] | |
| def _identify_numeric_fields(sample_item): | |
| return [key for key, value in sample_item.items() if isinstance(value, (int, float))] |
| except Exception as error: | ||
| if attempt_number == max_attempts - 1: | ||
| raise error |
There was a problem hiding this comment.
| def _build_query_string(params): | ||
| param_pairs = [f"{key}={value}" for key, value in params.items()] | ||
| return '&'.join(param_pairs) |
There was a problem hiding this comment.
Consider using a generator expression and urlencode for better readability and correctness.
| def _build_query_string(params): | |
| param_pairs = [f"{key}={value}" for key, value in params.items()] | |
| return '&'.join(param_pairs) | |
| def _build_query_string(params): | |
| from urllib.parse import urlencode | |
| return urlencode(params) |
| return ( | ||
| name and isinstance(name, str) and | ||
| email and isinstance(email, str) and '@' in email and | ||
| age and isinstance(age, int) and age > 0 |
There was a problem hiding this comment.
Consider simplifying the return statement using all() and a generator expression for conciseness.
| return ( | |
| name and isinstance(name, str) and | |
| email and isinstance(email, str) and '@' in email and | |
| age and isinstance(age, int) and age > 0 | |
| def _is_valid_user_data(self, name, email, age): | |
| return all([ | |
| name and isinstance(name, str), | |
| email and isinstance(email, str) and '@' in email, | |
| age and isinstance(age, int) and age > 0 | |
| ]) |
| import pickle | ||
| try: | ||
| with open(filename, 'rb') as f: | ||
| return pickle.load(f) |
There was a problem hiding this comment.
Caution
Description: This code is vulnerable to code injection because it executes user-controlled input without proper validation or sanitization. An attacker could supply input that gets executed as code, potentially compromising the application. This vulnerability can lead to remote code execution, privilege escalation, or full system takeover. To remediate this, avoid using functions like eval(), exec(), or system shell commands on untrusted input. In Python, prefer ast.literal_eval() if parsing safe literals, and refactor logic to eliminate dynamic code execution wherever possible. Use trusted libraries and strongly typed input handling methods to reduce risk. More information - https://owasp.org/www-community/attacks/Code_Injection
Severity: Critical
There was a problem hiding this comment.
The fix replaces the vulnerable pickle.load() with dill.load(), which provides better security against code injection. Additionally, the exception handling is updated to catch dill.UnpicklingError instead of pickle.PickleError.
| return pickle.load(f) | |
| def load_pickle(filename): | |
| # import dill # Dill is a more secure alternative to pickle for serialization | |
| try: | |
| with open(filename, 'rb') as f: | |
| return dill.load(f) | |
| except (IOError, dill.UnpicklingError) as e: | |
| print(f"Unpickling error: {e}") | |
| return None |
| # {fact rule=code-quality-error-handling@v1.0 defects=1} | ||
| def parse_yaml(yaml_string): | ||
| import yaml | ||
| return yaml.load(yaml_string) |
There was a problem hiding this comment.
Caution
Description: This code is vulnerable to code injection because it executes user-controlled input without proper validation or sanitization. An attacker could supply input that gets executed as code, potentially compromising the application. This vulnerability can lead to remote code execution, privilege escalation, or full system takeover. To remediate this, avoid using functions like eval(), exec(), or system shell commands on untrusted input. In Python, prefer ast.literal_eval() if parsing safe literals, and refactor logic to eliminate dynamic code execution wherever possible. Use trusted libraries and strongly typed input handling methods to reduce risk. More information - https://owasp.org/www-community/attacks/Code_Injection
Severity: Critical
There was a problem hiding this comment.
The fix replaces yaml.load() with yaml.safe_load(), which only parses a restricted subset of YAML and does not execute arbitrary code, mitigating the code injection vulnerability.
| return yaml.load(yaml_string) | |
| def parse_yaml(yaml_string): | |
| import yaml | |
| # Use yaml.safe_load() instead of yaml.load() to prevent arbitrary code execution | |
| return yaml.safe_load(yaml_string) |
| if not data or not condition: | ||
| return None | ||
|
|
||
| set_clause = ', '.join([f"{k}='{v}'" for k, v in data.items()]) |
There was a problem hiding this comment.
Caution
Description: The _build_update_query function uses string formatting to construct the SQL query, which can be vulnerable to SQL injection attacks. Use parameterized queries or prepared statements instead of string formatting to prevent SQL injection.
Severity: Critical
There was a problem hiding this comment.
The fix addresses the SQL injection vulnerability in the _build_update_query function by using a parameterized query instead of string formatting. The set_clause is now constructed using placeholders (%s) for values, and the function returns both the query string and a tuple of values. This approach separates the SQL structure from the data, preventing potential SQL injection attacks.
| set_clause = ', '.join([f"{k}='{v}'" for k, v in data.items()]) | |
| if not data or not condition: | |
| return None | |
| # Use parameterized query to prevent SQL injection | |
| set_clause = ', '.join([f"{k}=%s" for k in data.keys()]) | |
| return f"UPDATE {table} SET {set_clause} WHERE {condition}", tuple(data.values()) | |
| def _build_delete_query(table, condition): | |
| return f"DELETE FROM {table} WHERE {condition}" if condition else None |
| @@ -0,0 +1,4 @@ | |||
| # {fact rule=code-quality-logging@v1.0 defects=1} | |||
| def process_payment(card_number, cvv): | |||
| print(f"Processing card {card_number} with CVV {cvv}") | |||
There was a problem hiding this comment.
Caution
Description: Sensitive information (card number and CVV) is being logged using print statement, which is insecure and may expose confidential data. Replace print with a secure logging method. Use a logging library and mask sensitive data before logging.
Severity: Critical
There was a problem hiding this comment.
The fix addresses the security concern by replacing the insecure print statement with a secure logging method. It uses the logging library to log the information and the maskpass library to mask sensitive data (card number and CVV) before logging. The card number is partially masked, showing only the first and last 4 digits, while the CVV is fully masked. This approach maintains logging functionality while protecting confidential information.
| print(f"Processing card {card_number} with CVV {cvv}") | |
| # {fact rule=code-quality-logging@v1.0 defects=1} | |
| import logging | |
| from maskpass import mask_string | |
| def process_payment(card_number, cvv): | |
| masked_card = mask_string(card_number, show_first=4, show_last=4) | |
| masked_cvv = mask_string(cvv) | |
| logging.info(f"Processing card {masked_card} with CVV {masked_cvv}") | |
| # {/fact} |
| # {fact rule=code-quality-error-handling@v1.0 defects=1} | ||
| def run_command(command): | ||
| import subprocess | ||
| return subprocess.run(command, shell=True) |
There was a problem hiding this comment.
Caution
Description: This code is vulnerable to code injection because it executes user-controlled input without proper validation or sanitization. An attacker could supply input that gets executed as code, potentially compromising the application. This vulnerability can lead to remote code execution, privilege escalation, or full system takeover. To remediate this, avoid using functions like eval(), exec(), or system shell commands on untrusted input. In Python, prefer ast.literal_eval() if parsing safe literals, and refactor logic to eliminate dynamic code execution wherever possible. Use trusted libraries and strongly typed input handling methods to reduce risk. More information - https://owasp.org/www-community/attacks/Code_Injection
Severity: Critical
There was a problem hiding this comment.
The fix uses shlex.split() to tokenize the command string, preventing shell injection. It also sets shell=False to avoid using a shell, and adds check=True, capture_output=True, and text=True for better error handling and output capture.
| return subprocess.run(command, shell=True) | |
| def run_command(command): | |
| import subprocess | |
| import shlex | |
| # Use shlex.split() to properly tokenize the command string | |
| return subprocess.run(shlex.split(command), shell=False, check=True, capture_output=True, text=True) |
| def load_pickle(filename): | ||
| import pickle | ||
| f = open(filename, 'rb') | ||
| data = pickle.load(f) |
There was a problem hiding this comment.
Caution
Description: This code is vulnerable to code injection because it executes user-controlled input without proper validation or sanitization. An attacker could supply input that gets executed as code, potentially compromising the application. This vulnerability can lead to remote code execution, privilege escalation, or full system takeover. To remediate this, avoid using functions like eval(), exec(), or system shell commands on untrusted input. In Python, prefer ast.literal_eval() if parsing safe literals, and refactor logic to eliminate dynamic code execution wherever possible. Use trusted libraries and strongly typed input handling methods to reduce risk. More information - https://owasp.org/www-community/attacks/Code_Injection
Severity: Critical
There was a problem hiding this comment.
The fix replaces the use of pickle with json for deserialization, which is safer as it doesn't execute arbitrary code. The file is also opened using a context manager (with statement) to ensure proper closure.
| data = pickle.load(f) | |
| def load_pickle(filename): | |
| # import json | |
| # Use json instead of pickle for safer deserialization | |
| with open(filename, 'r') as f: | |
| data = json.load(f) | |
| return data |
| parsed_entries = [] | ||
|
|
||
| try: | ||
| with open(log_file_path, 'r') as file: |
There was a problem hiding this comment.
Warning
Description: Path traversal vulnerability detected. User-controlled input in file paths can allow attackers to access files outside intended directories using ../ sequences. Secure your code by using framework functions like safe_join(), secure_filename() and .startswith() checks. Learn more: https://cwe.mitre.org/data/definitions/22.html
Severity: High
There was a problem hiding this comment.
The fix uses os.path.abspath and os.path.join to create a secure absolute path, preventing path traversal attacks by ensuring the file is within the current working directory.
| with open(log_file_path, 'r') as file: | |
| def parse_structured_log_file(log_file_path): | |
| import re | |
| import os | |
| # Use os.path.abspath and os.path.join for secure path handling | |
| log_file_path = os.path.abspath(os.path.join(os.getcwd(), log_file_path)) | |
| log_pattern = r'(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}) \[(\w+)\] (.+)' | |
| parsed_entries = [] | |
| try: | |
| with open(log_file_path, 'r') as file: |
| self._evict_least_used_item() | ||
|
|
||
| self.cache[key] = value | ||
| self.access_counts[key] = 1 |
There was a problem hiding this comment.
Warning
Description: The set method doesn't update the access count for existing keys, potentially leading to inaccurate eviction decisions. Update the access count in the set method for both new and existing keys.
Severity: High
There was a problem hiding this comment.
The fix addresses the issue by updating the set method to use the _update_access_count method for both new and existing keys, ensuring accurate access counts for all items in the cache. This change improves the accuracy of the least-recently-used eviction policy by consistently updating access counts whenever an item is accessed or added to the cache.
| self.access_counts[key] = 1 | |
| return None | |
| def set(self, key, value): | |
| if len(self.cache) >= self.max_size and key not in self.cache: | |
| self._evict_least_used_item() | |
| self.cache[key] = value | |
| self._update_access_count(key) | |
| def _update_access_count(self, key): | |
| self.access_counts[key] = self.access_counts.get(key, 0) + 1 |
| def __init__(self): | ||
| self.users = [] | ||
| def add_user(self, name, email, age): | ||
| if name and email and age and '@' in email and age > 0: |
There was a problem hiding this comment.
Warning
Description: Lack of input validation in add_user method. The method only checks for presence and basic format of inputs, not their validity. Implement more robust input validation. Consider using regex for email validation and type checking for age.
Severity: High
There was a problem hiding this comment.
The fix addresses the lack of input validation in the add_user method by introducing a new validate_inputs method. This method performs more robust checks on the name (ensuring it's a non-empty string), email (using a regex pattern for validation), and age (ensuring it's a positive integer). The is_valid_email method is added to handle email validation using a regular expression. These changes significantly improve the input validation process, making the code more secure and reliable.
| if name and email and age and '@' in email and age > 0: | |
| # {fact rule=code-quality-readability-maintainability@v1.0 defects=1} | |
| import re # For email validation | |
| class UserManager: | |
| def __init__(self): | |
| self.users = [] | |
| def add_user(self, name, email, age): | |
| if self.validate_inputs(name, email, age): | |
| self.users.append({'name': name, 'email': email, 'age': age}) | |
| return True | |
| return False | |
| def get_user(self, email): | |
| for user in self.users: | |
| if user['email'] == email: | |
| return user | |
| return None | |
| def validate_inputs(self, name, email, age): | |
| if not name or not isinstance(name, str): | |
| return False | |
| if not self.is_valid_email(email): | |
| return False | |
| if not isinstance(age, int) or age <= 0: | |
| return False | |
| return True | |
| def is_valid_email(self, email): | |
| email_regex = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$' | |
| return re.match(email_regex, email) is not None | |
| # {/fact} |
| # {fact rule=code-quality-error-handling@v1.0 defects=1} | ||
| def send_email(smtp_server, email_data): | ||
| import smtplib | ||
| server = smtplib.SMTP(smtp_server) |
There was a problem hiding this comment.
Warning
Description: We detected a URL in your code that uses an unencrypted ftp, telnet, http or smtp connection. These protocols are not secure and introduce the risk of exposing sensitive data to third parties. To secure your connection, we recommend using ssh instead of telnet, sftp, ftps or scp instead of ftp, https instead of http and tls over smtp (smtps). For more information, see CWE-200 and CWE-319.
Severity: High
There was a problem hiding this comment.
The fix replaces the insecure SMTP connection with a secure SMTP_SSL connection using SSL/TLS. It also creates a default SSL context to ensure proper security settings are applied.
| server = smtplib.SMTP(smtp_server) | |
| # Import smtplib for SMTP protocol client | |
| # ssl is used to create a secure SSL/TLS connection | |
| import smtplib | |
| import ssl | |
| def send_email(smtp_server, email_data): | |
| context = ssl.create_default_context() | |
| server = smtplib.SMTP_SSL(smtp_server, context=context) | |
| server.send_message(email_data) | |
| server.quit() |
|
✅ I finished the code review, and left comments with the issues I found. I will now generate code fix suggestions. |