Add c random samples#57
Conversation
Added 50 randomly selected c code samples from the security dataset.
|
⏳ I'm reviewing this pull request for security vulnerabilities and code quality issues. I'll provide an update when I'm done |
There was a problem hiding this comment.
Summary of Changes
Hello @amazon-pratik, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
I've added a collection of 50 randomly selected C code samples, sourced from a security dataset, to facilitate analysis and testing. This change introduces a new directory structure to organize these samples and includes a README file to provide essential context. The samples cover a broad spectrum of coding practices, highlighting both secure and insecure patterns across various security domains and common programming pitfalls.
Highlights
- New C Code Samples Directory: I've introduced a new directory,
c/random_samples/, which now contains 50 randomly selected C code samples. These samples are intended for analysis and testing purposes, derived from a larger C security dataset. - Documentation for New Samples: A
README.mdfile has been added to thec/directory. This document provides a clear explanation of the purpose of the newrandom_samplesdirectory, outlines its contents, and clarifies the origin of these code snippets. - Diverse Security Code Patterns: The added C files cover a wide array of security-related code patterns, including both compliant and non-compliant examples. Specifically, you'll find demonstrations of issues such as Time-of-Check to Time-of-Use (TOCTOU) vulnerabilities, insecure cryptographic key generation, use-after-free errors, null pointer dereferencing, and various forms of insecure buffer operations using functions like
strcpy,strncpy,gets, andscanf. - Comprehensive Code Quality Examples: Beyond common vulnerabilities, the samples also illustrate other critical programming concerns like integer overflows, out-of-bounds memory access, division by zero, incorrect
gotousage, information leaks, and proper handling of operator precedence and variable initialization.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
| char *cp = (char*)malloc(sizeof(char)*10); // memory is allocated | ||
| // all bytes in cp can be used here | ||
| free(cp); // memory is released | ||
| cp[9] = 0; // Noncompliant: memory is used after it was released |
There was a problem hiding this comment.
Caution
Description: Accessing already freed memory can lead to critical issues like data corruption or unintended code execution, depending on how and when the memory is reused. This scenario resembles repurposing memory intended for release, resulting in unpredictable behavior and security vulnerabilities. To prevent such issues, practice careful memory management, ensuring proper allocation and deallocation procedures are followed, and avoid accessing memory after it has been freed. Learn more - https://owasp.org/www-community/vulnerabilities/Using_freed_memory
Severity: Critical
There was a problem hiding this comment.
The fix involves setting the pointers to NULL after freeing the allocated memory to prevent use-after-free vulnerabilities. Additionally, the second malloc is properly freed before setting the pointer to NULL.
| cp[9] = 0; // Noncompliant: memory is used after it was released | |
| void freed_memory_should_not_be_used_non_complaint() // any function | |
| { | |
| char *cp = (char*)malloc(sizeof(char)*10); // memory is allocated | |
| // all bytes in cp can be used here | |
| free(cp); // memory is released | |
| cp = NULL; // Set pointer to NULL after freeing | |
| int *intArray = malloc(sizeof(int) *20); // memory is allocated | |
| // elements of intArray can be written or read here | |
| free(intArray); // Free the allocated memory | |
| intArray = NULL; // Set pointer to NULL after freeing | |
| } |
| other_func((char*)var); | ||
| //{/fact} | ||
| other_func((char*)var->myname); | ||
| other_func((char*)*var); |
There was a problem hiding this comment.
Caution
Description: Accessing already freed memory can lead to critical issues like data corruption or unintended code execution, depending on how and when the memory is reused. This scenario resembles repurposing memory intended for release, resulting in unpredictable behavior and security vulnerabilities. To prevent such issues, practice careful memory management, ensuring proper allocation and deallocation procedures are followed, and avoid accessing memory after it has been freed. Learn more - https://owasp.org/www-community/vulnerabilities/Using_freed_memory
Severity: Critical
There was a problem hiding this comment.
The fix removes the use-after-free vulnerability by passing NULL to other_func() instead of the freed pointer var. The subsequent lines that were attempting to access the freed memory have been removed to prevent any potential segmentation faults or undefined behavior.
| other_func((char*)*var); | |
| var = (NAME *)malloc(sizeof(struct name)); | |
| free(var); | |
| var = NULL; | |
| // This will no longer cause a segmentation fault | |
| other_func(NULL); | |
| return 0; | |
| } |
| //{/fact} | ||
| //{fact rule=use-after-free@v1.0 defect=1} | ||
| // ruleid: function-use-after-free | ||
| other_func((char*)var[0]); |
There was a problem hiding this comment.
Caution
Description: Accessing already freed memory can lead to critical issues like data corruption or unintended code execution, depending on how and when the memory is reused. This scenario resembles repurposing memory intended for release, resulting in unpredictable behavior and security vulnerabilities. To prevent such issues, practice careful memory management, ensuring proper allocation and deallocation procedures are followed, and avoid accessing memory after it has been freed. Learn more - https://owasp.org/www-community/vulnerabilities/Using_freed_memory
Severity: Critical
There was a problem hiding this comment.
The fix involves setting the pointer to NULL after freeing it and removing all subsequent uses of the freed pointer to prevent use-after-free vulnerabilities.
| other_func((char*)var[0]); | |
| char buf[10]; | |
| var = (NAME *)malloc(sizeof(struct name)); | |
| free(var); | |
| var = NULL; // Set pointer to NULL after freeing | |
| // Remove all uses of 'var' after it has been freed | |
| // strcpy(buf, (char*)var); | |
| // other_func((char*)(*var)); | |
| // other_func((char*)var[0]); | |
| // var->func(var->myname); | |
| return 0; | |
| } |
| // This will segmentation fault | ||
| //{fact rule=use-after-free@v1.0 defect=1} | ||
| // ruleid: function-use-after-free | ||
| other_func((char*)var); |
There was a problem hiding this comment.
Caution
Description: Accessing already freed memory can lead to critical issues like data corruption or unintended code execution, depending on how and when the memory is reused. This scenario resembles repurposing memory intended for release, resulting in unpredictable behavior and security vulnerabilities. To prevent such issues, practice careful memory management, ensuring proper allocation and deallocation procedures are followed, and avoid accessing memory after it has been freed. Learn more - https://owasp.org/www-community/vulnerabilities/Using_freed_memory
Severity: Critical
There was a problem hiding this comment.
The fix removes the function call that uses the freed memory (var) and comments out subsequent problematic lines that attempt to access the freed memory, preventing use-after-free vulnerabilities.
| other_func((char*)var); | |
| var = (NAME *)malloc(sizeof(struct name)); | |
| free(var); | |
| var = NULL; | |
| // Removed the call to other_func with freed memory | |
| // other_func((char*)var); | |
| // The following lines are also problematic and should be removed or modified | |
| // other_func((char*)var->myname); | |
| // other_func((char*)*var); | |
| return 0; | |
| } |
| // This will segmentation fault | ||
| //{fact rule=use-after-free@v1.0 defect=0} | ||
| // ok: function-use-after-free | ||
| other_func((char*)var); |
There was a problem hiding this comment.
Caution
Description: Accessing already freed memory can lead to critical issues like data corruption or unintended code execution, depending on how and when the memory is reused. This scenario resembles repurposing memory intended for release, resulting in unpredictable behavior and security vulnerabilities. To prevent such issues, practice careful memory management, ensuring proper allocation and deallocation procedures are followed, and avoid accessing memory after it has been freed. Learn more - https://owasp.org/www-community/vulnerabilities/Using_freed_memory
Severity: Critical
There was a problem hiding this comment.
The fix includes a null check before accessing the newly allocated memory and frees the memory at the end of the function to prevent use-after-free vulnerabilities.
| other_func((char*)var); | |
| free(var); | |
| var = NULL; | |
| var = (NAME *)malloc(sizeof(struct name)); | |
| // This will no longer segmentation fault | |
| if (var != NULL) { | |
| other_func((char*)var); | |
| other_func((char*)var->myname); | |
| other_func((char*)*var); | |
| } | |
| free(var); | |
| return 0; | |
| } |
| //{/fact} | ||
| //{fact rule=use-after-free@v1.0 defect=1} | ||
| // ruleid: function-use-after-free | ||
| other_func((char*)(*var)); |
There was a problem hiding this comment.
Caution
Description: Accessing already freed memory can lead to critical issues like data corruption or unintended code execution, depending on how and when the memory is reused. This scenario resembles repurposing memory intended for release, resulting in unpredictable behavior and security vulnerabilities. To prevent such issues, practice careful memory management, ensuring proper allocation and deallocation procedures are followed, and avoid accessing memory after it has been freed. Learn more - https://owasp.org/www-community/vulnerabilities/Using_freed_memory
Severity: Critical
There was a problem hiding this comment.
The fix involves setting the pointer to NULL after freeing it and removing all subsequent uses of the freed pointer to prevent use-after-free vulnerabilities.
| other_func((char*)(*var)); | |
| char buf[10]; | |
| var = (NAME *)malloc(sizeof(struct name)); | |
| free(var); | |
| var = NULL; // Set pointer to NULL after freeing | |
| // Remove all uses of 'var' after it has been freed | |
| // strcpy(buf, (char*)var); | |
| // other_func((char*)(*var)); | |
| // other_func((char*)var[0]); | |
| // var->func(var->myname); | |
| return 0; | |
| } |
| free(var); | ||
| //{fact rule=use-after-free@v1.0 defect=1} | ||
| // ruleid: function-use-after-free | ||
| strcpy(buf, (char*)var); |
There was a problem hiding this comment.
Caution
Description: Accessing already freed memory can lead to critical issues like data corruption or unintended code execution, depending on how and when the memory is reused. This scenario resembles repurposing memory intended for release, resulting in unpredictable behavior and security vulnerabilities. To prevent such issues, practice careful memory management, ensuring proper allocation and deallocation procedures are followed, and avoid accessing memory after it has been freed. Learn more - https://owasp.org/www-community/vulnerabilities/Using_freed_memory
Severity: Critical
There was a problem hiding this comment.
The fix removes all operations that use the 'var' pointer after it has been freed, and sets the pointer to NULL immediately after freeing to prevent accidental use. This eliminates the use-after-free vulnerability.
| strcpy(buf, (char*)var); | |
| NAME *var; | |
| char buf[10]; | |
| var = (NAME *)malloc(sizeof(struct name)); | |
| if (var != NULL) { | |
| // Perform operations with var here | |
| free(var); | |
| var = NULL; // Set pointer to NULL after freeing | |
| } | |
| // Remove all operations that use 'var' after it has been freed | |
| return 0; | |
| } |
| free(var); | ||
| //{fact rule=use-after-free@v1.0 defect=1} | ||
| // ruleid: function-use-after-free | ||
| strcpy(buf, (char*)*var); |
There was a problem hiding this comment.
Caution
Description: Accessing already freed memory can lead to critical issues like data corruption or unintended code execution, depending on how and when the memory is reused. This scenario resembles repurposing memory intended for release, resulting in unpredictable behavior and security vulnerabilities. To prevent such issues, practice careful memory management, ensuring proper allocation and deallocation procedures are followed, and avoid accessing memory after it has been freed. Learn more - https://owasp.org/www-community/vulnerabilities/Using_freed_memory
Severity: Critical
There was a problem hiding this comment.
The fix removes all usage of the 'var' pointer after it has been freed, preventing use-after-free vulnerabilities. The code now includes a null check before using the allocated memory and frees it immediately after use.
| strcpy(buf, (char*)*var); | |
| NAME *var; | |
| char buf[10]; | |
| var = (NAME *)malloc(sizeof(struct name)); | |
| if (var != NULL) { | |
| // Perform necessary operations with var here | |
| free(var); | |
| } | |
| // Remove all usage of var after free | |
| return 0; | |
| } |
| // ruleid:double_goto | ||
| if (0) | ||
| goto ONE; | ||
| goto ONE; |
There was a problem hiding this comment.
Caution
Description: The main function always executes the second goto statement, bypassing the intended conditional logic. Ensure that the goto statements are properly enclosed in the if block if that's the intended behavior, or remove the redundant goto.
Severity: Critical
There was a problem hiding this comment.
The fix addresses the issue of unconditional execution of the goto statements in the main function. The if block has been properly enclosed with curly braces to ensure that both goto statements are only executed when the condition is true (which is never in this case, as the condition is always false). This change aligns the main function's behavior with the ok function's structure and prevents unintended program flow.
| goto ONE; | |
| main(int argc, char *argv[]) { | |
| //{fact rule=incorrect-behavior@v1.0 defect=1} | |
| // ruleid:double_goto | |
| if (0) { | |
| goto ONE; | |
| goto ONE; | |
| } | |
| //{/fact} | |
| printf("did not go to one | |
| "); | |
| return 0; |
| #include <openssl/rsa.h> | ||
| // {fact rule=insecure-cryptography@v1.0 defects=1} | ||
| void encrypt() { | ||
| RSA_public_decrypt(flen, from, to, key, RSA_SSLV23_PADDING); // Noncompliant |
There was a problem hiding this comment.
Caution
Description: Use of insecure padding scheme RSA_SSLV23_PADDING in RSA decryption. Replace RSA_SSLV23_PADDING with a more secure padding scheme like RSA_PKCS1_OAEP_PADDING.
Severity: Critical
There was a problem hiding this comment.
The fix addresses the use of an insecure padding scheme (RSA_SSLV23_PADDING) in RSA decryption. The code has been updated to use RSA_PKCS1_OAEP_PADDING, which is a more secure padding scheme for RSA encryption/decryption operations. This change enhances the security of the RSA decryption process by using a padding method that is resistant to various cryptographic attacks.
| RSA_public_decrypt(flen, from, to, key, RSA_SSLV23_PADDING); // Noncompliant | |
| #include <openssl/rsa.h> | |
| // {fact rule=insecure-cryptography@v1.0 defects=1} | |
| void encrypt() { | |
| RSA_public_decrypt(flen, from, to, key, RSA_PKCS1_OAEP_PADDING); // Fixed: Using more secure padding | |
| } | |
| // {/fact} |
|
✅ I finished the code review, and left comments with the issues I found. I will now generate code fix suggestions. |
There was a problem hiding this comment.
Code Review
This pull request adds 50 randomly selected C code samples. While the intent is to provide samples for analysis, many of the files contain significant issues beyond their stated purpose. Several files are not valid C code, containing C++ syntax, typos, and other compilation errors. Some 'compliant' examples have critical bugs, and 'non-compliant' examples are often broken in ways that obscure the intended vulnerability. The overall quality of the samples is very low, and they should be thoroughly revised to be useful and not misleading.
| char dest[256]; | ||
| char *dest = malloc(strlen(result) + 1); |
There was a problem hiding this comment.
| #include <botan/cipher_mode.h> | ||
| // {fact rule=insecure-cryptography@v1.0 defects=1} | ||
| void encrypt() { | ||
| Botan::Cipher_Mode::create("AES-256/ECB", Botan::ENCRYPTION); // Noncompliant | ||
| } | ||
| // {/fact} | ||
|
|
||
| #include <botan/rng.h> | ||
| #include <botan/auto_rng.h> | ||
| #include <botan/rsa.h> | ||
| #include <botan/pubkey.h> | ||
| // {fact rule=insecure-cryptography@v1.0 defects=1} | ||
| void encrypt() { | ||
| std::unique_ptr<Botan::RandomNumberGenerator> rng(new Botan::AutoSeeded_RNG); | ||
| Botan::RSA_PrivateKey rsaKey(*rng.get(), 2048); | ||
|
|
||
| Botan::PK_Encryptor_EME(rsaKey, *rng.get(), "PKCS1v15"); // Noncompliant | ||
| } | ||
| // {/fact} | ||
| // {fact rule=insecure-cryptography@v1.0 defects=1} | ||
| #include <cryptopp/aes.h> | ||
| #include <cryptopp/modes.h> | ||
|
|
||
| voic encrypt() { | ||
| CryptoPP::CBC_Mode<CryptoPP::AES>::Encryption(); // Noncompliant | ||
| } | ||
| // {/fact} | ||
| #include <cryptopp/rsa.h> | ||
| // {fact rule=insecure-cryptography@v1.0 defects=1} | ||
| void encrypt() { | ||
| CryptoPP::RSAES<CryptoPP::PKCS1v15>::Encryptor(); // Noncompliant | ||
| } | ||
| // {/fact} | ||
|
|
||
| #include <openssl/evp.h> | ||
| // {fact rule=insecure-cryptography@v1.0 defects=1} | ||
| void encrypt() { | ||
| EVP_aes_128_cbc(); // Noncompliant | ||
| } | ||
| // {/fact} | ||
| #include <openssl/rsa.h> | ||
| // {fact rule=insecure-cryptography@v1.0 defects=1} | ||
| void encrypt() { | ||
| RSA_public_decrypt(flen, from, to, key, RSA_SSLV23_PADDING); // Noncompliant | ||
| } | ||
| // {/fact} No newline at end of file |
| FILE *fopen_if_not_exists_good(const char *file) { | ||
| char F_OK = 10; | ||
| if (access(file, F_OK) == -1 && errno == ENOENT) { | ||
| FILE *f = fopen(file, "w"); // Noncompliant | ||
|
|
||
| return f; | ||
| } | ||
|
|
||
| return NULL; | ||
| } |
There was a problem hiding this comment.
This function demonstrates a classic Time-of-Check to Time-of-Use (TOCTOU) vulnerability. There is a race condition between the access() check and the fopen() call. An attacker could potentially create a symbolic link or a different file at the specified path between these two operations, leading to unintended file access or creation.
| FILE *fp = fopen(filename, "r"); | ||
| if(fp == NULL) { | ||
| printf("Error opening file\n"); | ||
| return 1; |
| const char* root_dir = "/jail/"; | ||
| chroot(root_dir); // Sensitive: no chdir before or after chroot, and missing check of return value | ||
| // {/fact} | ||
|
|
||
| // {fact rule=insecure-use-of-chroot@v1.0 defects=1} | ||
| const char* root_dir1 = "/jail/"; | ||
| chroot(root_dir1); // Sensitive: missing check of the return value | ||
| const char* any_dir = "/any/"; | ||
| chdir(any_dir); // Sensitive: missing check of the return value | ||
| // {/fact} |
There was a problem hiding this comment.
| // This will segmentation fault | ||
| //{fact rule=use-after-free@v1.0 defect=1} | ||
| // ruleid: function-use-after-free | ||
| other_func((char*)var); |
There was a problem hiding this comment.
The ruleid suggests this is a use-after-free, but it's actually a null pointer dereference. The pointer var is explicitly set to NULL after being freed. The subsequent call to other_func((char*)var) passes a null pointer. While this might crash, it's not a use-after-free vulnerability, which involves using a dangling pointer that hasn't been nullified.
|
|
||
| int bitwiseOperatoronSignedOperandCompliant(unsigned int i) { | ||
| // Compliant: Bitwise operator applied on unsigned operand. | ||
| return 1 << i; |
There was a problem hiding this comment.
| int IncorrectUseofGotoNonCompliant() | ||
| { | ||
| int n; | ||
| // Noncompliant: `goto` function is recommended to use inside conditional statements to prevent unconditional jumps | ||
|
|
||
| return 0; | ||
| ONE: | ||
| printf("went to one\n"); | ||
| return 1; | ||
| } |
|
|
||
| // {fact rule=insecure-use-of-chroot@v1.0 defects=0} | ||
| #include <stdio.h> | ||
| void insecureUseofChrootCompliant(){ |
There was a problem hiding this comment.
| struct OptionalInt foo_division_by_zero_good1(int a, int b) { | ||
| if (b == 0) { | ||
| a = 1; | ||
| } | ||
| return safe_division_by_zero_good1(a, b); | ||
| } |
Added 50 randomly selected c code samples from the security dataset.