This script is designed to mask sensitive information in files or directories by using regex patterns or specific keywords. It processes either a single file or all files within a directory, replacing matched patterns or keywords with a masked version. The tool supports a variety of file types and generates statistics on the masking process.
- Mask sensitive data in files using regex patterns and/or keywords.
- Process a single file or all files within a directory.
- Generate statistics on the number of matches for each regex pattern and keyword.
- Supports multiple file extensions, including
.txt,.py,.md,.json, and.csv.
- Python 3
os,re,shutil(Standard Python libraries for file and directory operations)collections.defaultdict(for easy statistics collection)
-
Clone the repository:
- Clone this repository to your local machine.
-
Prepare Configuration Files:
- Ensure you have the necessary configuration files in the
configdirectory:regex_patterns.txt: Contains the regex patterns to search for.keywords.txt: Contains the keywords to search for.
- Ensure you have the necessary configuration files in the
-
Running the Script:
- To start the masking process, run the script:
python main.py
Example Usages:
- Mask a single file:
- Choose
dosyawhen prompted. - Enter the file path.
- Select whether to mask using regex, keywords, or both.
- Choose
- Mask all files in a directory:
- Choose
dizinwhen prompted. - Enter the directory path.
- Select whether to mask using regex, keywords, or both.
- Choose
- To start the masking process, run the script:
- .txt
- .py
- .md
- .json
- .csv
- Add support for more file extensions.
- Improve error handling and user feedback.
- Add more complex masking rules based on the file type.
If you have any questions or suggestions, feel free to contact at kilicbartu@gmail.com.