MinTopGraph is a Python implementation for computing and visualizing minTopGraphs in multi-dimensional data spaces. This tool helps in analyzing and understanding the structure of data points in both 2D and high-dimensional spaces, with a focus on skyline computation and group max-rank analysis.
- Support for both 2D and high-dimensional data analysis
- Skyline computation and visualization
- Group max-rank calculation
- Minimum top-k graph estimation
- Query space analysis
- Python 3.x
- Required Python packages:
- pandas
- numpy
- matplotlib (for visualization)
- Clone this repository:
cd minTopGraph- Install the required dependencies:
pip install pandas numpy matplotlibThe main script can be run using the following command:
python main.py path/to/datafile increment_of_k compute_group_max_rankParameters:
datafile: A CSV file containing normalized data points with associated IDsincrement_of_k: The increment value for estimating q(s) in the MinTopGraph algorithmcompute_group_max_rank: Boolean value indicating whether to compute GroupMaxRank (default: True)
For the program to work, the following files must be present in the same directory as your data file:
skyline.csv: Contains the skyline of the datasetmaxrank.csv: Contains the maxrank of each skyline pointcellsout.csv: Contains the mincells for 2D datasetscells.csv: Contains the mincells for high-dimensional datasets
The project includes various example datasets in the examples/ directory:
- 2D datasets (uniform, normal, exponential, correlated)
- 3D datasets
- Real-world datasets (laptops, housing, Spotify, Airbnb)
The program generates one or two graphs depending on the dataset dimensions:
- MinTopGraph is always displayed
- Query space and skyline are displayed for 2D datasets
main.py: Main entry point of the programgGraph.py: Core graph estimation functionalitymaxrank.py: Maxrank computation implementationqtree.py: Query tree implementationgeom.py: Geometric utilitiesqueryutils.py: Query-related utility functionsGroupMaxRank.py: Group max-rank computation