This repository is the final cleaned delivery package for Team 11's Carlson Analytics Lab live case project with 4mativ. It contains the rule-based GPS validation pipeline, a static executive dashboard, a supporting realtime replay demo, and handoff documentation.
The full methodology and business interpretation live in KTD.html and ktd.qmd. This README is the practical orientation guide for the public GitHub delivery package.
4mativ needed a way to evaluate whether GPS evidence can support route-execution decisions. The solution is intentionally conservative: GPS is not treated as ground truth. The pipeline first checks whether the GPS signal is reliable enough, then evaluates route adherence, and only then assigns an attribution category.
The workflow answers three connected questions:
- Is the GPS signal reliable enough to use?
- Did the vehicle appear to complete the planned route?
- If not, is the issue more likely driver/operations, vendor/GPS, system/geofence limitation, or insufficient evidence?
- Open
KTD.htmlfor the full project narrative, methodology, findings, assumptions, and recommendations. - Open
dashboard/executive_summary/4mativ_dashboard.htmlfor the final executive dashboard. - Read
docs/pipeline_flow.mdfor a short flow diagram of how outputs are produced. - Use
docs/data_dictionary.mdto understand the output schema used in the private/local handoff. - Review
docs/known_caveats.mdbefore using the results operationally.
| Deliverable | Location | Purpose |
|---|---|---|
| Knowledge Transfer Documentation | KTD.html, ktd.qmd |
Full project methodology, findings, limitations, and recommendations |
| Validation engine | src/realtime_q1q2_engine.py |
Point-by-point GPS quality and route adherence logic |
| Replay runner | src/replay_runner_mp.py |
Runs the validation engine across historical trips |
| SQ3 attribution builder | src/sq3_attribution.py |
Converts realtime outputs into trip-level Q1 and attribution results |
| Dashboard aggregation | src/aggregate.py |
Builds aggregate JSON for dashboard charts and KPIs |
| Executive dashboard | dashboard/executive_summary/4mativ_dashboard.html |
Static client-facing dashboard |
| Realtime replay demo | dashboard/realtime_replay_demo/index.html |
Technical demo for replaying trip-level event evidence |
| Output schema docs | docs/data_dictionary.md |
Field-level interpretation guide |
The Python pipeline uses standard scientific Python packages:
- Python 3.10 or newer
pandasnumpy
There is no separate requirements file in this final package. If needed, install the dependencies in a local environment:
python -m pip install pandas numpyThe dashboards are static HTML/JS files. No build step is required.
Client-provided source data and generated operational outputs are intentionally excluded from this GitHub repository. The code and documentation are included for methodology review, but the private/local handoff package contains the data artifacts needed to reproduce the final dashboard and analytical tables.
The pipeline can be rerun only in an authorized local environment where the client-provided data exports are available.
Open the dashboard files directly in a browser:
- Executive dashboard:
dashboard/executive_summary/4mativ_dashboard.html - Realtime replay demo:
dashboard/realtime_replay_demo/index.html
If a browser blocks local data loading, serve the folder with a simple local server:
python -m http.server 8000Then open:
http://localhost:8000/dashboard/executive_summary/4mativ_dashboard.htmlhttp://localhost:8000/dashboard/realtime_replay_demo/index.html
All major thresholds and business rules are centralized in src/config.py, including:
- geofence radii
- lateness thresholds
- trip duration filters
- GPS gap, jump, and freeze detection rules
- missed-stop and pass-through rules
- attribution and confidence gates
Before operational use, these thresholds should be recalibrated with client-validated ground truth or a broader production sample.
- GPS is not ground truth. The pipeline uses GPS reliability gates before assigning route-execution responsibility.
primary_attributionis diagnostic, not punitive. It separates driver/operations issues from vendor/GPS issues, system/geofence limitations, completed trips with variance, and insufficient evidence.- The SQ3 output keeps two completion fields:
operational_completion_status: upstream SQ1/SQ2 route completion decision.attribution_completion_status: SQ3 completion decision after reliability and attribution gates.
- Pending status should not automatically be treated as failure.
- Vendor scoreboard metrics are intended for monitoring and investigation, not final vendor performance judgment without calibration.
Client-provided source data and generated operational outputs may contain sensitive operational information and should not be uploaded to a public GitHub repository.
The included .gitignore excludes generated output data, replay-demo data, and archive files by default.
Rerun the pipeline when:
- source trip, stop, vendor, school, or GPS position data changes;
- geofence, lateness, polling, or reliability thresholds change in
src/config.py; - attribution categories or dashboard metric definitions change;
- a new reporting period needs to be added.
After rerunning, check:
primary_attributiondistribution printed bysrc/sq3_attribution.py;- dashboard date range and trip count printed by
src/aggregate.py; - whether the executive dashboard data file has been refreshed in the private/local handoff package.
docs/pipeline_flow.md- short step-by-step pipeline mapdocs/data_dictionary.md- output field definitionsdocs/known_caveats.md- limitations and operational caveatsdocs/file_inventory.md- file-by-file package inventory