LLM-81: Add Excel export for summary comparisons and CLI command#111
Draft
benglewis wants to merge 2 commits into
Draft
LLM-81: Add Excel export for summary comparisons and CLI command#111benglewis wants to merge 2 commits into
benglewis wants to merge 2 commits into
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as outdated.
This comment was marked as outdated.
Contributor
Spec Reviewer Report 📬2 / 3 requirements met for ticket:
1 unmet requirement
2 met requirements
Used resources: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
User description
Motivation
Description
export_excelutility (llm_behavior_eval/export_excel.py) that merges twosummary_brief.csvfiles, normalizes metrics, writes an.xlsxworkbook with one sheet per dataset, and inserts comparison charts; it usesxlsxwriterand validates input overlap and numeric values.llm_behavior_eval/evaluation_utils/metrics.pyand updatedBaseEvaluatorto use these constants instead of hardcoded column header strings.export-excelinllm_behavior_eval/evaluate.pyvia theexport_excel_commandwrapper and registered it on thetyperapp.README.mdincluding usage example and install note to enable the optionalexceldependency.excelwithxlsxwriter>=3.2.0and includedexcelin thedevdependency group inpyproject.toml.tests/test_export_excel.pysuite validating sheet sanitization, file output, and overlap checking, and a CLI test intests/test_evaluate_cli.pythat checks the new command forwards options correctly.Testing
pytestincluding the newtests/test_export_excel.pyand modifiedtests/test_evaluate_cli.pytests, and they passed.test_export_excel_command_passes_new_option_namespassed and correctly exercises theexport_excel_commandargument mapping.Codex Task
Generated description
Below is a concise technical summary of the changes proposed in this PR:
Add Excel export as a CLI workflow via the new
export_excelhelper and register theexport-excelcommand so teams can generate comparison workbooks directly from thesummary_brief.csvoutputs, complete with optional labels and dataset filtering along with docs and dependency updates to installxlsxwriter. Introduce metric column constants inevaluation_utils.metricsand consume them inBaseEvaluatorso CSV generation stays consistent while also powering the Excel exporter.export_excelandexport-excelcommands with customizable labels and sheet selection.Modified files (7)
Latest Contributors(2)
evaluation_utils.metricsconstants insideBaseEvaluatorso generated CSVs remain aligned with the export tooling.Modified files (2)
Latest Contributors(2)