Hardcode 5 cases data and compare the ground truth with LLM output
Hardcode 5 cases data and compare the ground truth with LLM output