Thanks for sharing the amazing project. But I have a question why the performance of GPT-4o in arc dataset can only reach 7% ? Looking for your reply ! 
Thanks for sharing the amazing project.
But I have a question why the performance of GPT-4o in arc dataset can only reach 7% ?
Looking for your reply !
