3 Comments
User's avatar
Evan H.'s avatar

I'm wondering about a hybrid approach. What happens if you take the data you downloaded, paste it into the chat interface, and instruct 03 (or gemini) to run through row by row?

Expand full comment
John Horton's avatar

It bet that would do better!

Expand full comment
User's avatar
Comment deleted
May 6
Comment deleted
Expand full comment
John Horton's avatar

Thanks! Well, it is different in the sense that once gets the right answer and the other doesn't :). But I think to your point, I would imagine that for lots of questions we give to o3, we might want it to generate code to answer the question and then run that code---partially to save computation, but more for us to understand how it is getting an answer and trace out mistakes. For things like fetching a table (which the code does) and setting up a 1-per-row evaluation---it's much easier to describe and execute in "normal" code than have o3 do some kind of inscrutable neural network-based process that approximates this (which in this case, was, after all, wrong).

Expand full comment