5 Comments
User's avatar
Evan H.'s avatar

I'm wondering about a hybrid approach. What happens if you take the data you downloaded, paste it into the chat interface, and instruct 03 (or gemini) to run through row by row?

Expand full comment
John Horton's avatar

It bet that would do better!

Expand full comment
Josh Zweig's avatar

Hey how is that different from what your function does in when searching in Wikipedia? And how is this different to giving an llm the tool do it without using your package?

Super nice job btw :)

Expand full comment
John Horton's avatar

Thanks! Well, it is different in the sense that once gets the right answer and the other doesn't :). But I think to your point, I would imagine that for lots of questions we give to o3, we might want it to generate code to answer the question and then run that code---partially to save computation, but more for us to understand how it is getting an answer and trace out mistakes. For things like fetching a table (which the code does) and setting up a 1-per-row evaluation---it's much easier to describe and execute in "normal" code than have o3 do some kind of inscrutable neural network-based process that approximates this (which in this case, was, after all, wrong).

Expand full comment
Josh Zweig's avatar

Thank you for the detailed answer John! That makes sense :)

Expand full comment