Arvind Narayanan

Arvind Narayanan

@random_walker · Twitter ·

We've released annotated slides for a talk titled "Evaluating LLMs is a minefield". Current ways of evaluating chatbots/LLMs don't work well, especially for questions about societal impact. There are no quick fixes. More research is needed. w/ @sayashk 🧵https://www.cs.princeton.edu/~arvindn/talks/evaluating_llms_minefield/

Post media