PREFDISCO: Evaluating Proactive Personalization through Interactive Preference Discovery
Best AI papers explained - En podcast af Enoch H. Kang
Kategorier:
This paper introduce a new meta-benchmark designed to evaluate large language models' (LLMs) ability to perform **interactive preference discovery** and response personalization through conversation. The framework converts existing benchmarks into interactive tasks by assigning **psychologically-grounded personas** with hidden preferences to be discovered by the AI. Evaluation of numerous frontier models showed that simply attempting personalization often **degraded performance** compared to generic responses (42.6% of cases), indicating systematic failures in current architectures. The research established a strong positive correlation between **question-asking volume** and preference alignment, but noted that models tend not to ask enough questions, and personalization also often imposes a **cognitive cost** that reduces task accuracy, particularly in mathematical reasoning. Ultimately, the source argues that interactive preference discovery is a **distinct capability** requiring dedicated architectural innovations rather than relying on emergent general language understanding.
