Posts by Erkan Güneş
🚨 New preprint on text classification with LLMs.
Our superior use-case, which combined GPT 4 and Gemini 1.5 Pro achieved 0.82 weighted F1 score on the 83% of the data in which the two models agreed.
Our results point towards the insufficiency of complete reliance on instruction tuned LLMs, an increasing accuracy along with the human effort exerted, and a surprisingly high accuracy achieved in the most humanly demanding use-case.
We propose three use-case scenarios and estimate overall weighted F1 scores ranging from 0.44 to 0.82 depending on scenario and LLM models employed. The three scenarios aim at minimal, moderate, and major human interference, respectively.
We experimented on congressional bill titles and congressional hearing descriptions. We tested six different models' performance on clasifying titles and descriptions into Compratative Agendas Project's 21 issue topic categories.
Excited to announce my recently published article with Christoffer Florczak on LLMs' multiclass classification capabilities.
rdcu.be/d9oIw
Unexpected sequel to Why Nations Fail.