I wrote a little piece about a pet peeve of mine. People claiming that LLMs "lie". Writing is great to get your thoughts in order and it feels as if there's a bit more at stake when writing for a potential audience.
Might do more of these in the future.
substack.com/home/post/p-...
Posts by Bertram Højer
📣 Next week we will be in Vienna for @aclmeeting.bsky.social to present a couple of works from our lab!
Find more about each of them below 🧵👇
#NLP #NLProc #ACL2025NLP @itu.dk @aicentre.dk
Chatbots — LLMs — do not know facts and are not designed to be able to accurately answer factual questions. They are designed to find and mimic patterns of words, probabilistically. When they’re “right” it’s because correct things are often written down, so those patterns are frequent. That’s all.
If you don't think I provided enough interesting findings in my last post, @annarogers.bsky.social has you covered in her latest post on our paper! ✨
📢 The Copenhagen NLP Symposium on June 20th!
- Invited talks by @loubnabnl.hf.co (HF) @mziizm.bsky.social (Cohere) @najoung.bsky.social (BU) @kylelo.bsky.social (AI2) Yohei Oseki (UTokyo)
- Exciting posters by other participants
Register to attend and/or present your poster at cphnlp.github.io /1
You can find the answers to more interesting questions regarding the beliefs of researchers on LLMs and "Intelligence" in the full paper!
This work was done in collaboration with @terne.bsky.social, @heinrichst.bsky.social, and @annarogers.bsky.social ✨
Do researchers agree that current AI systems based on LLMs are "intelligent"?
❓Do researchers believe current LLM based systems are intelligent?
✔️Generally not - although junior researchers are more willing to attribute intelligence to current systems!
Key Criteria of Intelligence
❓What do AI researchers believe to be key criteria for intelligence?
✔️Researchers across fields agree that Generalization, Adaptability, & Reasoning are key components of intelligence!
Our survey paper "Research Community Perspectives on 'Intelligence' and Large Language Models" has been accepted to the ACL Findings 2025 - and I'll be in Vienna to present the work in July!
arxiv.org/abs/2505.20959
If you’re at ICLR, swing by poster #246 on Saturday from 10-12.30 to hear more about our work on modulating the reasoning performance of LLMs!
#ICLR2025
The problem with most machine-based random number generators is that they’re not TRULY random, so if you need genuine randomness it is sometimes necessary to link your code to an external random process like a physical noise source or the current rate of US tariffs on a given country.
This work was done in collaboration with @olliejarvis.bsky.social and @heinrichst.bsky.social!
Looking forward to presenting our poster alongside Oliver at the conference in Singapore! Hope to see you there! ✨
We of course took the chance to discuss:
🔹the implications of our results for LLM reasoning
🔹the use of the term "reasoning" to discuss LLM computations
🔹 whether LLMs can be said to do reasoning at all 🤔
We:
1️⃣ Derive steering vectors from LLM representations on "reasoning" tasks.
2️⃣ Apply them as a linear transformation to the representational space to improve "reasoning" performance.
A very simple tweak resulting in a slight improvement! 📈
ICLR is coming up and I thought I'd use the chance to advertise our paper: "Improving 'Reasoning' Performance in Large Language Models via Representation Engineering" ✨
Also happens to be my first publication as a PhD Student at @itu.dk ❗
its amazing how chatgpt knows everything about subjects I know nothing about, but is wrong like 40% of the time in things im an expert on. not going to think about this any further
That this is a misrepresentation is something we also show in our forthcoming survey paper on Intelligence and LLMs in researcher communities (preprint coming).
It’s a bit disturbing to hear Ezra Klein, someone I admire a lot, stating that “… virtually everyone working in this area (AI) are saying that [AGI] is coming”. In my view this is a gross misrepresentation of the actual sentiment in the field.
Sorry, my bad for being a bit quick there! When you say "popular AI denial" I think more of people just not wanting to accept that these systems are *actually* intelligent. I think it's a shame to discuss healthy skepticism in a derogatory manner ..
I disagree, and what exactly do you mean by AI denial? One can state that there are legitimate (limited) use-cases of modern AI while not subscribing to the belief that current models show "sparks of AGI".
Very harsh writing by Edward Zitron - but he voices concerns I have myself.
Developing helpful 'AI' systems could provide value, but the way current commercial 'AI' systems are being hyped is not very helpful and quite likely detrimental.
www.wheresyoured.at/longcon/
Not to mention that rather than being well-established observations, (1) is difficult if not impossible to assess without a proper definition of intelligence and (3) seems to be complete blather.
Modern-Day Oracles or Bullshit Machines?
Jevin West (@jevinwest.bsky.social) and I have spent the last eight months developing the course on large language models (LLMs) that we think every college freshman needs to take.
thebullshitmachines.com
The "Perspectives on Intelligence" survey is now closed! Thank you to the 200+ researchers who participated. Currently analyzing the data and writing up the findings - stay tuned for the paper!
Project in collaboration with @terne.bsky.social, @annarogers.bsky.social & @heinrichst.bsky.social!
Any clue as to when we'll have some more information? Hoping to go 😁
Do researchers in AI related fields believe that state-of-the-art language models are intelligent? And how do we even define intelligence?
If you haven't yet responded consider taking part in our survey. We'd love to hear your take!
Details and link in original post👇 !
Allowing models to process information without the constraint of "token-space" is an interesting direction for research related to reasoning - a direction I'm also currently pursuing!
They argue that their approach allows models to encode multiple next steps (like a form of breadth-first search) by working directly with the hidden states. This leads to better performance on tasks that need planning ahead.