Did *models* robustly solve classification? Absolutely not lol
But could you make a convincing argument that given a year and the weight of the entire industry, that any individual classification problem could be "solved" in a ~year? Maybe!
Posts by Sireesh Gururaja
This is a great thread, and reminds me of interviewing someone at one of the big labs for a study, and they made the claim that "classification was solved in 2022"
That's an insane quote, until you account for the money, time, and labor that will be poured into any problem that's deemed worth it
Have you read Ted Chiang's the Lifecycle of Software Objects? The sense of bonding is real, and so much more literal now
They need to make it so you can mass add starter packs to a list / feed if you don't want them to fill up your main feed
Monokai was suddenly everywhere with Sublime Text 3 at the tail end of my time in college, and I credit it with really giving me the bug for customizing my tools!
It looked cool in a way that I'm not sure color themes *can* for me anymore, and I've had a taste for hot pink ever since
friendly reminder to run docker system prune
if you haven't recently
But it's literally a benchmark!!
Really excited about this work w/ my long-time collaborators at Boulder!
We address limitations in existing morphosyntactic annotation systems for digitally under-resourced languages and show how *jointly* predicting morphological segmentation helps with glossing performance
Congratulations!! 🎉
if you invented public libraries today, every opinion page in the country would be arguing for means tested subsidized Amazon Prime memberships
It's time to seize the means of prediction
I think this is exactly right - I think there's a real use here, and it'd be a lot easier to sell if you led with that use rather than the fact that it was AI. Even if the AI provides a way to do other things down the line!!
You have a moral imperative to refuse to work with these people or develop models for these purposes.
*CL folks, a recent history question: was the move to OpenReview as the review platform for *CL conferences related to the move to ARR? Or did they just happen simultaneously?
Asking for a discussion with @shaily99.bsky.social
I'm really not a fan of the way some journals in the physical sciences state results upfront, and bury methods much later in the paper, or even in supplementary materials. Maybe it's my ML bias, but I don't trust that your methods are prima facie reasonable!! Show me what you did before the results!
Appointment reading
I think my AI beliefs are quickly becoming ‘this has enormous positive potential if implemented responsibly. it will not be implemented responsibly and like maybe five companies are even trying.’
Right!! And such complementary methods, too 🥰
This is also a much better statement of the alignment connection thru Bahdanau et al (2014)
Sorry for the barrage, this has just been a side quest for me for a while, so many thoughts
Misc links: tributes to Charles Wayne (DARPA program manager, seen as responsible for benchmarking): www.cambridge.org/core/journal... and Fred Jelinek (so much non-rule based NLP): doi.org/10.1162/coli...
The 2021 workshop on benchmarking also had some real gems, including a story from John Makhoul about the wholesale shift from rule-based ASR to HMMs following the institution of benchmark evaluations: github.com/kwchurch/Ben...
Koch and Peterson have a great critical take on benchmarks and their relationship to the rise of deep learning (both in NLP and otherwise: arxiv.org/abs/2404.06647
I'm biased, but the history of benchmarking is also really important. @markriedl.bsky.social mentioned the ALPAC report, but also a strong rec for Whither Speech Recognition. Same author, but this is a scathing piece that is interesting to compare to the field today (and an early LM shoutout!)
Hope a self plug is ok—we wrote about the LLM shift in NLP and what feels different about it here: aclanthology.org/2023.emnlp-m...
The transformer being developed for MT also is a cool connection to the idea of alignment introduced (iirc) by the IBM models