Brute force has taken on a new meaning in the current AI hype cycle
Posts by Adolfo Neto
« This is being used as a series of markdown [Claude] skills that are begging the AI to write good code, and it still produces [this garbage]. It should make you very aware that "make it good, no bugs, here's some skills" is not enough... »
Oops, all opposable thumbs of computing.
What Makes Code Generation Ethically Sourced?
Zhuolin Xu, Chenglin Li, Qiushi Li, Shin Hwei Tan
New version
To be presented at @icseconf.bsky.social
arxiv.org/abs/2507.197...
Is reading the code enough?
youtu.be/1r9n-HsBQsE?...
"Students have always cheated. Bending and breaking the rules is human nature. And by the same token, educators are not police. We are not here to obsessively surveil our students — education is based on mutual trust. "
Yes! Thanks!
"many metaphorical or jargonistic phrases (especially used to describe ANNs; see Table 1) — like train, learn, hallucinate, reason — are applied to machines and result in distorting
how we perceive these machines: humanising them while dehumanising us"
Yes, @olivia.science et al cited Francesca Albanese!
and allow the technology industry’s immense contributions to climate crisis
and environmental destruction to continue unimpeded (Benjamin 2024; Brennan et al. 2025; Mc Quillan 2025; Suarez et al. 2025; Tafani 2024b)."
"People can undo things; and we will (cf. Albanese 2025; Boztas 2025; Kohnstamm Instituut 2025; van Laarhoven and van Vugt 2025). Besides, there will be no future to embrace if we deskill our students and selves,
"intelligence has a racist, sexist, classist, and ableist inheritance that it has not managed to shake off, from superficial pseudoscience to eugenics and genocide (Dennis 1995; Gould 1981; Norrgard 2008; Reddy 2007; Saini 2019). "
zenodo.org/records/1706...
Maybe I got too late to the party, but I don't see much of that:
"For years, a certain kind of critical AI discourse has had an easy refuge: when the hype getstoo loud, you mutter 'stochastic parrots,' roll your eyes, and tell people to wait for thebubble to pop."
Our study highlights that models incorporating these details tend to receive higher downloads, suggesting increased adoption rates and facilitating better decisionmaking among users, particularly concerning environmental impact considerations (Bender et al. 2021).
"Towards semantic versioning of open pre-trained language
model releases on hugging face"
ESE Journal again
The best one so far?
doi.org/10.1007/s106...
Good!
bsky.app/profile/eerk...
It’s astonishing how many people don’t do the slightest bit of research, or simply don’t care, to the extent that they go and chat on LF’s podcast. This time it was Richard Karp (Turing Award).
Our review reveals a notable oversight, showing that despite the resource intensity of LLM training and inference, many studies do not systematically evaluate the cost-effectiveness or sustainability of their proposed approaches. As noted in prior work ([70, 71]), the centralization of GenAI capabilities in a handful of well-funded industrial actors raises further concerns regarding accessibility and equity in research. The current trajectory risks exacerbating disparities between organizations that can afford to experiment with GenAI at scale and those constrained by limited computational budgets.
Generative AI for Requirements Engineering: A Systematic Literature Review
Software: Practice & Experience (a good SE journal)
doi.org/10.1002/spe....
Looks good. They seem to care about some important topics.
71 is SP, 70 is aclanthology.org/P19-1355/
I'm interrupting this thread because I found an article that looks like an extended LinkedIn post
"The Tyranny of the Stochastic Parrot: How AI Critique Became a Way to Not See What's Happening"
papers.ssrn.com/sol3/papers....
Links to both papers
link.springer.com/article/10.1...
link.springer.com/article/10.1...
2.1 Large-Language ModelsFive years after transformer models were introduced (Vaswani et al. 2017), OpenAI’s pub-licly accessible chatGPT (OpenAI 2022) transformed the public understanding of LLMs.By now, cloud-based commercial LLMs such as OpenAI’s GPT family, Anthropic’s Claudeor Google’s Gemini have become ubiquitous (Zhao et al. 2023). Each new generation ofMeta’s Llama model (Touvron et al. 2023) ignites interest in running local LLMs to reduceboth potential privacy impact as well as subscription-based costs.There is an ongoing discussion about the minimum viable model parameter size. On onehand, proponents claim that emergent features arise only with larger model sizes (Kosinski2023; Bubeck et al. 2023; Wei et al. 2022); on the other hand, proponents claim that smallermodels can achieve domain-specific tasks with reduced costs for both training and execu-tion (Bender et al. 2021).Smaller models are feasible to run locally. This is important for agent-based scenar-ios (Andreas 2022; Park et al. 2023) or if privacy reasons disallow the usage of cloud-basedLLMs. In early 2024 the term Small Language Models was introduced to denote modelswith parameter sizes typically smaller than 8–12 billions, one example of such a model
LLMs as Hackers: Autonomous Linux Privilege Escalation Attacks
ESE again
These authors seem concerned about the size of the models
3.2 APR Needs a Multilingual Dataset to Overcome Language BiasIn recent years, we have seen a breakthrough in natural language processing (NLP) andunderstanding (NLU), largely owed to deep learning. However, researchers have repeatedlyindicated that there exists a strong bias towards the English language, not only in dataset butalso benchmarks (e.g., GLUE) (Bender et al. 2021).Does such a bias exist in APR? Of the 16 most important datasets published in recentyears, 11 include Java, five Python and three JavaScript, while other less popular but stillvery common languages like Ruby or Go do not appear at all; for PHP a first benchmarkwas presented only recently (Table 1). We would like to open the field to a larger number ofprogramming languages, overcome its current Java centricity and foster the development ofmore polyglot or even language-agnostic APR systems. This would greatly raise the appli-cability of APR because large amounts of code is written in languages other than, say, Java,Python or C. With software development becoming increasingly diverse in terms of pro-
RunBugRun: An executable dataset for automated program repair
Empirical Software Engineering (a top journal)
Are they only concerned about prejudice?
OK, maybe Stochatisc Parrots, by @emilymbender.bsky.social @timnitgebru.blacksky.app @mmitchell.bsky.social is a better choice. I was able to find a few software engineering papers that cite SP
dl.acm.org/doi/abs/10.1...
I didn't read the whole thing. I just looked for references to issues with LLMs that go beyond the generated code. I didn't find any.
Acabei de perceber
It is not new.
bsky.app/profile/neur...
ScienceDirect screenshot
ScienceDirect now has a "Reading Assistant" that can "summarize" (shorten), "answer questions", "suggest" related papers about a paper you are reading.
You cannot disable the chatbot.
The State of Peer Review in Empirical Software Engineering
If you review papers in software engineering, answer this survey at forms.gle/S8qVvF4TLTKa...
Sustainable Dual-Track Development: The Future of Software Engineering for Co-located, Remote, and Hybrid Teams 1st Edition
by Todd Sedano and Paul Ralph
amzn.to/4tw71K9
50 dollars is too much for a Kindle book...
Not sure it applies to these cases