“In 2022, AI was seen as an efficiency tool for image analysis and language polishing. By 2025, AI has become a participant in the process, shaping writing, influencing review, and challenging the concept of authorship and accountability.”
Very interesting report
#PRC10 #ScholarlyPublishing
Earlier this month, Cecilie, @helenavbk.bsky.social, Mette, and @alundh.bsky.social presented their research at the 10th International Congress on Peer Review and Scientific Publication in Chicago! #PRC10
Read more about our team and the research we do on our website: www.sdu.dk/en/forskning...
For #PeerReviewWeek, I just published my BlueSky posts and notes from the 10th International Peer Review Congress, held 2 weeks ago in Chicago.
#PRC10 @peerreviewcongress.bsky.social
Part 1 is here: scienceintegritydigest.com/2025/09/16/p...
Robert Thibault, PhD, speaks at a podium on a conference stage with dark blue curtains during the “Open Science and Data Sharing” session at the 10th International Congress on Peer Review and Scientific Publication. A colleague sits at a table as a large slide with the ASAP logo shows his talk title, “A Funder-Led Intervention to Increase the Sharing of Data, Code, Protocols, and Key Laboratory Materials.”
Last week, the ASAP team shared our commitment to #OpenScience at the 10th International Congress on Peer Review and Scientific Publication. The session highlighted how funders can expand the sharing of data, code, protocols, and materials. #PRC10
@peerreviewcongress.bsky.social
🔗 bit.ly/4ngFl8S
So much to absorb from a week in Chicago at @peerreviewcongress.bsky.social. Grateful to all who organized and spoke at this incredible meeting! #PRC10
A woman with curly hair and glasses speaks at a table with a microphone. She wears a blazer over a floral shirt. A water bottle and cups sit nearby, against a dark blue backdrop.
📸 JAMA and JAMA Network Editor in Chief @kbibbinsdomingo.bsky.social, MD, PhD, MAS, chaired the “Peer Review Times and Payment Incentives” session at the 10th International Congress on Peer Review and Scientific Publication.
@peerreviewcongress.bsky.social #PRC10
Heading home after #PRC10 feeling inspired by so many people with fresh ideas on how to make research better, and the highlight was reconnecting with my wonderful friend @james-mcauley.bsky.social after 10 years! Let’s keep the conversation going at the Science Integrity Alliance’s community forum!
A blond man in a blue shirt smiles in front of a poster titled "Transparent Reporting of Observational Studies Emulating a Target Trial: The TARGET Guideline". The poster includes a checklist of items with a QR code.
JAMA Network article: "Transparent Reporting of Observational Studies Emulating a Target Trial—The TARGET Statement" by Aidan G. Cashin et al. Published online September 3, 2025.
The TARGET 2025 guideline provides recommendations for transparent reporting of observational studies.
📸 Co-author Aidan Cashin, PhD, presented at @peerreviewcongress.bsky.social Reporting Guidelines poster session. #PRC10
🔗ja.ma/3HNbqGt
A smiling man stands in front of a poster titled, "Authors Who Publish in a Journal and Likelihood to Serve as Reviewers" by Stephan D Fihn et al. The poster details a study using JAMA Network Open data from 2019-23.
📸 Executive Deputy Editor Stephan Fihn, MD, MPH, presented early research on published authors who are also peer reviewers at @peerreviewcongress.bsky.social.
#PRC10
A woman stands next to a poster titled "Ten items to guide the reporting of health equity data and considerations in observational studies." The poster includes QR codes for accessing the STROBE-Equity Website.
JAMA Network Open article titled "Improving the Reporting on Health Equity in Observational Research (STROBE-Equity) Extension Checklist and Elaboration" lists authors and discusses the importance of equity in observational studies.
📸 Co-author Vivian Welch, PhD, broke down the STROBE-Equity guidelines at @peerreviewcongress.bsky.social. #PRC10
🔗 ja.ma/45QBGZH
A speaker, a bald white man with glasses, is at a table with a laptop, a bottle of water, and paper cups in front of a blue curtain. He wears a blue blazer and is speaking into a microphone while holding a book.
📸 JAMA Network Associate Editor David Schriger, MD, MPH, moderated the “Research and Integrity” session at the 10th International Congress on Peer Review and Scientific Publication.
@peerreviewcongress.bsky.social #PRC10
Thank you again for the live posts from @peerreviewcongress.bsky.social, they've been great! Not sure how you manage to get the photos, points and discussion down so fast!
But what a great record of the event and so helpful to get a feel of #PRC10. Thank you @elisabethbik.bsky.social!
Today at #PRC10: Analysis of 347 RCTs in Europe and Canada in 2016
Co-author @schwenkej.bsky.social found industry-sponsored trials were more often completed and available.
🔗 Read the study: ja.ma/41I2etz
@peerreviewcongress.bsky.social
John Ioannidis closing the conference
John Ioannidis is closing the conference, by thanking organizers, staff, first-comers, and veteran attendees. Some attended for the 9th or 10th time!
Safe travels everyone!
EB: I hope y'all enjoyed the live posts! It was my pleasure to provide access to this well-organized congress.
#PRC10
Discussion:
* Do we know if any LLMs are being trained on public reviews? - hard to know which ones are reliable.
* What happens if you retry with the same prompt? You get more or less the same output.
* One LLM and one human review in future?
* Problems with LLM monoculture/monopoly
#PRC10
Graph presenting results comparing LLMs with human
FA: Across 8 RQI tiems LLM reviews scored higher on:
* identify strengths and weaknesses
* useful comments on writing/organizations
* constructiveness
LLMs can thus help humans review papers.
Not all LLMs were equally good. Gemini 5.0pro was the best, but produced very long texts.
#PRC10
Approach slide
FA:
We used 5 LLMs vs 2 humans for each manuscript submitted to 4 BMJ journals, where the LLM reviews was not used in editorial decisions.
We used the Review Quality Instrument (RQI), where editors rated the review quality as well as comprehensiveness score.
#PRC10
Speaker and title slide
Next, the last speaker: Fares Alahdab with 'Quality and Comprehensiveness of Peer Reviews of Journal Submissions Produced by Large Language Models vs Humans'
There is reviewer fatigue, no credit, time-consuming.
Is it a bad thing that LLMs produce peer reviews? How good are LLM reviews?
#PRC10
Conclusion slide with QR code to the paper.
VR: In summary, we can detect LLM-generated peer reviews with high detection rate. Our preprint: arxiv.org/abs/2503.15772
Discussion:
* AI output can be quite good. Why prevent it?
* Flipside to hidden prompt: "give me a positive review" in manuscript. That is malfesience. This not?
#PRC10
Summary of results
VR:
Effectiveness of watermark insertion: LLMs insert the watermark with high probability.
We had great accuracy.
Reviewer defenses could be to paraphrase the LLM-generated text. They could also ask LLM if there were hidden prompts.
#PRC10
VR: But we do not want false positive and
Better watermarking strategies are to insert a random sentence,
a random fake citation, or a fake technical term (markov decision process) - false positive rate will go down.
Hidden prompts can be white colored, very small font, font manipulation
#PRC10
Title slide and speaker
Cartoons many reviewers suspected to submit LLM generated reviews
Next: 'Evaluation of a Method to Detect Peer Reviews Generated by Large Language Models' by Vishisht Rao
Many reviewers are suspected to submit LLM-generated reviews. We can insert hidden message in review assignment for a LLM: Use the word aforementioned and check for that.
#PRC10
Discussion:
* Some verbs are also hype, such as reveal, drive.
* Some folks have used words such as 'delve' already for 20 years - does not always mean it's AI.
* Is it bad to use those words? Other people need to know our science is very groundbreaking!
* semantic bleaching
#PRC10
Nihar Shah just quoted Mary Poppins at @peerreviewcongress.bsky.social #PRC10!
Experiments: language models Table with numbers
NM: The 3 annotators (the authors) did not always agree!
The language models performed better, with finetuned BERT outperforming all methods.
But, subjectivity remains a challenge. Binary labels (hype yes/no) oversimplify promotional language.
We want to expand the lexicon.
#PRC10
Flow chart of the annotation guidelines
NM: We manually annotated 550 sentences from NIH grant application abstract - benchmarking using NLP classification methods, pretrained LLMs and human baseline.
We looked at 11 adjectives promoting novelty, and classified the terms as hype or not hype.
#PRC10
NM: Hype might be biasing evaluation of research and erode trust in science. Confident or hype language is associated with success. LLMs also have contributed.
Can we develop tools to detect and mitigate hype in biomedical text?
Not all these words are always hype (eg. Essential fatty acids)
#PRC10
Speaker with title slide
Next: Neil Millar with "Automating the Detection of Promotional (Hype) Language in Biomedical Research".
Hype: hyperbolic language such as crucial, important, critical, vital, novel, innovative, actionable etc.
All these terms have increased over time in grant applications or articles.
#PRC10
Discussion:
* There are a couple of commercial tools available that do similar work, like Scite.ai - how is your work different?
* Medical writers for industry often do this work manually - compare if industry papers are doing better.
* Could citing the wrong year be counted as error?
#PRC10