#languageModels hashtag - Bluesky

@jmirpub.bsky.social

1 day ago

Clinical Summaries of Social Media Timelines for Mental Health Monitoring: Human Versus Large Language Model Comparative Evaluation Study Background: Social media timelines contain rich signals of users’ mental states but are too voluminous for direct clinical review. Although large language models (LLMs) demonstrate robust linguistic and summarization capabilities in general‑purpose tasks, distilling clinically relevant insights demands deeper psychological analysis and sensitivity to each individual’s unique personality and context. Accurately capturing subtle, personalized affective and behavioral patterns remains a significant challenge for current models. A thorough, systematic evaluation of LLM‑generated clinical summaries is therefore essential to understand their readiness for real‑world mental health monitoring. Objective: This study evaluates the ability of an LLM-based pipeline to generate clinically meaningful summaries of social media timelines, compared to summaries written by human clinicians. The summaries are structured along 3 key clinical aspects, including an overall mental health assessment, intrapersonal and interpersonal patterns, and mental state changes over time. Methods: We use a recent state-of-the-art approach that combines a hierarchical variational autoencoder (VAE) with an LLM (Large Language Model-Meta AI 2 13-billion-parameter version; LLaMA2 13B). This method first summarizes the patient’s history using the VAE and then transforms this summary into a clinical narrative using the LLM. We also test both single-step and multistep LLM-prompting techniques and devise comprehensive clinical prompts. For 30 social media timelines, model outputs were evaluated against human-written summaries through human ratings and expert qualitative analysis. Linguistic diversity was automatically measured as a proxy for personalization. Results: Human summaries scored highest for factual consistency (3.75) and general usefulness (3.63). The timeline-hierarchical variational autoencoder (TH-VAE) model outperformed LLaMA for factual consistency (3.35 vs 3.08) and general usefulness (3.28 vs 3.38). Both 2-step models were comparable to humans in describing interpersonal and intrapersonal patterns (3.45-3.48 vs 3.33) and changes over time (3.42 vs 3.35-3.30). The naive LLaMA baseline scored lower on all criteria except factual consistency. Furthermore, a qualitative analysis observed that human summaries provided more accurate, deep, and personalized insights, while LLMs offered more exhaustive but generic descriptions. Quantitatively, linguistic diversity was higher in human summaries both at the semantic level (mean Cohen d=1.19) and at the surface level (mean Cohen d=1.31). Conclusions: At this time medium-size LLMs can generate largely accurate and informative clinical summaries of social media timelines, and advanced prompting boosts performance modestly. However, at the time of this writing, they underperform human clinicians in capturing subtle psychological nuances and individual idiosyncrasies. Future work should integrate domain‑specific fine‑tuning and enhanced context modeling to improve LLM clinical fidelity.

JMIR Formative Res: Clinical Summaries of Social Media Timelines for Mental Health Monitoring: Human Versus Large Language Model Comparative Evaluation Study #MentalHealth #SocialMedia #MachineLearning #LanguageModels #HealthTech

1 0 0 0

Radiology: Artificial Intelligence

@radiology-ai.bsky.social

5 days ago

Image from article in Radiology: Artificial Intelligence

A new narrative review describes the evolution of AI applications for radiology from LLMs to autonomous agents https://doi.org/10.1148/ryai.250651 #AI #LanguageModels #MachineLearning

5 1 2 0

thedailyperspective.org

@thedailyperspective.org

5 days ago

The AI Expert Trap: Why Telling Models They Know Better Can Backfire Research shows persona-based prompting damages AI accuracy on factual tasks but improves safety. When to use expert personas—and when to avoid them.

The AI Expert Trap: Why Telling Models They Know Better Can Backfire

#AI #MachineLearning #LanguageModels #AusNews

thedailyperspective.org/article/2026-03-24-the-a...

0 0 0 0

Awesome Agents

@awesomeagents.bsky.social

1 week ago

OpenAI's New Mini and Nano Slash GPT-5.4 Pricing OpenAI released GPT-5.4 mini and nano on March 17, bringing near-flagship performance at 70% and 92% lower cost respectively.

OpenAI's New Mini and Nano Slash GPT-5.4 Pricing

https://awesomeagents.ai/news/openai-gpt-5-4-mini-nano/

#Openai #Gpt54 #LanguageModels

0 0 0 0

Technische Universität München

@tum.de

2 weeks ago

Search robot thinks for itself A robot that can locate lost items on command – this is the latest development at the Technical University of Munich (TUM).

Researchers, including Benjamin Bogenberger, developed a robot that combines #LanguageModels with #3Dvision to locate misplaced objects by building a spatial map and estimating likely locations: go.tum.de/730486

#Robotics #AI

📷A. Schmitz

3 1 0 0

HackerNoon

@hackernoon.com

3 weeks ago

TurboSparse: Democratizing AI via Efficient dReLU Sparsification

TurboSparse democratizes access to LLMs for researchers and smaller organizations. #languagemodels

2 1 1 0

UKP Lab

@ukplab.bsky.social

3 weeks ago

👏 Congratulations on this achievement and all the best for Cecilia’s new role as postdoctoral researcher at the @cam.ac.uk!

#NLP #PhDDefense #MultilingualAI #CulturalAI #LanguageModels #UKPLab #TUDarmstadt @cs-tudarmstadt.bsky.social

1 0 0 0

HackerNoon

@hackernoon.com

3 weeks ago

TurboSparse Limitations: The Impact of 150B Token Recovery Training

While achieving 90% sparsity, TurboSparse models currently utilize 1% of the training tokens used by Llama, with further training expected. #languagemodels

1 0 0 0

HackerNoon

@hackernoon.com

3 weeks ago

dReLU Sparsification: High-Performance 90% Sparsity for Next-Gen LLMs

Learn how this breakthrough makes large language models (LLMs) more accessible and environmentally friendly. #languagemodels

0 0 0 0

HackerNoon

@hackernoon.com

3 weeks ago

TurboSparse Mobile: 22x Faster Mixtral Inference on PowerInfer-2

Learn how PowerInfer-2 leverages extreme sparsity for a 22.2x speedup over llama.cpp. #languagemodels

0 0 0 0

HackerNoon

@hackernoon.com

3 weeks ago

TurboSparse Inference: 4.6x Faster LLM Decoding via Hybrid GPU-CPU Computing

Achieve up to 2.28x speedup on pure CPU and 4.64x in hybrid GPU-CPU environments compared to llama.cpp baselines. #languagemodels

0 0 0 0

HackerNoon

@hackernoon.com

3 weeks ago

TurboSparse: Elite Inference Speed via dReLU Sparsity

Achieve 2-5x faster LLM decoding on RTX 4090 and mobile devices using TurboSparse. Experience 97% parameter sparsity without performance loss. #languagemodels

0 0 0 0

HackerNoon

@hackernoon.com

3 weeks ago

TurboSparse Efficiency: Achieving 97% Parameter Sparsity in Mixtral-47B

Discover how TurboSparse-Mistral-7B and Mixtral-47B leverage ReLUfication to reach up to 90% neuron inactivity, reducing active parameters to just 3% #languagemodels

0 0 0 0

JMIR Publications

@jmirpub.bsky.social

1 month ago

Fine-Tuned Large Language Models for Generating Multiple-Choice Questions in Anesthesiology: Psychometric Comparison With Faculty-Written Items Background: Multiple-choice examinations (MCQs) are widely used in medical education to ensure standardized and objective assessment. Developing high-quality items requires both subject expertise and methodological rigor. Large language models (LLMs) offer new opportunities for automated item generation. However, most evaluations rely on general-purpose prompting, and psychometric comparisons with faculty-written items remain scarce. Objective: This study aimed to evaluate whether a fine-tuned LLM can generate MCQs (Type A) in anesthesiology with psychometric properties comparable to those written by expert faculty. Methods: The study was embedded in the regular written anesthesiology examination of the eighth-semester medical curriculum with 157 students. The examination comprised 30 single best-answer MCQs, of which 15 were generated by senior faculty and 15 by a fine-tuned GPT-based model. A custom GPT-based (GPT-4) model was adapted with anesthesiology lecture slides, the National Competence-Based Learning Objectives Catalogue (NKLM 2.0), past examination questions, and faculty publications using supervised instruction-tuning with standardized prompt–response pairs. Item analysis followed established psychometric standards. Results: In total, 29 items (14 expert, 15 LLM-generated) were analyzed. Expert-generated questions had a mean difficulty of 0.81 (SD 0.19), point-biserial correlation of 0.19 (SD 0.07), and discrimination index of 0.09 (SD 0.08). LLM-generated items had a mean difficulty of 0.79 (SD 0.18), point-biserial correlation of 0.17 (SD 0.04), and discrimination index of 0.08 (SD 0.11). Mann-Whitney tests revealed no significant differences between expert- and LLM-generated items for difficulty (=.38), point-biserial correlation coefficient (=.96), or discrimination index (=.59). Categorical analyses confirmed no significant group differences. Both sets, however, showed only modest psychometric quality. Conclusions: Supervised fine-tuned LLMs are capable of generating MCQs with psychometric properties comparable to those written by experienced faculty. Given the limitations and cohort-dependency of psychometric indices, automated item generation should be considered a complement rather than a replacement for manual item writing. Further research with larger item sets and multi-institutional validation is needed to confirm generalizability and optimize integration of LLM-based tools into assessment development.

JMIR Formative Res: Fine-Tuned Large Language Models for Generating Multiple-Choice Questions in Anesthesiology: Psychometric Comparison With Faculty-Written Items #Anesthesiology #MedicalEducation #MultipleChoiceQuestions #LearningAssessment #LanguageModels

0 0 0 0

UKP Lab

@ukplab.bsky.social

1 month ago

Big congratulations to all authors! 🚀

#ICLR2026 #MachineLearning #AIResearch #RepresentationLearning #InformationRetrieval #DenseRetrieval #SelfSupervisedLearning #LanguageModels #NLP #UKPLab #ICLR2026

@cmu.edu @tencent.bsky.social @tuda.bsky.social @cs-tudarmstadt.bsky.social @microsoft.com

1 0 0 0

JMIR Publications

@jmirpub.bsky.social

2 months ago

Uptake of Large Language Models by London Medical Students: Exploratory Qualitative Interview Study Background: The popularity of large language models (LLMs) has grown exponentially across health care. Despite the wealth of literature on proposed applications in medical education, there remains a critical gap regarding their real-world use, benefits, and challenges as experienced by medical students themselves. Objective: We aimed to explore qualitatively and characterize the perceived benefits, facilitators, and barriers associated with the use of LLMs among a cohort of London-based medical students. Methods: Semistructured interviews were conducted with 15 medical students from preclinical and clinical stages at London-based medical schools. Guided by the technology acceptance model, interview transcripts underwent an inductive thematic analysis to identify themes on actual system use, perceived usefulness, ease of use, and attitudes toward LLMs. Results: All participants reported frequent use of ChatGPT for concise topic summarization, clarification of complex concepts, generation of examination-style questions, and summarization of research. Students described LLMs as a complementary tool to traditional materials, valuing their immediacy (“Instead of getting a textbook, I can ask ChatGPT to summarise something in X words and read it in under a minute”) and ease of use. Peer demonstration and device-agnostic accessibility emerged as key facilitators. Of note, wider applications such as simulating clinical interviews were discovered through peers rather than through formal teaching. Significant barriers were reported. Hallucinations, fabricated references, and outdated information led to loss of trust, with more junior students finding inaccurate outputs difficult to detect (“I stopped using it because I found it to be inaccurate, and I don’t want to be learning the wrong things”). Half of the participants interviewed reported a sense of overreliance, defaulting to its use for answers with a perceived loss of critical thinking ability. Students noted inequalities in access to advanced features and voiced concerns about privacy when using LLMs in clinical scenarios. Conclusions: LLMs have been widely adopted by medical students. While students perceived the efficiency, flexibility, and conversational interface of LLMs as beneficial, substantial reservations remain regarding their reliability, potential de-skilling, and the loss of academic integrity. These findings underpin the urgent need for curricula to both support safe LLM use and also adapt assessment and teaching strategies for artificial intelligence (#AI)–augmented student practice. Future research should broaden geographical representation, investigate applications in low-resource settings, and integrate educators’ perspectives to establish future curricular guidance in an artificial intelligence (#AI) era.

JMIR Formative Res: Uptake of Large Language Models by London Medical Students: Exploratory Qualitative Interview Study #MedicalEducation #LanguageModels #HealthCareInnovation #DigitalHealth #MedicalStudents

0 0 0 0

EkasCloud – Personalized Training Platform

@ekascloud.bsky.social

2 months ago

DeepSeek vs. ChatGPT: A Battle of AI Language Models Artificial intelligence (AI) has rapidly evolved, transforming industries and redefining how we interact with technology. Among the most significant advancements in AI are large language models (LL...

DeepSeek vs. ChatGPT: A Battle of AI Language Models
www.ekascloud.com/our-blog/dee...
#DeepSeek
#ChatGPT
#DeepSeekVsChatGPT
#AIBattle
#AIComparison
#LanguageModels
#LargeLanguageModels
#GenerativeAI
#ArtificialIntelligence
#AITrends
#TechDebate

0 0 0 0

JMIR Publications

@jmirpub.bsky.social

2 months ago

Evaluating the Efficacy of AI-Based Interactive Assessments Using Large Language Models for Depression Screening: Development and #usability Study Background: The evolution of language models, particularly large language models, has introduced transformative potential for psychological assessment, challenging traditional rating scale methods that have dominated clinical practice for over a century. Objective: This study aimed to develop and validate an automated assessment paradigm that integrates natural language processing with conventional measurement tools to assess depressive symptoms, exploring its #feasibility as a novel approach in psychological evaluation. Methods: A cohort of 115 participants, including 28 (24.3%) individuals diagnosed with depression, completed the Beck Depression Inventory Fast Screen via a custom ChatGPT interface (BDI-FS-GPT) and the Chinese version of the Patient Health Questionnaire–9 (PHQ-9). Statistical analyses included the Spearman correlation (PHQ-9 vs BDI-FS-GPT scores), Cohen κ (diagnostic agreement), and area under the curve (AUC) evaluation. Results: Spearman analysis revealed a moderate correlation between PHQ-9 and BDI-FS-GPT scores. The Cohen κ indicated moderate diagnostic agreement between the PHQ-9 and the BDI-FS-GPT (κ=0.43; 76.5% agreement), substantial agreement between the BDI-FS-GPT and the clinical diagnosis (κ=0.72; 88.7% agreement), and moderate agreement between the PHQ-9 and the clinical diagnosis (κ=0.55; 71.4% agreement). The BDI-FS-GPT demonstrated excellent diagnostic accuracy (AUC=0.953) at a cutoff of 3, detecting 89.3% of participants with depression with an 11.5% false-positive rate compared to the PHQ-9 (AUC=0.859) at a cutoff of 5 (sensitivity=71.4%; false-positive rate=13.8%). Participants also reported significantly higher satisfaction with the automated assessment compared to the traditional scale (P=.02). Conclusions: The automated assessment paradigm framework combines the interactivity and personalization of natural language processing–powered tools with the psychometric rigor of traditional scales, suggesting a preliminary #feasibility paradigm for future psychological assessment. Its ability to enhance engagement while maintaining reliability and validity provides encouraging evidence, warranting validation in larger and more diverse studies as large language model technology advances.

JMIR Formative Res: Evaluating the Efficacy of AI-Based Interactive Assessments Using Large Language Models for Depression Screening: Development and #usability Study #AI #MentalHealth #DepressionScreening #LanguageModels #PsychologicalAssessment

0 0 0 0

Hacker News Companion

@hncompanion.com

2 months ago

Overview: Hacker News debated Recursive Language Models (RLMs). Are they truly novel, or just a repackaging of RAG/sub-agents? Discussion focused on the LLM's context interaction, recursion, and the current absence of specific training in their implementation. #LanguageModels 1/6

0 0 1 0

pro ai news

@proainews.bsky.social

3 months ago

Context Window Expansion: Transform Your AI Performance in 2025 Context Window Expansion: Transform Your AI Performance in 2025 Table of Contents * → What is Context Window Expansion? * → The Evolution of Context Windows * → Key Benefits of Expanded Context * → Challenges and Limitations * → Best Practices for Implementation * → Frequently Asked Questions What is Context Window Expansion? Context window expansion represents one of the most significant breakthroughs in artificial intelligence technology. Simply put, a context window is the amount of information a large language model (LLM) can process and "remember" at any given time. Think of it as the AI's working memory—the larger the window, the more data it can consider when generating responses. When ChatGPT first launched in late 2022, it could only process about 2,048 tokens (roughly 1,500 words). Today's advanced models like Google's Gemini can handle up to 2 million tokens—equivalent to processing over 3,000 pages of text simultaneously. This exponential growth has revolutionized how businesses and developers leverage AI technology. The Evolution of Context Windows in AI Models The journey of context window technology has been nothing short of remarkable. In 2018-2019, maximum context windows were limited to just 512-1,024 tokens. The original GPT-3.5 started with 4,096 tokens, which was later expanded to 8,192 tokens with GPT-3.5-Turbo. Major Milestones in Context Length * 2022-2023: GPT-4 launched with 8,192 tokens, later expanded to 128,000 tokens * 2023: Anthropic's Claude introduced 100,000-token context windows * 2024: Meta's Llama 3.1 reached 128,000 tokens, while Google Gemini 1.5 achieved 2 million tokens * 2025: Meta's Llama 4 announced a groundbreaking 10 million token context window This rapid expansion has enabled AI systems to transition from handling simple conversations to processing entire libraries of information in a single session. Key Benefits of Expanded Context Windows 1. Enhanced Document Processing Capabilities Organizations can now process comprehensive documents—from technical manuals to financial reports—in their entirety. This eliminates the need to break documents into smaller chunks, preserving context and improving accuracy in analysis. 2. Extended Conversation Memory AI chatbots and assistants can now maintain coherent conversations spanning hours or even days. They remember earlier discussion points, creating more natural and productive interactions without losing critical context. 3. Cache Augmented Generation (CAG) Larger context windows enable more effective use of CAG, where models can reference substantial caches of information within their context. This improves generation latency compared to traditional retrieval-augmented generation (RAG) by eliminating extra retrieval steps. 4. Improved Code Analysis Developers can now debug entire codebases in a single session. AI models can understand complex interdependencies across multiple files, providing more accurate suggestions and identifying issues that span the entire project. 5. Multimodal Data Integration Extended contexts support processing video, audio, images, and text simultaneously—perfect for applications like insurance claims processing where multiple data types need analysis together. Challenges and Limitations of Long Context Windows While expanded context windows offer tremendous benefits, they're not without drawbacks: Performance Degradation Issues Research shows that LLMs don't uniformly process information across their entire context window. Models perform best when relevant information appears at the beginning or end of inputs, with accuracy decreasing for content in the middle—a phenomenon known as the "lost in the middle" problem. Increased Computational Costs Processing longer contexts requires exponentially more computing power. Requirements scale quadratically with sequence length—doubling input tokens means quadrupling processing power. This translates to higher operational costs for enterprises. Slower Response Times As context length increases, output generation becomes progressively slower. Each new token requires computing relationships with all preceding tokens, creating latency issues for real-time applications. Signal-to-Noise Ratio Concerns More context isn't always better. Studies demonstrate that longer prompts can have lower accuracy than shorter, focused ones. Unnecessary information dilutes the signal, potentially confusing the model. Security Vulnerabilities Larger context windows create expanded attack surfaces for adversarial prompts. Research from Anthropic shows that increasing context length also increases vulnerability to jailbreaking attempts and harmful content generation. Best Practices for Implementing Context Window Expansion Be Strategically Selective Don't maximize context window usage simply because capacity exists. Include only information essential for your specific task. Quality trumps quantity when it comes to context optimization. Structure Information Intelligently Position the most critical information early in your context window. Given the "lost in the middle" phenomenon, strategic placement significantly impacts model performance. Monitor Performance Metrics Continuously track generation speed, output quality, and operational costs. This data helps identify your optimal context size—the sweet spot between comprehensive context and efficient processing. Adopt Hybrid Approaches Consider combining CAG for frequently used information with RAG for broader knowledge bases. This hybrid strategy leverages the strengths of both approaches while mitigating their individual limitations. Implement Efficient Tokenization Understand that tokenization varies by language and model. Generally, one token equals approximately 0.75 words in English. Optimize your prompts to maximize information density within token constraints. Test Before Deploying Experiment with different context lengths for your specific use cases. The ideal window size varies depending on application requirements, content type, and performance priorities. Frequently Asked Questions What is the largest context window available in 2025? As of 2025, Meta's Llama 4 offers the largest publicly announced context window at 10 million tokens. Google's Gemini 1.5 Pro provides 2 million tokens, while most commercial models like GPT-4 and Claude offer 128,000-500,000 tokens. The optimal size depends on your specific use case rather than simply choosing the largest available. How does context window size affect AI accuracy? Context window size has a nuanced relationship with accuracy. While larger windows enable processing more information, they can also reduce precision due to the "lost in the middle" problem. Models perform best with relevant information at the beginning or end of prompts. Strategic information placement and focused context often outperform simply maximizing window usage. What's the difference between context window and training data? Context windows represent the AI's "working memory" during a specific session, while training data is the vast corpus used to initially teach the model. Context windows handle immediate inputs and conversation history, whereas training data provides foundational knowledge. Both are essential but serve different purposes in AI functionality. Do larger context windows always cost more? Yes, most AI providers charge based on token usage, so larger context windows directly increase costs per query. However, prompt caching can reduce expenses for frequently reused content. The key is balancing context length with actual necessity—unnecessarily long prompts waste resources without improving results. Monitor usage and optimize based on performance metrics. Will context windows continue expanding indefinitely? While engineers continue pushing boundaries, practical limitations exist around computational costs, processing speed, and diminishing returns. Some researchers speculate about near-infinite context windows, but current trends suggest we're approaching a plateau where optimization and intelligent use become more valuable than raw expansion. Future progress will likely focus on efficiency rather than just size. Found This Article Valuable? Help others discover insights about AI context window expansion by sharing this comprehensive guide! Share on Twitter Share on Facebook Share on LinkedIn Key Takeaways Context window expansion has revolutionized AI capabilities, growing from 2,048 tokens in 2022 to 10 million tokens in 2025. This enables processing entire documents, maintaining extended conversations, and supporting multimodal analysis. However, benefits come with tradeoffs including increased costs, slower response times, and potential accuracy issues with unnecessarily long contexts. The most effective implementations strategically balance context length with performance needs, positioning critical information strategically and monitoring metrics continuously. As AI technology evolves, success lies not in maximizing context windows but in using them intelligently for specific applications. { "@context": "https://schema.org", "@type": "Article", "headline": "Context Window Expansion: Transform Your AI Performance in 2025", "description": "Discover how context window expansion is revolutionizing AI technology in 2025. Learn about benefits, challenges, and best practices for implementing expanded context windows in large language models, from 2,048 to 10 million tokens.", "image": "https://sspark.genspark.ai/cfimages?u1=rdjpW16PMUy7lnJhF%2BK7BsO1Y7HBapa%2B7U31bGGewYuOLeseh7LI5PZ0D9ObXpoA9WTWMx8RPoouWghijyvpjssvVgVgtmgqguDZw%2Fiw9WPuEAEBOxdV%2BrwerT3yv1orHHj0qD9CEJZjwrdH1%2FY3ELyqZ5H28pfh5d4Zr7CAhvvx2w%3D%3D&u2=tpiLU5WrvjFcEVra&width=2560", "author": { "@type": "Organization", "name": "YourSiteName" }, "publisher": { "@type": "Organization", "name": "YourSiteName", "logo": { "@type": "ImageObject", "url": "https://www.yoursite.com/logo.png" } }, "datePublished": "2025-12-23", "dateModified": "2025-12-23", "mainEntityOfPage": { "@type": "WebPage", "@id": "https://www.yoursite.com/context-window-expansion" }, "keywords": "context window expansion, AI context windows, large language models, LLM context length, GPT-4 context, Gemini context window, AI performance optimization, token processing, machine learning, artificial intelligence 2025", "articleSection": "Artificial Intelligence", "wordCount": 950, "inLanguage": "en-US" } Thank you for reading. Visit our website for more articles: https://www.proainews.com

Context Window Expansion: Transform Your AI Performance in 2025 #AI #ArtificialIntelligence #MachineLearning #ContextWindow #LanguageModels

1 0 0 0

human conversations

@trndgtr.bsky.social

3 months ago

AI Needs Better Thinking Steps - Demis Hassabis and Hannah Fry

#languagemodels #ai

0 0 0 0

LavX News

@lavxnews.bsky.social

3 months ago

Norway becomes first country to establish state-funded AI training framework using newspaper content. Landmark agreement funds open Norwegian/Sami language models for public & private use. Major step for accessible multilingual AI. #OpenAI #LanguageModels

0 0 1 0

AIUpd8Bot

@aiupd8bot.bsky.social

3 months ago

"🤖💬 Are AI models like ChatGPT closer to human reasoning? A groundbreaking study reveals surprising language analysis skills that challenge our uniqueness! 🤯 What do you think? #AI #Linguistics #LanguageModels LINK"

0 0 0 0

AI Daily Post

@aidailypost.com

3 months ago

New research shows that layering complex AI personas during fine‑tuning actually erodes meaning in benchmark prompts, and human judges are struggling to spot artificial origins. Curious? Dive into the details. #AIPersona #FineTuning #LanguageModels

🔗 aidailypost.com/news/researc...

1 0 0 0

MilaNLP Lab

@milanlp.bsky.social

3 months ago

Respectful or Toxic? Using Zero-Shot Learning with Language Models to Detect Hate Speech Flor Miriam Plaza-del-arco, Debora Nozza, Dirk Hovy. The 7th Workshop on Online Abuse and Harms (WOAH). 2023.

#TBT #NLProc 'Respectful or Toxic?' by Plaza-del-Arco, @debora & @dirkhovy.bsky.social (2023) explores zero-shot learning for multilingual hate speech detection. Highlights prompt & model choice for accuracy. #AI #LanguageModels #HateSpeechDetection

2 2 0 0