Advertisement · 728 × 90
#
Hashtag
#Datasets
Advertisement · 728 × 90
Post image

I'm a beginner documenting my data journey and this is the list I wish I had from day one. Full article link in the comments. #DataAnalytics #DataJourney #Lifelonglearner #Blackwomenintech #Medium #Free #Datasets

0 0 0 0

Finally Outshining the Random Baseline: A Simple and Effective Solution for Active Learning in 3D...

Carsten T. Lüth, Jeremias Traub, Kim-Celine Kahl et al.

Action editor: Jose Dolz

https://openreview.net/forum?id=UamXueEaYW

#dataset #datasets #segmentation

0 0 0 0
Post image

Research Paper (preprint) "Linking Global #Science #Funding to Research #Publications" arxiv.org/pdf/2603.24147 #publications #scholcomm #datasets #data #funders

1 0 0 0
A dataset of insect sounds from 459 species for bioacoustic machine learning - Scientific Data Scientific Data - A dataset of insect sounds from 459 species for bioacoustic machine learning

New paper from us: "A dataset of insect sounds from 459 species for bioacoustic machine learning", published in Scientific Data, led by Marius Faiß https://doi.org/10.1038/s41597-026-07123-4 #bioacoustics #datasets

14 11 1 0
Preview
NIST Helps Fingerprint Examiners With New Data and Software Release The new tools are an annotated collection of 10,000 fingerprints and a software program that can sort fingerprints according to their quality.

#crime #forensics #datasets #fingerprints #NIST #AI

'A NIST collection of 10,000 fingerprints has now been fully annotated with details that will help train both human fingerprint examiners and AI tools.'

www.nist.gov/news-events/...

2 0 0 1

Theoretically Understanding Data Reconstruction Leakage in Federated Learning

Binghui Zhang, Zifan Wang, Meng Pang, Yuan Hong, Binghui Wang

Action editor: Jinghui Chen

https://openreview.net/forum?id=1UfDXeYxwk

#federated #privacy #datasets

0 0 1 0
Preview
First versions of harmonised data now available | Infra4NextGen As part of the Infra4NextGen project, harmonised datasets for each of the five themes have been published for the first time on the NextGen Harmonised Data Gateway. These initial versions include…

Harmonised #datasets for the five themes of the NextGenerationEU recovery plan are now available for download.

These files include #data from five major #surveys that has been #harmonised to make it as comparable as possible, even if the #question text and response scales differed.

1 1 0 0

New #J2C Certification:

Reasoning-Driven Synthetic Data Generation and Evaluation

Tim R. Davidson, Benoit Seguin, Enrico Bacis, Cesar Ilharco, Hamza Harkous

https://openreview.net/forum?id=NALsdGEPhB

#generate #annotators #datasets

0 0 0 0
Post image

⛰️🌍 Mountains are underrepresented in global #datasets, yet are critical for understanding #ClimateChange & its impacts.

Strengthening #observations in #OurChangingMountains is key. 🗝️

MRI contributed this perspective at last month's Global Climate Observing System #GCOS meeting.

📖👉️ buff.ly/3JMiBjv

4 1 0 0
Preview
★★★★★ Private B2B data broker Verified business & consumer intelligence datasets: executive contacts, firmographics, and market insights. Transparent pricing—no quotes required.

Business & Consumer Intelligence You Won’t Find Anywhere Else

Structured datasets on companies, executives, consumers, and behavioral signals—ready for research, analysis, segmentation, or integration into your workflows.

mediumaxis.com

#datasets #intelligence #leadgeneration

0 0 0 0
Preview
DataSeer develops AI system to track dataset reuse - Research Information Tool has ability to "systematically track data reuse giving a new lens on openness, research integrity, and downstream impact"

DataSeer develops AI system to track dataset reuse: www.researchinformation.info/news/datasee...

#Data #LLM #LargeLanguageModel #LLM #OpenScience #OpenAccess #OA #Datasets #Stratos #AI #ArtificialIntelligence #ResearchData #DataSeer #Grants #MJFF

0 0 0 0

On the Importance of Pretraining Data Alignment for Atomic Property Prediction

Yasir M. Ghunaim, Hasan Abed Al Kader Hammoud, Bernard Ghanem

Action editor: Changyou Chen

https://openreview.net/forum?id=jfD9BsrDTb

#dataset #datasets #inception

0 0 0 0

But large #datasets bring challenges:
• Bias in digital data sources
• Measurement validity issues
• Risks of overfitting models

Therefore, validation and replication are essential in CSS research.

0 0 1 0
resumen ejecutivo del informe de datasets españoles en Zenodo

resumen ejecutivo del informe de datasets españoles en Zenodo

Ya está publicado el informe de #datasets de universidades españolas en #Zenodo con datos de diciembre-2025. Más conjuntos pero menor nivel de descripción. No se debe bajar la guardia. Las bibliotecas universitarias algo deben de hacer. www.javima.info/ciencia-abie...
#CienciaAbierta

0 0 0 0
Eye-Tracking-While-Reading Datasets

👀 📣 To all users of eye-tracking-while-reading datasets: check out our comprehensive, filterable dataset overview!

Dataset overview: dili-lab.github.io/datasets.html

Preprint: arxiv.org/abs/2602.19598

Add or edit your dataset: www.cl.uzh.ch/en/research-...

#FAIR #eyetracking #datasets

2 1 0 0
Preview
Scientists warn fake research is spreading faster than real science A sweeping new study from Northwestern University reveals that scientific fraud is no longer just the work of a few rogue researchers—it has evolved into a global, organized enterprise. By analyzing…

"By analyzing massive #datasets .. #researchers uncovered networks involving “paper mills,” brokers, and compromised journals that systematically produce and sell fake #research, authorship slots, and #citations.": buff.ly/YJ4bqBU

via sciencedaily
#science #MedSky #research #ResearchJournals

7 4 0 0
Preview
Why Austria? A Prime Telemarketing Goldmine In today's fast-paced digital landscape, businesses are constantly seeking efficient ways to connect with high-value leads. For marketers ta...

Enter 100% verified active #AustriaWhatsApp #numberdata from trusted #WhatsAppDatabase companies. These premium #datasets offer a #gamechanging solution for #telemarketing and direct call marketing #campaigns, delivering unmatched accuracy, and ROI
buywhatsappdatabase247.blogspot.com/2026/03/aust...

0 0 0 0
Post image

The scryptIQ #machinelearning module covers both supervised and unsupervised learning methods: namely the classification and clustering of different #biological #datasets, including images.

scryptiq.ai

0 0 0 0
Science is more than papers

Science is more than papers

153M+ research outputs in the #OpenAIREGraph are linked to #datasets & #software
A growing web of connections allowing us to see how knowledge is built across publications, data & code, not just the final paper.
Explore connections
🔗 #GraphAPI shorturl.at/oRotk
🔗 #OpenAIRE EXPLORE shorturl.at/RIZoh

2 1 0 0

New #J2C Certification:

Probabilistic Pretraining for Improved Neural Regression

Boris N. Oreshkin, Shiv Kumar Tavker, Dmitry Efimov

https://openreview.net/forum?id=F6BTATGXaf

#datasets #tabpfn #regression

0 0 0 0
Post image

BGS' BritPits map shows the distribution of worked mineral commodities across the UK - tinyurl.com/5ydmtaf6

#Aspermont #BritishGeologicalSurvey #BritPits #MineralResources #MineralPlanningAuthority #Geology #Datasets

0 0 0 0
Post image

From Reflection to Repair: A Scoping Review of Dataset Documentation Tools" (new preprint via ArXiv) arxiv.org/abs/2602.15968 #data #datasets #rdm

0 1 0 0
Post image

Discussing AI in the sphere of geological modelling with respect to the tunnelling industry - tinyurl.com/54bxc7bs

#Aspermont #COWIfonden #UniversityofStrathclyde #TechnicalUniversityofDenmark #COWI #AI #Tunnelling #GroundInvestigation #DataSets #GeologicalModelling

0 0 0 0
Preview
Automatic classification of research data sets into the Chinese Library Classification with generative large language model Purpose. Research data sets are typically distributed across different data repositories and lack standardized classification information, which hinders effective discovery and access. This study aims...

How can AI classify multilingual research datasets?

doi.org/10.1108/EL-0...

Why read? It shows a practical pipeline using a fine-tuned Qwen2 to assign CLC codes to multilingual datasets.
Next step: More detailed cross-language evaluation (authors).

#ShortReview #AI #LLM #Classification #Datasets

1 0 1 0
Post image

Industry holds some of the richest #ocean #datasets — yet only 3% reach global #biodiversity repositories (Tides of Transparency, 2024).
📺 Ocean Literacy Webinar 2
🗓️ 17 March 2026 | Online

Register now on our website! 🔗 tinyurl.com/3993rj9t

0 2 0 2

#agentarium
#intelligence_module
#cognitive_infrastructure
#vdb
#ai
#data
#datasets
#agenticai
#rag
#graphrag

1 0 0 0

Occam’s Razor for SSL: Memory-Efficient Parametric Instance Discrimination

Eric Gan, Patrik Reizinger, Alice Bizeul et al.

Action editor: Georgios Leontidis

https://openreview.net/forum?id=GFNTbsVFlP

#supervised #regularization #datasets

0 0 0 0

1) Do #datasets have #DOIs? How are #data cited?

"At Pensoft we can do it in 2 ways: authors can cite both Data Papers and/or #Dataset. We recommend to cite both, and this is in our opinion the right way to do that" - Prof. Penev.

#lovedata26

@lovedataweek.bsky.social

3 4 1 0
Post image

AllenAI Introduces #AutoDiscovery: Automated Scientific Discovery Now Available in Asta Labs allenai.org/blog/autodis... #AI #datasets #data @ai2.bsky.social #research

1 0 0 0
List of Ethical Requirements for the study "Co-Design of a Trustworthy AI-based Prognostic Tool for Predicting Patient Outcome in Acute Stroke" The data was collected as part of the study “Co-Design of a Trustworthy AI-based Prognostic Tool for Predicting Patient Outcome in Acute Stroke.”  It includes ethical requirements and the associated d...

List of Ethical Requirements for the study "Co-Design of a Trustworthy AI-based Prognostic Tool for Predicting Patient Outcome in Acute Stroke" zenodo.org/records/1848... #hvhebron #datasets #neuro [Text complet]

0 0 0 0