🧬 Database Center for Life Science #BioHackathon2025 in Mie, Japan!
We tested Camber's Nova, the Science AI, on analyzing survey results, combining DBCLS datasets, and enabling natural language queries. Huge thanks to DBCLS!
#BioHackathon #Bioinformatics #LifeSciences #AI #Camber #DBCLS
Posts by David Steinberg
Want to make better use of ontologies? Check out on2vec created at DBCLS #biohackathon 2025...
It turns ontologies into embeddings you can actually use in ML models. Perfect for biomedical data, knowledge graphs, etc. osf.io/preprints/bi... github.com/david4096/on...
Catch you at the next one I hope ;)
Hello from @ga4gh.org Plenary in Uppsala!
Announcing GIF Project: Cloud-based BRCA Exchange variant analysis environment using GA4GH standards in Camber. The project aims to adapt and extend community-driven standards to support interoperable workflows, variant annotation, and metadata description. Learn more: www.ga4gh.org/what-we-do/g...
Nice to meet you too!
Collected together @ga4gh.org Bluesky accounts here, lmk if you want to be added! go.bsky.app/8BDDMqM
Calling all @ga4gh.org Connect 2025 attendees online and in-person, let's connect here on bluesky! #ga4ghconnect2025 #ga4gh #bioinformatics #genomics
Grace Hopper could really get people laughing about information sciences and the struggles of working under strict hierarchies www.youtube.com/watch?v=si9i...
If you haven't caught up with the amazing new demos from @dynamicland.org now is your chance www.youtube.com/watch?v=Osn3...
At the @mlcommons.org Croissant community meeting with, you guessed it
The photo we saw reminded me immediately of some of the goals of @dynamicland.org as seen here dynamicland.org/2023/Improvi...
Another important direction is making immersive visual experiences that make data models accessible in a visual and humane way. I hope to experience this in person at a museum github.com/dbcls/dive
Toshiyaki Katayama, original author of the wildly popular KEGG database rounding out the keynotes @swat4hcls.bsky.social by showing us the past, present, and future of linked data in the life sciences — lots of excitement for the possibilities of #graphgenome!!
Nice to see this one making the rounds @dockstore.org @ucscgenomics.bsky.social
Starter pack for #swat4hcls2025 conference go.bsky.app/PiZd2qR 🗣️ @swat4hcls.bsky.social
Embedding knowledge graphs in order to compare ontologies using learned features from Shervin Mehryar’s keynote
From Prof Anna Fensel’s keynote a roundup of some of the connections between AI and semantic
One of the common themes of the conversations at #swat4hcls so far is that knowledge graphs are proving to be critical for reliability and interpretability of AI and LLMs in specific
Excited to attend #SWAT4HCLS in Barcelona next week, representing @cambercloud.bsky.social ! 🎉
At the hackathon, we’ll explore #CroissantML for seamless dataset & model access via @hf.co and @kaggle.com 🤓
Check out our first preprint from #biohacakathon Fukushima 2024 and expect more on this work 🤓 files.osf.io/v1/resources...
We found some low hanging fruit for improvement and tested out bringing a bio dataset into Croissant. We think that continually increasing the use of ontologies and controlled vocabularies will be crucial for data harmonization and the new era of multimodal models!
We made a simple tool for converting CroissantML to #RDF so it could be analyzed using #SPARQL and looked for differences between its usage between Kaggle and Hugging Face github.com/david4096/cr...
It works by providing a controlled vocabulary for high level dataset metadata as well as specific metadata for columnar data, which might seem like a small thing but is a huge step forward for bringing tools to data
@hf.co , @kaggle.com , OpenML, DataVerse and others are all implementing some or part of the CroissantML spec that interoperates with tooling like Tensorflow so you can load datasets directly into your AI training code
Biology datasets tend to be messy, require domain knowledge to parse, and not immediately usable for training AI models. That’s part of why I became interested in @mlcommons.org CroissantML as a way to bring ML tools to biology data — we’re presenting a poster on this effort at #swat4hcls next week!
This is a great opportunity to contribute —
@anthropic.com marked bioinformaticians as Office & Administrative for their job category 🧐 www.anthropic.com/news/the-ant...
gestures in @worrydream.com