Every day, but especially on Rare Disease Day 2025, the Monarch Initiative celebrates our broad community working together to tackle the challenges of identifying and treating rare diseases. Check out our #RareDiseaseDay blog: monarchinit.medium.com/rare-apart-s...
Posts by Nico Matentzoglu
For #RareDiseaseDay we are excited to amplify exciting work from Monarch scientists published yesterday in Nature focused on rare disease-gene association discovery. This work led to the discovery of 69 novel gene-disease associations! 🧬 Amazing! Check out the article: www.nature.com/articles/s41...
That’s why we tend to use RDF, sometimes OWL, for this purpose. In LPG afaik you have to do some hacks like representing literates as nodes, or dumping json as property values to achieve the same. Are there other solutions?
How do folks add node property provenance that rely purely on LPG in their integration layer?
In our world of biomedical knowledge graphs and ontologies it is essential to be able to provide provenance/attribution for node properties (rather than edge properties), eg “who has curated that synonym”?
Side project: A SSSOM validator, deployed as a streamlit app: sss-om-validate.streamlit.app
It is one big hack, but may be useful for some that use SSSOM to standardise entity mappings.
There are issues, please report:
github.com/mapping-comm...
Brought to you by @monarchinitiative.bsky.social
...there are times when we need to sacrifice, o sacrilege, ontological rigour for practical solutions so we can do good in this world.. It needs it.
.. but truth is, we cant really agree fully either. What is your take? I personally believe that discussions like github.com/OBOFoundry/C... (which is certainly what the "What is a phenotype?" discussion would devolve into) lead nowhere; and that yes..
Yesterday we had another fun debate at an @obofoundry.bsky.social Operations Meeting about what a phenotype is, and how it is different from related concepts like "biomarkers", "disease" etc. Here is a take of (a subset of) @monarchinitiative.bsky.social: obophenotype.github.io/upheno/refer....
Fringe use case for SSSOM, not for the world :P of course that’s a hugely important use case! Thanks for chatting.
Global biodata resources are at risk - funding models are fragile and unsustainable. GBC's open letter campaign needs your support to call for solutions. Science leaders have already signed the letter, please join them. Click here for more: globalbiodata.org/open-letter-...
Furthermore, we propose specific modeling solutions for three different categories of entities:
Thank you @piermonn.bsky.social! SSSOM is pretty agnostic to the entity type. When we present about SSSOM (mapping-commons.github.io/sssom/presen...), we usually show the picture below to give a sense of what "things" (symbols) can constitute a mappable entity in the sense of SSSOM..
Mappings are the glue of the Web of Research Data, enabling integration of information critical to the global coordination of humanity's most pressing issues such as #climate and #raredisease. Join the RDA Working Group to help promote reusable mappings! www.rd-alliance.org/groups/fair-...
... , I think yours may be a fringe use case. The point of SSSOM is not primarily to give empower your projects use cases, but to share your mappings that they are re-usable for others, FAIR style. Still, thank you for sharing your cases and giving me an opportunity to learn about it!
Yes, I can see now; While you could express the joining decision in SSSOM like
subject_id: my:table1.col1
predicate_id: skos:exactMatch
object_id: my:table2.col1
mapping_justification: sempav:ManualMappingCuration
(indeed some people do when they are doing data model mappings)...
This is not the fore use case of SSSOM, but if you are up for it we can create an example to see how it would look like. Can you share a gsheet/tsv that contains a column mapping like the one you describe?
… you could use sssom to justify the matching (resolution decisions). I assume your problem also involves transforming data, like concatenation of names etc?
Just to be clear - in your scenario SSSOM only covers the entity resolution part, and only if there is a sane way to identify an individual (I don’t work much with relational database but I guess this would be the primary key). So if two tables in your lake would refer to the same person….
In scientific contexts where transparency is paramount, avoid to just “merge” data using some fuzzy algorithmic approach. Export your matching decisions in a SSSOM-style mapping set so your consumers can see why you merged, and contest the decision. mapping-commons.github.io/sssom/