Advertisement · 728 × 90

Posts by Gaurav Diwan

https://link.springer.com/article/10.1007/s00239-025-10277-1
Conceptual overview of hierarchical orthologous groups. An example of one HOG, or gene family. A Species tree with four taxa: plant (green), fish (blue), human (orange), and mouse (yellow), each with one or more genes. B The implied gene tree, dubbed “HOG tree,” and inferred nested HOG composition. Duplication nodes (red) can be deduced based on the species tree topology and clusters of homologous genes at each level. Ancestral genes from which the HOGs descended are shown in gray. C HOGs returned at different taxonomic levels. Consider a gene family that was present in the last eukaryotic common ancestor (LECA). At this level, a single HOG encompasses all genes descending from that ancestral gene. At the Vertebrata level, this gene underwent duplication, leading to two distinct copies, i.e., HOGs. At the Mammalia level, a second duplication further subdivides one of these HOGs, showing how deeper HOGs split into nested subHOGs at more recent levels. The HOG composition implies that a loss event occurred after the mammalian speciation

https://link.springer.com/article/10.1007/s00239-025-10277-1 Conceptual overview of hierarchical orthologous groups. An example of one HOG, or gene family. A Species tree with four taxa: plant (green), fish (blue), human (orange), and mouse (yellow), each with one or more genes. B The implied gene tree, dubbed “HOG tree,” and inferred nested HOG composition. Duplication nodes (red) can be deduced based on the species tree topology and clusters of homologous genes at each level. Ancestral genes from which the HOGs descended are shown in gray. C HOGs returned at different taxonomic levels. Consider a gene family that was present in the last eukaryotic common ancestor (LECA). At this level, a single HOG encompasses all genes descending from that ancestral gene. At the Vertebrata level, this gene underwent duplication, leading to two distinct copies, i.e., HOGs. At the Mammalia level, a second duplication further subdivides one of these HOGs, showing how deeper HOGs split into nested subHOGs at more recent levels. The HOG composition implies that a loss event occurred after the mammalian speciation

https://link.springer.com/article/10.1007/s00239-025-10272-6
Summary of the QfO8 meeting. a Hot topics and future directions in method development and applications within the QfO community, namely artificial intelligence, protein domains, protein structure, RNA and splicing isoforms. b Definition of orthology and paralogy, including various paralogous subtypes (e.g. in-paralogs and out-paralogs). c Duplications and functional divergence. d Applications of orthology

https://link.springer.com/article/10.1007/s00239-025-10272-6 Summary of the QfO8 meeting. a Hot topics and future directions in method development and applications within the QfO community, namely artificial intelligence, protein domains, protein structure, RNA and splicing isoforms. b Definition of orthology and paralogy, including various paralogous subtypes (e.g. in-paralogs and out-paralogs). c Duplications and functional divergence. d Applications of orthology

https://link.springer.com/article/10.1007/s00239-025-10271-7
Overview of the OrthoXML File Format (simplified). A schematic representation of an OrthoXML file, a standardized XML-based format for representing orthology data. OrthoXML follows a hierarchical structure where elements are enclosed within opening < tag > and closing </tag > tags. < orthoXML > is the root element enclosing other elements. The < species > element contains information about genes. An OrthoXML file can include a < taxonomy > element, which specifies the species tree used to generate the file. Additionally, the < groups > element encapsulates the orthology and paralogy relationships among genes

https://link.springer.com/article/10.1007/s00239-025-10271-7 Overview of the OrthoXML File Format (simplified). A schematic representation of an OrthoXML file, a standardized XML-based format for representing orthology data. OrthoXML follows a hierarchical structure where elements are enclosed within opening < tag > and closing </tag > tags. < orthoXML > is the root element enclosing other elements. The < species > element contains information about genes. An OrthoXML file can include a < taxonomy > element, which specifies the species tree used to generate the file. Additionally, the < groups > element encapsulates the orthology and paralogy relationships among genes

Our trilogy of orthology publications is online!
Review on Hierarchical Orthologous Groups doi.org/10.1007/s00239-025-10277-1

OrthoXML-Tools doi.org/10.1007/s00239-025-10271-7

A great community effort on Quest for Orthologs in the era of Data Deluge and AI doi.org/10.1007/s00239-025-10272-6

5 months ago 19 10 1 0

- Placing new #Biodiversity #Genomics genomes (e.g.) in this framework gives context to which gene functions already exist in that clade and what is new/unknown for the new one

Big thanks to Rob Russell, Paschalis, JC, Mu-en, @maxjtelford.bsky.social @guigolab.bsky.social John Colbourne!

10/10

6 months ago 1 1 0 0

So take home messages:
- Overlaying function onto gene evolutionary history captures the specific biology of species without #morphology
- Phylogeny can be perfected & annotations can be perfected but if one fuses them together as they are, one could get deep #Evolution insights

9/10

6 months ago 0 0 1 0

And when we look inside the clusters, we can check the exact genes that have the analogous functions leading to the adaptation to the new environment. N.B. these genes are not from the same HOG. So, they're not related by sequence but by #function...

8/10

6 months ago 0 0 1 0
Comparison level of enrichment in clustered GO terms at the transition nodes versus their parent nodes. The genes inside the cluster were then checked for the source of the terms.

Comparison level of enrichment in clustered GO terms at the transition nodes versus their parent nodes. The genes inside the cluster were then checked for the source of the terms.

For the water to land nodes, terms inside a GO cluster were
Bone development in tetrapods 🦴
Open tracheal system in arthropods
Seed/Leaf development in plants 🍀
These were all anatomical structures which organisms needed to survive on land.

Again... these were captured from gene annotations!
7/10

6 months ago 0 0 1 0

Next we analyzed the functions emerging at nodes that underwent large phenotypic transitions. In our tree, we had 3 nodes where species moved from water to land and 5 nodes where organisms became multicellular.

We enriched GO terms at transition nodes and clustered them by semantic similarity

6/10

6 months ago 0 0 1 0
Above: GO enrichment of genes from HOGs gained at each node of the eukaryote --> tetrapod lineage. Below: average % of KEGG pathway categories gained at the same nodes.

Above: GO enrichment of genes from HOGs gained at each node of the eukaryote --> tetrapod lineage. Below: average % of KEGG pathway categories gained at the same nodes.

Eukaryotes = splicing + cell cycle + transcription/translation
Metazoa = differentiation + embryo development + nervous systems
Vertebrates = immune response + digestive systems + circulatory systems

So we are recovering characteristic #traits from gene annotations!

5/10

6 months ago 2 0 1 0
Advertisement

Instead of focusing on which exact genes were gained or to avoid over-interpreting the ancestral reconstruction of our sets, we analysed the enriched GO terms and categories of pathways gained at these nodes. We found interesting specificity here
#genomebiology

4/10

6 months ago 0 0 1 0

Nodes with the largest HOG + pathway component gains were the ancestors of:
- All Eukaryotes
- Flowering plants 🌻
- Slime molds (!)
- Metazoans 🦐
- Vertebrates 🚶‍♂️
#EvoBio #evolbio

3/10

6 months ago 0 0 1 0

We sorted all genes into HOGs (at the root of the species tree) and traced the history of each HOG on the species tree. We also did the same for the orthologs of each gene.
We used:
Each HOG = one gene that duplicated/diverged
Each ortholog = KEGG/Reactome pathway component
#phylogenomics
2/10

6 months ago 0 0 1 0
Number of genes, domains and pathways gained at every node of a cladogram of 508 species

Number of genes, domains and pathways gained at every node of a cladogram of 508 species

Pleased to announce that our #preprint on the evolutionary history of gene functions is now online at #bioRxiv! We overlayed functional annotations on the evolutionary history of ~4.5M genes from 508 species across the tree of life and found some very cool stuff!

tinyurl.com/FuncEvol

🧵 1/10

6 months ago 32 13 2 1

Good morning people at #ESEB2025

Come check out my poster P03.068 today to know about

- The evolutionary history of genes and gene functions from >500 species across the tree of life 🪾

- How that helps us reveal widespread parallelisms in the move from water to land and multicellularity 🧬

8 months ago 5 0 0 0

Fantastic day 1 @eseb2025.bsky.social ! Talks showing how incorrect orthology and hidden paralogy could affect phylogenetic tree structures; species tree inference from WGA; and how pre-LUCA protein sequences tell you the order in which amino acids emerged!
Super cool stuff #ESEB2025

8 months ago 2 0 1 0
Post image

Checked in @eseb2025.bsky.social in Barcelona. Come check out my poster on Thursday if you want to know the evolutionary history of any gene or its function! #ESEB2025

8 months ago 7 1 1 0