Posts by Bioinformatics Advances
Applied to five RNA velocity methods across mouse erythroid, pancreatic, and human brain development datasets, it also introduces a signal-to-random coherence score to guide method selection toward biologically meaningful fits.
This study introduces a replicate coherence framework for evaluating RNA velocity stability, using negative binomial count splitting to generate independent data replicates from scRNA-seq count matrices.
📊 Recently published in Bioinformatics Advances: "Quantifying stability via count splitting to guide model selection in RNA velocity analyses"
Full text at https://doi.org/10.1093/bioadv/vbag104
💻 CORTADO is available at https://github.com/lodimk2/cortado-marker
An iterative refinement procedure then uses selected markers as input features for community detection, improving clustering accuracy across brain, immune, spatial, and cancer datasets.
CORTADO uses stochastic hill-climbing optimization to select scRNA-seq marker genes by jointly maximizing differential expression, minimizing cosine similarity between selected genes, and enforcing sparsity.
🧬 New in Bioinformatics Advances: "CORTADO: Hill climbing optimization for cell-type specific marker gene discovery and clustering accuracy improvement"
Read it at https://doi.org/10.1093/bioadv/vbag106
💻 Code for DOMUS is available at https://github.com/GalGilad/DOMUS
DOMUS is a hierarchical clustering framework that minimizes Dasgupta's cost by blending multiple structural views of a similarity matrix via surrogate-assisted optimization. Benchmarked on synthetic, classic, & scRNA-seq datasets, it consistently outperforms average linkage & beam search baselines.
🌳 Check out the latest in Bioinformatics Advances: "An optimization framework for hierarchical clustering"
Read it here: https://doi.org/10.1093/bioadv/vbag107
💻 Code to reproduce the results of this study is available at https://github.com/prob-ml/spice
SPICE uses a graph convolutional network to identify genes affected by cell-cell communication in spatial transcriptomics data. Cells are represented as graph nodes connected by spatial proximity, w/ response gene expression predicted from ligand & receptor inputs across varying neighborhood radii.
🧠 Out now in Bioinformatics Advances: "Graph convolutional networks for inferring cell-cell communication from spatial transcriptomics data"
Full text at https://doi.org/10.1093/bioadv/vbag101
Using ResFinder, PCA, MLST, and plasmid network analysis, it identifies cross-source resistance gene sharing and population structure. Plasmid types IncFIA, IncI1, and IncFII showed the broadest cross-source connectivity, with strongest links between human and livestock isolates.
This study characterized AMR gene distribution across 174 whole-genome sequences from human, livestock, fish, and environmental E. coli isolates in Tanzania and Kenya.
🦠 New research examines "One Health analysis of antimicrobial resistance in Escherichia coli from humans, animals, and the environment"
Read it at https://doi.org/10.1093/bioadv/vbag099
scGeno infers chromosome-level genotype states from scRNA-seq data using a categorical hidden Markov model. It resolves homozygous and heterozygous chromosomal segments by modeling sequential allelic expression ratios, including crossover breakpoints, in genetically mixed single-cell datasets.
🔬 Introducing "scGeno: A hidden Markov model approach to denoise chromosome-scale genotypes from single-cell data"
Full text at https://doi.org/10.1093/bioadv/vbag094
Authors include: @helenekretzmer.bsky.social
🔍 Analysis of over 3.2 million prokaryotic genomes in NCBI reveals MAGs represent 91% of all available archaeal genomes, underscoring the domain's persistent cultivation challenges.
This review argues prokaryotic pangenomes are statistical models shaped by dataset quality & taxonomic resolution, not fixed biological entities. It covers pangenome fluidity, MAG-associated challenges, & the under-explored archaeal pangenome, with a look toward graph-based & AI-driven approaches.
🧬 New in Bioinformatics Advances: "The pangenome: A statistical model, not a fixed biological property"
Read it at https://doi.org/10.1093/bioadv/vbag069
Validation used synthetic data from rSNAPed alongside live U-2 OS cell imaging.
MicroLive is a Python-based GUI toolkit for quantifying live-cell single-molecule microscopy data. It integrates cell segmentation, single-particle detection and tracking, colocalization, and correlation analysis into one platform.
🔬 Introducing MicroLive in Bioinformatics Advances: "MicroLive: An image processing toolkit for quantifying live-cell single-molecule microscopy"
Full paper at https://doi.org/10.1093/bioadv/vbag095
Authors include: @brian-munsky.bsky.social
The model learns site-specific amino acid frequencies and pairwise joint frequencies, cross-validated by a Monte Carlo simulated annealing algorithm. Designed peptides achieved near-native binding affinities and structural confidence scores above 90, validated by DeepMHCII and AlphaFold3.