Gregor Sturm (@grst) Bsky

sorry, I meant min, not max

3 days ago 0 0 0 0

Thanks for the response!

The thing is that the original TCRdist uses max(4, 4-score) so it caps scores at 4.

With TCRblosum that would be max(2, 2-score) which would destroy all the strong signal that you have in e.g. the C residue.

So you'd suggest to use 2-score without the max instead?

3 days ago 0 0 2 0

Add tcrblosum support to TCRdist by felixpetschko · Pull Request #685 · scverse/scirpy So far, the TCRdist metric used a distance matrix derived from the blosum62 substitution matrix. This PR extends TCRdistDistanceCalculator with a new base_matrix="tcrblosum" option alongs...

See github.com/scverse/scir... for more detais.

6 days ago 0 0 0 0

Hi @pmeysman.bsky.social‬,

we are trying to integrate tcrBLOSUM into scirpy, our library for scTCRseq analysis.

Specifically, we want to adapt the TCRdist algorithm to use tcrBLOSUM substitution values.

How did you turn the substitution matrix into a distance matrix in your paper?

6 days ago 0 0 2 0

There's another scverse conference this year and it will be amazing!

Register now: www.eventbrite.com/e/scverse-co...

7 months ago 2 1 0 0

AFAIK, these differences are minor, numeric differences. I would consider them equivalent.

8 months ago 3 0 1 0

Our benchmark + guidelines for atlas-level differential gene expression of single cells is online:

academic.oup.com/bib/article/...

Bottom line: Use pseudobulk + DESeq2 in simple and pseudobulk + DREAM in more complex settings.

Collab w/ @leonhafner.bsky.social @itisalist.bsky.social

8 months ago 15 6 1 0

Register now for the best conference of the year!

8 months ago 0 0 0 0

scverse conference 2025 Follow us on our channels to learn more details in the coming weeks

📣 Mark your calendars! The 2025 edition of the scverse conference will take place on 17-19 November at Stanford University (US) scverse.org/conference20...

Call for abstracts and registrations coming soon!

11 months ago 12 9 1 2

Release v0.5.0 · scverse/cookiecutter-scverse New template sync We re-implemented template sync from scratch instead on relying on cruft. This allows us to create real merge conflicts that show up as such on GitHub instead of .rej files. Gene...

Just released a new version of the @scverse.bsky.social cookiecutter template: github.com/scverse/cook...

Some highlights:
🔃 improved template sync (merge conflicts now show up as such)
🚀 use hatch as project manager
🔧 lots of fixes and documentation updates

1 year ago 4 2 0 0

Rogue Scholar

rogue-scholar.org

1 year ago 1 1 0 0

Nice post!
How did you generate the doi-link for a blog post?

1 year ago 0 0 1 0

LEMUR simplified | const-ae A simplified implementation of the LEMUR algorithm.

Blog post by @const-ae.bsky.social with a simple explanation of the manifold regression algorithm & code that underlies our paper “Analysis of multi-condition single-cell data with latent embedding multivariate regression” (doi.org/10.1002/eji....).

const-ae.name/post/2025-01...

1 year ago 26 5 1 0

Working with >1M cells Scirpy scales to millions of cells on a single workstation. This page is a work-in-progess collection with advice how to work with large datasets. Distance metrics: Computing pairwise sequence dist...

Just released scirpy v0.21 -- Now with GPU Support for Hamming sequence distance and a brand new tutorial for working with scTCR datasets >1M cells: scirpy.scverse.org/en/latest/tu...
@scverse.bsky.social

1 year ago 2 0 0 0

Release notes Version 1.11: 1.11.0 2025-02-14: Release candidates: rc2 2025-01-24, rc1 2024-12-20. Features: rc1 sample() supports both upsampling and downsampling of observations and variables. subsample() is n...

🎉 Scanpy 1.11.0 is out! 🎉 just after reaching 2000 stars on GitHub!

- sc.pp.sample replaces subsample with many new features
- Sparse Dask support pca
- session-info2 package for more reproducible notebooks

See the release notes:

1 year ago 49 19 1 1

Been looking forward to this talk since @alexpeltzer.bsky.social told me about DSO in October!

1 year ago 4 1 0 0

GitHub - Boehringer-Ingelheim/dso: Data Science Operations (dso) command line tool Data Science Operations (dso) command line tool. Contribute to Boehringer-Ingelheim/dso development by creating an account on GitHub.

I'd like to share DSO, a command line helper to build reproducible data science projects with ease.

It is an opinionated way to organize data science projects, built around data version control (DVC).

github.com/Boehringer-I...

1 year ago 12 4 1 0

We try to avoid that by using this with preprocessed data only. All the heavy lifting is done with nextflow pipelines before. Datasets up to tens of GBs have worked well so far.

1 year ago 2 0 0 0

Finally, many thanks to my colleagues @alexpeltzer.bsky.social, Daniel Schreyer and Tom Schwarzl for testing, adopting, and contributing to DSO.

1 year ago 1 0 1 0

Bytesize: data science operations (DSO) Gregor Sturm, Boehringer Ingelheim

If you want to learn more, I'll be presenting this at a @nf-co.re bytesize talk: nf-co.re/events/2025/...

1 year ago 5 2 1 1

We built this at @boehringerglobal.bsky.social to meet the quality standards required for biomarker analysis in clinical trials.

But I think this is useful for any kind of data analysis project.

1 year ago 1 0 1 0

An exemplary PCA plot with a "preliminary" watermark.

One of my favorite features: automated watermarking of all plots in a quarto report. Nobody gonna publish my plots anymore before I think they are ready.

1 year ago 3 0 1 0

It brings together the best tools:
- git, for code versioning
- dvc, for data versioning and tracking inputs and outputs
- jinja2, for templates
- uv, for Python dep mgmt
- quarto, for authoring reports
- hiyapyco, for hierarchical YAML config
- pre-commit, for linting

1 year ago 1 0 1 0

GitHub - Boehringer-Ingelheim/dso: Data Science Operations (dso) command line tool Data Science Operations (dso) command line tool. Contribute to Boehringer-Ingelheim/dso development by creating an account on GitHub.

I'd like to share DSO, a command line helper to build reproducible data science projects with ease.

It is an opinionated way to organize data science projects, built around data version control (DVC).

github.com/Boehringer-I...

1 year ago 12 4 1 0

We (Chen Zhan!) just launched #sccomp for #Python!

Testing for differences in cell-type proportion in #singlecell #spatial data?

#sccomp is a mixed-effect Bayesian model
- Use sum-constrained BetaBinomial distribution
- Outliers detect.
- Remove unwanted effects

github.com/MangiolaLabo...

1 year ago 12 3 1 0

GitHub - icbi-lab/luca: Single-cell Lung Cancer Atlas with 1.2M cells Single-cell Lung Cancer Atlas with 1.2M cells. Contribute to icbi-lab/luca development by creating an account on GitHub.

(2) Finding the mistake, tracing it back to its origin, and fixing it was only possible because the data and scripts for building the atlas are publicly available and fully reproducible. github.com/icbi-lab/luca

1 year ago 2 0 0 0

(1) Maintaining a data resource is very much like maintaining software. It is never "done" but constantly improving.

1 year ago 2 0 1 0

Cellxgene Data Portal Find, download, and visually explore curated and standardized single cell datasets.

Two years after publication of our single-cell lung cancer atlas, a user found a mistake in the annotation of the EGFR-status of some patients. We fixed the issue and the atlas is now updated on cell-x-gene: cellxgene.cziscience.com/collections/...

What are the takeaways from that? (1/3)

1 year ago 7 3 2 0

Scverse x Owkin Hackathon in Paris We're pleased to announce the next Scverse Hackathon will take place in the Owkin offices in Paris from 17/03/2025 9am to 19/03/2025 1:30pm. This hackathon is a joint initiative between the scverse c...

I am Stoked about our upcoming @scverse.bsky.social
and @owkin.bsky.social hackathon, focused on spatial omics data analysis.
📅 March 17-19, 2025
📍 Owkin office, Paris

Apply now: docs.google.com/forms/d/e/1F...

1 year ago 10 8 0 1

Posts by Gregor Sturm