Maggie Steiner (@maggiesteiner) Bsky

PNAS Proceedings of the National Academy of Sciences (PNAS), a peer reviewed journal of the National Academy of Sciences (NAS) - an authoritative source of high-impact, original research that broadly spans...

Excited for our publication on how the geographic scale of a sample affects the discovery of rare, deleterious variants to be out this week. With a mix of theory, simulation, and data analysis, we show when samples are narrow vs broad, the number of variants discovered and their frequencies change

10 months ago 70 27 2 1

Thank you!

10 months ago 0 0 0 0

Out today in @pnas.org! www.pnas.org/doi/10.1073/...

10 months ago 28 16 1 0

Specificity, length, and luck: How genes are prioritized by rare and common variant association studies Standard genome-wide association studies (GWAS) and rare variant burden tests are essential tools for identifying trait-relevant genes. Although these methods are conceptually similar, we show by anal...

What do GWAS and rare variant burden tests discover, and why?

Do these studies find the most IMPORTANT genes? If not, how DO they rank genes?

Here we present a surprising result: these studies actually test for SPECIFICITY! A 🧵on what this means... (🧪🧬)

www.biorxiv.org/content/10.1...

1 year ago 208 95 4 8

Hi! Could I please be added? Thanks for setting this up!

1 year ago 0 0 0 0

I just figured out how to use feeds! So, sharing this with #popgen 🧪

1 year ago 9 1 0 0

Thanks Erik!

1 year ago 0 0 0 0

Thanks to co-lead Dan Rice & co-authors @aabiddanda.bsky.social, Marida Ianni-Ravn, and Chris Porras!

1 year ago 0 0 0 0

Overall - while our theoretical model is no doubt a simplification of the complex dispersal/evolutionary processes seen in natural populations, especially humans - we hope that this work will help improve our interpretation of existing genetic studies and provide guidance for the design of new ones.

1 year ago 2 0 1 0

Our results have implications for several applications of genetic data. Power to detect trait/disease associations (e.g., GWAS) is tied to allele frequency. The SFS is also used for inference of the distribution of fitness effects, which our results suggest may be biased by effects of study design.

1 year ago 1 1 1 0

However, when it comes to avg. allele frequency across all sites (incl. monomorphic ones) these effects can cancel - in our theoretical model we see unchanging avg. allele frequency with sampling design. In human data we see this for fine scale samples (within the UK) but not for broader samples.

1 year ago 0 0 1 0

We find evidence of these effects in re-sampling experiments using the UK Biobank. For example, our broadest re-sample with n=10,000 discovers ~98% more variant LoF sites than our most narrow sample, but allele frequency at those variant sites is on average ~41% lower.

1 year ago 0 0 1 0

Broad samples will sample a greater number of rare, deleterious variants than narrow samples (we call this “discovery”), but each will be sampled at lower average frequency (we call this “dilution”). These effects lead to substantial changes in some summary statistics, especially for large samples.

1 year ago 1 0 1 0

We develop a model for the evolution of carriers of rare deleterious variants, and use it to approximate the site frequency spectrum (SFS, the distribution of allele frequencies) in samples at various scales of geographic breadth. We find several key patterns as samples go from “narrow” to “broad”.

1 year ago 2 0 1 0

We focus on rare, deleterious variants, which are expected to cluster in geographic space. Rare variants are also generally of interest since they tend to have large effects on traits (including disease traits), and can help improve understanding of biological mechanisms.

1 year ago 2 0 1 0

In particular, we are interested in geographic breadth, or how broad a region across which individuals are sampled. This is important to current discourse in human genetics surrounding the Euro-centric bias of genetic datasets, and the launch of new biobanks to improve representation globally.

1 year ago 3 0 1 0

Excited to share a new preprint with @jnovembre.bsky.social ! We use a combination of population genetic theory, simulation, and data analysis to ask: how does study design in genetic studies (including biobanks) impact the discovery of rare, deleterious variants?

1 year ago 74 30 2 5

Thanks to co-lead Dan Rice + co-authors @aabiddanda.bsky.social, Marida Ianni-Ravn, and Chris Porras!

1 year ago 0 0 0 0

Overall - while our theoretical model is no doubt a simplification of the complex dispersal/evolutionary processes seen in natural populations, especially humans - we hope that this work will help improve our interpretation of existing genetic studies and provide guidance for the design of new ones.

1 year ago 0 0 1 0

Our results have implications for several applications of genetic data. Power to detect trait/disease associations (e.g., GWAS) is tied to allele frequency. The SFS is also used for inference of the distribution of fitness effects, which our results suggest may be biased by effects of study design.

1 year ago 0 0 1 0

However, when it comes to avg. allele frequency across all sites (incl. monomorphic ones) these effects can cancel - in our theoretical model we see unchanging avg. allele frequency with sampling design. In human data we see this for fine scale samples (within the UK) but not for broader samples.

1 year ago 0 0 1 0

We find evidence of these effects in re-sampling experiments using the UK Biobank. For example, our broadest re-sample with n=10,000 discovers ~98% more variant LoF sites than our most narrow sample, but allele frequency at those variant sites is on average ~41% lower.

1 year ago 0 0 1 0

Broad samples will sample a greater number of rare, deleterious variants than narrow samples (we call this discovery), but each will be sampled at lower average frequency (we call this dilution). These effects lead to substantial changes in some summary statistics, especially for large samples.

1 year ago 0 0 1 0

We develop a model for the evolution of carriers of rare deleterious variants, and use it to approximate the site frequency spectrum (SFS, the distribution of allele frequencies) in samples at various scales of geographic breadth. We find several key patterns as samples go from “narrow” to “broad”.

1 year ago 0 0 1 0

We focus on rare, deleterious variants, which are expected to cluster in geographic space. Rare variants are also generally of interest since they tend to have large effects on traits (including disease traits), and can help improve understanding of biological mechanisms.

1 year ago 0 0 1 0

In particular, we are interested in geographic breadth, or how broad a region across which individuals are sampled. This is important to current discourse in human genetics surrounding the Euro-centric bias of genetic datasets, and the launch of new biobanks to improve representation globally.

1 year ago 0 0 1 0

Posts by Maggie Steiner