Advertisement · 728 × 90

Posts by Sam Clark

Preview
Multi-dimensional Mortality (MDMx): Sex-Age-Specific Model Life Tables, Fitting, Prediction from Summary Mortality Indicators, and Forecasting Demographers rely on a variety of tools and methods to work with mortality schedules - model life tables, fitting methods, summary-indicator prediction, and forecasting - largely developed independent...

There’s a long tradition of using the SVD matrix factorization to model and forecast mortality. I used the higher-order SVD to decompose a 4-way tensor of mortality and explore the possibilities that this gives us.

* arxiv.org/abs/2603.20518
* arxiv.org/abs/2603.24299
* samclark.shinyapps.io/mdmx/

2 weeks ago 3 1 0 0
Preview
An in-process SQL OLAP database management system DuckDB is an in-process SQL OLAP database management system. Simple, feature-rich, fast & open source.

Use DuckDB for SQL: 

duckdb.org

1 year ago 1 0 0 0
Preview
Trump Administration Ends Global Health Research Program The Demographic and Health Surveys were the only sources of reliable information in many countries on metrics such as mortality, nutrition and education.

RIP DHS. We expected this, but it’s still a shock.

www.nytimes.com/2025/02/26/h...

1 year ago 10 1 0 0

Annual budget of USAID is about $50B. Annual budget of USA is about $7T. Eliminating USAID does not save anything in this context.

1 year ago 5 0 0 0
     Causal inference methods for treatment effect estimation usually assume independent units. However, this assumption is often questionable because units may interact, resulting in spillover effects between them. We develop augmented inverse probability weighting (AIPW) for estimation and inference of the expected average treatment effect (EATE) with observational data from a single (social) network with spillover effects. In contrast to overall effects such as the global average treatment effect (GATE), the EATE measures, in expectation and on average over all units, how the outcome of a unit is causally affected by its own treatment, marginalizing over the spillover effects from other units. We develop cross-fitting theory with plugin machine learning to obtain a semiparametric treatment effect estimator that converges at the parametric rate and asymptotically follows a Gaussian distribution. The asymptotics are developed using the dependency graph rather than the network graph, which makes explicit that we allow for spillover effects beyond immediate neighbors in the network. We apply our AIPW method to the Swiss StudentLife Study data to investigate the effect of hours spent studying on exam performance accounting for the students' social network.

Causal inference methods for treatment effect estimation usually assume independent units. However, this assumption is often questionable because units may interact, resulting in spillover effects between them. We develop augmented inverse probability weighting (AIPW) for estimation and inference of the expected average treatment effect (EATE) with observational data from a single (social) network with spillover effects. In contrast to overall effects such as the global average treatment effect (GATE), the EATE measures, in expectation and on average over all units, how the outcome of a unit is causally affected by its own treatment, marginalizing over the spillover effects from other units. We develop cross-fitting theory with plugin machine learning to obtain a semiparametric treatment effect estimator that converges at the parametric rate and asymptotically follows a Gaussian distribution. The asymptotics are developed using the dependency graph rather than the network graph, which makes explicit that we allow for spillover effects beyond immediate neighbors in the network. We apply our AIPW method to the Swiss StudentLife Study data to investigate the effect of hours spent studying on exam performance accounting for the students' social network.

"Treatment Effect Estimation with Observational Network Data using Machine Learning"

Arxiv: arxiv.org/abs/2206.14591
#rstats code: github.com/corinne-rahe...

#stats

1 year ago 14 4 1 0
A red squirrel poses for a photo

A red squirrel poses for a photo

Welcome to our Crib

1 year ago 9640 574 147 43
Post image

Huge congratulations to my Florida State Univ Population Center colleagues Mike McFarland and Matt Hauer (@drdemography.bsky.social) on this important and very newsworthy (!!) paper on the impact of leaded gasoline on US public health. #demography

acamh.onlinelibrary.wiley.com/doi/abs/10.1...

1 year ago 31 8 2 1
Advertisement

www.statnews.com/2023/03/13/m...

1 year ago 0 0 0 0

After I moved to Canada a couple of years ago I realized that I was no longer constantly running a massive stress routine in the background of my mind worrying about health care and guns. It was weirdly noticeable only when it stopped.

1 year ago 28181 3089 868 278
Graduate Program - Department of Demography Graduate Program UC Berkeley Demography offers three graduate degree tracks independently and in conjunction with the department of Sociology. Ph.D. in Demography The doctoral program is intended to p...

@ucberkeleyofficial.bsky.social is accepting applications for Fall 2025 for the PhD program in Demography AND the Graduate Group in Sociology & Demography. Seeking a diverse and strong cohort; applications DUE 12/17/2024.

Learn more about the program:
www.demog.berkeley.edu/graduate-pro...

1 year ago 28 21 1 2
Video

Falling #fertility across the world will lead to significant changes in countries' age pyramids. By 2100, when today's newborns are in their 70s, they (or their elders!) will be the largest age group in many countries.

#demography

#rstats code: github.com/schmert/bone...

1 year ago 16 7 1 0
Preview
Beware the myth: learning styles affect parents’, children’s, and teachers’ thinking about children’s academic potential - npj Science of Learning npj Science of Learning - Beware the myth: learning styles affect parents’, children’s, and teachers’ thinking about children’s academic potential

Looks like it might be time to reiterate what psychologists have been screaming from the rooftops for years: learning styles as it is presented to the general public is a myth and it damages students’ sense of efficacy www.nature.com/articles/s41...

1 year ago 66 33 2 2

Scientists, academics, researchers: We’re excited to share that @altmetric.com is now tracking mentions of your research on Bluesky! 🧪

1 year ago 29634 5020 458 280
Preview
How our team at Our World in Data became a global data source on COVID-19 Our small team made COVID-19 data clear, reliable, and accessible to a global audience. This is how it happened.

Saloni, Edouard, and Lucas wrote up the history of Our World in Data during the COVID pandemic.

It's about the impact we hoped to achieve and how it felt to us during that time.

ourworldindata.org/owid-covid-h...

1 year ago 85 24 2 1

Is there an equivalent graphic for water flouridation and tooth decay?

1 year ago 5 3 0 0
Advertisement
Book outline

Book outline

Over the past decade, embeddings — numerical representations of
machine learning features used as input to deep learning models — have
become a foundational data structure in industrial machine learning
systems. TF-IDF, PCA, and one-hot encoding have always been key tools
in machine learning systems as ways to compress and make sense of
large amounts of textual data. However, traditional approaches were
limited in the amount of context they could reason about with increasing
amounts of data. As the volume, velocity, and variety of data captured
by modern applications has exploded, creating approaches specifically
tailored to scale has become increasingly important.
Google’s Word2Vec paper made an important step in moving from
simple statistical representations to semantic meaning of words. The
subsequent rise of the Transformer architecture and transfer learning, as
well as the latest surge in generative methods has enabled the growth
of embeddings as a foundational machine learning data structure. This
survey paper aims to provide a deep dive into what embeddings are,
their history, and usage patterns in industry.

Over the past decade, embeddings — numerical representations of machine learning features used as input to deep learning models — have become a foundational data structure in industrial machine learning systems. TF-IDF, PCA, and one-hot encoding have always been key tools in machine learning systems as ways to compress and make sense of large amounts of textual data. However, traditional approaches were limited in the amount of context they could reason about with increasing amounts of data. As the volume, velocity, and variety of data captured by modern applications has exploded, creating approaches specifically tailored to scale has become increasingly important. Google’s Word2Vec paper made an important step in moving from simple statistical representations to semantic meaning of words. The subsequent rise of the Transformer architecture and transfer learning, as well as the latest surge in generative methods has enabled the growth of embeddings as a foundational machine learning data structure. This survey paper aims to provide a deep dive into what embeddings are, their history, and usage patterns in industry.

Cover image

Cover image

Just realized BlueSky allows sharing valuable stuff cause it doesn't punish links. 🤩

Let's start with "What are embeddings" by @vickiboykis.com

The book is a great summary of embeddings, from history to modern approaches.

The best part: it's free.

Link: vickiboykis.com/what_are_emb...

1 year ago 651 101 22 6
Preview
Looking for Maintainers to Support First-Time Contributors Announcing a Community Call and Coworking sessions to support first contributions

At @rOpenSci.hachyderm.io.ap.brid.gy we're pairing first-time code contributors with experienced maintainers. If you are an rOpenSci or other #RStats package author and want to help build the road for new contributors and get co-maintainers, sign up for co-working!

ropensci.org/blog/2024/10...

1 year ago 26 17 0 0
Preview
FSU board OKs removal of over 400 courses from general education offerings after review “We’re living through an era of legislature-driven higher education reform,” FSU Provost Jim Clark said.

Important to understand that (1) political appointees, not the university administration, are doing this; (2) they're not cancelling courses, but removing them from the list that satisfy breadth reqmts (i.e. death by strangling rather than a knife to the back).

www.tallahassee.com/story/news/l...

1 year ago 27 15 1 0
Preview
Researchers say an AI-powered transcription tool used in hospitals invents things no one ever said Whisper is a popular transcription tool powered by artificial intelligence, but it has a major flaw. It makes things up that were never said.

AI for medical transcription - in this case Whisper sneaks in its own hallucinatory phrases
apnews.com/article/ai-a...

though i wish the AI did invent ‘hyperactivated antibiotics’ we are going to need them soon 😏

h/t @placentadoc.bsky.social

#MedSky

1 year ago 26 7 0 1
Preview
Guyana - Wikipedia

For the Thanksgiving break I will be in Guyana visiting one of our children who is working there for two years.

en.wikipedia.org/wiki/Guyana

1 year ago 2 0 0 0

Just set up an account for the openVA Team @openva.net where I will post things related to the group.

1 year ago 3 1 0 0
Post image

Backyard now!

1 year ago 5 0 0 0


CGD's very own starter pack... experts and staff former and present...

bsky.app/starter-pack...

1 year ago 22 14 1 2
Post image

Can anyone give lit tips for papers showing this qualitative age pattern of a mortality rate ratio (e.g. frail vs not, sick vs not, high SES vs low, in nursing home vs general pop, with disease vs without)?

1 year ago 8 3 2 1
1 year ago 0 0 0 0
Advertisement

SQL! SQL. JUST USE SQL

1 year ago 2036 117 187 38
Post image Post image

Some recent beautiful evenings

1 year ago 5 0 0 0
Post image

Reading Peter Turchin’s interesting and provocative books. This characterization of social science disciplines in ‘Ultrasociety’ is amusing:

2 years ago 1 0 0 0