Advertisement · 728 × 90

Posts by Xan Gregg

Cool, thanks! Nice result: "visually display both outcome variability and inferential uncertainty".
I've been focused first on seeing outcome variability from an EDA POV (pre-inference), but I can see the median line as a proxy inferential anchor to align with their framing.

1 day ago 1 0 0 0
Panel of smooth trend lines for two groups of participants: friends and Prolific users. Each line is a different chart type labeled in legend..

Panel of smooth trend lines for two groups of participants: friends and Prolific users. Each line is a different chart type labeled in legend..

I hesitate to say much with this little data, but the "dot plot with median" in the friends panel is close to ideal. K-S Distance of 0.27 has a p-value of 0.05 for this sample size, so the curve should rise pretty fast into the quite/very surprising territory. The low purple line is plain dot plot.

1 day ago 1 0 1 0
Panel of smooth trend lines for two groups of participants: friends and Prolific users. Each line is a different chart type (unlabeled). y = survey response on a 1-4 scale. x = statistical difference measure (Kolmogorov-Smirnov distance)

Panel of smooth trend lines for two groups of participants: friends and Prolific users. Each line is a different chart type (unlabeled). y = survey response on a 1-4 scale. x = statistical difference measure (Kolmogorov-Smirnov distance)

Another look at test-mode pairs-study results, comparing friend volunteers (you all) and Prolific workers. Each line is a different chart type—not a lot of difference there. I'm using Kolmogorov-Smirnov distance as a measure of surprise on the X. #dataviz
blog: rawdatastudies.com/2026/04/19/v...

1 day ago 2 0 1 0
Paired dot plot with means lines and shaded confidence intervals. One group is volunteer testers and the other is Prolific workers with alignment score on the y axis. The volunteer testers have generally higher scores.

Paired dot plot with means lines and shaded confidence intervals. One group is volunteer testers and the other is Prolific workers with alignment score on the y axis. The volunteer testers have generally higher scores.

I decided to spring for a round of Prolific workers for my paired-chart study. Not surprisingly a drop in quality from the socials volunteers. I still need to do attention checks. Here's how responses aligned with a stat measure of diff.
Still collecting data: xangregg.github.io/pairstudy121...

1 week ago 2 0 0 0
Screenshot of comma, comma converter web app with buttons for Download Data and Download Metadata.

Screenshot of comma, comma converter web app with buttons for Download Data and Download Metadata.

I added CSVW csvw.org support for metadata export in my RDS→CSV webapp. The out-of-the-box spec seems to be missing a lot of useful column properties like factor levels and ordering, but you have to start somewhere. xangregg.github.io/commacomma/

1 week ago 1 0 0 0
Screenshot of a web app named "Comma,Comma" with just a single drop zone for a file upload, listing a few input file types.

Screenshot of a web app named "Comma,Comma" with just a single drop zone for a file upload, listing a few input file types.

Found a recent JavaScript parser for R data files at github.com/jackemcphers...; works well enough that I made a wrapper app for CSV conversion xangregg.github.io/commacomma/.

Now I no longer have to dig up my R environment to make the conversion when I find an RData file in the wild.

1 week ago 1 0 0 0
Estimated density of salaries for US General and Operations Managers. Median is 100k but mode is more like 70k.

Estimated density of salaries for US General and Operations Managers. Median is 100k but mode is more like 70k.

That is, I assume median artifacts because here is the biggest occupation with a median of 100k but it doesn't have a spike there itself (based on your interpolations).

2 weeks ago 0 0 1 0
Advertisement

Interesting that the shape of your summary violin is different from the beeswarm-of-medians shape, with the latter having a spike around 100k (and I noticed more spikes when I did a proper beeswarm (w/o d3-force). I assume the spikes are median artifacts rather than something smoothed out.

2 weeks ago 0 0 1 0

Very nice. I can easily imagine a post or Observable notebook of the "journey of the 5 quantiles to become a violin".

2 weeks ago 0 0 1 0

I didn't even see the three dots right away (in the corner of my big monitor). That's quite a write-up -- hope it's a full post somewhere.

2 weeks ago 1 0 1 0
screen capture of a chart while hovering over a kernel density plot and seeing a more annotated view with a sliding highlight region.

screen capture of a chart while hovering over a kernel density plot and seeing a more annotated view with a sliding highlight region.

Very nice hover graphs. At one point I saw icons in the hover windows I could click on (a recycle-like icon for cycling through graph types), but I can't remember how I got them.

2 weeks ago 2 0 1 0

Bad day to post a new chart type -- my first thought was April Fools 🤣.

2 weeks ago 2 0 1 0

Each run has a different mix of chart variations (but same 4 basic types). In case that makes anyone want to try the survey multiple times, go for it!

3 weeks ago 1 0 0 0

Framing the question and responses has been the hardest part! Previous round was about "how much evidence of different sources". Trying to capture EDA experience of "is anything going on here?"

3 weeks ago 1 0 0 0

Funny, I did start out trying to make a lineup-based study. But it seemed too much to expect participants to scan all 20 (or even 12) images when making each choice.

3 weeks ago 1 0 0 0
Screenshot of one trial of the study, showing a pair of box plots with the question "How surprising would it be if samples A and B were from the same source?" 4 answer buttons are: not/slightly/quite/very surprising.

Screenshot of one trial of the study, showing a pair of box plots with the question "How surprising would it be if samples A and B were from the same source?" 4 answer buttons are: not/slightly/quite/very surprising.

Screenshot of last page of a completed study with a summary of the participants responses and how they aligned with expectations.

Screenshot of last page of a completed study with a summary of the participants responses and how they aligned with expectations.

I've updated my pairs study from initial feedback (thanks!): new question framing, rebalanced effects, phone-friendlier and now showing your results at the end. Testing round 2 now open: give it a try at xangregg.github.io/pairstudy121... #dataviz

3 weeks ago 10 2 0 3
Advertisement

With GitHub, at least, it happens without intention. If you ask Claude to commit a file for you, it appends "Co-Authored-By: Claude Sonnet" to the commit message and GitHub parses that into its contributors list. Found out the hard way with github.com/xangregg/pai...

3 weeks ago 4 0 1 1

Thanks!

4 weeks ago 1 0 0 0

That's what I did! (not knowing JavaScript). All the code is at github.com/xangregg/pai.... There are several jitter methods being testing but each run only uses one method.

4 weeks ago 1 0 0 0

Thanks! Each response should be recorded in case of early bailouts. I'm not sure this will turn into a real study, but regardless I'll at least post results to the corresponding GitHub site.

4 weeks ago 1 0 1 0

To clarify, the box vs violin finding was a side note of the presentation but not part of the paper. It was only from their internal testing. And, of course, my memory may be bad.

4 weeks ago 0 0 0 0

That's a question I'm especially interested in and I include bimodal effects in the study. Box plots still show a difference (wider box) with bimodal distributions; I remember a prez of the original Lineups paper where boxes did better than violins for bimodal. vita.had.co.nz/papers/infer...

4 weeks ago 0 0 1 0

You should be able to take it multiple times and get different chart variants each time. Current design details at github.com/xangregg/pai...

4 weeks ago 0 0 0 0
Four pairs of charts showing distributions as a teaser for the study content. Two box plots, two bands plot, two dot plots and two violin plots.

Four pairs of charts showing distributions as a teaser for the study content. Two box plots, two bands plot, two dot plots and two violin plots.

Example page from study showing two box plots with overlaid dots with the question: How much evidence do these charts provide that A and B are genuinely different?
Four choices: No/Weak/Moderate/Strong Evidence

Example page from study showing two box plots with overlaid dots with the question: How much evidence do these charts provide that A and B are genuinely different? Four choices: No/Weak/Moderate/Strong Evidence

I've been vibe coding a #dataviz study aimed at detecting "Is this anything?" between two distribution views. I could keep tinkering forever, but it's ready for test feedback. Try it out (10 min) and send me your comments about the study or about the app.
xangregg.github.io/pairstudy121...

4 weeks ago 4 0 3 0
Preview
When Everyone Is Super What does coding assistants mean for computer science research?

New post: "When Everyone Is Super" on what coding agents mean for CS research.

Everyone got the same superpower. The hard parts of research (questions, methods, rigor) just became relatively more important. The easy part (code) just became relatively less so.
medium.com/@niklaselmqv...

1 month ago 6 3 0 0
Visualizations of Distributions and Uncertainty Provides primitives for visualizing distributions using ggplot2 that are particularly tuned for visualizing uncertainty in either a frequentist or Bayesian mode. Both analytical distributions (such as...

Thanks! I'm using JMP, but you might get something comparable in R with @mjskay.com 's ggdist/geom_dotsinterval. mjskay.github.io/ggdist/

1 month ago 3 0 0 0
Advertisement

#dataviz exercise: trying a smoothed dot plot version of the beeswarm.

1 month ago 16 0 1 0

Mosaic (marimekko) chart of all the responses to "Are you a large language model?", severely cleaned, by source of participants. #dataviz

1 month ago 1 0 0 0
Figure 1 from the linked paper. A chart showing correlations of 27 semantic antonym pairs across 6 sources, with confidence intervals. Mechanical Turk based sources generally have positively correlated antonyms.

Figure 1 from the linked paper. A chart showing correlations of 27 semantic antonym pairs across 6 sources, with confidence intervals. Mechanical Turk based sources generally have positively correlated antonyms.

My favorite free text response in this study's data:
Q: Are you a large language model?
A: No, I am not a large language model. I am a human who is here to assist you with any questions or information you may need.
www.researchgate.net/publication/...

1 month ago 1 0 0 1

Belated author tag @cskay.bsky.social

1 month ago 1 0 0 0