Advertisement · 728 × 90

Posts by Bryce

Fed up with your world? Visit ours. Clarkesworld robot looks to the right where the cover of issue 235 is located. It features a tall spacecraft floating over a dockyard in a pink and purple sky.

Fed up with your world? Visit ours. Clarkesworld robot looks to the right where the cover of issue 235 is located. It features a tall spacecraft floating over a dockyard in a pink and purple sky.

Friday is a great day to subscribe.
clarkesworldmagazine.com/subscribe/
(Ok, so any day is good, but why wait?)

1 day ago 139 44 1 0
Preview
Portland Apache DataFusion Meetup · Luma Join us for an evening of talks, community discussions about Apache DataFusion and its growing role in modern data infrastructure. This meetup will spotlight…

If you are coming to tokio.conf or are in the Portland area on April 22 we are having holding a DataFusion meetup at OHSU -- thanks to Mustafa Akur for helping arrange the details.

Signup here luma.com/dsp3ud82

3 days ago 2 1 0 0

Oh no. NSF Long Term Ecological Research (LTER) program shuttered.

This one leaves me winded. A good story from 2025 by @seattletimes-rss.bsky.social offers insight into what we are losing: www.seattletimes.com/seattle-news...

5 days ago 445 235 6 15
Preview
Portland Apache DataFusion Meetup · Luma Join us for an evening of talks, community discussions about Apache DataFusion and its growing role in modern data infrastructure. This meetup will spotlight…

We are holding an @ApacheDataFusio meetup on Wed April 22 in Portland: luma.com/dsp3ud82

Yes, this is a day before the meetup in Seattle/Bellevue on April 23 luma.com/hxshbp0m

1 week ago 2 1 0 0
Preview
We built an ADBC interface for COBOL Modern database connectivity meets 1959. It fits better than it should.

We heard you: ADBC is great, but you need it on mainframes.

Introducing ADBC for COBOL. The same connectors that work with Python, Rust, and Go now work with a language originally designed for punch cards.

columnar.tech/blog/adbc-co...

1 week ago 8 4 2 0
Preview
PDX

The latest wallpaper from @lmnt.me is perfect. www.lmnt.me/blog/wallpap...

2 weeks ago 4 2 0 0
Data Organization in Spreadsheets
Karl W. Broman
& Kara H. Woo
Pages 2-10 | Received 01 Jun 2017, Accepted author version posted online: 29 Sep 2017, Published online: 24 Apr 2018

    1. Introduction
    2. Be Consistent
    3. Choose Good Names for Things
    4. Write Dates as YYYY-MM-DD
    5. No Empty Cells
    6. Put Just One Thing in a Cell
    7. Make it a Rectangle
    8. Create a Data Dictionary
    9. No Calculations in the Raw Data Files
    10. Do Not Use Font Color or Highlighting as Data
    11. Make Backups
    12. Use Data Validation to Avoid Errors
    13. Save the Data in Plain Text Files

ABSTRACT

Spreadsheets are widely used software tools for data entry, storage, analysis, and visualization. Focusing on the data entry and storage aspects, this article offers practical recommendations for organizing spreadsheet data to reduce errors and ease later analyses. The basic principles are: be consistent, write dates like YYYY-MM-DD, do not leave any cells empty, put just one thing in a cell, organize the data as a single rectangle (with subjects as rows and variables as columns, and with a single header row), create a data dictionary, do not include calculations in the raw data files, do not use font color or highlighting as data, choose good names for things, make backups, use data validation to avoid data entry errors, and save the data in plain text files.

Data Organization in Spreadsheets Karl W. Broman & Kara H. Woo Pages 2-10 | Received 01 Jun 2017, Accepted author version posted online: 29 Sep 2017, Published online: 24 Apr 2018 1. Introduction 2. Be Consistent 3. Choose Good Names for Things 4. Write Dates as YYYY-MM-DD 5. No Empty Cells 6. Put Just One Thing in a Cell 7. Make it a Rectangle 8. Create a Data Dictionary 9. No Calculations in the Raw Data Files 10. Do Not Use Font Color or Highlighting as Data 11. Make Backups 12. Use Data Validation to Avoid Errors 13. Save the Data in Plain Text Files ABSTRACT Spreadsheets are widely used software tools for data entry, storage, analysis, and visualization. Focusing on the data entry and storage aspects, this article offers practical recommendations for organizing spreadsheet data to reduce errors and ease later analyses. The basic principles are: be consistent, write dates like YYYY-MM-DD, do not leave any cells empty, put just one thing in a cell, organize the data as a single rectangle (with subjects as rows and variables as columns, and with a single header row), create a data dictionary, do not include calculations in the raw data files, do not use font color or highlighting as data, choose good names for things, make backups, use data validation to avoid data entry errors, and save the data in plain text files.

Every day is a good day for sharing one of the most useful papers about research data ever written. PLEASE get your people to understand and follow this advice.

www.tandfonline.com/doi/full/10....

3 weeks ago 1049 404 31 48

Finally, an LLM bench that makes any sense to me

4 weeks ago 1 0 0 0

What makes more sense?

4 weeks ago 0 0 0 0
Advertisement
Preview
Arctos Alliance - Open Collective Support the committers powering the world’s open data infrastructure.

Arctos Alliance (arctosalliance.org) is now officially launched!

"If your organization depends on Arrow or Parquet or you’re interested in helping sustain these critical data technologies, we would welcome a conversation about how to get involved."

1 month ago 4 2 0 0
Preview
Besieged As the last digit on the calendar rolled over from five to six, it took less than a month to realize the coming year was going to be different than the year that preceded it. Arguably the stage was se...

This was an extremely rational take on where we're at with AI and I appreciated the take on the whole SaaS-is-dead thing: redmonk.com/sogrady/2026...

1 month ago 2 0 0 0
Post image

We're chuffed to announce Apache SedonaDB 0.3.0! This release features a rewritten join that supports larger-than-memory spatial/KNN joins courtesy of Kristin Cowalcijk, new functions, parameterized SQL queries, GDAL/pyogrio reads, GDAL/sf based reads in R, and the beginnings of an R DataFrame API!

1 month ago 6 2 1 0
Bluesky follower syncing on Puzzmo.com illustrated

Bluesky follower syncing on Puzzmo.com illustrated

The Puzzmonaut labeler

The Puzzmonaut labeler

Shipping today: @puzzmo.com has integrated support for @bsky.app

The site/app have follower syncing, and a labeler so that you can spot new Puzzmo folks when browsing Bluesky posts

www.puzzmo.com/bluesky
blog.puzzmo.com/posts/2026/0...

1 month ago 71 13 6 8
A pink and blue graphic reading "apply for our opportunity scholarship to posit::conf(2026)."

A pink and blue graphic reading "apply for our opportunity scholarship to posit::conf(2026)."

We are covering 40 people's travel, lodging, and registration for posit::conf() this fall! If you are from a group that is underrepresented in data science or open source, please consider applying for the Opportunity Scholarship—we'd love to have you join.

posit.co/blog/apply-t...

1 month ago 22 15 2 1

to be here

1 month ago 0 0 0 0

Really excited about this new blog post from my colleagues at @columnar.tech

1 month ago 4 0 0 0

:) I have to be really careful with this stuff.

1 month ago 0 0 0 0
Post image

It seems like we don't have any meaningful metrics to measure around agentic coding so we end up with stuff like this, similar to GitHub's "activity" scoring. Amp has had a really neat looking activity card for a while now:

1 month ago 0 0 1 0
Roses are red, violets are blue, syntax error in line 32.

Roses are red, violets are blue, syntax error in line 32.

A message from your ZX Spectrum.

1 month ago 564 177 2 6
Advertisement
Post image

📖 Apache Parquet recently added native support for Geospatial. This post explains what that means and why it is important: parquet.apache.org/blog/2026/02...

1 month ago 13 2 0 0
Post image
1 month ago 669 116 3 9
Post image
1 month ago 5658 656 65 55
Post image

Releasing today: 'ProggyForever': a new MIT-licensed scalable font designed to closely match ProggyClean which happily served as Dear ImGui default font for 10+ years: github.com/ocornut/prog...
The new font is now embedded by default in Dear ImGui (compressed down to ~14 KB).

1 month ago 61 6 2 0
Preview
Apache Arrow is 10 years old 🎉 The Apache Arrow project was officially established and had its first git commit on February 5th 2016, and we are therefore enthusiastic to announce its 10-year anniversary! Looking back over these 10...

Apache Arrow is 10 years old 🎉

The first git commit landed on 2016-02-05. Now Arrow has 10+ official implementations and a growing third-party ecosystem.
Read our celebratory blog post for some tidbits of Arrow history, and our outlook on the present and future.

arrow.apache.org/blog/2026/02...

2 months ago 5 6 1 1

Reviewing release notes for projects I work on can be a trip. The PRs have my name on them and apparently I created them but I have zero memory of having worked on them. Is that bad?

2 months ago 0 0 0 0
Preview
Introducing a Security Model for Arrow We are thrilled to announce the official publication of a Security Model for Apache Arrow. The Arrow security model covers a core subset of the Arrow specifications: the Arrow Columnar Format, the Arr...

We finally did it: Apache Arrow now has a security model.

The Arrow security model doesn't cover any specific Arrow implementation, but it informs about pitfalls and guardrails when dealing with potentially untrusted data in one of the Arrow formats.

@apache.org

arrow.apache.org/blog/2026/02...

2 months ago 5 3 0 1
Advertisement

It’s a good bridge

2 months ago 2 0 0 0
Large television showing the boonta eve classic

Large television showing the boonta eve classic

They got the big game on at the bar

2 months ago 5785 1502 48 41

No worries. I'm glad you replied here.

The responses in the Discuss thread are really interesting.

2 months ago 1 0 0 0

I don't see any of the examples, dbc included, as "random binaries". Maybe that's why it elicits so little concern. At least for dbc, we're coming from the perspective of the Python user and trying to work backward towards the best user experience.

2 months ago 0 0 0 0