(@braaannigan) Bsky - nopzon.com

Writing Benchmarks | skydiscover-ai/skydiscover | DeepWiki This page explains the three-file contract that every benchmark in the `skydiscover` ecosystem must satisfy, how to lay out a benchmark directory, and how the three files interact during the evolution

With an evo algorithm it can instead use an LLM to identify how to modify to your code, evaluate them and decide whether to keep the new code for the next iteration.

The best results I've had are with SkyDiscover. See the docs that explain how the repo fits together: deepwiki.com/skydiscover-...

1 week ago 0 0 0 0

Writing Benchmarks | skydiscover-ai/skydiscover | DeepWiki This page explains the three-file contract that every benchmark in the `skydiscover` ecosystem must satisfy, how to lay out a benchmark directory, and how the three files interact during the evolution

Evolutionary algorithms can play an important role in automated knowledge discovery. When we're trying to improve a model we are often updating code which we normally can't calculate gradients on. deepwiki.com/skydiscover-...

1 week ago 0 0 1 0

Other interesting parts:
- I'm doing it 100% hands-off with codex, I haven't written a single line of code
- I'm using an AI scientist approach where an LLM proposes the next hypothesis and analyses the results

I can move fast while spending more time learning about the underlying challenges

2 weeks ago 0 0 0 0

You don't need a massive GPU to learn about ocean ML emulators. I've been running lots of CPU experiments where I emulate a double gyre shallow water model. Starting to get promising results
🌊

2 weeks ago 4 0 0 1

This is the most important but under-appreciated idea in science at the moment: if you can frame your problem so you can:
- define any solution in code and
- evaluate any proposed solution with a single score then

you can set an LLM loose to iterate and improve it all day and all night

🌊

2 weeks ago 1 0 0 0

I've been using it on a personal 🌊 experiment. I did some initial manual iterations with an LLM before using Sky Discover. It produced big jumps forward while I was in the shower!

I used it with the Codex model from openai and spent less than $1 in API costs

3 weeks ago 0 0 0 0

GitHub - skydiscover-ai/skydiscover: AI-Driven Scientific and Algorithmic Discovery AI-Driven Scientific and Algorithmic Discovery. Contribute to skydiscover-ai/skydiscover development by creating an account on GitHub.

Sky-Discover is an AI knowledge discovery algo that you can use on problems where you can define an evaluation score for new solutions. You set it up with data and an initial solution script. It then uses an LLM to evolve your script into new solutions it can evaluate
github.com/skydiscover-...

3 weeks ago 1 0 1 0

Glad that this google paper faces the same reality I do in my early AI scientist experiments:
- models focus on hyper-params too early
- performance saturates as a local optimum is found
- results can be far from what cracked humans can achieve
arxiv.org/pdf/2602.02660

3 weeks ago 0 0 0 0

Towards Execution-Grounded Automated AI Research Automated AI research holds great potential to accelerate scientific discovery. However, current LLMs often generate plausible-looking but ineffective ideas. Execution grounding may help, but it is un...

The limitations of RL-style autoresearch: the focus on incremental improvements in eval metric lead to local rather than global optimisation. Anyone who has done scientific research knows that true discovery involves persistence with initially unpromising avenues!

arxiv.org/abs/2601.14525

4 weeks ago 0 0 0 0

@davidho.bsky.social Do you know why my post doesn't appear on the oceanography feed?

1 month ago 0 0 1 0

MITgcm/MITgcm | DeepWiki This document provides a high-level introduction to the MITgcm (MIT General Circulation Model) repository structure, core architecture, and key systems. MITgcm is a flexible ocean and climate modeling

Want to understand the source code of the MITgcm ocean model? We can do this using the deepwiki tool which uses LLMs to build detailed docs

deepwiki.com/MITgcm/MITgcm
🌊

1 month ago 1 0 1 0

Getting started | uv uv is an extremely fast Python package and project manager, written in Rust.

If you work with python I highly recommend uv as the single tool you use to manage:
- installing python for a project
- creating and running virtual envs
- managing dependencies
- packaging to run on other machines

docs.astral.sh/uv/getting-s...

It's faster and more comprehensive than pip/venv/etc

1 month ago 3 0 0 0

Lots of scientists still use Jupyter notebooks for analysis, but these don't integrate well with agentic coding.

As an alternative I'd suggest marimo notebooks, which have a similar interface but which an agent can run like a script

1 month ago 4 0 0 0

Polars has built in date/datetime/duration functions. I use them a lot because they have a consistent API across python versions and the syntax for working with timezones is a lot easier to remember than Python datetimes!

6 months ago 0 0 0 0

Polars has neat built-in approaches for casting common string datetime formats these days, so long .str.strfmt followed by some pattern I could never remember

6 months ago 0 0 0 0

Need to find performance bottlenecks? Then pyinstrument is an excellent tool. Recently it showed me that my pipeline run weren't slow because of my data - it was because I was re-authenticating to AWS every time. You get this nice visual which makes it easy to spot the laggards

7 months ago 0 0 0 0

I'm finding that O3 generates technically valid Polars code, but it leans very heavily on working with Series like numpy arrays and never comes close to proper lazy mode Polars syntax

9 months ago 0 0 0 0

New blog post from NVIDIA and Polars showing how you can process datasets too large to fit on GPU memory (link below). For a single GPU it may be best to use the spill-to-system-memory approach while for mutli-gpus there is a new streaming engine approach

9 months ago 2 0 0 0

Generating Polars code with LLMs - Polars user guide Generating Polars code with LLMs Large Language Models (LLMs) can sometimes return Pandas code or invalid Polars code in their output. This guide presents approaches that help LLMs generate valid Polars code more consistently. These approaches have been developed by the Polars community through test...

I put together a user guide page on getting the best Polars code from LLMs. That was months ago, however! How do you think it needs to be updated?

10 months ago 0 0 0 0

As projects mature you will want to invest in a tool to validate the schema and data in your dataframes. This blog post sets out a good summary on the different options for Polars users: posit-dev.github.io/pointblank/blog/validati...

10 months ago 2 0 0 0

Pypi download stats work in mysterious ways. In the last few months Polars exhibited low continuous growth. Then basically overnight downloads almost double and become much more variable. Why?

10 months ago 1 0 1 0

Let me count the ways that lazy mode in Polars ❤️ Parquet files

1. Polars can get the schema to start the query
2. Polars can use projection pushdown to subset columns
3. Polars can use predicate pushdown to limit the row groups it reads from the file when a filter is applied

11 months ago 1 0 0 0

Forecasting: Principles and Practice, the Pythonic Way

Interested in forecasting in python? A major new free online textbook by the leading forecasting academics and practitioners has been released: https://otexts.com/fpppy/

This adapts Rob Hyndman's excellent R forecast book to the python world

11 months ago 2 0 1 0

Using pytest with Polars? When there's an error the default traceback is often very long and you have to scroll through a lot to get to the relevant part. You can make it snappier by passing --tb=short to your pytest command to get to the point!

11 months ago 0 0 0 0

You can add a new column to a Polars DataFrame at a specified index position with insert_column. Your data needs to be a Polars Series first

11 months ago 1 0 0 0

One habit I've picked up with LLMs: if I'm working in a terminal but have to much data to read then I generate a function that takes my dataframe and produces a html page with plotly charts that I can then open in the browser. Basically an on-demand dashboard

11 months ago 0 0 0 0

We can handle tricky JSON with Polars nested dtypes.

Here we have a list of dicts. But each row also contains a list of dicts. We deal with this by exploding the inner list of dicts to get each entry on its own row. Then we unnest the inner dicts so each field is its own column

11 months ago 1 0 1 0

Not at the moment, I'm afraid, they come from my O'Reilly workshop

11 months ago 1 0 0 0

It should be called look-at-the-data science

11 months ago 2 0 0 0

One thing to be careful with Polars is using pl.when.then in cases where it isn't needed as Polars pre-calculates all of the possible paths. It may be that a pl.when.then can be replaced by a join or replace_strict. This query is 5x faster as a join for example

11 months ago 1 0 0 0

Posts by