John Platt (@johnplattml) Bsky

An AI system to help scientists write expert-level empirical software The cycle of scientific discovery is frequently bottlenecked by the slow, manual creation of software to support computational experiments. To address this, we present an AI system that creates expert-level scientific software whose goal is to maximize a quality metric. The system uses a Large Language Model (LLM) and Tree Search (TS) to systematically improve the quality metric and intelligently navigate the large space of possible solutions. The system achieves expert-level results when it explores and integrates complex research ideas from external sources. The effectiveness of tree search is demonstrated across a wide range of benchmarks. In bioinformatics, it discovered 40 novel methods for single-cell data analysis that outperformed the top human-developed methods on a public leaderboard. In epidemiology, it generated 14 models that outperformed the CDC ensemble and all other individual models for forecasting COVID-19 hospitalizations. Our method also produced state-of-the-art software for geospatial analysis, neural activity prediction in zebrafish, time series forecasting and numerical solution of integrals. By devising and implementing novel solutions to diverse tasks, the system represents a significant step towards accelerating scientific progress.

We've been working on an AI system to help scientists write their software! All you need is a way to write a score function, and be willing to give it some hints or papers based on your domain knowledge. arxiv.org/abs/2509.06503

7 months ago 3 1 0 0

Rocket taking off at night, leaving trail of flames behind with low billowing clouds against the ground.

On Friday night, I watched the launch of the first FireSat! This constellation of fire satellites, when finished, will be able to detect 25 square meter wildfires within 15 minutes. See blog.google/outreach-ini... Photo courtesy of SpaceX.

1 year ago 10 0 1 1

Using ChatGPT is not bad for the environment And a plea to think seriously about climate change without getting distracted

A well-reasoned essay about AI and energy use: andymasley.substack.com/p/individual... from @andymasley.bsky.social

1 year ago 3 0 0 0

📊 Using Promptfoo in @googlecolab.bsky.social for quick comparisons between models is so slick!

👇 Several out-of-the-box Gemini variants, and a couple of examples with code execution & self-checking as tools or function calls.

🔗 app.promptfoo.dev/eval/f:fc38a...

⚙️ gist.github.com/dynamicwebpa...

1 year ago 25 3 0 0

About our agency NOAA is an agency that enriches life through science. Our reach goes from the surface of the sun to the depths of the ocean floor as we work to keep the public informed of the changing environment aro...

“Why do I need NOAA? I’ve got a weather app.”

Is equivalent to asking

“Why do I need farms? I can go to the supermarket.”

www.noaa.gov/about-our-ag...

1 year ago 3913 1281 51 56

Different John Platt..

1 year ago 0 0 1 0

Wildfires offset the increasing but spatially heterogeneous Arctic–boreal CO2 uptake - Nature Climate Change How the carbon stocks of the Arctic–Boreal Zone change with warming is not well understood. Here the authors show that wildfires and large regional differences in net carbon fluxes offset the overall ...

Climate change more strongly affects the Arctic than other regions. The Arctic also has a vast store of carbon. Sadly, increased fires are now making parts of the Arctic a carbon source rather than a sink. www.nature.com/articles/s41...

1 year ago 2 0 0 0

Is there a link between #ClimateChange & increasing risk/severity of #wildfire in California--including the still-unfolding disaster? Yes. Is climate change the only factor at play? No, of course not. So what's really going on? [Thread] #CAfire #CAwx #LAfires iopscience.iop.org/a...

1 year ago 789 365 26 73

I really wish we saw more emphasis on installing 120V outlets everywhere (at rental properties, public garages, workplace parking lots etc). If you were confident you could slow charge wherever you park, it would be fine for maybe 80-90% of days, with DC rapid chargers filling in the rest of time.

1 year ago 101 12 15 9

Welp, OpenAI o3 model gets 25%. Inference time RL ftw.

1 year ago 1 0 0 0

Feasibility test of per-flight contrail avoidance in commercial aviation - Communications Engineering Vapour trails (contrails) from aircraft make a substantial contribution to aviation’s climate impact. Here we execute a per-flight contrail avoidance feasibility test through altitude adjustments base...

Our paper shows that you can predict and avoid contrails in real-life, with statistically significant results: www.nature.com/articles/s44...

1 year ago 3 1 0 0

Science team of LLMs design nanobodies for COVID: www.biorxiv.org/content/10.1... #neurips workshop hangover

1 year ago 1 0 0 0

FrontierMath FrontierMath is a benchmark of hundreds of unpublished and extremely challenging math problems to help us to understand the limits of artificial intelligence.

An extraordinarily difficult math AI benchmark: epoch.ai/frontiermath

1 year ago 2 2 1 0

Noam Brown just gave a very nice talk about o1. He reached back to the history of game AI to show that inference time compute trades off against trading time. He believes that when answers are easy to verify but difficult to generate them inference-time RL is worth it.

1 year ago 1 0 0 0

Can Large Language Model Agents Simulate Human Trust Behavior? Can Large Language Model Agents Simulate Human Trust Behavior?

LLMs tested to see if they emulate human trust behavior #neurips agent-trust.camel-ai.org

1 year ago 1 0 0 0

GitHub - THU-MIG/yolov10: YOLOv10: Real-Time End-to-End Object Detection [NeurIPS 2024] YOLOv10: Real-Time End-to-End Object Detection [NeurIPS 2024] - GitHub - THU-MIG/yolov10: YOLOv10: Real-Time End-to-End Object Detection [NeurIPS 2024]

YOLOv10 visual object detector has eliminated non-maximum suppression, which is nice. #neurips github.com/THU-MIG/yolo...

1 year ago 3 0 0 0

Questioning the Survey Responses of Large Language Models Surveys have recently gained popularity as a tool to study large language models. By comparing models’ survey responses to those of different human reference populations, researchers aim to infer...

An interesting NeurIPS talk that says that when you give multiple choice survey questions to LLMs and randomize the order of the answers, you get close to random answers. Paper: openreview.net/forum?id=Oo7...

1 year ago 5 0 0 0

I would share more NeurIPS papers, but there is a common lack of statistical significance tests: it would be nice to know whether a set of results is interesting or due to random fluctuations.

1 year ago 3 0 0 0

The trick with #neurips posters is to seek out the ones that are too crowded

1 year ago 2 0 0 0

Making quantum error correction work

The Quantum team at Google has shown that error correction is effective: surface code now tested up to 7x7 and each enlargement has a notable decrease in error. The logical qubit now lasts longer than the best individual physical qubit component.

Nice blog post: research.google/blog/making-...

1 year ago 11 1 0 1

To crop or not to crop: Comparing whole‐image and cropped classification on a large dataset of camera trap images In this work, the authors assess the hypothesis that classifying animals cropped from camera trap images using a species-agnostic detector yields better accuracy than classifying whole images. We fin...

Using species-agnostic detector helps improve camera trap classification ietresearch.onlinelibrary.wiley.com/doi/10.1049/... This matches my analogous experience since the 90s: "pingers" (Detectors that generically go 'ping') often improve classification

1 year ago 0 0 0 0

👍🏻

1 year ago 0 0 1 0

Geological Net Zero and the need for disaggregated accounting for carbon sinks - Nature Nature - Geological Net Zero and the need for disaggregated accounting for carbon sinks

Climate change is all about accounting. This paper www.nature.com/articles/s41... argues that net zero goals should be considered at the boundary of the Earth: if we include passive biosphere uptake as part of net zero, the Earth will warm by an additional 0.5C even after net zero is reached.

1 year ago 6 0 1 0

Icymi: you can track recent persistent contrails in our contrail detection app: contrails.webapps.google.com .. it's surprising how many there are

1 year ago 1 0 0 0

Mapping the ionosphere with millions of phones - Nature Data from millions of smartphones are used to map the ionosphere in greater detail, leading to improved smartphone location accuracy, particularly in parts of the world with few monitoring stations.

Paper: using cellphones to collectively measure the electron density in the ionosphere. Could lead to increased accuracy of GPS. www.nature.com/articles/s41...

1 year ago 11 2 0 0

NEW PAPER: Decarbonization isn't just about climate—it's also about public health. We find that economy-wide CO2 reductions can lead to widespread improvements in air quality and health benefits, ranging from $65 billion to $250 billion annually by 2035.

www.sciencedirect.com/science/arti...

1 year ago 445 182 7 19

Ran into Denny Zhou, one of the inventors of Chain-of-Thought. He shared a summary of his LLM work over the last 2 years, which I thought was interesting: dennyzhou.github.io/LLM-Reasonin...

1 year ago 2 0 0 0

Project Contrails: Preventing Contrails with AI - Google Research Discover how Google's Project Contrails is using AI to help prevent contrails and mitigate aviation's impact on global warming.

I'm John Platt, Fellow @ Google Research. I'm working on AI to help fight climate change. One example is preventing contrails (see sites.research.google/contrails/).

1 year ago 0 0 0 0

Posts by John Platt