Posts by Ryan Rosario
Academics/Researchers: What your suggestions for someone that got their PhD a while ago and wants to re-enter the research space? I think my teaching career is coming to and end (too strict of a schedule) but I want to remain affiliated with academia and need to plan my next move in it.
Most of them are Trumpers. They support this.
That moment when I was so busy and completely forgot that Monday was a holiday and I had an extra day to prepare. Courtesy of Sora.
Anyway, It was humorous seeing these concepts being treated as if they were scientific breakthroughs. It reminds me of the Dynamo paper where the authors believed that they had discovered, or at least revolutionized, the concept of "Tail Latency."
(1) and (2) show the disconnect between statistics and/or data science and computer science. It's very inefficient.
(3) In the "replacement" discussion, I felt a bit of elitism here. Some of the researchers that are enthusiastic about everyone being replaced with AI seem to the think that they are immune. It comes across differently to those that are not 100% in academia.
(2) There was a lot of focus on time series. The talks suggested that time series had just been re-discovered by computer scientists. It's been around for at least a hundred years.
A snarkier take from my time at #NeurIPS2025
(1) Large companies have been doing evaluation, on everything, for decades (it's all I did as a DS in Google Search). It was interesting seeing academia catch up, beyond accuracy/precision/recall/AUC/F1 etc. though they acted like this was a new concept.
With that said, statisticians and data scientists (or the companies that don't understand how to use them) tend to miss a big opportunity: helping improve systems, algorithms and AI through evaluation and experimental design. I don't get it.
(2) Most papers in AI overfit the data, this is why evaluation is important.
(3) System architects may be safe from automation for AI in the near future.
(4) Junior level roles will disappear (My opinion: this is a shift, not a deprecation)
These are my takeaways from hashtag #NeurIPS2025:
(1) Evaluation of algorithms and solutions developed from LLM prompts and responses in systems is important (attention statisticians and data scientist). Log the results ("observability"). Iterate based on the results. 1/4
Most, if not all, of us who teach and/or do research feel a certain way about what’s going on right now. It was surreal to see UCOP explicitly call it out in a recent (public) document. It made my heart skip a beat.
If any of you are thinking of upgrading to Claude Max. Don't. Save your money. Same ridiculous limit on input and conversation length.
Whenever I introduce TCP or other network connections, I introduce the concept with two bros, Connor and Logan. Why? Because not much data is exchanged, yet the handshake is important.
It's getting to the point that I need to consider canceling my subscription to Claude. Has anyone else noticed a drastic decrease in quality with coding prompts in addition to system reliability issues?
Another earthquake! This is getting to become a bit much. All of them have been near Conejo Valley.
I believe so. We are about to head for a cliff in the next couple of years when StackExchange shuts down and training data becomes old or limited. Sure there's Github, but there is less human annotation in Github.
What baffles me about statistics education is lack of discussion on non-ML importance to computer science:
(1) use for indexing where the keys follow a distribution: arxiv.org/abs/1712.01208
(2) use in evaluating cost of query plans
(3) probabilistic data structures
Tonight I took all of my slides and passed them to NotebookLM. The podcast adds more context, some analogies and other examples. With the exception of some minor hallucination, and the host making strange noises, this is mind-blowing. I'm using this for my classes moving forward.
THAT was a big earthquake. Damn.
Why do I even pay for Claude? It is horribly rate limited, expensive, and is offline more than it is online.
Hot take? Tableau is hot garbage.
Believe it or not, today was my first time ever using Tableau as a data scientist. And after today, it is also my last time.
MongoDB has the most bizarre authentication model.
I am going to have to switch away from Neo4j to another graph database as my choice to teach the graph model. It's too much of a money grab for simple things like read only access on a user, and it's a pain to setup HTTPS and reverse proxy. Any suggestions for worth alternatives?
This quarter my data management students are constructing an ETL pipeline as their final project. We are hamstrung by AWS' free tier and so we are using #DuckDB as our serving layer, rather than Snowflake or Redshift, to power a Tableau dashboard. I enjoy it more and more each time.
Ultimate nirvana when teaching. This happened for the first time since maybe week 1. Average response time has gone up over the years, but still pretty good if there were an SLA...
Brought to you by the parametric equations,
x = 16 (sin x)^3
y = 13 cos x - 5 cos 2x - 2 cos 3x - cos 4x