Advertisement · 728 × 90

Posts by Valeriy M., PhD, MBA, CQF

Preview
Classifier Calibration at Scale: An Empirical Study of... We study model-agnostic post-hoc calibration methods intended to improve probabilistic predictions in supervised binary classification on real i.i.d. tabular data, with particular emphasis on...
3 hours ago 0 0 0 0

Modern gradient-boosted trees arrive pre-calibrated. Measuring first is not optional — it is the whole job.

Which model did you calibrate last that you now suspect you broke?

Full write-up in the comments.

MachineLearning #Calibration #ConformalPrediction

3 hours ago 0 0 1 0

The model is calibrated. Any rewrite is destructive.
If |Z| > 3.0, use Venn-Abers — distribution-free, mathematically guaranteed. The 1.96–3.0 band is where you actually test both.
The standard advice comes from an era of SVMs and shallow nets.

3 hours ago 0 0 1 0

Distort a calibrated distribution through a parametric function it never needed, you lose sharpness faster than you gain anything.

The Spiegelhalter decision rule — three steps:

Compute the Spiegelhalter Z on a held-out set before any calibration.
If |Z| < 1.96, do nothing.

3 hours ago 0 0 1 0

Platt fit a two-parameter sigmoid anyway, compressed the tails, pulled confident predictions toward the center, and erased meaningful variation the model had already earned.

Log-loss does not forgive that. It measures calibration and sharpness together.

3 hours ago 0 0 1 0

Platt hurt every one of them.

The reason is mechanical. CatBoost’s Spiegelhalter |Z| was 1.88 — below the 1.96 significance threshold. The model was not miscalibrated. There was nothing to fix.

3 hours ago 0 0 1 0

The damage on well-behaved models was not subtle.

— CatBoost: log-loss worsened on 93% of folds. Platt added 5.3%.
— TabICL: worse on 91% of folds. 6.0% penalty.
— EBM: worse on 90%. 4.4% penalty.
— TabPFN: worse on 87%. 5.0% penalty.

Four of the five best classifiers in the study.

3 hours ago 0 0 1 0

Stop using Platt scaling by default. It’s the bug, not the fix.
If you’re serious about shipping calibrated models, that one-line reflex is quietly making your best classifier worse.

We ran Platt scaling across 21 classifiers, 30 binary datasets, 150 cross-validation folds (arXiv 2601.19944).

3 hours ago 0 0 1 0
Post image

Knowing is cheap. Performing is what counts.

Video based on my English translation of Kiselev’s iconic Arithmetic:

www.youtube.com/watc...

5 hours ago 0 0 0 0
Advertisement
Preview
The Great Translation: How 1950s China Cloned the Soviet Mathematical Machine In the archives of modern mathematics, there is a specific kind of artifact that tells a story of geopolitics, ideology, and the birth of a…

My new article explains "The Great Translation: How 1950s China Cloned the Soviet Mathematical Machine"

6 hours ago 1 1 0 0
Post image

The real foundation of China’s 🇨🇳 STEM and AI rise was not the Western liberal arts model. It was the opposite: the systematic dismantling of that model and the large-scale import of Soviet curricula, textbooks, and teaching methods.

6 hours ago 1 0 1 0
Preview
The Calculus of Probabilities (1900) by Andrei Markov [English Translation] PREORDER SPECIAL: Secure your copy of the English translation of the masterpiece that paved the way for modern AI.Translated and adapted by Valery Manokhin, PhD, MBA, CQFFor over a century, Andrei Markov’s original 1900 monograph, The Calculus of Probabilities (Исчисление вероятностей), has remained inaccessible to English speakers—until now. This is the book where probability theory began to evolve from simple games of chance into the rigorous science of dependent variables that powers today’s world.⚡ Limited Time Offer: Preorder today to lock in the special introductory price. Bonus: Get instant access to the first two chapters immediately upon purchase. Guarantee: The full digital edition will be delivered automatically to your inbox upon completion. Urgency: The price will increase tomorrow. Don't miss this chance to own a piece of history for less. Why This Book MattersIn 1913, Andrei Markov used the techniques developed in this book to analyze Alexander Pushkin’s novel Eugene Onegin. By calculating the probability of vowels and consonants appearing in sequence, he demonstrated that "what happens next depends on what happened before."This revolutionary idea—the Markov Chain—is the direct ancestor of the technology behind modern Large Language Models (LLMs) like ChatGPT, search engines, and automated translation.This is your chance to read the original source code of the AI revolution.About This EditionThis is not just a direct translation; it is a careful restoration.Faithful to the 1900 Original: Translated directly from the pre-reform Russian text, capturing Markov’s specific pre-axiomatic logic. Modernized Notation: While preserving the historical "flavor," mathematical notation has been updated to align with contemporary standards for clarity. Annotated: Includes editorial notes clarifying historical terminology and connecting Markov’s formulations to modern statistical modeling. What You Will Learn The Foundation: How Markov defined probability based on "equally possible" events and "mutually incompatible" outcomes. The Logic: See the "Theorem of Addition" and "Theorem of Multiplication" derived from first principles. The Classic Examples: Work through Markov’s original problems, including the famous urns with white and black balls. Who Is This For? Data Scientists &amp; AI Researchers who want to understand the deep roots of their field. Math Historians &amp; Educators looking for a primary source that has been unavailable in English for 125 years. Collectors who appreciate the legacy of classical and Russian mathematics. "In a very real sense, Markov can be regarded as a conceptual godfather of the earliest probabilistic language models." — From the Translator's IntroductionPreorder now and start reading Chapter 1 today.

Sometimes "wrong" is how you know something was built with care.

6 hours ago 0 0 0 0

The faithful edition of Markov is meant to read like a classical mathematics book — and part of what classical mathematics looks like is Latin and Greek sharing a page without pretending they belong to the same family.

6 hours ago 0 0 1 0

If you wanted a century of mathematical writing to stay typographically continuous with the tradition that came before it, Knuth got it exactly right.
I'm leaving the font alone in my translation.

6 hours ago 0 0 1 0

You've been reading it your whole career and probably never noticed — because somewhere along the way, it became the look of math.
Was it a mistake?
If you wanted every symbol to look cut from one block of wood, yes. Later fonts harmonise Greek and Latin better.

6 hours ago 0 0 1 0

Knuth drew them from different typographic traditions. Neoclassical Latin italic for a. Handwritten Greek italic for α. Different centuries, different hands.
Nearly every mathematical paper typeset in LaTeX carries this mismatch. Textbooks. Monographs. Fields Medal lectures.

6 hours ago 0 0 1 0

In Computer Modern — the math font nearly every LaTeX paper on earth defaults to, designed by Donald Knuth in the late 1970s — italic a and italic α don't harmonize. The Greek alpha is visibly taller and meaningfully wider than the Latin a sitting right next to it.

6 hours ago 0 0 1 0

Knuth is a genius. He still got some things wrong.
A reader of my Markov translation sent me a photo of a fraction on page 20:

(a − α) / (a + b − α − β)

"The 'a' and the 'α' look like different sizes. Feels off."
He was right.

6 hours ago 0 0 1 0
Advertisement

And that is why no one should impressed by the cult of Kaggle medals.

#AI #MachineLearning #DataScience #DeepLearning #Kaggle #DeepMind #Mathematics #ArtificialIntelligence

8 hours ago 2 0 0 0

But it does not produce DeepMind.

Kaggle trained people to win competitions.
Rigorous mathematics trained people to move civilization forward.

8 hours ago 0 0 1 0

Major scientific breakthroughs.
Nobel Prize-level impact.

That is what happens when you optimize for first principles instead of public scoreboards.

The uncomfortable truth is simple:

In AI, rigorous mathematics beats contrived hacks.

Every time.

Kaggle can produce clever competitors.

8 hours ago 0 0 1 0

And it is definitely not the same thing as changing the world.

Now compare that with DeepMind.

Founded by elite mathematicians, neuroscientists, and top-tier researchers.
Built on rigorous theory, not leaderboard theatre.
And what did it produce?

AlphaGo.
AlphaFold.

8 hours ago 0 0 1 0

Kaggle rewards people for squeezing the last few basis points out of contrived benchmark problems with endless feature hacks, ensembling tricks, and competition-specific gymnastics.

That is not the same thing as building enduring technology.
That is not the same thing as advancing science.

8 hours ago 0 0 1 0

For years, the industry was told that Kaggle Grandmasters were the "ultimate proof of machine learning excellence."

They were not.

8 hours ago 0 0 1 0
Post image

Kaggle was founded in 2010.

More than 16 years later, what exactly is its legacy?

❌ A generation of leaderboard chasers.
❌ Very little foundational AI.
❌ Very few serious companies.
❌ No scientific breakthroughs.

8 hours ago 0 0 1 0

Proof that great early education + relentless curiosity can take you from action hero to AI innovator.
Who else credits their weird or rigorous childhood schooling for skills they use today? Drop your stories 👇
#MillaJovovich #MemPalace #AI

1 day ago 0 0 0 0

From Soviet kindergarten math foundations → Hollywood superstar → building cutting-edge AI architecture.
The USSR preschool program didn’t mess around with early STEM foundations… and apparently neither does Milla.

1 day ago 1 0 1 0

After dominating Hollywood as the Resident Evil queen and Fifth Element warrior, Milla didn’t stop learning.
She just architected MemPalace — a free, open-source AI memory system on GitHub that’s currently posting some of the highest benchmark scores ever seen for memory/retrieval.

1 day ago 0 0 2 0
Advertisement

In the USSR, little Milla Jovovich was getting a serious head start in mathematics — counting, patterns, shapes, basic logic, and spatial thinking that the Soviet preschool system was famous for teaching systematically from age 3–6.
Fast-forward decades:

1 day ago 0 0 1 0
Post image

Most kids leave kindergarten knowing colors and how to share toys.

1 day ago 0 0 1 0