#RobustStatistics #CRAN #DataScience #OpenSource
orig https://fediscience.org/@davdittrich/116312301689809849 4/4
Three-panel speedup plot. Panel A: robScale, robLoc, adm vs revss (4–5× at small n, 2–4× at large n). Panel B: Qn, Sn vs robustbase (up to 9× at large n). Panel C: GMD vs GiniDistance, IQR vs stats (37× at small n), MAD vs stats (26× at small n). Sample sizes n=3 to n=10 million.
robscale 0.5.3 is now on CRAN: A major update since the last public release (0.2.1).
New: 11 robust estimators with confidence intervals, plus a variance-weighted ensemble combining seven scale statistics via bootstrap. Newton–Raphson replaces scoring […]
[Original post on fediscience.org]
https://github.com/davdittrich/robscale
#RStats #RobustStatistics #DataScience #Optimization
orig https://fediscience.org/@davdittrich/116201886682143700 4/4
A two-panel benchmark charting performance multipliers of the optimized C++ robscale package against legacy pure-R implementations across sample sizes from $n = 3$ up to $10^7$, with the vertical axis starting honestly at 0x. The left panel reveals a massive speedup for M-estimators (robLoc, robScale, adm vs. revss), pushing up to ~28x for robScale. The right panel tracks scale estimators ($Q_n$, $S_n$ vs. robustbase), showing the speedup curve upward from 1.6x, approaching 10x at large sample sizes. Shaded ribbons show 95% bootstrap confidence intervals, visually confirming dramatic computational efficiency.
Robust estimation demands highly efficient computation, especially in streaming anomaly detection where latency budgets are tight.
While Rousseeuw & Croux's robust estimators ($Q_n$ and $S_n$), and Rousseeuw & Verboven's M-estimators of location and scale […]
[Original post on fediscience.org]