This is such a cool piece of work! It unifies two completely separate techniques used often in cloud native settings:
1. Graceful degradation. E.g., disabling or weakening features in your app instead of erroring out.
2. Autoscaling to match capacity to demand.
Turns out, you can combine the two!
Posts by Sangeetha Abdu Jyothi
Check out our paper and code, and say hi to @kapilagrawal.bsky.social if you are attending ASPLOS/EuroSys!
Tech Report π : arxiv.org/pdf/2312.12809
Code π»: github.com/NetSAIL-UCI/...
(received all three ACM reproducibility badges during artifact evaluation π
)
4/ We also introduce ππππ©ππππ, a resilience benchmarking platform that can emulate realistic cloud environments at scale.
3/ We build ππ‘π¨ππ§π’π±, the first automated resilience management system for containerized clouds, based on diagonal scaling. Phoenix can handle failures in a cluster of 100,000 nodes within 10 seconds.
2/ We introduce the notion of π₯πͺπ’π¨π°π―π’π π΄π€π’ππͺπ―π¨, which involves selectively turning off less critical microservices during capacity crunch scenarios. By allowing apps to specify acceptable degraded states using criticality tags on microservices, we can enable a broader set of resilience objectives.
1/ At ASPLOS'25, @kapilagrawal.bsky.social will present our paper, "Cooperative Graceful Degradation In Containerized Clouds." As cloud outages grow in cost and frequency, we put forward a vision for automated cloud resilience management with cooperative graceful degradation b/w apps & cloud π€
I came across a video demonstrating the ideal qualities of a PhD student! π
On #InternationalDayofWomenandGirlsinScience let me remind you that we are NOWHERE NEAR parity. A few examples.
1. Women are credited less in science than men.
www.nature.com/articles/s41...
I hear this is where the cool people hang out now. I finally made the jump :)