Advertisement · 728 × 90

Posts by Rushiv Arora

I’ll just say it’s “not like me” to know the song :)

FWIW, ChatGPT called it Beethoven

4 months ago 2 0 2 0

Rushiv has been doing this neat work trying to understand how to bring language into multi-task learning.

6 months ago 6 2 0 0

Special thanks to @eugenevinitsky.bsky.social for being an amazing mentor!

6 months ago 0 0 0 0

2. LEXPOL’s performance benchmarks against previous methods on MetaWorld
3. A combination of LEXPOL with the previous natural language based state embedding algorithm, giving a joint method combing state and action factorization

6 months ago 0 0 1 0
Post image

Paper Highlights:
1. Qualitative analysis of LEXPOL (end-to-end learning) and frozen pre-trained single-task policies. We note that LEXPOL successfully disentangles the tasks into fundamental skills, and learns to combine them without a decomposition to primitive actions.

6 months ago 1 0 1 0

LEXPOL is inspired by our ability to combine multiple different sub-skills together to solve larger tasks based on context. It works by factorizing the complexity of multi-task reinforcement learning into smaller learnable pieces.

6 months ago 2 0 1 0

This helps since multi-task RL is hard because a single monolithic policy must entangle many skills. LEXPOL factorizes control into smaller learnable pieces and uses language as the router for composition.

6 months ago 1 0 1 0

The idea: give the agent a natural-language task description (“push the green button”) and let a learned language gate blend or select among several sub-policies (skills) based on context. One shared state; multiple policies; a gating MLP guided by language embeddings chooses the action.

6 months ago 2 0 1 0
Advertisement
Preview
Multi-Task Reinforcement Learning with Language-Encoded Gated Policy Networks Multi-task reinforcement learning often relies on task metadata -- such as brief natural-language descriptions -- to guide behavior across diverse objectives. We present Lexical Policy Networks (LEXPO...

I’m very excited to share my new paper introducing LEXical POLicy Networks (LEXPOL) for Multi-Task Reinforcement Learning!



In LEXPOL, language acts as a gate that routes among reusable sub-policies (skills) to solve diverse tasks.
Paper: arxiv.org/abs/2510.06138

6 months ago 11 2 2 1

Excited to share that an abstract of this paper was accepted at RLDM!

I’m genuinely excited to go for the conference—it is one of my favorite venues for interdisciplinary discussions!

1 year ago 1 0 0 0

This is a cause that is very close to my heart, and I am glad to have found the NOCC. They do a lot of important work for patients and their caregivers.

1 year ago 0 0 0 0

My grandmother passed away from ovarian cancer in 1995. I never got the chance to meet her, but I have always felt extremely connected with her through the countless stories and family heirlooms my parents have shared with me.

1 year ago 0 0 1 0
Preview
2025 TCS New York City Marathon - Rushiv Arora My grandmother passed away from ovarian cancer in 1995. I never got the chance to meet her, but I have always felt extremely connected with her, and all my ancestors, through the countless stories and...

I am thrilled to announce that I am running the 2025 New York City Marathon for the National Ovarian Cancel Coalition.

I would be grateful if you would consider donating or sharing the message to help me fundraise for this very important cause! Every share counts!

p2p.onecause.com/nycmarathon2...

1 year ago 1 0 1 0

This paper is personally meaningful to me since it is my first solo-authored paper! (And 7th overall!).

I got the idea in 2023 in the final months of my master’s but didn’t have the chance to work on it until last year. I am thrilled to finally publish it!

1 year ago 0 0 0 0

Read the paper to explore how H-UVFAs advance scalable and reusable skills in RL! #ReinforcementLearning #MachineLearning #AI

1 year ago 0 0 1 0

- Outperforming UVFAs: In Hierarchical settings, H-UVFAs have superior performance and generalization than UVFAs. In fact, UVFAs failed to learn in some settings.

- Learning in both supervised and reinforcement learning contexts.

1 year ago 1 0 1 0

Core Contributions:
- Hierarchical Embeddings: We show that it is possible to break down hierarchical value functions into its core elements by leveraging higher-order decomposition methods in Mathematics like Tucker Decompositions.

- Zero-shot generalization: H-UVFAs can extrapolate to new goals!

1 year ago 0 0 1 0
Advertisement

We extend Universal Value Function Approximators (UVFAs) to hierarchical RL, enabling zero-shot generalization across new goals in multi-task settings while retaining the benefits of temporal abstraction.

1 year ago 0 0 1 0
Preview
Hierarchical Universal Value Function Approximators There have been key advancements to building universal approximators for multi-goal collections of reinforcement learning value functions -- key elements in estimating long-term returns of states in a...

First post on here and it’s an exciting one! I am happy to share my new paper “Hierarchical Universal Value Function Approximators” (H-UVFAs): arxiv.org/abs/2410.08997

1 year ago 5 1 1 1