Sekh (Sk) Mainul Islam (@sekh-copenlu) Bsky

This work was conducted at @copenlu.bsky.social under the guidance of my amazing supervisors, @iaugenstein.bsky.social and @apepa.bsky.social.

5 months ago 1 0 0 0

Overall, this work advances understanding of how LLMs integrate internal and external knowledge by introducing the first systematic framework for multi-step analysis of knowledge interactions via rank-2 subspace disentanglement.

5 months ago 1 0 1 0

💡How is the CoT mechanism aligned with the knowledge interaction subspace?
📊 CoT maintains similar CK alignment compared to standard prompting for all the datasets, and also reduces PK alignment.

5 months ago 1 0 1 0

💡 Can we find reasons for hallucinations based on PK-CK interactions?
📊 The gap between PK and CK is much higher for the examples with hallucinated spans than for the examples with no hallucinated spans across the sequence steps.

5 months ago 1 0 1 0

💡 How do individual PK and CK contributions change over the NLE generation steps for different knowledge interactions?
📊 During most of the NLE generations, the model slightly prioritizes PK.

5 months ago 1 0 1 0

💡 How do individual PK and CK contributions change over the NLE generation steps for different knowledge interactions?
📊 While generating an answer, the model aligns with the CK direction for conflicting examples, while for supportive examples, the model aligns with PK.

5 months ago 1 0 1 0

🪛 We propose a novel rank-2 projection subspace that disentangles PK and CK contributions more accurately and use it for the first multi-step analysis of knowledge interactions across longer NLE sequences.

5 months ago 1 0 1 0

💡 Is a rank-1 projection subspace enough for disentangling PK and CK contributions in all types of knowledge interaction scenarios?
📊 Different knowledge interactions are poorly captured by the rank-1 projection subspace in LLM model parameter

5 months ago 1 0 1 0

Prior work has largely examined only single-step generation – typically the final answer, and has modelled PK–CK interaction only as a binary choice in a rank-1 subspace. This overlooks richer forms of interaction, such as complementary or supportive knowledge.

5 months ago 1 0 1 0

🤔 NLEs illustrate the underlying decision-making process of LLMs in a human-readable format and reveal the utilization of PK and CK. Understanding their interaction is key to assessing the grounding of NLEs, yet it remains underexplored.

5 months ago 2 0 1 0

I am excited to share our new preprint answering this question:
"Multi-Step Knowledge Interaction Analysis via Rank-2 Subspace Disentanglement"

📄 Paper: arxiv.org/pdf/2511.01706
💻 Code: github.com/copenlu/pk-c...

5 months ago 2 0 1 1

What is the interaction dynamics between Parametric Knowledge (PK) and Context Knowledge (CK) in generating longer Natural Language Explanation (NLE) sequences?

5 months ago 5 2 1 0

👩‍🔬 Huge thanks to my brilliant co-authors from @copenlu.bsky.social (led by @iaugenstein.bsky.social ) — @nadavb.bsky.social , Siddhesh Pawar, @haeunyu.bsky.social , and @rnv.bsky.social .
@aicentre.dk

8 months ago 1 0 0 0

📊 Key Takeaways:
3️⃣ Real & Fictional Bias Mitigation: Reduces both real-world stereotypes (e.g., “Italians are reckless drivers”) and fictional associations (e.g., “citizens of a fictional country have blue skin”), making it useful for both safety and interpretability research.

8 months ago 1 0 1 0

📊 Key Takeaways:
2️⃣ Strong Generalization: Works on unseen biases during token-based fine-tuning.

8 months ago 1 0 1 0

📊 Key Takeaways:
1️⃣ Consistent Bias Elicitation: BiasGym reliably surfaces biases for mechanistic analysis, enabling targeted debiasing without hurting downstream performance.

8 months ago 1 0 1 0

BiasGym consists of two components:
BiasInject: injects specific biases into the model via token-based fine-tuning while keeping the model frozen.
BiasScope: leverages these injected signals to identify and steer the components responsible for biased behaviour.

8 months ago 1 0 1 0

💡 Our Approach: We propose BiasGym, a simple, cost-effective, and generalizable framework for surfacing and mitigating biases in LLMs through controlled bias injection and targeted intervention.

8 months ago 1 0 1 0

🔍 Problem: Biased behaviour of LLMs is often subtle and non-trivial to isolate, even when deliberately elicited, making systematic analysis and debiasing particularly challenging.

8 months ago 1 0 1 0

🚀 Excited to share our new preprint: BiasGym: Fantastic LLM Biases and How to Find (and Remove) Them

📄 Read the paper: arxiv.org/abs/2508.08855

8 months ago 11 2 1 1

Posts by Sekh (Sk) Mainul Islam