Tomer Ashuach

Researcher at Technion – Israel Institute of Technology

me.png

Technion – Israel Institute of Technology

I’m a PhD student at the Technion, advised by Prof Yonatan Belinkov. My research focuses on the interpretability of language models, with particular emphasis on uncovering their internal mechanisms and understanding how knowledge is acquired and can be unlearned.

Research Interests

  • Interpretability in LLMs
  • Knowledge and Unlearning in LLMs
  • AI Safety and Alignment


Publications

2026

  1. ACL 2026
    pk.png
    Tomer Ashuach, Liat Ein-Dor, Shai Gretz, and 2 more authors
    In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics: ACL 2026, Dec 2026
    We explore whether large language models possess privileged internal knowledge about their own correctness, analogous to human introspection. By evaluating models on conflicting predictions, we discover that LLMs exhibit a domain-specific intuition for factual knowledge tasks that external observers cannot access.
  2. ACL 2026
    crisp.png
    Tomer Ashuach, Dana Arad, Aaron Mueller, and 2 more authors
    In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics: ACL 2026, Jan 2026
    We introduce CRISP, a parameter-efficient method leveraging sparse autoencoders to achieve persistent concept unlearning in LLMs. By automatically identifying and suppressing salient features, CRISP successfully removes harmful knowledge while maintaining model utility.

2025

  1. ACL Findings 2025
    revs.png
    Tomer Ashuach, Martin Tutek, and Yonatan Belinkov
    In Findings of the Association for Computational Linguistics: ACL 2025, May 2025
    We propose REVS, a non-gradient-based method that unlearns sensitive information from language models by identifying and modifying a small subset of relevant neurons. REVS achieves robust unlearning and resists extraction attacks while preserving the model’s underlying integrity.