0

Mor Geva

Papers
33

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
33papers

Authored papers

33

Faithfulness Metrics Don't Measure Faithfulness: A Meta-Evaluation with Ground Truth

arXiv 2026

2026

From Directions to Regions: Decomposing Activations in Language Models via Local Geometry

arXiv 2026

2026

Mixing Mechanisms: How Language Models Retrieve Bound Entities In-Context

arXiv 2025

2026

Rethinking Selective Knowledge Distillation

arXiv 2026

2026

Friends and Grandmothers in Silico: Localizing Entity Cells in Language Models

arXiv 2026

2026

Precise In-Parameter Concept Erasure in Large Language Models

arXiv 2025

2025

Decomposing MLP Activations into Interpretable Features via Semi-Nonnegative Matrix Factorization

arXiv 2025

2025

Why Is Spatial Reasoning Hard for VLMs? An Attention Mechanism Perspective on Focus Areas

arXiv 2025

2025

Enhancing Automated Interpretability with Output-Centric Feature Descriptions

arXiv 2025

2025

Universal Jailbreak Suffixes Are Strong Attention Hijackers

arXiv 2025

2025

LMEnt: A Suite for Analyzing Knowledge in Language Models from Pretraining Data to Representations

arXiv 2025

2025

Do Large Language Models Perform Latent Multi-Hop Reasoning without Exploiting Shortcuts?

arXiv 2024

2024

Towards Interpreting Visual Information Processing in Vision-Language Models

arXiv 2024

2024

RAVEL: Evaluating Interpretability Methods on Disentangling Language Model Representations

arXiv 2024

2024

Hopping Too Late: Exploring the Limitations of Large Language Models on Multi-Hop Queries

arXiv 2024

2024

Intrinsic Evaluation of Unlearning Using Parametric Knowledge Traces

arXiv 2024

2024

Inferring Functionality of Attention Heads from their Parameters

arXiv 2024

2024

From Insights to Actions: The Impact of Interpretability and Analysis Research on NLP

arXiv 2024

2024

From Loops to Oops: Fallback Behaviors of Language Models Under Uncertainty

arXiv 2024

2024

Performance Gap in Entity Knowledge Extraction Across Modalities in Vision Language Models

arXiv 2024

2024

The Hidden Space of Transformer Language Adapters

arXiv 2024

2024

In-Context Learning Creates Task Vectors

arXiv 2023

2023

The Hidden Language of Diffusion Models

arXiv 2023

2023

Evaluating the Ripple Effects of Knowledge Editing in Language Models

arXiv 2023

2023

Jump to Conclusions: Short-Cutting Transformers With Linear Transformations

arXiv 2023

2023

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

TMLR

2022

Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space

arXiv 2022

2022

SCROLLS: Standardized CompaRison Over Long Language Sequences

arXiv 2022

2022

Analyzing Transformers in Embedding Space

arXiv 2022

2022

Inferring Implicit Relations in Complex Questions with Language Models

arXiv 2022

2022

Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies

arXiv 2021

2021

Injecting Numerical Reasoning Skills into Language Models

injecting-numerical-reasoning-skills-into-1

2020

Transformer Feed-Forward Layers Are Key-Value Memories

EMNLP 2021 11

2020

Affiliations

No known affiliations.

Frequent co-authors

10

from 33 papers