Amir Hossein Kargaran
- Papers
- 19
Cite
Notes
Only stored in your browser.
Authored papers
19GlotOCR Bench: OCR Models Still Struggle Beyond a Handful of Unicode Scripts
arXiv 2026
How Programming Concepts and Neurons Are Shared in Code Language Models
arXiv 2025
Insights from the ICLR Peer Review and Rebuttal Process
arXiv 2025
Tracing Multilingual Factual Knowledge Acquisition in Pretraining
arXiv 2025
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
arXiv 2025
CoBia: Constructed Conversations Can Trigger Otherwise Concealed Societal Biases in LLMs
arXiv 2025
On Relation-Specific Neurons in Large Language Models
arXiv 2025
GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for Minority Languages
arXiv 2024
GIRT-Model: Automated Generation of Issue Report Templates
arXiv 2024
MEXA: Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment
arXiv 2024
MaskLID: Code-Switching Language Identification through Iterative Masking
arXiv 2024
How Transliterations Improve Crosslingual Alignment
arXiv 2024
GlotLID: Language Identification for Low-Resource Languages
arXiv 2023
GlotScript: A Resource and Tool for Low Resource Writing System Identification
arXiv 2023
GIRT-Data: Sampling GitHub Issue Report Templates
arXiv 2023
MenuCraft: Interactive Menu System Design with Large Language Models
arXiv 2023
Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages
arXiv 2023
Analytical Derivation and Comparison of Alarm Similarity Measures
analytical-derivation-and-comparison-of-alarm-1
Wide-AdGraph: Detecting Ad Trackers with a Wide Dependency Chain Graph
arXiv 2020
Affiliations
Frequent co-authors
10from 19 papers