Alham Fikri Aji
- Papers
- 31
Cite
Notes
Only stored in your browser.
Authored papers
31Behind Maya: Building a Multilingual Vision Language Model
arXiv 2025
Datasheets Aren't Enough: DataRubrics for Automated Quality Metrics and Accountability
arXiv 2025
Predicting the Order of Upcoming Tokens Improves Language Modeling
arXiv 2025
GenAI Content Detection Task 1: English and Multilingual Machine-Generated Text Detection: AI vs. Human
arXiv 2025
Crosslingual Reasoning through Test-Time Scaling
arXiv 2025
Do Language Models Understand Honorific Systems in Javanese?
arXiv 2025
Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia
arXiv 2025
CLIP meets DINO for Tuning Zero-Shot Classifier using Unlabeled Image Collections
arXiv 2024
SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages
arXiv 2024
SemEval-2024 Task 1: Semantic Textual Relatedness for African and Asian Languages
arXiv 2024
WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines
arXiv 2024
M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection
arXiv 2024
LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection
arXiv 2024
From Multiple-Choice to Extractive QA: A Case Study for English and Arabic
arXiv 2024
MLKV: Multi-Layer Key-Value Heads for Memory Efficient Transformer Decoding
arXiv 2024
Kalahi: A handcrafted, grassroots cultural LLM evaluation suite for Filipino
arXiv 2024
LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions
arXiv 2023
Bactrian-X: Multilingual Replicable Instruction-Following Models with Low-Rank Adaptation
arXiv 2023
M4: Multi-generator, Multi-domain, and Multi-lingual Black-Box Machine-Generated Text Detection
arXiv 2023
QASiNa: Religious Domain Question Answering using Sirah Nabawiyah
arXiv 2023
COPAL-ID: Indonesian Language Reasoning with Local Culture and Nuances
arXiv 2023
LLM-powered Data Augmentation for Enhanced Cross-lingual Performance
arXiv 2023
NusaWrites: Constructing High-Quality Corpora for Underrepresented and Extremely Low-Resource Languages
arXiv 2023
Crosslingual Generalization through Multitask Finetuning
arXiv 2022
The Decades Progress on Code-Switching Research in NLP: A Systematic Survey on Trends and Challenges
arXiv 2022
NusaCrowd: Open Source Initiative for Indonesian NLP Resources
arXiv 2022
Nix-TTS: Lightweight and End-to-End Text-to-Speech via Module-wise Distillation
arXiv 2022
BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting
arXiv 2022
IndoNLI: A Natural Language Inference Dataset for Indonesian
EMNLP 2021 11
Semi-Supervised Low-Resource Style Transfer of Indonesian Informal to Formal Language with Iterative Forward-Translation
arXiv 2020
Toward a Standardized and More Accurate Indonesian Part-of-Speech Tagging
arXiv 2018
Affiliations
Frequent co-authors
10from 31 papers
Genta Indra Winata
Ayu Purwarianti
Jonibek Mansurov
Zheng Xin Yong
researcher
Fajri Koto
Holy Lovenia
Niklas Muennighoff
grad-student
Nizar Habash
Preslav Nakov
Samuel Cahyawijaya