Genta Indra Winata
- Papers
- 25
Cite
Notes
Only stored in your browser.
Authored papers
25Behind Maya: Building a Multilingual Vision Language Model
arXiv 2025
R3: Robust Rubric-Agnostic Reward Models
arXiv 2025
Datasheets Aren't Enough: DataRubrics for Automated Quality Metrics and Accountability
arXiv 2025
Do Language Models Understand Honorific Systems in Javanese?
arXiv 2025
Fine-Tuning Diffusion Generative Models via Rich Preference Optimization
arXiv 2025
Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia
arXiv 2025
Crosslingual Reasoning through Test-Time Scaling
arXiv 2025
M4-RAG: A Massive-Scale Multilingual Multi-Cultural Multimodal RAG
arXiv 2025
Linguistics Theory Meets LLM: Code-Switched Text Generation via Equivalence Constrained Large Language Models
arXiv 2024
Towards Efficient and Robust VQA-NLE Data Generation with Large Vision-Language Models
arXiv 2024
SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages
arXiv 2024
WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines
arXiv 2024
MetaMetrics: Calibrating Metrics For Generation Tasks Using Human Preferences
arXiv 2024
MINERS: Multilingual Language Models as Semantic Retrievers
arXiv 2024
IndoToD: A Multi-Domain Indonesian Benchmark For End-to-End Task-Oriented Dialogue Systems
arXiv 2023
NusaWrites: Constructing High-Quality Corpora for Underrepresented and Extremely Low-Resource Languages
arXiv 2023
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
TMLR
BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting
arXiv 2022
The Decades Progress on Code-Switching Research in NLP: A Systematic Survey on Trends and Challenges
arXiv 2022
NusaCrowd: Open Source Initiative for Indonesian NLP Resources
arXiv 2022
Few-Shot Bot: Prompt-Based Learning for Dialogue Systems
arXiv 2021
NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation
arXiv 2021
ASCEND: A Spontaneous Chinese-English Dataset for Code-switching in Multi-turn Conversation
LREC 2022 6
IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding
Asian Chapter of the Association for Computational Linguistics 2020
XPersona: Evaluating Multilingual Personalized Chatbot
EMNLP (NLP4ConvAI) 2021 11
Affiliations
Frequent co-authors
10from 25 papers
Alham Fikri Aji
Samuel Cahyawijaya
Ayu Purwarianti
Pascale Fung
David Anugraha
Holy Lovenia
Niklas Muennighoff
grad-student
Bryan Wilie
Derry Tanti Wijaya
Ruochen Zhang