0

Samuel Cahyawijaya

Papers
21

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
21papers

Authored papers

21

Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia

arXiv 2025

2025

Subobject-level Image Tokenization

arXiv 2024

2024

SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages

arXiv 2024

2024

High-Dimension Human Value Representation in Large Language Models

arXiv 2024

2024

WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines

arXiv 2024

2024

Towards Efficient and Robust VQA-NLE Data Generation with Large Vision-Language Models

arXiv 2024

2024

NusaWrites: Constructing High-Quality Corpora for Underrepresented and Extremely Low-Resource Languages

arXiv 2023

2023

Cross-Lingual Cross-Age Group Adaptation for Low-Resource Elderly Speech Emotion Recognition

arXiv 2023

2023

Which One Are You Referring To? Multimodal Object Identification in Situated Dialogue

arXiv 2023

2023

IndoToD: A Multi-Domain Indonesian Benchmark For End-to-End Task-Oriented Dialogue Systems

arXiv 2023

2023

A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity

arXiv 2023

2023

InstructAlign: High-and-Low Resource Language Alignment via Continual Crosslingual Instruction Tuning

arXiv 2023

2023

BigBIO: A Framework for Data-Centric Biomedical Natural Language Processing

arXiv 2022

2022

Automatic Speech Recognition Datasets in Cantonese: A Survey and New Dataset

LREC 2022 6

2022

NusaCrowd: Open Source Initiative for Indonesian NLP Resources

arXiv 2022

2022

Can Question Rewriting Help Conversational Question Answering?

insights (ACL) 2022 5

2022

NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation

arXiv 2021

2021

ASCEND: A Spontaneous Chinese-English Dataset for Code-switching in Multi-turn Conversation

LREC 2022 6

2021

IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding

Asian Chapter of the Association for Computational Linguistics 2020

2020

CrossNER: Evaluating Cross-Domain Named Entity Recognition

arXiv 2020

2020

XPersona: Evaluating Multilingual Personalized Chatbot

EMNLP (NLP4ConvAI) 2021 11

2020

Affiliations

No known affiliations.

Frequent co-authors

10

from 21 papers