0

Shafiq Joty

Papers
44

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
44papers

Authored papers

44

SkillOrchestra: Learning to Route Agents via Skill Transfer

arXiv 2026

2026

References Improve LLM Alignment in Non-Verifiable Domains

arXiv 2026

2026

Least-Loaded Expert Parallelism: Load Balancing An Imbalanced Mixture-of-Experts

arXiv 2026

2026

Aligning Text, Code, and Vision: A Multi-Objective Reinforcement Learning Framework for Text-to-Visualization

arXiv 2026

2026

Synthesizing Agentic Data for Web Agents with Progressive Difficulty Enhancement Mechanisms

arXiv 2025

2025

Meta-Design Matters: A Self-Design Multi-Agent System

arXiv 2025

2025

Why Vision Language Models Struggle with Visual Arithmetic? Towards Enhanced Chart and Geometry Understanding

arXiv 2025

2025

Hard2Verify: A Step-Level Verification Benchmark for Open-Ended Frontier Math

arXiv 2025

2025

Demystifying Domain-adaptive Post-training for Financial LLMs

arXiv 2025

2025

Does Context Matter? ContextualJudgeBench for Evaluating LLM-based Judges in Contextual Settings

arXiv 2025

2025

What Makes a Good Natural Language Prompt?

arXiv 2025

2025

ChartQAPro: A More Diverse and Challenging Benchmark for Chart Question Answering

arXiv 2025

2025

Preference Optimization for Reasoning with Pseudo Feedback

arXiv 2024

2024

FaithEval: Can Your Language Model Stay Faithful to Context, Even If "The Moon is Made of Marshmallows"

arXiv 2024

2024

From Pixels to Insights: A Survey on Automatic Chart Understanding in the Era of Large Foundation Models

arXiv 2024

2024

Open-RAG: Enhanced Retrieval-Augmented Reasoning with Open-Source Large Language Models

arXiv 2024

2024

Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey

arXiv 2024

2024

Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction

arXiv 2024

2024

ChartGemma: Visual Instruction-tuning for Chart Reasoning in the Wild

arXiv 2024

2024

How Much are Large Language Models Contaminated? A Comprehensive Survey and the LLMSanitize Library

arXiv 2024

2024

ChartInstruct: Instruction Tuning for Chart Comprehension and Reasoning

arXiv 2024

2024

StructTest: Benchmarking LLMs' Reasoning through Compositional Structured Outputs

arXiv 2024

2024

BootPIG: Bootstrapping Zero-shot Personalized Image Generation Capabilities in Pretrained Diffusion Models

arXiv 2024

2024

A Systematic Survey and Critical Review on Evaluating Large Language Models: Challenges, Limitations, and Recommendations

arXiv 2024

2024

ReIFE: Re-evaluating Instruction-Following Evaluation

arXiv 2024

2024

LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond

arXiv 2023

2023

XGen-7B Technical Report

arXiv 2023

2023

xCodeEval: A Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval

arXiv 2023

2023

UniChart: A Universal Vision-language Pretrained Model for Chart Comprehension and Reasoning

arXiv 2023

2023

ChatGPT's One-year Anniversary: Are Open-Source Large Language Models Catching up?

arXiv 2023

2023

Did You Read the Instructions? Rethinking the Effectiveness of Task Definitions in Instruction Learning

arXiv 2023

2023

Benchmarking Generation and Evaluation Capabilities of Large Language Models for Instruction Controllable Summarization

arXiv 2023

2023

Exploring Self-supervised Logic-enhanced Training for Large Language Models

arXiv 2023

2023

Embrace Divergence for Richer Insights: A Multi-document Summarization Benchmark and a Case Study on Summarizing Diverse Information from News Articles

arXiv 2023

2023

Personalised Distillation: Empowering Open-Sourced LLMs with Adaptive Learning for Code Generation

arXiv 2023

2023

Verify-and-Edit: A Knowledge-Enhanced Chain-of-Thought Framework

arXiv 2023

2023

CodeChain: Towards Modular Code Generation Through Chain of Self-revisions with Representative Sub-modules

arXiv 2023

2023

ChartQA: A Benchmark for Question Answering about Charts with Visual and Logical Reasoning

Findings (ACL) 2022 5

2022

FOLIO: Natural Language Reasoning with First-Order Logic

arXiv 2022

2022

Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation

arXiv 2022

2022

GlobalWoZ: Globalizing MultiWoZ to Develop Multilingual Task-Oriented Dialogue Systems

ACL 2022 5

2021

GeDi: Generative Discriminator Guided Sequence Generation

Findings (EMNLP) 2021 11

2020

It's Morphin' Time! Combating Linguistic Discrimination with Inflectional Perturbations

it-s-morphin-time-combating-linguistic-1

2020

Domain Adaptation with Adversarial Training and Graph Embeddings

domain-adaptation-with-adversarial-training-1

2018

Affiliations

No known affiliations.

Frequent co-authors

10

from 44 papers