0

Soujanya Poria

Papers
44

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
44papers

Authored papers

44

δ-mem: Efficient Online Memory for Large Language Models

arXiv 2026

2026

From Perception to Action: An Interactive Benchmark for Vision Reasoning

arXiv 2026

2026

NORA: A Small Open-Sourced Generalist Vision Language Action Model for Embodied Tasks

arXiv 2025

2025

Error Typing for Smarter Rewards: Improving Process Reward Models with Error-Aware Hierarchical Supervision

arXiv 2025

2025

Error-Free Linear Attention is a Free Lunch: Exact Solution from Continuous-Time Dynamics

arXiv 2025

2025

NORA-1.5: A Vision-Language-Action Model Trained using World Model- and Action-based Preference Rewards

arXiv 2025

2025

JAM: A Tiny Flow-based Song Generator with Fine-grained Controllability and Aesthetic Alignment

arXiv 2025

2025

The Jumping Reasoning Curve? Tracking the Evolution of Reasoning Performance in GPT-[n] and o-[n] Models on Multimodal Puzzles

arXiv 2025

2025

Pixel-Level Reasoning Segmentation via Multi-turn Conversations

arXiv 2025

2025

Demystifying deep search: a holistic evaluation with hint-free multi-hop questions and factorised metrics

arXiv 2025

2025

PromptDistill: Query-based Selective Token Retention in Intermediate Layers for Efficient Large Language Model Inference

arXiv 2025

2025

Training Vision-Language Process Reward Models for Test-Time Scaling in Multimodal Reasoning: Key Insights and Lessons Learned

arXiv 2025

2025

OffTopicEval: When Large Language Models Enter the Wrong Chat, Almost Always!

arXiv 2025

2025

DiffPO: Diffusion-styled Preference Optimization for Efficient Inference-Time Alignment of Large Language Models

arXiv 2025

2025

TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization

arXiv 2024

2024

Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning

arXiv 2024

2024

CM-TTS: Enhancing Real Time Text-to-Speech Synthesis Efficiency through Weighted Samplers and Consistency Models

arXiv 2024

2024

Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse

arXiv 2024

2024

MOOSE-Chem: Large Language Models for Rediscovering Unseen Chemistry Scientific Hypotheses

arXiv 2024

2024

Two are better than one: Context window extension with multi-grained self-injection

arXiv 2024

2024

Not All Votes Count! Programs as Verifiers Improve Self-Consistency of Language Models for Math Reasoning

arXiv 2024

2024

Inference Time Alignment with Reward-Guided Tree Search

arXiv 2024

2024

Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization

arXiv 2024

2024

DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling

arXiv 2024

2024

WalledEval: A Comprehensive Safety Evaluation Toolkit for Large Language Models

arXiv 2024

2024

Ferret: Faster and Effective Automated Red Teaming with Reward-Based Scoring Technique

arXiv 2024

2024

Safety Arithmetic: A Framework for Test-time Safety Alignment of Language Models by Steering Parameters and Activations

arXiv 2024

2024

MM-BigBench: Evaluating Multimodal Models on Multimodal Content Comprehension Tasks

arXiv 2023

2023

Mustango: Toward Controllable Text-to-Music Generation

arXiv 2023

2023

INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large Language Models

arXiv 2023

2023

Video2Music: Suitable Music Generation from Videos using an Affective Multimodal Transformer model

arXiv 2023

2023

Flacuna: Unleashing the Problem Solving Power of Vicuna using FLAN Fine-Tuning

arXiv 2023

2023

Red-Teaming Large Language Models using Chain of Utterances for Safety-Alignment

arXiv 2023

2023

Contrastive Chain-of-Thought Prompting

arXiv 2023

2023

Large Language Models for Automated Open-domain Scientific Hypotheses Discovery

arXiv 2023

2023

Sentence Embedder Guided Utterance Encoder (SEGUE) for Spoken Language Understanding

arXiv 2023

2023

Multiview Contextual Commonsense Inference: A New Dataset and Task

arXiv 2022

2022

A Dataset for Hyper-Relational Extraction and a Cube-Filling Approach

arXiv 2022

2022

WikiDes: A Wikipedia-Based Dataset for Generating Short Descriptions from Paragraphs

arXiv 2022

2022

COSMIC: COmmonSense knowledge for eMotion Identification in Conversations

Findings of the Association for Computational Linguistics 2020

2020

Recognizing Emotion Cause in Conversations

recognizing-emotion-cause-in-conversations

2020

MIME: MIMicking Emotions for Empathetic Response Generation

EMNLP 2020 11

2020

Towards Multimodal Sarcasm Detection (An _Obviously_ Perfect Paper)

arXiv 2019

2019

MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversations

meld-a-multimodal-multi-party-dataset-for-1

2018

Affiliations

No known affiliations.

Frequent co-authors

10

from 44 papers