James Glass
- Papers
- 34
Cite
Notes
Only stored in your browser.
Authored papers
34SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models
arXiv 2025
Meta CLIP 2: A Worldwide Scaling Recipe
arXiv 2025
Beyond Context Limits: Subconscious Threads for Long-Horizon Reasoning
arXiv 2025
TTRV: Test-Time Reinforcement Learning for Vision Language Models
arXiv 2025
VisualOverload: Probing Visual Understanding of VLMs in Really Dense Scenes
arXiv 2025
Overflow Prevention Enhances Long-Context Recurrent LLMs
arXiv 2025
PRISMM-Bench: A Benchmark of Peer-Review Grounded Multimodal Inconsistencies
arXiv 2025
mWhisper-Flamingo for Multilingual Audio-Visual Noise-Robust Speech Recognition
arXiv 2025
Curiosity-driven Red-teaming for Large Language Models
arXiv 2024
DASS: Distilled Audio State Space Models Are Stronger and More Duration-Scalable Learners
arXiv 2024
Lookback Lens: Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps
arXiv 2024
Teaching VLMs to Localize Specific Objects from In-context Examples
ICCV 2025
GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models
arXiv 2024
Quantifying Generalization Complexity for Large Language Models
arXiv 2024
DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models
arXiv 2023
Joint Audio and Speech Understanding
arXiv 2023
Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers
arXiv 2023
Interpretable Unified Language Checking
arXiv 2023
Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning
arXiv 2023
Expand, Rerank, and Retrieve: Query Reranking for Open-Domain Question Answering
arXiv 2023
Entailment as Robust Self-Learner
arXiv 2023
What, when, and where? -- Self-Supervised Spatio-Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions
arXiv 2023
Logic Against Bias: Textual Entailment Mitigates Stereotypical Sentence Reasoning
arXiv 2023
Contrastive Audio-Visual Masked Autoencoder
arXiv 2022
DiffCSE: Difference-based Contrastive Learning for Sentence Embeddings
NAACL 2022 7
Vocalsound: A Dataset for Improving Human Vocal Sounds Recognition
arXiv 2022
Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos
ICCV 2021 10
Cooperative Self-training of Machine Reading Comprehension
NAACL 2022 7
AST: Audio Spectrogram Transformer
arXiv 2021
SSAST: Self-Supervised Audio Spectrogram Transformer
arXiv 2021
PSLA: Improving Audio Tagging with Pretraining, Sampling, Labeling, and Aggregation
arXiv 2021
Non-Autoregressive Predictive Coding for Learning Speech Representations from Local Dependencies
arXiv 2020
Vector-Quantized Autoregressive Predictive Coding
arXiv 2020
Improving Neural Language Models by Segmenting, Attending, and Predicting the Future
improving-neural-language-models-by-1
Affiliations
Frequent co-authors
10from 34 papers