Dorien Herremans
- Papers
- 38
Cite
Notes
Only stored in your browser.
Authored papers
38APEX: Large-scale Multi-task Aesthetic-Informed Popularity Prediction for AI-Generated Music
arXiv 2026
SonicVerse: Multi-Task Learning for Music Feature-Informed Captioning
arXiv 2025
JamendoMaxCaps: A Large Scale Music-caption Dataset with Imputed Metadata
arXiv 2025
ImprovNet -- Generating Controllable Musical Improvisations with Iterative Corruption Refinement
arXiv 2025
Text2midi-InferAlign: Improving Symbolic Music Generation with Inference-Time Alignment
arXiv 2025
Smart Timing for Mining: A Deep Learning Framework for Bitcoin Hardware ROI Prediction
arXiv 2025
NORA-1.5: A Vision-Language-Action Model Trained using World Model- and Action-based Preference Rewards
arXiv 2025
JAM: A Tiny Flow-based Song Generator with Fine-grained Controllability and Aesthetic Alignment
arXiv 2025
SonicMaster: Towards Controllable All-in-One Music Restoration and Mastering
arXiv 2025
Towards Unified Music Emotion Recognition across Dimensional and Categorical Models
arXiv 2025
Text2midi: Generating Symbolic Music from Captions
arXiv 2024
MidiCaps: A large-scale MIDI dataset with text captions
arXiv 2024
MIRFLEX: Music Information Retrieval Feature Library for Extraction
arXiv 2024
Are We There Yet? A Brief Survey of Music Emotion Prediction Datasets, Models and Outstanding Challenges
arXiv 2024
Natural Language Processing Methods for Symbolic Music Generation and Information Retrieval: a Survey
arXiv 2024
PRESENT: Zero-Shot Text-to-Prosody Control
arXiv 2024
DeepUnifiedMom: Unified Time-series Momentum Portfolio Construction via Multi-Task Learning with Multi-Gate Mixture of Experts
arXiv 2024
Leveraging LLM Embeddings for Cross Dataset Label Alignment and Zero Shot Music Emotion Prediction
arXiv 2024
BandControlNet: Parallel Transformers-based Steerable Popular Music Generation with Fine-Grained Spatiotemporal Features
arXiv 2024
Mustango: Toward Controllable Text-to-Music Generation
arXiv 2023
Video2Music: Suitable Music Generation from Videos using an Affective Multimodal Transformer model
arXiv 2023
HEAR: Holistic Evaluation of Audio Representations
arXiv 2022
Forecasting Bitcoin volatility spikes from whale transactions and CryptoQuant data using Synthesizer Transformer models
arXiv 2022
Accented Text-to-Speech Synthesis with a Conditional Variational Autoencoder
arXiv 2022
Predicting emotion from music videos: exploring the relative contribution of visual and auditory information to affective responses
arXiv 2022
Understanding Audio Features via Trainable Basis Functions
arXiv 2022
Conditional Drums Generation using Compound Word Representations
arXiv 2022
PreBit -- A multimodal model with Twitter FinBERT embeddings for extreme price movement prediction of Bitcoin
arXiv 2022
Generating Lead Sheets with Affect: A Novel Conditional seq2seq Framework
arXiv 2021
Generative Modelling for Controllable Audio Synthesis of Expressive Piano Performance
arXiv 2020
Music FaderNets: Controllable Music Generation Based On High-Level Features via Low-Level Feature Modelling
arXiv 2020
A variational autoencoder for music generation controlled by tonal tension
arXiv 2020
The impact of Audio input representations on neural network based music transcription
arXiv 2020
The Effect of Spectrogram Reconstruction on Automatic Music Transcription: An Alternative Approach to Improve Transcription Accuracy
arXiv 2020
nnAudio: An on-the-fly GPU Audio to Spectrogram Conversion Toolbox Using 1D Convolution Neural Networks
arXiv 2019
Midi Miner -- A Python library for tonal tension and track classification
arXiv 2019
Multimodal Deep Models for Predicting Affective Responses Evoked by Movies
arXiv 2019
Latent space representation for multi-target speaker detection and identification with a sparse dataset using Triplet neural networks
arXiv 2019
Affiliations
Frequent co-authors
10from 38 papers