Looking around you: external information enhances representations for event sequences

Representation learning produces models in different domains, such as store purchases, client transactions, and general people's behavior. However, such models for event sequences usually process each sequence in isolation, ignoring context from those that co-occur in time. This limitation is particularly problematic in domains with fast-evolving conditions, like finance and e-commerce, or when certain sequences lack recent events. We develop a method that aggregates information from multiple user representations, augmenting a specific user's representation in a setting with multiple co-occurring event sequences, achieving better quality than processing each sequence independently. Our study considers diverse aggregation approaches, ranging from simple pooling techniques to Learnable attention aggregation, that can highlight more complex information flow among other users. The proposed methods operate on top of an existing encoder and support its efficient fine-tuning. Across nine diverse event sequence datasets (finance, e-commerce, entertainment, etc.) and downstream tasks, Learnable attention improves metric scores, both with and without fine-tuning, while mean pooling yields a smaller but still significant gain.