
Uncertainty-Based Learning of a Lightweight Model for Multimodal Emotion Recognition
In this paper, the authors propose a lightweight neural network architecture that extracts and performs the analysis of multimodal information using the same audio and visual networks across multiple temporal segments.

An Open Dataset of Synthetic Speech
This paper introduces a multilingual, multispeaker dataset composed of synthetic and natural speech, designed to foster research and benchmarking in synthetic speech detection.

Word-Class Embeddings for Multiclass Text Classification
Code for Word-Class Embeddings (WCEs), a form of supervised embeddings especially suited for multiclass text classification.

CO2A – Contrastive Conditional domain Alignment
A novel unsupervised domain adaptation approach for action recognition from videos, inspired by recent literature on contrastive learning.

Neighborhood Contrastive Learning for Novel Class Discovery
A holistic learning framework for Novel Class Discovery (NCD), which adopts contrastive learning to learn discriminate features with both the labeled and unlabeled data.

Style-Hallucinated Dual Consistency Learning for Domain Generalized Semantic Segmentation
we study the task of synthetic-to-real domain generalized semantic segmentation, which aims to learn a model that is robust to unseen real-world scenes using only synthetic data.