29.05.2025 (Thursday)

Igor Prünster (Università Bocconi)
29 May at 10:30 - 11:30
KCL, Strand - S0.11

Species sampling processes have long provided a fundamental framework
for random discrete distributions and exchangeable sequences. However,
analyzing data from distinct, yet related, sources, requires a broader
notion of probabilistic invariance, with partial exchangeability as the
natural choice. Over the past two decades, numerous models for partially
exchangeable data, known as dependent nonparametric priors, have
emerged, including hierarchical, nested, and additive processes. Despite
their widespread use in Statistics and Machine Learning, a unifying
framework remains elusive, leaving key questions about their learning
mechanisms unanswered.

We fill this gap by introducing multivariate species sampling models, a
general class of nonparametric priors encompassing most existing
dependent nonparametric processes. These models are defined by a
partially exchangeable partition probability function, encoding the
induced multivariate clustering structure. We establish their core
distributional properties and dependence structure, showing that
borrowing of information across groups is entirely determined by shared
ties. This provides new insights into their learning mechanisms,
including a principled explanation for the correlation structure
observed in existing models.

Beyond offering a cohesive theoretical foundation, our approach serves
as a constructive tool for developing new models and opens new research
directions aimed at capturing even richer dependence structures.

Posted by yu.luo@kcl.ac.uk