# Bayesian nonparametric modelling of sequential discoveries

@article{Zito2020BayesianNM, title={Bayesian nonparametric modelling of sequential discoveries}, author={Alessandro Zito and Tommaso Rigon and Otso Ovaskainen and David B. Dunson}, journal={arXiv: Methodology}, year={2020} }

We aim at modelling the appearance of distinct tags in a sequence of labelled objects. Common examples of this type of data include words in a corpus or distinct species in a sample. These sequential discoveries are often summarised via accumulation curves, which count the number of distinct entities observed in an increasingly large set of objects. We propose a novel Bayesian nonparametric method for species sampling modelling by directly specifying the probability of a new discovery… Expand

#### References

SHOWING 1-10 OF 38 REFERENCES

Bayesian nonparametric inference for species variety with a two parameter Poisson-Dirichlet process prior

- Mathematics
- 2009

A Bayesian nonparametric methodology has been recently proposed in order to deal with the issue of prediction within species sampling problems. Such problems concern the evaluation, conditional on a… Expand

A new estimator of the discovery probability.

- Mathematics, Medicine
- Biometrics
- 2012

A novel estimator of the probability of detecting species that have been observed with any given frequency in the enlarged sample of size n+m is derived and the result allows us to quantify both the rate at which rare species are detected and the achieved sample coverage of abundant species, as m increases. Expand

Bayesian nonparametric inference beyond the Gibbs-type framework

- Mathematics
- 2018

The definition and investigation of general classes of nonparametric priors has recently been an active research line in Bayesian statistics. Among the various proposals, the Gibbs‐type family, which… Expand

Are Gibbs-Type Priors the Most Natural Generalization of the Dirichlet Process?

- Mathematics, Computer Science
- IEEE Transactions on Pattern Analysis and Machine Intelligence
- 2015

The goal of this paper is to provide a systematic and unified treatment of Gibbs–type priors and highlight their implications for Bayesian nonparametric inference. Expand

The Pitman–Yor multinomial process for mixture modelling

- Mathematics
- 2020

Discrete nonparametric priors play a central role in a variety of Bayesian procedures, most notably when used to model latent features as in clustering, mixtures and curve fitting. They are effective… Expand

Nonparametric prediction in species sampling

- Biology
- 2004

A simple prediction method is proposed for predicting the number of new species that would be discovered by additional sampling in a continuous-time stochastic model in which species arrive in the sample according to independent Poisson processes and where the species discovery rates are heterogeneous. Expand

Bayesian Nonparametric Estimation of the Probability of Discovering New Species

- Mathematics
- 2007

We consider the problem of evaluating the probability of discovering a certain number of new species in a new sample of population units, conditional on the number of species recorded in a basic… Expand

Bayesian Inference for Logistic Models Using Pólya–Gamma Latent Variables

- Computer Science, Mathematics
- 2012

We propose a new data-augmentation strategy for fully Bayesian inference in models with binomial likelihoods. The approach appeals to a new class of Pólya–Gamma distributions, which are constructed… Expand

The Dependent Dirichlet Process and Related Models

- Mathematics
- 2020

Standard regression approaches assume that some finite number of the response distribution characteristics, such as location and scale, change as a (parametric or nonparametric) function of… Expand

Defining Predictive Probability Functions for Species Sampling Models.

- Mathematics, Medicine
- Statistical science : a review journal of the Institute of Mathematical Statistics
- 2013

This paper gives a new necessary and sufficient condition for arbitrary putative PPFs to define an EPPF and shows posterior inference for a large class of SSMs with a PPF that is not linear in cluster size and discusses a numerical method to derive its PPF. Expand