Cumulants in chinese restaurant process?

by Cupitor   Last Updated September 23, 2018 18:19 PM

I already wrote a similar question on StackOverflow but was not welcomed there! So I decided to ask it from the folks here. I have wrote a code in Python for CRP problem. I think everybody here is familar with the subject but nevertheless:

Short description of it: Suppose we want to assign people entering to a restaurants to potentially infinite number of tables. If $z_i$ represents the random variable assigned for the $i$'th person entering the restaurant the following should hold:

With probability $p(z_i=a|z_1,...,z_{i-1})=\frac{n_a}{i-1+\alpha}$ for $n_a>0$, $i$'th person will sit in table $a$ and with probability $p(z_i=a|z_1,...,z_{i-1})=\frac{\alpha}{i-1+\alpha}$ $i$'th person will sit around a new table.

I am not quite sure if my code is correct cause I am surprised how small the final number of tables are. I would be happy if somebody could give me cumulants for the distribution associated with this process.

import numpy as np
def CRP(alpha,N):
    """Chinese Restaurant Process with alpha as concentration parameter and N 
    the number of sample"""
    #Array which will save for each i, the number of people people sitting
    #until table i
    summed=np.ones(1) #first person assigned to the first table
    for i in range(1,N):
        #A loop that assigns the people to tables

        #randind represent the random number from the interval [1,i-1+alpha]
        randind=(float(i)+alpha)*np.random.uniform(low=0.0, high=1.0, size=1)
        #update is the index for the table that the person should be placed which
        #if greater than the total number, will be placed in a new table
        update=np.searchsorted(summed,randind,side='left')
        if randind>i:
            summed=np.append(summed,i+1)
        else:
            zerovec=np.zeros(update)
            onevec=np.ones(summed.size-update)
            summed+=np.append(zerovec,onevec)
    #This part converts summed array to tables array which indicates the number
    #of persons assigned to that table
    tables=np.zeros(summed.size)
    tables[0]=summed[0]
    for i in range(1,summed.size):
        tables[i]=summed[i]-summed[i-1]
    return tables
a=CRP(5,1000)
print a


Answers 1


I don't know how many tables you were expecting, but the mean and variance of the number of total tables is available in closed form. For $\alpha = 5$ and $N = 1000$ $$ E[\mbox{Num Tables}] = \sum_{i=1}^N \frac{\alpha}{\alpha + i - 1} \approx 27, $$ and $$ \mbox{Var}[\mbox{Num Tables}] = \sum_{i=1}^N \frac{\alpha(i-1)}{(\alpha + i - 1)^2} \approx 21.5. $$

I ran your code in python and it seems consistent with these formulas. Note that the expected number of tables grows logrithmically in $N$.

guy
guy
October 16, 2013 17:06 PM

Related Questions



Distributions of continuous distributions

Updated August 07, 2018 08:19 AM



Karp luby sampling

Updated June 17, 2018 12:19 PM