U statistics for product kernels

by Till Hoffmann   Last Updated October 17, 2018 17:19 PM

Background

Let $X$ be a real, univariate random variable with probability distribution function $p(x)$. Let $x_1, \ldots, x_n$ be a sample of size $n$ drawn from $X$. U statistics provide an unbiased estimator of the integral \begin{align} I&=\int dxdx'\phi(x,x')p(x)p(x')\\ &\approx \frac{1}{n(n-1)}\sum_{i\neq j}\phi(x_i,x_j) \end{align} by employing the empirical distribution function and omitting terms $i=j$. The terms are omitted because they can have peculiar properties. For example, if $X\sim\mathrm{Normal}(0,1)$ and $\phi(x,x')=|x-x'|$, there would be $n$ terms with $\phi(x_i,x_i)=0$ even though the probability that $x=x'$ is vanishing.

Problem

I would like to evaluate the integral $$ J=\int dx' \prod_{i=1}^n \phi(x_i,x') p(x'). $$ Naively, one might approximate using the empirical distribution function such that $$ \hat{J}_1= \frac 1n\sum_{j=1}^n \prod_{i=1}^n \phi(x_i,x_j). $$ But each term in the sum will have $i=j$ terms in it which is likely going to lead to problems. Is there an equivalent to U statistics for integrals of the form I am working with?

Possible improvement

The following estimator avoids combinations with $i=j$ but I don't trust it because I don't know how it behaves. $$ \hat{J}_2=\frac{1}{n}\sum_{j=1}^n\left(\prod_{i\neq j}^n \phi(x_i,x_j)\right)^{n/(n-1)} $$

Example

Let $X\sim\mathrm{Normal}(0,1)$, $n=50$ and $q(x,x')=|x-x'|$. Then $\hat{J}_1=0$ for any sample because there is a term $q(x_i,x_i)=0$ in each product. The alternative estimator performs better (see the python code below).

import numpy as np

# Define a kernel
kernel = lambda x, y: np.abs(x-y)

# Generate some data
# np.random.seed(1)
n = 50
x = np.random.normal(0, 1, n)

# Perform the integral by Monte Carlo integration 
# using the known distribution for x
nsamples=1000
y = np.random.normal(0, 1, nsamples)
samples = np.prod(kernel(x[:, None], y[None]), axis=0) ** (1.0 / n)
# Report the results
integral = np.mean(samples)
integral_std = np.std(samples) / np.sqrt(nsamples)
print "Monte Carlo integration: {:.2f}+-{:.2f}".format(integral, integral_std)

# Use the naive estimator (which fails terribly)
samples = np.prod(kernel(x[:, None], x[None]), axis=0) ** (1.0 / n)
# Report the results
naive = np.mean(samples)
naive_std = np.std(samples) / np.sqrt(n)
print "Naive estimator J_1: {:.2f}+-{:.2f}".format(naive, naive_std)

# Calculate it with the correction
samples = kernel(x[:, None], x[None])
samples[range(n), range(n)] = 1
samples = np.prod(samples, axis=0) ** (1.0 / (n-1))
# Report the results
alt = np.mean(samples)
alt_std = np.std(samples) / np.sqrt(n)
print "Alternative estimator J_2: {:.2f}+-{:.2f}".format(alt, alt_std)

# Output for np.random.seed(1)
# Monte Carlo integration: 0.81+-0.01
# Naive estimator J_1: 0.00+-0.00
# Alternative estimator J_2: 0.80+-0.05


Related Questions


How to find most stable shift?

Updated July 31, 2018 07:19 AM

MVUE is unique - wrong proof?

Updated October 14, 2018 18:19 PM