# Is there an alternative to categorical cross-entropy with a notion of "class distance"?

by boomkin   Last Updated September 20, 2018 13:19 PM

I have a signal $x \in \mathbb{R}^{t \times l}$ which is discretized into $l = 32$ levels for $t = 100000$ time points. This enables me to turn a regression problem into a classification problem, which is more tractable mathematically for my application.

I understand that classification problem should be done with categorical cross-entropy, and I realise that at any time point $t$, there is one level which is 1 In $x$, so probably sparse categorical cross-entropy would improve it.

However, in this problem setting, making $l = 15$ to 1 instead of $l = 16$ is not as bad as setting $l = 1$ to 1, as these levels have a natural order.

Is there any way to incorporate this information into the loss function?

I looked at the Wasserstein-distance metric, but I'm not reasonably advanced in mathematics to know if it has any closed from loss functions for my classes, but as far as I understand that would do something similar.

Tags :