I have a signal $ x \in \mathbb{R}^{t \times l} $ which is discretized into $ l = 32 $ levels for $ t = 100000 $ time points. This enables me to turn a regression problem into a classification problem, which is more tractable mathematically for my application.

I understand that classification problem should be done with categorical cross-entropy, and I realise that at any time point $t$, there is one level which is 1 In $ x $, so probably sparse categorical cross-entropy would improve it.

However, in this problem setting, making $ l = 15 $ to 1 instead of $ l = 16 $ is not as bad as setting $ l = 1 $ to 1, as these levels have a natural order.

Is there any way to incorporate this information into the loss function?

I looked at the Wasserstein-distance metric, but I'm not reasonably advanced in mathematics to know if it has any closed from loss functions for my classes, but as far as I understand that would do something similar.

- Serverfault Help
- Superuser Help
- Ubuntu Help
- Webapps Help
- Webmasters Help
- Programmers Help
- Dba Help
- Drupal Help
- Wordpress Help
- Magento Help
- Joomla Help
- Android Help
- Apple Help
- Game Help
- Gaming Help
- Blender Help
- Ux Help
- Cooking Help
- Photo Help
- Stats Help
- Math Help
- Diy Help
- Gis Help
- Tex Help
- Meta Help
- Electronics Help
- Stackoverflow Help
- Bitcoin Help
- Ethereum Help