Is it really more difficult to debug randomized algorithms?

by eig   Last Updated October 20, 2019 00:05 AM - source

I am a theoretical computer scientist. I have heard the following:

Practitioners do not like randomized algorithms because they are notoriously difficult to debug. So deterministic algorithms are much more preferred.

How true is this statement? If you have can provide examples or counter-examples that would be great.

Tags : debugging random


Answers 1


I went about this answer in a bit of a different direction - not from a debugging, but a testing perspective:

Usually a pure function has a very simple setup:

Same input in, same result out.

That makes it very easy to define proper test cases - You select a set of standard test values and a few of special interest (depending on what your function does), write your testing methods for it, comparing what the function returns to what you expected it to return. This gives you a solid set of testing methods that you can run whenever you make any changes.

Once you start having randomness in your methods, you run into this issue:

Same input in, many possible values out.

Now you need to know ALL the possible values that your randomized function could return for a given input, and test against those. And depending on what your randomized function does, that's just either plainly unfeasible (due to many options that you need to manually or programmatically add in), or straight up impossible.

Suddenly the amount of effort to create a solid test suite went up dramatically, at which point it usually just gets straight skipped.

How would we avoid this?

To avoid this during writing your algorithms, you would encapsulate the piece of randomness. You would try to put the snippet that actually causes the random part inside its own object, outside of the algorithm. Now you can actually mock that object during your test suite - basically fixing the values that the "randomized" object would return. It's not ideal, and does not cover every scenario, but so does barely any testing. The effort to create your test suite went up only slightly, since the "randomness" is gone now, and you can treat it like a normal, pure function for testing purposes.

Similar principles apply to debugging a function manually: Most of the time you do not actually know what random value the algorithm generated, and thus it's difficult to recreate what exactly happened in the function that broke it. That makes manual debugging very difficult.

Joe
Joe
October 19, 2019 23:47 PM

Related Questions


rand() gives same numbers again for a small range

Updated July 20, 2015 13:02 PM

Programming a Truly "Fair" Weighted Random Dice Roll

Updated September 21, 2016 09:02 AM

How to test this getRandom() method?

Updated September 29, 2017 13:05 PM

How to find the nth narcissistic number?

Updated September 16, 2016 09:02 AM

What this type of Random Generator is?

Updated October 23, 2017 18:05 PM