by zen
Last Updated March 13, 2018 07:19 AM

I have a fairly large data set with 500000 observations for 100 variables. Observations are randomly assigned to treatment and control group. How do I infer that treatment has indeed been randomized in the data set. I ran t-tests on a couple of explanatory variables. Some have given significant p-values, some insignificant. What would be the most suitable check for randomization here?

Ideally, the person who undertakes the randomisation should have created some **replicable** coding to allow the randomisation to be audited and reproduced. If randomisation is done in R this can be done by using `set.seed`

and having the code generating the randomisation.

In the case where the randomisation is not replicable, it is effectively then just a bunch of numbers that have come from somewhere. You can conduct *post-hoc* "balance tests" to see if there different groups appear to have been randomised, but that is all you can do. The other thing you should do is get very cranky at the person who did the randomisation.

- Serverfault Help
- Superuser Help
- Ubuntu Help
- Webapps Help
- Webmasters Help
- Programmers Help
- Dba Help
- Drupal Help
- Wordpress Help
- Magento Help
- Joomla Help
- Android Help
- Apple Help
- Game Help
- Gaming Help
- Blender Help
- Ux Help
- Cooking Help
- Photo Help
- Stats Help
- Math Help
- Diy Help
- Gis Help
- Tex Help
- Meta Help
- Electronics Help
- Stackoverflow Help
- Bitcoin Help
- Ethereum Help