
Mechanical Turk (MTurk), an online labor market run by Amazon, provides a web platform for conducting behavioral experiments; the site offers immediate and inexpensive access to a large subject pool. In this study, we review recent research about using MTurk for behavioral experiments and test the validity of using MTurk for experiments in behavioral operations management. We recruited subjects from MTurk to replicate the inventory management experiment from Bolton and Katok ( 2008 ), as well as the procurement auction experiment from Engelbrecht‐Wiggans and Katok ( 2008 ), and the supply chain contracting experiment from Loch and Wu ( 2008 ). We successfully replicate individual biases in the inventory management and procurement auction experiments, but learning in the individual tasks occurs more slowly on MTurk compared to the original studies. Further, we find that social preference manipulations in the supply chain experiment are ineffective in changing the behavior of MTurk subjects, in contrast to the original study. We conducted an additional replication study of the supply chain contracting experiment using student subjects in a standard laboratory. Results from this laboratory replication also fail to replicate the original laboratory study, indicating that the effect of social preferences on supply chain contracting may not be robust to alternative subject pools. We conclude that factors potentially influencing the differences observed on MTurk are less related to the online environment, but more related to the diversity and characteristics of subject pool on MTurk. Overall, MTurk appears to be an important and relevant tool for researchers in behavioral operations, but we caution researchers about slower learning of the MTurk subjects and the use of social preference manipulations on MTurk.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 97 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 1% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
