Many software visualization tools exist but very few researchers conduct user studies with them, let alone do repeated or even replicated studies with the tools.
Lung, Aranda, Easterbrook and Wilson attempted to replicate the study by Dehnadi and Bornat described in the paper The camel has two humps (Dehnadi’s web page). However, throughout the design and execution of the experiment by Lung et al., a number of contextual – and unavoidable – difficulties forced their replication to deviate from the original study. Their paper entitled On the difficulty of replicating human subjects studies in software engineering presented at ICSE 2008 describes their replicated experiment and discuses the difficulties in replicating studies of this nature.
Replication of empirical studies is frequently advocated in software engineering but is rarely practiced.
Replication of experiments in SE remains a challenge. Empirical studies in SE usually involve testing a tool or observing the software development process. Such studies require access to skilled participants, who may be difficult and/or expensive to attract and retain for a study. Locating suitable subjects can also be problematic because of the wide variety of tools and programming languages: only a small subset of available participants may have the required experience with a particular technology. Many SE tasks involve some degree of creativity, leading to large variations in answers, e.g., quality of source code or a model.
The aim of replication is to check that the results of an experiment are reliable. In particular, external replication (replication by different researchers) can identify flaws in the way that hypotheses are expressed and can help to identify the range of conditions under which phenomenon occurs. In a literal replication, the goal is to come close enough to the original experiment so that the results can be directly compared. In contrast, a theoretical replication seeks to investigate the scope of the underlying theory, for example by redesigning the study for a different target population, or by testing a variant of the original hypothesis.
There are many difficulties in replicating human subjects studies.
Such difficulties may be overcome if the payoff is great enough. However, it is not clear how to assess the cost benefit tradeoff for conducting replications. Human subjects studies are expensive and time consuming to conduct and all such studies are limited in some way. The crucial question is how much knowledge is gained by conducting a particular study (or replication), considering the amount of effort invested. For researchers interested in validating results of existing experiments, is it better to attempt a literal or theoretical replication, or to invest that same effort in designing a better study, or to probe the research question in a different way.