As scientists, it is imperative that we understand not only the power of our research tools to yield results, but also their ability to obtain similar results over time. This study is an investigation into how common decisions made during the design and analysis of a functional magnetic resonance imaging (fMRI) study can influence the reliability of the statistical results. To that end, we gathered back-to-back test-retest fMRI data during an experiment involving multiple cognitive tasks (episodic recognition and two-back working memory) and multiple fMRI experimental designs (block, event-related genetic sequence, and event-related m-sequence). Using these data, we were able to investigate the relative influences of task, design, statistical contrast (task vs. rest, target vs. nontarget), and statistical thresholding (unthresholded, thresholded) on fMRI reliability, as measured by the intraclass correlation (ICC) coefficient. We also utilized data from a second study to investigate test-retest reliability after an extended, six-month interval. We found that all of the factors above were statistically significant, but that they had varying levels of influence on the observed ICC values. We also found that these factors could interact, increasing or decreasing the relative reliability of certain Task × Design combinations. The results suggest that fMRI reliability is a complex construct whose value may be increased or decreased by specific combinations of factors.