The experiment will use a within-subject design like that used for the TREC interactive track, but with a different number of topics and a different task. Each searcher will be presented with all of the topics. The presentation order for topics will be varied systematically to ensure that each topic is searched in a different position, but that the same presentation order is used for each system. The minimal experimental matrix, in the order run, is shown below. For each searcher, the order of the systems and the order of the topics for each system are shown.

Searcher | Block #1 | Block #2 |

1 | System 1: 1-4 | System 2: 3-2 |

2 | System 2: 2-3 | System 1: 4-1 |

3 | System 2: 1-4 | System 1: 3-2 |

4 | System 1: 2-3 | System 2: 4-1 |

Additional searchers can be added in groups of four using the design above. If eight searchers are used, the second group of four searchers should use the following experiment matrix:

Searcher | Block #1 | Block #2 |

5 | System 1: 4-2 | System 2: 1-3 |

6 | System 2: 3-1 | System 1: 2-4 |

7 | System 2: 4-2 | System 1: 1-3 |

8 | System 1: 3-1 | System 2: 2-4 |

Any integer multiple of eight searchers can be accommodated by replicating both matrices as needed. Any other integer multiple of four searchers can be accommodated by replicating only the first matrix (e.g., for 12 searchers, replicate the first matrix three times and do not use the second matrix).

The experiment will require about three hours for each participant. Each topic will take about 25 minutes: 2 minutes before to examine the topic description, 20 minutes during the search, 3 minutes afterwards to complete the post-search survey. Searchers should not be asked to work for more than an hour without a break. An example schedule for an experimental session would be as follows:

Introductory stuff | 10 minutes |

Initial survey | 5 minutes |

Tutorials (2 systems) | 30 minutes total |

Break | 10 minutes |

Searching (system A, 2 topics) | 50 minutes |

Post-system survey | 5 minutes |

Break | 10 minutes |

Searching (system B, 2 topics) | 50 minutes |

Post-system survey | 5 minutes |

Final survey | 10 minutes |

