In recall-oriented retrieval setups, such as the Legal Track, ranked retrieval has a particular disadvantage in comparison with traditional Boolean retrieval: there is no clear cut-off point where to stop consulting results. It is expensive to give a ranked list with too many results to litigation support professionals paid by the hour. This may be one of the reasons why ranked retrieval has been adopted very slowly in professional legal search.2
The "missing" cut-off remains unnoticed by standard evaluation
measures: there is no penalty and only possible gain for padding a run
with further results. The TREC 2008 Legal Track addresses this head-on
by requiring participants to submit such a cut-off value
per topic where precision and recall are best balanced. This year we
focused solely on selecting
for optimizing the given
-measure. We believe that this will have
the biggest impact on this year's comparative evaluation.
The rest of this paper is organized as follows.
The method for determining
is presented in Section 2.
It depends on the underlying score distributions
of relevant and non-relevant documents, which we elaborate on
in Section 3.
In Section 4 we describe the parameter estimation methods.
In Section 5 we discuss the experimental setup, our
official submissions, results, and additional experiments.
Finally, we summarize the findings in Section 6.