INEX 2012 Relevance Feedback Track


The goal of the Relevance Feedback track is to determine how to best apply (focused) relevance feedback provided by users to provide the user with better results within the same search session.

To participate in the Relevance Feedback track, organisations will implement software solutions, then run them using the supplied Evaluation Platform. The Evaluation Platform will then automatically submit the results online. Unlike in previous versions of the Relevance Feedback track, there is no need to submit the software solution. It does not have to be portable to other systems and does not have to be a single program. The document collection can be indexed in advance so that it does not have to be rerun each time.

Use Case

The use-case of this track is a single user searching with a particular query in an information retrieval system that supports relevance feedback. The user highlights relevant passages of text in returned documents (if any) and provides this feedback to the information retrieval system. The IR system re-ranks the remainder of the unseen results list to provide more relevant results to the user. The exact manner in which this is implemented is not of concern in this evaluation; here we test the ability of the system to use focused relevance feedback to improve the ranking of previously unseen results.

Test Collection

The relevance feedback track will use the Wikipedia XML Corpus as the test collection. Evaluation will be based on a collection of relevance assessments gathered from assessments used to judge previous INEX Ad Hoc tracks. Two topic sets (a training set and an evaluation set) will be provided with the evaluation platform. The evaluation set should not be used for testing as the results will be submitted online, although multiple submissions may be made.


The evaluation software for the Relevance Feedback track, complete with documentation and sample submissions, is available here.


Participating organisations will create one or more Relevance Feedback Modules (RFMs) intended to rank a collection of documents with a query while incrementally responding to explicit user feedback on the relevance of the results presented to the user. These RFMs will be implemented as executable files that the Evaluation Platform (EP) will interact with through input/output stream pipes, simulating a user search session. To evaluate the RFM, participants will run the (Java) application and provide it with the path to the RFM.

The EP will run the RFM and provide it with a topic, as a single line of text ending with a newline. The RFM will respond with the document ID of the first document to present to the user. The EP will then respond with the number of relevant passages within the document, again on a single line. If there is at least 1, the EP will provide the RFM with the text from those passages, on one line each. The RFM will then present the document ID of the next document. When the RFM has presented all the documents it wishes to present, it will supply the text EOF on a line instead of a document ID. The RFM will then present the RFM with a new topic or the text EOF if there are no more topics to evaluate.

The EP will retain the presentation order of documents as generated by the RFM. This order will then be evaluated as a submission using the trec_eval evaluation tool.

In previous versions of the Relevance Feedback track, a reduced collection was used to avoid long running times. As the evaluation platform does not require the document collection to be present for evaluation to work, the full Wikipedia collection can be used. It may be advisable to pre-index the document collection so that the RFM can load the index for the evaluation instead of having to re-index all the documents each time.


The evaluation platform and the participating relevance feedback module will communicate using a pipe, a standard feature of all modern operating systems. Hence, any programming language capable of creating an executable that can read from standard input and write to standard output would be suitable for creating a relevance feedback module for the task.

Each 'message' from the evaluation platform or the relevance feedback module will be in the form of a single line of text ending in a linefeed character. The meaning of the line of text will be derived from the context in which it is submitted.

The evaluation platform communicates first, providing a topic line. This line will either contain the text of the topic or the text EOF, signalling to the RFM that the evaluation is over and it may exit. The RFM will respond with a document line. This line will contain either a document ID or the text EOF, signalling to the EP that the RFM has finished presenting documents for the current topic and is ready to move on to the next topic. If a document ID is presented, the EP will respond with feedback.

Feedback will be provided in the form of a line with a number indicating the number of passages of relevant text found in the document. If that number was 0, the document was not relevant and the RFM should provide the next document ID. Otherwise, the EP will immediately follow up the number with that many passages of feedback text, each on a single line. The feedback text will be stripped of characters outside the ASCII printable range of 32-127. Note that these lines potentially be as large as the largest document in the collection. After all the lines of feedback have been sent, the RFM is expected to respond with another document.

A sample submission (with source code) will be provided with the evaluation platform.


The submission deadline for the track is the 29th of July, 2012, at 11:59pm (any timezone). Submissions will be generated and submitted automatically by using the evaluation tool.


Shlomo Geva

Timothy Chappell
Imprint | Contact someone about INEX