INEX 2013 Social Book Search Track
Home |

INEX 2013 Social Book Search Track

Social Book Search Track home

Evaluation results for the 2013 Social Book Search task

The evaluation results shown below are based on the official INEX 2013 SBS topic set based on the LibraryThing discussion groups and the user profiles and catalogues of the topic creators.

These are the official Qrels:

  1. inex13SBS.qrels: Qrels derived from the books recommended on the LibraryThing discussion threads of 380 topics. The mapping of labels to relevance values is explained below.
The Qrels set uses the LibraryThing work IDs as document IDs. For evaluation the ISBNs in the submitted runs are mapped to LT work IDs as well, with the highest ranked ISBN being mapped to the work ID and lower ranked ISBNs mapped to the same ID removed from the results list.

Deduplication: mapping ISBNs to LibraryThing work IDs



Evaluation results

These are preliminary results. Final results will be released on 12 June 2013.

The official evaluation measure is nDCG@10.

RunnDCG@10P@10MRRMAP
RSLIS - run3.all-plus-query.all-doc-fields 0.1361 0.0653 0.2286 0.0861
UAms_ILLC - inex13SBS.ti_qu_gr_na.bayes_avg.LT_rating 0.1331 0.0771 0.2342 0.0788
UAms_ILLC - inex13SBS.ti_qu.bayes_avg.LT_rating 0.1331 0.0771 0.2342 0.0788
RSLIS - run1.all-topic-fields.all-doc-fields 0.1295 0.0647 0.2190 0.0797
UAms_ILLC - inex13SBS.ti_qu_gr_na 0.1184 0.0555 0.2075 0.0790
UAms_ILLC - inex13SBS.ti_qu 0.1163 0.0647 0.2091 0.0665
ISMD - run_ss_bsqstw_stop_words_free_member_free_2013 0.1150 0.0479 0.1839 0.0800
UAms_ILLC - inex13SBS.qu.bayes_avg.LT_rating 0.1147 0.0661 0.1997 0.0656
ISMD - run_ss_bsqstw_stop_words_free_2013 0.1147 0.0468 0.1843 0.0798
UAms_ILLC - inex13SBS.ti.bayes_avg.LT_rating 0.1095 0.0634 0.2005 0.0630
ISMD - ism_run_ss_free_text_2013 0.1036 0.0426 0.1566 0.0728
ISMD - run_ss_bsqstw_2013 0.1022 0.0416 0.1618 0.0707
ISMD - run_ss_bsqstw_stop_words_free_query_title_only_2013 0.0940 0.0495 0.1648 0.0556
NTNU - aa_LMJM3 0.0832 0.0405 0.1464 0.0518
NTNU - az_LMJM3 0.0814 0.0376 0.1507 0.0486
NTNU - az_BM25 0.0789 0.0392 0.1442 0.0489
NTNU - aa_BM25 0.0780 0.0366 0.1409 0.0472
UAms_ILPS - UAmsRetTbow 0.0664 0.0355 0.1143 0.0386
UAms_ILPS - indri 0.0645 0.0347 0.1143 0.0346
NTNU - qa_LMD 0.0609 0.0345 0.1171 0.0330
LSIS_AMU - score_file_mean_R_2013_reranked 0.0596 0.0324 0.1101 0.0367
NTNU - qz_LMD 0.0577 0.0342 0.1100 0.0300
LSIS_AMU - score_file_SDM_HV_2013_reranked 0.0576 0.0292 0.1145 0.0362
LSIS_AMU - resul_SDM_2013 0.0571 0.0297 0.1061 0.0357
RSLIS - run2.query.all-doc-fields 0.0401 0.0208 0.0635 0.0232
CYUT - Run4.query.RW 0.0392 0.0287 0.0796 0.0201
CYUT - Run6.query.reviews.RW 0.0378 0.0284 0.0772 0.0165
CYUT - Run2.query.Rating 0.0376 0.0284 0.0792 0.0178
CYUT - Run1.query.content-base 0.0265 0.0147 0.0418 0.0153
CYUT - Run5.query.reviwes.content-base 0.0254 0.0153 0.0359 0.0137
CYUT - Run3.query.RA 0.0170 0.0087 0.0352 0.0107
OUC - sb_ttl_nar_10000_0.5 0.0100 0.0071 0.0212 0.0055



Operationalisation of forum judgement labels

Students from the Royal School of Library and Information Science (Copenhagen) has labelled the LibraryThing forum topic threads and the suggestions in those threads.

Forum members can mention books for many different reasons. We want the relevance values to distinguish between books that were mentioned as positive recommendations, negative recommendations (books to avoid), neutral suggestions (mentioned as possibly relevant but not necessarily recommended) and books mentioned for some other reason (not relevant at all).

Furthermore, we want to differentiate between recommendations from members who have read the book they recommend and members who haven't. We assume the recommendation to be of more value to the searcher if it comes from someone who has actually read the book.

Terminology

Simplifying assumptions

Decision tree: determine which judgements to use

1 - Work mentioned once -> there is only one judgement, use that
2 - Work mentioned multiple times
  2.1 - topic creator mentions work
    2.1.1 - topic creator *suggests* neutral -> use replies (go to 2.2)
    2.1.2 - topic creator *suggests* pos/neg -> use creator judgement
    2.1.3 - topic creator *replies* -> use creator judgement only
  2.2 - topic creator doesn't mention work
    2.2.1 - there are some has_read suggestions/replies -> use has_read judgements
    2.2.2 - there are no has_read suggestions/replies -> use all judgements

Decision tree: turn judgements into relevance values

When a work is mentioned, its base relevance value is rv=2.
1 - single judgement
  1.1 - creator has_read judgement
    1.1.1 - creator pos/neg/neu -> rv=0
  1.2 - creator not_read judgement
    1.2.1 - creator positive -> rv= 8
    1.2.2 - creator neutral -> rv=2
    1.2.3 - creator negative -> rv=0
  1.3 - other has_read judgement
    1.3.1 - has_read positive -> rv= 4
    1.3.2 - has_read neutral -> rv=2
    1.3.3 - has_read negative -> rv=0
  1.4 - other not_read judgement
    1.4.1 - not_read positive -> rv= 3
    1.4.2 - not_read neutral -> rv=2
    1.4.3 - not_read negative -> rv=0
2 - multiple judgements
  2.1 - multi has_read judgements
    2.1.1 - some positive, no negative -> rv=6
    2.1.2 - #positive > #negative -> rv=4
    2.1.3 - #positive == #negative -> rv=2
    2.1.4 - all neutral -> rv=2
    2.1.5 - #positive < #negative -> rv=1
    2.1.6 - no positive, some negative -> rv=0
  2.2 - multi not_read judgements
    2.2.1 - some positive, no negative -> rv=4
    2.2.2 - #positive > #negative -> rv=3
    2.2.3 - #positive == #negative -> rv=2
    2.2.4 - all neutral -> rv=2
    2.2.5 - #positive < #negative -> rv=1
    2.2.6 - no positive, some negative -> rv=0


If you have questions, please send an email to marijn.koolen@uva.nl.





Imprint | Contact someone about INEX