Wednesday, October 7, 2009

CFP: INEX 2009 XML Mining (Unsupervised Clustering) Task Update

The INEX 2009 XML Mining (unsupervised clustering) Web site is updated to include all information about dataset, evaluation criteria, submission formats and others. Please check it on:  http://www.inex.otago.ac.nz/tracks/wiki-mine/wiki-mine.asp   Submission deadline for clustering results is November 2nd 2009. The Website includes various representations (tags, links, trees, entities, bag-of-words with bigrams, bag-of-words with stemmed words, bag-of-words with stemmed bigrams) of a large INEX 2009 data set (about 2.6 million documents) and a small subset of INEX 2009 dataset (about 50 thousand documents). The task is to utilize unsupervised classification techniques to group the documents into clusters. You can submit several clustering solutions of different numbers of clusters: 100, 500, 1000, 2500, 5000 and 10000.   As advertised in the previous CFP, the clustering solutions will be evaluated by two means. Firstly, the submissions will be evaluated by using the standard criteria such as Entropy, F-score, Normalised Mutual Information and others to determine the quality of clusters against ground truths. These evaluation results will be provided online and ongoing along the same lines as NetFlix, starting from mid-October. Secondly, the clustering solutions will be evaluated to determine the quality of clusters relative to the optimal collection selection goal, given a set of queries. Real Ad-hoc retrieval queries and their manual assessment results will be utilised in this evaluation. Results of this evaluation will be released at the INEX workshop in December.   Do not hesitate to contact us if you have any questions. Thanks for your time, Richi Nayak, Chris De Vries and Sangeetha Kutty INEX 2009 Clustering Task Organizers Dr Richi Nayak, School of Information Technology, Queensland University of Technology, Brisbane, QLD 4001 Office: MS-306  Phone: 3138 1976  Email: r.nayak@qut.edu.au  http://sky.scitech.qut.edu.au/~nayak/

Labels: , , , , , , ,