Instructor: Evaggelia Pitoura
Where: B2
When: Tuesday, 13:00-16:00pm
What:
The course will cover the basics of Information Retrieval (IR).
Material will be based on the two textbooks given below.
This year, we shall also cover one or more from the following special topics:
(a) Peer-to-Peer IR, (b) Publish/Subscribe IR,
(c) Social Communities and (d) New Storage Media (eg Kadoop).
For these special topics, we shall read research papers (to be announced
below).
Textbook
Also (free on line):
Ανακοινώσεις
Βαθμοί
Tentative Schedule
Week 1 (Oct 13)
Short Introduction.
Week 2 (Oct 20)
Introduction. (Chapter 1)
slides
The Boolean, Vector and Probabilistic Models. (Chapter 2)
slides
Assignment 1
Week 5 (Nov 10)
Text Operations: Document Preprocessing.
slides
Query Expansion and Feedback I.
slides
Assignment 4
Πληροφορίες για Lucene εδώ (από Μαρίνα Δρόσου)
Week 6 (Nov 24)
Query Expansion and Feedback II.
slides
Indexing I (inverted files I, text statistics).
slides
Ασκήσεις 3
Assignment 5
Results of yahoo and google relevance judgement from set 3.
Week 7 (Dec 1)
Indexing II (invreted files II, dictionary, phrase, proximiy and
wild-card queries)
slides
Ασκήσεις 1
Assignment 6
Week 9 (Dec 15)
Web Search. Web Crawling and Indexes.
slides
List of papers (to be presented on Jan 12&19)
Topic I: Cloud Computing, MapReduce
Jeffrey Dean, Sanjay Ghemawat,
"MapReduce: Simplified Data Processing on Large Clusters", OSDI, 2004
(a more recent version that appeared in Communications of the ACM 2008)
Hadoop MapReduce
Topic II: Social Networks
Paul Heymann, Georgia Koutrika, Hector Garcia-Molina: Can social bookmarking improve web search? WSDM 2008: 195-206
Shenghua Bao, Gui-Rong Xue, Xiaoyuan Wu, Yong Yu, Ben Fei, Zhong Su: Optimizing web search using social annotations. WWW 2007: 501-510
Topic III: Beyond Relevance
Sreenivas Gollapudi, Aneesh Sharma: An axiomatic approach for result diversification. WWW 2009: 381-390
(similar/more recent) Sreenivas Gollapudi, Aneesh Sharma: An Axiomatic Framework for Result Diversification. IEEE Data Eng. Bull. 32(4): 7-14 (2009)
Zolton Gyongyi, Hector Garcia-Molina, Jan O. Pedersen: Combating Web Spam with TrustRank. VLDB 2004: 576-587
Γενικές οδηγίες για τις ομιλίες.
Week 10 (Jan 12)
Recommenders
slides
MapReduce
-
Paper:
Jeffrey Dean, Sanjay Ghemawat
MapReduce: Simplified Data Processing on Large Clusters.
OSDI 2004
Hadoop
- Φώτης Σιταράς & Σταυρίνα Τσουρού:
talk
report
Assignment 8
Week 11 (Jan 19)
Social Networks
-
Paper1:
Paul Heymann, Georgia Koutrika, Hector Garcia-Molina:
Can social bookmarking improve web search?
WSDM 2008
Γιώργος Μίσκος
talk
report
-
Paper2:
Shenghua Bao, Gui-Rong Xue, Xiaoyuan Wu, Yong Yu, Ben Fei, Zhong Su:
Optimizing web search using social annotations.
WWW 2008
Άγγελος Λάζος
talk
report
Web Spam
-
Paper:
Zolton Gyongyi, Hector Garcia-Molina, Jan O. Pedersen:
Combating Web Spam with TrustRank.
VLDB 2004
Σταυρούλα Αλεξίου
talk
report
Diversity
-
Paper:
Sreenivas Gollapudi, Aneesh Sharma:
An axiomatic approach for result diversification.
WWW 2009
Χρήστος Παππάς
talk
report
Information Retrieval Resources
IR Courses in Greek Universities