Deadline: May 12 (by email to pitoura@cs.uoi.gr) 1. 2-page write-up on the proposed term project (should include a section on related work) 2. Answer the following questions based on the survey paper: D. S. Milojicic, V. Kalogeraki, R. Lukose, K. Nagaraja, J. Pruyne, B. Richard, S. Rollins, and Z. Xu "Peer-to-Peer Computing", HP Technical Report, HPL-2002-57 (online copy at: http://www.cs.uoi.gr/~pitoura/courses/ds03_gr/projects.html) QUESTIONS (cover sec 1-5) 1. What is the definition of a p2p system given by the authors in sec 1? Compare it with at least one of the definitions surveyed in the last paragraph of pg 2. 2. In Fig 2 (pg 3), the authors compare some aspects of the client-server and the p2p computing models. List and explain these aspects. 3. What is a hierarchical and what is a flat client-server model? 4. What is a super peer? 5. What is the difference between a compute-intensive and a componentized application? How does this relate to vertical and horizontal distribution? 6. What is according to the authors the main challenge of communication in p2p? 7. What is the most common solution to reliability across p2p systems. 8. What are the advantages/disadvantages of the centralized directory, the flooded requests, and the document routing models. 9. In the centralized directory approach, after the best peer is located, the file exchange occurs directly between it and the requesting peer. What are the advantages/disadvantages of this? 10. What can be considered as a closure mechanism in Gnutella? 11. What are the factors that affect scalability, give one example for each. 12. Given the ad-hoc nature of connectivity in p2p, comment on what type of (message-oriented) communication (i.e., synchronous/asynchronous, transient/persistent) would be more appropriate. 13. pg 17, 1st column, last par "The geographical distribution of the peers help to reduce congestion on both peers and the network". Explain. 14. What is the goal of caching in p2p? What are the advantages/disadvantages of caching the reply at all nodes in the return path? Can you think of any alternatives? Is this possible in Gnutella? 15. What does the "power-law distribution of the p2p network" (pg 17) mean? 16. Compare/relate the definition of distributed systems in sec 5.2 (pg 21) with sec 1.4 of the textbook. 17. Why is the fault tolerance problem a greater challenge in collaborative p2p systems than in file sharing p2p systems?