CSE012/CS059 – Data Mining
Fall 2024
|
|
Lecture Slides
For the slides of this course we will use slides and material from other courses and books. We thank in advance: Tan, Steinbach and Kumar, Anand Rajaraman Jeff Ullman, and Jure Leskovec, Evimaria Terzi, Aris Anagnostopoulos for the material from their slides that we have used in this course. Introduction: Logistics (in Greek) (pptx, pdf) Lecture 1: Introduction to Data Mining (pptx, pdf)
Lecture
2: What
is data? The data mining pipeline. Preprocessing and
postprocessing. Sampling and normalization. (pptx, pdf)
Lecture
3: Data
exploration and statistical analysis (pptx, pdf)
Lecture 4: Similarity
and Distance. Recommendation Systems (pptx, pdf)
Lecture 5: Dimensionality Reduction. Singular Value Decomposition (SVD). Principal Component Analysis (PCA). Model-based collaborative filtering (pptx, pdf)
Lecture
6: Clustering.
The k-means algorithm. Hierarchical Clustering. The
DBSCAN algorithm. Clustering
Evaluation. (pptx, pdf)
Lecture
7: Mixture
Models. The EM Algorithm. (pptx, pdf)
Lecture
8: Introduction
to Supervised Learning. Linear Regression.
Classification. Decision Trees - Expressiveness. Evaluation. (pptx, pdf)
Lecture
9: Nearest Neighbor
Classification, Support Vector Machines,
Logistic Regression, (Naive Bayes
Classification). Neural
Networks. Word Embeddings. The Supervised
Learning pipeline. (pptx, pdf)
Lecture
10: Link Analysis Ranking Web
Ranking. PageRank, Random Walks, HITS. (pptx, pdf)
|