Projects developed for the course Data Mining of the EIT Digital data science master at KTH
The projects developed for the course are the following:
- Finding Similar Items: presents the different stages for finding textually similar documents based on Jaccard similarity. Specifically, the implemented techniques to tackle the problem in a more optimal way are: shingling, minhashing, and locality-sensitive hashing (LSH).
- Discovery of Frequent Itemsets and Association Rules: implementation of the Apriori algorithm to discover association rules between itemsets in a sales transaction database (i.e. a set of baskets).
- Finding Similar Items: implementation of TRIÈST, a streaming graph processing algorithm which aims to count the local and global triangles in fully dynamic streams.
- Finding Similar Items: implementation of the spectral graph clustering algorithm as described in the paper “On Spectral Clustering: Analysis and an algorithm”.
- K-way Graph Partitioning Using JaBeJa: study distributed graph partitioning techniques by implementing the JaBeJa algorithm.
- Serghei Socolovschi serghei@kth.se
- Angel Igareta alih2@kth.se