Skip to content
This repository has been archived by the owner on Apr 6, 2020. It is now read-only.
/ deu-bil3003-ps1 Public archive

Apriori algorithm implementation (Introduction to Data Mining / Problem set 1)

Notifications You must be signed in to change notification settings

avhn/deu-bil3003-ps1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

55 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction to Data Mining - Problem set 1

pipeline status

Goal is to generate frequent item sets and find association rules with Apriori algorithm. See problem set description.

Technologies used in this project:

  • Python 3.7
  • GitLab CI
  • Git

File Structure

├── main.py
├── test.py
├── problemset
│   ├── __init__.py
│   ├── parse
│   ├── generate
│   ├── utils
│   ├── interact

Resources

Here's the what I generally used, a research paper from 1994 by IBM.

The Apriori Principle:

If an itemset is frequent, then all of its subsets must also be frequent. Conversely, if an subset is infrequent, then all of its supersets must be infrequent, too.

Usage and notes

  • To run, just run the main.py file at the root:
    $ python3 main.py
  • problemset.parser accepts csv format which indicates type of the value at the first line. This way the value is itemized by making a hashable immutable tuple as (indicator, value).

About

Apriori algorithm implementation (Introduction to Data Mining / Problem set 1)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages