GitHub - avhn/deu-bil3003-ps1: Apriori algorithm implementation (Introduction to Data Mining / Problem set 1)

Introduction to Data Mining - Problem set 1

Goal is to generate frequent item sets and find association rules with Apriori algorithm. See problem set description.

Technologies used in this project:

Python 3.7
GitLab CI
Git

File Structure

├── main.py
├── test.py
├── problemset
│   ├── __init__.py
│   ├── parse
│   ├── generate
│   ├── utils
│   ├── interact

Resources

Here's the what I generally used, a research paper from 1994 by IBM.

The Apriori Principle:

If an itemset is frequent, then all of its subsets must also be frequent. Conversely, if an subset is infrequent, then all of its supersets must be infrequent, too.

Usage and notes

To run, just run the main.py file at the root:
```
$ python3 main.py
```
problemset.parser accepts csv format which indicates type of the value at the first line. This way the value is itemized by making a hashable immutable tuple as (indicator, value).

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
problemset		problemset
.gitignore		.gitignore
.gitlab-ci.yml		.gitlab-ci.yml
DESCRIPTION.pdf		DESCRIPTION.pdf
README.md		README.md
dataset.csv		dataset.csv
main.py		main.py
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction to Data Mining - Problem set 1

File Structure

Resources

Usage and notes

About

Releases

Packages

Languages

avhn/deu-bil3003-ps1

Folders and files

Latest commit

History

Repository files navigation

Introduction to Data Mining - Problem set 1

File Structure

Resources

Usage and notes

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages