Recep Arda Kaya - 435060

Apriori Association to Grocery Store Data Set

Objective

* Trying Apriori alogrithm with generating frequent itemsets and explain how it works.

Data

Due to more practical explanation, I am going to use Grocery Store Data Set

This dataset contains 11 items : JAM, MAGGI, SUGAR, COFFEE, CHEESE, TEA, BOURNVITA, CORNFLAKES, BREAD, BISCUIT and MILK.

Review of the Data

Apriori Algorithm

Apriori is a popular algorithm for extracting frequent itemsets with applications in association rule learning. The apriori algorithm has been designed to operate on databases containing transactions, such as purchases by customers of a store. An itemset is considered as "frequent" if it meets a user-specified support threshold. For instance, if the support threshold is set to 0.5 (50%), a frequent itemset is defined as a set of items that occur together in at least 50% of all transactions in the database.
Now, let us return the items and itemsets with at least 38% support:

Most frequent single items

COFFEE = 0.606061
BISCUIT = 0.560606

Most frequent set items

COFFEE AND BISCUIT = 0.393939

Conclusions

Support threshold is set to 0.38(38%), a frequent itemset is defined as a set of items that occur together in at least 38% of all transactions in the database. In our example we can say that biscuit and coffee associated together if we set our support to 38%. Also, their purchase frequency is much higher while we consider them as a single unit.

References

https://medium.com/edureka/apriori-algorithm-d7cc648d4f1e