Minging Of Skyline Patterns By Considering Both Frequent And Utility Constraints
Abstract
Mining frequent itemsets (FIM) are the task to find the itemsets that are frequently occurrence in
a customer transaction database. However, FIM ignores the weight, interestingness or unit profit
of the items. To unveil more details, the task of High-Utility Itemset Mining is proposed, as a more
generalized task than FIM, to reveal the items that has high profit (or utility) from transaction
databases. By considering both utility and frequency measure, SKYMINE algorithms used to find
Patterns of Skyline Frequency-Utility (named SFUPs). However, SKYMINE has the disadvantage
that it takes a lot of computations to find SFUPs. In 2019 Lin et al presented an algorithm called
SKYFUP-D used to mine SFUP. However, dense databases, which are the ones containing several
similar transactions, have negative impact on the SKYFUP-D’s performance both in runtime and
memory usage. Therefore, the thesis proposes an algorithm called MSKY-D, as an extension to
the original SFYFUP-D algorithm, to utilize a technique called transaction merging. The proposed
approach merges similar transactions in a transaction database to reduce the cost of database scans,
candidates checking and memory usage. Experiment evaluations also show that the MSKY-D
algorithm has better performance in terms of time and memory than the SKYFUP-D algorithm,
especially on dense databases.