apriori_pattern_miner
Apriori frequent itemset miner for transaction datasets. The library
depends on the frequent_pattern_mining_protocols support library,
implements the generic pattern_miner_protocol defined in the
pattern_mining_protocols core library, and mines frequent itemsets
using deterministic level-wise candidate generation and anti-monotone
pruning with one transaction rescan per candidate level using a
candidate hash tree backed by keyed bucket dictionaries. Requires a
dataset implementing transaction_dataset_protocol with transactions
represented as canonical sorted lists of unique declared items.
API documentation
Open the ../../apis/library_index.html#apriori_pattern_miner link in a web browser.
Loading
To load this library, load the loader.lgt file:
| ?- logtalk_load(apriori_pattern_miner(loader)).
Testing
To test this library predicates, load the tester.lgt file:
| ?- logtalk_load(apriori_pattern_miner(tester)).
Features
Deterministic Level-Wise Mining: Builds frequent itemsets level by level by generating deterministic candidate combinations, pruning candidates whose subsets are infrequent, and rescanning transactions once per level to compute support counts for all candidates using a candidate hash tree.
Candidate Hash Tree Counting: Counts supports for an entire candidate level by traversing a hash tree with keyed bucket and item dictionaries instead of linearly scanning bucket lists for every transaction.
Library Hashing: Uses the
hasheslibraryfnv1a_32object to hash candidate items instead of relying on an ad hoc local hash function.Apriori Join Step: Generates level candidates by pairwise joins of the previous frequent itemsets with shared prefixes.
Apriori Pruning: Rejects candidate itemsets whose immediate subsets are not all frequent using ordered subset checks over the previous level.
Canonical Transactions: Validates that transactions are sorted, duplicate-free, and restricted to declared items.
Flexible Support Thresholds: Supports relative minimum support and absolute minimum support count.
Model Export: Mined pattern collections can be exported as predicate clauses or written to a file.
Options
The mine/3 predicate accepts the following options:
minimum_support/1: Relative minimum support threshold in the interval]0.0, 1.0]. The default is0.5.minimum_support_count/1: Absolute minimum support count. When both support options are provided, this option takes precedence.maximum_pattern_length/1: Maximum itemset length to mine. The default is1000, which is effectively capped by the longest transaction in the dataset.minimum_pattern_length/1: Minimum itemset length retained in the mined result. The default is1.
Pattern miner representation
The mined pattern miner result is represented by a compound term with the functor chosen by the implementation and arity 3. For example:
apriori_pattern_miner(ItemDomain, Patterns, Options)
Where:
ItemDomain: Canonical sorted list of declared dataset items.Patterns: List ofitemset(Items, SupportCount)terms ordered first by pattern length and then lexicographically.Options: Effective mining options used to mine the frequent itemsets.
References
Agrawal, R. and Srikant, R. (1994) - “Fast algorithms for mining association rules in large databases”.