Distributed association rules mining using non-derivable frequent patterns
(ندگان)پدیدآور
پدیدآور نامشخصنوع مدرک
Textزبان مدرک
Englishچکیده
Mining association rules in distributed databases is an interesting problem in the context of parallel and distributed data mining. A number of approaches have, so far, been proposed for distributed mining of association rules. However, most of them consider all types of frequent itemsets the same, even though there are different types of itemsets in distributed databases, e.g., derivable and non-derivable. In this study, a new application of deduction rules is introduced for distributed mining of association rules which exploits the derivability of itemsets to reduce communication overhead and to enhance response time. A new algorithm is proposed which mines derivable and non-derivable frequent itemsets in a distributed database. Since the collection of derivable and non-derivable frequent itemsets form all frequent itemsets, our algorithm mines all frequent itemsets rather than a subset of them. In the algorithm, there is no need to scan local databases and exchange messages in order to obtain support counts of derivable frequent itemsets, since each site can produce them autonomously. Experimental evaluations on horizontally partitioned real-life datasets show that such exploitation drastically reduces communication and also improves response time. Therefore the new algorithm is useful when communication bandwidth is the main bottleneck.
کلید واژگان
Distributed data miningAssociation rules mining
Non-derivable frequent itemsets
distributed deduction rules
شماره نشریه
6تاریخ نشر
2009-01-011387-10-12




