資訊管理學報

謝楠楨;
頁: 25-51
日期: 2005/04
摘要: 本研究將提出一種適用於醫療資料庫探勘之四階段作業程序,以改善現有關聯規則(association rule)資料探勘研究中常見,如所發掘之關聯規則語意不清晰、關聯規則重複,以及因傳統關聯規則「支持度\信賴度」機制的限制,造成遺失有意義的規則等問題。為使發掘之關聯規則語意清晰,本研究首先運用叢集劃分(cluster partitioning)技術,自動將資料表格中數值資料(quantitative data)的資料欄位,轉換成為口語化述辭(linguistically terms)形式的模糊集合,其後使用自我組織映射圖網路(SOM, self-organizing maps)叢集分析法,依據敏感度分析(sensitivity analysis)所獲得之相對重要資料欄位,以及資料本身特徵,將所有資料區分為數個內部資料特徵相似的叢集,並對各叢集進行關聯規則分析,其後並以模糊相似關聯(fuzzy resemblance relation)概念設計之演算法,將語意近似之重覆關聯規則加以合併。藉由關聯規則之合併,可有效減少發掘關聯規則之數量,且所保留之關聯規則更具資訊表達之完整性(informative),且更易於醫療領域之解釋及運用。另為判斷關聯規則之可信度,本研究並運用模糊資料庫(fuzzy database)中真實值(truth value)評量方法,保留具較高真實度之關聯規則。最後,我們並使用一真實的疾病醫療資料庫驗證本研究提出的作法。
關鍵字: 資料探勘;叢集劃分;自我組織映射圖網路;模糊關聯規則;模糊重組關聯;真實值;

Finding Relevant Fuzzy Association Rules from Medical Databases


Abstract: For data mining applications, association rule can be used to support a decision making process. However, association rule algorithms usually yield a large numbers of rules, and many of the rules may contain redundant, irrelevant information or describe trivial knowledge. In this paper we present a four-stage data mining processes for finding relevant fuzzy association rules from medical database. Fuzzy association rules are especially suitable in medical mining, since they consist of simple linguistically interpretable rules and do not have the drawbacks of symbolic or crisp association rule. In the first phase, the Cluster partitioning technique was used to automatically transform quantitative values into fuzzy linguistically terms. The linguistically terms were modeled by means of fuzzy sets defined in the appropriate attribute domains. Next, a Kohonen self-organizing map (SOM) was used to identify clusters based on shared feature attribute values. The resulting clusters were then classified by feature attributes determined using an Apriori association rule algorithm. Because the association rule algorithm tended to generate large numbers of rules, we present interactive strategies for pruning redundant association rules on the basis of fuzzy resemblance relation to enhance its readability, and evaluate the truth degree of the discovered fuzzy association rules by the truth evaluation mechanism. Finally, we demonstrate our approach on a real disease medical database.
Keywords: Data mining;cluster partitioning;self-organizing map SOM;fuzzy association rule;fuzzy resemblance relation;truth value;

瀏覽次數: 10467     下載次數: 78

引用     導入Endnote

相關文章推薦

Top Downlaod Papers