資訊管理學報

簡禎富;林國勝;
頁: 133-159
日期: 2006/10
摘要: 生物晶片之基因微陣列技術(microarray)及基因選殖技術(gene cloning)的突破,加上資訊科技的進展能力,使得生物科技和生物晶片的研究和應用在過去十年有非常蓬勃的發展,也因而衍生許多資料處理和分析的問題亟待克服,特別是生物晶片資料變數多而樣本數少的問題。本研究目的係針對cDNA生物晶片之二元資料的特性,發展生物晶片資料挖礦(Data Mining)方法和模式藉以探索與尋找疾病與特定基因的關係,並建構其規則以作為醫療診斷決策支援參考。本研究並採用史丹佛大學晶片資料庫中乳癌晶片資料以驗證研究效度,從四萬多個基因與64個樣本當中,使用顯著性分析(Significant Analysis of Microarray)與決策樹(Decision Tree)挖掘出具影響力的基因及診斷決策規則,從中萃取有價值之資訊,研究結果可以驗證本研究所提出之方法的效度。
關鍵字: 生物晶片;資料挖礦;決策樹;微陣列技術;顯著性分析;

A Data Mining Framework for Binary cDNA bio-chip Data Analysis and Its Validation


Abstract: Owing to increasing breakthroughs for microarray in biochips and gene cloning technologies, biotechnology is now an emergent and promising industry worldwide. Although information technology advancements enable complex calculation and comprehensive data storage involved in biotechnology, a number of critical issues need to be addressed for both practice and research needs. This study aims to develop a data mining framework for analyzing huge bio-chip data that are different from the data addressed in manufacturing and service industries. In particular, specific genes between normal and abnormal individuals were extracted in decision rules to clarify the relationships among genes, and diseases. We adopt the breast cancer patient cDNA microarray dataset for validating the proposed approach. We firstly extracted significant genes from more than 44,000 genes and then use decision tree to derive classification rules to support medical diagnosis. The results showed practical viability of this approach.
Keywords: Biochip;Data mining;Decision Tree;Microarray;Significant Analysis of Microarray;

瀏覽次數: 11038     下載次數: 80

引用     導入Endnote

相關文章推薦

Top Downlaod Papers