資訊管理學報

戴偉勝;許中川;
頁: 1-25
日期: 2010/12
摘要: 現今企業資料庫中,隨處可見大量包含數值型與類別型屬性的高維度混合型資料。這些資料中常隱含有用資訊,因此如何能有效地分析這些資料從而支援決策,儼然是企業經營管理上的一項重要課題。在探勘資料時,視覺化一直是資料分析初始階段中相當重要的一環。自組映射圖能夠提供一個高效率的資料視覺化介面,讓使用者能於低維度映射圖上分析高維度資料的特徵。然而,對大部分自組映射圖演算法而言,使用者必須在訓練之前先行決定映射圖大小,也因此最終的映射結果會被此預設固定大小的圖形所限制,無法依據資料的本質擴充所需的神經元。雖然已有學者提出具備更彈性化結構的成長式自組映射圖克服前述問題,然而成長式自組映射圖仍然無法有效處理包含數值型與類別型屬性的混合型資料。本研究提出一個成長式混合型自組映射圖架構及訓練演算法,以更彈性的結構圖處理高維度混合型資料。經由實驗結果證實,本研究所提出的方法不但可表現混合型資料的拓撲關係,更可於資料分群上表現出較傳統自組映射圖更好的績效。
關鍵字: 資料探勘;資料視覺化;自組映射圖;混合型資料;

成長式自組映射圖視覺化混合型資料


Abstract: Large amount of high-dimensional mixed-type data including numeric as well as categorical attributes are commonly seen in corporate databases nowadays. Being able to analyze those data is important for supporting decision making. Visualization is essential in data mining, especially, at the initial stage of data analysis. Self-Organization Map (SOM) provides users an efficient data visualization interface to analyze the characteristics of high-dimensional data on a low-dimensional map. However, most SOMs need to predetermine the size of the map prior to training. Consequently, the resultant map must be constrained in a static, fixed-size map and could not extend with extra neurons in accordance with the nature of the data. Although growing SOM (GSOM) was proposed to tackle the foregoing problem via more flexible structures, GSOM lacks the ability to handle mixed-type data which include numeric as well as categorical attributes. In this study, we propose Growing Mixed SOM (GMixSOM) intending to handle high-dimensional mixed-type data in a map with flexible structure. Experimental results indicate that the proposed model not only can present the topological relationship between mixed-type data but also demonstrate better performances of data clustering compared to the conventional GSOM.
Keywords: data mining;data visualization;Self-Organization Map SOM;mixed-type data;

瀏覽次數: 16351     下載次數: 8292

引用     導入Endnote

相關文章推薦

Top Downlaod Papers