資訊管理學報

陳林志;陳大仁;葉國暉;吳忠澄;
頁: 273-315
日期: 2015/07
摘要: 近年來線上部落格成長的速度如同其它社群網站一樣迅速。一般而言,我們使用不同部落格搜尋引擎(例如:Technorati、Blogpulse、Google Blog Search)搜尋那些我們最感興趣之部落格貼文;一般而言,當我們從部落格搜尋引擎進行搜尋時,很可能會面臨到同義詞(兩個字詞形態不同,但語意相同)及一詞多義(一個字詞有不同的意義)問題。在本論文裡,我們使用兩個語意分析模型:潛在語意分析(LSA)及機率潛在語意分析(PLSA),去解決上述兩個問題。LSA使用奇異值分解(SVD)技術,去擷取字詞間存在之同義詞關係;PLSA則可解決一詞多義並明確區分字詞間的不同含意和不同用法。根據模擬的結果,我們認為語意分析模型可增進部落格搜尋引擎的效能。
關鍵字: 語意分析;部落格搜尋;潛在語意分析;機率潛在語意分析;奇異值分解;

Using the Semantic Models to Analyze the Online Blog Posts


Abstract: Purpose- In recent years, the online blogging community is growing bigger as the social network service. Generally, we have used various blog search engines, such as Technorati, Blogpulse, and Google Blog Search, to find the blog post most appropriate for what we are seeking. Design/methodology/approach- In this paper, we use two semantic analysis models, Latent Semantic Analysis (LSA) and Probabilistic Latent Semantic Analysis (PLSA), to deal with these two problems. Findings- According to the results of simulation analysis, we conclude that the semantic analysis models can effectively be applied to the blog search engine. Research limitations/implications- We have encountered synonym (two terms are syntactically different but semantically interchangeable expressions) and polysemy (a term has different meanings) problems when we search from the blog search engine. Practical implications- LSA uses a Singular Value Decomposition (SVD) technique to capture the synonym relationships between terms. PLSA can deal with the problem of polysemy and can explicitly distinguish between different meanings and different types of term usage. Originality/value- We claim that the semantic analysis models can effectively improve the performance of blog search engine.
Keywords: semantic analysis;blog search;latent semantic analysis;probabilistic latent semantic analysis;singular value decomposition;

瀏覽次數: 11142     下載次數: 514

引用     導入Endnote