資訊管理學報

郝沛毅;歐仁彬;黃天受;林振穎;吳建生;
頁: 363-395
日期: 2018/10
摘要: 能夠成功預測股票漲跌趨勢明顯地有許多好處,根據效率市場假設,公司股票的價值是由當前所有可用的信息給定。當分析師、投資者和機構交易者評估當前股價時,新聞在股價估值過程中發揮重要作用。事實上,金融新聞刊載有關於公司基本面的訊息,和影響市場參與者期望的質化訊息。在大數據時代,線上新聞文章的數量持續增長,在如此巨量的文字資料面前,越來越多的機構依靠現代計算機的高速處理能力來進行文字探勘與機器學習,以建構更準確的股價趨勢預測模型。使用文章中非結構化的數據,是最具挑戰性的研究方向,也將是本研究工作的重點,在本論文中,我們將從新聞文章中萃取出隱含的主題模型與情緒資訊,此外,我們將開發一個模糊支持向量機來融合線上新聞文章內含的豐富資訊,以預測股價的漲跌趨勢。我們認為模糊理論非常適用於本研究,因為文字本身就是模糊的(例如,高低、大小),而且在漲跌趨勢之間,存在一條曖昧的模糊邊界(例如,漲0.01%與漲1%雖然都屬於上漲的類別,但是屬於的程度明顯不同)。本研究在食品類股的預測正確率最高為87%,半導體類股的正確率最高為71%,電腦周邊類股的預測正確率最高為69%,相較於傳統支持向量機透過關鍵字來預測股價漲跌趨勢的正確率僅五成多(接近於隨機猜測),本研究所提出的方法明顯優於傳統的支持向量機預測模型。
關鍵字: 股價預測;情緒分析;潛在狄利克雷分配;文字探勘;模糊理論;支持向量機;

Sentiment and Topic Analysis on Financial News for Stock Movement Prediction by Using Fuzzy Support Vector Machine


Abstract: Purpose-In Big Data era, the amount of news articles has been increasing tremendously. In front of such a big volume of textual data, more and more institutions rely on the high processing power of modern computers for text mining and machine learning to make more accurate predictions of stock market. Discovering the fundamental data available in unstructured text is the most challenging research aspect and therefore is the goal of this work. Design/methodology/approach-In this study, we extracted the hidden topic model and emotional information from news articles. Besides, we developed a fuzzy support vector machine to merge the abundant information from the on-line news, which can be used to forecast the trend of stock prices. Fuzzy set theory is very useful for this study because the texts are fuzzy in itself (such as high/low and big/small), and there is an ambiguous boundary between rise and fall categories. For example, going up either 10% or 1% belongs to rise category, but is different in degree. Findings-As for this study, the highest forecast accuracy rate was 87% for the food-related stocks, 71% for the semiconductors-related stocks, and 69% for the computer peripheral-related stocks. When compared with traditional support vector machine, which the forecast accuracy rates of stock price trends were just over 50% (nearly to random guess), the method proposed in this study is significantly better than the forecasting model of traditional support vector machine. Research limitations/implications-This study focused only on accurately classifying the stock movement based on hidden topic and sentiment features. In our future work, we plan to investigate more complex semantic features. Practical implications-Successful predictions of stock price movement tendency have obvious advantages. According to the Efficient Market Hypothesis, the price of a stock asset is given by all information available in the moment. Financial news carries information about the firm's fundamentals and qualitative information influencing expectations of market participants. This study employs sentiment and topic analysis on financial news to predict stock movement. This can help analysts, investors and institutional traders to effectively evaluate current stock prices. Originality/value-This study is, to the best of our knowledge, the first attempt to apply fuzzy support vector machine and hidden topic/semantic features for the prediction of stock movement in Taiwan.
Keywords: stock trend prediction;sentiment analysis;latent dirichlet allocation;text mining;fuzzy theory;support vector machine;

瀏覽次數: 17189     下載次數: 1385

引用     導入Endnote

相關文章推薦

Top Downlaod Papers