2017年4月3日月曜日

2-1. Previous Research

In principle, Sentimental Analysis can be categorized as a spin-out stemmed from Text Analysis field. Similarly, Text Analysis pertains to Content Analysis. Thus, it would be necessary to date back to its root theories when it comes to seeking for the origin of the Sentimental analysis. Philipp Mayring, Professor of Psychology from Alps Adria University Klagenfurt, lines up the Bible interpretation as one of the precursors for Content Analysis (Mayring & Klagenfurt, 2008, para. 6). For many centuries, every one of the words both in the Old and New testaments have long been scrutinized thoroughly by not only scholars, but also ordinary individuals. Though the motivation for the challenge extensively varies from protecting faith to attacking it, it is certain that there would have never been such fully examined books for various purposes from diverse aspects in human history. For instance, the comparison between Luke and Acts has been a popular subject in an effort to confirm or deny the common connections in them (Walters, 2009). Moreover, some critics pointed out that the preface of Luke is resemble to the traditional Greek-speaking historians' writing style like the opening statement of Josephus' "Against Apion"  (Allison A. TritesWilliam J. Larkin, 2016). It's just an aside, but assumedly the custom established due to the fear of the demagogues who agitated people in the ancient Greek and led to the collapse. Those theories clearly specify how text analysis has been utilized in the theological world.
As for the more contemporary text analysis, the 18th century Methodist leader John Wesley, who was an Anglican cleric, leader of Methodism and the spiritual founder of Aoyama Gakuin, proposed a primitive method to understand the true meaning of the Holy book saying;

Many biblical texts are intertexts, composed with other biblical texts in
mind and heart, and still other texts, unknown or unintended by the
author, that come to the interpreter’s mind in canonical context. The
talented interpreter listens for echoes of other biblical texts, however low
their volume, and looks for allusions, however dim their reflection, that
link biblical texts together, the one glossing and thickening the meaning
of the other. (How to Read, p.43)

This explanation can be grasped as a Content analysis for Bible’s stories that have a numerous number of intricate descriptions for which careful exegeses are needed.
However, it took a long time for Content analysis to be systematically compiled. In the early 20th century, Bernard Berelson firstly published a book about Content analysis “Content analysis in Communication Research” in 1952, which was presumably effected by the prior works of Paul F. Lazarsfeld and Harold D. Lasswell in 20ies and 30ies of 20th century such as the U.S. government sponsored a project under the directorship of Harold Lasswell to evaluate enemy propaganda during World War II. (Content Analysis A method, p.1). After that, the technics began to be employed by many scholars for assessment of various fields from politics to finance.
As for the “Sentimental analysis” or its essential function “Text mining”, its emergence was much more recent compared to those ancestors. According to Bing lu, a professor at University of Illinois at Chicago (UIC), before 2000, there were few investigations using this method conducted due to the deficiency of the network environment causing poor data mining results. However, since 2000, thanks to the rising Information technology, this field has become one of the major research areas with rapidly developing computer software (Sentiment Analysis and, p.10). There began to appear various reports for this approach since then. For example, the 2001 paper written by Sanjiv Das and Mike Chen (Yahoo! for Amazon, 2004) is regarded as one of the pioneer works for this category by related researchers (Lee, 2008, para. 1). In the thesis, they collect investors’ postings from stock message board and assessed how their sentiments are affected by management announcements, press releases, third-party news, regulatory changes, and thing like that. The probe notably employs five algorithms; Naive Classifier, Vector Distance Classifier, Discriminant-Based Classifier, Adjective-Adverb Phrase Classifier and Bayesian Classifier as classifiers for each message. Moreover, they created additional programs to collect data and help those classifiers to evaluate the text. It is remarkable that even the earliest stage of the “Sentimental analysis” study already utilized the computer based applications. Similarly, their analysis process that they divided the opinions into three groups which are bullish (optimistic), bearish (pessimistic) and neutral (comprising either spam or messages that are neither bullish nor bearish) is also impressive, though these days some papers treating “Sentimental analysis” seem to be fond of using the terms “Positive”, “Negative” and “Neutral” instead. Not only their methods but also their research results are quite interesting.
Figure 1 and 2 from the thesis show a perfect correlation between the sentiment indexes that they calculated from messages from the “Yahoo” boards and actual stock prices for APPLE on 18-October-2000 and 20-October-2000. Surprisingly, figure.2 slightly indicates that the sentiment index implies the imminent rising of the stock price before it actually happens. On the contrary, Figure 3, which illustrates the Amazon’s comparison on 11th December, 2000, reveals the opposite outcome in which the two trends are in inverse proportion to each other. Interestingly, the authors defend their untoward consequence by desperately saying “On other days, such as for Amazon on 11th December, 2000 (Figure 3), there appears to be almost no relationship between sentiment and stock price, was a precursor to stock price change (Yahoo! for Amazon, 2004, P. 7).”
The dissertation in 2002 proposed by Bo Pang, Lillian Lee and Shivakumar Vaithyanathan (“Thumbs up?” 2002) is also viewed as the early sentimental analysis document (Structured Models). In the paper, they use the movie reviews and decide whether they are “Positive” or “Negative”. Besides, comparisons between human and algorithms on the accuracy for distinguishing positive from negative words is conducted. One of the unique features this thesis offers is that they employ the “machine learning techniques”. According to “WhatIs.com”, machine learning means “a type of artificial intelligence that provides computers with the ability to learn without being explicitly programmed”.[1] Also, those machine learning algorithms can educate themselves using given data and specialize in any fields’ analysis depending on the type of the information. Three different machine learning methods are used in this research; “Naive Bayes classification”, “maximum entropy classification,” and “support vector machines”. Each one of them is a traditional analysis method which has been studied for a long time in other fields. Another interesting point is that this thesis successfully proves that humans’ instincts are much inferior to those standard machine learning techniques for analyzing the text’s sentimental elements like positive or negative words against our general expectations. The figure 4 from the thesis clearly suggests that algorithms’ accuracies outnumber the humans’.




http://www.ijcaonline.org/archives/volume150/number6/alhojely-2016-ijca-911545.pdf

0 件のコメント:

コメントを投稿

The Ouster of John Bolton probably means the some signals for the North Korea?

Some western media have reported that Trump's National Security Advisor John Bolton was fired. There are various opinions about this dec...