In
principle, Sentimental Analysis can be categorized as a spin-out stemmed from
Text Analysis field. Similarly, Text Analysis pertains to Content Analysis. Thus,
it would be necessary to date back to its root theories when it comes to
seeking for the origin of the Sentimental analysis. Philipp Mayring, Professor
of Psychology from Alps Adria University Klagenfurt, lines up the Bible
interpretation as one of the precursors for Content Analysis (Mayring &
Klagenfurt, 2008, para. 6). For many centuries, every one of the
words both in the Old and New testaments have long been scrutinized thoroughly
by not only scholars, but also ordinary individuals. Though the motivation for
the challenge extensively varies from protecting faith to attacking it, it is
certain that there would have never been such fully examined books for various
purposes from diverse aspects in human history. For instance, the comparison
between Luke and Acts has been a popular subject in an effort to confirm or
deny the common connections in them (Walters, 2009). Moreover, some critics
pointed out that the preface of Luke is resemble to the traditional
Greek-speaking historians' writing style like the opening statement of
Josephus' "Against Apion" (Allison A. Trites、William J. Larkin, 2016). It's
just an aside, but assumedly the custom established due to the fear of the
demagogues who agitated people in the ancient Greek and led to the collapse. Those
theories clearly specify how text analysis has been utilized in the theological
world.
As for
the more contemporary text analysis, the 18th century Methodist leader John
Wesley, who was an Anglican cleric, leader of Methodism and the spiritual
founder of Aoyama Gakuin, proposed a primitive method to understand the true
meaning of the Holy book saying;
Many biblical texts are intertexts, composed with
other biblical texts in
mind and heart, and still other texts, unknown or
unintended by the
author, that come to the interpreter’s mind in
canonical context. The
talented interpreter listens for echoes of other
biblical texts, however low
their volume, and looks for allusions, however dim
their reflection, that
link biblical texts together, the one glossing and
thickening the meaning
of the other. (How to Read, p.43)
This explanation
can be grasped as a Content analysis for Bible’s stories that have a numerous
number of intricate descriptions for which careful exegeses are needed.
However,
it took a long time for Content analysis to be systematically compiled. In the
early 20th century, Bernard Berelson firstly published a book about
Content analysis “Content analysis in Communication Research” in 1952, which
was presumably effected by the prior works of Paul F. Lazarsfeld and Harold D.
Lasswell in 20ies and 30ies of 20th century such as the U.S. government
sponsored a project under the directorship of Harold Lasswell to evaluate enemy
propaganda during World War II. (Content Analysis A method, p.1).
After that, the technics began to be employed by many scholars for assessment
of various fields from politics to finance.
As
for the “Sentimental analysis” or its essential function “Text mining”, its
emergence was much more recent compared to those ancestors. According to Bing
lu, a professor at University of Illinois at Chicago (UIC), before 2000, there
were few investigations using this method conducted due to the deficiency of
the network environment causing poor data mining results. However, since 2000,
thanks to the rising Information technology, this field has become one of the
major research areas with rapidly developing computer software (Sentiment
Analysis and,
p.10). There began to appear various reports for this approach
since then. For example, the 2001 paper written by Sanjiv Das and Mike Chen (Yahoo!
for Amazon, 2004) is
regarded as one of the pioneer works for this category by related researchers (Lee, 2008, para. 1).
In the thesis, they collect investors’ postings from stock message board and
assessed how their sentiments are affected by management announcements, press
releases, third-party news, regulatory changes, and thing like that. The probe
notably employs five algorithms; Naive Classifier, Vector Distance Classifier,
Discriminant-Based Classifier, Adjective-Adverb Phrase Classifier and Bayesian
Classifier as classifiers for each message. Moreover, they created additional
programs to collect data and help those classifiers to evaluate the text. It is
remarkable that even the earliest stage of the “Sentimental analysis” study
already utilized the computer based applications. Similarly, their analysis
process that they divided the opinions into three groups which are bullish
(optimistic), bearish (pessimistic) and neutral (comprising either spam or
messages that are neither bullish nor bearish) is also impressive, though these
days some papers treating “Sentimental analysis” seem to be fond of using the
terms “Positive”, “Negative” and “Neutral” instead. Not only their methods but
also their research results are quite interesting.
Figure
1 and 2 from the thesis show a perfect correlation between the sentiment
indexes that they calculated from messages from the “Yahoo” boards and actual
stock prices for APPLE on 18-October-2000 and 20-October-2000. Surprisingly,
figure.2 slightly indicates that the sentiment index implies the imminent
rising of the stock price before it actually happens. On the contrary, Figure
3, which illustrates the Amazon’s comparison on 11th December, 2000, reveals
the opposite outcome in which the two trends are in inverse proportion to each
other. Interestingly, the authors defend their untoward consequence by
desperately saying “On other days, such as for Amazon on 11th December, 2000
(Figure 3), there appears to be almost no relationship between sentiment and
stock price, was a precursor to stock price change (Yahoo!
for Amazon, 2004,
P. 7).”
The
dissertation in 2002 proposed by Bo Pang, Lillian Lee and Shivakumar
Vaithyanathan (“Thumbs
up?” 2002) is also
viewed as the early sentimental analysis document (Structured
Models).
In the paper, they use the movie reviews and decide whether they are “Positive”
or “Negative”. Besides, comparisons between human and algorithms on the
accuracy for distinguishing positive from negative words is conducted. One of
the unique features this thesis offers is that they employ the “machine
learning techniques”. According to “WhatIs.com”, machine learning means “a type of artificial intelligence that
provides computers with the ability to learn without being explicitly
programmed”.[1] Also,
those machine learning algorithms can educate themselves using given data and specialize in any fields’
analysis depending on the type of the information. Three different machine
learning methods are used in this research; “Naive Bayes classification”,
“maximum entropy classification,” and “support vector machines”. Each one of
them is a traditional analysis method which has been studied for a long time in
other fields. Another interesting point is that this thesis successfully proves
that humans’ instincts are much inferior to those standard machine learning
techniques for analyzing the text’s sentimental elements like positive or
negative words against our general expectations. The figure 4 from the thesis
clearly suggests that algorithms’ accuracies outnumber the humans’.
http://www.ijcaonline.org/archives/volume150/number6/alhojely-2016-ijca-911545.pdf
0 件のコメント:
コメントを投稿