Guardian sentiment analysis - explanation
By parsing the comments from articles appearing in the
Guardian newspaper an attempt is made to judge the overall
reader sentiment with regard to the content. Please see
this blog post for a more complete description.
There are two means to view sentiment, using
this page to value a single article,
or
this page to get a list of overall scores for a set of articles.
The sentiment value for an article is the average sentiment score of the associated comments. The score for a comment is composed as follows:
-
A small amount of cleaning is done on the comments, in particular any quoted text is removed.
-
The sentiment value is calculated by searching for sentiment indicating words from
this
list.
-
This is normalised by dividing by the word count to get a sentiment per word.
-
The score is then weighted by the number of recommends it received, rescaling so that the average recommend-weighting is equal to 1.
Note that only head comments are used, not replies, and of these only the first 100.
Contains information from AFINN, which is made available here under the Open Database License (ODbL).