Sunday, August 05, 2007

Preparing the General Inquirer negative dictionary for Yoshikoder

I've been spending my free moments this weekend creating a Yoshikoder-friendly version of the General Inquirer negative dictionary used for computer content analysis of political texts. It entails adding wildcards, which Yoshikoder recognizes. This means that the dictionary will be far more sensitive to variations of common negative terms. The creators of the GI dictionary got some variants -- for instance, "exasperate" and "exasperation" -- but missed many other obvious ones, such as "exasperates" and "exasperating". Using "exasper*" will catch these terms.

Of course, wildcards don't work for every word. For instance, "envy" and "envious" could be replaced by "env*", which would get variants such as "envies," but would also catch unrelated words with neutral or even positive meanings -- "envelope," "envision", etc. In this case, I simply added "envies" to the list, rather than using a wildcard.

Converting the General Inquirer dictionary is no easy task. There are 2000 words in the original dictionary that my thesis director gave me (although I see another version contains 2291 words), and each one requires manual review to ensure that wildcards are effectively used and don't introduce unwanted terms into the content analysis that I am planning -- a review of press coverage of Second Life in the past 18 months. Although the GI dictionaries were originally created to examine political texts, I believe they can be used to evaluate other types of text content as well. The GI negative dictionary doesn't contain some of the terms that one typically sees in American or British media articles about new technologies, but it does have a very solid baseline list of negative terms that one might see anywhere.

To see how I used Yoshikoder for my thesis research, check out the following posts:


Thesis update: One small step completed, but still a long way to go

Thesis update: Revising proposal, going granular with Yoshikoder

Thesis update: A eureka moment

Thesis update: Chapter 3 (draft) completed

No comments: