Wednesday, June 01, 2005

Scharrer's content analysis using LexisNexis

Just skimmed over a 2002 journal article in Journalism Studies entitled "An 'Improbable Leap': a content analysis of newspaper coverage of Hillary Clinton’s transition from first lady to Senate candidate." (Journalism Studies, Volume 3, Number 3, 2002, pp. 393–406)

The author, Erica Scharrer of the University of Massachusetts (Amherst), uses LexisNexis in a different way than I have. My earlier study on references to overseas Chinese in Kampuchea and Vietnam was based on frequency counts in a single source (Xinhua) and analysis of the the results, but Scharrer's study uses multiple sources, a sampling technique, and a comparitive study that mentions Rudi Giuliani, and also has human coders rate the stories according to a set of special criteria:

To construct the sample, the search term
“Hillary (Rodham) Clinton” was entered for the
time period of 1 October 1999 to 6 February
2000 in the Lexis Nexis database. From the list
of sources displayed, every fourth story was
selected to reach the ultimate sample size of 342
stories on Clinton.

The dates were chosen to
encompass a four-month period in which
speculation about the race mounted and then
certainty was reached. The time period includes
24 November 1999, on which Clinton said
unofficially that she would run, and ends on 6
February 2000, the date of her official announcement.
To gather stories about Giuliani, a smaller
sample over a shorter time period was chosen,
since Clinton is the central focus of the study.
Using the search term “Rudolph OR Rudy
AND Giuliani” in the Lexis Nexis database
from 1 November 1999 to 1 January 2000, every
tenth story was selected, resulting in a sample
of 96 stories. When comparisons between the
candidates were made, the dates of the Clinton
stories were narrowed to include only the same
time period that was examined for Giuliani.

All US-based newspapers archived in the
Lexis Nexis database were included in the
analysis, allowing for diversity of newspaper
size and region. Newspapers were chosen
due to their role in informing the public about
politics (Comstock and Scharrer, 1999) as well
as for the potential for elite newspapers to set
the agenda for other publications (Shoemaker
and Reese, 1996). News articles and editorials
were both analyzed. Two female, trained
coders who were unaware of the hypotheses
coded 40 percent of the sample (20 percent
each), and the author coded the remaining 60
percent. Intercoder agreement using Holsti’s
formula averaged 0.88 and ranged from 0.83 to

Defining and Measuring Variables

Coders noted whether the activity or angle
covered in the story was politically active or
not. Stories about issue positions, poll results,
campaign visits, and policy discussions were
coded as politically active. Stories about such
traditional first lady roles as escort, entertainer,
home decorator, fashion plate, and charitable
works advocate were coded as non-politically
active. For example, stories in which Clinton
visited a hospital were coded as politically active
if it was under the heading of campaigning
but non-politically active if it was within the
charitable role common to a first lady. Finally,
“mixed” labels were given to stories in which
politically active and non-politically active roles
were given approximately equal weight.

Coders determined the degree to which the
story indicated that Clinton would, indeed, run
for Senate, on a scale of 1 (definitely will not
run) to 4 (definitely will run) and assessed the
tone of the story on a scale of 1 (very negative)
to 5 (very positive).


She then creates an index for the two politicians based on these variables, and proceeds to analyze the results. It's an interesting study that uses the LexisNexis tool in a different way.

Of course, the use of human coding is necessary for rating the coverage as negative/positive/politically active/etc. Computers cannot accurately discern these nuances (although some computer programs incorporate dictionaries to discern more general degees of "meaning" based on the position of certain words within text, syntax, and frequency counts).

On the other hand, using humans to code these stories also introduces the possibility of error and bias.

No comments: