Tuesday, June 28, 2005

Another reason China should fear the 'Net: A million people with camera phones

There have been a lot of stories and debate about China censoring its media, banning websites, and forcing foreign companies like Microsoft to obey its anti-free speech directives. China of course wants to keep a tight lid on the exchange of information among Chinese people and protect its image, but I think this is impossible in the Internet age.

Moreover, the threat of the Internet is not just about Chinese democracy activists and Falun Dafa supporters exchanging tracts and criticizing the government. I think a far more potent aspect of the Internet lies in its ability to spread damaging news, facts, soundbytes, photos, and video to those in China and elsewhere who might otherwise have an ambivalent attitude about the government in Beijing. The spread of the Internet has also coincided with the creation of cheap gadgetry like cell phone cameras, digital audio recorders, and video cameras, and a huge base of people who know how to use them and upload the results to the 'Net for all to see.

An office lady with a cellphone may not seem like a threat to the regime, but what if she gets a text message from a friend saying to check out a crowd in front of a local school, and then she takes a picture of what turns out to be a demonstration, and it spreads all over town by the end of the day, and all over the country by the end of the week?

Consider these recent examples from outside China, and the damage they inflicted upon various people, companies, and countries:

1) "Memogate" involving CBS Evening News, and a supposed 30-year old memo it claimed cast doubt on President Bush's National Guard service. Bloggers with publishing experience examined the document online, and concluded that the fonts used in the memo were probably used with a modern word processing program -- i.e., the memos were forged.

2) Photos of a Korean schoolgirl who was run over by an American armored vehicle. These were spread via the Internet and provoked anti-American anger in Korea

3) Embarrassing and potentially damaging emails from various U.S. corporate scandals, some of which have contributed to multi-million dollar fines, corporate shakeups, and new government regulations

4) The Internet-distributed Taiwanese sex tape from a few years ago, which embarrassed local officials from central Taiwan who were involved (anyone remember the details? My mind is foggy on the points)

5) The Abu Grahib prison pictures, taken with a digital camera, and spread via email and then mass media.

Now consider how similar incidents, captured and spread electronically, could cause problems for local, provincial, and central government officials and entities in China. It could be a cellphone photo of a demonstration or riot, an accidentally forwarded email between a mayor and his mistress, a videotape of an official taking a bribe ... Incidents like these have the potential to really cause problems for the government, because they are so hard to control at the points of collection, distribution, and reception, and ordinary Chinese people -- like ordinary people in many other countries -- are intensely interested in news of scandal, corruption, sex, and official abuse of power.

Imagine sixteen years ago, if five percent of the tens of thousands of people in Tiananmen Square had portable phones, digital cameras, and video cameras, and the content from 10% of those devices had been uploaded and spread via the 'Net? There wouldn't be just one iconic image of the events -- i.e., that guy standing in front of the tank -- there would be dozens, or hundreds. And the government wouldn't have to put out fires in Beijing and a few other big cities -- it'd have a major image problem in practically every city and town where there are people with 'Net connections.

Chinese instructions for IT executives?

I wrote another China-related post for my blog at Computerworld. In it, I responded to a colleague's idea to have American IT executives learn Chinese. While I think Chinese-language education is appropriate for high school and college students, and for American businesspeople in China or planning to go there, I believe most American IT managers who are based in the United States could use the time in more productive educational endeavors. Read my reasoning here.

Saturday, June 25, 2005

Nationalism in Xinhua's Chinese service vs. Xinhua's English service; new thesis idea?

One problem with my idea of measuring nationalism indicators in Xinhua coverage pertains to the purpose of the Xinhua English service, which is different than that of the domestic Chinese language service.

The copy that Xinhua produces for Chinese newspapers and broadcast media has a propoganda mission. The Xinhua foreign language services try to let foreigners understand China, promote China's progress and the struggles it faces, and, when it comes to foreign news, to uphold China’s national independence, territorial integrity, and sovereignty. Therefore, the mix of stories that are aimed at these two audiences should be different. The gatekeepers for the English service -- the editors, translators, and reporters -- may dump a lot of the nationalism-themed stories, because they think it will not appeal to the audience they are targeting.

On the other hand, there are two reasons why Xinhua's English service may actually be quite similar to the Chinese service. One relates to the source material for the English service, which largely consists of translations of stories from the Chinese service. Still, it's not a one-to-one translation of Chinese material -- the English service apparently tries to match the "inverted pyramid" style of newswriting, which not only means Chinese stories are restructured, but also means additional sources are consulted to fill in the details that foreigners might expect to see. This correlates with my own experience working in English-language media in Taiwan -- almost all of the stories that I worked on that were based on translations from Chinese articles had to be beefed up considerably to fill in various details, ranging from getting the full names of people mentioned in the story, to answering the "whys" of a particular event. Why did this happen? Why did he or she say that, or do that? It drove the translators crazy, and often the original reporters, too, who were sometimes summoned from sleep to answer these questions. We also consulted other newspapers, foreign wire service reports, and occasionally the original sources of the news, especially police, government offices, or places of business where a certain event took place.

But I digress. The other reason Xinhua may include nationalism-themed stories is because these stories will help fulfill Xinhua's mission of giving foreigners a better understanding of Chinese popular and official sentiment.

These things should be considered, as I conduct more preliminary tests. I am also thinking about another tack to take for my thesis -- expanding upon the research I did for last semester's class, Chinese Emigration in Modern Times. I describe the research here, and you can read the paper here.

My idea for expanding the thesis is to analyze Xinhua coverage of Vietnam and issues relating to Vietnam/Chinese relations over a longer period of time. My original research only covered 1977, 1978, and 1979, but in the 1980s and 1990s there were several other developments in bilateral relations, and patterns in Xinhua coverage might be dissected to let us better understand China's policies.

Friday, June 24, 2005

China History Forum and other Harvard resources

I just happened upon a really interesting forum that I can't read now, because I am at work: The China History Forum.

I also found a few Harvard resources which I didn't know existed: Harvard China Review, a magazine devoted to developments in China, and the Harvard China Forum, which apparently started the magazine but no longer exists (?). Unfortunately, the website of the former hasn't been updated in some time, but I will check it out later.

TAs who can't communicate

The New York Times has an article today, about some foreign graduate teaching assistants who have difficulty communicating in English. I would say this is a legitimate beef in certain cases -- a student should be able to understand what his or her TA is saying in class discussions, or in feedback regarding exams, papers, and other projects. This is apparently a big problem in certain disciplines where there is a high proportion of foreign graduate students and PhD candidates, such as engineering and science.

The article does not mention another side of this issue ... if the students can't communicate with their TAs, what about the professors? Does this pose a problem for them in terms of giving instructions to TAs, or discussing a certain student's paper? What about evaluating the TAs own written work, or the oral defense of a dissertation?

Thursday, June 23, 2005

Book on Chinese nationalism

In my lunchtime search for electronic archives of Chinese media sources, I came upon a book by Peter Hayes Gries entitled China's New Nationalism: Pride, Politics, and Diplomacy (you can read it here if you have a Harvard ID and PIN).

I haven't read the whole thing, but he makes a point about U.S. Chinese studies academics which jives with my earlier post on the character of seminars on China organized by Harvard academic departments. Gries writes:

Academic China watchers also tend to present a rosy picture of China, rarely speaking out on controversial issues such as human rights. Scholars like Andrew Nathan and Perry Link are the exceptions that prove the rule. Because they have spoken out against Chinese human rights violations, Chinese nationalists and government oRcials have subjected them to vicious personal attacks, and they have been denied visas to China.

Gries, Peter Hays (Author). China's New Nationalism: Pride, Politics, and Diplomacy.
Ewing, NJ, USA: University of California Press, 2004. p 2.
Copyright ©2004. University of California Press. All rights reserved.


I am hoping the book talks about the Chinese press, as well. When I get home, I will start writing a precis on Gries' book.

Factiva: Limited Xinhua Chinese content

I just logged onto Factiva through Harvard Libraries, to see how far back their Chinese language press sources go. Unfortunately, not far ... The Xinhua Chinese-language services (traditional and simplified characters) only go back to 2002, compared to 1989 for the Xinhua English wire service. The problem is probably related to technology ... I imagine that for Factiva to be able to store and display non-Western language character sets, a significant upgrade on the servers or database software was required.

In any case, 2002 is too recent for my purposes for a comparison study, although at least in the case of Xinhua's English service, I can turn to LexisNexis, which goes back to 1977.

I'll have to see if Harvard Libraries have access to any other searchable Chinese-language press databases from the 1980s and 1990s.

Content analysis of pro-nationalist, pro-state propaganda in Xinhua?

I was reading journal articles about Chinese journalism last night. One of the main trends that communication scholars like to study is how Chinese media handled its propaganda mandate while expanding commercial activities in the 1980s and 1990s. There was an interesting case study involving the Beijing Youth News, and how it turned into a competitive mass-market publication, while serving as a propaganda model for other Chinese media following the Tiananmen crackdown in 1989. One point the authors of this study brought up was the paper itself had its own yearly campaigns starting in 1990 to boost nationalist sentiment and support for the government -- "Socialism is good," and "China's modern history," are two examples. These campaigns were started before the central government requested more emphasis on these issues -- i.e., journalists acted on their own to create propaganda, rather than being told to do so by the government, although later directives reaffirmed the paper's decision to carry out these campaigns.

But this made me think ... if the central government also issued directives to the media for more emphasis on nationalist and pro-socialist feeling, to counter the "Western" influences that capitalism brought to the country, that should be reflected in Xinhua domestic (i.e., Chinese) copy for sure, and perhaps even for the English service.

Here's an idea for a content analysis, using LexisNexis searches on Xinhua: Take every month in 1985 to 1995, and perform very structured searches for certain indicators that would reflect pro-socialist, pro-state, or pro-nationalist support. The analysis would test for all articles in the lead paragraphs (which I believe would indicate direct propoganda) as well as in other parts of the article (which I hypothesize indicate an attempt to infuse pro-socialist or nationalist thought in other types of stories). I actually had a similar idea before, but didn't really think it through, after other topics attracted my attention.

The tricky part would be identifying the indicators. Some obvious candidates include "Socialism", "motherland," as well as campaign slogans and mentions of imperialist actions that form the basis for modern Chinese nationalism. The other tricky issue would be the searches -- they would need to be focussed and control for other types of news items that may contain these indicators but have nothing to do with China or propaganda.

My tentative hypothesis: Xinhua coverage would change dramatically, following the 1989 incident, the start of official propaganda campaigns, and perhaps following the promotion or demotion of certain leaders and Xinhua managers. The data from the content analysis would allow me to measure these trends, and allow for additional analysis.

What would be really interesting, though, would be to incorporate an analysis on Xinhua's Chinese service, and compare it with what the English service says.

Some preliminary testing is in order ... I'll give it a shot this weekend, if it's not too hot in my study.

Tuesday, June 21, 2005

Using precis, part I

As I mentioned in my last post, I am going back to the literature in an attempt to focus my topic ideas.

For me, going back to the literature does not mean re-reading the books themselves. Rather, it means looking at various precis I wrote last year after reading the books the first time.

If you are a graduate student and don't know what precis are, it's time you learned and started creating them for all the books you read. In a nutshell, a precis is a book summary that can help you remember the author's intention, thesis, main points, ideological slant, and details related to various chapters. They are ten times better than ordinary notes or margin scribblings, and are an incredible resource for papers and follow-on research.

It's easy to get started on precis -- all you need to do is read a book and write a paragraph or two for the following items:

Precis Title:

Read (Date):

Author:

Author’s intentions:

Thesis:

Type of history:

Structure of argument:

Evidence used:

Ideological orientation:

Strengths of book:

Weaknesses:

Contributions to the field:

Outline:


For the last item, "Outline", I usually write an entry for each interesting point or quote throughout the book, and note the page number. This may cover up to ten pages, for a long or particularly important book. However, standard short-form precis usually mention four or five main concepts illustrated in the book, which should take up a page or two.

Professor Sally Hadden, a Harvard PhD and specialist on the history of the South and the American Revolution, taught me how to prepare a precis. She teaches at the Summer School and I highly recommend her courses. While the reading and writing workload is tough, she sets high standards for her students and you will learn a lot about early American history.

I'll write more about Prof. Hadden and how to prepare a precis later on. I'll also provide some samples. Stay tuned ...

Sunday, June 19, 2005

Thesis blues

In the middle of the week, I had a feeling of helplessness -- It has been about a month since class ended, and my pledge to myself to get my thesis proposal completed this summer is stuck in a rut.

The main problem is my inability to focus my thesis topic. I know I am interested in developing methodologies to pick apart Chinese media and analyze Chinese policy, but I cannot focus the topic more than that. I have considered nationalism, territorial claims, Taiwan, and other policy issues, but for various reasons I am having trouble shaping the topic and methodologies into a coherent thesis idea.

I began to ask myself, am I on the wrong track? Am I too wedded to my content analysis methodologies, and the idea of using Chinese media sources to study Chinese policy?

I also asked, is Chinese history my calling?

When you ask yourself these kinds of questions, and follow the logic chains to other possibilities, one alternative that presents itself is the idea of giving up. Considering this option is depressing, yet so seductive -- giving up ends the mental anguish of the thesis, frees up huge chunks of time, and allows you to concentrate on other pasttimes, family, and career.

Another possibility is giving up temporarily, and hoping a better thesis idea presents itself next year, or the year afterwards -- after all, with the five-year ALM deadline, I have until June 2009 to finish my coursework and thesis.

Both of these ideas crossed my mind, but then I gave myself a virtual slap in the face. I don't want to give up, and delaying would be fatal to my thesis -- I am doing well in my career and expect to have more work responsibilities two years from now. Also, I have two little kids now, which is really tough, but in two or three years these little tykes will need my attention and time in different ways -- help with homework, driving them around to various activities, etc. If I wait, a new topic may present itself, but by then I may not have the time to conduct the necessary reasearch and writing to complete a thesis.

But even more importantly, I confirmed to myself that Chinese history, Chinese media, and a computer-assisted content analysis really are what I want to do. Thanks to my background and previous studies, I have a unique set of skills, interests, and understanding that are a natural fit for this type of study. Furthermore, I believe this type of study would really contribute to the study of modern Chinese history. This is my calling.

With these thoughts in mind, I have decided that returning to the literature is my best course of action. I will reread two books on Chinese nationalism that I haven't cracked open since last year, and also check out some journal articles on Chinese media and computer-assited content analysis of media sources. Besides refreshing my memory on several issues relating to modern Chinese history and media issues, it may also give me new insight on methodolgy and research methods, and of course lead me to a solid research topic and thesis proposal.

Wednesday, June 15, 2005

Blogging, Microsoft and China

In my Harvard blog, I haven't really described what I do for work, but I will now.

I am the online projects editor for Computerworld, a large trade magazine relating to the computer industry. I develop and manage several areas of the Computerworld website, including blogs and webcasts, and sometimes get to write something for the site.

Today I had an opportunity to post an item on the Computerworld website about blogs and doing business in China, which you may find interesting. Check out the post.

Monday, June 13, 2005

Using LexisNexis Academic: Pros and Cons

A crucial part of my planned content analysis is the tool that I will use to gather data from the Xinhua News Agency. LexisNexis Academic holds the electronic archives of hundreds, maybe even thousands, of text-based news sources such as magazines, newspapers, and wire service reports, stretching back to the 1970s. There must be several million individual articles on file, which makes it an incredible resource for students of history, foreign policy, anthropology, and other disciplines. It was apparently started as a resource for lawyers to search for evidence and old case material held in text files.

However, the Web-based interface with this giant database is limited. It works like a search engine, with a few twists. If you want, you can search all of the news sources in the database, or you use drop-down menus and other fields to restrict the search to a certain region, or a certain publication. You can also specify a date range to search. There are three fields to specify the terms you want to search for, as well as "operands" to add or eliminate terms from the results.

For instance, in my study of Xinhua references to Vietnam and Overseas Chinese, I set up a series of searches on one month periods starting in January 1977 and ending in December 1979. I restricted the results to the Xinhua News Agency, so I wouldn't get "hits" from the New York Times, Associated Press, the Hong Kong Standard, etc. In the first field I typed "viet" (lowercase "v", as the results are not case-sensitive) as opposed to "Vietnam" because Xinhua's style guide spells the country "Viet Nam." I also made sure that "viet" wouldn't return results for articles that contained "Soviet", which was a common term in Xinhua articles at the time.

I then selected the operand "and" and in the second field, I typed "overseas Chinese." The way the LexisNexis engine works, the words typed in a single field will be interpreted as a single phrase, not separate words -- i.e., only stories that said "overseas Chinese" would be searched. Also, I could use the operand "not" to find all stories about Vietnam that do not mention overseas Chinese.

One other cool feature of LexisNexis is the ability to restrict each of the three search terms to a certain part of the story -- the headline, the lead paragraph, or the full body of the story. Anyone who regularly reads newspapers can appreciate the value of restricting a search to a word that appears in a headline versus anywhere in the body of an article. How many New York Times stories feature "corn" as a focus, versus a mere background detail? Restricting a search to a headline term would filter out those stories which only included "corn" as a background detail, as background details never appear in a headline.

While this may sound like a great tool, it's far from perfect. My research centers around frequency counts, and LexisNexis is only of limited help in this respect. It can tell me how many stories with a certain term appear in a certain time period, but only if the number is less than 1,000 -- otherwise it returns an error message, forcing you to refine the search criteria, which usually means reducing the time period under study from, say, one month to one week, in order to stay below the 1,000 item limit. It also won't tell you the total number of stories printed by a given news source in a certain time period, which means you have to trick the engine into telling you this detail -- I used "item no" as a search term, as each article in the Xinhua archive starts with "Item No.:" in the slug. This allowed me to determine the total number of stories in a certain period, and, after manually pasting the data into an Excel spreadsheet, calculate the frequency of stories mentioning Vietnam and overseas Chinese (as well as other terms). But this was a more labor-intensive and potentially error-prone method, than if LexisNexis added this capability to the existing tool.

Additionally, LexisNexis does not allow you to determine which terms appear most often in a body of news articles over a certain period. Wouldn't it be great if I could call up all Xinhua stories that mention Vietnam in 1978, and then find out the five most frequently mentioned words within those results that are located in the lead paragraph and are longer than five letters, and their relative frequencies?

Such a search would be possible, if users were allowed to perform full SQL queries on the database. SQL stands for Structured Query Language, and is the language most databases understand .... in fact, it is almost certainly the language that the LexisNexis engine uses behind the scenes to return results using the Web browser interface. However, the browser interface is limited to the SQL methods that correspond to what i have described above .... more advanced queries, such as automatically calculating frequency counts, ordering results by size, or performing statistical methods on the results is not possible.

But what can I do? I can wait for LexisNexis to improve its browser-based search tool, or I could look elsewhere -- Factiva offers a similar service, but its Xinhua archive only goes back to 1989. Or I can attempt more tricks or work-arounds to get the results that will help me with my research.

In any case, I should be thankful that I have access to the existing tool, via the Harvard Libraries agreement with LexisNexis, which allows students to access the engine via a dial-up Web connection. Ten years ago, such a tool was probably not available, unless you were at an on-campus connection.

Sunday, June 12, 2005

Considering thesis topics and methodologies

I have been thinking, a lot, about my thesis proposal. At home and in the car, and sometimes at work.

I know that I want to do a content analysis of Chinese media. I find it intriguing that Chinese media can be used to examine Chinese government policies. But I do not know what specific topic I want to study. I have considered China's views toward Taiwan, China's territorial claims in the South China Sea, the Chinese media itself, or potentially something that combines all of these topics. There are problems with making the study too broad, and also with studying a topic that is not really covered by the chinese media source (the Xinhua News Agency) that will form the basis of my content analysis.

Thinking through these issues has allowed me to consider two key aspects of my ALM research: the topic itself, and the methodology.

I've already talked a bit about the topic question. Other than the topics listed above, the field is really wide open -- the Xinhua English service has covered thousands of news items relating to foreign and domestic policy since the archives were digitized beginning on January 1, 1977. For the class I took last semester, Prof. Kuhn's seminar on Chinese emigration in modern times, I was able to research China's policies regarding overseas Chinese in Vietnam and Cambodia in the late 1970s, using Xinhua news coverage as a barometer of Beijing's policies. There is really a lot of other fodder in the Xinhua archives that could be subjected to similar research. Another possibility: Making the topic Xinhua itself -- for intance, examining how the news agency covered major subjects during the 1980s, or 1990s.

There's also the methodology issue. To recap, I used the following methodology in my research paper for Prof. Kuhn's class:

1) Used simple frequency counts to draw out patterns in the data
2) Used a traditional historical approach to explain the patterns as well as analyze qualitatively the news articles themselves.

Prof. Kuhn suggests that rather than using my hybrid approach, I could have taken the quantitative approach a step further, or several steps further:

Your reasoning is plausible in view of the evidence you cite from non-numerical sources. I wonder, though, if there could be a way to use a finer-meshed array of frequency counts to support one or more of these hypotheses? Having gone this far, it should be possible to construct plausible strings or phrases to check out -- to support the Indonesian angle, for instance, or the Soviet angle. .... In short, I think you have made an excellent start thought quantitative methods. Can it be refined past the hypotheses stage, or will it always require verification from the non-quantititative side?


Hmmm. This got me thinking ... could I have applied a finer mesh of searches to support my hypotheses? To tease out the Soviet angle, certainly: I could take the results of searches I performed on "overseas Chinese", "Vietnam" and "Kampuchea" and seen how many of the results also mentioned the USSR.

But to tease out the editorial angle of these stories, I am not sure how much simple searches would have helped ... that is, a search on the Xinhua archive for Vietnam and the Soviet Union might turn up lots of results, but I think only a human review could determine if those stories were hostile to Vietnam and the USSR vs. ambivalent and even friendly. It is incorrect to assume that Xinhua would naturally be hostile toward the USSR, as Chinese policy toward the USSR at one timehttp://www.blogger.com/img/gl.link.gif was very friendly, and conceivably there were alternating periods of friendliness/hostility after 1977, depending on personal or policy issues between the two countries.

Additionally, there are degrees of friendliness and hostility that frequency counts wouldn't be able to tease out, except under the most contrived scenarios.

After thinking about these issues in the car, and at home, I have come to the conclusion that my methodologies might take two routes:

1) A content analysis consisting of a frequency count followed my manual coding for meaning, like Scharrer's content analyis of news coverage of Hillary Rodham Clinton

2) A two-stage frequency count content analysis ... one to determine basic trends in Xinhua coverage, followed by analysys of the results and the formation of more hypotheses, which could then be tested by another round of frequency counts.

More on this soon ...

Saturday, June 11, 2005

Work and studies, and time-saving techniques ...

For the past week I've been busy with work ... I am covering for someone who's on vacation, which means I have two hours of additional work to do at home each day on the computer. This takes away from time that I could otherwise use for study.

But I still have time to think about my thesis proposal in the car. That's the way it is if you are an Extension School student with job and family responsibilities, or a really heavy class schedule ... you just have to make time to study in odd places and at odd times, or sacrifice times when you would otherwise relax, or be unproductive. I can't remember the last time I have read a novel -- practically any time I am opening a book, it's for class or for the ALM. And commuting is usually an unproductive section of people's daily schedule, but I have managed to use it for my studies.

I remember when I was working in Cambridge, I used to do tons of reading for class and for papers on the 71 and 70 busses. Now that I work in Framingham, I can't read during my half-hour car ride every weekday morning and afternoon, but at least I can think, as long as I turn off the radio.

Monday, June 06, 2005

Chinese emigration paper returned with Prof. Kuhn's comments

I just received my graded paper from Prof. Kuhn's Modern Chinese Emigration class. I have linked to it here. Besides giving me a very good grade, he also brought up a couple of interesting comments:

Recent study shows that Xinhua during the pre-1949 period was both news service and intelligence agency -- e.g., combinng foreign broadcasts and publications for information needed by the CCP leadership.


This jives with some of my secondary source research, which noted that as of the late 1980s, Xinhua published Cankao Xiaoxi (Reference News), which translates from foreign publications important articles, without embellishment. This is a must-read publication for CCP officials, and, despite being restricted to party and government officials, had a circulation of 8 million in the late 1980s (Mass Media in China: The History and the Future. Ames: Iowa State University Press, 1989, p. 69)

I have also read in the Hong Kong press that the Hong Kong branch of the Xinhua News Agency was in fact an intelligence-gathering facility. I did not include references to this in my paper, as it was tangental to the focus of my paper.

Professor Kuhn also notes the following about my paper:

I am impressed by the power of statistical approach to generate hypotheses. In each policy turn, you cite reasons drawn from non-quantitatve sources for the particular configuration of variables. Your reasoning is plausible in view of the evidence you cite from non-numerical sources. I wonder, though, if there could be a way to use a finer medshed array of frequency counts to support one or more of these hypotheses?


This is something I have been thinking about for my ALM thesis -- can I or should I take a hybrid numerical/non-numerical approach to my content analysis, as I did in my study of references to Vietnam, Kampuchea, and overseas Chinese, or try a more strictly quantitative approach, as Prof. Kuhn suggests?

To recap, I did the following in my research paper for Prof. Kuhn's class:

1) Used simple frequency counts to draw out patterns in the data
2) Used a traditional historical approach to explain the patterns as well as analyze qualitatively the news articles themselves.

In retrospect, I don't think I could have done this study in a wholly quantitative manner. There were shifts in Xinhua coverage of the three variables that could not readily be drawn out by automated means, except by applying contrived filters in order to force a quantitative data set that supported my hypotheses.

One example that springs to mind is the shift in Xinhua coverage of the Vietnam refugee crisis from the summer of 1978 to the spring of 1978. In the summer of 1978, Xinhua's criticism of Vietnam on this issue was very direct, citing the words and actions of Chinese officials. However, in the spring of 1979, the criticism was very indirect -- Xinhua cited other countries' criticism of Vietnam. There is no clear linguistic indicator that I could have used to discern this shift using quantitative methods.

But there may be for another content analysis on a different set of variables within the Xinhua archive. More on this in the coming days ....

Friday, June 03, 2005

Allston expansion: is the Charles the problem?

More news from Harvard regarding the Allston expansion. I have only read Marcella Bombardieri's Globe summary, but I take issue with one statement:

"The Charles River has long been the major physical and psychological impediment to that vision, with students and faculty balking at the thought of a long walk across the river, exposed to traffic and bitter weather, to get from dorm to class or lab to a meeting."


Having crossed the Charles from Allston to Cambridge many times, I believe the problem is not the river. The river is actually quite beautiful and a pleasure to cross.

Rather, the main problem is the traffic. Crossing four lanes of traffic on Mem drive, and then two more sets of lights on the Allston side takes time, and is unpleasant as dozens of cars roar by. The second problem is the 10-minute walk from Harvard Yard to the riverside, along congested, broken sidewalks.

Still, it would be nice to have kiosks selling ice cream, water, beer, kebabs, beer, hot dogs, popiscles, beer, or whatever along the way. It would make it possible to grab a quick meal between classes or meetings.

Thursday, June 02, 2005

Blogged by Asiapundit

It's a good feeling the first time you realize someone actually finds value in your blog. I had that feeling earlier this evening when I happened upon Asiapundit's reference to my earlier post about my preliminary test run of Xinhua on references to Taiwan in the 1980s and 1990s.

That prompted me to post the raw data in PDF format to my Harvard website. Creating the PDF was easy (with a Mac); uploading it to the FAS web server was unbelievably complicated. Besides basic HTML skills, Harvard requires you to know unix commands, and be able to work with a secure FTP client. Very few students -- Extension or otherwise -- happen to be able to do this. It's almost as if FAS doesn't want you to create a website ... now why would that be?

Anyhow, here's a link to my Harvard website, which has a link to the preliminary Xinhua data on Taiwan. It's very rough ... note there are no totals for each year, meaning relative frequency is impossible to figure out. But in an earlier study, I found that Xinhua's English language wire service was publishing about 20,000 news items per year in 1978 and 1979 (compared to about 15000 in 1977, the first year in which Xinhua English service items were archived in LexisNexis), so I imagine the totals would be similar for the early 1980s.

Wednesday, June 01, 2005

Scharrer's content analysis using LexisNexis

Just skimmed over a 2002 journal article in Journalism Studies entitled "An 'Improbable Leap': a content analysis of newspaper coverage of Hillary Clinton’s transition from first lady to Senate candidate." (Journalism Studies, Volume 3, Number 3, 2002, pp. 393–406)

The author, Erica Scharrer of the University of Massachusetts (Amherst), uses LexisNexis in a different way than I have. My earlier study on references to overseas Chinese in Kampuchea and Vietnam was based on frequency counts in a single source (Xinhua) and analysis of the the results, but Scharrer's study uses multiple sources, a sampling technique, and a comparitive study that mentions Rudi Giuliani, and also has human coders rate the stories according to a set of special criteria:

To construct the sample, the search term
“Hillary (Rodham) Clinton” was entered for the
time period of 1 October 1999 to 6 February
2000 in the Lexis Nexis database. From the list
of sources displayed, every fourth story was
selected to reach the ultimate sample size of 342
stories on Clinton.

The dates were chosen to
encompass a four-month period in which
speculation about the race mounted and then
certainty was reached. The time period includes
24 November 1999, on which Clinton said
unofficially that she would run, and ends on 6
February 2000, the date of her official announcement.
To gather stories about Giuliani, a smaller
sample over a shorter time period was chosen,
since Clinton is the central focus of the study.
Using the search term “Rudolph OR Rudy
AND Giuliani” in the Lexis Nexis database
from 1 November 1999 to 1 January 2000, every
tenth story was selected, resulting in a sample
of 96 stories. When comparisons between the
candidates were made, the dates of the Clinton
stories were narrowed to include only the same
time period that was examined for Giuliani.

All US-based newspapers archived in the
Lexis Nexis database were included in the
analysis, allowing for diversity of newspaper
size and region. Newspapers were chosen
due to their role in informing the public about
politics (Comstock and Scharrer, 1999) as well
as for the potential for elite newspapers to set
the agenda for other publications (Shoemaker
and Reese, 1996). News articles and editorials
were both analyzed. Two female, trained
coders who were unaware of the hypotheses
coded 40 percent of the sample (20 percent
each), and the author coded the remaining 60
percent. Intercoder agreement using Holsti’s
formula averaged 0.88 and ranged from 0.83 to

Defining and Measuring Variables

Coders noted whether the activity or angle
covered in the story was politically active or
not. Stories about issue positions, poll results,
campaign visits, and policy discussions were
coded as politically active. Stories about such
traditional first lady roles as escort, entertainer,
home decorator, fashion plate, and charitable
works advocate were coded as non-politically
active. For example, stories in which Clinton
visited a hospital were coded as politically active
if it was under the heading of campaigning
but non-politically active if it was within the
charitable role common to a first lady. Finally,
“mixed” labels were given to stories in which
politically active and non-politically active roles
were given approximately equal weight.

Coders determined the degree to which the
story indicated that Clinton would, indeed, run
for Senate, on a scale of 1 (definitely will not
run) to 4 (definitely will run) and assessed the
tone of the story on a scale of 1 (very negative)
to 5 (very positive).


She then creates an index for the two politicians based on these variables, and proceeds to analyze the results. It's an interesting study that uses the LexisNexis tool in a different way.

Of course, the use of human coding is necessary for rating the coverage as negative/positive/politically active/etc. Computers cannot accurately discern these nuances (although some computer programs incorporate dictionaries to discern more general degees of "meaning" based on the position of certain words within text, syntax, and frequency counts).

On the other hand, using humans to code these stories also introduces the possibility of error and bias.