Tuesday, August 28, 2007

When statistical analysis gets scary

Statistical analysis finds evidence of human-to-human bird flu transmission, reports the Fred Hutchinson Cancer Research Center:
The researchers based their findings on a cluster of eight flu cases within an extended family in northern Sumatra. Using a computerized disease-transmission model that took into account the number of infected cases, the number of people potentially exposed, the viral-incubation period and other parameters, the researchers produced the first statistical confirmation of humans contracting the disease from each other rather than from infected birds.

The cluster contained a chain of infection that involved a 10-year-old boy who probably caught the virus from his 37-year-old aunt, who had been exposed to dead poultry and chicken feces, the presumed source of infection. The boy then probably passed the virus to his father. The possibility that the boy infected his father was supported by genetic sequencing data. Other person-to-person transmissions in the cluster are backed up with statistical data. All but one of the flu victims died, and all had had sustained close contact with other ill family members prior to getting sick -- a factor considered crucial for transmission of this particular flu strain.
The close cousin of this type of research and analysis is predictive analytics -- and I find it somewhat alarming that a Google search for the following terms:

predictive analytics disease

.... turns up 209,000 English pages.

Saturday, August 25, 2007

A note from a Harvard Extended reader

Since starting this blog, I have gotten a regular stream of correspondence from people who have questions about the Extension School. In a typical month I'll get about two prospective students contacting me with specific questions about the school, the ALM program, and occasionally the ALB program. I used to report these conversations, and last year I started an interview series with several current ALM students, but the volume of correspondence got to be too large, and my time too tight, to blog about them all.

However, I thought I would post this message from someone who has been following Harvard Extended for several years:
My wife and I are planning to move to Boston from Los Angeles. For the last 2 years I've been reading about the Harvard Extension School, your blogspot, and the forum that you created. It has been a great resource for me, in terms of weighing whether or not I would want to go to the Extension School or not. I truly appreciate all the information you have put out there, and I just wanted to personally thank you.
It's interesting to see how readers have turned to the Harvard Extended blog for information and even inspiration. I started the blog for research-related reasons, and to chronicle my experience in the Extension School's ALM program, but for a few readers it has helped them make major life decisions.

It's a good feeling knowing that I've helped people in this way, but it's also a little unsettling. Changing one's career or academic path is very common in our society, but moving across state lines -- or across the country -- to attend the Harvard Extension School takes things to a different level. We local students can drop out of a program, or take a break from studies without any major impact on our jobs or family lives, but the folks who move to Massachusetts have already quit their jobs and said goodbye to family and friends. They have made a huge commitment, and it's much harder for them to stop or take a break. I really do admire them -- I think the ALM program is fabulous and worth all of the time, effort, and tuition I have put into it since 2003, but I am not sure I would be willing or able to quit my job or relocate my family to take part.

One other note: I would like to point out to the author of this email and other readers that if "the forum you've created" refers to the Extension Student online forum, that was actually created by someone else -- see the interview with him here.

Friday, August 17, 2007

Super Typhoon!

satellite imagery originally from the US National Oceanic & Atmospheric AdministrationTaiwan is mostly shut down today, thanks to typhoon Sepat, which has been classed as a "Super Typhoon." Top speeds are 209 km/h, which puts it in the upper range of a category 3 hurricane, according to the Saffir-Simpson Hurricane Scale. In the north part of the island, we're getting rain and wind, but lots of breaks in the weather, too. I went out yesterday afternoon and got totally drenched, but in the evening there was a break in the clouds, and I could actually see stars. This morning, there have been a few intense, gusty showers, but it seems to be weakening. Down south, where the typhoon made landfall, the conditions will be worse -- I expect the newscasts this evening will have lots of footage of flooding, landslides, and washed-out mountain roads.

Tomorrow morning, I travel to Singapore for State of Play V -- I hope air travel will be back to normal by then!

Scenes from Jiayi's Bo'ai Road Night Market

I've just gotten back from the city of Jiayi (Chiayi, 嘉義市) in southern Taiwan. We were visiting relatives, but we also had a chance to check out the old-school Bo'ai Road (博愛路) night market. Besides the ubiquitous food stalls and clothing stands, we were able to browse activities which disappeared years ago from street markets in Taipei, such as pachinko, bingo, bumper cars, and even archery. Here are some photos from our evening outing:










My daughter's favorite activity from the evening: Catching shrimp (see the second photo from the bottom). There are live shrimp in a plastic tub of water, and players dip little gaffing hooks into the tub and try to catch the shrimp -- which you can then barbecue on a small brazier next to the stall.

Sunday, August 05, 2007

Preparing the General Inquirer negative dictionary for Yoshikoder

I've been spending my free moments this weekend creating a Yoshikoder-friendly version of the General Inquirer negative dictionary used for computer content analysis of political texts. It entails adding wildcards, which Yoshikoder recognizes. This means that the dictionary will be far more sensitive to variations of common negative terms. The creators of the GI dictionary got some variants -- for instance, "exasperate" and "exasperation" -- but missed many other obvious ones, such as "exasperates" and "exasperating". Using "exasper*" will catch these terms.

Of course, wildcards don't work for every word. For instance, "envy" and "envious" could be replaced by "env*", which would get variants such as "envies," but would also catch unrelated words with neutral or even positive meanings -- "envelope," "envision", etc. In this case, I simply added "envies" to the list, rather than using a wildcard.

Converting the General Inquirer dictionary is no easy task. There are 2000 words in the original dictionary that my thesis director gave me (although I see another version contains 2291 words), and each one requires manual review to ensure that wildcards are effectively used and don't introduce unwanted terms into the content analysis that I am planning -- a review of press coverage of Second Life in the past 18 months. Although the GI dictionaries were originally created to examine political texts, I believe they can be used to evaluate other types of text content as well. The GI negative dictionary doesn't contain some of the terms that one typically sees in American or British media articles about new technologies, but it does have a very solid baseline list of negative terms that one might see anywhere.

To see how I used Yoshikoder for my thesis research, check out the following posts:


Thesis update: One small step completed, but still a long way to go

Thesis update: Revising proposal, going granular with Yoshikoder

Thesis update: A eureka moment

Thesis update: Chapter 3 (draft) completed

The Homer Simpson/Xinhua incident: The take-down is explained!

I have an interesting follow-up to last week's entry, Homer Simpson's brain, or why Xinhua continues to have a credibility problem. One of Xinhua's English "polishers" (Xinhua's title for copy editors) has revealed on his blog how Xinhuanet was notified of the mistake: He phoned them up and asked them about the inclusion of Homer Simpson's X-Ray in a serious health article. The conversation that followed is quite amusing.

The blogger behind Beijing Newspeak also discusses the prevailing institutional attitude at Xinhua. It's a place where departmental rivalries matter more than editorial quality or Xinhua's overseas image, apparently:
The entire organisational structure of Xinhua is flawed to the core. Each department within Xinhua exists independently, each scoring performance points for the release of reams of often meaningless words, or losing points for an individual’s mistake eg writing China and Taiwan in a headline. The departments compete with each other to secure as high a place as possible in the end-of-year league table which ensures there is absolutely zero cooperation between them. It is each for himself which means that if Xinhuanet uses a picture of Homer Simpson’s brain to illustrate a MS story, and in doing so tarnishes the reputation of the whole news agency, no one cares. As long as it doesn’t affect our department. Which is why it took a phone call from a foreign polisher, whose pay and reputation is not affected by the points system, to cause the removal of the picture. Many of the Xinhua “leaders” do not read English or simply regard the non-Chinese services as trivial.
I encourage anyone who is interested in the inner workings of China's state-run news agency to read the many other posts in Beijing Newspeak. The author has a lot of interesting observations about the New China News Agency and Chinese journalism. It's one of the few accounts I've seen from a foreign editor working at Xinhua, outside of the recollections published in Robin Porter's 1992 book, which date from the late 1970s and 1980s. Some starting points: Transformers: Xinhua reporters in disguise ("I have come across a number of occasions on which Xinhua reporters in the provincial bureaus around the country have treated breaking news with contempt") and Practising what they preach? China’s journalists’ association on bun conundrum ("It is common practice to surf local news websites, copy some titillating nugget and upload it to the central department. I have seen some of my colleagues rightly reject stories by regional journalists that are based on one comment from an Internet forum.")

Thursday, August 02, 2007

Homer Simpson's brain, or why Xinhua continues to have a credibility problem

Screenshot of the Xinhua articleThe New China News Agency (Xinhua, 新華社) has a credibility problem. It's not just because NCNA is a state-run news agency that publishes propaganda alongside news. It's also because basic editorial processes are so broken that a "file photo" of Homers Simpson's brain can show up alongside a serious article about multiple sclerosis, and remain there for days (it was not removed until August 3, four days after it was initially posted).

Now, you may chuckle at what appears to be a one-off mistake, but it reflects major editorial problems at China's official news agency. This is not just a harmless error (or prank) by a single employee -- it's very likely that at least two other people were involved, and the editorial processes that are supposed to catch such mistakes either failed to work or are not even in place at Xinhua.

I work for a major technology news publisher. Multiple people contribute to and review each article that appears online, before and after the initial publication process. It can't be much different at Xinhua. Aside from writing and copy editing the article, someone -- probably a writer, or the editor "Han Lin" -- had to choose the photo to be included with it. Someone else may have helped prepare the photo for the Xinhua website (resizing it, placing it on the appropriate server, etc.). A third person -- in a normal newsroom, that would be a more senior editor, or someone directly responsible for the website updates -- probably vetted it before it went live, or immediately after it went live. Other employees almost certainly browsed it after publication.

And it never occurred to any of them that the X-ray seemed unusual. I mean, c'mon! Even if you've never seen the Simpsons, wouldn't an X-ray of an oddly shaped skull with a serious overbite and walnut-sized brain warrant a little extra discussion or examination?

But wait, there's more. Since the publication of the article on July 30 (three days ago), people have noticed. Other media outlets have noticed, including Computerworld. It's hard to believe no one at Xinhua has realized the gaffe. Maybe no one checks the Xinhua email inbox?

This is not the first time something like this has happened. China's state-run media has lifted articles off the 'Net before, and has sometimes reprinted hoaxes as news -- as evidenced by the fiasco caused by a 2002 Onion article about the U.S. Congress threatening to relocate the Capitol. The Internet has made it much easier for careless or lazy journalists to copy and paste, and Xinhua's weak QA processes make it easier for plagiarized content, hoaxes, and false or exaggerated information to make it past the gatekeepers.

But even before the World Wide Web appeared, the English-language service had a credibility problem. While viewed as an authoritative source of information about Chinese policies (which is one of the reasons I used it as the basis for my thesis research), Western audiences did not trust its news output, partially owing to its stated propaganda mission, and partially because of quality issues, ranging from poorly written articles to long delays in printing coverage of important events. Since the 1980s, Xinhua/NCNA has invested a great deal of money and effort into making itself a "world news agency" (see Robin Porter's 1992 book, Reporting the News from China, and Won Ho Chang's 1989 history, Mass Media in China: The History and the Future). But as long as the propaganda mission persists, and editorial quality is neglected, there is little chance the English-language service will achieve widespread international respectability.