Thursday, March 16, 2006

Problems with the digitized, networked archives

I am a big fan of digital archives and other data accesible via the Internet. I am using them now as the foundation of my research, and I believe they will make a huge impact on historical studies in the years to come.

However, an article by John Letzing of the Wall Street Journal last month ("Changing History", p. R10, February 13, 2006) brings up a problem with this trend: Digitized records on computer networks are marginalizing records that are not on computer networks. A Northwestern University professor says that faculty aren't going to the library very much anymore, and are missing out on "accidental discoveries" that often prompt new theories and research.

The same source also brings up two problems with Google. He quite rightly points out that Google indexes by popularity, not by quality. Websites with the most links get pushed to the top of the search results, regardless of who wrote it or their famiarity with the subject in question. So if Martha Stewart's blog mentions ancient Chinese civilization in connection with a menu she dreams up, it will show up in the results way ahead of writings by history scholars. Additionally, the source in the article suggests that peer reviewed articles are losing their anonymity, because it's easy to insert a phrase into Google from an article under review and find out who the author is, at least if the text has been previously uploaded to the 'Net -- which it often is, in the case of classroom lecture notes, syllabi, and abstracts from conference papers.

I have a few personal observations to add. I really believe today's students are more apt to take the lazy way out -- several times, I've seen bibliographies of student papers that contain nothing but Internet sources. I believe that journals that are networked get more scholarly attention and citations, while those that aren't are increasingly relegated to obscurity. We may see a similar trend with books, when the trickle of electronic books and the Google Book Search project begin to make headway. And handwritten sources? Unless someone is able to design scanning software and hardware that can accurately and efficiently scan letters, scrolls, and notes, and make them searchable, these important primary sources are likely to be marginalized as well -- and the only ones that will get attention are those "imortant", or rare documents that have been scanned and hosted on the 'Net by their owners, most of them well-financed libraries and universities.

In the case of my own thesis, I have to admit that my own research would have been more effective if I had been able to include New China News Agency data from 1976, the year that Mao died and the Gang of Four (briefly, and incompletely) had political control of the country. But because NCNA news wasn't archived by LexisNexis until January 1977, I couldn't include the earlier material -- it would have been too time-consuming, and prone to error if I had to search and tabulate the thousands of 1976 NCNA articles by hand.

On the other hand, the fact that the post-1977 NCNA archives are digitized makes possible whole new areas of enquiry into modern Chinese history that simply weren't realistic endeavors in the pre-networked age. I am using this data for a quantitative study, but even people conducting qualitative research can far more quickly and effectively find relevant NCNA articles in the LexisNexis and Factiva databases. The same is true for other areas of history. Right now scholars can examine American newspaper reports from the 1700s and 1800s -- imagine how useful that is for people studying the American Revolution, slavery, and the settlement of the American West!

No comments: