More brilliance in Corpus Linguistics

| | Comments (2)

Professor Mark Davies of the Brigham Young University set up a superb site to access the British National Corpus (BNC). Easy to use, fast, packed with features and intuitive, it is by far the best concordancer on the web in my opinion.

As if that wasn't enough, he has now added more corpora for us. There is a nice database of Time Magazine that allows us to search for words over time. This means that when I searched for 'glasnost', a Russian word for the Gorbachev era reforms, I can see that it was not used at all before the 1980s, the decade when 187 occurrences are found, then the decline in the 1990s (73) and a very few mentions since then (4)- an accurate reflection of the word's history in English. Try a search for environmental, terrorism, or HIV/AIDS yourself and see if you're not hooked. By the way, the slash (/) means that you can search for two words and compare them at the same time.

Furthermore, there will be access to the Corpus of Contemporary American English (360 million words) towards the end of this year. For other languages, Professor Davies has provided two Spanish corpora and a Portuguese one. Sadly, his very tempting corpus of the OED is only available to students at his university.

One of the best English language sites has just got a whole lot better, with the promise of even more to come. I assume that he is already working on cross-corpora searches. I cannot recommend or praise this site highly enough. Compulsive, addictive even, and highly educational, it is simply fantastic.

Categories: General

2 Comments

Brilliant indeed. Who'd've thought that 'new' was more than twice as frequent as 'old' in tabloid newspapers. (Maybe it's been becoming more frequent in the last few years. Perhaps this tool can tell me - I'll have to go and find out; it'll give me something to do while the UE Members' Area is hors de combat.)

b

Bob, what's wrong with the Members' Area; I just logged in without any problem.

Leave a comment


Type the characters you see in the picture above.