Can Google Concordance Language?

| | Comments (2)

In language discussions, results taken from search engines are often quoted as examples to show whether something is used as a form or to compare forms to see which is more common, etc. GoogleBlogoscoped has run 27,000 words from a dictionary through Google for popularity- the full results of the study can be downloaded here. The table below shows the top thirty words from the 2006 and 2003 surveys, together with the top thirty words from the British National Corpus (BNC).
The method used in the Google study does not count multiple occurrences in a single page, so the presence of a copyright message at the foot of a page will count for the same as all the times that the occurs, which accounts for the presence of copyright, contact, site, home, etc. However, the other entries suggest that the contents of the Google databases, and therefore any other reputable search engine, are likely to give a fairly accurate reflection for terms that are not related directly to the language of the layout of a webpage. As a rough and ready tool for checking, it seems that search engines can be used as basic concordancing tools.

Poll: Can Google concordance language?

Google 2006
Google 2003
BNC

a

the

the

the

of

at

to

and

of

in

to

and

of

a

a

and

in

in

for

for

to

by

on

it

home

home

is

all

is

was

this

by

I

is

all

for

about

this

you

site

with

he

with

about

be

at

or

with

more

at

on

your

from

that

us

are

by

you

us

are

contact

site

not

web

information

this

are

you

but

from

contact

's

information

an

they

it

more

his

copyright

new

from

an

search

had

privacy

that

she

that

your

which

Categories: General

2 Comments

Not counting multiple occurences does weaken the case for its accuracy.


Are there any new lists in CSV format that have been added or created lately?

Leave a comment


Type the characters you see in the picture above.