Let's say you've found a new word. You've looked it up in a dictionary and found an example sentence or two. Perhaps you've also found what other words it collocates with. (I'll be looking at some online tools to find collocates, in another post). What you'd like to do now is to see how it's used in context.
One possibilty is to use a corpus, ("a collection of samples of real-world texts stored on computer. Plural - corpora" - Leoxicon), but these can sometimes be difficult to use, and when they include spoken language, the grammar is occasionally "non-standard", let's say.
The British National Corpus is easy to use, but be careful with examples from spoken language.
Another way is to do a simple Google (or other) search. The Internet is one enormous corpus if you think about it, although no linguist has "collected" these examples. But a simple search can bring up a lot of irrelevant material, and again you're not really assured of grammatical correctness.
What I like to do is a Google site search of trusted newspapers and other websites, which are in effect small corpora, or look in Google Books, where the material has been edited and proofread, so is likely to be grammatically correct.
With a Google site search, you put in your search term, (which I like to put in inverted commas so that it only looks for these words when they are together), followed by site: and the address of the website (without http://). So if I wanted to find examples of highly unlikely on the Guardian website, I'd enter:
- "highly unlikely" site:www.guardian.co.uk
To make things easier, I've put together a simple tool to look up words and expressions on various newspaper sites, etc. Just enter a word or expression into the Entry Box and click on one of the links. (Try it with the examples). I'll probably add some more sites to it later.
A note on books - clicking on Google Books searches all books digitised by Google. There is also a facility for doing an advanced search of Google Books
here. You can for example choose to search only modern books. Project Gutenberg is a digital collection of out of copyright books, so it has all the classics but few modern books.
Links
- Leoxicon - a blog dedicated to the use of corpora in the teaching of English to foreign students.
Belum ada tanggapan untuk "Finding language in context - Google site search"
Posting Komentar