Web as Corpus
=============
url:
http://www.webcorp.org.uk/
lang: all languages
kwd: search engine for translators, effective snippet-built corpus
cmt: "WebCorp is a suite of tools which allows access to the World Wide Web
as a corpus - a large collection of texts from which facts about the
language can be extracted. (...) WebCorp is designed to retrieve lingustic
data from the Web: concordance lines showing the context in which the user's
search term occurs. In response to a user query, standard search engines
return a list of URLs (page addresses), along with a description of or some
text from each page to help the user decide which pages are most useful.
(...) WebCorp actually visits each one of these pages, extracting
concordance lines from them. (...) WebCorp contains options (customisable
concordance span, output format, etc) specifically designed for linguistic
research."
***
Marie-Pierre Lessard
EN>FR Technical Translator
http://www.proz.com/translator/4451