The BNC was the first text corpus of its size to be made widely available.
The bibliome is the totality of biological text corpus.
KeyWord created a list of all those words and word forms according to certain statistical criteria in the text corpus significantly occur rarely or frequently.
Spanish text corpus by Molino de Ideas, which contains 660 millions words.
My specific areas of interest are natural language understanding systems and multivariate analysis of text corpora.
Nevertheless, there is a very large text corpus of Middle English.
He applied it to phonemes and words in text corpora of different languages.
Rules to define co-occurrence within a text corpus can be set according to desired criteria.
On text corpora, word lengths, and word frequencies in Slovene.
Consequently, text corpora were limited both in size and availability.