Helping language learners become language researchers: (part 1)

What is is a brilliant website. Essentially, it is a user-friendly interface for analysing a corpus. (For those of you who haven’t come across this term as yet, a corpus is a collection of texts stored electronically.) In this case, it is the COCA (Corpus Of Contemporary American English) corpus, a 450 million word corpus. It is the largest corpus that is freely available, was collected between 1990 and 2012 and contains texts from spoken, newspaper, fiction and academic registers.

Due to its user-friendliness (colour-coding for different parts of speech in the examples, colour-coding for frequency in text analysed etc.), seems ideal for use with students, a tool that could help them become more independent, by providing a means of discovering how language is used, that doesn’t rely on the teacher.

It provides information like:

  • frequency of word or phrase use (within the top 500 most-used words, 501-3000, 3000+)
  • frequency of word or phrase use within particular genres (spoken, newspapers, fiction, academic)
  • definitions, synonyms and collocates (for which it also provides frequency information, making it a very powerful collocational thesaurus, for phrases as well as words)

It allows you to:

  • input (type in or copy and paste) a paragraph of text and see at a glance (through colour-coding) how frequent words are.
  • search for a phrase from that inputted text, by clicking on the component words and generate examples of that chunk of language in use.
  • look at a list of colour-coded examples and identify, at a glance, what types of words are used before and after the word in focus (nouns? adjectives? adverbs? prepositions?), with a rough indication of frequency (in terms of how much highlighting of a particular colour you can see in comparison to another) too.

All in all, it enables you to gain a  better idea of the meaning and use of a word or phrase, as well as its potential alternatives.

However, when learners first meet it, it might seem daunting:

  • When you search commonly used words or phrases, large numbers of examples may be generated: this may be confusing for learners, especially as the examples are portions of sentences (x number of words around the word being analysed) rather than complete sentences, and are devoid of context.
  • Before the colour-coding for parts of speech can help you, you need to understand what it means!
  • There is a lot of information on the page – it can be difficult to know where to start.

How can we use this website with learners?

This is something I am still exploring. I think it has massive power but the limitations need managing carefully so that they don’t put students off.

I have already created some self-access materials (inspired by a course mate of mine – see below for more details) which guide learners through using the site, through a series of tasks, and help them to discover what they can do with it. My learners (of various levels) have used these materials and many were able to complete the tasks without too much difficulty. Some learners independently shared information they found via using the site, using our class blog. However, for the most part it “gathered dust”. 

While my materials address the “how” (at a basic level – there is more that the website can do, that I am still finding out!), they don’t help learners become better at identifying the patterns that are present in the examples generated. Perhaps in order for learners to use successfully and really harness its power, in-class scaffolding is needed, in the form of using concordances with learners, getting them to produce word profiles and generally developing their noticing skills. Of course, as teachers we are always trying to help learners develop better noticing skills, but we usually work with texts, complete with some kind of context, rather than with sentence fragments devoid of context. Transferring these noticing skills, then, may not be achieved automatically.

One of my aims in the next couple of months is to create some activities using concordances and other information from and use them with my learners, to give them more scaffolding, and help them to develop their use of the site independently, as language researchers. I hope to integrate it so that learners use it to find out  more about the vocabulary we meet in class, as well as encourage them to apply it to language they meet out of class. What I create and how I get on with this project will form part 2 (and onwards?!) of this series of posts.

Here are the materials I have made: self access  – a guided discovery tour of the website, with an answer key at the end. If you aren’t familiar with the site, these might be as useful for you as for your learners?! 🙂

These materials were inspired by a course mate of mine at Leeds Met , Jane Templeton, who made some guided discovery materials to help learners use  to choose mid-frequency vocabulary from texts they encountered, as these mid-range words provide a useful learning focus, and to find out more about their choices. I wanted to use with my learners too, but wanted a more general purpose intro to the features of the site, rather than geared towards that particular purpose.  So it was I made my materials, with the example word “outfit” – which may seem a rather random choice! – taken from the page of compounds learners meet in Headway Advanced Unit 6. Though, one might well question whether guiding learners towards a particular purpose, as in Jane’s materials, might be more useful than my vaguer, more general approach… <answers on a postcard!>

How can this website help *you*, the teacher? enables you to:

  • copy and paste in a text that you want to use with your learners and see at a glance what percentage of high frequency (top 0-500), mid-frequency (500-3000) and low-frequency (outside the top 3000) words are present in your text and so an indication of what difficulties it is likely to present to your learners.
  • You could use this information to guide you in decisions regarding what words to pre-teach, what scaffolding your learners might need when they meet this text, or perhaps what words to adjust to more frequently used synonyms (something else the site can help you find, as it provides both synonyms and frequency information, as well as examples of use, if you are unsure whether you have found the right alternative) if you feel that would be more appropriate, depending on your goals in using the text and the level of your learners.

Conclusion: is a site with a lot of potential for language learners and teachers alike. I’m still learning how to use it and finding ways to tap that potential. Please let me know how you get on with using the materials I have uploaded here, and the website, whether yourself, or on behalf of your learners – I would be very interested to hear! I would also be interested to hear any ideas, you have and try out, for integrating use of, in any context, and how it has benefited your learners.


22 thoughts on “Helping language learners become language researchers: (part 1)

  1. Looks very promising. I however had a confusing first go with it. I clicked on the Academic Text link, inputted one of my sample answers and clicked search. It appeared in the text box to the right in a colour-coded fashion, but when I clicked on any of those words, the corpus showed a completely different word in definition and concordance. Something seems amiss here.

    Next I selected “History” as the domain where I inputted my sample answer and clicked submit. A completely different text appeared colour-coded in the text box on the right, not my sample. Maybe there is some explanation here, but it’s enough oddity to make me hesitate using it.

    • Odd! I just tried the academic text link (a whole nother area of the site that I’m keen to explore further!) and wrote an example sentence in the box, did the search, it appeared as usual on the right-hand side, with the frequencies relating to academic register/frequency list, and clicked on some of the words and it generated a concordance/definitions etc of the words I clicked on.

      But to your second paragraph, I think you actually selected an automatic sample text by choosing history. That’s how it works in the general part of the site: it has an example text from each of the registers that you can select to search. And that would be what then appeared in the box on the right when you clicked search.

      I think the academic link would be very a valuable tool to language learners in a university context, so don’t give up on it just yet! 🙂

  2. This is a really powerful tool. I definitely want to try to build it into a new course on vocabulary. I will be playing with it a bit, and see what I can do. I look forward to seeing what others have done too. Tyson, I tried the two features you mentioned there, but it worked fine for me. I got exactly the text I entered, and was able to see what fell into my selected domain. I could click on any of the words and get a complete list, as well as clicking on any of the collocation words to get further lists. I have the ‘phrase’ feature working too.I don’t see any problems yet. Maybe you had a temporary glitch?

    • Sounds good! What kind of context are you working in? 🙂 I’m currently using it with general English private language school adult learners but am keen to use it in a university context, if I can get some pre-sessional work this summer. I think it could be very powerful for uni students.

      • My situation is a little strange: it’s EAP, but with very low-level…beginners (some literacy level). I am in the early planning stages of my MA Thesis, and thinking of creating a 6 week seminar on vocabulary for this beginner level. That looks like a good tool, since it clearly separates a text into the “1st 1000” of the GWL, 2nd 1000, etc. I think this could be one of my tools. Still early stages though…

  3. hi lizzie

    good stuff, is a deep tool, but as with all tools getting students to use it depends on the interface. i have found that giving learners a range of options is one way to encourage use so offer exposure to corpus based tools such as wordneighbors, my current favourite (, just the word (, lextutor (, byucoca ( as well as

    btw did a similar intro to wordandphrase your readers may be interested here,%20Muralee.%20Corpus%20use%20literacy.pdf


    • Wow tons of tools to explore. I’ve used lextutor before, it’s quite cool, but I prefer wordandphrase, for the colour! 🙂 The others, I’ve yet to play with…

    • What is G+community for corpus use? Is it like a yahoo group but on Google or are we analysing it? :-p (I know I could look but for the sake of anyone who sees this, it’s nice to know at a glance what to expect before following up! 🙂

  4. Pingback: Helping language learners become language researchers: (part 2: 3 activities) | Reflections of an English Language Teacher

  5. Pingback: Helping language learners become language researchers (part 3): concordance activity outcomes | Reflections of an English Language Teacher

  6. Great post Lizzie and I love your tour 🙂 I’ve used the site for a while and my learners find it very useful, particulalry when working on academic English. They post their own work into the box and the “less frequent” items that often come up tend to be errors, but you have to be careful with this.
    I like your very clear exploration of outfit and I think that if teachers plan very carefully what they are going to focus on, it can be very useful and it is, let’s face it, a great tool.

    It does pay, however, if you want to use it in class to check whether your system/firewall whatever will let you as it works with a particular video setting otherwise you don’t see anything. This can be altered on the computer but it can throw you if you’re not prepared for it. (At least it did me)
    Once again, I think corpora based materials are great for teachers to select lexis to focus on and for advanced learners who can use them to answer their own questions quite often.

    • Thanks! 🙂
      I agree re checking systems in advance – it also doesn’t like low-res screens such as on tablets or small computers (netbooks?)!!
      My advanced learners love now, but I think what’s also helped that is scaffolding that use through little tasks/activities in class (using printouts) and then out of class and giving them opportunities to share what they have learnt in class discussions. It is a great tool. I like the idea of getting them to paste their work into box. But what if they are using vocabulary happens to be “less frequent” as opposed to errors. I suppose searching chunks/phrases would be interesting. Mmmm I’ll have to think about it some more when I’m more awake! 🙂

  7. Pingback: Learning Contracts and Language Learning (part 2): how I’ve used one and what I’ve learnt (other than a lot of Italian!) | Reflections of an English Language Teacher

  8. Pingback: Useful IELTS Websites | Reflections of an English Language Teacher

  9. Pingback: Dictionaries, corpora and using notebooks | Sandy Millin

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s