I’ve had a lot of prep time for a couple years now. How?! Not because of my teaching schedules, but because I constantly streamline practices to ensure I can actually complete my work during the workday. Most of this time is spent typing up class texts for students, as well as researching teaching practices online. Last week, however, I spent waaaaaay too much of that prep time crunching numbers with voyant-tools.org. Here are some insights into the vocab my students were exposed to this year throughout all class texts, and 8 of my novellas (reading over 45,000 total words!). N.B this includes all words read in class except for those appearing in the first 6 capitula of Lingua Latīna Per Sē Illustrāta that we read at the very end of the year. The stats:
- 550 unique words recycled throughout the year (there were 960 total, but 410 appeared just a handful of times!)
- 30% came from the first 8 Pisoverse novellas (Rūfus lutulentus through Quīntus et nox horrifica), and not found in class texts.
- 290 appeared in at least a few forms (i.e. not only 3rd person singular present for verbs, or nominative/accusative for nouns).
- 2470 different forms of words (grammar!)
- 45% came from the 8 Pisoverse novellas, not class texts.
These numbers represent what is meant by “shelter vocabulary; unshelter grammar.” What is this? Among other things, it’s basically the opposite approach of nearly all textbooks. That is, the most reasonable textbooks might introduce 25 new words, but just 1 new word form (i.e. “unleash vocab, limit the grammar”) whereas the least reasonable textbooks might introduce 25 new words, and 25 new word forms?! Success via a grammar-based textbook approach requires students to understand a high number of word meanings. The challenge is even greater since all that vocab amounts message content that tends to lack purpose. It tends to lack purpose because it’s used to teach a limited amount of grammar (i.e. conveying meaning is only secondary to teaching grammar).
With a comprehension-based and communicative approach, however, I was able to expose students to many different forms of the 550 words in order to build up mental representation of Latin word functions (i.e. case, tense, etc.) without ever having to explicitly teach those functions. Students were able to process those functions by having to understand relatively few meanings. Therefore, success via this kind of approach is much higher, for all students. The 30% figure of words found in novellas but not class texts represents classically-themed vocab that wasn’t really a part of everyday communication within my context.
The 410 figure is a bit higher than I would’ve guessed. Then again, it does represent all the “flavor text” that classes found compelling, but that weren’t really important, or weren’t needed to express much throughout the year. Also included in that figure are the one-offs that made sense for a particular story/text, often appearing just once or twice with meaning established in parentheses, and then never again in a different text. N.B. due to the compelling nature of texts/stories, and the language experience in the class while reading, however, it’s likely that at least some of those one-offs have been acquired! Still, it’s important to recognize that of those 550 words that were recycled, it’s likely that students have only acquired a third of the most frequent…of those most frequent…words, but this is expected. Language acquisition is sloooooooow, and this data helps us see that. Here’s a very small sampling of some individual word occurrences from our class texts (i.e. not novellas):
– vir (man) appeared 26 times throughout the year.
– semel (once) appeared…once (hahaha!)
– vult (wants) appeared 211 times, and its different forms 408 times.
– est (is) appeared 1071 times, as would be expected.
– gladius (sword) and its forms appeared just 17 times.
– hodiē (today) appeared 55 times.
– servus (slave) appeared just twice.
What Does This All Mean?!
I don’t expect words appearing 25 times or fewer to be acquired. Some will, but many won’t. Also, it makes sense that a word like ventus (wind) was used just once, but I’m surprised that valdē (very) appeared just 4 times! I use that word a LOT during class, so I wouldn’t expect it to be absent from the written input. That’s not a big deal, though. Students understand valdē just fine. If, however, they were to negotiate meaning constantly during class, I could look into increasing exposure to that word. If I then wanted it to appear more in written texts, this data shows that I’d have to be more deliberate about that.
Other insights show that in our texts, tessera (password) was only used in that one form. If students showed signs of incomprehension every time I said tesseram or tesserīs, I COULD look to increasing exposure to those forms in the written input. I also see in the data that sunt (are) appeared 105 times, but sumus (we are) appeared just 18. However, I have ZERO plans to create first person plural “we” dialogue just because it appeared less frequently in the written input! After all, I use that word often in class, and haven’t needed to negotiate meaning of it. This data just shows that sumus was just lacking in the written input.
Conclusions? Next Steps?
I’m not worried about any lack of exposure to words infrequently occurring, and my novellas provide a wide net of grammar. Still, I could more deliberately unshelter grammar when writing class texts to see how much wider a net could be cast. Using voyant-tools.org more regularly, I’ll be able to ensure that more forms of words appear in class texts, as well as recycle one-offs/”flavor text” more so the numbers equal out a bit more (e.g. 900 words, of which 600 or 700 appear more than a handful of times). In fact, this kind of analysis just screaaaams teacher eval goal! Of course, it will take a small amount of additional work, but I’m not concerned about that.
After all, I’ll continue to ensure that I have plenty of prep time to do so.
6 thoughts on “Sheltering Vocab & Unsheltering Grammar: 2018-19 Stats”
Very cool, Lance! You found 550 unique words (“lexemes” in corpus linguistics terms) and 2470 different forms (“lemmas”). So then was there an average about four morphological variants per lexeme? For example “vult (wants) appeared 211 times, and its different forms 408 times.” Would you up for adding a second post with contextualized examples of just two unique words/lexemes? You could show us vult ‘wants’ in all of its forms in the contexts of the sentences, plus one preceding sentence plus one sentence following, plus a mention of the overall story within which the three sentences were situated. Context adds “ecological validity” for the reader (us), i.e. showing us what motivated the use of these different morphological (word ending) changes, and, more to your interest, how it wasn’t simply the teacher contriving grammar for grammar’s sake (the situation pushed for it).
Oh man, great suggestion! Let’s see if I can deliver on that…
OK Reed et al., I have a few examples of 8 forms of “want.”
Very cool! I can see how the narrative contexts push for 1st, 2nd, 3rd person, and past and present tense. The stories wouldn’t make as much sense without those grammatical forms.
Pingback: 45,000 Total Words Read! | Magister P.
Pingback: 2018-19: Timed Write Stats | Magister P.