Cognates. Whether you love ’em or hate ’em, how much Latin are we talking about? When they’re used, do cognates end up comprising most of a book’s Latin? Are we talking half? A quarter? Less? And then what does that mean about the rest of a book? How much of a book’s Latin is being dismissed amongst the cognate concerns? One of the concerns is that cognates weren’t used by Classical authors, the claim being that using infrequent words is a problem, and the conclusion being that cognates are unhelpful for today’s Latin learner. I don’t have this concern or have found evidence to support the claim. Nevertheless, I have been wondering just how much—or little—the concerns constitute, percentage-wise, of a book. I also have been wondering if there were many words Classical authors themselves used that other Classical authors didn’t really use. N.B., I don’t mean hapax, just words that were rarely used by others. Thus, if Classical authors preferred to use rare words, too, that would help illustrate how the current cognate fuss is more about preference than anything else. Let’s see…
I knew the number of unique cognates found in my books (e.g., 11 cognates, 8 other words), but I didn’t have any figures about their percentage of the total (e.g., instances of cognates are ???% of all the Latin in the book). Therefore, I looked at my shortest books with the heftiest dose of non-Classical Latin cognates, and calculated the percentage of unranked word occurrences via Logeion. Unranked words are those that appear fewer than 50 times in the Classical text used for frequency data. To give you a sense of what this means, the word sēcrētum is unranked (though it’s one of my favorite Latin words!). A word like clam was preferred by Classical authors. It’s the 2280th word they used most-frequently, appearing a whopping 269 times in extant texts. To put this into perspective, the most-frequent word is et, appearing 175,032 times.
A Note On Using Frequency Lists
Frequency lists are generated from all literature sources of a particular kind, such as the Classical period, but no one expects you or your students to read all Classical texts. So, clam is not gonna be 2280th on YOUR frequency list. To get a true frequency list, you’d have to run all your planned curricular texts through something like Voyant Tools to see a) what words are used, and b) how often they occur. Maybe a word like clam is unranked in YOUR frequency list (i.e., appears fewer than 50 times in the texts your students read). Then again, maybe it’s waaaaay more frequent. Either way, you still have a choice to make regarding clam: use it, or don’t use it. Frequency lists are just one data point. There are other things to consider. Of course, no one’s saying it’s a bad idea to start with the heavy hitters at the top of frequency lists, but they shouldn’t be treated as gospel. After all, even knowing DCC’s entire Top 1,000 list doesn’t get you as far as you might think.
Anyway, here’s what I found in terms of % cognate instances in my shortest books with a bunch of non-Classical Latin:
12% – Olianna et obiectum magicum
12% – Quīntus et īnsula horrifica (coming soon)
8% – Mārcus magulus
5% – Quīntus et āleae īnfortūnātae (coming soon)
What’s this mean? First and foremost, this means that 5% to 12% of these books have a higher chance of being understood by my students. A *higher* chance. This is a modest claim, and this claim has support. Consider Latin with no familiar words. This kind of Latin has absolutely no higher chance. A higher chance is better than no higher chance, even if it turns out to not happen. Not astrophysics. Next, 5% is not very much. 12% is not much more, either. Concerns over this little amount of a text means that the other 88% to 95% of the content is being dismissed. Finally, those with cognate concerns would find 5% to 12% of the books above unhelpful on the basis that those words weren’t frequent in Classical literature. It only makes sense, then, to take a look at infrequent words found in Classical authors that would be just as unhelpful.
A Note On Unhelpful Latin
So much fuss has been directed towards cognates. What about all the fancy rare words, or one-offs found in ancient texts? If any argument is to be made about characterizing the kind of Latin being written today as unhelpful—for beginners no less—such an argument must consider the kind of Latin that was written centuries ago that’s also unhelpful. We know that obscure name-drops and unfamiliar places of the past get in the way of comprehension. So, how unhelpful might some unranked Caesar and Virgil be, percentage-wise? Let’s take a look and make sure we’re talking apples and apples:
7% – First 6 chapters (800 total Latin words) in Caesar’s Book 1 of Gallic War
6% – First 120 lines (800 total Latin words) of Virgil’s Aeneid
What’s this mean? If the claim is that using rare unranked words is a problem, this claim must extend to the unranked words of Virgil and Caesar just the same. If the claim is less about rare words in reality (i.e., if the claim is a red herring), and more about the quālitās of the words themselves, the concerns are really about preference, and begin to take on an imperialist language tone (e.g., “oh, we wouldn’t want to use those words now would we?”). I’m no fool hoping to change anyone’s mind on cognates, but my search this week did show that even coveted Classical texts contain a comparable number of unranked words, and they should be taken into account as well.