How We’ve Been Wrong About Latin Word Order

Anyone who knows anything about Latin will agree that the language is SOV (Subject-Object-Verb). In more friendly terms, this means that Latin how Yoda speaks resembles it does. This common understanding is just one basic assumption that drives a lot of decisions and discussions. Yet, how certain are we that Latin is as SOV as we think…

Scholars have been analyzing Latin for a long time, so research on word order isn’t hard to find. Like all good Wikipedia articles, there’s a ton of citations and sources to dig through in the one on Latin word order. Here are some fun facts from this 1918 article I accessed on JSTOR looking at SOV frequency of esse, main clauses, and subordinate clauses in Caesar’s De Bello Gallico I & II, and Cicero’s De Senectute:

  • For esse in main clauses, Caesar used SOV order just 10% of the time; Cicero at 33%.
  • In subordinate clauses with esse, both authors used SOV order about 62% of the time.
  • For all other verbs in main clauses, Cicero used SOV order 66% of the time; Caesar 90%.
  • For all other verbs in subordinate clauses, Caesar used SOV order 68% of the time; Cicero just 8% of the time!

Those stats are all over the place. One thing that stands out above all else, though, is that neither author wrote 100% in SOV order. That should mean something right there. However, the data doesn’t stop there! Pinkster (1990) shares many facts about Latin word order, summarizing Linde (1923), this time looking specifically at main clauses across literature:

  • Caesar uses SOV order 84% of the time.
  • Sallust, 76%
  • Augustine, 42%
  • Cicero, 3354% (depending on work)
  • Varro, 33%
  • Egeria, 25% (yeah I had no idea, either, but read here)

Cicero is interesting for the range of SOV use across his own writing, as far as how far below Caesar he was, even at the highest use of SOV. I was so interested in these stats that I began doing a bit of counting on my own using much smaller samples of just a paragraph or so. Surely, if SOV were preferred, it would have shown up more in smaller samples, right? Wrong. I opened up, clicked on some authors, and counted both main and subordinate clauses in SOV order:

  • Cato De Agri Cultura praefatio, 76%
  • Caesar De Bello Gallico V 1.1-1.9, 73%
  • Cicero Pro Caelio 1, 63%
  • Petronius Satyricon 1, 60%
  • Livy De Urbe Condita I 1, 54%

Results from those very small samples (e.g. 20-30 clauses each) are certainly higher than some averages Linde recorded, but nowhere near supporting the idea that Latin is as SOV as we thought. Think about it; that sample from Livy was as low as nearly half, and that of Cato edging beyond 3/4. That’s hardly enough consistency to qualify Latin as SOV! Still, there’s more data to be found. I decided to look at some poetry for comparison, which I expected to be much, much lower given the constraints of meter. The figures below represent all clauses that I counted, again both main and subordinate, that appear in SOV order:

  • Virgil Eclogues I-III, 38%
  • Catullus I-V, 34%
  • Virgil Aeneid (lines 1-50), 31%
  • Ovid Metamorphoses I (lines 1-50), 20%

This particular sampling shows that the masterful Virgil wrote in SOV order less than 40% of the time. However, poets are bound by meter. This is a good comparison when looking back at the prose authors. If SOV word order truly were as desirable as we’ve thought, wouldn’t writers of prose—with complete freedom unbound by meter—be using it waaaaaaaaaaaaaay more often than the data shows?! Yes. Yes, they would, but the numbers aren’t compelling enough to support that. Sure, it seems that Caesar had a major affinity towards SOV. Other writers, not so much, and Cicero—THE Cicero—had much lower SOV order than one might expect for the model of golden age Latin. If Cicero wrote in SOV order—at most—about half the time, we need to reorient our thinking.

The facts are clear: historically, most authors haven’t written even mostly (75%+) in SOV order. So, where is the idea of SOV coming from?! Honestly, I thought it might be textbooks brainwashing us into expecting SOV at every turn—perhaps with the primary goal of students reading Caesar—so I looked into a few of those. However, reality paints a slightly different picture:

  • Cambridge Latin passages, Stages I-VI, 72%
    • Like Caesar, nearly EVERY verb was last except for forms of esse
  • Ecce Romani I-VI, 65%
  • LLPSI (Lingua Latina Per Se Illustrata) I-VI, 63%

Although the numbers are indeed on the higher side, textbooks still aren’t close to 100% SOV. Cambridge certainly sticks out as mirroring the style of Caesar, but its Latin hasn’t really been regarded as the best among critics, anyway. In fact, I’ve seen LLPSI praised most for its Latinity, yet it has the lowest average of SOV. BTW, at this point, if anyone is trying to cite word order as lack of Latinity, those claims just don’t hold up!

Regarding Latinity, there certainly has been a stigma against using Latin in any word order other than SOV, although it should be clear by now that there’s no reason. Still, I haven’t come across anyone speaking or writing exclusively in an English-like word order, either (i.e. 0% SOV)! Even when Englishlike word order is used as a strategy for making Latin more comprehensible, the percentage of SOV is actually a lot higher than we think. Curious, then, I did a quick analysis of the VERY BEGINNER novellas (i.e. 20-40 unique words) I’ve written. To be honest, I expected Rūfus lutulentus to be super low because I intentionally used English-like word order as a comprehension strategy. Surprisingly enough, it matches the SOV writing of Egeria, and isn’t far away from Cicero’s lowest use (33%) of that word order! Here are the percentages of SOV, which reflect the same range of all the Latin I’ve looked at for this post, whether BC, AD, or modern textbook Latin:

25% – Rūfus lutulentus
44% – Rūfus et Lūcia: līberī lutulentī
55% – Syra sōla
54% – Pīsō perturbātus
86% – Rūfus et arma ātra

What Next?
My hope is that current authors will worry less about word order as a rule, instead using Latin’s flexibility to be more comprehensible, emphasize certain words, and expose students to the inflected endings that create meaning rather than fixed patterns that don’t really exist in the literature outside of Caesar. I also hope more teachers will feel more confident speaking in whatever order Latin flows out. After all, even authors who had time to edit their writing didn’t largely prefer SOV order. Moreover, no one insisted on SOV exclusively, and some of the best were very, very far away from writing in that style.

Suffice to say that, at least according to the data I’ve found, most Latin authors didn’t even mostly use SOV. Therefore, Latin is NOT as SOV as we’ve thought.

9 thoughts on “How We’ve Been Wrong About Latin Word Order

  1. This is really useful information and can makes us feel better about not sticking to 100% SOV when not using esse.

    But, SVO is no the only alternative to SOV. How often is the pattern OVS or VOS (suspense or shock in holding the subject to the end)? I would suspect SVO occurs more often than the other two, but it’s a guess, not research-based. And how much more often, I’m not willing to hazard even a guess.

    Thanks for putting so much time into the research!

  2. This makes me wonder what other possible linguistic influences may have been present. Did all authors get raised in Latio? From where did their mothers hail (mother tongue/dialects)? Did any authors get raised a portion of the time in more than one province? Or perhaps forced to study foreign languages for diplomacy? Etc. Do we know enough of these authors to study the reverse; linguistic forensics? Might give us clues to the proto-phonecian dialects and language.

  3. In a way this seems like a distinction without a difference.

    To come right out with it. Yes, “Classical” Latin is indeed an SOV language. Variation of this kind is relatively common in languages with pragmatically determined word order, and says very little about whether a specific word order is preferred for unmarked, pragmatically neutral utterances. It’s true that Latin word order was not “rigidly” SOV in the way Persian, say, is. But the variation from that unmarked pattern is itself not unconditioned or random. I wish western Euroglots didn’t so casually describe Latin word order as “free”. It wasn’t free. It was sensitive to a very different type of constraint than the ones speakers of Western European languages are familiar with.

    Russian word order is basically SVO, but pragmatic considerations can at times push the percentage of genuine SVO sentences in literary prose well below the 75% mark. Emotive speech in particular may cause rightward or leftward dislocation of the head verb a lot of the time. If I say in Russian “Boris vzyal knigu” (Boris took the book) this is “unmarked”. But “Boris knigu vzyal” could mean “It was the book, (not the DVD), that Boris took”. But “Knigu vzyal Boris” might introduce double focus: “Boris took the book (whereas Natasha took the DVD)”. Even the pragmatic determinants that produce these word orders in Russian are not always the same. Word order in emotive speech is different from simple reported fact. None of this makes Russian any less of an “SVO” language at its fundament.

    Classical Latin is not a typologically idiosyncratic language, nor is it non-configurational. Its word-order is often pragmatically driven, even as the SOV baseline continues to remain the unmarked alignment. But there are complicating factors.

    It has in fact become conventional among linguists who work on Latin syntax to stress the part played by pragmatic factors of this kind in determining variability of word order (e.g. between OV and VO) in Latin texts. Take a look at Devine & Stevens’ “Latin Word Order” and it becomes clear that tabulating percentages in main clauses without regard to pragmatic context is not the most useful way of understanding word order in a language like this.

    Now in most texts there is a good deal of variation, and this can be put down in early texts especially to the needs of emphasis and other pragmatic factors. But there are two other points that need to be stressed.

    First, as Devine & Stevens note well, in the very earliest recorded Latin, the syntax seems closer to being truly non-configurational. The word-order of verse I think preserves something of that, and for obvious reasons. But in the “Classical” Period the pragmatically driven word-order was probably already in the process of shifting to a more strictly configurational type familiar from modern European languages. I don’t mean that in a teleological sense, as if the language had a scripted destiny with Romance as its endpoint, but you take my meaning.

    Between the Late Republican period and Romance there was a shift from a language in which OV features were common (and in e.g. Caesar, predominant) to a kind of language in which VO had become widespread and unmarked. In the many centuries in between there isn’t exactly a chronological progression from OV to VO. What we see is a lot of variability determined in part by the sorts of pragmatic factors already mentioned, but not entirely. Which brings me to point number two:

    In post-classical Latin there are certain texts — most of them containing other markers of informal language, and some containing signs of influence by genuine lower sociolects — that have obvious and marked VO characteristics, with comparatively little to speak of in the way of variability, pragmatic or otherwise, esp. in main clauses. Such texts include the letters of Claudius Terentianus, the Peregrinatio Aetheriae, Pompeius’ Grammar, the bilingual colloquia, and much of the New Testament portion of the Vulgate.

    The VO features of this rather varied corpus cannot be due merely and entirely to pragmatic determinants and this becomes obvious when they are set against many other texts of comparable content in the same periods. The only explanation that really holds much water is that in higher literary language, conservatism retained greater force, as it often does. This very likely not only affected writing but copying as well. There is moderate variation in word order in attested MSs of some texts, which leads to the possibility that the word-order used in copying a text out might be subject to interference from the copyist’s stylistic sensibilities. What’s more, in the later Empire (as the range of the “textual past” got longer and longer) authors could probably wind up imitating older models more and more indiscriminately, with varying levels of interference from their vernacular grammar (this is to be seen especially in the work of Pompeius). This makes it hard — and sometimes impossible, when not simply pointless — to disentangle the synchronic and the diachronic, the more so in archaizing genres.

    Imitation of earlier models will have caused the preservation of a variability that included OV features. But a different style — with less pragmatic variability, and far more VO features — does seem to have been emerging in less literary texts. In light of later history, one can only assume that this was more reflective of what was going on in speech, though the influence of the Latin Bible’s translationese (through which Biblical Greek word order was filtered) must have been a contributing factor (if only by establishing a precedent that made the VO style more licit in some kinds of writing than it might otherwise have been.)

    But the sense that certain normal features of speech (like VO word order, and also eventually e.g. use of ille/ipse morphs as a definite article) ought to be avoided in higher styles lasted for a very long time, probably for as long as people felt that what they spoke was Latin, which is to say right up until the 9th century. Even if you look at the Strasbourg Oaths, generally agreed to be the first instance of “written Romance”, you can see the high-style avoidance of specific features (like the definite article) which based on other evidence we now know must have been present in speech. And just check out the OV word order of a sentence like:

    “Si Lodhuuigs sagrament quæ son fradre Karlo iurat, conseruat, et Carlus meos sendra, de suo part, non lostanit, si io returnar non l’int pois, ne io, ne neuls cui eo returnar int pois, in nulla aiudha contra Lodhuuuig nun li iu er.”

    The last point I want to make is that even our idea of “Classical Latin” (traditionally the Latin of the Late Republic and Early Empire) encompasses the language of around two-to-three and a half centuries of linguistic change. The rules governing pragmatic word-order in Caesar’s syntax and those of, say, Apuleius (separated as they are by about two centuries) are probably not identical. I would in fact say they are clearly different. If one were to try and derive the determinants of word-order for Icelandic based on a corpus stretching from the 18th century to the 21st, or for Italian based on a corpus extending from the 14th century to the 17th, one would wind up with some quite erroneous, and inscrutable, conclusions if one did not allow for change over time.

    Using percentages, you do, from authors separated by a century or more is thus a problem. Taking Augustine and Petronius compared to Caesar and Cicero is not comparing like with like. One must make allowances for the interaction of synchronic grammar with literary precedent set by previous generations. History authors like Livy and Tacitus especially were prone to a lot of syntactic experimentation. On the other hand it is pretty hard to avoid the conclusion that, for late authors like Pompeius (who appears to have been composing via dictation at least in part), SVO was the norm in speech, whatever they were inclined to do when they wrote.

  4. I understand that we should write and speak Latin just as it occurs in our minds or pens, respecting its morphology, without the trouble of always trying to put words in a sov order. SOV is a procedure imposed by school books and teachers who do not dare to think Latin can and may be treated as a modern language for daily and general use, and especially in the universities. The Romance languages, modern Greek and the Slavic languages should serve as examples of the syntatic simplicity we can apply to Latin. The author of this page has proved that SOV need not worry us anymore. I desagree fom the view that the kinds of the Latin language of the numerous earliest writters should be considered as different languages unsuitable for syntactic comparisons. Latin and Greek are intellectual and cultural treasures of our Western civilization. To strive to keep them alive and accessible for all should be the aim of our schools in this turning-point of our Western history, therefore, to scrap SOV is an urgent need. Darcy Carvalho. FEA-USP. São Paulo. Brazil, 2022.

