Voldemort has nothing on bears!

Historical linguistics is an incredibly fascinating subfield of linguistics with so many areas to explore. Rather than focussing on the language that we have now, historical linguists spend their days looking through old documents to try and understand how language and evolved over long periods of time.

It turns out there are so many consistent changes that have happened over time and doing this type of analysis has taught us so much about the universal rules of language change. We are going to talk about one peculiar case of language change though that was brought about by a strange superstitious belief. Let’s talk about the English word “bear”.

Photo by Photo Collections on Pexels.com

Before we talk about “bear”, let’s introduce some historical knowledge and general knowledge about language families. As you may already know, all of the languages in the world can be divided into groupings of languages called families that are related to each other in some way and share a common ancestor like a family tree of sorts. Two of the most well known language families that we will be focussing on today are languages in the Romance family (Spanish, Italian, French) and languages in the Germanic family (German, Dutch, English). There are many more language families out there that are all doing their own unique things, but these are the two that we will talk about for today.

First, let’s dive into Romance languages. Romance languages are all directly descended from Vulgar Latin and as a result of this close common ancestor, they share much of their vocabulary and grammatical rules. There are of course large distinctions in pronunciation and such that have developed over time, but you have likely noticed in your own life that a lot of the words in a language like French are quite similar to Spanish and Italian.

So let’s bring it back around to “bear”. “Bear” is no exception to the above facts. The French word for “bear” ours/ourse is similar to the Spanish orso which is similar to the Italian orso/orsa. These are all so similar because they would have derived from the Latin word ursa, which is likely not a huge surprise when you think about the name of the constellation “ursa major”, which was named using the Latin word for bear.

This is all very cool and interesting, but we need to remember at this stage that Latin was not the first language on earth. It’s not like Latin just showed up and created the word ursa for bear and things evolved over time. If we back it up even further, we arrive at a language that is known as Proto-Indo-European. Proto-Indo-European (or PIE for short) is a theorized language that existed from around 4500 BC to 2500 BC. I say theorized because, at this point at least, there are little to know written records that prove the existence of PIE. The reason we believe PIE exists is because there are many examples of languages in Eastern Europe and Western Asia that have common words and patterns in their language, and they can all be traced back to this hypothetical ancestor in some way or another.

A quick example of this can be shown by looking at some Italian and English comparisons. For instance: piede and foot, padre and father, pesce and fish… there are so many words in just these two languages that have developed to different forms over time, but their patterns are very consistent. The ”p” sounds in Italian seem to be roughly parallel to “f” sounds in English in all of these words for instance. Now I know this is just three words in two languages but trust me when I say that there are hundreds of examples across dozens of languages that give extra weight to this theory.

So if we trace the Latin word “bear” back to PIE, we end up with something like this: *h₂ŕ̥tḱos (note here that the asterisk is to mark the fact that this is a hypothetical reconstruction based on comparing many many languages. Like I said, we don’t actually have writings that include this word).

Where this starts to get really interesting is the fact that this PIE word can trace down to other languages in the PIE family that are not romance languages. Let’s look at the Greek word for “bear” now.

In Greek, the word bear is άρκτος (pronounced “arktos”) and you can notice two things from this. First off, the word arctic in English is derived from this in some way, which is how we have something like the Arctic Ocean, because this is the ocean that is in the northern direction where the Ursa Major constellation is (it’s all tying together again!). The second thing that you can notice is that “arktos” looks roughly like how the PIE word for “bear” would be pronounced. This is giving us more evidence that maybe PIE is a real thing and that all of these languages are tied together!

Photo by Magda Ehlers on Pexels.com

Now those of you with keen perceptions may have noticed something that I did when I first introduced PIE. I used evidence of English and Italian words to convince you that PIE was real. But if PIE is real, and we can trace back words to this common ancestor… how the heck did we end up with bear instead of something more closely related to *h₂ŕ̥tḱos?

It turns out that it’s not just English that has this “bear” problem either. This is where we start to talk about the Germanic lineage. Germanic is also descended from PIE, but not from Latin (English has a lot of Latin influence, but let’s thank the Norman invasions for that). The family tree in this instance splits off directly from PIE and gives us two subfamilies; one with Latin that bears (no pun intended) Romance and Mediterranean languages, and the other side with the Proto-Germanic languages. There are many other divisions and such, but we are only going to talk about these two for now.

All of the Germanic languages have similar words for “bear”. German has bär, Dutch has beer and many Slavic languages (also descended from Proto-Germanic) have it too (Swedish björn and Norwegian bjørn for instance). So with this evidence, it is clear that something happened to the Proto-Germanic word for “bear” that caused this shift.

Let’s take a closer look at some of the Romance languages and see what words that they do have which are similar to “bear” and it might give us a clue.

French has the word brun meaning brown, which was derived from the Latin word brunius. While these may not appear to be exactly the same as “bear” on the surface, it turns out when we trace back the word “bear” in the Germanic and Slavic families that they are in fact derived from the word “brown” somehow.

So what happened to the Proto-Germanic people to make them start referring to “bears” as “the brown ones”. Historical linguists theorize that the Proto-Germanic people were very superstitious people, and because of their superstitions, they were worried that calling a “bear” by its true name would bring it into your life somehow and increase the likelihood of “bear” attacks in ones life.

By a process known as euphemism, it is thought that the Proto-Germanic people collectively started to refer to *h₂ŕ̥tḱos as “the brown one” simply because they needed a way to talk about them without risking summoning them to their camps or hunting excursions.

The Proto-Germanic people had their own version of “he-who-should-not-be-named”, but instead of being some literary euphemism, it ended up influencing thousands of years of language use and giving me something to write about! This is the sort of stuff about language that I find truly fascinating. The fact that we can take this weird thing that you have likely never given a second of thought to and develop entire theories and papers and blog posts to talk about it in an informed and educated way.

So the moral of the story is: keep your friends close and your *h₂ŕ̥tḱos far away by calling them bears instead!

Thank you for reading folks! I hope this was informative and interesting to you. Be sure to come back next week for more interesting linguistic insights. If you have any topics that you want to know more about, please reach out and I will do my best to write about them. In the meantime, remember to speak up and give linguists more data.

There’s an elephant in my pajamas!

Last night I shot an elephant in my pajamas.

Elephant in pajamas – by: Amy Block

A sentence like this one above has two possible meanings, even though you probably only thought of one. One option is the logical meaning where “I” am the one wearing the pyjamas while the elephant being shot. The other possible meaning is that “the elephant” is the one in my pyjamas last night and that’s why I shot it. Now obviously, this meaning is a bit of a stretch (ha!), but that’s only because it is an elephant that was shot. If you change out “elephant” for something a little more realistic, it is easier to convince yourself of this alternate meaning.

Last night I shot a burglar in my pyjamas.

Here you can likely imagine both interpretations, although it does raise the new question of why is this burglar wearing your pyjamas?

There is also a way that we can modify this sentence so that the “I” subject is likely not the one that is “in” something.

Last night I trapped a burglar in my closet.

Just by changing two words, we have made it so it is most likely the burglar who is in the closet, and not me.

Now obviously these sentences are just one silly example of how changing a word or two can change how we might interpret a sentence, but ambiguous sentences show up quite often in one context quite often.

In a newspaper headline like this, we can see the same kind of ambiguity problem. Namely, there is a prepositional phrase (with knife) at the end of the sentence that could reasonably apply to either the subject of the sentence (the cops), or the direct object of it (the man).

Sentences like these don’t often pose a problem for us because we have out own logic and intuition to rely on. Let’s take it one step further and imagine the effects that this might have on a computer. If a computer were to try and “read” these sentences, what conclusion do you think it would draw?

Computers rely on several processes when it comes to interpreting language, but one of the biggest ones (and the easiest one to explain here) is known as statistical learning. Statistical learning is a process by which you take a large set of data, known as a training set, and feed it to a computer program that reads the data one chunk at a time, and makes note of what comes after each chunk. These chunks can be set to a certain number of words to be processed all at once, known as a window.

If you feed the computer a large enough set of data, you can then ask it to start making predictions (like you see in the predictive text on your phone). The computer is able to make guesses on what is most likely to come next based on how often that combination appeared in the training data that was fed to it. This is where all of the statistical stuff comes in.

This process is all very math heavy and quite hard to wrap your head around, but let’s try and simplify it with an example. Imagine I asked you to fill in the remainder of this phrase:

To kill two birds with one _______.

If you guessed stone, then congratulations! Your internal statistical learning system is working normally. If you put in a word like bullet, you might not be incorrect based on your own experience, it might just mean you are working from a different set of training data from most people and you are not familiar with this idiom.

The idiom “to kill two birds with one stone” is very common in North American English and you have likely seen or heard it so many times that you can intuitively know how to finish it. You can probably think of other examples too where after seeing one word come up, you would know for certain what the next word is.

Computers are working on the exact same principle that you just employed to complete that idiomatic expression, but they are doing it on a much different level than you are. Being able to change the scale of the “window” (how big of a chunk) that they are looking through allows them to notice patterns in language that you or I could never notice on our own.

The biggest problem with this from a computing standpoint is that memory is finite for computers so if you make these windows too big, the computer will not be able to handle it. If you make it too small, you won’t get enough useful data to make good predictions. You were able to easily predict the last word of that idiom because you have a large window and you are able to have access to the entire sentence at once. Imagine you were only able to see something like “with one ___”. It would be a lot harder to make a good prediction with this small amount of information.

Another problem is, computers don’t know the meaning of these phrases that they are reading and predicting. This leads us back to the ambiguous sentences from the beginning of this post.

Imagine you could design a program where you could give a “trained” computer the sentence “I shot an elephant in my pyjamas” and then ask it who was wearing the pyjamas. The computer would likely wrongly assume that the elephant was the one in the pyjamas because more often than not in English, when we have a preposition like “in” after a noun, it is meant to be associated with that nearest noun.

There is a chance that the computer might be tipped off in some way somehow by the fact that they are MY pyjamas though, and because of this first person possessive pronoun would correctly associate them. What about a sentence that only uses inanimate objects and pronouns?

The trophy would not fit in the cabinet because it was too big.

We as humans are able to reason that the trophy being too big is the most likely problem here. But again, the computer would likely make the wrong prediction here because it would want to associate the it pronoun with the closest possible noun in the sentence.

All these sentences can be easily disambiguated to ensure that the computer makes the right choice every time.

I shot an elephant while I was in my pyjamas.

The trophy would not fit in the cabinet because the trophy was too big.

Without any ambiguities the computers will be happier knowing that they can understand the sentences just like we can. All of this is to say that when you are writing, be kind to your computer and make sure that you are writing in clear, unambiguous sentences for their benefit too.

Alternatively, the takeaway might be that we should write needlessly ambiguous sentences to confuse the computers and hope it slows down the inevitable terminator-style uprising. I’ll leave the interpretation of this blog post to you the reader.

Thank you for reading folks! I hope this was informative and interesting to you. Be sure to come back next week for more interesting linguistic insights. If you have any topics that you want to know more about, please reach out and I will do my best to write about them. In the meantime, remember to speak up and give linguists more data.

You, me, and haiku makes 5/7/5

Photo by Poppy Thomas Hill on Pexels.com

If I asked you to tell me structure of a haiku poem right now, you would probably say that it is three lines consisting of five, seven, and five syllables in that order.

 If you were talking about the traditional Japanese method of writing haiku’s, that wouldn’t quite be right. To understand how haiku’s are written, we need to talk about Japanese syllable structure.

Japanese is a language that prefers simple CV syllables. This means that a typical syllable in Japanese consists of a single consonant sound (C) followed by a single vowel sound (V). All of this is not to say that there are no exceptions, because it just wouldn’t be a rule if there weren’t exceptions. The moral here is that the Japanese language does not like to have consonants at the end of a syllable (known as coda consonants).

To see this in action, lets look at some of the English words that have made their way into Japanese as “loan words”. Words like “glass” and “restaurant” in English become ガラス “garasu” and レストラン “resutoran” when they are adapted into Japanese.

You will notice in a word like “ga.ra.su”, the entirety of the word conforms to this strict CV structure, while in “re.su.to.ran”, the majority of the word is CV with the last syllable having a consonant at the end. Like I said, there are always exceptions to rules. It has to do with the type of sound that determines whether it can be a coda consonant or not.

In Japanese, coda consonants are permitted for nasal sounds like /n/ and /m/, but it is only the case that they are permitted to be coda’s. It is not the case that they will always be coda consonants if they show up after a vowel. There is a whole complicated set of circumstances that determine the actual structure with them, but for now, lets just say that nasal consonants are special in Japanese.

So now that we know a little more about Japanese syllables, let’s talk about what a syllable actually is. You can think about syllables as more than just consonants and vowels, and start to think about them like a hierarchical structure like this diagram below.

Syllable tree of the word “kite”

In a structure like this, syllables are composed of several interlocking parts. At the centre of every syllable is a vowel. Borrowing a term from science class, we refer to the vowel of a syllable as its nucleus. This is why children are taught to count the vowel sounds in words to determine how many syllables there are because every syllable requires a vowel (with exceptions as always). Everything before the vowel in a syllable is known as the onset, and everything after the vowel within a syllable is known as a coda. Taking the nucleus and the coda of a syllable together, we get a unit that is known as the rhyme of the syllable, which makes sense if you consider the fact that this is the part of a syllable that makes it rhyme with other syllables.

In a language like English, syllables can be composed of complex onsets and complex codas. You can see this by looking at a word like “strength” which is a single syllable word that has a lot of consonant sounds on either side of the vowel. With the Japanese tendency to prefer CV syllables, you can start to see a pattern where syllables are only allowed to have simple onsets composed of a single sound, and only in rare cases are they allowed to have coda consonants.

If you recall from my post on expletive infixation {HYPERLINK}, I talked about the concept of prosodic feet and how they are composed of at least two syllables, with one of those syllables usually being stressed. Well, there is a unit of measurement that we care about that comes between the syllable and the foot. This unit is known as a mora, and it is represented by the Greek letter ‘mu’ (μ).

Mora’s are what give us the concept of “syllable weight” which boils down to the principle of syllables with more morae are “heavier” than syllables with fewer morae. When it comes to assigning these morae to a syllable, they can only be associated with the rhyme of the syllable. No matter how complex the onset of a syllable is, it isn’t going to contribute any morae to the syllable as a whole. Coda consonants on the other hand can contribute weight to the syllable. In Japanese, where coda consonants are already so rare, those coda consonants almost always contribute to the weight of a syllable.

Morae are usually assigned on a one-to-one basis where a vowel will have a single mora, and every consonant sound in the coda will be assigned one mora. If you have a long vowel though (a vowel with added stress to it that makes it sound “longer” or more prominent for a longer duration) those will be assigned two mora’s.

To hear what this sounds like, listen to these two clips of the Australian pronunciation of the words “cut” and “cart”.

Australian English speaker pronouncing “cart”
Australian English speaker pronouncing “cut”

Notice how both of these use the same vowel sound, but the vowel in “cart” is longer than the vowel in the word “cut”. This is an example of contrastive vowel length where two words are pronounced using the same vowel, but there is a difference of length that gives rise to the distinction between the two words.

Traditionally, Japanese has syllables with no codas a single mora on the vowel. However, Japanese also has long vowels meaning that they can have bimoraic syllables without a coda consonant. Looking back at “garasu” and “resutoran”, you can count it out and see that “ga.ra.su” has three syllables with a single mora each, while “re.su.to.ran” has three syllables with a single mora each and one syllable that has two mora’s because it has a coda consonant.

Getting back to haiku’s (that was what we were talking about right?), a traditional Japanese haiku is not composed of five syllables followed by seven syllables followed by another five syllables. Many scholars argue that these haiku’s care about the number of morae in the words rather than the syllable count. So if you had a total of four syllables in the first line of a haiku, but one of those syllables contained a long vowel or a coda consonant, this would be a perfectly acceptable haiku line.

Now this is not to say that you should email your high school English teacher and tell them that they lied to you for all those years, but I hope this was at least a little informative and interesting to you. There are so many interesting things that I only just barely scratched the surface on here. Japanese is an incredibly interesting language to study, and the phonology that I talk about here is skipping over some very cool stuff. If you have the time, I suggest you read a little more about it here:

All that said, come back for more next week and in the future as I am sure that we will be returning to Japanese phonology again at some point. If you have any topics that you want to know more about, please reach out and I will do my best to write about them. In the meantime, remember to speak up and give linguists more data.

Why I Am Never Stressed.

Never stressed.

Two words that I have inked into my arm that carry two very important meanings.

The first and most important meaning to me is to act as a reminder to stop sweating the small stuff (spoiler alert, it only works about half the time). The second meaning corresponds to the symbol that you see above it. I don’t mean the little blue bird looking thing, we’ll talk about that soon I promise. It is associated with that upside down ‘e’ looking thing in the speech bubble.

This symbol is known as a schwa, and it stands for a mid central vowel on the IPA vowel chart that, at least in English, shows up mainly in unstressed positions. Now I know that there are some terms in that sentence that you likely haven’t heard about before and I promise, I am going to unpack them. Don’t stress about it.

First lets talk about the IPA. The International Phonetic Alphabet, or IPA for short, is an alphabetic system based primarily on the Latin script that serves as a one-to-one representation of the sounds humans make when they are speaking. This is a globally used alphabet that allows linguists to communicate about speech sounds regardless of the language that they are spoken in. The written IPA output is completely removed from meaning and only serves to describe how people are speaking without caring about what the actual words mean.

Think of it like this. If you were asked to listen to a language from a remote village in some foreign country with no clue of what they were speaking about, you probably couldn’t make heads or tails of any of it. While you are listening though, you might hear some familiar consonants and vowels though like an “eee” or an “ess” sound. With the IPA, you could then mark those down as [i] and [s] respectively and start to segment all the sounds that you are hearing. You might not be able to figure out all of the words or what is being said still, but over time and with enough data, you could start to form an idea of what some of the words are in this language because you would start to see the same sequences of sounds popping up time and time again.

Nowadays, there are other techniques that we can use to help move this along faster, but just having this universal way of describing the things that you are hearing can be largely helpful in so many areas of linguistics.

The IPA is a tool that is used beyond the realm of linguistics too. The IPA is utilized by actors, singers, speech pathologists and translators as a way to be as specific as possible about the sounds we as humans produce. There are so many interesting things to discuss when it comes to the International Phonetic Alphabet, but we came here to talk about schwa so lets just focus on the vowels for now.

Vowels are organized in a small chart that resembles a parallelogram. The location of the vowels are not as arbitrary as they may appear at first glance. Each vowel is placed in the chart according to how it is produced on two dimensions, height and backness.

To give you a better idea of what this means, lets try a little test. If you make an “eee” sound like you would in the word “feet”, you can feel how your jaw is closed up a bit, your tongue may feel slightly raised in your mouth and it might be touching or approaching the back of your teeth. Now try saying a word like “bother”. When you make the “ah” sound in the first syllable, you notice that your jaw is much more agape and your tongue is probably pushed down and toward the back of your mouth. These two vowels are the [i] and [ɑ] symbols seen on this chart respectively, with [i] being described as a “high-front vowel” and [ɑ] being described as a “low-back vowel”. Now with that in mind, you can probably start to see the logic of how this chart is organized.

If you imagine the vowel chart above overlaid onto the cross section of a human head, you can start to see how the organization of the vowels, and the way that they are described corresponds to the position of the tongue. In reality, there is a large amount of variance in the actual production of vowels in speech, but organizing the symbols in this way makes the most sense from a purely logical standpoint.

You are probably saying “wait, I though English only had a, e, i, o, and u! Why are there so many vowels on this?”. This is a perfectly valid question and it has a two part answer. The first part is that this is a completely comprehensive chart that lists all of the vowels used in all of the languages of the world. The second thing to remember is that these symbols don’t correspond to any of the vowels in a written language, these correspond to the vowel sounds that we produce when we speak.

Think about the English “u” vowel for instance. When written in a word like “hunt” it produces a sound that corresponds to the [ʌ] sound on this chart, but in a word like “muse”, it corresponds to a [u] sound which is a high back rounded vowel. That’s two different vowel sounds represented by one letter.

Now I’m sure this isn’t news to you that these words sound different, after all, we all know how broken and removed from pronunciation English spelling can be. I am just trying to emphasize that the symbols on this chart don’t correspond to any written form of a language. They are purely being used in the IPA to describe the characteristic of the vowels when they are produced, even if most of them do look familiar to you.

I feel like I’ve gotten off track here. Again, there are so many things within this chart that I could go on and on for hours talking about, but that’s not why we are here today. We are here to talk about schwa.

For all of this hype and buildup, schwa is actually the least interesting of all of the vowels listed on this chart. As I pointed out earlier, schwa is that weird upside down looking ‘e’ in the very middle of the chart. It is described as the mid central vowel, and the best way to think about it is; what would be the vowel that your mouth produces if everything was in a “neutral” position so to speak?

Think of a word like “about”. The very first vowel in that word, the one right before the “b” sound is a schwa. It might sound similar to the vowel in the word “hunt” if you were to slow it down, but in the word “about” it goes by much quicker and it feels a bit more relaxed. This is because “about” is a two syllable word and the “stress” is falling on the second syllable.

Now, stress is a whole other can of worms that we will inevitably discuss at some point, but for right now just think of it as the vowel that stands out in a word or the one that you are emphasizing when you speak. Almost every multisyllabic word in English contains a schwa. I say almost every here because there are other unstressed vowels, and there are many other cases where you are just seeing quick unstressed variants of other vowels. Statistically though, schwa shows up a ton.

I specified that schwa is unstressed in English, but this is not the case for all languages of the world. In Romanian for instance, the schwa sound even gets its own written letter and can show up as a stressed vowel. In a word like văd, which means “I see”, the ă is pronounced with a schwa vowel, and because there is only one syllable within it, this would be considered the stressed vowel within that word. So what this means for me is that my tattoo, while certainly meaningful in an English speaking country, wouldn’t translate super well if I visit Romania in the future.

This is only one small tidbit of the wonderful world that learning about IPA can teach you so I certainly encourage you to do some more reading about it, or just come back here in the future as we are sure to talk about this again in the future.

Thank you for reading folks! I hope this was informative and interesting to you. Be sure to come back next week for more interesting linguistic insights. If you have any topics that you want to know more about, please reach out and I will do my best to write about them. In the meantime, remember to speak up and give linguists more data.

When does “this” become “that”?

Earlier this week, beloved internet nerd Hank Green posted a tweet expressing his frustration about not understanding the relationship between what/that, where/there and when/then. The actual answer to the question is incredibly fascinating and is summed up brilliantly in this short video by Jess Zafarris.

But I am not here to try and take credit for this answer or to expand on it further, I want to take a minute to talk more about the relationship between “this” and “that”.

As Jess pointed out in the video, “this” and “that” are demonstratives that we have in English that are used to locate things in space. But when exactly does “this” become “that”? “This” is usually reserved for things that are in our grasp or are comparatively closer to us than “that”. For instance, if you were holding a pen, you could easily say “This pen is quite reliable” but it would be weird to talk about the same pen you are holding and say “That pen is quite reliable”.

If there were two pens on a table, you could pick up one of them and easily talk about “this” pen that you are holding versus “that” pen which you are not holding. But as we know, the concept of “that” is not as spatially confined as “this” is. We can talk about a “that” that isn’t even in the same room.

If you find a pen on a table that writes significantly better than your friend’s favorite pen he keeps at home, you could probably pick it up and say to them “this pen is so much better than that pen.” And your stationary obsessed friend would probably be able to figure it out. You may need to provide them with a few more specifics, but the point here is that “that” does not have to be within your eyesight. “That” could be anywhere other than here, and it is always going to be comparatively farther away than “this”.

Photo by Jess Bailey Designs on Pexels.com

Now what about this scenario. You walk into a room and there are two pens on the table. One of these pens writes significantly better than the other, but you know that your friend has an incredible fountain pen at home that makes both of these pens look like utter trash. You turn to your friend and say “This pen is much better than that pen, but that other pen you have is the best”. This is a perfectly acceptable and understandable statement, but wouldn’t it just feel so much better if we had a nice way to talk about “that’s” that are likely really far away as opposed to “that’s” that are here, but are not “this”.

This is where we get into the concept of deixis. Deixis is the use of words and phrases to refer to a specific place, time, or person in context. The demonstrative words “this” and “that” can both be used to locate things in space meaning that they are also deictic words. When you are speaking to someone else, you usually use yourself as a default centre point for these words which is how we get this distinction where “this” is closer to you than “that”.

So the concept of “this” is a proximal deictic word, meaning close in proximity to the centre while “that” is a distal deictic word meaning it is further away from the centre. This is a deictic system that all natural languages have to some degree (at least based on the evidence we have). In addition to spatial terms, deixis can also help us differentiate between “now” and “then”, and it can even give us a three way contrast in English between “you”, “me”, and “them”, but in English we seem to be confined to just a “this” and “that” contrast for spatial location.

I say confined here because there are actually languages that go above and beyond in their spatial location capabilities. A language like Korean for instance has a three-way distinction on spatial reference much like we have a three-way distinction on personal pronouns. In Korean, you can use the word yogi to talk about something that is near the speaker, kugi to talk about something near the listener, and chogi to talk about something that is far away from both the speaker and the listener.

Japanese also has this same pattern with the words koko, soko, and asoko, while Tamil uses the words inge, unge,and ange to express the same thing. This pattern also shows up in Thai, Filipino, Macedonian, Yaqui, Turkish, and many more languages so it is certainly not a rare or obscure possibility, it is just something that we English speakers don’t have the ability to take advantage of.

Photo by Karolina Grabowska on Pexels.com

Moving away from “this” and “that”, let’s talk a little bit about the temporal aspect of deixis. In English, we have a similar two-tiered system that we use to talk about the proximal “now” versus the distal “then”.

The thing about “then” is that it is slightly ambiguous in terms of which “then” we are talking about. Are we talking about the “then” that just happened now? Are we talking about the “then” from a few days ago? Or are we talking about the “then” that is going to happen at some point in the future.

And here we have another shortcoming of English. Like the spatial terms, this is not an insurmountable shortcoming, we just have to do some extra work to differentiate between all of these “then’s”. But like the spatial stuff, there are languages that do a much better job than English of differentiating between “yesterday”, “the day before yesterday” and “that one Tuesday six months ago” (okay, maybe not that specific, but let me explain a bit more).

Take the language Zulu, a Bantu language spoken primarily in South Africa. In Zulu, you can make the distinction between the recent past tense and the remote past tense just by changing up the suffix on the word and altering the initial vowel (Bantu languages love to change many things in different places to accomplish one thing. I promise this is just one thing). For instance, sihambile in Zulu means “we went” (phrases in Zulu are expressed by a single word), but it has a recent sense of time. Compare this to sāhamba, which also means “we went”, but it was further in the past than the first example.

Now, let’s just imagine a scenario. Let’s say you are out with your friends on Monday July 26th, 2021, and you are having so much fun that you want to try and get together again on Saturday August 7th, 2021. You could say to your friends “Hey, we should hang out again next weekend”, and it would likely start some debate about “Wait, do you mean this next weekend beginning in five days? Or do you mean the one twelve days from now?”.

And again, this is another shortcoming of English that it turns out Zulu does not have! Like it’s past tense, Zulu can make a difference between recent future tense and remote future tense, but it is incredibly subtle and not in the place you would expect. Zulu changes something in the middle of the word to achieve this effect. Let’s walk through an example and you can see what I mean.

The word Ngizokuza translates to the phrase “I will come”, but this is going to happen before Ngiyokuza, which also means “I will come”. Simply by changing a zu to a yo in the middle of the word, Zulu speakers are able to easily differentiate between a near future and a more distant future.

Now I am not here to say that we all need to go out and learn Zulu to make plans with our friends in a more accurate way, I am just trying to show off all the cool and interesting systems that languages of the world have. English is a serviceable language for sure. If it wasn’t, we would have abandoned it long ago. I just think that learning a little bit more about how other languages handle things like this is incredibly interesting, and that’s why I love this field so much.

Thank you for reading folks! I hope this was informative and interesting to you. Be sure to come back next week for more interesting linguistic insights. If you have any topics that you want to know more about, please reach out to me at talkinglinguist@gmail.com and I will do my best to write about them. In the meantime, remember to speak up and give linguists more data.

Semantic Illusions and How to Spot Them

How many animals of each kind did Moses take on the ark?

It’s two right?

The answer is none. I’m not trying to start some sort of religious debate here. According to the bible, Moses didn’t build the ark, Noah did.

Maybe you are a theology buff and your sharp eye caught this on the first read through. Maybe you aren’t familiar with the bible story though so it might not have been a fair test. Let’s try another one.

What is the name of the holiday where kids dress up and go out to give candy?

If you answered Halloween, you’d be wrong again. Kids don’t give candy on Halloween, they get candy handed out to them. So what causes our brains to skip over the most important part of the sentence and just decide that “It’s fine, I know the answer to this”?

If you remember last week when we talked about garden path sentences, I mentioned that our brains are driven by efficiency. That desire for maximum efficiency might also be able to explain why these sentences, known to linguists as “Moses illusions”, seem to trip us up. One theory with these Moses illusions is that our brains reach a point in processing these sentences where they feel that they have enough information to answer the question that they can ignore the wrong information.

According to some research out of the University of Maryland this is likely due to what is known as shallow processing. Shallow processing is a bit of a broad term, and the definition of it changes depending on the “thing” that our brains our processing. In garden path sentences for instance, our brains decide what the most likely interpretation of the sentence is before we get to the end of it.

With these Moses illusions, our brains process the sentence to the point where they see key words in the sentence like “holiday”, “dress up”, and “candy”. A word like “give” goes undetected on a quick glance. One reason for this is that our brain feels like it has enough information to answer the question. An even bigger reason though is that “giving” is closely related to “receiving”. If I had given you a sentence like the one below, you likely would not have been fooled as easily.

What is the name of the holiday where kids dress up and go out to grow candy?

Using the word “grow” here might make it easier to catch because it’s such a weird thing that maybe your brain is more likely to catch it. But this might not be fair because our brain has the benefit of hindsight.

Not all Moses illusions are created equal though. You can’t just change out one word in any sentence with something closely related and have it trip people up. It’s not just the similarity of the words that’s causing you to fall for this. The position of the substitution in relation to the other key pieces of information also plays a role in whether people fall for these illusions or not. Sentences with substitutions at the beginning of the sentence are more likely to be noticed by readers than those with subtle substitutions near the end when the brain already has “enough information”.

The explanation that we have at this point in the research is by no means perfect. This is still a growing area of research and there is certainly more to learn about how people read and interpret sentences before we have a definitive answer to why people fall for these. So the next time someone asks you “Which British monarch lit the torch at the London Olympic winter games in 2012?”, you will take a moment to remind them that those were the summer games rather than blindly answering with “Queen Elizabeth”.

Thank you for reading folks! I hope this was informative and interesting to you. If you want to see more of these posts, be sure to follow my Facebook page and get updates when new posts go live. If you have any topics that you want to know more about, please reach out and I will do my best to write about them. In the meantime, remember to speak up and give linguists more data.

Things I learned while walking in my garden

Photo by Daria Obymaha on Pexels.com

Have you ever come across a sentence where something feels off the first time you read it?

The horse raced past the barn fell.

If this is your first time encountering a sentence like this, you probably had to read it a few times before you figured out that it was the horse that was falling, not the barn. Although this sentence is weird, it’s perfectly grammatical and you have no problem understanding it once you know the trick to it. What about for a sentence like this?

After the man paid the clerk asked for more money.

Jerry Seinfeld performing stand up on Seinfeld. (NBC/Youtube)

So what’s the deal with these sentences anyway? Sentences like these are called garden path sentences. They get their name from the fact that you feel like you are being led down a lovely garden path as you read the sentence, before you are suddenly brought to the edge of a cliff looking into the void of “ungrammaticality” and realizing that you should have taken that left turn at Albuquerque.

Garden path sentences show up rarely in natural writing, but they are used in psycholinguistic research to figure out how our brains respond to unexpected things. One way that psycholinguists can test this is to do what is called a self-paced reading task, which presents a sentence to readers one word at a time.

By presenting a sentence like this and recording how long it takes for the reader to move to the next word, we can look at the exact point where they encounter the oddness of the sentence see how they react.

Before we get lost in that rose patch though, let’s think about the possible ways that people could process something they are reading. Because English has a relatively strict word order, seeing certain words might be a good signal of what is coming next. When you are reading it one word at a time through, are you simply interpreting the most likely possibility of how the sentence will end based on what you know about your own language? Or are you thinking of all the possible ways that a sentence could go, and then cutting off the impossible ones with some hedge trimmers as you encounter more words?

Both explanations seem possible, but the second one does feel a little bit more cumbersome. Because languages are recursive and sentences could be infinitely long, trying to keep all the possible structures in your mind would be impossible. But what if we relied on the fact that our brains are essentially super computers? What if our brains understand that most sentences aren’t infinitely long and there are actually only a few things that we would have to look out for if we don’t care about what specific word follows, but instead just care about whether the sentence could continue from this point or not? Now it seems a little bit easier to imagine that we could be processing things like this.

So how exactly are our brains interpreting sentences? And how can these garden path sentences confirm that? Well, if we isolate the point of weirdness in a self-paced reading task, we can see whether readers slow down at all when they reach that point. As I mentioned before, sentences can be infinitely long, so when we encounter something like this, it shouldn’t be a surprise to our brains that the sentence keeps going and there shouldn’t be any slowing down.

But it does slow down. When readers reach this odd point in the sentence where we introduce the second verb, a significant portion of readers will take a little bit more time to figure out just what in the fertilizer is going on.

The key thing that we do need to realize though is that while our brains might be super computers that could possibly keep this idea in mind, our brains are also driven by efficiency. What this means is that, would it be worth spending all that energy to consider the infinite possibilities of sentences when we could just keep the most likely possibility in mind and revaluate the rest of the possibilities when that doesn’t work.

Let’s take another look at the sentence “the horse raced past the barn fell”. As most of you have probably noticed by now, the reason we get tripped up by this sentence is the fact that “the horse raced past the barn” could stand on its own as a sentence.

As we work our way through the sentence one word at a time our brains, being the efficient machines that they are, are trying to only consider the most likely possibility. This means that by the time we reach the word “barn” we have come across a subject “the horse”, a past tense verb “raced” and an object “past the barn”. The possibility that our brain doesn’t consider is that this entire phrase is referring to a horse that was raced past the barn, presumably by a jockey who needed to get home to water the carrots.

When we encounter the next word “fell”, the first thing our brain thinks is that “oh, barns fall all the time, so it must be the barn that fell”. After trying that angle and realizing that it doesn’t work, our brain panics and thinks “wait, I must have missed something” before it goes back to reanalyze things from the beginning with this new information in mind and makes the correct assumption that it was indeed the horse that was raced past the barn who fell.

So why didn’t they just say “the horse that raced past the barn fell” in the first place? Because then, I would have nothing interesting to write about! This is another big part of linguistics where we try to push the limits of what is grammatical and see how people will react to it. After planting a small seed of an idea in someone’s head, we are able to grow our understanding of how the human brain processes sentences. There are so many more amazing things that research has been able to teach us about language, and I can’t wait to keep sharing them with you all every week.

Thank you for reading folks! I hope this was informative and interesting to you. Be sure to come back next week for more interesting linguistic insights. If you have any topics that you want to know more about, please reach out and I will do my best to write about them. In the meantime, remember to speak up and give linguists more data.

The Fan-fkn-tastic Rules of Language

Whether its to express anguish or to emphasize something excitedly, The F Word is a versatile word within the English language. We all know that it can be a noun, a verb, and an adjective already, but a lesser discussed feature of it is what It can do inside of a word. The ability to insert The F Word into the middle of a word, or infix it, is constrained by the underlying structure of the word.

Take the word “fantastic” for example. It’s a three-syllable word with no prefixes of suffixes. We can easily insert The F Word after the first syllable and end up with something like “fan-fkn-tastic”, but we can’t put it between the second and third syllables to make something like “fanta-fkn-stic” (or fantas-fkn-tic depending on how you want to divide it).

You can probably think of some other three syllable words that pattern the same way so you might try to generalize that The F Word can be inserted between the first and second syllable of a word. But what about a four-syllable word like absolutely. You can “abso-fkn-lutely” break it up like this, but it’s “ab-fkn-solutely” weird to put it anywhere else.

So what can this tell us about the structure of these words then? If it’s not just about the syllables, what is constraining where we put The F Word? This is where we need to talk about the concept of a prosodic foot. This is a concept that you may already be familiar with if you study poetry.

Essentially, a prosodic foot within a word is what gives us a sense of rhythm within a word. Prosodic feet are composed of two (disyllabic) or more syllables in a word, one of which is stressed or “long”, and the rest being unstressed or “short” (there are cases of all stressed syllables and all unstressed syllables in a foot, but we won’t be talking about those today). The two most common types of disyllabic feet that you have likely heard of are iambic feet and trochaic feet. In an Iambic foot, the second syllable will be stressed while in trochaic feet, the first syllable will be stressed. Iambic feet are pretty well known thanks in part to William Shakespeare and his penchant for writing in an iambic rhythm.

With these definitions in mind, lets review some of the words from above:

If we assign a beat structure of sorts to these words where “DUM” is a stressed syllable and “da” is an unstressed syllable we can see that they both have uniquely different rhythms.

Fantastic – da-(DUM-da)

Absolutely – (DUM-da)-(DUM-da)

The two words shown here both happen to have trochaic feet where the first syllable in the foot is the one that is stressed. So now we can adapt our rule to say that the F word can only be infixed in front of a foot.

xkcd #1290

Of course, this just wouldn’t be English if there weren’t a few exceptions to the rule. What about a more complex words like “unbelievable”? I think we can all agree that “un-fkn-believable” sounds a lot better than “unbe-fkn-lievable”. Linguists have adapted the rule to prioritize morphological boundaries of a word over the rhythm and foot structure in these cases, and because the “un” in this word is a prefix that attaches to the word “believable”, we would prefer to separate it for the sake of infixing than to use the rhythm structure and leave behind this “unbe” thing.

Now think about a word like Kalamazoo. You might be okay with both “Kala-fkn-mazoo” and “Kalama-fkn-zoo”. Maybe you have a strong preference one way or the other. At the end of the day, there is no perfect answer to the problem of where we can infix The F Word in English, but there are certainly some interesting things you can do with it. I encourage you all to go out there and see what you can F up!

Thank you for reading folks! I hope this was informative and interesting to you. Be sure to come back next week for more interesting linguistic insights. If you have any topics that you want to know more about, please reach out and I will do my best to write about them. In the meantime, remember to speak up and give linguists more data.

Prescriptive vs. Descriptive: Not your English teacher’s English

Photo by cottonbro on Pexels.com

In my last post, I talked about how it isn’t my job to dictate what happens with a language, or to tell you how language should be used. You may be asking yourself “wait, if linguists don’t make the language rules, then what do they even do?” It’s a valid question, and frankly one that I am still figuring out every day. The best analogue that I can come up with is Dr. Jane Goodall and her chimpanzees.

Dr. Goodall began her career of working with primates in the early 1960’s by observing them from a distance and simply observing their behavior and taking notes. Over time she did interact with them, but it was passive interactions, and it was always led by the primates, never forced upon them. She never tried to walk in and change the way that the chimps lived their lives.

Photo by Nirav Shah on Pexels.com

In this metaphor, while I am implying that linguists act like Dr. Goodall, I am not implying that you the reader are the wild primate. Language is the natural, everchanging thing that we linguists are observing and documenting every day. Some of that may involve studying human behavior and how humans utilize the tools that they have when using language such as their vocal tract, but often, it’s the language itself we are most interested in.

So why do we have grammar classes in school? Well, the ability to communicate in a professional manner is certainly a valuable skill in our society still. After all, without grammar rules and at least some form of writing training, there would be no way for this post to be universally comprehensible.

The approach that your high school English teacher took when teaching you how to write a paper or a poem is what we call a prescriptive approach. A prescriptive approach to grammar is one that seeks to prescribe one system in preference to another. On the other hand, a descriptive approach is one that tries to simply describe human linguistic ability and knowledge.

So why don’t linguists just take a prescriptive approach and tell people how to use language? Well think about that high school English class for a second. Sure, you probably learned how to write in an active voice as opposed to a passive voice. Maybe you even learned how to read into metaphors and how to interpret poetry. The things you learned in that class likely don’t make their way into your everyday life in all those ways though. When you are out casually talking with your friends, you probably aren’t sitting there telling them that they should stop ending their sentences with prepositions (at least I hope not).

When it comes to trying to dictate language change, it’s a bit like trying to paddle against a strong current. You probably aren’t going to get very far, and its likely way more fun to just sit back and see where the current takes you. This is the approach that linguists take to language. Rather than trying to force a system to behave a certain way, we choose to observe, document, and unravel the natural changes as they happen. Even small changes to a language can lead to massive amounts of research and a new understanding of how languages around the world adapt.

So the next time you want to yell at your friends about whether it’s “to who” or “to whom” they are speaking with, just sit back and think about why they are saying it that way rather than telling them right versus wrong.

Thank you for reading folks! I hope this was informative and interesting to you. Be sure to come back next week for more interesting linguistic insights. If you have any topics that you want to know more about, please reach out and I will do my best to write about them. In the meantime, remember to speak up and give linguists more data.

It’s raining outside, but what is “it” all about?

It is raining outside.

It seems to be raining outside.

There appears to be a rainstorm blowing in.

Photo by Jill Burrow on Pexels.com

Apart from talking about some gloomy weather, these sentences also share another common thread. Think about the subjects of these sentences. What does “It” mean? Where is “There”? Why do we even need them in these sentences?

These types of pronouns are known as expletive pronouns (no, not those kind of expletives, that’s for another post). These expletive pronouns, also known as pleonastic pronouns or dummy pronouns, act as placeholders in languages that have a strict requirement for subjects. English is a language that really wants to have a subject, regardless of whether that subject actually means anything. Compare the examples above to these sentences:

Is raining outside.

Seems to be raining outside.

Appears to be a rainstorm blowing in.

Just reading these sentences, you probably got a queasy feeling in your stomach telling you “oh, I don’t like this at all”. That’s your native intuition as a speaker of English kicking in to tell you that those are not grammatical sentences. But some of these sentences are only bad because you are reading them as opposed to hearing them. Listen to this clip from Pawnee’s own Leslie Knope.

Parks and Recreation: Season 5 Episode 19

Seems to me we oughta use it.” When we read sentences without subjects, it provokes a stronger reaction compared to hearing someone say it, especially in the quick, casual register used in the clip above.

So what does this mean for the English language as a whole? Well, its hard to say for sure. It could be that we are witnessing a gradual language change where we are becoming more comfortable with ditching these meaningless expletive pronouns in our speech. After all, there are languages that are able to talk about the weather without the need for expletive pronouns.

In a language like Italian you can say “sta piovendo” where the “sta” is would be a verb like ‘be’ in English, and “piovendo” of course means ‘raining’. The word for word translation for a phrase like this would be “is raining”, and if you ask someone who speaks Italian, they likely wouldn’t get the same queasy feeling from “sta piovendo” that you get from “is raining”.

English isn’t the only language that requires these dummy pronouns to describe the weather. Germanic languages all require these pronouns in sentences, even if they don’t refer to anything at all. In German for instance, the phrase “es regnet” translates into English as “it rains”, where again, the ‘it’ doesn’t refer to anything in particular, but it needs to be there for the phrase to make sense (let’s ignore the lack of a verb and tense for the sake of simplicity here).

But is English changing? As a linguist, it’s not my job to try and dictate what should happen to language or to tell you to do the same either. What I can say though is that this might be something to keep your ears open for. Language phenomena have this funny way of sticking in your brain once you know about them. These expletive pronouns are something that may have gone completely unnoticed by you for most of your life, but now that you know a little bit more about them, you will probably notice them a lot more often. If this is the result of natural language change happening in real time, maybe the next time the rain is falling wherever you are you can go outside and scream “IS RAINING!” to get people used to this change a little bit faster.

Thank you for reading folks! I hope this was informative and interesting to you. Be sure to come back next week for more interesting linguistic insights. If you have any topics that you want to know more about, please reach out and I will do my best to write about them. In the meantime, remember to speak up and give linguists more data.