We recently released Into the Haze, our second text adventure game for learning Chinese.

This game works like an interactive, graded reader, and it is targeted at intermediate and upper-intermediate learners of Chinese.

But when you first dive into the game, you may find all the Chinese text overwhelming, even if you already know hundreds of Chinese characters.

But all is not lost. In this post, we'll hopefully see:

  1. How the game is not as hard as it may appear.
  2. How a bit of up-front study can go a long way to making the game easier.

How hard is Into the Haze, really?

On the surface, Into the Haze seems pretty daunting. The opening few steps of the game look like a wall of text:

Wall of text screenshot

By the raw numbers, the game can also seem pretty daunting. In total, the full text of the game contains 649 distinct Chinese characters. These combine in various ways to form 922 distinct words that appear throughout the game. And in total, the full text of the whole game is more than 13,000 characters of text (though keep in mind that when making your way through a text adventure game, it's not possible to see all of the text in any given path).

But wait, it gets worse! Suppose you know about 500 characters. (Not sure how many you know? Check out our character knowledge estimation tool). Even if you know 500 characters, it's likely that the characters you know are not exactly the characters used in this game. In fact, based on a simplistic model1 of learning Chinese, if you know 500 characters, you're only likely to know 263 of the 649 (40%) characters that appear in this game.

It's now starting to look truly terrifying.

Can we take an axe to this fear?

Before you call it quits and go home, let's dive in a bit deeper, because I think you will see the picture is not so bleak.

Like the rest of WordSwing, our text games have a built-in dictionary. So whenever you don't know a word or character you can tap on it and pull it up in the dictionary. But you might be thinking, "Won't I be looking up every other word in the game?"

Well, let's see.

Of the 649 distinct characters, 230 of them appear 3 or fewer times in the game. Let's call these rare characters. And let's call the remaining 419 characters, which appear 4 or more times, common characters.

Although the rare characters comprise 35% of the distinct characters, these actually only comprise 3.4% of the text of the game, because, naturally, the common ones are used much more than rare ones.

Thus, if you only need to look up words that contain rare characters, then on average you'll be looking up 2.8% of the words. That's not so bad, is it?

But if you only know 500 characters, then you probably don't know all 419 of the common ones in the game. Rather you probably know about 263 of the characters overall and 213 of the common ones.1

This translates into needing to look up 5 words per text item, and Into the Haze has on average ~13 words per text item. So this is nearly 40% of the words!

What if there was a way to strategically learn some characters and words in preparation for the game?

If you were to learn the 50 most frequent characters in the game that you don't already know, bringing your total to 263+50 = 313, where would this put you?

Amazingly enough, this would bump you up to understanding 85% of the game, and require you to only look up 2 words on average in each text item. And remember, the average text item is 13 words long.

That seems much more doable, right?

But how can I learn an extra 50 characters?

WordSwing has a Word Lists feature that allows you to build, customize, share, and study lists of words. And there are many other tools, such as Anki and Skritter that can also help.

On WordSwing, we've created two official lists for the Into the Haze game:

Into the Haze word lists

The list labeled Part I contains 499 of the most common words in the game. And the order of the words in this list is based on their frequency in the game.

Although we've been talking about individual characters, we recommend on expanding your character knowledge by learning new character in the context of words. This list contains 428 distinct characters and has 95% overlap in characters with what we're calling common characters.

You can make your own personal copy by "forking" the list, and then you can prune it based on your knowledge ratings so that you can focus on studying just the words you don't know. With a bit of up-front study, you'll probably find the content of Into the Haze looks much familiar.

Here's a screen cast of how you can create your own personalized study list based on the game:

This screencast illustrates the following:

  1. Making a personal copy (forking) the published list of the 499 common words in Into the Haze.
  2. Using your knowledge ratings to prune the list down to just the ones you don't understand.
  3. Studying this list using the Pronunciation Recall tool, which is a good way to practice your ability to recognize the written form of Chinese words.
  4. Or, exporting the list to Anki for external flash-card study.

A possible strategy

If you find yourself feeling that the fraction of unfamiliar words in the game is overwhelming, perhaps something like the following strategy will be helpful.

  1. Prepare a pruned word list as I illustrated in the screencast.
  2. Choose a way to begin familiarizing yourself the most common of these words. Our Pronunciation Recall activity is one option. Anki is another good option or maybe Skritter if you know how to import your words there.
  3. After spending a bit of time learning some words, try playing the game again. Does it feel any easier or more familiar?
  4. Continue to alternate between studying words and playing the game.

Here are two important points:

We recommend you don't try and learn all the words in your word list before playing the game. After all, the point of the game is to practice Chinese. Instead, just familiarize yourself with some of the words and use the game to practice them.

And:

In very short order you'll probably be a pro at navigating the game, and the subset of Chinese used in the game will feel like second nature. This is the magic of learning through games.

By focusing on the vocabulary of the game and practicing it through the game, you will quickly be able to read what is arguably more difficult text at a higher level than were you not to narrow the scope of your efforts. You can read more about narrow reading on Hacking Chinese.

tl;dr

By strategically focusing on familiarizing yourself with the most important vocabulary of the game, and practicing these words by playing, you'll quickly master the small subset of Chinese used in the game. This will happen almost magically through play.

And soon the world of Into the Haze will feel like home, albeit a post-apocalyptic Chinese version of home.

1. This calculation assumes that each time you encounter a character, there is some small probability, p, that you learn that characters. As you encounter more written Chinese, you see common characters more than rare ones, and the probability of not knowing a character you've seen k times follows a geometric distribution. prob(not known) = (1-p)k. And thus the probability you've learned a character is prob(known) = 1 - (1-p)k.