What does knowing 575 characters mean?

A while back we announced a gadget to estimate how many characters you know.

Well, that gadget simply gave you a number, like 575 characters. But what does 575 characters mean?

In an attempt to provide some context for what some level of character knowledge means, we have produced some visualizations to follow up the estimate.

Here's an example:

Plot of relationship between known characters and fraction of typical text comprised by these characters

The curve illustrates the relationship between the number of characters one might know and the fraction of typical text that is comprised of those characters. Because a small number of common characters make up much of every-day Chinese, the curve is quite steep. The blue point gives the location of the estimate for a student who knows 575 characters.

Try it out

How does it work?

In each case, the summary is based purely on our estimate of your overall character knowledge and not the exact characters you know. If you have tried out our knowledge estimator, you will have noticed that we only ask you whether you know a few characters. Then based on your responses we estimate your overall character knowledge.

We don't know exactly which characters you know, just about how many overall. Nonetheless, by sampling sets of characters that someone who knows 575 characters plausibly knows, we can figure out overall what fraction of text is comprised of such characters.

What are other implications of knowing 575 characters?

But why is knowing those characters useful? Well, Even with not very many characters, you can form an awful lot of words. With 575 characters one can form over 10,000 words. That's quite a good size vocabulary. Now you just need to learn those words!

Plot of relationship between known characters and number of words that can be formed using these characters

This plot (above) illustrates the relationship between the number of characters one might know and the how many words can be formed with just those characters. The idea is to give you some sense of how many words you could learn with just the characters you know. Many of these words are probably ones you already know!

But what sort of words do I likely know?

We also provide a breakdown by HSK level:

Pie graphs indicating fraction of words in each HSK level that can be formed with known characters

Here we show a pie graph for each HSK level. The shaded portion indicates what fraction of words at that HSK level could be formed by characters you probably know.

Go see how many words you know

Introduing slower audio

Ever feel like your brain can't quite keep up with all the mental processing that must occur to understand native-speed Chinese speech? That's perfectly normal. Your brain is actually doing a tremendous amount of work. It must segment the incoming aural data stream into words and phrases, recognize words and retrieve the meanings from memory, and identify the syntatic, semantic and contextual relationships that comprise the big-picture meaning.

Natural-speed Chinese speech is about 3.5 characters per second. This means you have less than 290 ms to process the information contained in each character and place it into context. Well, to give your brain an easier time of all this processing, we now offer audio playback at 75% natural speed. That's an extra 95 ms per character for your brain to search for meanings, analyze structure, or 休息一下. It actually makes a big difference. The new audio player looks like this (just a picture, not the actual audio player):

Dual-speed audio player

The play-back speed can be toggled using the 快 (fast/natural) / 慢 (slow) links. Try it out on one of the Asking Directions exercises.

So what's going on? Well, it's really simple. We're just playing back the same recording at a slower rate, but adjusted in such a way that the pitch stays the same. While it would be preferable to have all our recordings spoken slowly and at natural speed to begin with, this is nearly as good, and much easier.

Why is WordSwing named WordSwing?

The name WordSwing is part of a metaphor for language learning. The words you learn are vines hanging in the jungle that you can use to swing to new vines (words), and thereby traverse the landscape (language-scape), in a fun, exhilarating, and effective way. You use the part of the language you know, to have meaningful language practice and thereby expand your understanding, swinging to ever higher levels of proficiency.

Consider the first part of the name, word. At WordSwing, there is a substantial focus on growing your vocabulary, as this is often the most important factor when understanding and communicating in real-life situations. So in this sense, words are your vehicle (or vines) to learn new things and get to new places. The focus on the word as the unit is also apparent in our estimate of your knowledge of the language. We track your learning at high resolution, down to your familiarity with individual words and even particular word usages. It is precisely this high resolution that enables us to propose relevant and accessible activities. A vine that you can't swing to is no use in exploring the jungle, and language that is not accessible is no help for learning.

Now for the second part, swing. Language learning should be fun and playful, and what better toy than a swing to capture the fun, playful nature of language. Language is amazing flexible and malleable; there are often many ways to say things, and thus language is an endlessly entertaining toy that can be played with in an infinite number of ways.

Ever wonder how many characters you know?

If you have been learning Chinese you no doubt have wondered, “how many characters do I know?” Perhaps you would like to see if you’ve made a dent in the roughly 4000 characters it usually takes to be considered literate. Or perhaps you want to see some measure of all that effort you’ve put into studying. Or maybe you want to impress your mom. Regardless of your reason, you would rather not run through the roughly 10,000 characters in modern use to count how many you know.

Fear not! Statistics to the rescue

Fortunately it only takes a relatively small sample of characters to accurately estimate how many characters you know overall. So we at WordSwing put together a little gadget to help you out. This gadget will sample characters and ask you a simple yes or no question: do you know the character? The more answers you give the more accurate the estimate.

Character knowledge estimation gadget

But isn’t this question fraught with peril? you might ask. What does it mean to know a character? Well, luckily it can mean whatever you want it to mean. If you want it to mean the character is vaguely familiar, then the resulting estimate will be how many characters are vaguely familiar to you. If you only answer yes if you’ve mastered the character, then the estimate will be for how many characters you’ve mastered. If you want to try it several ways, just clear the estimate and start over.

Curious how it works?

As in any estimate, there are some assumptions that go into it. We assume that the characters you know are a sampling from the overall frequency distribution of Chinese characters (based on a large corpus of modern Chinese) and that you know each character with a probability proportional to the frequency of the character. Thus each character represents a binomial draw and our goal is simply to estimate the constant of proportionality that scales the integral of the frequency distribution to the estimated number of characters you know, which we do by a variant of logistic regression. This model is about as simplistic as one can make it, and undoubtedly, the modeling assumptions are not exactly correct, but hopefully the Law of large numbers helps us out and we believe this represents a fairly accurate estimate of how many characters you know.

Page: 9 of 10

Learn a language by swinging up to ever higher levels of proficiency by effectively using the language you've learned so far. wordswing.com

Keep up to date