Reading Basic Japanese Signs

Introduction
Overview
Hiragana and Katagana Character Sets
Modifiers of Basic Character Sets
Some Examples
Confusing Pairs
Kanji
Resources

INTRODUCTION

In a recent (2009) trip to Japan, I decided it would be useful to pick up the basics of Japanese scripts. NOT to speak, but in order to read! NOT to read for understanding (which requires some grammar and vocabulary), but just to read signs (hotel names, street signs, etc), and to make connections to words that I already know in English. Most introductory Japanese websites or books aim to teach you basic speaking, or simple reading for understanding. But what I need is a rather specialized subset of that information. If you have a limited goal like mine, you might find the following notes useful.

If we just want to "recognize" signs, we just have to memorize the symbols. Why care about pronunciation? Here are some reasons: you may need to sound it out when asking directions. It would be nice to recognize it when you hear your tour guide say the word. Most importantly, we are often interested in recognizing those words that are non-Japanese (e.g., "terminal", "club") or Japanese words that has entered the English vocabulary (e.g., ramen). In this case, they sound like the English counter part, and you could recognize it by its pronounciation. Of course, our need for pronunciation is less critical than those who must speak the language (in order to be understood).

You might think that in major cities, there would be enough information or signs in English, so it is unnecessary to read Japanese signs. Unfortunately, most maps and signages in Japan are in Japanese with a smattering of English in some major hubs.

Here is an example. In my trip to Fukuoka to attend ASCM (Asian Symposium on Computer Mathematics, Dec 2009), I was looking for JAL SEAHAWK HOTEL which I had stayed in 2003. This is a landmark (35 storey) hotel, next to the huge Yahoo Stadium, home of Fukuoka's baseball team (Seahawk). But I could not find an online map with this English name on the map. (Someone put a marker in Google maps for this hotel, but it was located in the sea, and quite far from the actual location.) Here is the name in Japanese:

SEA-HAW-K: ( しほく ) [Si Ho Ku]
Note that this is spelled out using Hiragana script (even though most foreign words are in Katagana script). Perhaps you might find this alternative rendition: ( しーほーく ) [Si--Ho--Ku] This rendition simply has the ''prolongation'' symbol [--] following the first two syllables; it is an aid to pronouncing the name more accurately. The last syllable [Ku] remains short. It is, of course, superfluous in English. But in Japanese, each consonant is tied to a vowel, and [Ku] is thus the stand-in for the letter K in English.

The upshot is that SEAHAWK in Japanese is a 3 syllable word, unlike the 2 syllables of English. Fortunately, Japanese vowels are always short, so you probably hear the terminal syllable [Ku] as [K']. That is also the reason why we need to lengthen the sound of [Si] and [Ho] using the prolongation mark ''ー'' above. Remarkably, this prolongation mark ー can be rotated 90 degrees to become a vertical bar when the Japanese script is written vertically. Japanese, like Chinese and Korean scripts, are equally at home in vertical or horizontal modes. (Interestingly, Chinese and Korean scripts are even at home whether you write them from left-to-right or right-to-left; unfortunately, Japanese script is not so endowed.)

In general, terminal consonants in foreign words represent a standard problem in transcribing sounds into Japanese --- every terminal consonant will introduce a superfluous terminal syllable. E.g., FORK in English must be transcribed as [Fo Ku] or, with prolongation, [Fo--Ku]. But there is some indeterminacy here: if we want a terminal [K] sound, which of the characters in the K-row ([Ka], [Ki], [Ku], etc) should you use? In practice, it seems that [Ku] and [Ko] are preferred.

For the purposes of reading signs, the Katagana script is perhaps slightly more important than Hiragana script. But the two systems are parallel, so for good measure, we throw in Hiragana as well. Actually, you will need to recognize Hiragana characters, if only to ignore them! It is true the curvy script of Hiragana usually distinguishes it from Katagana. But until you master the scripts, it is sometimes hard to tell whether you are reading Katagana or Hiragana. In that case, you want a table of both alphabet (see below) for comparison.

OVERVIEW

There are 3 written scripts in Japanese: Hiragana, Katagana and Kanji.
1. HIRAGANA, the "curvy script" is considered the simplest, and is phonetic.
  NOTE: The curvy nature comes from brush strokes. It helps to know a bit about the dynamics of brush writing, as in Chinese calligraphy. Brush strokes imparts some tell-tale characteristics on typography:
  (1) curvature on what are essentially straight strokes,
  (2) hook flourishes at the end of a stroke, and
  (3) traces of connection between different strokes.
  (4) strokes shows a direction (if I may use a geometric terminology, brush strokes are ''directed segments'', not ''undirected'' ones).
  All these are reflected in the typographic design of the characters. Unless you know calligraphy, you might miss these characteristics or get them wrong in your own writing.
2. The next script, KATAGANA is also phonetic and is useful for foreign words.
  NOTE: This is slightly easier to read than Hiragana because it has a more "angular strokes" which are not derived from brush strokes (you still see some characteristic curves).
3. The last, KANJI, derived from Chinese characters is hardest and non-phonetic. Of course, if you know Chinese characters already, this would be the easiest. We will see that it is useful to learn some Kanji too ( see below ).

Hiragana and Katagana Character Sets

These two character sets have a one-one correspondence, and hence it is useful to see them presented side-by-side. The third table below is a merger of the first two, and strictly speaking, redundant. Still, it is useful:

HIRAGANA:

KATAGANA:

HIRAGANA/KATAGANA :

	A	I	U	E	O
-	あ	い	う	え	お
K-	か	き	く	け	こ
S-	さ	し	す	せ	そ
T-	た	ち	つ	て	と
N-	な	に	ぬ	ね	の
H-	は	ひ	ふ	へ	ほ
M-	ま	み	む	め	も
Y-	や	--	ゆ	--	よ
R-	ら	り	る	れ	ろ
W-	わ	ゐ	--	ゑ	を
-n	ん	--	--	--	--

	A	I	U	E	O
-	ア	イ	ウ	エ	オ
K-	カ	キ	ク	ケ	コ
S-	サ	シ	ス	セ	ソ
T-	タ	チ	ツ	テ	ト
N-	ナ	ニ	ヌ	ネ	ノ
H-	ハ	ヒ	フ	ヘ	ホ
M-	マ	ミ	ム	メ	モ
Y-	ヤ	--	ユ	--	ヨ
R-	ラ	リ	ル	レ	ロ
W-	ワ	ヰ	--	ヱ	ヲ
-n	ン	--	--	--	--

	A	I	U	E	O
-	あ : ア	い : イ	う : ウ	え : エ	お : オ
K-	か : カ	き : キ	く : ク	け : ケ	こ : コ
S-	さ : サ	し : シ	す : ス	せ : セ	そ : ソ
T-	た : タ	ち : チ	つ : ツ	て : テ	と : ト
N-	な : ナ	に : ニ	ぬ : ヌ	ね : ネ	の : ノ
H-	は : ハ	ひ : ヒ	ふ : フ	へ : ヘ	ほ : ホ
M-	ま : マ	み : ミ	む : ム	め : メ	も : モ
Y-	や : ヤ	-- : --	ゆ : ユ	-- : --	よ : ヨ
R-	ら : ラ	り : リ	る : ル	れ : レ	ろ : ロ
W-	わ : ワ	ゐ : ヰ	-- : --	ゑ : ヱ	を : ヲ
-n	ん : ン	-- : --	-- : --	-- : --	-- : --

There are 5 columns and 11 rows in each table. Since 7 entries are missing, we have a total of 48 characters in each table.
The first row has five vowels.
In Hiragana, A= あ , I= い , U= う , E= え , O= お
or in Katagana, A= ア , I= イ , U= ウ , E= エ , O= オ .
Please lookup some internet source for how to pronounce them -- suffices to say that these vowels are always short.
HINT: it is good to memorize them, including their ordering (A,I,U,E,O).
The last row (like the first) is an anomaly. The remaining 9 rows contain SYLLABLES (= consonant + vowel).
The vowel is constant in each column (in particular, it is inherited from the vowel in the first row.) So we can label the columns as A-column, I-column, U-column E-column and O-column.
Similarly, the consonant is constant for each row. So we can label each of the 9 rows by the corresponding consonant. These consonants are (K,S,T,N,H,M,Y,R,W).
E.g., row 2 is the K-row with (Ka, Ki, Ku, Ke, Ko).
The Y-row has only three entries (Ya, Yu, Yo) but no Yi and Ye.
The W-row has only three entries (Wa, Wu, Wo) but no Wi and We. (NOTE: Sometimes, Wu is also omitted!)
The last row for [-n] is special because this is a terminal N sound, but never the initial N sound.
The T-row has somewhat irregular sounds:

A I U E O

(H) たちつてと

(K) タチツテト

Sounds [Ta] [Chi] [Tsu] [Te] [To]
A similar irregularity in the H-row is that the sound for ふ (H) or フ (K) is [Fu] and not [Hu].
In fact, it seems that the H-row could also be called the F-row. With TEN-TEN modifier marks (see below), you can also turn it into a B-row or a P-row!! So H-row is highly flexible.

Modifiers of Basic Character Sets

Each symbol in Hiragana and Katagana is intermediate between a ''letter'' (as in the English alphabet) and a ''full character'' (as in Chinese script). But I will call them ''characters'' and not ''letters''. But Japanase characters are more atomic (indivisible) than Chinese characters which have internal structure. Like atoms, Japanese characters have isotopes indicated by modifiers, they can combine with certain affinities.
If you were looking for certain consonants ([G], [Z], [D], [B] and [P]), you would not find them in any row! These are obtained by modifying related consonants using accent marks, known as TEN-TEN marks. There are two versions: the ''double quote'' mark ( ゛) (the VOICED modifier) and the ''degree'' mark ( ゜) (the SEMI-VOICED modifier).

The double quote mark converts certain consonants to their "voiced form". E.g., in Katagana, Tu (or Tsu) ツ becomes Du ヅ . The actions of the TEN-TEN marks in Katagana and Hiragana are parallel: E.g., in Hiragana, Tu (or Tsu) つ becomes Du づ . Here is the table of these transformations.

Transformation	A	I	U	E	O
K ゛ yields G	[Ga]: ガ (K) が (H)	[Gi]: ギ (K) ぎ (H)	[Gu]: グ (K) ぐ (H)	[Ge]: ゲ (K) げ (H)	[Go]: ゴ (K) ご (H)
S ゛ yields Z	[Za]: ザ (K) ざ (H)	[Zi]: ジ (K) じ (H)	[Zu]: ズ (K) ず (H)	[Ze]: ゼ (K) ぜ (H)	[Zo]: ゾ (K) ぞ (H)
T ゛ yields D	[Da]: ダ (K) だ (H)	[Di]: ヂ (K) ぢ (H)	[Du]: ヅ (K) づ (H)	[De]: デ (K) で (H)	[Do]: ド (K) ど (H)
H ゛ yields B	[Ba]: バ (K) ば (H)	[Bi]: ビ (K) び (H)	[Bu]: ブ (K) ぶ (H)	[Be]: ベ (K) べ (H)	[Bo]: ボ (K) ぼ (H)
H ゜ yields P	[Pa]: パ (K) ぱ (H)	[Pi]: ピ (K) ぴ (H)	[Pu]: プ (K) ぷ (H)	[Pe]: ペ (K) ぺ (H)	[Po]: ポ (K) ぽ (H)
W ゛ yields V	[Va]: ヷ (K) -- (H)	[Vi]: ヸ (K) -- (H)	[Vu]: ヴ (K) ゔ (H)	[Ve]: ヹ (K) -- (H)	[Vo]: ヺ (K) -- (H)

Two remarks about this table. First, the semi-voiced transformation (using ゜) is only applied to the H-row. Second, the transformation in the last row (W/V) is very irregular: all but one voiced character are missing in Hiragana -- apparently they are not needed for native Japanese sounds.
Why is [D-] the voiced form of [T-]? If you place your fingers on your vocal cords as you say [Da] or [Di], you can feel the vocal cords vibrating (voiced). No such vibrations occur with [Ta] or [Ti] (unvoiced). But if you place your palms in front of your mouth for the unvoiced [T-], you will feel a burst of air; for this reason, the unvoiced vowels are also called asphirated consonants. Note that [K] or [T] sounds are less aspirated in Japanese than in English.
Besides asphirated/voiced pair (T/D), the other pairs are (K/G), (S/Z), (H/B), (W/V). The (H/B) pairs seems a bit of an anomaly in this classification. Moreover, we have (H/P) as the semi-voiced variant.
Combination of these syllables is possible in a very limited form: the Y-row (Ya, Yu, Yo) can be combined with the I-column (Ki, Shi, Chi, Ni, Hi, Mi, Ri).
E.g., [Ki Ya] or [Ya Ki], etc.
Interestingly, the vowels are written in two sizes: large and small! LARGE: あいうえお
SMALL: ぁぃぅぇぉ I suppose if the vowel is not the beginning of a syllable, it would be small. But it seems that its true role is to modify the preceding character (??).
Both Hiragana and Katagana share two common prolongation sound marks: ー (long) or ｰ (short). See our SEAHAWK example above.
For the opposite effect to prolongation, the characters っ [Tu] in Hiragana, and ッ [Tu] in Katagana, are used to shorten a vowel. (See example below of my name).
Some Examples
- Let us say "Thank You!" (arikatou kozaimashta). This is written in Hiragana since it is a pure Japanese expression, not a foreign one:
  
  ありがとうございました
  
  [A] [Ri] [Ka'] [To] [U] [Ko'] [Sa'] [I] [Ma] [Shi] [Ta]
- Another useful expression is "Doumo" (literally ''Very''):
  
  とうも
  
  [Do] [U] [Mo]
  
  This expression can mean "Thank you", "You are Welcome" or "Goodbye". But unlike the other "Thank you" [A-Ri-Ka-To], this one can only be a response, not the initial "Thank you".
- Sign in front of a eating establishment says ラーメン (K) [Ra] [--] [Me] [-n]. So they specialize in Japanese noodles (ramen). You might also see it in Hiragana: らーめん (H) [Ra] [--] [Me] [-n].
- Some city names: TOKYO: とうきよう (H) [To] [U] [Ki] [Yo] [U]. SINGAPORE: シンガポール (K) [Si] [-n] [Ga] [Po] [--] [Ru].
- My name (Chee Yap) in Hiragana is ちーやっぷ [Shi--] [Ya' Pu]. In Katagana, チーヤップ [Shi--] [Ya' Pu]. Note the prolongation mark, and the TEN-TEN accent mark to convert [Hu] to [Pu]. Somewhat unexpected we also have っ and ッ (in small size!) and they are meant to shorten the preceding vowel in [Ya].
- Toyoko-Inn in Kobe, near the Sannomiya district. The ''Toyoko'' part is written in Kanji (of course, ''To'' is ''East''). Then ''Inn'' is イン [I] [-n] (K). In the hotel brochure, they advertized for the ''Toyoko-Inn Club Card'': クラフ (ten ten) [Ku] [Ra] [Hu''] = ''Club'' and カート (ten ten) [Ka] [--] [To''] = ''Card''.
Confusing Pairs
Until you master the full range of Hiragana and Katagana characters, you will often be unsure if you have read a particular character correctly.
There are several sources for this confusion. First, a Hiragana (H) character might look the same as Katagana (K) character. For instance, some rendition of [Ri] (H), り, might look almost the same as [Ri] (K), リ. E.g., my printer renders them both as a short vertical stroke and a long vertical curved stroke. The only difference is that the short stroke in Hiragana has a small hook. (But on my Firefox browser, [Ri] (H) the short and long strokes are joined as one might expect in brush writing. In this case there is little confusion.) Second, Hiragana, when heavily stylized as brush strokes, can be pretty hard to recognize. Finally, if you write your own script, you will likely misbalance the relative lengths of various strokes (an important feature in Chinese calligraphy), leading to confusion. Here are some examples to watch out for (a good idea is to learn them in pairs):
- わ [Wa] (H) versus ち [Chi] (H). The non-expert will write both horizontal strokes about the same length, making them similar.
- き [Ki] (H) versus ま [Ma] (H). The top of both characters are the same, and if you do not pay attention to the bottom, you confuse them.
- あ [A] (H) versus お [O] (H).
- く [Ku] (H) versus へ [Ke] (H).
- さ [Sa] (H) versus ち [Chi] (H).
- こ [Ko] (H) versus ニ [Ni] (K). [Ko] is essentially two horizontal strokes, but the two hooks flourishes that are almost joined are important! In handwriting, it is hard to write these hooks without connecting them. So my [Ko] becomes two plain horizontal strokes, like the Chinese character for "two". But this is the Katagana [Ni].
- い [I] (H) versus リ [Ri] (K). Again, it is your handwriting that might trip you up.
- り [Ri] (H) versus リ [Ri] (K). Compare to the previous pair. As noted above, this pair may not look confusing, depending on the rendition of the Hiragana script. Moreover, confusion in this instance is anyhow harmless.
- ソ [So] (K) versus ン [-n] (K). Note the difference in the direction of the two strokes: [So] is a down stroke, but [-n] is an up stroke.
- シ [Si] (K) versus ツ [Tu] (K). Note the difference in the direction of the two strokes: [Si] is an up stroke, but [Tu] is a down stroke.
- フ [Hu] (K) versus ワ [Wa] (K).
- る [Ru] (H) versus ろ [Ro] (H) versus ゐ [Wi] (H).
Familiar Looking Characters
These might look familiar and cause a double-take:
- ロ [Ro] (K) looks like the Chinese character for ''mouth''. Since Japanese script is intermixed with Chinese characters, this might lead to some confusion.
- メ [Ra] (K) looks like an X.
- ニ [Ni] (K), に [Ni] (H) and こ [Ko] (H) all recall the Chinese character for the number ''2''.

Why Kanji?

Kanji is always found interspersed with the other two scripts. But within out limited scope, our main goal is NOT to learn Kanji, but to recognize when a character is Kanji, so that we can ignore them!
Fortunately, Kanji is usually easy to spot because it generally looks more complicated than Hiragana/Katagana. That is because Kanji is a compound script where each character is made up of a small number of parts that resemble Hiragana characters. [SOME EXAMPLES?]
Some basic words are almost surely written in Kanji. For instance, the word ''East'' which is pronounced [TO] (as in Tokyo) would almost surely written in Kanji (東). Recognizing this character will help you recognize many names of places (like Tokyo).
Hence, let us venture a little beyond Hiragana/Katagana and try to pick up a few basic Kanji characters.
But what should we learn? One suggestion: learn all Kanji characters with at most 4 or 5 strokes. Let f(n) be the census function, counting the number of Kanji characters with exactly n strokes. We know that f(1)=1. It is probably easy to determine f(2), f(3) and f(4). Below, we will be less systematic and name some common words: some n will be larger than 4 (e.g., "East" (東) is counted in f(8)).
For instance, the word for ''Exit'' which you see over public exits is always in Kanji: 出口. The second character (口 or [Ku]) is indistinguishable from the Katagana character ロ [Ro]; only context can tell us which is intended. The character literally means ''mouth'', but also stands for a ''door'' or ''entrance''. You will often see the Kanji for ''East Gate'' (東口) at transportation hubs.
If Kanji writing is hard, its pronounciation will give you more headache: it turns out that each Kanji word has two pronounciations or readings! Consider the Kanji ''Mountain'' (山). I will write the two readings of ''Mountain'' as the pair ''[SAN]/[yama]''.
The first reading [SAN], is called on'yomi (Chinese reading) and is usually a single syllable, usually derived from the original Chinese sound. The second reading [yama] is called kun'yomi (Japanese reading) and is usually multi-syllable, and represents native Japanese pronounciations of the word. Many words have only one on'yomi and several kun'yomi. But have no kun'yomi or more than one on'yomi is also possible. See http://en.wikipedia.org/wiki/Kanji for more information.
Incidentally, the ``original'' Chinese sound (it seems to me) is often related to ancient Chinese sounds from the Tang dynasty (7-9th Century AD). Such sounds seems to have been better preserved in Southern dialects such as Fukien than in Mandarin.

Basic Kanji Words:

Kanji ''Big'' (大) [DAI]/[ou(kii)] is used a lot. Universities are ''Big'' Schools. But ''Small'' is 小 [SHOU]/[ko] is also common.
Kanji ''Sun'' or ''Day'' (日) [NICHI],[JITSU]/[hi] is very important -- the name ''Japan'' (日本) uses this character, as it is the land of the rising sun. Other heavenly terms include: ''Moon'' or ''Month'' (月) [GETSU]/[tsuki], and ''Sky'' or ''Heaven'' (夭).
Kanji ''New'' (新) [Shin] is a common word in names of places! E.g., Shin-Kobe and Shin-Osaka are districts of Kobe city and Osaka city, but both are "new" parts of the cities.
Kanji ''Water'' (水) [SHUI]/[mizu]. Compare to Kanji ''River'' (川) [SEN]/[kawa]. But ''Rain'' is 雨 [U]/[ame].
Kanji ''Fire'' (火) [KA]/[hi] is a contrast to water.
Kanji for the first 10 numerals is important. Let us do the simplest: ''One'' (一) [ICHI]/[hito], ''Two'' (二) [NI]/[futa], ''Three'' (三) [SAN]/[mi(tsu)], ''Ten'' (十) [JUU]/[tou].
Kanji ''Gold'' (金) [KIN].
Kanji ''Right'' (右) [YUU],[U]/[migi] while ''Left'' is (左) [SA]/[hidari]. (Incidentally, do not confuse 右 with ''Stone'' (石) [SEKI],[KOKU],[SHAKU]/[ishi].)
Kanji for the cardinal directions. ''East'' is most important, but the other three cardinal directions are also common: North (北), South (南), West (西).
Kanji ''Person'' (人) [NIN],[JIN]/[hito]. ''Woman'' or ''Girl'' is 女 [JO],[NYO]/[onna]. But Kanji ''Child'' (子) [SHI]/[ko] is also used as a diminutive used for woman.
Some body parts: we already saw ''Mouth''. Kanji ''Hand'' (手) [SHU]/[te], ''Foot'' (足) [SOKU]/[ashi], ''Ear'' (耳) [JI]/[mimi], ''Eye'' (目) [MOKU]/[me].
Kanji ''Rice field'' (田) [DEN]/[ta]. Names of places often have this character.
Kanji ''King'' (王).
Kanji ''Yen'' (monetary unit) seen in price list, etc.

Resources

For html codes for Hiragana Characters, see the useful Penn State site: http://tlt.its.psu.edu/suggestions/international/bylanguage/japanesecharthiragana.html. For Katagana Characters, see http://tlt.its.psu.edu/suggestions/international/bylanguage/japanesechartkatakana.html A source for Kanji characters is http://www.rikai.com/library/kanjitables/kanji_codes.unicode.shtml. Look also in http://www.scholiast.org/kanji-converter1.html, where I have borrowed pronounciations.
It is easy to lament the wastefulness of having two parallel phonetic scripts, especially while you are learning the system! But once you get used to it, it provides a rich ecosystem for expressing subtleties that would be lost in a purer and more "designed" system. While a purely phonetic system has great appeal, there is no pure phonetic writing system. Perhaps Korean Hangul comes closest to it. English or French would be terribly hard to understand if you clean up their writing system! Here is a thought experiment. I imagine the total confusion when Hangul was first proposed to replace Chinese script in Korea -- just a babel of sounds, and totally devoid of the rich associations and subtleties of Chinese characters. Why? Written texts usually come with much less context information than (the performance of) spoken words. This deprived context can be somewhat compensated by redundancies in the writing system. Instead of redundancies, think of them as opportunities to introduce contextual information.

	A	I	U	E	O
(H)	た	ち	つ	て	と
(K)	タ	チ	ツ	テ	ト
Sounds	[Ta]	[Chi]	[Tsu]	[Te]	[To]

あ	り	が	と	う	ご	ざ	い	ま	し	た
[A]	[Ri]	[Ka']	[To]	[U]	[Ko']	[Sa']	[I]	[Ma]	[Shi]	[Ta]