Reading Basic Japanese Signs
- Hiragana and Katagana Character Sets
- Modifiers of Basic Character Sets
- Some Examples
- Confusing Pairs
In a recent (2009) trip to Japan,
I decided it would be useful to pick up the basics of Japanese scripts.
NOT to speak, but in order to read!
NOT to read for understanding (which requires some grammar and vocabulary),
but just to read signs (hotel names, street signs, etc),
and to make connections to words
that I already know in English.
Most introductory Japanese websites or books aim to teach you
basic speaking, or simple reading for understanding.
But what I need is a rather specialized subset of that information.
If you have a limited goal like mine,
you might find the following notes useful.
If we just want to "recognize"
signs, we just have to memorize the symbols.
Why care about pronunciation?
Here are some reasons:
you may need to sound it out when asking directions.
It would be nice to recognize it when you hear your
tour guide say the word. Most importantly, we are
often interested in recognizing those words that are non-Japanese
(e.g., "terminal", "club") or Japanese words
that has entered the English vocabulary (e.g., ramen).
In this case, they sound like the English counter part,
and you could recognize it by its pronounciation.
Of course, our need for pronunciation
is less critical than those who must speak the language
(in order to be understood).
You might think that in major cities, there would
be enough information or signs in English, so it is unnecessary
to read Japanese signs.
Unfortunately, most maps and signages in Japan are
in Japanese with a smattering of English in some
Here is an example. In my trip to Fukuoka to attend
ASCM (Asian Symposium on Computer Mathematics, Dec 2009),
I was looking for JAL SEAHAWK HOTEL which I had stayed in 2003.
This is a landmark (35 storey) hotel, next to
the huge Yahoo Stadium, home of Fukuoka's baseball team (Seahawk).
But I could not find an online map with this English name on the map.
(Someone put a marker in Google maps for this hotel,
but it was located in the sea, and
quite far from the actual location.)
Here is the name in Japanese:
( し ほ く )
[Si Ho Ku]
Note that this is spelled out using Hiragana script
(even though most foreign words are in Katagana script).
Perhaps you might find this alternative rendition:
( し ー ほ ー く )
This rendition simply has the ''prolongation'' symbol [--] following
the first two syllables; it is
an aid to pronouncing the name more accurately. The last
syllable [Ku] remains short. It is, of course, superfluous in English.
But in Japanese, each consonant is tied to a vowel,
and [Ku] is thus the stand-in for the letter K in English.
The upshot is that SEAHAWK in Japanese is a 3 syllable word,
unlike the 2 syllables of English.
Fortunately, Japanese vowels are always short, so you probably
hear the terminal syllable [Ku] as [K']. That is also the reason
why we need to lengthen the sound of [Si] and [Ho] using
the prolongation mark ''ー'' above.
Remarkably, this prolongation mark ー can be rotated 90 degrees
to become a vertical bar when the Japanese script is written vertically.
Japanese, like Chinese and Korean scripts, are equally at home
in vertical or horizontal modes. (Interestingly, Chinese and Korean scripts are
even at home whether you write them from left-to-right or right-to-left;
unfortunately, Japanese script is not so endowed.)
In general, terminal consonants in foreign words
represent a standard problem in transcribing
sounds into Japanese --- every terminal consonant
will introduce a superfluous terminal syllable.
E.g., FORK in English must be transcribed as [Fo Ku]
or, with prolongation, [Fo--Ku]. But there is some indeterminacy
here: if we want a terminal [K] sound, which of
the characters in the K-row ([Ka], [Ki], [Ku], etc) should you use?
In practice, it seems that [Ku] and [Ko] are preferred.
For the purposes of reading signs, the Katagana script is perhaps slightly
more important than Hiragana script. But the two systems
are parallel, so for good measure, we throw in Hiragana as well.
Actually, you will need to recognize
Hiragana characters, if only to ignore them!
It is true the curvy script of Hiragana
usually distinguishes it from Katagana.
But until you master the scripts,
it is sometimes hard to tell whether you are reading
Katagana or Hiragana. In that case, you want a
table of both alphabet (see below) for comparison.
- There are 3 written scripts in Japanese: Hiragana, Katagana and Kanji.
- HIRAGANA, the "curvy script" is considered the simplest,
and is phonetic.
curvy nature comes from brush strokes. It helps to
know a bit about the dynamics of brush writing,
as in Chinese calligraphy.
Brush strokes imparts some
tell-tale characteristics on typography:
(1) curvature on what are essentially straight strokes,
(2) hook flourishes at the end of a stroke, and
(3) traces of connection between different strokes.
(4) strokes shows a direction (if I may use
a geometric terminology, brush strokes are
''directed segments'', not ''undirected'' ones).
All these are reflected in the typographic
design of the characters. Unless you know calligraphy,
you might miss these characteristics or get them wrong
in your own writing.
- The next script, KATAGANA is also phonetic
and is useful for foreign words.
This is slightly easier to read than Hiragana because it
has a more "angular strokes" which are not
derived from brush strokes (you still
see some characteristic curves).
- The last, KANJI, derived from Chinese characters
is hardest and non-phonetic. Of course, if you know
Chinese characters already, this would be the easiest.
We will see that it is useful to
learn some Kanji too ( see below ).
Hiragana and Katagana Character Sets
These two character sets
have a one-one correspondence, and hence it is
useful to see them presented side-by-side.
The third table below is a merger of the first two,
and strictly speaking, redundant.
Still, it is useful:
|| A || I
|| U || E
|| O |
|| あ || い
|| う || え
|| お |
|| か || き
|| く || け
|| こ |
|| さ || し
|| す || せ
|| そ |
|| た || ち
|| つ || て
|| と |
|| な || に
|| ぬ || ね
|| の |
|| は || ひ
|| ふ || へ
|| ほ |
|| ま || み
|| む || め
|| も |
|| や || --
|| ゆ || --
|| よ |
|| ら || り
|| る || れ
|| ろ |
|| わ || ゐ
|| -- || ゑ
|| を |
|| ん || --
|| -- || --
|| -- ||
|| A || I
|| U || E
|| O |
|| ア || イ
|| ウ || エ
|| オ |
|| カ || キ
|| ク || ケ
|| コ |
|| サ || シ
|| ス || セ
|| ソ |
|| タ || チ
|| ツ || テ
|| ト |
|| ナ || ニ
|| ヌ || ネ
|| ノ |
|| ハ || ヒ
|| フ || ヘ
|| ホ |
|| マ || ミ
|| ム || メ
|| モ |
|| ヤ || --
|| ユ || --
|| ヨ |
|| ラ || リ
|| ル || レ
|| ロ |
|| ワ || ヰ
|| -- || ヱ
|| ヲ |
|| ン || --
|| -- || --
|| -- ||
|| A || I
|| U || E
|| O |
|| あ : ア
|| い : イ
|| う : ウ
|| え : エ
|| お : オ |
|| か : カ
|| き : キ
|| く : ク
|| け : ケ
|| こ : コ |
|| さ : サ
|| し : シ
|| す : ス
|| せ : セ
|| そ : ソ |
|| た : タ
|| ち : チ
|| つ : ツ
|| て : テ
|| と : ト |
|| な : ナ
|| に : ニ
|| ぬ : ヌ
|| ね : ネ
|| の : ノ |
|| は : ハ
|| ひ : ヒ
|| ふ : フ
|| へ : ヘ
|| ほ : ホ |
|| ま : マ
|| み : ミ
|| む : ム
|| め : メ
|| も : モ |
|| や : ヤ
|| -- : --
|| ゆ : ユ
|| -- : --
|| よ : ヨ |
|| ら : ラ
|| り : リ
|| る : ル
|| れ : レ
|| ろ : ロ |
|| わ : ワ
|| ゐ : ヰ
|| -- : --
|| ゑ : ヱ
|| を : ヲ |
|| ん : ン
|| -- : --
|| -- : --
|| -- : --
|| -- : -- |
There are 5 columns and 11 rows in each table.
Since 7 entries are missing, we have a total of
48 characters in each table.
The first row has five vowels.
A= あ , I= い , U= う , E= え , O= お
or in Katagana,
A= ア , I= イ , U= ウ , E= エ , O= オ
Please lookup some internet source for how to pronounce them --
suffices to say that these vowels are always short.
HINT: it is good to memorize them, including their ordering (A,I,U,E,O).
The last row (like the first) is an anomaly.
The remaining 9 rows contain SYLLABLES (= consonant + vowel).
The vowel is constant in each column (in particular,
it is inherited from the vowel in the first row.)
So we can label the columns as A-column, I-column, U-column
E-column and O-column.
Similarly, the consonant is constant for each row.
So we can label each of the 9 rows by the corresponding consonant.
These consonants are (K,S,T,N,H,M,Y,R,W).
E.g., row 2 is the K-row with (Ka, Ki, Ku, Ke, Ko).
The Y-row has only three entries (Ya, Yu, Yo)
but no Yi and Ye.
The W-row has only three entries (Wa, Wu, Wo)
but no Wi and We.
(NOTE: Sometimes, Wu is also omitted!)
The last row for [-n] is special because this is a
terminal N sound, but never the initial N sound.
The T-row has somewhat irregular sounds:
A || I || U
|| E || O
| (H) ||
た || ち || つ
|| て || と
| (K) ||
タ || チ || ツ
|| テ || ト
| Sounds ||
[Ta] || [Chi] || [Tsu]
|| [Te] || [To]
A similar irregularity in the H-row is that the sound for
ふ (H) or
In fact, it seems that the H-row could also be called
the F-row. With TEN-TEN modifier marks (see below), you can also turn
it into a B-row or a P-row!! So H-row is highly flexible.
Modifiers of Basic Character Sets
Each symbol in Hiragana and Katagana is
intermediate between a ''letter'' (as in the English
alphabet) and a ''full character'' (as in Chinese script).
But I will call them ''characters'' and not ''letters''.
But Japanase characters are more atomic (indivisible) than
Chinese characters which have internal structure.
Like atoms, Japanese characters have isotopes indicated
by modifiers, they can combine with certain affinities.
If you were looking for certain consonants ([G], [Z], [D], [B]
and [P]), you would not find them in any row! These are
obtained by modifying related consonants using accent marks,
known as TEN-TEN marks. There are two versions:
the ''double quote'' mark
(the VOICED modifier)
and the ''degree'' mark
(the SEMI-VOICED modifier).
The double quote mark converts certain
consonants to their "voiced form".
E.g., in Katagana, Tu (or Tsu)
The actions of the TEN-TEN marks in Katagana and Hiragana are parallel:
E.g., in Hiragana, Tu (or Tsu)
Here is the table of these transformations.
| Transformation || A || I
|| U || E || O |
K ゛ yields G
[Ga]: ガ (K)
[Gi]: ギ (K)
[Gu]: グ (K)
[Ge]: ゲ (K)
[Go]: ゴ (K)
S ゛ yields Z
[Za]: ザ (K)
[Zi]: ジ (K)
[Zu]: ズ (K)
[Ze]: ゼ (K)
[Zo]: ゾ (K)
T ゛ yields D
[Da]: ダ (K)
[Di]: ヂ (K)
[Du]: ヅ (K)
[De]: デ (K)
[Do]: ド (K)
H ゛ yields B
[Ba]: バ (K)
[Bi]: ビ (K)
[Bu]: ブ (K)
[Be]: ベ (K)
[Bo]: ボ (K)
H ゜ yields P
[Pa]: パ (K)
[Pi]: ピ (K)
[Pu]: プ (K)
[Pe]: ペ (K)
[Po]: ポ (K)
W ゛ yields V
[Va]: ヷ (K)
[Vi]: ヸ (K)
[Vu]: ヴ (K)
[Ve]: ヹ (K)
[Vo]: ヺ (K)
- Two remarks about this table.
First, the semi-voiced transformation (using ゜) is
only applied to the H-row.
Second, the transformation in the last row (W/V) is very irregular:
all but one voiced character are missing in Hiragana
-- apparently they are not needed for native Japanese sounds.
Why is [D-] the voiced form of [T-]? If you place
your fingers on your vocal cords as you say [Da] or [Di],
you can feel the vocal cords vibrating (voiced).
No such vibrations occur with [Ta] or [Ti] (unvoiced).
But if you place your palms in front of your mouth for the
unvoiced [T-], you will feel a burst of air; for this reason,
the unvoiced vowels are also called asphirated consonants.
Note that [K] or [T] sounds are less aspirated
in Japanese than in English.
Besides asphirated/voiced pair (T/D), the
other pairs are (K/G), (S/Z), (H/B), (W/V).
The (H/B) pairs seems a bit of an anomaly in this classification.
Moreover, we have (H/P) as the semi-voiced variant.
Combination of these syllables is possible in a very limited form:
the Y-row (Ya, Yu, Yo) can be combined with
the I-column (Ki, Shi, Chi, Ni, Hi, Mi, Ri).
E.g., [Ki Ya] or [Ya Ki], etc.
Interestingly, the vowels are written in two sizes: large and small!
I suppose if the vowel is not the beginning of a syllable, it
would be small. But it seems that its true role is to modify
the preceding character (??).
Both Hiragana and Katagana share two common prolongation sound marks:
See our SEAHAWK example above.
For the opposite effect to prolongation, the characters
っ [Tu] in Hiragana, and
ッ [Tu] in Katagana,
are used to shorten a vowel. (See example below of my name).
Let us say "Thank You!" (arikatou kozaimashta).
This is written in Hiragana since it is a pure Japanese
expression, not a foreign one:
あ || り || が ||
と || う || ご ||
ざ || い || ま ||
し || た
[A] || [Ri] || [Ka'] || [To] ||
[U] || [Ko'] || [Sa'] || [I] ||
[Ma] || [Shi] || [Ta]
Another useful expression is "Doumo" (literally ''Very''):
This expression can mean "Thank you", "You are Welcome" or "Goodbye".
But unlike the other "Thank you" [A-Ri-Ka-To], this
one can only be a response, not the initial "Thank you".
Sign in front of a eating establishment says
ラ ー メ ン
(K) [Ra] [--] [Me] [-n].
So they specialize in Japanese noodles (ramen).
You might also see it in Hiragana:
ら ー め ん
(H) [Ra] [--] [Me] [-n].
Some city names:
と う き よ う
(H) [To] [U] [Ki] [Yo] [U].
シ ン ガ ポ ー ル
(K) [Si] [-n] [Ga] [Po] [--] [Ru].
My name (Chee Yap) in Hiragana is
や っ ぷ [Shi--] [Ya' Pu].
ヤ ッ プ [Shi--] [Ya' Pu].
Note the prolongation mark,
and the TEN-TEN accent mark to convert [Hu] to [Pu].
Somewhat unexpected we also have
small size!) and they are meant to shorten the
preceding vowel in [Ya].
Toyoko-Inn in Kobe,
near the Sannomiya district. The ''Toyoko'' part is
written in Kanji (of course, ''To'' is ''East'').
Then ''Inn'' is
[I] [-n] (K).
In the hotel brochure, they advertized for the ''Toyoko-Inn Club Card'':
ク ラ フ (ten ten)
[Ku] [Ra] [Hu''] = ''Club''
カ ー ト (ten ten)
[Ka] [--] [To''] = ''Card''.
Until you master the full range of Hiragana
and Katagana characters, you will often be unsure if
you have read a particular character correctly.
There are several sources for this confusion.
First, a Hiragana (H) character
might look the same as Katagana (K) character.
For instance, some rendition of
[Ri] (H), り, might look almost the same as [Ri] (K), リ.
E.g., my printer renders them both as
a short vertical stroke and a long vertical curved stroke.
The only difference is that the short stroke in Hiragana
has a small hook. (But on my Firefox browser, [Ri] (H) the
short and long strokes are joined as one might expect in
brush writing. In this case there is little confusion.)
Second, Hiragana, when heavily stylized as brush strokes, can
be pretty hard to recognize.
Finally, if you write your own script, you will likely
misbalance the relative lengths of various strokes
(an important feature in Chinese calligraphy), leading
Here are some examples to watch out for (a good idea is
to learn them in pairs):
わ [Wa] (H)
ち [Chi] (H).
The non-expert will write both horizontal strokes
about the same length, making them similar.
き [Ki] (H)
ま [Ma] (H).
The top of both characters are the same, and if you
do not pay attention to the bottom, you confuse them.
あ [A] (H)
お [O] (H).
く [Ku] (H)
へ [Ke] (H).
さ [Sa] (H)
ち [Chi] (H).
こ [Ko] (H)
ニ [Ni] (K).
[Ko] is essentially two horizontal strokes,
but the two hooks flourishes that are almost joined
In handwriting, it is hard to write these
hooks without connecting them.
So my [Ko] becomes two plain horizontal strokes,
like the Chinese character for "two". But this is
the Katagana [Ni].
い [I] (H)
リ [Ri] (K).
Again, it is your handwriting that might trip you up.
り [Ri] (H)
リ [Ri] (K).
Compare to the previous pair.
As noted above, this pair may not look confusing,
depending on the rendition
of the Hiragana script. Moreover, confusion
in this instance is anyhow harmless.
ソ [So] (K)
ン [-n] (K).
Note the difference in the direction of the two strokes:
[So] is a down stroke, but [-n] is an up stroke.
シ [Si] (K)
ツ [Tu] (K).
Note the difference in the direction of the two strokes:
[Si] is an up stroke, but [Tu] is a down stroke.
フ [Hu] (K)
ワ [Wa] (K).
る [Ru] (H)
ろ [Ro] (H)
ゐ [Wi] (H).
Familiar Looking Characters
These might look familiar and cause a double-take:
ロ [Ro] (K)
looks like the Chinese character for ''mouth''.
Since Japanese script is intermixed with Chinese
characters, this might lead to some confusion.
メ [Ra] (K)
looks like an X.
ニ [Ni] (K),
に [Ni] (H)
こ [Ko] (H)
all recall the Chinese character for the number
Kanji is always found interspersed with the other two scripts.
But within out limited scope, our main goal is NOT to learn Kanji,
but to recognize
when a character is Kanji, so that we can ignore them!
Fortunately, Kanji is usually easy to spot because
it generally looks more complicated
than Hiragana/Katagana. That is because Kanji is a
compound script where each character is made up of a small number
of parts that resemble Hiragana characters.
Some basic words are almost surely written in Kanji.
For instance, the word ''East'' which is pronounced [TO] (as
in Tokyo) would almost surely written
in Kanji (東).
Recognizing this character will help you recognize
many names of places (like Tokyo).
Hence, let us venture a little beyond Hiragana/Katagana
and try to pick up a few basic Kanji characters.
But what should we learn? One suggestion: learn all
Kanji characters with at most 4 or 5 strokes.
Let f(n) be the census function, counting the number of
Kanji characters with exactly n strokes. We know that f(1)=1.
It is probably easy to determine f(2), f(3) and f(4).
Below, we will be less systematic and name some common words:
some n will be larger than 4 (e.g., "East" (東) is counted in f(8)).
For instance, the word for ''Exit'' which you see over
public exits is always in Kanji: 出口.
The second character (口 or [Ku])
is indistinguishable from the Katagana character
only context can tell us which is intended.
The character literally means ''mouth'', but also stands for a ''door''
or ''entrance''. You will often see the Kanji for ''East Gate''
(東口) at transportation hubs.
If Kanji writing is hard, its pronounciation will give you
more headache: it turns out that each Kanji word
has two pronounciations or readings!
Consider the Kanji ''Mountain'' (山).
I will write the two readings of ''Mountain''
as the pair ''[SAN]/[yama]''.
The first reading [SAN], is called
on'yomi (Chinese reading)
and is usually a single syllable, usually
derived from the original Chinese sound. The second reading [yama]
is called kun'yomi (Japanese reading)
and is usually multi-syllable,
and represents native Japanese pronounciations of the word.
Many words have only one on'yomi and several kun'yomi.
But have no kun'yomi or more than one on'yomi
is also possible.
See http://en.wikipedia.org/wiki/Kanji for more information.
Incidentally, the ``original'' Chinese sound (it seems to me)
is often related to ancient Chinese sounds from the Tang dynasty
(7-9th Century AD). Such sounds seems to have been better
preserved in Southern dialects such as Fukien than in Mandarin.
Basic Kanji Words:
Kanji ''Big'' (大) [DAI]/[ou(kii)] is used a lot.
Universities are ''Big'' Schools.
But ''Small'' is 小 [SHOU]/[ko] is also common.
Kanji ''Sun'' or ''Day'' (日) [NICHI],[JITSU]/[hi]
is very important -- the name ''Japan'' (日 本)
uses this character, as it is the land of the rising sun.
Other heavenly terms include:
''Moon'' or ''Month'' (月) [GETSU]/[tsuki],
and ''Sky'' or ''Heaven'' (夭).
Kanji ''New'' (新) [Shin] is a common word in names of places!
E.g., Shin-Kobe and Shin-Osaka are districts of Kobe city
and Osaka city, but both are "new" parts of the cities.
Kanji ''Water'' (水) [SHUI]/[mizu].
Compare to Kanji ''River'' (川) [SEN]/[kawa].
But ''Rain'' is 雨 [U]/[ame].
Kanji ''Fire'' (火) [KA]/[hi] is a contrast to water.
Kanji for the first 10 numerals is important.
Let us do the simplest:
''One'' (一) [ICHI]/[hito],
''Two'' (二) [NI]/[futa],
''Three'' (三) [SAN]/[mi(tsu)],
''Ten'' (十) [JUU]/[tou].
Kanji ''Gold'' (金) [KIN].
Kanji ''Right'' (右) [YUU],[U]/[migi] while
''Left'' is (左) [SA]/[hidari].
(Incidentally, do not confuse 右 with
''Stone'' (石) [SEKI],[KOKU],[SHAKU]/[ishi].)
Kanji for the cardinal directions. ''East'' is most important,
but the other three cardinal directions are also common:
North (北), South (南), West (西).
Kanji ''Person'' (人) [NIN],[JIN]/[hito].
''Woman'' or ''Girl'' is 女 [JO],[NYO]/[onna].
But Kanji ''Child'' (子) [SHI]/[ko] is also used as a diminutive
used for woman.
Some body parts: we already saw ''Mouth''.
Kanji ''Hand'' (手) [SHU]/[te],
''Foot'' (足) [SOKU]/[ashi],
''Ear'' (耳) [JI]/[mimi],
''Eye'' (目) [MOKU]/[me].
Kanji ''Rice field'' (田) [DEN]/[ta]. Names of places often
have this character.
Kanji ''King'' (王).
Kanji ''Yen'' (monetary unit) seen in price list, etc.
- For html codes for Hiragana Characters, see the useful Penn State site:
For Katagana Characters, see
A source for Kanji characters is
Look also in http://www.scholiast.org/kanji-converter1.html,
where I have borrowed pronounciations.
- It is easy to lament the wastefulness of having
two parallel phonetic scripts, especially while you
are learning the system! But once you get used to it, it provides
a rich ecosystem for expressing subtleties that would be lost
in a purer and more "designed" system.
While a purely phonetic system
has great appeal, there is no pure phonetic writing system.
Perhaps Korean Hangul comes closest to it.
English or French would
be terribly hard to understand if you clean up their writing system!
Here is a thought experiment.
I imagine the total confusion when Hangul was first proposed
to replace Chinese script in Korea -- just a babel of sounds, and
totally devoid of the rich associations and subtleties of
Chinese characters. Why?
Written texts usually come with much less context information
than (the performance of) spoken words.
This deprived context can be somewhat compensated
by redundancies in the writing system. Instead of redundancies,
think of them as opportunities to introduce