Aan mijn ouders
en Jean-Pierre
Contents
Acknowledgments ................................................................................................ XIX
I From proto-Japanese to the modern dialects ...................................................... 1
Introduction........................................................................................................... 3
0.1 The subject and aim of this study .............................................................. 3
0.2 The Middle Japanese tone system and the tone system
of proto-Japanese ...................................................................................... 4
0.3 The basis for the reconstruction of the Middle Japanese tone system ....... 4
0.3.1 The different types of tone markings in old Japanese texts .................... 4
0.3.2 The tone systems of the modern Japanese dialects ................................. 6
0.3.3 The Late Middle Chinese tones and their relation to the value
of the Japanese tone dots......................................................................... 7
0.3.4 Historical descriptions of the Late Middle Chinese tones in Japan ........ 7
0.4 Modern reconstructions of the value of the Japanese tone dots................. 8
0.5 Conventions............................................................................................... 9
0.5.1 Tone classes, symbols and spelling ........................................................ 9
0.5.2 Terminology: Pitch-accent or tone ......................................................... 11
0.5.2.1 What is pitch-accent? .......................................................................... 12
0.5.2.2 What is restricted tone? ....................................................................... 14
0.5.2.3 Tone or pitch-accent in Middle Japanese ............................................ 14
0.5.2.4 Tone or pitch-accent in the modern dialects........................................ 17
0.5.2.5 The tone systems of the modern Japanese dialects
as ‘restricted tone’ ............................................................................... 19
0.6. The selection of the corpus....................................................................... 20
0.6.1 Syllable and mora................................................................................... 20
0.6.1.1 Old Japanese........................................................................................ 21
0.6.1.2 Early Middle Japanese......................................................................... 23
0.6.1.3 Late Middle Japanese .......................................................................... 24
0.6.1.4 The modern dialects ............................................................................ 25
0.6.1.5 The exclusion of heavy syllables ......................................................... 27
1 The two sets of comparative data....................................................................... 29
1.1 The modern Japanese tone systems ........................................................... 29
1.1.1 The Tōkyō type tone systems ................................................................. 29
1.1.2 The Kyōto type tone systems.................................................................. 34
1.1.3 The Kagoshima type tone systems.......................................................... 37
1.2 The distribution of the tone dots in Middle Japanese ................................ 38
1.3 Modern Japanese and Middle Japanese compared .................................... 41
1.3.1 Monosyllabic nouns................................................................................ 42
VIII Contents
1.3.2 Disyllabic nouns ..................................................................................... 45
1.3.3 Trisyllabic nouns .................................................................................... 46
1.4 Differences in the attestation of the tone of the monosyllabic
case particles ............................................................................................. 47
2 The standard reconstruction of the Middle Japanese tone system ..................... 50
2.1 Kindaichi’s Middle Japanese tone system
resembles the tone system of Kyōto .......................................................... 50
2.2 Historical developments according to Kindaichi....................................... 52
2.3 Kindaichi’s reconstruction and the tone system of proto-Japanese ........... 55
2.3.1 How natural is the development of /H/ tones
in tone classes 2.3, 3.4 and 3.5? .............................................................. 56
2.3.2 Hayata’s solution: Unrecorded /M/ tones in classes
2.3, 3.4 and 3.5 in Middle Japanese ....................................................... 58
2.3.3 How natural is the change of initial /L/ tone in Kyōto
to initial /H/ tone in Tōkyō in classes 2.4, 2.5, 3.6 and 3.7? ................... 60
2.3.4 How natural is the shift of /H/ tone to the right
in the Tōkyō type dialects? ..................................................................... 61
2.3.5 Problems concerning the tone of class 3.2 in the Kyōto type dialects ... 62
2.4 Historical background of the standard theory............................................ 63
2.4.1 The geographical dilemma ..................................................................... 64
2.4.2 The dialect area theory and the circle theory.......................................... 66
2.4.3 The resolution of the dispute .................................................................. 67
2.4.4 Turning Yanagita’s circles inside-out..................................................... 68
2.5 Other theories that are based on the standard reconstruction .................... 69
2.5.1 Ōhara’s theory: Kyōto type tone as an innovation.................................. 72
2.6 Conclusion................................................................................................. 75
3 Ramsey’s reconstruction of the Middle Japanese tone system .......................... 77
3.1 Arguments based on the comparative method ........................................... 78
3.1.1 Ramsey’s Middle Japanese tone system
resembles the tone system of Tōkyō........................................................ 79
3.1.2 Ramsey’s Middle Japanese tone system is a suitable ancestor
of the Kyōto type tone systems ............................................................... 80
3.1.3 /H/ tones in tone classes 2.3, 3.4 and 3.5 were already present
in Ramsey’s Middle Japanese ................................................................. 81
3.1.4 /H/ tone spreading onto the particles after /LH/ tone:
Gairin type tone as a natural development .............................................. 81
3.1.5 /H/ tone spreading onto the particles after /R/ tone:
Chūrin type tone as a natural development ............................................. 83
3.1.6 The /L/ register of tone class 3.2 in Kōchi.............................................. 86
3.1.7 Restrictions to the location of the /H/ tone in the Kyōto type dialects.... 86
3.2 The attestation of the Nairin/Chūrin/Gairin split in the old documents ..... 87
3.3 Two more arguments from dialect geography ........................................... 91
Contents IX
3.3.1 Reports on the geographical spread of Kyōto type tone
in earlier periods ..................................................................................... 91
3.3.2 The blurred division between the Gairin and Chūrin areas
as an indication that this is the oldest dialect split in Japan.................... 94
4 The development of the tone systems of Tōkyō, Kyōto and Kagoshima........... 96
4.1 The developments in the Nairin and Chūrin type dialects ......................... 97
4.1.1 /H/ tone restriction.................................................................................. 99
4.1.2 The development of [M] pitch................................................................ 100
4.1.3 The development of /H/ tone anticipation
and a %L phrase boundary tone .............................................................. 103
4.1.4 Analogy in the tone classes that lack /H/ tone ........................................ 104
4.1.5 /H/ tone anticipation affects the remaining /L/ tones .............................. 106
4.2 The developments in the Kyōto type dialects ............................................ 106
4.2.1 How the leftward shift created the /L/ toneme in modern Kyōto............ 108
4.2.2 The origin of the mixed reflexes of tone class 3.2
in the Kyōto type dialects of central Honshū .......................................... 109
4.2.3 The realization of classes with all Ø tone as [H]
in the Kyōto type dialects........................................................................ 109
4.2.4 The reason why the distinct tone classes 1.2, 2.5 and 3.7
were lost in Tōkyō but preserved in Kyōto ............................................. 110
4.2.5 The developments in the Kyōto type dialects of Shikoku
and the Seto Inland Sea ........................................................................... 112
4.3 The developments in the Gairin Tōkyō type dialects ................................ 114
4.4 The development of the two Kagoshima word-tones ................................ 116
4.5 The reconstruction of the tone of class 3.3 ................................................ 118
4.6 Did the tone of the initial syllable have a special status
in Middle Japanese? .................................................................................. 121
4.7 Conclusion................................................................................................. 123
5 Arguments in favor of Ramsey’s theory based on internal reconstruction ........ 125
5.1 The special tonal features of the particle no .............................................. 125
5.1.1. The particle no in the Tōkyō type dialects ............................................ 125
5.1.2 The particle no in Kyōto, Ōsaka and Kōchi ........................................... 127
5.1.3 The loss of the special features of the particle no
after monosyllabic nouns......................................................................... 128
5.1.4 The distribution of the particle no /H/ tone cancellation ........................ 129
5.2 The tone of compound nouns in the modern dialects ................................ 132
5.2.1 The tone of compound nouns with ‘long’ second elements
` in Tōkyō and Kyōto ................................................................................ 132
5.2.2 The tone of compound nouns with ‘short’ second elements in Tōkyō.... 133
5.2.3 The tone of compound nouns with ‘short’ second elements in Kyōto:
Wada’s discovery and its meaning for Ramsey’s theory......................... 136
5.3 Incongruent register of compounds in the dialect of Kyōto....................... 138
X Contents
5.4 The origin of the two types of reflexes of tone class 2.3 in Tōkyō............ 138
5.5 Compounds with tone class 2.3 in Hiroshima, Kyōto and Tōkyō.............. 140
5.6 What do the compounds in Hiroshima, Kyōto and Tōkyō tell us? ............ 143
5.7 The tone rules for compound nouns in the Gairin type dialects ................ 144
5.8 The tone rules for compound nouns in Middle Japanese........................... 147
5.8.1 Compounds with ‘long’ second elements ............................................... 147
5.8.2 Compounds with ‘short’ second elements .............................................. 149
5.9 How old are the tone rules for compound nouns in central Japan? ........... 152
5.10 How old are the rules for compound nouns
in the Gairin type dialects?...................................................................... 154
5.11 Noun compounding and the tone class divisions of proto-Japanese........ 155
5.12 The relation between sequential voicing
and lack of /H/ tone in compounds.......................................................... 156
5.13 The origin of the irregular cross-dialect correspondences
of longer nouns........................................................................................ 159
6 A new look at dialect tone ................................................................................. 160
6.1 Transitional or ‘Tarui type’ dialects .......................................................... 160
6.2 The Noto dialects ...................................................................................... 165
6.2.1 Kindaichi’s data...................................................................................... 167
6.2.2 McCawley’s view ................................................................................... 169
6.2.3 Noto type tone and Ramsey’s Middle Japanese tone system.................. 170
6.2.4 The conditioned variants as remnants of earlier Kyōto type tone? ......... 171
6.2.5 The origin of the variants in the Noto dialects........................................ 174
6.2.6 The tone of monosyllabic nouns in the Noto dialects............................. 175
6.2.7 The tone system of Toyama.................................................................... 177
7 Rightward spreading and tone shift in the Japanese dialects ............................. 180
7.1 Rightward tone shift in the Tōkyō type dialects ........................................ 181
7.1.1 Rightward tone shift conditioned by vowel height ................................. 182
7.1.2 Unconditional rightward tone shift ......................................................... 187
7. 2 Rightward tone shift in the Kyōto type dialects........................................ 187
7.2.1 Rightward tone shift conditioned by vowel height ................................. 187
7.2.2 Unconditional rightward tone shift in Ibukijima..................................... 189
8 Subclass divisions in proto-Japanese ................................................................. 191
8.1 Subclass divisions based on dialectal reflexes........................................... 191
8.1.2 The subclasses 2.2a and 2.2b in Martin’s classification ......................... 191
8.1.3 The subclasses 2.2a, 3.2a and 3.7a in the standard theory...................... 192
8.1.4 Subclasses 2.2a, 3.2a and 3.7a in Ramsey’s theory:
Final /R/ tone preceded by /L/ tone in proto-Japanese ............................ 193
8.1.5 Is the distinction between tone classes 3.2a and 3.2b
reflected in the Kyōto type dialects? ....................................................... 194
8.2 Subclass divisions based on tone dot attestations...................................... 198
Contents XI
8.2.1 Tone class 1.3b: /F/ tone in Middle Japanese......................................... 198
8.2.2 Tone class 3.5b (and tone class 2.5):
Final /R/ tone preceded by /H/ tone in Middle Japanese......................... 200
8.2.3 The reason why /R/ tone preceded by /H/ tone
is still attested in Middle Japanese .......................................................... 202
8.2.4 Were tone classes 3.5b and 2.5 larger than the small number
of attestations in Middle Japanese would make us believe?.................... 202
8.2.5 The developments in class 3.5b in Tōkyō .............................................. 204
8.3 Were the final /R/ tones an innovation of central Japan? .......................... 205
8.4 Restrictions to the location of /F/ and /R/ in Middle Japanese .................. 207
9 The tone systems of the Ryūkyūs ...................................................................... 208
9.1 Rightward tone shift and the shift from syllable-tone to word-tone........... 209
9.2 Hattori’s later reconstruction of the proto-Japanese tone system .............. 212
9.3 The split in classes 2.3 and 2.4/5 examined............................................... 215
9.3.1 Was there no distinct tone class 2.3 in proto-Ryūkyūan? ....................... 215
9.3.2 A comparison of the iki-, ita- and mari-groups in 12 dialects ................ 216
9.4 From vowel length to [H] pitch or from [H] pitch to vowel length?.......... 222
9.4.1 Overview of word-tones and vowel length in disyllabic nouns .............. 222
9.4.2 The geographical distribution of vowel length in the Ryūkyūs .............. 227
9.4.3 Arguments against the idea that vowel length in the initial syllable .......
is original ................................................................................................ 228
9.4.4 Kindaichi’s ideas on the origin of vowel length in the Ryūkyūs............. 229
9.5 Rightward tone shift conditioned by vowel height
and the split in class 2.4/5 ......................................................................... 230
9.6 Possible explanations for the iki/ita split compared .................................. 232
9.6.1 Extra tone classes in proto-Japanese ...................................................... 233
9.6.2 Vowel length distinctions in proto-Japanese .......................................... 233
9.6.3 Dialect interference in the development of proto-Ryūkyūan .................. 234
9.7 Martin’s idea of /L/ tone as a concomitant of vowel length
in proto-Japanese ...................................................................................... 237
9.7.1 Vovin’s evidence for vowel length in proto-Okinawan .......................... 239
9.7.1.1 Vovin’s examples ................................................................................ 239
9.7.1.2 Amendments to Vovin’s examples ...................................................... 240
9.7.1.3 The examples of underlying vowel length........................................... 242
9.7.1.4 Vovin’s proto-Ainu evidence for vowel length in proto-Japanese....... 245
10 Conclusion: The order and timing of the dialect splits .................................... 247
10.1 Minor developments................................................................................ 247
10.2 The new dialect-geographical paradox.................................................... 248
10.3 The conditioned split between Nairin type and Chūrin type.................... 249
10.4 The oldest split from proto-Japanese:
The Gairin type tone system and its geographical distribution................ 250
10.5 Similarities between the dialects of Izumo and Tōhoku .......................... 251
XII Contents
10.6 The settlement of the Tōhoku region....................................................... 252
10.7 The starting point of the /H/ tone restriction ........................................... 254
10.8 Final developments.................................................................................. 256
10.9 Hattori’s ideas on the relation between dialect boundaries
based on tonal distinctions and Japanese history..................................... 258
11 The accent of Japanese loanwords in Ainu ...................................................... 259
11.1 The basis of the Ainu dialect comparison................................................ 259
11.2 Phonological differences between Sakhalin Ainu and Hokkaidō Ainu ... 261
11.3 Distinctive vowel length in Sakhalin ....................................................... 266
11.4 Distinctive accent in Hokkaidō................................................................ 267
11.5 Similarities between the two systems ...................................................... 268
11.6 Hattori’s reconstruction of proto-Ainu
phonological structure and accent ........................................................... 269
11.6.1 Exceptions to Hattori’s correspondences ............................................. 273
11.7 The lack of pitch and vowel length distinctions in monosyllables........... 274
11.7.1 Yamamoto Tasuke’s description .......................................................... 274
11.7.2 Asai’s findings...................................................................................... 276
11.8 Evidence for the direction of change ....................................................... 277
11.8.1. The Hokkaidō Ainu system as a simplification
of the Sakhalin Ainu system................................................................ 277
11.8.2 The relation between retention of accent on the second syllable
in Yakumo and vowel length in Sakhalin............................................ 278
11.8.3 Vowel length in older Japanese sources of Hokkaidō Ainu ................. 279
11.8.3.1 Matsumae no kotoba (1626/1627) .................................................... 279
11.8.3.2 Moshiogusa (1792)............................................................................ 282
11.8.3.3 Ezo kotoba irohabiki (1848).............................................................. 284
11.8.4 The development of distinctive pitch-accent in Hokkaidō Ainu........... 285
11.8.5 Vowel length in older sources of Kuril Ainu........................................ 285
11.8.5.1 Krasheninnikov (1738)...................................................................... 286
11.8.5.2 Klaproth/Steller (1823/1743) ............................................................ 287
11.8.5.3 Nineteenth century sources of Kuril Ainu ......................................... 289
11.8.6 Influence from Japanese ....................................................................... 291
11.9 Vovin’s reconstruction of proto-Ainu phonological structure and tone... 292
11.9.1 Monosyllables ...................................................................................... 294
11.9.1.1 Proto-Ainu */H/................................................................................. 294
11.9.1.2 Proto-Ainu */L/ ................................................................................. 295
11.9.2 Disyllables............................................................................................ 297
11.9.2.1 Proto-Ainu */HH/ .............................................................................. 297
11.9.2.2 Proto-Ainu */HL/............................................................................... 300
11.9.2.3 Proto-Ainu */LL/ ............................................................................... 302
11.9.2.4 Proto-Ainu */LH/............................................................................... 303
11.9.3 Trisyllables ........................................................................................... 303
Contents XIII
11.9.3.1 Proto-Ainu */LHL/ ............................................................................ 303
11.9.3.2 Proto-Ainu */LLH/ ............................................................................ 304
11.10 Vovin’s evidence from Japanese loanwords in Ainu
for the standard reconstruction of proto-Japanese tone ........................... 305
11.11 What can the Japanese loanwords really tell us? ................................... 308
11.11.1 Loanwords that include voiced consonants in the second syllable
in Japanese .......................................................................................... 308
11.11.2 Loanwords that have the accent on the second syllable in Ainu......... 309
11.11.3 Loanwords that have the accent on the initial syllable in Ainu........... 311
11.11.4 Traders from Ōsaka ............................................................................ 313
11.12 Evaluating the evidence......................................................................... 314
11.12.1 The loanwords and the standard reconstruction ................................. 314
11.12.2 The loanwords and Ramsey’s reconstruction ..................................... 314
11.12.3 Attempts to date the examples............................................................ 315
11.12.4 The origin of the two different segmental shapes
for loanwords with accent on the initial syllable ................................. 317
11.12.5 The CVCCV shaped loanwords as evidence
for Hattori’s reconstruction of proto-Ainu vowel length..................... 318
11.12.6 The special case of pasúy, kamúy and múy ........................................ 319
11.13 Vovin’s reconstruction of proto-Ainu consonant clusters ..................... 321
11.14 Conclusion............................................................................................. 325
II The introduction and adaptation of the Middle Chinese tones in Japan ........... 327
Introduction........................................................................................................... 329
0.1 Ramsey’s theory and the evidence from the modern dialects .................... 329
0.2 Ramsey’s theory and Late Middle Chinese tone, Japanese philology
and the Buddhist shōmyō tradition ............................................................ 329
1 The history of Middle Chinese........................................................................... 331
1.1 The different varieties of speech that functioned
as the Chinese standard language .............................................................. 331
1.1.1 Early Middle Chinese ............................................................................. 331
1.1.2 Varieties of Early Middle Chinese ......................................................... 332
1.1.3 Late Middle Chinese .............................................................................. 332
1.1.4 Wu pronunciation and Qin pronunciation .............................................. 333
1.2 The relationship between Early Middle Chinese
and Late Middle Chinese .......................................................................... 334
1.2.1 Late Middle Chinese as the ancestor of the modern dialects .................. 335
2 The origin of tone in Middle Chinese ................................................................ 336
2.1 From consonantal distinctions to tonal distinctions................................... 336
2.2 The effect of glottal stop and -h on the pitch of preceding syllables ......... 337
2.3 Chinese descriptions of the four tones....................................................... 339
XIV Contents
3 Character reading traditions in Japan................................................................. 341
3.1 Early Sino-Japanese .................................................................................. 341
3.1.1 Go-on and southern Early Middle Chinese............................................. 341
3.2 Direct contacts with China ........................................................................ 342
3.2.1 Introduction of new character readings .................................................. 343
3.2.2 The introduction of the tone dots............................................................ 343
3.2.3 The government promotes foreign Chinese (Han pronunciation)........... 344
3.2.4 The development of a new standard of Sino-Japanese ........................... 345
3.3 Confusion and overlapping of terms.......................................................... 346
3.4 Confucianist and Buddhist reading practice .............................................. 347
3.4.1 Buddhist reading methods ...................................................................... 348
3.5 Reorganization of Go-on ........................................................................... 349
3.6 Buddhist Kan-on study .............................................................................. 349
3.7 Buddhist study of Chinese phonology ....................................................... 350
3.8 Different types of historical material ......................................................... 352
3.9 Present day Go-on and Kan-on pronunciations ......................................... 352
3.10 Summary of terms relating to Sino-Japanese........................................... 352
4 The difference between the tones of Go-on and Kan-on.................................... 354
4.1 The Go-on tones and the Kan-on tones are contrasted to each other......... 354
4.2 Characters in the Go-on pronunciation are marked
with ‘reversed’ tone dots ........................................................................... 356
4.3 The tonal value of the tone dots is based on the Kan-on tone tradition ..... 358
4.4 The shift from qu tone markings in Wa-on
to shang tone markings in Go-on .............................................................. 358
4.5 Tone descriptions from the Tendai and Shingon schools
concern the Kan-on tones.......................................................................... 359
5 The shōmyō traditions of the Tendai and Shingon schools ................................ 360
5.1 Varieties of shōmyō ................................................................................... 360
5.2 Nara shōmyō.............................................................................................. 361
5.3 Heian shōmyō: The introduction of Tendai and Shingon .......................... 362
5.4 A period of change .................................................................................... 363
5.5 Shōmyō traditions within the Tendai school.............................................. 365
5.6 Shōmyō traditions within the Shingon school............................................ 366
5.6.1 Kogi Shingon .......................................................................................... 366
5.6.2 Shingi Shingon ....................................................................................... 367
5.7 The antiquity of the shōmyō traditions that have survived to this day....... 368
6 The earliest tone description in Japan: Shittan-zō.............................................. 371
6.1 Annen’s four traditions.............................................................................. 371
6.1.1 Biao and Jin............................................................................................ 372
6.1.2 Isei and Chisō ......................................................................................... 373
6.2 Annen’s text .............................................................................................. 373
Contents XV
6.2.1 Heavy and light 重軽.............................................................................. 376
6.2.2 Low/falling and high/rising 低昂............................................................ 378
6.2.3 Inner and outer 内外............................................................................... 379
6.2.4 The tones 声勢 ....................................................................................... 380
6.2.5 Nu-sounds 怒声...................................................................................... 381
6.2.6 Enunciatory strength 著力...................................................................... 383
6.3 The tone systems of Isei and Chisō ........................................................... 384
6.4 Which of Annen’s tone systems represents
the LMC standard language?..................................................................... 387
6.5 Remaining problems.................................................................................. 389
7 Later Japanese tone theories .............................................................................. 392
7.1Overview of the kinds of tone dots used in Japan..................................... 393
7.1.1 Tone systems in which not all distinctions
may have been based on pitch................................................................. 393
7.1.2 Tone systems in which the distinctions were based on pitch .................. 394
7.1.3 The quasi eight-tone system of the Tendai school.................................. 396
7.2 Myōgaku and the state of Siddham study in Myōgaku’s time ................... 399
7.3 The descriptions of the tones..................................................................... 400
7.3.1 Heian period (794-1185) ........................................................................ 400
7.3.1.1 Chūzan 仲算 (Hossō school)............................................................... 400
7.3.1.2 Myōgaku 明覚 (Tendai school).......................................................... 402
7.3.1.3 Fujiwara Munetada 藤原宗忠 ............................................................. 418
7.3.1.4 Eijū 恵什 (Shingon school):................................................................ 421
7.3.1.5 (Kōmyō-san) Jūyo (光明山) 重誉 (Tendai school)............................. 423
7.3.1.6 Shinren 心蓮 (Shingon school) ........................................................... 424
7.3.2 Kamakura period (1185-1338) ............................................................... 424
7.3.2.1 Dōhan 道範 (Shingon school) ............................................................. 424
7.3.2.2 Shinpan 信範 (Shingon school)........................................................... 425
7.3.2.3 Ryōson了尊 (Shingon school)............................................................. 429
7.3.2.4 Anonymous (Tendai school) ............................................................... 430
7.3.3 The early Muromachi or Nanboku-chō period (1338-1392) .................. 431
7.3.3.1 Kenpō 賢宝 (Kogi Shingon school) .................................................... 431
7.3.3.2 Anonymous (Tendai school) ............................................................... 433
7.3.3.3 Shinkū 心空 (Tendai school)............................................................... 436
7.4 Overview of the tone descriptions ............................................................. 438
7.4.1 Descriptions that concentrate on differences in length
between light and heavy in the shang and qu tones................................. 438
7.4.2 Descriptions that concentrate on differences in pitch ............................. 438
8 Background and analysis of the tone theories of the Siddham scholars............. 441
8.1 The tones of the Siddham scholars do not represent the tones of LMC..... 441
8.1.1 Features that go back to a misinterpretation of Annen’s text.................. 443
8.1.2 The merger of light qu with shang is a Japanese invention .................... 447
XVI Contents
8.2 The tones of the Siddham scholars do not represent
the tones of Middle Japanese .................................................................... 449
8.3 The influence of Myōgaku’s innovations .................................................. 450
8.3.1 Myōgaku’s fanqie theory........................................................................ 451
8.3.2 Myōgaku divides the tones in two parts ................................................. 452
8.3.2.1 The tone contour of the ‘initial tone’................................................... 453
8.3.2.2 The tone contour of the ‘final sound’ .................................................. 455
8.3.3 Myōgaku’s eight-tone theory:
A tone system that had no historical basis............................................... 455
8.3.4 Myōgaku adapts the tone contour of the qu tone
in the six-tone theory............................................................................... 456
8.4 Myōgaku’s influence on the Shingon tone theories ................................... 458
8.5 Summary ................................................................................................... 459
9 Which reconstruction agrees better with the tones of the Siddham scholars?.... 461
9.1 The tones of the Siddham scholars and the standard reconstruction ......... 461
9.2 The tones of the Siddham scholars and Ramsey’s reconstruction ............. 464
9.3 The Shingon qu tone and the background of Ruiju myōgi-shō.................. 466
9.4 The light ping tone dot .............................................................................. 467
9.4.1 The abandonment of the use of the light ping tone
to mark Japanese words .......................................................................... 468
9.4.2 Was the use of the light ping tone abandoned
because of Myōgaku’s theories? ............................................................. 470
9.4.3 The origin of the rising contour of the light ping tone
and the falling contour of the heavy ping tone in Japan .......................... 471
10 Stages in the adaptation of the tones of Late Middle Chinese in Japan ........... 475
10.1 The first stage: The tone system of the Han pronunciation ..................... 475
10.2 The second stage: The Early Kan-on tone system ................................... 478
10.3 The third stage: The Later Kan-on tone system....................................... 481
10.4 The tone systems used outside the monasteries ....................................... 482
11 Miscellaneous issues........................................................................................ 483
11.1 The Wa-on tones ..................................................................................... 483
11.1.1 Differences in vowel length as the origin of the reversed
Wa-on tone markings? ........................................................................ 484
11.1.2 The ru-tone in Wa-on and Kan-on ....................................................... 486
11.2 The Sinologist view of the shang and qu tones ....................................... 490
11.3 The shang and qu tones in Sino-Korean.................................................. 491
11.4 Paekche loanwords in Old Japanese........................................................ 494
12 Determining the time of the tone shift in Kyōto .............................................. 496
12.1 Evidence from the 14th century................................................................ 497
12.1.1 Gyōa’s emendations to Fujiwara Teika’s spelling system .................... 497
12.1.2 Emperor Chōkei on the ping, shang and qu tones ................................ 499
Contents XVII
12.2 Confusion in the 15th century................................................................... 500
12.2.1 Yūkai 宥快 (Kogi Shingon school) ...................................................... 500
12.3 Reanalysis at the end of the 15th century ................................................. 501
12.3.1 In’yū 印融 (Kogi Shingon school) ....................................................... 502
12.4 The annotation ‘ataru’ in the 16th century............................................... 505
12.5 Summary ................................................................................................. 507
13 The Japanese tone theories after the shift ........................................................ 509
13.1 The shōmyō revival in the 17th century.................................................... 509
13.1.1 Kannō 観応 (Shingi Shingon school)................................................... 509
13.1.2 Pitch readjustment rules after the shift: ideai 出合 .............................. 512
13.1.3 Keichū 契沖 (Kogi Shingon school) .................................................... 514
13.1.4 Anonymous (Kogi Shingon school)...................................................... 517
13.2 Diversity in the tone theories of the 18th century..................................... 517
13.2.1 Monnō 文雄 (1700-1763, Jōdo school) ............................................... 518
13.2.2 Ise Sadatake 伊勢貞丈 (1715-1784) .................................................... 518
13.2.3 Motoori Norinaga 本居宣長 (1730-1801) ........................................... 518
13.3 The Buddhist tone theories in the 19th century ........................................ 519
13.3.1 Anonymous (Shingi Shingon)............................................................... 520
13.3.2 The Tendai tone system after the shift:
Rai Tsutomu’s study of the Kan-on shōmyō of the Tendai school ..... 520
13.4 The Edo period tone theories and modern scholarship............................ 523
14 Fushihakase material ....................................................................................... 525
14.1 The interpretation of older fushihakase material is uncertain.................. 525
14.2 The historical development of the fushihakase ....................................... 526
14.2.1 Ko-hakase (‘old hakase’): Shōten hakase, tada hakase, fu-hakase...... 528
14.2.2 Zu-hakase ............................................................................................. 531
14.2.3 The early goin hakase system of Tanchi .............................................. 533
14.2.4 Meyasu hakase ..................................................................................... 534
14.2.5 The goin hakase or hon-bakase system of Kakui................................. 536
14.2.6 Shōmyō genres that contain historical information
on the Japanese tones .......................................................................... 539
14.2.7 Neumes versus absolute tone................................................................ 539
14.3 Fushihakase material that has to be reinterpreted
in view of Ramsey’s theory ..................................................................... 542
14.3.1 Fushihakase material that reflects a Middle Japanese tone system ...... 542
14.3.1.1 The Daiji-in-bon of Shiza kōshiki ..................................................... 543
14.3.2 Fushihakase material that reflects a restricted tone language............... 548
14.3.2.1 Butsuyuigyō-kyō ................................................................................ 549
14.3.2.2 The old rongi material and the quotation part of Bumō-ki ................ 550
14.4 The history of the rongi ceremonies and the rongi books ....................... 551
14.4.1 The rongi ceremonies........................................................................... 551
14.4.2 The rongi books ................................................................................... 552
XVIII Contents
14.4.3 Why is the vocabulary part of Bumō-ki regarded as yomikuse? ........... 553
14.5 The tone system reflected in the old rongi material
and Butsuyuigyō-kyō ............................................................................... 556
14.6 The reading of the ko-hakase materials the 16th century and later........... 559
14.7 The musical notation systems of Nō and the Heike monogatari.............. 560
14.7.1 Yōkyoku ................................................................................................ 561
14.7.2 Heikyoku............................................................................................... 562
14.7.3 The value of the marks in Nō and Heikyoku:
reversal or preservation? ..................................................................... 563
14.8 Summary ................................................................................................. 564
15 Conclusion ....................................................................................................... 566
References............................................................................................................. 571
Index…….…… .................................................................................................... 587
Acknowledgments
I wish to express my gratitude to a number of people for their help and interest
during the long years of research that have led up to this publication. First and
foremost, my thanks go out to Professor Frederik Kortlandt of Leiden University,
who first directed my attention to the outstanding problems in the historical
development of Japanese tone, and the solution that Ramsey’s theory seemed to
offer to these issues. His advice over the years, and his enthusiasm for the project
have been invaluable.
I would also like to thank Professor Ishizuka Harumichi, my supervisor during
my study at Hokkaidō University, for pointing out to me – among other things – the
importance of the reading notes in the interpretation of Japanese historical texts.
Professor Satō Tomomi of the same university kindly shared with me his thorough
knowledge of the Ainu language, as did Dr. Anna Bugaeva and Dr. Takahashi
Yasushige, who were my fellow students at the time.
Professor Robert Ramsey made it possible for me to extend my research by
inviting me to the University of Maryland, where I was able to study, thanks to a
Fulbright Scholarship and a grant from the Dr. Catharina van Tussenbroek
Foundation. Without the brilliant insight that came to him so many years ago, this
book would never have been written, and I know he is as happy about its publication
as I am.
I thank Professor James Unger of Ohio State University for his interest and
encouragement, and for opening my eyes to the implications that my research has on
the order in which the Japanese islands were populated by speakers of Japanese. I
would also like to thank Professor Wim Boot, Professor Wolfgang Behr, Dr.
Thomas Pellard, Dr. Anton Antonov and Dr. Wayne Lawrence for their corrections
and advice. Needless to say, all errors made are mine alone.
The publication of this book was supported by grants from the Leiden University
Centre of Linguistics and the Spinoza Prize awarded to Professor Kortlandt.
Last but not least, I would like to thank my parents, who have been a source of
encouragement throughout my life, and especially my husband Jean-Pierre, whose
support and patience have been truly overwhelming.
I From proto-Japanese to the modern dialects
Introduction
0.1 The subject and aim of this study
The subject of this study is the historical development of the Japanese tone system.
My aim has been to determine how the historical development from the tone system
of proto-Japanese to the tone systems of the different dialects (both modern and
historical) can best be explained.
I have concentrated on the tonal distinctions that can be observed in nouns, as the
distinctions in nouns are more numerous than those of verbs and adjectives.
Although the latter sometimes show historical developments that differ from the
developments in nouns, the differences are small and justify the assumption that a
satisfactory account of the changes leading to the richer tonal distinctions in nouns
will subsume those of verbs and adjectives; therefore, I discuss verbs and adjectives
in this study only in passing.
When one compares the standard explanation of the historical development of
the different tone systems in the Japanese dialects (e.g. Kindaichi (1951) and
elsewhere) with what is known about tonal developments in other tone languages,
Japanese comes across as quite unusual. Many developments posited in the standard
theory appear unlikely and even impossible in the light of such cross-linguistic
comparisons. In this study, I have tried to explain the Japanese data – both
contemporary and historical – in a way that better agrees with what happens in other
languages with similar tone systems.
A more phonetically accurate reconstruction of the tone system of the oldest
stage in the Japanese language for which we have sufficient data, the language of the
11th to late 13th century or Middle Japanese, forms an integral part of this endeavor.1
A phonetically accurate reconstruction of the tone system of Middle Japanese is
important as the Middle Japanese tone system may contain information on earlier
stages in the language. Whitman (1990), for instance, proposed the idea that the
Middle Japanese tone system may contain clues as to earlier contractions and vowel
1 The history of the Japanese language is usually divided into the following periods: Old
Japanese, the language of the Nara period (710-794); Early Middle Japanese, the language of
the Heian period (794-1185); Late Middle Japanese, the language of the Kamakura (1185-
1379), Muromachi (1392-1573) and Azuchi-Momoyama (1573-1603) periods; Modern
Japanese, the language of the Edo period (1603-1867), and down to the present. This division is
based on segmental and grammatical considerations, not on differences in suprasegmental
(tonal) features. The tonal spelling system used in parts of the Nihon shoki for instance suggests
that the tone system of Old Japanese was not fundamentally different from that of Middle
Japanese. Likewise, the crucial tone dot material transgresses the boundary between Early
Middle and Late Middle Japanese; I use the term Middle Japanese for convenience to designate
the language during the whole period when tone dot markings proliferated.
4 Introduction
length, while Kortlandt (1993) and Vovin (1997) suspect that the /L/ or /H/ tone of
the initial syllable in Middle Japanese may go back to an earlier distinction between
voiced and voiceless initial consonants.
0.2 The Middle Japanese tone system and the tone system
of proto-Japanese
The reconstruction one arrives at by comparing all known dialects of a language is,
by definition, its proto-language. The term proto-Japanese therefore refers to a
putative ‘oldest’ stage of the Japanese language that can be regarded as the ancestor
of all modern and attested pre-modern dialects.
When we compare the vocabularies of the modern dialects, we find that words of
a given length fall into a number of discrete tone classes. In each dialect, some tone
classes have merged, but not necessarily the same ones in different dialects; hence,
the number of tone classes that has to be reconstructed for proto-Japanese is larger
than the number needed to describe any single modern dialect. It turns out that the
number of tone classes that has to be reconstructed for proto-Japanese on the basis
of a comparison of the modern dialects agrees closely with the number of tone
classes implied by the data in the early dictionary Ruiju myōgi-shō 類聚名義抄 (11th
century), our main source of knowledge about the Middle Japanese tone system.
This means that Middle Japanese had a system phonemically very similar to that
of proto-Japanese, similar enough to be used as a working model. It also implies that,
even though a large part of the tone dot material probably reflects the tone system of
the language as it was spoken in the old capital of Kyōto, the current dialect of the
city of Kyōto has no privileged status among modern dialects, since all serve equally
as witnesses to proto-Japanese.
0.3 The basis for the reconstruction of the Middle Japanese tone
system
There are four distinct types of data that form the basis for the reconstruction of the
Middle Japanese tone system. The first two, introduced in sections 0.3.1 and 0.3.2
below will be for the most part discussed in part I of this study; the next two types,
introduced in sections 0.3.3 and 0.3.4, will be addressed in part II.
0.3.1 The different types of tone markings in old Japanese texts
Many different means of marking tonal distinctions in texts have been used in Japan,
especially by Buddhist clerics. Each method will be discussed at greater length later
on; what follows here is just a brief presentation to the different types of pitch
markings.
0.3 The basis for the reconstruction of the Middle Japanese tone system 5
The oldest can be found in the Koji-ki 古事記 (712) the earliest surviving
Japanese extended text. It makes use of two of the characters that in China,
represented the tones of Middle Chinese (平 ping, 上 shang, 去 qu, 入 ru), namely
上 and 去. These were added like notes after the names of certain deities, persons
and place names. With one exception (a single attestation of the use of the character
qu 去)these notes consist of the character shang (上) and are in any case infrequent.
It is not possible to reconstruct the tone system of the language of the Koji-ki on
their basis.
A second type of indication of tone is inferred from choices of different
Man’yōgana 万 葉 仮 名 for syllables that were segmentally identical. In 1981,
Takayama Michiaki discovered a statistical correlation between the tone of
characters in Middle Chinese and the pitches of Japanese syllables (as inferred from
the later tone dot markings) in the poems included in the Nihon shoki 日本書紀
(720). Earlier, Kindaichi Haruhiko (1947) had made similar claims about
Man’yōgana used in Konkōmyō saishōō-kyō ongi 金光明最勝王経音義 (1079), a
pronunciation guide (ongi) to the Suvarnaprabhāsa sutra.
A third type of marking consists of dots representing Late Middle Chinese tones
added to individual Chinese characters or syllabic graphs (kana). These dots are
called shōten 声点 or ‘tone dots’. At their earliest, the tone dots are found at the end
of the 9th century, when they were introduced into Japan from China by Buddhist
monks from the esoteric Tendai and Shingon schools. Their function was to indicate
the tones of the Chinese characters that had been used to transcribe the Sanskrit
dhāran,ī.2 From the late 10th or early 11th century onward, they start to be used to
mark the tones of Chinese loanwords in Japanese as well as the tones of Japanese
words.
It was this system that became the most widespread and most consistently used
method of indicating tone, and it is therefore of greatest historical importance. When
used to indicate the tones of Japanese words, the tone dots were added to the upper,
middle and lower left or to the upper right of phonograms (Man’yōgana or kana).
The most important work containing shōten added to Japanese words is the Ruiju
myōgi-shō dictionary, and our knowledge of the Middle Japanese pitches is largely
based on the markings in that dictionary. 3 They can, however, also be found in
Wamyō ruiju-shō 和名類聚抄 and old manuscripts of the Nihon shoki, Kokin waka-
shū 古今和歌集 and many other texts. The habit of adding shōten to texts fell into
disuse in the 14th century.
2 Dhāraṇī are mystic verses in Sanskrit that play an important role in esoteric Buddhism.
Because they consist of syllables that very often have no literal meaning (yet were supposed to
have magical powers, when properly pronounced) they were not translated into Chinese but
carefully transcribed phonographically with Chinese characters. The correct pronunciation of
the dhāraṇī in Japan was therefore seen as depending on the correct pronunciation of the
corresponding Chinese characters.
3 Unless indicated otherwise, quotations from Ruiju myōgi-shō are usually from the Kanchi-in-
bon 観智院本, the most complete manuscript.
6 Introduction
A different means of indicating tone can be found in the tonal spelling system
devised by Fujiwara Teika 藤原定家 (1216) and in the amended version of it by
Gyōa 行阿 (late 14th century) which will be discussed in chapter 12 of part II. In this
system different kana signs, which, as a result of sound changes, had come to be
pronounced in the same way (such as オ and ヲ) were redeployed to differentiate
tone. (Since there were few such graphic oppositions to exploit, the information
contained in this sort of tonal spelling system is very limited.)
Another type of information on tone in older periods comes from texts of
Buddhist chants or shōmyō. From the 11th century on, so-called fushihakase marks
that indicated the melodies of these chants came into use in the Tendai and Shingon
schools. There have been many kinds of fushihakase in use in different schools and
in different periods. The number and the type of texts in which they were deployed
to mark the tones of Japanese are limited, and mostly date only from the 14th century
on. The notation systems used in the recitation of the Heike monogatari 平家物語
and Nō 能 chanting both developed from of this kind of musical script, and these
two traditions also contain historical information on the tones.
In this study I have concentrate primarily on the tone dot markings, as these form
an earlier and richer source of information. In the final chapter of part II however, I
have examined the different kinds of musical script. The information that can be
gained from these sources has been incorporated in chapter 4, in the overview of the
transitional stages that I reconstruct in the central Japanese dialects in the 14th
century.
0.3.2 The tone systems of the modern Japanese dialects
The tone dots in the old texts do not tell us what the tones of the language sounded
like phonetically, though their consistent use shows that they do indicate
suprasegmental phonemic distinctions. When modern linguists at the beginning of
the 20th century began to be interested in the old tone dot markings, the practice of
marking pitch distinctions by means of shōten had long been lost, and knowledge of
the phonetic values of the different shōten had become confused. By the 18th century,
Buddhist and Confucian scholars had put forth many different interpretations of the
phonetic values of the Middle Chinese tones, and even in esoteric Buddhist circles,
the present-day chanting practice does not directly reflect the tonal categories of the
characters (Giesen, 1977).
Consequently, to gain an understanding of the phonetic values represented by the
tone dots, we must compare the phonetic patterns of the modern dialects. By
matching these phonetic values with the corresponding phonemic tone classes of
Middle Japanese, we can deduce with fair certainty the phonetic differences that
kept those classes of words phonemically distinct. (The tonal distinctions of Middle
Japanese and some of the main modern dialects are presented in chapter 1.)
In comparing modern dialect data, the way in which the modern tonal types are
distributed geographically needs to be taken into account. The geographical
distribution of phonetic types can provide clues to which types are old and which are
0.3 The basis for the reconstruction of the Middle Japanese tone system 7
innovations. The results of comparing main-island dialects are presented in chapters
2 and 3, with additional discussion of the possible existence of a number of extra
phonological distinctions in proto-Japanese in chapter 8.4 In chapter 9, the dialects
of the geographically far removed islands of the Ryūkyūs are discussed separately.
Their segmental phonology differs considerably from the dialects of the main
Japanese islands, and in addition they show an interesting split in the merged tone
classes 2.4/5 and 3.4/5. As discussed in chapter 9, it may not be necessary to
reconstruct this additional split in proto-Japanese
I also use the technique of internal reconstruction within certain dialects, i.e. the
analysis of irregularities and phonological alternations in paradigmatic forms. The
results of this analysis are presented in chapter 5.
0.3.3 The Late Middle Chinese tones and their relation to the value
of the Japanese tone dots
Since the use of dots to mark tone was adopted from Chinese practice, we must take
the reconstruction of the Late Middle Chinese tones into consideration when trying
to determine the significance of these notations in Japanese texts. One must not, of
course, simply equate a reconstructed tone of Late Middle Chinese directly with a
phonetic feature of a syllable of Japanese. The tone systems of the two languages
were fundamentally different: Like modern Chinese, Middle Chinese was a contour
tone language. Middle Japanese on the other hand, although containing a limited
number of contour tones, was basically a register tone language. The significance of
the dots had to be altered for that reason alone. In addition, a study of historical
Japanese texts that deal with the Chinese tones shows that, after official contact with
China was severed at the end of the 9th century, Japanese tone theories developed in
a direction of their own, and show uniquely Japanese characteristics that have little
or nothing to do with Middle Chinese.
0.3.4 Historical descriptions of the Late Middle Chinese tones in Japan
Our knowledge on how the Late Middle Chinese tones were interpreted in Japan in
different historical periods is based on a large number of treatises on the Chinese
tones written by Buddhist monks of the esoteric Shingon and Tendai schools.
The oldest and most detailed of these treatises (and the only one that might justly
be called a description of Late Middle Chinese, since it dates from a period when
direct contact with spoken Chinese was still relatively recent) can be found in
Shittan-zō 悉曇蔵 (880) by the Japanese Tendai monk Annen 安然. In Annen’s time,
tone dots were not yet used to mark the pitches of Japanese.5
4 In some cases these subclasses have not survived in any of the modern dialects but have been
attested in the tone dot material. In other cases, they have to be reconstructed based on the
modern dialects but must have disappeared from the attested varieties of Middle Japanese
before the compilation of the tone dot material.
5 As has been mentioned, in the 8th century however, the writing system used in part of Nihon
shoki is believed to have indicated the pitch of syllables in Japanese words through the
8 Introduction
From the 11th to the 14th century, many treatises on the tones were written in
order to obtain a correct pronunciation of the dhāran,ī, and it is in the tone theories
from this period that uniquely Japanese innovations emerged. It was also in this
period that the use of tone dots to express the pitches of Japanese proliferated.
After the 14th century, there is a gap of approximately two hundred years during
which hardly any works on the tones were written. The tone theories of this period
appear to have fallen into disarray. In the 17th and 18th centuries however, there was
a revival of scholarly interest, and a number of widely divergent tone theories
appeared, now written by Confucian scholars as well as Buddhist monks. The
practice of marking the tones of Japanese syllables by means of tone dots however,
was not revived.
0.4 Modern reconstructions of the value of the Japanese tone dots
In 1936, Arisaka Hideyo took the first step toward a reconstruction of the tone value
of the Japanese tone dots when he tried to reconstruct the phonetic values of the Late
Middle Chinese tones based on Annen’s description. Later, Konishi Jin’ichi (1948),
Kindaichi Haruhiko (1951), and Mabuchi Kazuo (1962) also made use of this text,
as well as descriptions from later periods in reconstructions of the tone values of
Late Middle Chinese. Unfortunately, what the tones of Late Middle Chinese may
have been, what esoteric Buddhist monks later thought they were, and how they
applied their understanding to mark the basically level pitches of Japanese – three
separate issues – are usually not distinguished in the literature.
In his most famous article (1951), Kindaichi compared Annen’s description and
several later Buddhist theoretical works on the tones with Japanese tone dot material
and with the tone systems of a number of modern Japanese dialects.6 In this article
Kindaichi presented his reconstruction of the value of the tone dots and the tone
system of Middle Japanese, as well as his ideas on the historical development of the
modern Japanese tone systems from the tone system of Middle Japanese. In chapter
2, I present this now widely accepted theory.
In 1979, an alternative theory was proposed by Samuel Robert Ramsey,
according to which the tone values that Kindaichi had reconstructed as ‘high’ and
‘low’ are exactly reversed.
A preliminary outline of Ramsey’s theory is presented in chapter 3. The chapters
that follow discuss a number of different issues from the viewpoint of his theory:
selection of otherwise interchangeably used Man’yōgana.
6 In this article Kindaichi combined the evidence that he had accumulated in earlier, separately
published studies. Kindaichi himself (1951: 632-633) mentions ‘Sho-hōgen no hikaku kara
mita Heian chō no akusento’, Hōgen; 7-6, 1937, ‘Bumō-ki no kenkyū, zoku-chō’, Nihon-go no
akusento (Nihon hōgen-gakkai ed.), 1942, ‘Keichū no kanazukai-sho shosai ni mieru koku-go
no akusento’, Koku-go to koku-bungaku; 18-4, 1943, and ‘Ruiju myōgi-shō no wakun ni
hodokosaretaru shōfu ni tsuite’, Kokugo-gaku ronshū, 1944.
0.5 Conventions 9
The historical developments in the tone systems of the main dialects in Japan (cf.
chapter 4), internal reconstruction (cf. chapter 5), the developments in the tone
systems of a number of smaller dialects (cf. chapter 6), the influence of segmental
features on the rightward tone shift that can be seen in many dialects in Japan, and
what these shifts imply as to the validity of Ramsey’s theory.
The results of the investigations are summarized in chapter 10, where I present
an overview of my reconstruction of the developments from the proto-Japanese tone
system to the more restricted tone systems of the modern dialects.
Finally, in chapter 11, I investigate the accent of Japanese loanwords in Ainu,
and present the evidence for and against the two theories contained in this material.
0.5 Conventions
Below I will introduce conventions used in the classification of the tonal distinctions
of the Japanese nouns, the representation of tone, and the spelling. I will also discuss
the terminology used in the analysis of the different tone systems.
0.5.1 Tone classes, symbols and spelling
Comparing the tone patterns of nouns in the three major prosodic types in Japan, it is
possible to divide the nouns into a number of different tone classes, based on the
way in which the tone patterns of the nouns correspond to each other. These tone
classes have conventionally been assigned a number, such as 1.1, 1.2, 1.3 etc. The
number before the dot refers to the number of syllables in the word and the number
after the dot refers to the number assigned to the tone class. The number 1 after the
dot is reserved for those tone classes that have Ø tone in the Kyōto type tone
systems and in the Tōkyō type tone system that immediately surrounds the Kyōto
type on the island of Honshū, the so-called Nairin Tōkyō type. (See section 1.1.1.)
In the division into tone classes I follow Martin (1987). Martin’s division builds
on Kindaichi & Wada’s division of 1955, but Kindaichi later (1974) deleted the
small and irregular class 3.3 from the list of three-syllable nouns.7
7 In addition, Martin acknowledges a number of subclass divisions. The first, proposed by
Hattori (1951) is based on tone dot attestations and concerns subclasses 3.5a and 3.5b. The rest
were proposed by Hayata (1973) and concerns subclasses 2.2a, 2.2b, 3.2a, 3.2b, 3.7a, 3.7b.)
Hayata’s subclasses have not been attested with tone marks that differ from the markings of
other nouns of classes 2.2, 3.2 and 3.7, but based on the reflexes in the modern dialects they
may have to be reconstructed in proto-Japanese. I have left discussion of the small subclasses
2.2a, 3.2a and 3.7a to chapter 8. In the preceding chapters 2.2, 3.2 and 3.7 refers to the larger
subclasses 2.2b, 3.2b and 3.7b. The subclasses 1.3b and 3.5b have been attested in Middle
Japanese and are therefore briefly introduced in sections 1.3.1 and 1.3.3, but as they are small
and as the modern reflexes are unclear, further discussion of these subclasses is left to chapter 8.
In the other chapters therefore 1.3 refers to the larger class 1.3a and 3.5 refers to the larger class
3.5a.
10 Introduction
I have adopted the Japanese convention of indicating a syllable with high pitch
[H] with a black dot , and a syllable with low pitch [L] with a white dot , as I
found this easier on the eyes, especially in the many tables. A rise [R] will be
indicated by and a fall [F] by . In the few cases in which I discuss the possible
existence of a mid tone [M] I have adopted the mark , first used for this purpose
by Hattori Shirō (1951).8
These symbols aim to indicate the tones as they are (or were) realized.9 (The
symbols [ and ] are therefore omitted in these representations.) In many of the
modern dialects the phonetic realization can differ considerably from the underlying
phonological representation, depending on the pitch assignment rules in the different
dialects. For instance, a word which can be represented as /ØØØHØ/ phonologically,
is realized in phrase initial position as % in Tōkyō, % in
Nagoya and % in Akita.10
In some cases, I have added the phonological representation in terms of /H/ and
Ø tone to the surface realization in the modern dialects, but in most cases I have
avoided having to do so by adding a phonological mark ( ' ) to the surface
representations. In this way, the surface realization and the phonological
representation can be conflated. This mark is usually referred to as the ‘accent mark’,
but in the tonal analysis which I have adopted, it marks the location of the /H/ tone.
In the many Japanese dialects in which the location of the /H/ tone in the word is
distinctive, this location can be determined by looking for the distinctive drop to [L]
pitch that immediately follows the /H/ tone: Tōkyō ', Nagoya ',
Akita '. The mark shows that the phonological shape in all three dialects
is the same. In Akita the surface form and the underlying form are identical, but in
dialects such as Tōkyō and Akita it is necessary to distinguish /H/ tone from [H]
pitch.
An important aspect in which the prosodic system of modern Kyōto differs from
the standard dialect is that in Kyōto /L/ tone on the initial syllable is distinctive.
(This distinctive initial /L/ tone in Kyōto is often referred to as ‘/L/ register’.)
Adding phonological marks to the Kyōto dialect, it is customary to add the mark
' before the initial syllable, in case of words that start with /L/ register. In the pitch-
accent analysis of the Japanese prosodic systems, this mark is referred to as the ‘pre-
accent’.
8 In the description of the pitches of Ryūkyūan and Ainu in chapters 9 and 11 on the other hand,
I use the symbol for [L] or the location of a drop to [L] pitch and the symbol for [H] or the
location of a rise to [H] pitch.
9 As will be explained in section 1.1.1, I have adopted the so-called ‘new two-step analysis’, in
which [M] pitch , although a phonetic reality in a number of modern dialects, is represented
as [H] . The result is that the representation of the actual pitches of words with all Ø tone is
not purely phonetical but has been influenced somewhat by the phonological analysis.
10 The initial [L] pitch in all three dialects is an automatic %L phrase boundary tone, which is
assigned postlexically. On the use of /H/ and Ø, see section 0.5.2.5.
0.5 Conventions 11
Finally, in those rare modern dialects that include the toneme /R/, the location of
this tone is indicated by adding the mark '' after the syllable with /R/ tone.
For Japanese words quoted in italics as linguistic examples, I use the Kunrei
Romanization system. (In all other cases I use the Hepburn Romanization system, as
this system is closest to the pronunciation of the modern Japanese standard
language.)
The Kunrei Romanization, which has the character of a phonemic transliteration
of the kana spellings, is most suitable for my purposes as it makes it possible to
avoid using different spellings for different periods: I quote examples from all kinds
of different periods in the history of Japanese in which the pronunciation often
differed, and I do not want to switch between many different spellings for one and
the same word that has always maintained the same kana orthography. In the
following section, which treats some aspects of the segmental phonology of Old
Japanese, I will use a spelling that more closely reflects the likely pronunciation of
that period, but in the parts that deal with tone, a word like ‘river’ for instance, will
always be spelled as kaha, although depending on the period the pronunciation was
kapa, kaɸa or kawa.11
Long vowels however (which mainly occur in the sections on compound nouns
in chapter 5), are indicated by means of double vowel signs (aa, ii, oo, uu and ei), as
this makes it possible to indicate the exact location of the pitch fall in Tōkyō, which
can only occur in the middle of a long vowel (i.e. a'a, i'i, u'u, o'o and e'i.
As I discuss many dialects, it has not been possible for me to introduce the
differences in phonology between all these dialects, and I have limited myself to a
discussion of their tone systems. The phonology of the Ryūkyūan dialects differs
strongly among the dialects themselves and from standard Japanese. I have
nevertheless decided to introduce the nouns that I discuss in my treatment of the
word-tone systems of the Ryūkyūs in chapter 9 in the standard Japanese form.
Listing the different pronunciations in the many Ryūkyūan dialects in the tables
would have been too cumbersome, while using the standard form also facilitates
identification with the corresponding lexical items in the dialects of mainland Japan.
0.5.2 Terminology: Pitch-accent or tone
Modern Tōkyō Japanese often features as the archetypical pitch-accent language,
and accentual terminology is often applied to the Middle Japanese stage as well. As
is clear from the previous pages, I analyze the modern Japanese dialects as well as
Middle Japanese in terms of tone. In order to account for this choice I will first
compare different views on the status of pitch-accent as opposed to stress-accent and
11 An exception is the spelling of Old Japanese used in sections 0.6.1.1 and 11.10. I follow Martin
(1987) in representing the kō (type 1) vowel as /yi/, /ye/ and /wo/, and the otsu (type 2) vowels
as /ey/, /iy/ and /o/, which merged as /i/, /e/ and /o/ in Middle Japanese. In these sections /h/ is
spelled as /p/.
12 Introduction
tone, and then show why for both modern Japanese and Middle Japanese the tonal
analysis proves to be most useful.
0.5.2.1 What is pitch-accent?
When tone, stress-accent and pitch-accent are compared, tone and stress-accent
seem to be the maximally opposed systems with maximally clear separation of
properties. It is possible to define these prototypes and establish a set of properties
that typically co-occur in each (Odden 1999, Hyman, 2006).
The two central criteria that distinguish prototypical stress-accent from tone, are
culminativity (every lexical word has at most one syllable marked for the highest
degree of metrical prominence) and obligatoryness (every lexical word has at least
one syllable marked for the highest degree of metrical prominence (primary stress).
In other words, in stress-accent systems, not only is there generally at most one
stress per word, there also is at least one stress per word, barring clitics, which
cannot stand on their own (Odden (1999:198). The presence of a stressed syllable in
the word defines a word as a word.
In a prototypical tone language on the other hand, tone is neither obligatory
(requiring for instance that every lexical word has to contain at least one segment
with a certain tone) nor culminative (restricting certain tones to one per lexical
word).
Establishing a third pitch-accent prototype is more elusive, as many languages
that have been analyzed in terms of pitch-accent, like modern standard Japanese,
have properties that are reminiscent of both stress-accent (culminativity) and tone
(use of pitch height, non-obligatoryness).
While scholars agree on the fact that prosodic systems like those of standard
Japanese occupy a typological middle ground between tone and stress-accent, there
are differences as to whether such systems are regarded as closer to stress-accent or
closer to tone, depending on which criterion is regarded as most essential;
obligatoryness or culminativity. (And this, in turn, causes differences in the
terminology applied to this typological middle ground.)
Beckman for instance (1986:1), who regards Japanese as an accentual language,
sees the difference between stress-accent and ‘non-stress accent’ (pitch-accent) as a
mere difference in phonetic realization: Stress-accent differs phonetically from non-
stress accent in that it uses to a greater extent material other than pitch.
1 Comparison of the properties of pitch-accent, stress-accent and tone
tone Japanese stress-
pitch-accent accent
culminative - + +
obligatory - - +
0.5 Conventions 13
The fact that stress-accent is obligatory, while pitch-accent – such as in case of
Japanese – is not, is thus not regarded as a point of primary importance.
Hayata (1999:222) likewise sees pitch-accent and stress-accent as a continuum:
The marked segment may have higher pitch, higher amplitude, increased duration, it
may be more peripheral in vowel articulation or capable of distinguishing more
phonemes; the more a prosodic system possesses these properties, the closer it is to
stress-accent. Accent is thus an abstract entity whose surface realization can differ
from one language to another, so that in some languages it can be expressed
primarily or entirely by means of pitch height (pitch-accent). In this approach pitch-
accent is a subset of stress-accent and is contrasted with tone, based on the fact that
pitch-accent and stress-accent are both culminative (restricted to at most one per
word), while tone is not.
McCawley (1964) for instance defined the difference between tone and pitch-
accent as follows:
If the underlying form of each morpheme requires at most the specification of
the location of some pitch phenomenon (e.g. the location of high pitch or a
drop in pitch), the language has a pitch-accent system and is not a tone
language; if a morpheme generally requires an underlying form in which each
syllable must be specified for an underlying tone (so that the number of
potential underlying tonal contrasts increases geometrically with the number
of syllables, as compared with the pitch-accent case, where the number of
potential tonal contrasts only increases arithmetically with the number of
syllables), the language is a tone language.
Following McCawley’s definition, Japanese is definitely not a tone language, as in a
tone language each syllable must be specified for an underlying tone. In reality
however, most tone languages are restricted to some extent, so that in most tone
languages the number of contrasts does not actually increase geometrically with the
number of syllables, and many languages with a more limited number of contrasts
are still analyzed in terms of ‘tone’. Even languages with severe restrictions on the
number of tonal contrasts can be handled in strictly tonal terms, treating these
languages as so-called ‘restricted tone languages.’
The idea that tone can still be tone, even when its distribution is restricted, is the
basis of the approach by Meeussen (1972), Odden (1999) and Hyman (2006) among
others. In their idea, tone forms a continuum from almost completely unrestricted to
very restricted, and in a tone system at one end of the restrictedness scale tone may
be culminative. Restricted tone – as a subset of tone – is contrasted with stress-
accent on the basis of the fact that stress-accent defines a word as a word (has an
obligatory head), while tone and restricted tone do not.12
12 If the classification of a prosodic system as accentual is based on the criterion of whether
accent in such a system is obligatory or not, the term pitch-accent may have to be limited to
prosodic systems like those of Nubi (Gussenhoven, 2006) or Ainu (see chapter11) where accent
14 Introduction
Following this approach, Japanese is definitely not accentual as Japanese-style
pitch-accent is markedly different from stress-accent: Even in the archetypical pitch-
accent system of modern standard Japanese, the vast majority of words is
unaccented, and so in this respect, even modern standard Japanese is more tone-like
than accent-like.
Japanese style pitch-accent, is thus regarded as a form of tone with restrictions
on the number of tonal contrasts. As the closest affinity is with tone and not with
stress-accent, the term ‘pitch-accent’ is replaced by ‘restricted’, ‘sparse’ or
‘privative’ tone.
0.5.2.2 What is restricted tone?
There are a number Bantu languages in eastern Zaire, Rwanda, Burundi and adjacent
areas with tone systems that strongly resemble the pitch-accent system of modern
Tōkyō Japanese. Since the work of McCawley (1970), where it was proposed that
the Bantu language Luganda shares typological characteristics with Japanese, many
of these languages have been analyzed in terms of pitch-accent at some point, but
they are now usually treated as languages with restricted tone systems, often by the
very authors who earlier proposed an accentual analysis (Odden, 1999:188).
In such restricted tone systems the /H/ tones are often more salient that the /L/
tones, so that such tone systems have been claimed to have /H/ vs. Ø opposition, i.e.
there is one active, accent-like /H/ tone, and a default [L] or Ø tone. In tone systems
which have such restricted tone, the Ø tone can be realized in different ways so that
the underlying prosody is abstractly different from surface realizations.
In concrete terms this means that the unmarked or Ø tones in such systems are
not always realized with [L] pitch. They can be realized in different ways [H], [L] or
[M], depending on their location in the word. It is, for instance, not uncommon for Ø
tones preceding /H/ tone to be realized with [H] pitch, because in /H/ vs. Ø tone
languages the Ø tones often anticipate the accent-like /H/ tone. As a result, the
number of [H] tones can be larger than the number of [L] tones in the surface
realization, even though the marked /H/ tone is much rarer than the unmarked Ø tone
in the underlying phonemic representation.
0.5.2.3 Tone or pitch-accent in Middle Japanese
In many works on the history of the Japanese tone system, the term ‘pitch-accent’ is
not only applied to the modern dialects but also to Middle Japanese.13 There have
been, for instance, numerous attempts to analyze the Middle Japanese tone system
with the help of accent marks. Such analyses are unnecessarily complicated as the
is not only culminative and expressed by means of pitch height, but also obligatory.
13 In Japanese works the term akusento is commonly used, which has a much wider range of
meanings than the English term ‘accent’. Akusento can refer to almost anything prosodic:
‘pitch-accent’, ‘pitch height’, ‘word-tone’ and ‘tone’.
0.5 Conventions 15
oppositions in the Middle Japanese tone system were too numerous to be captured in
a system of pitch-accent.14
Another approach has been to analyze the tone system of Middle Japanese as a
combination of tone and pitch-accent, just as there are languages that combine tone
and stress-accent (cf. Swedish/Norwegian, Serbo-Croatian). In this approach /H/ or
/L/ tone on the initial syllable of the word in Middle Japanese is described in terms
of tone, while occurrences of [H] or [L] pitch in other than the first syllable are
described in terms of pitch-accent (cf. Martin, 1987, Hayata, 1999). Each word thus
starts with a /H/ or /L/ word-tone called ‘register’ with the additional possibility of a
locus of pitch-accent later on in the word.
The term ‘pitch-accent’ when used for Middle Japanese refers to the location of a
change from [H] to [L] or from [L] to [H], so that any change of pitch after the /H/
or /L/ register of the initial syllable is regarded as a ‘locus of accent’. A word with
tone for instance, is analyzed as having /L/ register and one locus of accent, a
word with tone is analyzed as having /L/ register and two loci of accent, but a
word with tone is analyzed as having /L/ register and no accent.
Even if we leave the criterion of obligatoryness aside, there are a number of
problems with this approach. What is essential in an accent system is that it should
be possible to point to a specific segment in the word (usually one with a [H] pitch)
which is more prominent and highlighted over others. The term ‘accent’ after all
implies that there is a location in the word where a culmination of prosodic features
occurs, thereby marking the unit that bears the accent with greater salience than
surrounding units. In other words, there is a specific segment which is the bearer of
accent, and pitch change functions to indicate which segment is accented.
The pitch change that indicates which segment is accented can vary from
language to language. In some languages, the location of a rise in pitch indicates
which segment is accented (cf. Hokkaidō Ainu, Middle Korean) while in other
14 To illustrate this point I will show two different analyses of Middle Japanese as a pitch-accent
language, by Okuda (1971) and McCawley (1978). The lists below show all the tone patters
attested in Middle Japanese for disyllabic nouns in the standard reconstruction. In longer nouns
in Middle Japanese it can be seen that a very limited type of morpheme-internal tone spreading
had taken place which had reduced the number of original contrasts. (See section 4.5.) Even
with a default pitch to account for some instances of [H] or [L] – Okuda takes [L] and
McCawley takes [H] as the default pitch – it is often necessary to use as many ‘accent’ marks
as there are syllables. It is for instance necessary to add McCawley’s sign ', which indicates a
fall in pitch, before each syllable or mora with /L/ tone. In Okuda’s treatment on the other hand,
not only the location of a fall in pitch (marked by ) is regarded as an ‘accent’ (just as for
instance in the pitch-accent analysis of modern standard Japanese) but the location of a rise in
pitch (marked by ) is also regarded as an ‘accent’:
Okuda McCawley
2.1
2.2 '
2.3 ''
2.4 '
2.5 ''
16 Introduction
languages, such as in modern Tōkyō Japanese (as well as in most other modern
Japanese dialects) the location of a fall in pitch indicates which segment is
accented.15
In modern Japanese therefore, the pitch fall is the acoustic cue that signals which
segment is accented. When a pitch-accent-analysis is applied to Middle Japanese on
the other hand, it is not a specific segment that is regarded as accented, but the
location of the change in pitch itself is regarded as the ‘locus of accent’. Any change
in pitch moreover, from [L] to [H] as well as from [H] to [L].
The following are the tone patterns that were allowed in case of four-syllable
nouns in Middle Japanese in the standard reconstruction. All the locations of pitch
change after the initial syllable have been marked: 4.1 , 4.2 , 4.3
, 4.4 , 4.5 , 4.6 , 4.7 , 4.8 ,
4.9 , 4.10 , 4.11 .16
As can be seen, it is impossible to capture the distinctions of Middle Japanese
with a analysis in which only the location of a rise in pitch or only the location of a
fall in pitch is distinctive. It is also impossible to point to a specific segment as the
bearer of accent. What is the accented segment in case of tone class 4.8 for instance?
The second syllable? But in order to explain the difference in with class 4.10, we
would have to analyze class 4.8 as having three accented syllables in a row. (And
class 4.9 as having two accented syllables in a row.) Likewise, does class 4.7
perhaps have two consecutive accented syllables, distinguishing it from class 4.11,
which has only one?
If it is possible to have any number of distinctive /H/ tones in a word, these /H/
tones should not be regarded as accents, but simply /H/ tones. In my opinion
therefore, the tonal distinctions of Middle Japanese are too rich to be captured in a
system of pitch-accent. The fact that accent is no longer regarded as based on a
specific segment, and the fact that any change in pitch is regarded as a ‘locus of
accent’ are artifices that were introduced to force a register tone language into an
accentual mold. As it is not possible to identify a specific accent-bearing syllable in
Middle Japanese, the use of the term ‘pitch-accent’ for Middle Japanese should be
rejected.
Another problem is that – looking at the tone system of Middle Japanese itself –
there is no basis for the special status that is given to the tone of the initial syllable
over the tone of other syllables in the word. As we shall see in section 4.6, the idea
that tone occurring in the initial syllable in Middle Japanese had a different status
than tone occurring elsewhere in the word stems from the habit of projecting
15 An indication for instance, that the speaker’s target is the syllable or mora with the /H/ tone
itself, and not the syllable or mora boundary where the pitch falls, is the fact that in Tōkyō
Japanese, the pitch of the accented segment in fact higher than the pitch of any of the syllables
in the phrase before it.
16 I have excluded some of the smaller tone classes which included the rare contour tones of
Middle Japanese.
0.5 Conventions 17
features that characterize the prosodic system of modern Kyōto back onto Middle
Japanese.
It is simpler and more correct to regard Middle Japanese as a syllable-tone
language in which the two register tones /H/ and /L/ formed the basis of the tone
system. Middle Japanese also included the contour tones /R/ and /F/, but these
tonemes were rare, and most likely the result of contractions. In some register tone
systems there is a default tone – usually [L] in a two-tone system or [M] in a three-
tone system – that is more common and less salient than other tones. There is
however, no sign that certain tones were more salient than others in the distribution
of the tones at the Middle Japanese stage, and so we have to reconstruct two equally
active tones /H/ and /L/.
0.5.2.4 Tone or pitch-accent in the modern dialects
The oppositions in the tone systems of the modern Japanese dialects are far more
limited. In many dialects (among which also the standard language of Tōkyō) there
is only a single distinctive location in the word or phrase of a transition from [H] to
[L]. The syllable in the word or phrase that contains the last [H] before [L] is thus
highlighted over others, and can be regarded as ‘accented’. As the prosodic system
of modern standard Japanese employs tone height and is culminative, it is not
surprising that Japanese often features as the archetypical pitch-accent language.
If obligatoryness is not included in the definition of what constitutes an ‘accent’,
the pitch-accent analysis works well for the standard language, and combined with a
tonal distinction that is limited to the word-initial syllable only, may also be applied
to the modern Kyōto type dialects, as in both types a single accented syllable per
word can be located by the subsequent drop to [L] pitch. (The prosodic system of
modern Kyōto is much more restricted than the prosodic system of Middle
Japanese.)
Extending the pitch-accent analysis to many of the other Japanese dialects on the
other hand, is problematic. For the prosodic system of the dialect of Nozaki
(discussed in more detail later on in this section and in section 6.2) to fall under the
definition of pitch-accent for instance, it would be necessary to widen the definition
of what constitutes an accent beyond ‘the last [H] before [L]’, and to allow more
than one accented syllable per word. Furthermore, the Kagoshima type dialects in
southwest Kyūshū and many dialects in the Ryūkyūs distinguish between a limited
number of different word-tones which can be mapped over words or phrases of any
length so that there is not a specific syllable in the word which can be regarded as
accented.17 These dialects, although restricted, are therefore clearly tonal.
17 In a typology of tone systems based on the domain of tonal contrast, Donohue (1997)
distinguishes word-tone systems, which use the whole word as the relevant tone assignment
domain, from syllable-tone systems, in which each syllable is allowed to bear a distinctive tone
independent of the other syllables in the word. In word-tone languages, a small number of
underlying tonal melodies account for the surface specification of mono- and polysyllabic
words. It is typical for such languages to allow tonal melodies to spread over the full tonal
18 Introduction
Widening the definition of pitch-accent to include the Japanese word-tone
systems is clearly out of the question. Widening the definition of pitch-accent to
include the prosodic system of a dialect like Nozaki would only result in making the
term ‘pitch-accent’ meaningless, just as when the term was used for Middle
Japanese. When pitch-accent becomes indistinguishable from tone, it is better to call
it ‘tone’.
What to do then, with more restricted dialects like modern Tōkyō? Even in case
of Tōkyō Japanese, the term ‘accent’ is merely used in the sense of a diacritic
indication of pitch which shows the alignment of the tones. Pitch-accent in this sense
shares nothing more with stress-accent than the fact that both tend to be limited to
one per word (culminativity), and we have seen that there are linguists who argue for
an analysis as restricted tone, as they accept the idea that restricted tone at one end
of the restrictedness scale may be culminative.18
My decision to abandon the pitch-accent analysis in favor of an analysis in terms
of restricted tone – even for a dialect like modern Tōkyō – has also been based on
practical considerations: One reason is that an analysis as restricted tone means that
a typological division between the different modern dialects, and between the
modern dialects and Middle Japanese can be avoided. This removes the
complication of having to switch back and forth between different sets of
terminology when discussing different modern dialects, or different historical stages.
Another factor is that the typological division that results from analyzing some
of the modern Japanese dialects as accentual and others as tonal comes across as
unnatural, as it often occurs between dialects that are closely related to each other in
all other respects: The prosodic system of the village of Nozaki on Noto Island has
preserved more than one accent-like /H/ tone per word, as well as a distinctive /R/
toneme, and in this respect, it is very archaic. Even if we leave aside the /R/ tone, /H/
tone in this dialect is not culminative, and so it would be hard to include the
prosodic system of Nozaki under a definition of ‘pitch-accent’. The dialect of
Nozaki would thus have to be separated typologically from the closely related
dialect of nearby Kōda village on the same island, which is of the more usual Tōkyō
type, in that it has lost all but one /H/ tone per word.
domain including (otherwise) toneless affixes.
18 They also accept the idea that restricted tone languages may have other properties reminiscent
of accent. McCawley (1970) made the following distinction between pitch as it is involved in a
true tone language, and pitch as it is involved in a pitch-accent system: In a language with a
pitch-accent system, the rules affecting pitch are accent reduction rules, i.e. rules which make
one element of a word or phrase predominant by eliminating or weakening the accentual
phenomena elsewhere. Accent reduction may involve action at a distance. By contrast, rules
affecting pitch in a tone language are the same kinds of assimilations and dissimilations that
affect ordinary segmental features, where the segments have to be adjacent to one another.
Odden (1999:199) however, points out: “In most Bantu languages, a good argument can be
made on language-internal grounds for assuming a privative H~Ø opposition, rather than H~L
(see Stevick, 1969) and this may also be at the root of the properties of Bantu which are
reminiscent of accent such as ‘long distance de-accentuation’.”
0.5 Conventions 19
Furthermore – as I have just mentioned – most dialects in the Ryūkyūs have
word-tone systems. These dialects are therefore clearly tonal and not accentual. In
the dialect on San on Tokunoshima however (Thorpe, 1983:134), a distinct location
of the /H/ tone in the word has to be recognized. Matsumori has furthermore argued
that in the dialects of Masana and Wadomari on Okinoerabu (2000:103–105), and
Tarama on Taramajima (2000:106–109) as well, the opposition between a number of
the tone classes can only be captured if a link between certain pitch changes and a
specific syllable in the word is acknowledged. In case of these dialects it is therefore
possible to point to a specific syllable in the word which can be regarded as
‘accented’.
These dialects would therefore qualify as pitch-accent languages, which would
again create a typological division between dialects that are closely related and
sometimes located on one and the same island. An analysis in terms of tone on the
other hand, includes both types (although we have to distinguish between word-
based tone and syllable-based tone), and can thus avoid a more fundamental
typological division.
Finally, the restricted tone analysis also makes the historical developments in
Japanese more transparent, as it reflects the historical continuity between the
relatively unrestricted tone systems in earlier stages of the language and the more
restricted tone systems of the modern dialects, as well as the actual process of
change: As will be shown in chapter 4, one of the most important changes in the
historical development of the Japanese tone systems has been the restriction of the
number of /H/ tones per word.
0.5.2.5 The tone systems of the modern Japanese dialects as ‘restricted tone’
Applying the restricted tone analysis and terminology outlined in section 0.5.2.2 to
Japanese makes a lot of sense, as it acknowledges the special accent-like status of
the /H/ tone, but does not require /H/ tone to be limited to one per word. As
mentioned, the tonal oppositions in the dialect of Nozaki are hard to capture in a
system based on the presence of at most one accent per word. The tone of class 3.7
for instance, is , which stands in opposition to the tone of class 3.6, which is
. Analyzed in terms of restricted tone, tone class 3.7 can be represented as
having /HØH/ tone. An analysis in terms of restricted tone also enables us to
acknowledge /R/ tone in the dialect of Nozaki, which is needed in the analysis of the
tone of class 2.5. Tone class 2.5 with an attached case particle has - tone,
which stands in opposition to tone class 2.4 which has - tone. Tone class 2.5
in Nozaki can now be represented as having /HR/ tone, where the /R/ tone is realized
with [L] pitch, while the rise to [H] pitch is shifted onto the attached enclitic case
particle.
Furthermore, the anticipation of /H/ tone that often occurs in /H/ vs. Ø tone
languages explains why the /H/ tone in so many modern Japanese dialects can only
be located by searching for the subsequent drop to [L] pitch. In many Tōkyō type
dialects (including those of Tōkyō proper and Nozaki) the phrase-initial syllable is
20 Introduction
exempt from the anticipatory raising so that a four-syllable word with /H/ tone on
the penultimate syllable (/ØØØHØ/) is realized as .19
Concluding, we can say that the dialects of Tōkyō and Nozaki both have
restricted tone systems in which tonal anticipation occurs. However, Tōkyō allows
only one /H/ per word and has lost /R/ tone, so that 2.5 has merged with 2.4 as -
and 3.7 has merged with 3.6 as , while Nozaki is archaic in that it has
preserved two /H/ tones in class 3.7, and /R/ tone in class 2.5. Tōkyō Japanese is
thus a restricted tone language at one end of the restrictedness scale.
In the prosodic system of modern Kyōto, not only the location of the last [H]
before [L] is distinctive, but also whether a word begins with /H/ or /L/ tone. In
addition to the /H/ vs. Ø tonal opposition therefore, the Kyōto type dialects have a
/L/ toneme, which can only occur in word-initial position. The question of whether
this /L/ tone was inherited directly from Middle Japanese, or redeveloped later, will
be addressed in chapter 4.
0.6. The selection of the corpus
As I have mentioned at the beginning of this chapter, I have limited my discussion
of the historical developments in the tone system of Japanese to the tone patterns
that can be observed in nouns. From this corpus I have furthermore excluded all
nouns that contain heavy syllables. In order to explain why only nouns with short
open syllables have been selected, a discussion of the role of the syllable and the
mora in the contemporary Japanese dialects and in the history of the Japanese
language is indispensable.
0.6.1 Syllable and mora
Japanese linguistic and poetic tradition divides the sounds of the language into
smaller temporal units than the syllable.20 Words like gengo ‘language’ and kooko
19 One of the principles of historical tonology observed by Hyman (1978:265) is the principle of
pause as /L/ tone: A pause boundary can at any time cause a lowering of an adjacent [H] or
other non-low tone, which explains why the phrase initial syllable is exempt from the
anticipatory raising. In Nagoya (in words of more than two syllables) the first two syllables of
the phrase are exempt from the anticipatory raising, so that /ØØØHØ/ is realized as .
In Akita on the other hand, anticipation of /H/ tone does not operate at all, and so /ØØØHØ/ is
realized as . As in Akita all Ø tones are realized with [L] pitch, it is not necessary to
look for the drop to [L] pitch to locate the /H/ tone.
20 It is often said that most Japanese speakers intuitively consider that each mora constitutes a
distinct temporal unit. This intuition however, is strongly suggested by the Japanese writing
system, as in the writing system one mora is rendered by one grapheme (except /CyV/, which
uses two, but the second grapheme is minimized). If one compares this to Dutch where a single
syllable is sometimes written by means of 7 graphemes (cf. schacht ‘shaft’) and other times by
means of only one, the mora analysis is almost thrust upon anyone grown up with the Japanese
writing system. Moreover, nearly all Japanese are literate and familiar with the traditional
0.6. The selection of the corpus 21
‘public fund’ for instance, which have two syllables, gen-go and koo-ko, are divided
into three units each, ge-n-go and ko-o-ko. These temporal units are called moras.
The second mora of a heavy syllable is called a dependent, non-syllabic or
subsyllabic mora.
In most Japanese dialects dependent moras can serve as independent timing units,
and in many dialects they can function as independent tone-bearing units as well,
even though they are not syllables. In this respect, the mora can be regarded as a
phonemic syllable, as opposed to a phonetic syllable.
The concept of syllable, in which words like gengo and kooko are divided into
two units, was introduced in Japan together with Western linguistic theory, but this
does not mean that the syllable is not a phonological reality in Japanese. There are
for instance many dialects (including the standard dialect of Tōkyō) where the
distinction between syllable and mora plays a role in the tone rules. In such dialects,
the following four types of moras cannot bear /H/ tone: A syllable-final moraic nasal
/N/, which is realized as [≤] or as nasalization on the preceding vowel before a
vocalic syllable, but as [n], [m] or [N] depending on the articulation point of a
following consonant. A syllable-final moraic obstruent /Q/, which consists of the
first half of the geminate consonants [pp], [tt], [kk], [ss]. The second half of long
vowels and the second half of vowel sequences ending in -i.
Finally, there are also dialects in which dependent moras do not even function as
independent timing units. The origin of the differences among the contemporary
dialects as to the capacity of non-syllabic moras to function as independent timing
and tone-bearing units lies in the fact that the dependent moras developed from
independent syllables historically, and that in some dialects the former syllables
have preserved more of their inherited syllable-like properties than in other dialects.
In the following sections I will give an overview of the historical development of the
syllable and the mora in Japanese.
0.6.1.1 Old Japanese
Old Japanese (700–800) is thought to have contained only short open syllables /CV/
and /V/. As there was no distinction between syllable and mora, the concept of mora
is not needed in the analysis. /V/ was generally restricted to word initial position, so
that there were almost no word internal vowel sequences.21 When vowel sequences
arose in compounds, or even in phrases, they were sometimes eliminated by
contraction (cf. naga ‘long’ + amey ‘rain’ > nagamey ‘long spell of rainy weather’)
or vowel deletion.
mora-counting poetry. Even children for instance, study Hyaku-nin isshu, a set of 100 classical
Japanese poems patterned in 5-7-5-7-7 moras, which form the basis of a card game.
21 Some of the rare exceptions can be explained by the fact that *yi and *wu were no longer
allowed in Old Japanese: uu < *uwu ‘to plant’, oi <*oyi (continuative (ren’yō-kei) verb form of
oyu) ‘to grow old’.
22 Introduction
Deletion of the second vowel occurred for instance in panare ‘isolation’ + iswo
‘beach’ > panareswo ‘isolated beach’, wa ‘I’ + ga (genitive particle) + (y)ipye
‘house’ > wagapye ‘my house’. Examples of deletion of the first vowel are: ara
‘wilderness’ + iswo ‘beach’ > ariswo ‘a rocky shore’, wa ‘I’ + ga (genitive particle)
+ (y)ipye ‘house’ > wagyipye ‘my house’, wa ‘I’ + ga (genitive particle) + (y)imwo
‘beloved girl’ > wagyimwo ‘my beloved’.
It is not clear why it is sometimes the first vowel and sometimes the second
vowel which is deleted. (Both options were for instance possible in case of ‘my
house’.) This problem has been addressed by Unger (1993), Russel (2003) and
Wenck (1959) among others. Unger and Russel attempt to set up rules that can
accurately predict which solution will be chosen in a particular environment. Wenck
on the other hand, thinks that the difference between the two options was dialectal,
and that elision of the second vowel represented an older pattern, which was being
replaced by a newer pattern in which there was elision of the first vowel.22
In other cases the two consecutive vowels were replaced by a single vowel of
intermediate quality: sakyi ‘blooming’ + ari ‘are’ > sakyeri ‘are blooming’, naga
‘long’ + ikyi ‘breath’ > nageykyi ‘lament’. As this option is most rare, Wenck (1959)
suggests that the examples may be remnants of an even older pattern that was no
longer productive in Old Japanese.
There is reason to assume that – at least until the Old Japanese period – the
vowels of syllables with contour tones were automatically lengthened, while
syllables with level tones had short vowels. As will be shown in chapter 11 of part II,
such an assumption serves to explain the difference between the tones of identical
Chinese characters in the Go-on and Kan-on reading traditions. It also agrees well
with the observation that in order to accommodate a tone contour the syllabic
support is often lengthened (Hyman, 1978:262). Therefore, although the contour
tones were the result of contractions, the vowels were not long because they derived
from contractions, but because they had to accommodate tone contours.
(Contractions that did not result in a contour tone would not have had vowel length.)
22 According to Wenck, elision of the second vowel is more common in the Azuma uta and
Sakimori uta in the Man’yō-shū, which express the dialect of east Japan. There are exceptions,
such as panariswo ‘isolated beach’ in the Sakimori uta, but this should come as no surprise, as
Hagers (2000) shows that in the Azuma uta and the Sakimori uta other features of the eastern
dialect (such as the distinction between the forms of the attributive (rentai-kei) and the finite
(shūshi-kei) in verbs with consonant stems) have actually only been attested in a small minority
of cases. (Wenck also examined the geographical distribution of place names that show either
elision of the first or the second vowel. Two common place names, Kawachi and Kōchi are
good examples of the two different strategies: ‘between rivers’ kawa-uti > kawati as opposed
to kawa-uti > kawuti, which developed later into kauti > kooti. His conclusion is that the
geographical distribution is best explained by assuming that an older pattern in which the
second vowel was deleted was pushed out by a later central Japanese pattern in which the first
vowel was deleted, but that some of the older forms survived in central Japanese (Wenck,
1959:70-71).
0.6. The selection of the corpus 23
0.6.1.2 Early Middle Japanese
In Early Middle Japanese (800–1200) lengthened vowels have been attested
occasionally, for instance in Ruiju myōgi-shō, but only in case of monosyllables.
(See section 1.3.1.)
Automatic lengthening of monosyllables occurs in many languages, and can be
found in central Japan today. Because of the attestations in works like Ruiju myōgi-
shō, it is likely that automatic lengthening of monosyllables had already occurred in
central Japan by the Early Middle Japanese period. As the vowel lengthening was
predictable however, there was no phonological contrast between long and short
vowels.
The occurrence of ji-amari in the poetry of the Old and Early Middle Japanese is
most likely related to the fact that vowel length was not distinctive: Ji-amari is a
phenomenon common in the syllable-counting poetry of this period, where in poetry
lines that included a phrase-internal vowel sequence, there was the option to use one
more syllable than would normally have been allowed. In other words, ji-amari was
allowed when it was possible to eliminate one of the syllables in the line by reading
a vowel sequence occurring on two consecutive syllables as one lengthened syllable.
Even in such cases, when some measure of vowel length must definitely have
been there, the poetry apparently only recognized syllables as such. The fact that this
technique was allowed agrees with the idea that the language was not sensitive to the
distinction between long and short vowels.23
The old (C)V pattern was altered by a large influx of Chinese loanwords, and a
heterogeneous group of phonological changes known collectively as onbin or ‘ease
of pronunciation’. These changes comprise elision of intervocalic consonants
resulting in word internal vowel sequences, and elision of vowels (usually i or u)
resulting in closed syllables. The closed syllables ended in either of two mora
consonants /N/ or /Q/.
Examples of onbin changes that resulted in mora consonants and vowel
sequences in native Japanese words are for instance the gerund verb forms yomite >
yonde ‘read’, matite > matte ‘wait’, kakite > kaite ‘write’, kikite > kiite ‘hear’ and
kanasiki > kanasii ‘sad’.
The main source of internal vowel sequences and closed syllables was however,
formed by Chinese loanwords: Chinese loanwords ending in -p, -t, -k were supplied
with a close vowel, but -p, -t -k developed into the mora obstruent /Q/ if followed by
a voiceless obstruent, such as happened in many character compounds. Apparently,
-m and -n, were likewise initially supplied with a close vowel, but when syllable-
23 When distinctive vowel length developed in Late Middle Japanese, the phonemic background
for the allowance of ji-amari was lost. As a result, ji-amari gradually developed into a poetic
device that was no longer restricted to lines that included vowel sequences, but could now be
used in other environments as well. According to Motoori Norinaga 本 居 宣 長 the
abandonment of the old rule started in the Shin-kokin-shū 新古今集 poetry collection of 1205,
but even there, deviations from the old rule only occur in 26 out of the 351 poems (Wenck,
1959:75).
24 Introduction
final /N/ became available, that mora came to be used: -mi/-mu, -ni/-nu > /N/.24
Chinese diphthongs were adopted as /V1V2/ sequences (iu, ou, au, eu, ai, ui, ei).25
The onbin changes emerge in the written records around the start of the 9th
century, and it is thought that by the end of the Early Middle Japanese period
identical vowel sequences, non-identical vowel sequences and mora consonants had
become part of the colloquial language.
0.6.1.3 Late Middle Japanese
Most Japanese dialects descend from a stage in the language that is definitely no
later than Early Middle Japanese. Changes attested in the written record in the Late
Middle Japanese period can therefore only be related to the dialects of central Japan
(Kinki and most likely parts of southern Chūbu).
When certain syllable types had changed into non-syllables, there were two
options: One option was to preserve the underlying system (only phonetic syllables
can function as timing and tone-bearing units) and to alter the rhythm and tone
pattern of the language.
We find no indication of such radical changes in documents from the Late
Middle Japanese period, but lack of attestations as such is not conclusive, as changes
in the spoken language do not always make it into the written record. The former
syllables continued to count as independent timing units in poetry, but as traditional
poetic meter dates back to the time of Old and Early Middle Japanese, we cannot be
sure as to the situation in the spoken language of central Japan. Secondly, onbin
forms were more common in speech than in the written language, and onbin forms
with tone marks are especially rare.
As far as I am aware of, the only clear attestation of the fact that the
mora/syllable distinction started to play a role in the tone system can be found in the
musical notation marks added to geminate voiceless stops in gerund verb forms in
Shiza kōshiki 四 座 講 式 (13th century), which dates back to the Late Middle
Japanese period.26 The avoidance of [H] pitch on a dependent mora that can be seen
there however, can hardly count as an indication that dependent moras had lost the
capacity to function as tone-bearing units in general, as it involves the mora
obstruent /Q/ before a voiceless stop. A less hospitable tone-bearing unit is hardly
imaginable, as the acoustic realization of /Q/ in this environment is silence.
24 Only a few of these early loans have preserved the attached vowel. Examples are zeni ‘money’
and sami ‘three’ in samisen ‘three-stringed lute’.
25 Sequences ending in u developed long vowels due to subsequent assimilations: iu > yuu, ou >
oo, au > oo, eu > yoo. As these developments are only rarely reflected in the spelling they are
hard to date, but according to Wenck (1959: 148-166) the different assimilations most likely
occurred at different times during the Late Middle Japanese period. Other diphthongs, such as
ai, ui and ei did not coalesce into long vowels. Chinese -N and -p were a source of long vowels
as well: -N was generally replaced by a high vowel, which was originally nasalized but later
merged with i and u, and -p was adopted as -pu which developed > ∏u > wu > u.
26 See section 14.3.1.1 of part II.
0.6. The selection of the corpus 25
The other way of dealing with the fact that certain syllable types had changed
into non-syllables, would have been to preserve the old rhythm and tone pattern and
change the underlying system, so that non-syllables could now function as timing
and tone-bearing units. This option would have been least intrusive, as it meant
treating dependent moras as phonemic syllables.
Judging from the way in which the modern dialects in the areas of Honshū that
most likely form the geographical basis of the attested forms of Late Middle
Japanese treat the voiceless geminate consonants, the mora nasal and the second half
of vowel sequences, we can conclude that this was the option chosen in Late Middle
Japanese.
0.6.1.4 The modern dialects
The development from Late Middle Japanese to the modern dialects in central
Honshū was most likely as follows: The mora nasal and the second half of vowel
sequences continued to function as independent timing and tone-bearing units. The
voiceless geminate consonants on the other hand, were no doubt unable to function
as independent tone-bearing units from the start (such as the evidence from Shiza
kōshiki also shows), but they continued to function as timing units.
In the dialect of Kyōto and in many of the more central Tōkyō type dialects for
instance, the mora nasal and the second half of all types of vowel sequences are still
capable of bearing /H/ tone, which means that in these dialects they still function as
independent phonological syllables.27 Kindaichi (1958) includes a map showing the
influence of segmental features on tone placement in the Japanese dialects.
According to this map, the mora nasal and the second half of vowel sequences can
bear /H/ tone in the Kyōto type dialects, and in the following Tōkyō type dialects: In
the middle of the Kyōto type dialects in the dialect of Totsukawa (Nairin type). To
the west of the Kyōto type dialects in Yamaguchi, Okayama and Hyōgo prefectures
(Chūrin and Nairin type) and in the southwest of Shikoku (Chūrin type). To the east
of the Kyōto type dialects on the Izu peninsula in Shizuoka prefecture, such as in
Shimoda (Chūrin type) and in parts of Nagano and Gunma prefectures. Kindaichi
(1943) also mentions Toyohashi in Aichi prefecture (Gairin type).
In other dialects, such as in the modern standard language of Tōkyō, the
voiceless geminate consonants, the mora nasal, the second half of long vowels, and
27 In Kyōto the occurrence of /H/ tone on the second mora of a heavy syllables is in fact quite
restricted. It is possible, but only if the word starts with /L/ tone as in 'on'gaku ‘music’. As
McCawley (1978a:131) has pointed out, in Kyōto type Japanese, dictionary entries need never
distinguish between heavy syllables with first-mora /H/ tone and second-mora /H/ tone.
However, such a contrast can arise through the action of morphophonemic rules. In Kyōto, just
as in standard Japanese, certain final elements of compound nouns put a /H/ tone immediately
before them. In Kyōto, the /H/ tone in such cases goes on the immediately preceding mora,
even if that happens to be the second mora of a heavy syllable, thereby creating second moras
that carry /H/ tone even in compounds that do not start with /L/ tone: si'nkee /HØØØ/ + syoo
/ØØ/ → sinkee'syoo /ØØØHØØ/ ‘neurosis’. (See also the many examples in section 5.5.)
26 Introduction
the second half of closing diphthongs are syllable-like in their role as timing units,
but they can no longer function in the same way as the syllable as far as tone is
concerned. Only the first mora of a heavy syllable can carry the /H/ tone. In gengo
and kooko, for instance, the only contrasts observed are between ge'ngo '
‘language’ and gengo ‘original language’. And between ko'oko '
‘public fund’ and kooko ‘pickled radish’ (Shibatani, 1990:178).
There is no contrast between heavy syllables with /H/ tone on the first mora,
versus heavy syllables with /H/ tone on the second mora. Each syllable affords only
one possible place for the pitch to fall; at the end of its first mora. McCawley
(1968:134) therefore described Tōkyō Japanese as a ‘mora-counting syllable
language’ as the syllable is the /H/ tone-bearing unit, but counting the moras is
necessary to determine where in the syllable (after the first mora) the drop to [L]
pitch occurs.28
Non-identical vowel sequences ending in -i have monosyllabified to closing
diphthongs. This is evident from cases like ka'i (< kahi 2.3) ‘shell’ and ku'i (< kuhi
2.2) ‘stake’, where the /H/ tone, which historically fell on the second vowel, has
shifted to the first vowel. Non-identical vowel sequences ending in open vowels on
the other hand, do not monosyllabify. In ie' (< ihe 2.3) ‘house’, sao' (< sawo 2.3)
‘pole’, sio' (< siho 2.3) ‘salt’ and yue' (< yuwe 2.2) ‘reason’, the /H/ tone has
remained on the second vowel. In vowel sequences of this kind there still is a
contrast between words that have the /H/ tone on the first vowel and words that have
the /H/ tone on the second vowel. Such vowel sequences can therefore not be treated
as a single syllable. When words like a'o (< awo 2.5) ‘blue’, ma'e (< mahe 2.5)
‘front’, tu'e (< tuwe 2.4) ‘staff’ and ko'e (< kowe 2.5) have the /H/ tone on the first
syllable, this is the regular reflex in these tone classes, and not the result of a shift of
the /H/ tone to the initial mora due to monosyllabification.
While in some dialects, subsyllabic moras have lost the capacity to function as
/H/ tone-bearing units, in other dialects the changes have been much more extensive.
These dialects drastically shorten the dependent moras, so that the heavy (two-mora)
syllables sound much like the light (one-mora) syllables. These are found in the
Gairin type dialects of northeast Honshū, Niigata and Shimane, and in southeast
Kyūshū, Tokunoshima and Yonaguni (Martin 1987:4).
In such dialects, syllables are the minimal temporal units, and forms like matti
‘match’ and honya ‘bookstore’ are not counted as having three moras, but divided
into only two temporal units as mat-ti and hon-ya. Shibata Takeshi (1962) has
termed these dialects ‘syllabeme’ dialects, as opposed to those Japanese dialects in
which the mora functions as the minimal temporal unit.
28 A /H/ tone on the second mora of a long vowel is not completely impossible in Tōkyō. An
example is sii'ru ‘to compel’, which forms a minimal pair with si'iru ‘a seal’. Such cases are
extremely rare and only occur when the /H/ tone is assigned to the second mora due to
morphophonemic rules. (In the above example, the attributive/finite ending puts a /H/ tone on
the preceding mora.) The best approach is therefore to maintain McCawley’s rule, and analyze
sii'ru as containing a sequence of two identical vowels on two separate syllables.
0.6. The selection of the corpus 27
We can summarize the developments as follows: The reason why in many
dialects of Japanese subsyllabic moras can function as syllables in most respects, is
because they inherited the role of timing and tone-bearing unit from the syllable,
from which they derived historically.
The present-day system in the standard language in which the mora counts as an
independent timing unit, but not as an independent tone-bearing unit, and where
open vowels have a greater capacity to function as independent tone-bearing units
than close vowels, is the result of development in which the tone-bearing capacity of
non-syllabic moras has eroded over time.
The present-day situation in the syllabeme dialects in Kyūshū and Tōhoku is the
result of a process in which the subsyllabic moras not only lost their former
capacities as tone-bearing units, but also as timing-units. The first step would have
been that they still formed a distinct timing unit but that they no longer formed a
distinct tone-bearing unit, while the next step would have been that they lost the
capacity to function as timing units as well.
The distinction between syllable and mora plays hardly any role in dialects in
which the mora has remained relatively syllable-like, preserving its capacity to
function as both timing and tone-bearing unit.29 This is also true for the dialects at
the other and of the scale, where the mora has lost practically all of its syllable-like
properties (the syllabeme dialects). The distinction between syllable and mora plays
the largest role in the phonology of dialects where non-syllabic moras have
preserved the capacity to function as independent timing units, but lost the capacity
to bear /H/ tone.
0.6.1.5 The exclusion of heavy syllables
As the modern dialects differ among themselves as to which segments can bear /H/
tone and which segments can no longer bear /H/ tone, the cross-dialect tonal
correspondences of words that contain heavy syllables are complex. Kindaichi,
(1943:24) illustrated this by means of a comparison of the reflexes in Tōkyō,
Okayama and Kyōto.
2 Cross-dialect correspondences of nouns that contain heavy syllables
Tōkyō Okayama Kyōto
ai (< awi) ‘indigo plant’ ' ' '
koi (< kohi) ‘carp’ ' ' '
kai (< kahi) ‘shell’ ' ' '
kui (< kuhi) ‘stake, post’ ' ' '
The reflexes in Okayama still show a regular correspondence with the reflexes in
Kyōto, but the reflexes in Tōkyō no longer do, as they have been influenced by the
29 The only segment that is not capable of bearing tone in such systems is the mora obstruent /Q/.
28 Introduction
change in syllable structure. In order to avoid such extra complications, I have
limited the discussion of the historical development of the Japanese tones to the
developments in light (one-mora) syllables. The correspondences between Middle
Japanese and the different modern dialects, as well as the correspondences between
the modern dialects among themselves in this work are therefore based on examples
that contain short open syllables only.30 Such examples constitute the most suitable
corpus for historical comparative purposes due to their structural stability. To avoid
confusion with non-syllabic moras, I will therefore speak of ‘syllables’ in my
discussion of the Japanese tone systems in (almost) all cases.
Finally, in Tōkyō Japanese, as in the dialects of Kyūshū, the Shimane peninsula
(Matsue and Izumo), the Noto peninsula (Ishikawa and Toyama prefectures) and in
most dialects of northeast Japan (excluding Akita and Aomori) there is a tendency
for close vowels between voiceless obstruents, or between a voiceless obstruent and
a pause, to be devoiced. As Vance (1978:48–55) points out however, the situation is
complex, as open vowels also sometimes devoice, but such devoicing appears to
vary from individual to individual.
Such devoiced vowels often shift the /H/ tone away, which can complicate cross-
dialect correspondences, but here again; the situation is far from unequivocal.
Devoicing of vowels between a voiceless obstruent and a pause for instance, appears
to be influenced by the location of the /H/ tone, rather than the other way around:
According to Martin (1952:14) devoicing does not occur when such vowels are
accented (i.e. bear the /H/ tone). Even devoiced close vowels between voiceless
obstruents do not always shift the /H/ tone away in the phonological sense: In /si'ku/
‘4 times 9’ as well as in /siku/ ‘spread’, the first vowel is devoiced [i8], but in case of
/si'ku/ the second syllable has a falling tone contour while in case of /siku/ the
second syllable has a rising tone contour. These contours now function to
distinguish the underlying /HØ/ and /ØH/ tone patterns (Han 1962b, 81–82).
The question of whether – and if so – under what conditions, vowel devoicing
influences tone placement remains unclear, and as exclusion of all examples that
include vowels which could be subject to devoicing in certain dialects would
drastically limit the number of possible comparisons such examples have not been
excluded.
30 There are a few exceptions: In chapter 5, the list of compound nouns includes examples in
which the first constituent contains vowel sequences or mora consonants. This does not
complicate the correspondences however, as the accent of compound nouns in Kyōto and in the
Tōkyō type dialects of central Honshū is determined by the second member of the compound.
Monosyllabic nouns have not been excluded, even though they have vowel length in areas like
central Honshū, Shikoku and Okinawa, as well as (some of) the attested forms of Middle
Japanese, as this vowel length is not distinctive. In a number of dialects in the Ryūkyūs,
automatic vowel lengthening also occurs in disyllabic nouns in members of certain tone classes,
but here again (with the exception of a few dialects which will be discussed in chapter 9) the
vowel length is subphonemic.
1 The two sets of comparative data
In this chapter I will present the two sets of comparative data: The tone systems of
the modern Japanese dialects and the tonal distinctions of Middle Japanese as
evidenced by the distribution of the tone dots over the Japanese lexicon in texts from
the 11th to 13th century.
1.1 The modern Japanese tone systems
In the following sections I will give a description of the three main types among the
modern Japanese tone systems (the Tōkyō, Kyōto and Kagoshima types), and their
geographical distribution. Map 1 presents a (simplified) representation of the
geographical distribution of these types. For the most detailed map of the
geographical distribution of the different tone systems (and their many subtypes) in
Japan, see Language atlas of the Pacific area (Wurm & Hattori, 1981).
1.1.1 The Tōkyō type tone systems
The Tōkyō type tone systems are characterized by the presence or absence of at
most one /H/ tone per word. As the location of this /H/ tone in the word is distinctive,
the Tōkyō type tone systems can be analyzed as syllable-tone systems. Nouns are
divided into different tone classes on the basis of whether they contain a /H/ tone
and – if they do – on which syllable of the word this /H/ tone is located. In most
Tōkyō type dialects the location of the /H/ tone is on the last syllable before a drop
to [L] pitch. (According to the pitch-accent analysis of the dialect of Tōkyō, the last
[H] pitched syllable is the syllable that carries the accent.) All other syllables in the
word can be analyzed as having Ø tone, as the pitch of these syllables is determined
by automatic pitch assignment rules.
In many Tōkyō type dialects the /H/ tone can only be identified by looking at the
location of a transition from [H] to [L], because in these dialects the /H/ tone is
anticipated. Syllables with Ø tone that precede the /H/ tone will have [H] pitch in
anticipation of the /H/ tone that will follow later on in the word. (Not all Japanese
dialects have /H/ tone anticipation. In the dialect of Akita for instance, only the
syllable that carries the /H/ tone itself has [H] pitch.)
In many Tōkyō type dialects, the phrase-initial syllable is exempt from the /H/
tone anticipation, so that the phrase-initial syllable has automatic [L] pitch if it does
not carry the /H/ tone. After this %L phrase boundary tone, the pitch is [H] until the
pitch fall that follows immediately after the /H/ tone. The surface realization in
Tōkyō of a word with /ØØHØ/ tone is thus . As outlined in section 0.5.1,
30 1 The two sets of comparative data
the surface realization and the phonological shape can be captured in one
representation, when the location of the /H/ tone (or the location of the accent in the
pitch-accent analysis) is indicated by means of an apostrophe after the last [H]
pitched syllable: '.
The location of the rise from [L] to [H] can differ from dialect to dialect (in
Nagoya for instance, in words of more than two syllables, the anticipation of the /H/
tone only starts after the second syllable of the phrase) but in each dialect the
location of the rise in pitch is automatic, and therefore not distinctive.
Many Tōkyō type dialects also have a rise in pitch after the first syllable in
words that contain only Ø tones (i.e. words that are ‘unaccented’ in the pitch-accent
analysis), but there are also Tōkyō type dialects in which the pitch of such words is
level, such as in the dialect of Akita in northeast Honshū, and in the dialects of
several villages (Oritachi, Hiratani, Shigesato) in the Totsukawa area in Nara
prefecture (Ikuta Sanae, 1951, Yamana Kunio, 1951). In Aomori only the very last
syllable of a phrase with all Ø tone will have [H] pitch.
In the standard language, and in many other Tōkyō type dialects, the automatic
rise in pitch after the first syllable in words that contain only Ø tones is less
pronounced in careful speech. (This difference has been confirmed by experimental
data. See Pierrehumbert & Beckman, 1988:6.)1 Because of this, the Tōkyō type tone
system used to be analyzed as having three tone levels, /H/, /M/ and /L/, the so-
called three-step analysis. The pitches of disyllabic words (in phrase initial position
and phrase internal position) with the attached nominative case particle ga in the
three-step analysis look as follows:2
If /H/ tone falls on the first syllable: ha'si-ga '- ‘chopsticks (subj.), kono
ha'si-ga '- ‘these chopsticks (subj.)’.
If /H/ tone falls on the second syllable: hasi'-ga '- ‘bridge (subj.)’, kono
hasi'-ga '- ‘this bridge (subj.)’.
In case of all Ø tone: hasi-ga - ‘edge (subj.)’, kono hasi ga -
‘this edge (subj.)’
Later, a two-step analysis developed in which /HØ/, /ØH/ and /ØØ/ were
represented as ', ' and respectively. As the difference between [M] and
[H] tone can only be heard in slow, careful pronunciation, the Tōkyō type tone
system is nowadays usually analyzed as having only two tone levels, so that the
disyllabic words shown above are represented as ', ' and (the ‘new’
two-step analysis). This is the representation of the Tōkyō type tone system which I
will follow from here on. This means that the representation of the actual pitches of
words with all Ø tone is not purely phonetic, but has been influenced by the
phonological analysis.
1 It is interesting to note that in the preface to Yamada Bimyosai’s dictionary of 1892, which
contains the oldest analysis of the Tōkyō type tone system, the pitch of words with all Ø tone is
described as [L], and no rise in pitch after the first syllable is mentioned.
2 What are usually referred to as ‘case particles’ in Japanese linguistics are enclitic case markers
that are phonologically part of the previous word.
1.1 The modern Japanese tone systems 31
It has to be remembered however, that the difference between [H] and [L] pitch
in modern Japanese is not large, and can only be perceived in relation to the other
pitches within a word or phrase. This means for instance, that in case of
monosyllabic nouns in isolation, the phonological distinction in tone between the
different tone classes cannot be heard.3
The Tōkyō type tone system is divided into three major subtypes, depending on
what kind of mergers have occurred between the tone classes that have to be
reconstructed for proto-Japanese on the basis of a comparison of all the modern
dialects. I will adopt the terms Nairin (inner circle) type, Chūrin (middle circle) type
and Gairin (outer circle) type from Kindaichi (1977) and Uwano (1981), who see the
distribution of the three types as forming three concentric circles around the Kyōto
type tone system in the middle.4 The tone system of the Japanese standard language,
which is based on the dialect of Tōkyō, belongs to the Chūrin type.
Table (1) gives the merger patterns on which the classification of the three
Tōkyō type dialects as Nairin, Chūrin or Gairin is based.
1 The Nairin, Chūrin and Gairin merger patterns
Monosyllabic nouns Disyllabic nouns
/Ø/ /H/ /ØØ/ /ØH/ /HØ/
Nairin 1.1 1.2, 1.3 2.1 2.2, 2.3 2.4, 2.5
Chūrin 1.1, 1.2 1.3 2.1 2.2, 2.3 2.4, 2.5
Gairin 1.1, 1.2 1.3 2.1, 2.2 2.3 2.4, 2.5
As I have mentioned in the introduction, in the assignation of numbers to the
different tone classes of nouns, the number 1 after the dot has been reserved for
those tone classes that have Ø tone in the Kyōto type dialects and in a subset of the
Tōkyō type dialects: Only the Nairin Tōkyō type dialects agree with Kyōto in this
respect. In the other Tōkyō type dialects the number of tone classes that have Ø tone
is larger. To put it in simple terms; as we move from the ‘inner’ to the ‘outer’ circles
more and more tone classes have Ø tone.
In the Nairin type dialects, a merger has occurred between classes 1.2 and 1.3
and between classes 2.2 and 2.3. (All three Tōkyō subtypes have in common that
there is no distinction between tone classes 2.4 and 2.5). In the Chūrin type tone
3 Matsumori (1993) for instance gives the pitches of the monosyllabic nouns in her own (Chūrin)
Tōkyō type tone system as:1.1 , -, 1.2 , -, 1.3 ', '-
4 I will use these terms throughout as referring to the three main subtypes of the Tōkyō type tone
system. In an earlier publication (1964), Kindaichi used these terms in a different sense. In this
article Kindaichi proposed his ‘circle theory’ (which will be discussed in more detail in section
2.4.4) and in this article the term ‘Nairin’ refers to dialects with a Kyōto type tone system, the
term ‘Chūrin’ refers to dialects with a Nairin or a Chūrin Tōkyō type tone system, and the term
‘Gairin’ refers to dialects with a Gairin Tōkyō type tone system or a Kagoshima type tone
system.
32 1 The two sets of comparative data
system, tone class 1.2 has not merged with class 1.3, but with class 1.1 (Ø tone).
This has also happened in the Gairin type tone system, but here not only tone class
1.2, but also tone class 2.2 has Ø tone, so that class 2.2 has merged with class 2.1. In
case of trisyllabic nouns, which are not included here, the mergers in the Nairin and
Chūrin dialects are the same, while the Gairin dialect differs in that tone class 3.2
has Ø tone, and has merged with class 3.1.
The Nairin type can be found in four non-adjacent areas on the island of Honshū.
To the northeast of the Kyōto type tone system, the Nairin type can be found in an
area that spreads from Nagoya northward to Gifu, Izumi and Takayama. On Noto
peninsula a special subtype of the Nairin tone system can be found, in which tone
classes 2.4 and 2.5 and 3.6 and 3.7 have not merged, which is unusual for Tōkyō
type dialects. (This Nairin type tone system – which is conservative on the one hand,
but has gone through some innovations as well (cf. McCawley’s lowering rule) –
will be discussed in more detail in section 6.2.) The more typical Nairin type can
also be found in this area, in several villages on the Noto peninsula and Noto Island.
Noto Island is of particular interest (see sections 3.1.1 and 6.2) as some of the tone
systems on this island appear to be exceptionally archaic.
To the west of the Kyōto type tone system, the Nairin type can be found in an
area that spreads from Mineyama, Wadayama and Toyooka southwestward to
Tsuyama, Okayama and Onomichi. Completely surrounded by Kyōto type tone, the
Nairin tone system can be found in the middle of the Kyōto type area in the villages
of the Totsukawa region.5 (The Totsukawans – or so it seems – have a talent for
preserving their own tone system, as in Shin-Totsukawa on Hokkaidō, immigrants
from this region have managed to preserve their own Nairin type tone system while
surrounded by Gairin type tone (Wurm & Hattori, 1981).
The Chūrin type can be found in three blocks. One block lies on the island of
Honshū to the east of the area with Kyōto type and Nairin type tone. It spreads from
Okazaki northward to Itoigawa and eastward to Matsumoto, Maebashi, Hinoemata,
Kōfu, Shizuoka, Odawara, Yokohama, Tōkyō and Chiba. (The Izu islands off the
coast of Odawara also have a Chūrin type tone system.) The second block is to the
west of the Kyōto and Nairin type area on the island of Honshū. It spreads from
5 Kindaichi (1977) and Uwano (1981) make a distinction between the Nairin and Chūrin types,
based on the different merger patterns in the monosyllabic nouns. In all other dialect maps and
classifications these two types are not distinguished as most classifications are based on the
merger patterns of disyllabic nouns only. Because of this, my information on the Nairin dialects
stems mainly from Uwano’s article, but the mergers of the monosyllabic tone classes in the
pattern that Uwano (1981) describes are confirmed in the description of the dialects of Nagoya
and Gifu by Mase (1960) and the Totsukawa region by Hirayama (1979) and Ikuta (1951).
Uwano claims that the dialect of Kōda has an unmerged system in which 1.1, 1.2 and 1.3 are all
distinguished from each other, but does not mention the dialect description on which this
observation is based. The only description of the dialect of Kōda that I know of in which the
monosyllabic nouns are mentioned (Kindaichi, 1954) is not very clear on this point but seems
to indicate a merger of tone classes 1.2 and 1.3 (cf. section 6.2.6). I classify the dialect of Kōda
therefore, as belonging to the Nairin type.
1.1 The modern Japanese tone systems 33
Tottori and Nakayama on the Sea of Japan coast southwest-ward to Hiroshima,
Hamada, Yamaguchi and Shimonoseki. Finally the Chūrin type can be found in the
southwestern corner of Shikoku in the region of Uwajima, Nakamura and Sukumo,
adjacent to the Kyōto type tone system that is spoken on the rest of the island.6
The Gairin tone system can be found in four widely separated blocks: By far the
largest block can be found in the northeast of Honshū. It starts to spread from
Nagano northward to Niigata. (This area is however, still considered a transitional
area between the Chūrin and the Gairin type.) From Yamagata prefecture on, Gairin
type tone spreads all the way to Akita, Aomori and Morioka. In the centre of Honshū,
along the Pacific coast, there is a block in the region of Tenryū, Toyohashi,
Kakegawa and Hamamatsu. In the west of Honshū, along the Sea of Japan coast
there is a block in Shimane prefecture, in the area of Yonago, Matsue, Izumo and
Gōtsu. Finally, in the northeast of Kyūshū there is a block in the region of Kita
Kyūshū and Ōita. (Located next to this area is the Hakata/Fukuoka subtype, in which
the tone classes have merged as 2.1/2/3 vs. 2.4/5. Not only is this merger pattern
unusual, the fact that the merged class 2.1/2/3 (which includes class 2.1, which
normally has Ø tone) contains a /H/ tone is another unusual feature of this dialect.)
The Gairin type is divided into two subtypes, which I have called Gairin A and
Gairin B. The areas with the Gairin A type are smaller than the areas with the Gairin
B type, but I nevertheless regard the A type as basic, as this type represents an older
stage in the Gairin type tone systems.
In west Japan, the A type has been preserved in northeast Kyūshū, and in
Yonago and Gōtsu in Shimane prefecture. In central Japan it can be found along the
Pacific coast around Hamamatsu, and in the northeast it can be found on top of the
Shimokita peninsula and along the Pacific coast in Iwate prefecture, spreading
westward past Morioka.
In the Gairin B type, the /H/ tone has shifted one syllable to the right in words of
more than one syllable, unless the syllable which would become the /H/ tone-bearing
syllable contains a close vowel (i, u). In the B type tone systems therefore there is
influence of vowel height on the location of the /H/ tone. This type occurs in Matsue
6 It is clear from Ikuta’s (1951) description of the Tōkyō type dialects of Uwajima, Sukumo and
Nakamura in the southwest of the island of Shikoku, that they belong to the Chūrin type.
Nevertheless, both on the map in Language atlas of the Pacific area and in Uwano’s comment
to the map, these dialects are classed as belonging to the Nairin type. Uwano does however,
mention the Chūrin type merger pattern of the monosyllabic nouns on Shikoku in his comments.
About the Nairin tone systems he writes: “Each is situated next to the Kinki type or the
subtypes derived from it. (...) Judging from the geographical distribution (...) these dialects at
least can be considered to have derived from the Kinki type through independent substance
change. The Nairin type in Shikoku, too, has probably derived similarly (although
monosyllabic nouns raise a problem, being 1.1, 1.2 / 1.3).” It appears therefore, that proximity
to the Kinki (i.e. Kyōto type) dialects was seen as a more important criterion for classing a
dialect as belonging to the Nairin type on the map, than the kind of mergers in the
monosyllabic tone classes. (Meanwhile, Prof. Uwano has confirmed to me that the
classification of the Tōkyō type dialects on Shikoku as Nairin on the dialect map is not correct.)
34 1 The two sets of comparative data
and Izumo, and in most of the large area with Gairin type tone in northeast Honshū,
as well as in Hokkaidō.
The fact that the Gairin B type occurs in two separate blocks along the Japan Sea
coast may be connected to shipping routes along this coast, as the waters on the Sea
of Japan coast are more navigable than those on the Pacific coast. (See also section
10.6.)
Rightward shift of the /H/ tone blocked by close vowels is not entirely limited to
the Gairin B type dialects. It can also be found in one Chūrin type area on the Bōsō
peninsula south of Chiba, and in one Kyōto type dialect on Shikoku.7 (The influence
of vowel quality on rightward tone shift will be discussed in more detail in chapter
7.)
There are also a number of dialects with deviant Tōkyō type tone systems, such
as Narada (Uwano, 1976, 1977:318) and Shizukuishi (Uwano, 1997). These tone
systems developed from the familiar Tōkyō type tone systems but have very
different pitch assignment rules.8
1.1.2 The Kyōto type tone systems
The Kyōto type tone systems have many things in common with the tone system of
Tōkyō. Just as in Tōkyō, the tone system is characterized by the presence or absence
of at most one location per word of a distinctive drop from [H] to [L]. An important
difference is that in Kyōto this pitch fall regularly occurs one syllable earlier in the
word.
7 It should be noted that Hirayama Teruo’s dialect maps indicate the distinction between dialects
that do not have rightward tone shift blocked by close vowels and dialects that do, and not the
distinction between Chūrin and Gairin type tone. (See the maps in Zenkoku akusento jiten
(1960), Gendai Nihon-go hōgen daijiten (Meiji Shoin, 1992) and Kokugogaku daijiten (Tōkyō-
dō Shuppan, 1980) Because there is a considerable congruence between the Gairin type tone
system and the areas that have partial rightward tone shift, and between the Chūrin type tone
system and the areas that do not have partial rightward tone shift, for people that are used to a
Kindaichi/Uwano type map, Hirayama’s maps may give the impression that there is a Chūrin
type area in Iwate prefecture and a Gairin type area on the Bōsō peninsula.
8 The tone system of Shizukuishi for instance, developed from a Gairin tone system of the Akita
type. Like Akita, this dialect does not have /H/ tone anticipation. We have seen that in Akita
this means that the syllables with Ø tone that precede the /H/ tone, as well as the syllables with
Ø tone that follow the /H/ tone have [L] pitch. In Shizukuishi on the other hand, the syllables
with Ø tone that precede the /H/ tone have [L] pitch, just as in Akita, but the syllables with Ø
tone that follow the /H/ tone have [H] pitch. In Shizukuishi therefore, the acoustic cue that
signals where the /H/ tone is located is a rise in pitch, so that it is the first syllable in the word
with [H] pitch that carries the /H/ tone. This is the exact opposite of the situation in most Tōkyō
type dialects where it is the last syllable in the word with [H] pitch that carries the /H/ tone.
Shizukuishi even developed a L% phrase boundary tone, which is the exact opposite of the %L
phrase boundary tone in Tōkyō; in Shizukuishi it is the phrase-final syllable which has
automatic [L] pitch, whereas in Tōkyō it is the phrase-initial syllable which has automatic [L]
pitch. In case of the dialect of Narada too (which has gone through an almost perfect tone
reversal), the derivation from the surrounding Chūrin Tōkyō type tone system is clear.
1.1 The modern Japanese tone systems 35
The reason why the /H/ tone can only be identified by looking at the location of a
transition from [H] to [L] is because, just as in Tōkyō, the /H/ tone is anticipated on
syllables with Ø tone that precede the /H/ tone. An important difference with the
situation in Tōkyō is that Kyōto does not have the %L phrase boundary tone. In
Kyōto all syllables with Ø tone preceding the /H/ tone will have [H] pitch. In Kyōto
the surface realization of a word with /ØØHØ/ tone in phrase initial position is thus
not ' as in Tōkyō, but '. When the initial syllable of a word does
have [L] pitch in Kyōto, this is distinctive.
Compared to the Tōkyō type dialects, the Kyōto type dialects have one extra
toneme: In addition to /H/ and Ø, they have an active /L/ tone, which can only occur
in word-initial position. This tone is often referred to as ‘/L/ register’.
As in Tōkyō, the location of a rise in pitch is not distinctive, and words or
phrases that start with /L/ register will have an automatic rise to [H] pitch before the
final syllable. A word with the phonological shape /LØØ/ will, for instance, have
pitch. As in Tōkyō, the distinction between [H] and [L] pitch is not absolute
and can only be perceived in relation to the other pitches within a word or phrase. In
case of two words with the phonological shapes /ØØØ/ and /LØØ/, it is not so much
a difference between [L] and [H] pitch that can be heard but a difference between a
level tone contour in case of /ØØØ/ and a tone contour in case of /LØØ/.
In Tōkyō the phonological distinction between monosyllabic nouns with /H/ and
Ø tone in isolation cannot be heard, but in many Kyōto type dialects the difference
between monosyllabic nouns with /L/, /H/ and Ø tone is audible even in isolation.
The pitches in isolation and with attached case particle in Kōchi are for instance :,
- for /H/, :, - for /L/ and :, - for Ø. In Kyōto they are :, :-,
:, :- and :, :-.
In most cases /L/ register in Kyōto corresponds to a pitch fall after the first
syllable in Tōkyō, as in (2).
2 The correspondence between initial /H/ tone in Tōkyō and /L/ register in Kyōto
Tōkyō Kyōto
ha'si, ' 'hasi, ' ‘chopsticks’
ha'si-ga '- 'hasi-ga '- ‘chopsticks (subj.)’
As the pitch fall in Kyōto regularly occurs one syllable earlier in the word than in
Tōkyō, /L/ register in Kyōto can be regarded as a pitch fall before the first syllable.
The correspondence between the two features in the two prosodic types is expressed
by using the same symbol (an apostrophe) for /L/ register as well as for the pitch fall
after /H/ tone, but in case of /L/ register the apostrophe is added before the initial
syllable.
It is not solely for comparative reasons that /L/ register in Kyōto is analyzed as a
pitch fall before the first syllable. The tone system of the Kyōto dialect itself also
contains arguments for such an analysis. When a word with /L/ register is modified
36 1 The two sets of comparative data
by a word with Ø tone like the demonstrative kono ‘this’ for instance, there is an
audible pitch fall before the word, as kono will have [H] pitch regardless of whether
the word to which it is added starts [H] or [L].
3 Audible pitch falls in Kyōto before words with /L/ register
kono 'hasi-ga ‘these chopsticks (subj.)’ /ØØ LØ-Ø/ '-
kono ha'si-ga ‘this bridge (subj.)’ /ØØ HØ-Ø/ '-
kono hasi ga ‘this edge (subj)’ /ØØ ØØ-Ø/ -
Another feature that distinguishes the Kyōto type tone system from the tone system
of Tōkyō, is that Kyōto has a distinction between tone classes 2.4 and 2.5 and
between tone classes 3.6 and 3.7 that almost all Tōkyō type dialects have lost.
However, not all Kyōto type dialects have this distinction (Ikehara, Ōse, Owase,
Aiga, Shimakatsu, Miura and Nigo for instance do not) and conversely, there are a
number of Tōkyō type dialects on Noto peninsula and Noto Island that do. The
location of the /H/ tone in the different dialects in relation to each other (Kyōto one
syllable earlier than Tōkyō) is therefore the only real criterion on which to decide
whether a dialect is of the Kyōto type or of the Tōkyō type.
Kyōto type tone occurs in a relatively limited area in the central part of Japan, in
and around the old capital of Kyōto and the cities of Ōsaka and Kōbe (the Kinki
area), including the cities of Ōtsu, Hikone, Himeji, Wakayama and Tsu. Across the
water from Ōsaka and Kōbe this type of tone system can also be found on a number
of small islands in the Seto Inland Sea and in the northeast of the island of Shikoku.9
The Kyōto type tone system that can be found around Kyōto and Ōsaka on the
island of Honshū is regarded as the typical Kyōto type, in which tone classes 2.2 and
2.3 have merged, but on Shikoku and on some small islands in the Seto Inland Sea a
9 Farthest removed from the rest of the Kyōto type dialects is the Kyōto type dialect of Sado
Island. Sado Island used to be an important area for silver mining, and intensive trade contacts
by sea with the Kyōto/Ōsaka area probably account for the Kyōto type tone system that can be
found on the island. In pre-modern times traffic over water was often easier than traffic over
land, so that in dialect geography one often sees that sea unites, while land divides. See also
how the greatest admixture of dialect forms typical of western Japan in Niigata prefecture does
not occur in western Niigata, closest to the western dialects, but in central Niigata along the
coast (Miller, 1993:162). The shipping route of the kitamae-bune which went all the way to
Hokkaidō to trade with the Ainu for instance, also followed along this stretch of the Japan Sea
coast. Sado Island happens to lie exactly off the coast of this area of Niigata, and many features
that are typical of the western Japanese dialects can be found both in the dialect of Sado Island
and in this area of Niigata prefecture: The past tense of consonant stem verbs originally ending
in -h like kahu ‘to buy’ and simahu ‘to finish doing’ is koota and simoota rather than the forms
katta and simatta typical of eastern Japan. The infinitive (ren’yōkei) form of adjectives like
siroi ‘white’ is siroo instead of siroku. The negative form ends in -n or -nu rather than -nai, and
the form ‘to be’ for living creatures is oru instead of iru. The imperative is formed by -yo rather
than -ro and ‘see!’ is thus miyo or mii instead of miro. For the copula both the forms ya
(western) and da (eastern) occur.
1.1 The modern Japanese tone systems 37
number of interesting subtypes can be found, in which these two tone classes are
kept separate:
The most famous is that of the island of Ibukijima. The Ibukijima type is the
only tone system in Japan that still distinguishes all of the five tone classes for
disyllabic nouns that were also distinguished in Middle Japanese. The tone system
of Ibukijima will be discussed in more detail in section 7.2.2.
Very close to this type is the tone system that can be found on the islands of
Manabe, Sanagi and Takami, three neighboring islands in the Seto Inland Sea.
(Uwano calls this the Manabe type.) In these dialects classes 2.2 and 2.3 are kept
separate (just as in the Ibukijima dialect) but classes 2.1 and 2.5 have merged. (In
Sanagi and Takami, tone classes 1.1 and 1.3 have merged as well.) Around these
three islands there is another small group of islands (Awashima, Honjima and
Teshima) in which 2.1, 2.3 and 2.5 have all merged to one tone class with Ø tone.
On the island of Shikoku the tone system of Kōchi is of the more typical Kyōto
type, but around Takamatsu, Marugame, Kan’onji, Niihama and Ikeda tone classes
2.1 and 2.3 have merged just, as in Awashima, Honjima and Teshima, and tone
classes 2.2 and 2.3 are thus also kept separate. (Uwano calls this tone system the
Sanuki type.) The East Sanuki dialect shows rightward shift of the /H/ tone blocked
by syllables with close vowels, similar to what can be seen in a number of Tōkyō
type dialects, although occurring in completely different tone classes. (See section
7.2.1)
Finally, here and there on the outskirts of the Kyōto type dialect area, bordering
on the Tōkyō type dialects there are tone systems in which tone classes 1.1 and 1.3
and 2.1 and 2.4 have merged. (Uwano calls this type the Tarui type after the village
of Tarui in Shiga prefecture in which this type was first described.) In section 6.1, I
will explain how I think these tone systems developed.
1.1.3 The Kagoshima type tone systems
The Kagoshima type tone systems are named after the dialect of the city of
Kagoshima in the southwest of the island of Kyūshū. In Kagoshima it is not the
location in the word or phrase of a transition from [H] to [L] pitch which is
important but the difference between two distinct word-tones, which can be mapped
over words or phrases of different length.
The two Kagoshima word-tones have been named A and B (Hirayama, 1960)
and the division of the nouns over the two types shows a quite regular
correspondence with the Gairin tone system that can also be found on the island, in
the northeast around the cities of Ōita and Kita-Kyūshū. Tone classes that have all Ø
tones in the Gairin type dialects have word-tone A and tone classes that contain /H/
tone in the Gairin type dialects have word-tone B.10 For this reason it is possible to
identify word-tone A with Ø tone and word-tone B with /H/ tone.
10 All the word-tone dialects of Kyūshū and the Ryūkyūs have merged tone classes 2.1 and 2.2
and 3.1 and 3.2, which is typical of the Gairin type tone systems. There is however, one dialect
38 1 The two sets of comparative data
The word-tones of nouns up to three syllables with and without particle in the
dialect of Kagoshima are:
4 The word-tones of Kagoshima
A , - A , - A , -
B , - B , - B , -
The melodies of the word-tones can differ considerably from dialect to dialect. In
the dialect of Makurazaki on the Satsuma peninsula to the south of Kagoshima city
for instance (McCawley, 1978b:303), the melodies are almost the exact opposite of
the melodies in Kagoshima, but the division of the nouns over the two distinct tone
classes is the same:
5 The word-tones of Makurazaki
A , - A , - A
B , - B , - B
The Kagoshima dialect is often used to represent all dialects that have word-tones,
such as the dialects of the Ryūkyū archipelago of which the majority is also
characterized by distinct word-tones rather than a distinct location of a /H/ tone in
the word. However, the tone classes in the Ryūkyūs have merged in different ways,
and the word-tones are often more numerous. These dialects will be discussed
separately in chapter 9.
1.2 The distribution of the tone dots in Middle Japanese
The two tone dots that were most consistently used to indicate the tones of Middle
Japanese were the mark for the ping 平 tone (a dot at the lower left corner of a kana
graph) and the mark for the shang 上 tone (a dot at the upper left corner of a kana
graph). To a number of words however, the mark for the qu 去 tone (a dot at the
upper right corner of a kana graph) was added, and to another group of words the
mark for the light ping tone (a dot at the middle left side of a kana graph) was added.
with word-tones, which developed from a Chūrin type tone system. This is the Chibu dialect on
Oki Island, off the coast of Matsue. In this dialect a merger between 2.1 and 2.4/5 has resulted
in a system of only two word-tones: 1.1/2 vs. 1.3 and 2.1/4/5 vs. 2.2/3. In the other dialects on
the island the tone class division, which is characterized by the location of the /H/ tone, is still
close to that of the Chūrin type tone system from which it evolved: For monosyllabic nouns it
is 1.1/2 vs. 1.3, and for disyllabic nouns it is 2.1 vs. 2.2/.3 vs. 2.4/5, but there are only three
series of contrasts for words of any length, so that this dialect seems to be on its way to
developing a word-tone system with a three-way division.
1.2 The distribution of the tone dots in Middle Japanese 39
As this dot is also called the tō-ten 東点 or ‘east dot’, it will be represented by the
character 東. 11
As to the qu tone, most markings concern those monosyllabic nouns that form
the separate tone class 1.3b. In addition some longer nouns have the qu tone mark on
the initial syllable.
From the following set of double attestations it becomes clear that the qu tone
can be analyzed as a sequence of a ping tone followed by a shang tone on one
syllable. The Tosho-ryō-bon of Ruiju myōgi-shō 図書寮本類聚名義抄 (1081) has
nu 去, but also nuu 平上 for ‘marsh’, goma 去上, but also ugoma 平上上12 for
‘sesame’, hagi 去平, but also haagi 平上平 for ‘shank’.13 Shinsen jikyō (892) has
hii 平上 for ‘shuttle’, which is also attested as hi 去 in Ruiju myōgi-shō.
The fact that the qu tone must have consisted of a ping-shang contour tone can
also be seen from an different type of entry in the Tosho-ryō-bon of Ruiju myōgi-
shō: Characters that have a one-kana pronunciation in Wa-on like i 伊 (イ) or hu 不
(フ) and a ping tone in Kan-on usually have a qu tone dot in the Wa-on entries of
Ruiju myōgi-shō, but characters that have a two-kana pronunciation in Wa-on like
佳 kuwe (クヱ) or kei (ケイ) and a ping tone in Kan-on will have a ping tone dot
followed by a shang tone dot added to the consecutive kana signs (Kindaichi,
1951:646-648).14
Furthermore, the Ruiju myōgi-shō tone patterns hardly ever show a shang-ping
sequence followed by a shang tone within the word, whereas a ping-shang sequence
followed by a shang tone within the word is very common. This makes it likely that
the qu tone in examples like goma 去上 or siwoni ‘aster’ 去上上 in Ruiju myōgi-shō
consisted of a ping-shang contour, but very unlikely that it consisted of a shang-ping
contour.
The qu tone dot was used to mark ping-shang tone contours in manuscripts such
as Wamyō ruiju-shō 和名類聚抄, Iroha-ji rui-shō 色葉字類抄 and the various
manuscripts of Ruiju myōgi-shō and Nihon shoki 日 本 書 紀 . In the many
manuscripts of Kokin waka-shū 古今和歌集, in Nihon shoki shi-ki 日本書紀私記
(1278-1293) and the Jōben-bon of Shūi waka-shū 浄 弁 本 拾 遺 和 歌 集 (1333)
however, the qu tone mark is no longer used. It is though that the ping-shang
contour tone disappeared from the language as a separate toneme in the 13th
century.15
11 The designation ‘east dot’ stems from the fact that the Sino-Japanese reading of the character
for ‘east’ 東 is a well-known example of a character with a light ping tone in Japan.
12 Chinese syllable final -N was in Japan generally replaced by a close vowel that is thought to
have been nasalized originally. Hayata (1973) therefore interprets the letter u here as a
grapheme indicating a syllabic nasal: Ngoma.
13 Haagi 平上平 is also attested in Dai-hannya-kyō ji-shō 大般若経字抄(1040).
14 Wa-on is an early form of Go-on. In Japan, one and the same Chinese character would often be
marked with a different tone dot depending on whether the character was read as Wa-on/Go-on
or as Kan-on. (See chapter 4 of part II.)
15 It is often claimed that the qu contour tone was replaced by the shang tone, because such a
40 1 The two sets of comparative data
As to the light ping tone dot, Komatsu Hideo (1959) discovered that some words
– in case of nouns most notably the second syllable of some of the disyllabic nouns
that in the Kyōto type dialects and a few of the Tōkyō type dialects form the
separate tone class 2.5 – were marked with the light ping tone dot in the Tosho-ryō-
bon manuscript of Ruiju myōgi-shō. A few examples can also be found in Konkōmyō
saishōō-kyō ongi 金光明最勝王経音義 of 1079 and a number of other works.
(There is evidence indicating that the light ping tone may have been in wider use in
earlier documents that have not survived. More evidence on the use of the light ping
tone dot to mark the pitches of Japanese words is presented in section 9.4 of part II.)
The light ping tone dot is placed to the left of a character or kana sign, slightly
above the normal location of the ping tone dot, and seems to have been in use only
in Japan. As far as is known to me, marking the light and heavy subtones of the four
basic Middle Chinese tones with a separate set of tone dots is a Japanese invention,
and did not occur in China or in Korea and Vietnam.
Although there is not much evidence from moraic markings, such as was the
case with the qu tone,16 based on other considerations, such as its influence on the
pitch of attached particles, the light ping tone in Middle Japanese is thought to have
been the exact opposite of the qu tone, in other words; a combination of a shang tone
followed by a ping tone on one syllable.
In the earliest and most precise materials, such as the Tosho-ryō-bon of Ruiju
myōgi-shō, Konkōmyō saishōō-kyō ongi, the Iwasaki-bon of the Nihon shoki 岩崎本
日本書紀 and other texts, the light ping tone dot was still used to mark the shang-
ping contour tones of Middle Japanese. In the marking system that came in use later
however, a shang tone dot was used in place of the earlier light ping tone dot.17 The
use of the light ping tone dot to mark the pitches of Japanese already fell out of use
in the 12th century.
development can indeed be seen in short (one kana) character readings. However, the character
readings involved were never adopted into the spoken language as loanwords, and the change
in tone dot markings was due to a development in scholarly tradition and not to phonological
sound change. (See section 4.4 and 7.3.3.2 of part II.)
There are on the other hand, also two examples of native Japanese words that were marked
with qu tone dots in the Tosho-ryō-bon of Ruiju myōgi-shō, but with shang tone dots in later
works such as the Jōben-bon of Shūi waka-shū. The two words involved (i.e. the nouns of class
1.3b su ‘nest’ and ya ‘building’) are however, also attested with shang tone marks in the Tosho-
ryō-bon. The shang tone markings in Ruiju myōgi-shō indicate that a variant that had merged
with tone class 1.1 already existed in Middle Japanese, and not necessarily that words that had
a qu tone earlier shifted to shang tone later. As can be seen in section 1.3.1, su has merged
with tone class 1.1 instead of the more usual tone class 1.3 in modern Kyōto and Kōchi and in
the Tōkyō type dialects of western Japan, and ya has merged with class 1.1 in Kyōto (I have no
information on the tone of ya in Kōchi or west-Japan).
16 There is one example of 1.2 kii ‘yellow’ marked as 上平 in Ruiju myōgi-shō.
17 There are also texts in which a ping tone dot is used instead of the light ping tone dot. This is
usually seen as an indication that in earlier copies of these texts the light ping tone was still in
use, as it is thought that the ping tone markings are the result of mistakes by later copyists who
mistook the only slightly raised light ping tone dot for the ordinary ping tone dot.
1.3 Modern Japanese and Middle Japanese compared 41
The common view is that the shang-ping contour tones did not disappear from
the language (and definitely not as a toneme), even though the light ping tone dot
fell out of use. (See for instance Suzuki Yutaka’s overview of the different marking
strategies in Akinaga et al. ed., 1998:580-581). Double attestations, the tone of
attached case particles (see the discussion of tone classes 1.2 and 2.5 in sections
1.3.1 and 1.3.2) and modern dialect reflexes all indicate that although the marking
system had changed, the underlying phonology had not. In other words; in the later
marking system the shang tone dot started to do double service, both as the indicator
of simple shang and of shang-ping contours, which had previously been marked
with the light ping tone dot.18
Summarizing the developments in the qu and light ping tones: The use of the
light-ping tone dot to mark the pitches of Japanese words died out in the 12th century,
which is attributed to a change in the marking system. The use of the qu tone dot to
mark the pitches of Japanese words died out in the 13th century, but in this case, the
change in the markings is regarded as a sign that the ping-shang toneme had
disappeared from the language.
1.3 Modern Japanese and Middle Japanese compared
Tables 6 to 9 introduce the tone of the different tone classes in the modern Japanese
dialects, compared with their tone in Middle Japanese. As mentioned in the
introduction, the only subclass distinction included in the tables is between tone
class 1.3a and 1.3b and 3.5a and 3.5b as this distinction has been attested in the
Middle Japanese tone dot material. Other subclass distinctions will be discussed in
chapter 8. (Tone class 2.5 is not treated as a subclass of tone class 2.4, as 2.5 is well
attested as a separate class in the modern dialects.)
I will commence by providing some background information on the dialects that
I have chosen to represent the modern Japanese tone systems. (More information on
different types of tone dot attestations will be given at the end of this chapter.)
The modern Kyōto type tone system is represented by the dialect of Kōchi, as the
tone system of this dialect is more archaic than the dialect of the city of Kyōto itself.
The dialect of the city of Kyōto has shifted /H/ tone in trisyllabic nouns that
originally fell on the second syllable one syllable to the left, unto the first syllable
since the 17th century. 19 An older stage, in which this leftward shift had not yet
18 Double attestations are for instance abu ‘gadfly’ and hiru ‘leech’ 平上 in Ruiju myōgi-shō, 平
東 in Konkōmyō saishōō-kyō ongi, hitohe ‘single layer’, 平平東 as well as 平平上 in the
Tosho-ryō-bon of Ruiju myōgi-shō, tamaki ‘arm ornament’, 平平東 as well as 平平上 in the
Tosho-ryō-bon of Ruiju myōgi-shō. Double attestations of 東 and 上 are especially numerous in
case of the adjective suffixes -si (shūshi-kei) and -ki (rentai-kei).
19 This happened only in case of nouns that had a falling tone contour over the whole word, so
therefore not in tone classes that start with /L/ tone: ''. The tone pattern ' does
occur in Kyōto nouns (mostly in compound nouns) but is very rare. In verbs and adjectives it is
42 1 The two sets of comparative data
occurred has been attested in Bumō-ki 補忘記 (1687), and modern Kyōto type
dialects such as Wakayama and Hyōgo have preserved the Bumō-ki-type stage, just
like Kōchi.20
The Gairin type is represented by the dialect of Ōita, the Chūrin type by the
dialect of Tōkyō itself, and the Nairin type (only monosyllabic nouns are given) by
the dialect of Totsukawa. The dialect material is from Kobayashi (1975) for Kōchi
and Tōkyō, Hirayama (1979) for Totsukawa, and Hirayama (1960) for Kagoshima
and Ōita. The table shows the tone of nouns of one, two and three syllables with the
nominative case particle ga.
The correspondences for the trisyllabic nouns in the modern dialects can be
distressingly irregular. The decision of what can be regarded as the most common
reflex for a certain tone class in a certain dialect is based on a comparison of as
many words of a certain tone class as possible.
One reason for the irregularity of cross-dialect correspondences of longer nouns
is probably because many are compounds (cf. section 5.13), but some dialects are
more irregular than others, and some tone classes are more irregular than others.
Tone classes 3.1, 3.4 and 3.7 are usually quite regular. Very irregular tone
classes on the other hand, are tone classes 3.2, 3.3 and 3.5.21 Tone class 3.3 has the
additional problem that the number of nouns for this class is so small, that for a
number of dialects, the decision of what to regard as the main reflex of this tone
class is rather arbitrary. To such reflexes I have added a question mark.
An interesting case is also tone class 3.6, which has a regular ' reflex in the
Gairin type dialects of Ōita and Hamamatsu, and in the Chūrin type dialects of part
of Nagano (Martin, 1987:182), but which has Ø tone in many cases in other Tōkyō
type dialects. (This issue is addressed in section 7.1.1.)
I have decided to include the trisyllabic nouns in the comparison whenever
possible, as they sometimes show unexpected, and therefore revealing
correspondences, which the shorter disyllabic nouns with their more limited tonal
possibilities do not. An example is the tone of nouns of class 3.2 in Kyōto and Kōchi,
discussed in sections 2.3.5, 3.1.6, 4.2.2 and 8.1.5.)
1.3.1 Monosyllabic nouns
Tone class 1.2 is usually marked with a shang tone dot in the old manuscripts, just
as tone class 1.1. There are however, several reasons why it is assumed that these
more common. It probably still occurs in these inflected forms because there, it is the result of
productive morphophonemic processes.
20 Bumō-ki, is a pronunciation guide from the Shingi Shingon school for the correct recitation of
the rongi ceremonies (formalized discussions on the Buddhist teachings). See also section
13.1.1 and 14.4 of part II. The tone of Japanese words and Chinese loanwords is indicated by
means of fushihakase musical notation marks.
21 In each case however, it is possible to explain (at least partly) the reason behind the irregularity
of the present-day reflexes. (For tone class 3.2 cf. section 4.2.2, for tone class 3.3 cf. section 4.5,
for tone class 3.5 cf. section 8.2.5.
1.3 Modern Japanese and Middle Japanese compared 43
two tone classes nevertheless had a different tone in Middle Japanese: The particle
no attached to tone class 1.1 with a shang tone, but to tone class 1.2 with a ping tone
(for instance hi-no ‘of the sun’ 上-平 in Ruiju myōgi-shō, ha-no ‘of the leaf’ and in
the Maeda-ke-bon of Nihon shoki 前田家本日本書紀 (Hayata, 1983:35). The tonal
properties of the particle no were special in Middle Japanese, in that no always
copied the final pitch of the preceding word. This means that tone class 1.2 must
have had the tone contour of a shang tone followed by a ping tone on one syllable.22
There is also one example of 1.2 kii ‘yellow’ marked as 上平 in Ruiju myōgi-shō.
Furthermore, 1.2 e ‘inlet’23 is marked with the light ping tone dot in the Tosho-ryō-
bon of Ruiju myōgi-shō and also in Wamyō ruiju-shō, and 1.2 na ‘name’ is marked
with the light ping tone dot in Ruiju myōgi-shō (Mochizuki, 1974:668-669). The fact
that the tone of class 1.2 differed from that of class 1.1 is confirmed by the distinct
reflexes of the two classes in the modern dialects.
Tone class 1.3 is divided into two subclasses, 1.3a and 1.3b. This is because a
small number of nouns that have reflexes typical of tone class 1.3 in the modern
dialects are marked with the qu tone dot (=1.3b) in Ruiju myōgi-shō. (As mentioned
before, the qu tone in Japan is thought to represent a ping tone followed by a shang
tone on one syllable.)
22 In The Japanese language through time (1987), in Martin’s representation of the tone of the
different tone classes, the tone of the particle represents to the tone that will occur with the
particle no, and not to the tone that will occur with the other monosyllabic case particles. (In
the oldest materials these have uniform shang tone markings.)
Martin’s unusual representation is used as a tool to reveal information on the nature of the tone
of the final syllable of the noun. Martin’s representation of class 2.5 as /LH-L/ for instance,
reveals that the final syllable of class 2.5 had /F/ tone. Although Martin’s choice to use the
particle no is apparent from p.180 and p.170 of The Japanese language through time (and has
been confirmed to me by Martin in person) it is not mentioned on the pages where his
representations are given (pp.600–634).
A number of scholars seem to have based themselves on Martin’s representation, without
realizing that it refers to the tone of nouns + no, and not to the tone of nouns + particles like ha,
ga, ni and wo. As a result, in a number of publications, the tone of Middle Japanese nouns + no
is taken as a starting point, while the tone that occurs in the modern dialects in case of the other
monosyllabic case particles is presented as the outcome of the historical developments. (Cf.
Vovin, 1997:114, Takeuchi, 1999:54, Shimabukuro, 2002:337–338, 2007:297-298, 302-308,
Matsumori, 2008:107).
As the tone of nouns + no in the modern dialects is quite different from the tone of nouns + ha,
ga, ni and wo the historical changes in these publications are misrepresented. For a comparison
of the historical developments in case of nouns + no and in case of nouns + the other
monosyllaboc case particles see section 5.1 and subsections.
23 ‘Inlet’ is classed as ‘?1.1’ in Martin’s list, but the markings in the Tosho-ryō-bon and Wamyō
ruiju-shō indicate that it must belong to 1.2, which agrees with the idea that it is related to
‘branch’, just as Martin suspects (1987:392).
44 1 The two sets of comparative data
6 Monosyllabic nouns
Middle Kōchi Totsukawa Tōkyō & Ōita Kago-
Japanese Nairin Chūrin & Gairin shima
1.1 ko ‘child’ 上-上 - - - A
1.2 na ‘name’ 上-上24 '- '- - A
1.3a te ‘hand’ 平-上 '- '- '- B
1.3b hi ‘fire’ 平-上25 '- '- '- B
The number of nouns attested with a qu tone dot that have survived as independent
nouns in the modern dialects is small, and the reflexes in the Kyōto type dialects are
quite irregular:
7 Tone class 1.3b
Kyōto Kōchi Tōkyō
su ‘nest’ :- - '-, -26
ya ‘building’ :-, :'- x '-
hi ‘fire’ ':- '- '-
ni ‘load’ ':- '- '-, -27
e ‘bait’ :'- '- '-
ha ‘tooth’ :'- '- '-
hi ‘water pipe’ :- x '-
hi ‘cypress’ :- x x
me ‘female’ :- x x
As can be seen from the equal numbers of reflexes, the representation of the Kōchi
reflex in the table as '- could be replaced by '-. (My choice for '- is
based on the fact that the correspondences between Kōchi and Tōkyō tend to be
quite regular.)
The tone contour of the qu tone died out in the 13th century and left no clear
separate reflex in the modern dialects. (An explanation for the lack of a distinct
reflex in the modern dialects for this proto-Japanese tone class will be given in
section 8.2.1.) Unless stated otherwise, tone class 1.3 will refer to tone class 1.3a in
the following chapters.
There is evidence that – at least in central Japan – monosyllabic nouns in Middle
Japanese were automatically lengthened, just as they still are in many central
24 Also 東-上.
25 Also 去-上.
26 In the Tōkyō type dialect of western Honshū (Hiroshima, Okayama, Matsue, Izumo su ‘nest’
has Ø tone.
27 The reflex in Aomori is also Ø.
1.3 Modern Japanese and Middle Japanese compared 45
Japanese dialects. 28 The four possible tone combinations for these lengthened
monosyllabic nouns expressed in moras would look as follows: 1.1 上上, 1.2 上平,
1.3a 平 平 and 1.3b 平 上 , but attestations of such moraic tone markings are
extremely rare. They are confined to two instances of 上平 markings (for 1.2 ya
‘spoke of a wheel’ and 1.2 ki ‘yellow’) and two instances of 平上 markings (for 1.3b
nu ‘marsh and 1.3b hi ‘shuttle’).
1.3.2 Disyllabic nouns
Although the final syllable of tone class 2.5 is most of the time marked with a shang
tone dot, just as in tone class 2.4, there are several reasons why it is assumed that
these two classes nevertheless had a different tone in Middle Japanese: The final
syllable of a number of nouns that belong to tone class 2.5 in the modern Kyōto type
dialects have been marked with the light ping tone dot in the Tosho-ryō-bon of Ruiju
myōgi-shō as well as a number of other works. In addition, just as was the case with
tone class 1.2, there is evidence from the particle no, which attached to tone class 2.5
with a ping tone, but to class 2.4 with a shang tone.29
Martin’s classification of kata ‘shoulder’, hune ‘boat’, mugi ‘wheat’, aha ‘millet’,
ine ‘rice’, kinu ‘garment’, uri ‘melon’, zeni ‘money’ and kari ‘goose’ as belonging
to class 2.5 for instance, is based on the fact that the particle no attached with a ping
tone to these nouns (Martin 1987:173). Ima ‘now’ and kibi ‘millet’ also belong to
the group to which the particle no attached with a ping tone, and in these cases
additional evidence for classification as belonging to this class can be seen in the
fact that they still belong to class 2.5 in some of the modern Kyōto type dialects
(other than Kyōto). In case of the majority of nouns that are regarded as members of
28 While Endō (1974:37) thinks that only nouns of class 1.3b, marked with the 去 tone were two
moras long, and that the rest of the monosyllabic nouns were one mora long, Martin (1987:71)
lists a number of early attestations of lengthening of monosyllables that include examples of
other tone classes as well: Kegon ongi shi-ki 華厳音義私記 (794) has both ka and kaa for
‘mosquito’ (1.1). Ruiju myōgi-shō (1081) has tii for ti ‘thatching grass’(1.2?), waa for wa
‘wheel’ (1.3a), yaa marked with 上平 tone dots (1.2?) for ya ‘spoke of a wheel’ also attested
with a 平 tone dot (a copying mistake for an original tō-ten marking? Modern dialect data point
to 1.2 or 1.3), kii ‘yellow’ marked with 上平 tone dots (1.2). However, attestations of vowel
length for nouns of class 1.3b are particularly numerous: Honzō wamyō 本草和名 (918) has sei
and se ‘shaggy hoof’ (1.3b), attested with 去 tone in Wamyō ruiju-shō (934). Daijion-ji sanzō
hōshi-den 大慈恩寺三蔵法師伝 (1116) has the compound suu-ya and su ‘nest’ (1.3b), attested
with 去 tone in Ruiju myōgi-shō (Tsukishima 1969b:396). Ruiju myōgi-shō (1081) has nuu for
nu ‘marsh’ (1.3b), attested with 去 tone and 平上 tone. Shinsen jikyō 新撰字鏡 (892) has hii
with 平上 tone for hi ‘shuttle’ (1.3b), attested with 去 tone in Ruiju myōgi-shō and tii for ti
‘fish hook’. (No tone marks, no modern data.) Engi shiki 延 喜式 (925) has wii for wi
‘rush’1.3b, also attested with 去 tone. (It should be noted that the tone dots in works like
Wamyō ruiju-shō, Shinsen jikyō and Engi shiki are of later date than the original date of
composition indicated.)
29 As will be discussed in section 1.4, there is material in which case particles other than no also
attach to nouns of class 2.5 with a ping tone, and based on this material, it is also possible to
identify members (or former members) of class 2.5.
46 1 The two sets of comparative data
class 2.5 the classification is based on the fact that they belong to class 2.5 in
modern Kyōto, or in some of the other modern Kyōto type dialects.
8 Disyllabic nouns
Middle Kōchi Totsukawa Ōita Kago-
Japanese & Tōkyō Gairin shima
Nairin &
Chūrin
2.1 tori ‘bird’ 上上-上 - - - A
2.2 mura ‘village’ 上平-上 '- '- - A
2.3 inu ‘dog’ 平平-上 '- '- '- B
2.4 umi ‘sea’ 平上-上 '- '- '- B
2.5 saru ‘monkey’ 平上-上30 ''- '- '- B
1.3.3 Trisyllabic nouns
There is evidence of a subclass distinction for tone class 3.5 in the old tone dot
material, as some nouns are marked with 平平東, 31 instead of 平平上 tone dots
(Hattori, 1951).
As it happens, a distinction into two subclasses, 3.5a and 3.5b is also included in
Martin’s list of the different tone classes of nouns (1987). Martin’s subclass
distinction is based on a split in the reflexes of this tone class in Tōkyō: Tone class
3.5a has a ' reflex (or a Ø reflex) in Tōkyō, which shows the regular
correspondence with the ' reflex of the dialect of Kōchi. Tone class 3.5b on
the other hand, has an unexpected ' reflex in Tōkyō.
It is hard to prove that there is a link between the two different attestations in
Middle Japanese and the split in the reflexes in Tōkyō, as it is difficult to determine
the regular present-day Tōkyō reflex of the words marked in the old texts as 平平東:
The markings are rare, the examples nouns are uncommon and several examples
consist of compound nouns, which often have irregular reflexes as it is.
I have therefore, decided to leave the 平平東 markings and the split in the
reflexes in Tōkyō for what they are. (Unless stated otherwise, tone class 3.5 will
refer to tone class 3.5a in the following chapters.) In section 8.2.5 I will however,
present a tentative explanation for the unexpected ' reflex of tone class 3.5b in
30 Also 平東-上.
31 Kindaichi (1964c:350) gives the examples of akidu ‘dragonfly’ (平平東 in the Yūryaku-ki 雄略
紀日本書紀 of Nihon shoki), awoto ‘blue grindstone’ (平平東 in the Tosho-ryō-bon of Ruiju
myōgi-shō, 平平上 in the Kanchi-in-bon), hirome ‘seaweed, konbu’ (平平東 in the Tosho-ryō-
bon, 平平上 in the Kanchi-in-bon), hitohe ‘single layer’ (平平東 and 平平上 in the Tosho-ryō-
bon of Ruiju myōgi-shō), tamaki ‘arm ornament’ (平平東 and 平平上 in the Tosho-ryō-bon of
Ruiju myōgi-shō). According to Martin (1987:540) also 平平東 in Konkōmyō saishōō-kyō ongi.
1.4 Differences in the attestation of the tone of the monosyllabic case particles 47
Tōkyō, in connection with the existence of a tone class with 平平東 markings in
Middle Japanese.
9 Trisyllabic nouns
Middle Kōchi Totsukawa Ōita Kago-
Japanese & Tōkyō Gairin shima
Nairin &
Chūrin
3.1 ‘shape’ katati 上上上-上 - - - A
3.2 ‘azuki’ aduki 上上平-上 ''- '- - A
3.3 ‘strength’ tikara 上平平-上 '- '-? -? A
3.4 ‘mirror’ kagami 平平平-上 '- '- '- B
3.5a ‘heart’ kokoro 平平上-上 '- '- '- B
3.5b ‘dragonfly’ akidu 平平東-? ? ? ? ?
3.6 ‘sparrow’ suzume 平上上-上 '- '- '- B
3.7 ‘helmet’ kabuto 平上平-上 ''- '- '- B
1.4 Differences in the attestation of the tone
of the monosyllabic case particles
In the tables above, I have shown the tone of the monosyllabic case particles ha, ga,
wo, and ni in Middle Japanese as they are marked in the Tosho-ryō-bon of Ruiju
myōgi-shō and tone dot manuscripts of Nihon shoki. In these manuscripts they have
universal shang tone. I have chosen to show this type, because this type – in which
there is no influence yet of the tone of the noun on the tone of the attached case
particle (except in case of the particle no) – is most archaic. It also has the oldest
attestations. In some later materials on the other hand, it can be seen that the tone of
the case particles is influenced by the tone of the preceding noun.
The archaic type did not disappear after the more innovative types were attested.
The archaic type can for instance, still be found in the 14th century, long after the
earliest attestations of more innovative types. For reasons which shall be explained
in chapter 3, I consider material in which the monosyllabic case particles (other than
no) are marked with shang tone dots throughout, as representing the tone system of
the city of Kyōto and the area which surrounds Kyōto, as well as the area which
nowadays has a Nairin type tone system. This tone system will be called the MJ
(Middle Japanese) ‘Nairin’ tone system.
In the Kanchi-in-bon of Ruiju myōgi-shō on the other hand, we see that the tone
of the attached case particles after words that ended with a shang-ping sequence on
one syllable is not shang but ping: 2.5 for instance, is marked as 平上-平. In the
Date-ke-bon of Kokin waka-shū 伊達家本古今和歌集 (1226), classes 2.2 and 3.2
48 1 The two sets of comparative data
have this effect on the tone of attached case particles as well: 2.2 is marked as 上平-
平 and 3.2. is marked as 上上平-平.
We have just seen that tone class 1.2 and the final syllable of tone class 2.5 are
reconstructed with a shang-ping tone contour on one syllable, and so, what these
tone classes have in common is a shang-ping tone contour on the whole word or on
the final syllable: A shang-ping tone contour on the preceding noun could case the
original shang tone on the attached case particle to change to a ping tone
Materials in which such ping tone spreading only occurred after tone classes that
ended in a shang-ping contour on one syllable reflect – what I will call – the MJ
‘Chūrin’ tone system. (Not surprisingly, the tendency for the ping tone to spread
onto an attached case particle was apparently stronger when the shang-ping
sequence was squeezed onto a single syllable.) Materials in which the ping tone
spreading also occurred after tone classes 2.2 and 3.2 reflect – what I will call – the
MJ ‘Gairin’ tone system.32
In (10) I have divided the tone dot material in groups, based on the degree of
tone spreading that can be seen in the tone of the case particles. The list is not
complete, and a problem in dating the different materials is that it is not always clear
whether the tone dots date from the same time as the date of copying indicated in the
manuscript. The division between the Nairin/Chūrin type on the one hand and the
Gairin type on the other hand is easy to make, as there are usually enough
attestations of tone classes 2.2 and 3.2 with a particle. Distinguishing the Nairin and
the Chūrin type from each other is harder, as the small tone class 1.2 is not always
attested with a particle (other than no).
In case of the Kanchi-in-bon the assignation to the Chūrin type is therefore based
on the tone of the particle that occurs after class 2.5: If assimilation of the tone of the
particle occured after the shang-ping sequence on the final syllable of this class, I
have assumed that it also occurred after the shang-ping sequence of class 1.2, but it
is possible that assimilation did not (yet) take place after a lengthened monosyllable.
(In Nihon shoki shi-ki on the other hand, the particle after 1.2 is marked with the
ping tone.)
In my division I have relied on Sakurai’s article ‘Joshi akusento no shi-teki
kōsatsu’ (included in Sakurai 1975, 183-225) and Kindaichi (1955). Furthermore
Mochizuki (1974), Akinaga (1974), Hayata (1984), Tsukishima (1951, 107-178),
and Wenck (1959).
The distinction between the different types is only meaningful when one takes
the non-standard reconstruction of the Middle Japanese tone system by S. R.
Ramsey as basis for the reconstruction of the historical developments. The
32 At the Middle Japanese stage represented by the tone dot material, these subtypes had not yet
merged any of the tone classes. As the true criterion for classing a dialect as Nairin, Chūrin or
Gairin is the merger patterns of the nouns, I have decided to put the classification as Nairin
Chūrin or Gairin at the Middle Japanese stage in quotes.
1.4 Differences in the attestation of the tone of the monosyllabic case particles 49
distinction will therefore not be referred to until chapter 3, where Ramsey’s theory is
introduced.
10 The different types of Middle Japanese tone dot material
a. No ping tone spreading to particles (MJ ‘Nairin’ tone system): Iwasaki-bon of
Nihon shoki 岩崎本日本書紀 (1000), Tosho-ryō-bon of Ruiju myōgi-shō 図書寮
本類聚名義抄(±1080/1100), Tosho-ryō-bon of Nihon shoki (1142), Kitano-bon
北野本 of Nihon shoki (1150), Maeda-ke-bon of Nihon shoki (1150), Kamakura-
bon 鎌倉本 of Nihon shoki (1303), Maeda-ke-bon of (Jōben) Shūi waka-shū 前
田家本浄弁拾遺和歌集 (1333).33
b. Ping tone spreading to particles after tone classes 1.2 and 2.5 (MJ ‘Chūrin’ tone
system): Kanchi-in-bon 観智院本 of Ruiju myōgi-shō (1140-1150?), Mikanagi-
bon 御巫本 and Ōei-bon 応永本 of Nihon shoki shi-ki 日本書紀私記 (1278-
1293).
c. Ping tone spreading to particles after tone classes 1.2, 2.5, 2.2 and 3.2 (MJ
‘Gairin’ tone system): 34 Myōgo-ki 名 語 記 (1268-1275), Kokin (waka-shū)
kunten-shō 古今訓点抄 (1305)
d. Ping tone spreading to particles after tone classes 1.2 and 2.5 and sometimes
after 2.2 and 3.2 (Mixed MJ ‘Chūrin’/‘Gairin’ tone system): Date-ke-bon of
Kokin waka-shū 伊達家本古今和歌集 (1226).
33 Although the particle in 1.2 ha-ni ‘leaf’ is attested with a 平 tone dot, so is the particle in 1.3
ne-ha ‘root’, so the 平 markings may be a mistake.
34 The Daiji-in-bon of Shiza kōshiki 大慈院本四座講式 which is fushihakase material (discussed
in chapter 14 of part II) also belongs to this type. According to Kindaichi (1964c:141–142) the
Daiji-in-bon is most likely a late 14th century copy of a 13th century original. (Shiza kōshiki was
originally composed in 1226.)
2 The standard reconstruction of the Middle Japanese
tone system
In this chapter, I will discuss the reconstruction of the tone system of Middle
Japanese by Kindaichi Haruhiko and the historical developments from proto-
Japanese to the modern dialects that Kindaichi proposes. Kindaichi’s reconstruction,
taken together with his account of the historical developments, are what is known as
the ‘standard theory’. The two key-elements of the standard theory are Kindachi’s
reconstruction of the Middle Japanese tone system and the idea that this tone system
was close to, or almost identical, to the tone system of proto-Japanese.
I will end this chapter with a short discussion of a number of alternative theories,
which adhere to Kindaichi’s reconstruction of the Middle Japanese tone system, but
deviate from the standard theory in that they hold a different view on the status of
Middle Japanese in the historical developments.
2.1 Kindaichi’s Middle Japanese tone system
resembles the tone system of Kyōto 2.1 Kindaichi’s Middle Japanese tone sy stem
In 1951, in an influential article on the reconstruction of the (Late)1 Middle Chinese
tones and the tonal value of the tone dots in Japan, Kindaichi Haruhiko interpreted
the tonal value of the tone dots in such a way that the tone pattern of Middle
Japanese coincided as much as possible with the tone pattern of the modern Kyōto
type dialects. One obvious reason for assuming that the Middle Japanese tone
system resembled that of modern Kyōto was the fact that Kyōto (Heain-kyō) had
become the capital of Japan in 794, and that the tone dot material could be expected
to reflect the dialect of the capital. Another reason was the fact that the Kyōto type
dialects have preserved certain tonal distinctions that have been lost in most other
dialects, so that the Kyōto type tone system appears to be archaic.
The clearest example can be found in the Kyōto type tone system of the island of
Ibukijima, the only modern dialect in which tone classes 2.1, 2.2 and 2.3 are each
kept separate from all of the other tone classes, so that this dialect distinguishes five
tone classes for disyllabic nouns. (See section 7.2.2.)
1 It was only later that a distinction between Early Middle Chinese and Late Middle Chinese was
introduced in Chinese linguistics, but the value of the Japanese tone dots was clearly based on
the Kan-on character reading tradition, and the Kan-on character reading tradition was based on
Late Middle Chinese. (See chapter 4 of part II.)
2.1 Kindaichi’s Middle Japanese tone system 51
Another example is that of the three separate tone classes for the monosyllabic
nouns that have been preserved in the Kyōto type dialects, while most Tōkyō type
dialects distinguish only two classes. The distinction of three separate classes in the
Kyōto type dialects is definitely not an innovation. Although the evidence for the
existence of a separate tone class 1.2 in Middle Japanese is not very direct – it is for
the largest part based on the effect that this tone had on the tone of attached case
particles – there can be no doubt that this tone class was distinguished in the proto-
Japanese tone system: Most Tōkyō type dialects have only preserved two
monosyllabic tone classes, but in the Nairin dialects and in the Chūrin/Gairin
dialects the tone classes have merged in different ways, so that based on the Tōkyō
type dialects alone – independent of the old and modern Kyōto data – it is necessary
to reconstruct three separate monosyllabic tone classes in proto-Japanese.
Finally, tone classes 2.5 and 3.7 are distinguished in most Kyōto type dialects,
but only in very few Tōkyō type dialects.
In addition to a comparison with the modern Kyōto type dialects, Kindaichi also
considered descriptions of the Late Middle Chinese tones by Japanese Buddhist
monks.
Kindaichi reconstructed the Late Middle Chinese tones, and by extension the
value of the Japanese tone dots, as follows: The shang tone was /H/, the ping tone
was /L/, the qu tone was a contour tone /R/. (The three interpretations of ping, shang
and qu are interconnected, as we have seen that in Japanese the qu tone consists of a
ping tone followed by shang tone on one syllable.) Finally, the light ping tone was a
contour tone /F/.
In table (1) the tone of nouns + ga of one, two and three syllables in Middle
Japanese (as reconstructed by Kindaichi) is compared with the tone of these nouns in
the modern dialects. I have omitted the Kagoshima dialect, but the correspondence
with Kindaichi’s Middle Japanese is as follows: Word-tone A corresponds to words
starting with /H/ tone in Middle Japanese and word-tone B corresponds to words
starting with /L/ tone in Middle Japanese.
1 Comparison of Kindaichi’s Middle Japanese tone system with the tone systems
of the modern dialects
Middle Kōchi Totsukawa Tōkyō Ōita
Japanese (Nairin) (Chūrin) (Gairin)
(Kindaichi)
1.1 - - - - -
1.2 - '- '- - -
1.3a- '- '- '- '-
1.3b- '- '- '-
52 2 The standard reconstruction of the Middle Japanese tone system
Middle Kōchi Totsukawa Tōkyō Ōita
Japanese (Nairin) (Chūrin) (Gairin)
(Kindaichi)
2.1 - - - (as Nairin) -
2.2 - '- '- -
2.3 - '- '- '-
2.4 - '- '- '-
2.5 - ''- '- '-
3.1 - - - (as Nairin) -
3.2 - ''- '- -
3.3 - '- '- ?
3.4 - '- '- '-
3.5 - '- '- '-
3.6 - '- '- '-
3.7 - ''- '- '-
If one ignores the tone of the case particles, the pitches of most of the one- and two-
syllable nouns in Middle Japanese – as well as some of the three-syllable nouns –
look similar to the pitches that these tone classes have in the modern Kyōto type
dialects.2
2.2 Historical developments according to Kindaichi
Because of the similarity between the tones of modern Kyōto and Middle Japanese,
Kindaichi assumed that the changes that the modern Kyōto type dialects went
through were relatively minor, and that at some point Tōkyō type tone developed
from Kyōto type tone by means of a rightward tone shift.
As can be seen in (1), the Tōkyō type dialects and the Kyōto type dialects have
/H/ tones in classes that lacked /H/ tones in Middle Japanese (tone classes 2.3 and
3.4), and in class 3.5, the /H/ tone in Middle Japanese is in a location in the word
(the final syllable) that does not agree with the location of this tone in any of the
modern Kyōto type dialects. Kindaichi therefore assumed that Tōkyō type tone
developed only after a number of changes had transformed the Middle Japanese tone
system into a tone system that had a much closer resemblance to the tone system of
modern Kyōto.
At some point, /H/ tones developed in classes 2.3 and 3.4. In class 3.5, the /H/
tone on the final syllable that was present in Middle Japanese was lost, while a new
2 The /H/ tone of the case particles is not regarded as an important difference by Kindaichi and
others that follow the standard theory, as it is thought that the particles lost their independent
tone in the course of history. The comparison is therefore usually restricted to the tones before
the word boundary.
2.2 Historical developments according to Kindaichi 53
/H/ tone developed in the required location (i.e. on the initial syllable). It is only
after these changes were completed that the development towards a Tōkyō type tone
system (i.e. rightward shift of the /H/ tone) could have started.
This change towards a modern Kyōto type tone system had definitely taken
shape by the 17th century, as such a more modern Kyōto type tone system is attested
in Bumō-ki 補忘記 (1687). The development from the tone system of Bumō-ki to the
tone system of Tōkyō is shown in (2).
2 The development from the tone system of Bumō-ki to the tone system of Tōkyō
Bumō-ki Tōkyō
2.1 - > -
2.2/3 '- > '-
2.4 '- > '-
2.5 ''- > '-
3.1 - > -
3.2/4 '- > '-
3.3/5 '- > '-
3.6 '- > '-
3.7 ''- > '-
As identical notations are used for real pitch falls (/H/ tone followed by Ø tone) as
well as /L/ register (which is a pitch fall only in an abstract morphophonemic sense),
the change looks as follows: After deleting all but the first pitch fall per word, a
single rightward shift of the remaining pitch falls results in a Tōkyō type tone
system.
In reality however, the development of Tōkyō type tone from Kyōto type tone
involves two very different kinds of changes: In tone classes 2.2/3, 3.2/4 and 3.3/5
/H/ tone is shifted one syllable to the right. In classes 2.4, 2.5, 3.6 and 3.7 on the
other hand, /L/ register in Kyōto is transformed into /H/ tone on the initial syllable in
Tōkyō.
In 1954, based on the tone patterns that he observed in a number of dialects on
Noto peninsula and Noto Island (see section 6.2), Kindaichi proposed a number of
intermediate stages (3) in the change from the Bumō-ki type tone system to the
Tōkyō type tone system (1954:72).
The single rightward shift shown in (2) is therefore more a description of the end
result than of the actual process by which Kindaichi assumes the change from Kyōto
type tone to Tōkyō type tone took place. While the shift of /H/ tone one syllable to
the right already occurs at the second intermediate stage between the Bumō-ki type
tone system and the Tōkyō type tone system, the development of initial /H/ tone in
Tōkyō from /L/ register in Kyōto only occurs at the third intermediate stage.
54 2 The standard reconstruction of the Middle Japanese tone system
3 Intermediate stages reconstructed by Kindaichi
Bumō-ki Tōkyō
2.2/3 - - > - - -
2.4 - > - > - > - -
2.5 - - > - > - > -
3.2/4 >
3.3/5 >
3.6 > > >
3.7 > > >
The development of /H/ tones in tone classes 2.3 and 3.4 in the Middle Japanese
tone system resulted in the merger of these tone classes with classes 2.2 and 3.2, a
merger which has indeed occurred in the Nairin and Chūrin type dialects and in
Kyōto (if we ignore the split in tone class 3.2 in Kyōto discussed in section 2.3.5 for
the moment). As shown in (4) however, in the Gairin type dialects tone classes 2.2
and 3.2 lack /H/ tone altogether, and have merged with tone classes 2.1 and 3.1.
4 Comparison of the merger patterns of Middle Japanese, Bumō-ki
and the Tōkyō type dialects
Middle Bumō-ki Nairin/ Gairin
Japanese Chūrin
2.1 - - - -
2.2 - '- '- -
2.3 - > '- '- '-
3.1 - - - -
3.2 - '- '- -
3.4 - > '- '- '-
This means that in the Gairin type dialects, before /H/ tones developed in tone
classes 2.3 and 3.4, the following changes must have occurred in tone classes 2.2
and 3.2, which made them merge with classes 2.1 and 3.1: 2.2 > , 3.2
> . By way of these changes, the Gairin type must have branched off
from proto-Japanese before the development of /H/ tones in classes 2.3 and 3.4
could cause these nouns to merge with classes 2.2 and 3.2.
In this way, it is in principle possible to derive the modern tone systems from the
standard reconstruction of the Middle Japanese tone system, but this derivation is
certainly not without problems.
2.3 Kindaichi’s reconstruction and the tone system of proto-Japanese 55
2.3 Kindaichi’s reconstruction and the tone system of proto-Japanese
We have seen that in Kindaichi’s reconstruction of the Middle Japanese tone system
there is no transition from /H/ tone to /L/ tone in classes 2.3, 3.4 and 3.5. As shown
once more in (5), the modern dialects of Tōkyō and Kyōto all have such a transition
(/H/ tone followed by Ø tone), and in places that correspond to each other: The
Tōkyō type dialects have the /H/ tone one syllable later than the Kyōto type dialects.
5 Classes 2.3, 3.4 and 3.5 in Tōkyō, Kyōto and Middle Japanese
Middle Kōchi Tōkyō
Japanese (all subtypes)
2.3 - '- '-
3.4 - '- '-
3.5 - '- '-
One can only conclude – with Kindaichi – that /H/ tones must have been present at
the appropriate place (i.e. in the modern Kyōto type location of the word) in these
tone classes at the time when the difference in the location of the /H/ tone in the
Kyōto type and Tōkyō type tone systems developed. This is the only way in which
the regular correspondence between the location of the /H/ tone in the Kyōto type
dialects and the Tōkyō type dialects can be explained: In order for the Tōkyō type
dialects to shift the /H/ tones one syllable to the right, such /H/ tones must have been
present in the tone system of these dialects at the time of the shift. This is why
Kindaichi assumed that /H/ tones developed in these tone classes at some point, and
that the split into Kyōto type and Tōkyō type tone only occurred afterwards.
As Kindaichi’s Middle Japanese tone system forms an unsuitable starting point
for the development into the tone systems of the modern dialects, it cannot be
equated with proto-Japanese. The Middle Japanese tone system – in fact – seems to
predate the tone system of proto-Japanese. As /H/ tones in the Kyōto type location
in tone classes 2.3, 3.4 and 3.5 are only first attested in the late 13th to early 14th
century, this would mean that the tone system from which all modern dialects can be
derived only dates back to this time. It must have spread throughout Japan, most
likely from Kyōto, and completely replaced all previously existing, more Middle
Japanese-like tone systems, after the 14th century.
Such a complete replacement of all previously existing dialects by the dialect of
Kyōto, starting only after the 14th century, is unlikely, but the most fundamental
problem is formed by the fact that the Gairin type tone systems have survived: The
tone systems in the modern Gairin areas cannot have been replaced by a new tone
system radiating out from the Kyōto area, as the Gairin dialects do not show the
typical central Japanese merger patterns in the tone classes of the nouns.
56 2 The standard reconstruction of the Middle Japanese tone system
In light of these problems, Kindaichi does not assume such a late date for the
split of the modern tone systems from the tone system of proto-Japanese. He
maintains that the proto-Japanese tone system was older, and similar to, or almost
identical, to the tone system of Middle Japanese. According to Kindaichi there is a
different reason why /H/ tones are present in tone classes 2.3, 3.4 and 3.5 in identical
locations in the Tōkyō type dialects throughout Japan.
According to Kindaichi, the changes that resulted in the Gairin type tone systems
happened several times independently in the different areas with Gairin type tone.
These changes involve: The merger of classes 2.2 and 2.1, and the merger of classes
3.2 and 3.1. The development of /H/ tones in classes 2.3 and 3.4. The loss of the
final /H/ tone in class 3.5, followed by the development of a new /H/ tone on the
initial syllable. The shift of all /H/ tones one syllable towards the right.
The Chūrin and Nairin type tone systems only went through the last three
changes, but again, as independent parallel developments in the geographically
widely separated areas with Nairin and Chūrin type tone. According to Kindaichi,
the repeated independent occurrence of all of these changes is not improbable
because they are natural, and thus likely to occur. The standard theory hinges on the
idea that all necessary changes from the Middle Japanese-like tone system of proto-
Japanese to the modern dialects are natural, and form part of a drift in which all
Japanese dialects, independently of each other, eventually take part.
2.3.1 How natural is the development of /H/ tones
in tone classes 2.3, 3.4 and 3.5?
According to Kindaichi, the development of /H/ tones in classes 2.3, 3.4 and 3.5 is a
natural development. In Kindaichi’s idea, if a word starts with two or more syllables
with /L/ tone in a row, a /H/ tone will develop at the beginning of this word in due
course. According to Kindaichi, this ‘mechanism’ operated several times in the
history of Japanese, the first time being in the change from the Middle Japanese tone
system to the Bumō-ki type tone system.
6 Development of /H/ tones in the change from the Middle Japanese tone system
to the tone system of Bumō-ki
2.3 - > '-
3.4 - > '- (and after the 17th century > '-)
3.5 - > - > '-.
Next, the mechanism operated many times independently, all over Japan, in the
change from the Bumō-ki type tone system to the Tōkyō type tone systems.
2.3 Kindaichi’s reconstruction and the tone system of proto-Japanese 57
7 Development of /H/ tones in the change from the tone system of Bumō-ki
to the Tōkyō type tone systems
2.4 - > - > - > -
2.5 - > - > - > -
3.6 - > - > - > - > -
3.7 - > - > - > -
What is the motivation behind the change from Middle Japanese to the Bumō-ki type
tone system (the first change) and from the Bumō-ki type tone system to the Tōkyō
type tone systems (the second change)? Why do /H/ tones develop in tone classes
that start with a sequence of /L/ tones?
I will limit myself in this section to a discussion of the first change, as this is the
change that all theories that adhere to the standard reconstruction of the Middle
Japanese tones must necessarily acknowledge.3
According to Kindaichi (1971), if a word starts with a sequence of word-initial
/L/ tone, the location of the word boundary is unclear, and so the inserted /H/ tones
function as word-demarcator. A problem is however, that in longer tone classes,
such as 3.4 and 4.5, the inserted /H/ tones result in sequences of /H/ tone, and it
remains unclear why sequences of word-initial /L/ tone are problematic while
sequences of word-initial /H/ tone are not.
A somewhat similar solution has been proposed by Matsumori (1999), who
describes the process as ‘polarized tone insertion at the beginning of each tonal
phrase’, while Kisseberth (2001) draws a comparison with stress languages: “These
changes indicate that there is something non optimal about #LL, and that what is bad
about #LL can be repaired by converting the sequence to #HL (….) While it seems
to me that it is inappropriate to try to reduce tonal systems entirely to stress systems,
there is no doubt that the two systems share many features. It seems that the change
is simply the equivalent of the common requirement in stress languages that a stress
foot should be aligned with the left edge of a word.”4
Again, these solutions may work for the shorter nouns, but they cannot explain
why in longer nouns like 3.4 kagami ‘mirror’ ( in Middle Japanese) and 4.5
yorokobi ‘joy’ ( in Middle Japanese) the change was to and
and not to and . In other words, one cannot really speak of polarized
3 The developments in the second change are also based on the assumption that word-initial
sequences of /L/ tone are somehow unacceptable, but not everyone accepts Kindaichi’s idea
that the development from modern Kyōto to Tōkyō was a gradual process, in which tone
classes 2.4, 2.5, 3.6 and 3.7 went through a stage in which they started with /LL/ tone. The
alternative view (i.e. a single rightward shift transforming /L/ register in Kyōto into initial /H/
tone in Tōkyō) will be discussed in section 2.3.3.
4 This comment is in reaction to an account of the historical developments in the Kyōto dialect
by Nakai Yukihiko (2001).
58 2 The standard reconstruction of the Middle Japanese tone system
tone insertion at the beginning of each tonal phrase, or alignment with the left edge
of a word.
The explanation that agrees best with the data is by Kawakami Shin (1965/1995).
He hypothesized that in order to emphasize the transition from the /L/ of the final
syllable of a noun to the /H/ of the attached case particle, or to emphasize the
transition from /L/ to /H/ within a noun (as in case of kokoro below), the last /L/
before /H/ became pronounced progressively lower, while the /L/ tones that
preceded the final /L/ became progressively higher (/L/ > /M/). The /M/ tones then
developed into /H/ tones, and as a last step the resulting ,
tone sequences were simplified by elimination of the final /H/.
8 Kawakami’s explanation for the development of /H/ tones
in tone classes that started with a sequence of /L/ tones
2.3 inu ‘dog’ - > - > - > -
3.4 kagami ‘mirror’ - > - > - > -
4.5 yorokobi ‘joy’ - > - > - > -
3.5 kokoro ‘heart’ - > - > - > -
The problem remains however, that /H/ tones are present in nouns that started with
/LL/ tone in Middle Japanese in corresponding places in dialects all over Japan. For
this exact sequence of developments to have occurred many times over,
independently, in dialects that are widely separated geographically, it would have to
agree closely with widely attested universals of tone rules.
Kawakami’s proposed development however, is in disagreement with the
following universal, which is that a [HL] interval is subject to F0 polarization, while
a [LH] interval is subject to F0 compression. In other words, a [LH] interval has the
tendency to level out to [LM] or [MH] (Hyman, 2007), which ist he exact opposite
of the development proposed by Kawakami.
Even Kawakami’s solution therefore falls short of explaining how this relatively
late change in the history of the Japanese tone system could have been so pervasive.
And this is, of course, not even all, as Kawakami’s change must have been followed
by rightward tone shifts that started independently many times over, leaving not a
single dialect behind. After all, the Kyōto type location of the /H/ tone has not been
preserved anywhere outside of the central Japanese area.
2.3.2 Hayata’s solution: Unrecorded /M/ tones in classes 2.3, 3.4 and 3.5
in Middle Japanese
In light of these problems, Hayata Teruhiro came up with a way to rationalize the
parallel but independent development of these /H/ tones in the Kyōto type and
Tōkyō type dialects. Hayata (1973) made use of an idea that had once been proposed
by Hattori Shirō (1951), namely that Middle Japanese included /M/ tones. Hattori
suggested that in the Middle Japanese tone system as recorded in Ruiju myōgi-shō
2.3 Kindaichi’s reconstruction and the tone system of proto-Japanese 59
類聚名義抄, the word classes that would later on develop /H/ tones in the Kyōto
type dialects had the following tone: 2.3 , 3.4 and 3.5 . (To Hattori
the Middle Japanese tone system represented no more than an older stage in the
dialect of Kyōto).
Hayata adopted Hattori’s idea of /M/ tone in Middle Japanese, and incorporated
it in an accentual analysis (9). Hayata uses two different marks, one indicating initial
/L/ tone L, and one indicating a pitch fall ä.5
9 Hayata’s analysis of the Middle Japanese tone system
Middle Japanese
(Hayata)
2.1
2.2 ä
2.3 Lä
2.4 L
2.5 Lä
3.1
3.2 ä
3.3 ä
3.4 Lä
3.5 Lä
3.6 L
3.7 Lä
According to Hayata – to whom the Middle Japanese tone system represented the
tone system of proto-Japanese (1973:152) – the developments to the modern Tōkyō
type dialects were as follows: In the Nairin/Chūrin type dialects the slight lowering
of the pitch after the /M/ tones developed into a full pitch fall (just as in the Kyōto
type dialects: /M/ > /H/) and was then shifted to the right. In the Gairin type dialects
the slight lowering of the pitch after the /M/ tones developed into a full pitch fall as
well (/M/ > /H/) and shifted to the right. For some reason however, the full pitch fall
in tone classes 2.2 and 3.2 was lost (/H/ > Ø).
Hayata’s solution makes the Middle Japanese tone system closer to the tone
system of modern Kyōto, while at the same time avoiding the merger of tone classes
5 The location of the mark for the pitch fall in tone class 3.7, after the final syllable instead of
after the second syllable, is explained by the following rule: “A syllable with a falling pitch
preceded by a non-low pitched syllable (...) becomes low pitched.” In Hayata’s system the
pitch automatically rises from [L] to [H] if the word includes the initial /L/ tone mark L. (See
classes 2.4, 3.6 and 3.7.) This rule holds true, even when the initial /L/ tone mark is several
syllables away and never actually resulted in an initial low tone. (See the final [H] pitch in tone
class 3.5.)
60 2 The standard reconstruction of the Middle Japanese tone system
2.2 and 2.3 and 3.2 and 3.4 that has occurred in modern Kyōto, as this would be
problematic in view of the Gairin type dialects.
Even though Hayata’s idea makes the development of the /H/ tones in the Kyōto
type dialects and in corresponding places in the Nairin/Chūrin, as well as in the
Gairin type dialects more plausible, it is still necessary to take many independent
coincidental developments in geographically widely separated dialects for granted,
and the problems that plague the standard theory are by no means solved.
Finally, if we follow Hayata’s solution, we have to reconstruct for Middle
Japanese a tone system with distinctive /H/, /L/ and /M/ tone. Despite the fact that
/M/ tone must have been distinctive – after all; it left reflexes in all modern dialects
– it was never recorded. The practice to use tone dots to markt he tones of Japanese
was invented in this period, and it is hard to imagine that one of the three tonemes of
the language would have been so completely overlooked. It is probably mainly
because of this last problem (the lack of attestation) that Hayata’s reconstruction of
/M/ tones in Middle Japanese has not been widely accepted.
2.3.3 How natural is the change of initial /L/ tone in Kyōto to initial /H/ tone
in Tōkyō in classes 2.4, 2.5, 3.6 and 3.7?
Kindaichi assumed that the second change, the change from a more modern Kyōto
type tone system to a Tōkyō type tone system, was a gradual process in which at
some point, sequences of /L/ tone developed at the beginning of certain tone classes,
which were then resolved by the insertion of /H/ tones, in the same way that
sequences of /L/ tones in Middle Japanese had been resolved. Other scholars assume
a direct development of a single initial /L/ tone in Kyōto to initial /H/ tone in Tōkyō,
at the same time, and as part of the same rightward shift that created the Tōkyō type
location of the /H/ tone in general. Such a development, without an intermediate
stage with sequences of /L/ tone has been shown in Table (2) in section 2.2. The
analysis of initial /L/ tone in Kyōto as a pitch fall before the word is essential to this
idea.
As has been shown in chapter 1, when modified, nouns with initial /L/ tone in
Kyōto can have an audible pitch fall before the word. However, initial /L/ in Kyōto
is only preceded by a pitch fall in an abstract morphophonemic sense.
The idea is nevertheless that this pitch fall was shifted to the right, at the same
time and in the same way that pitch falls within the word were shifted to the right, i.e.
the same rule that moves a /H/ tone one syllable to the right is used to turn initial /L/
tone in Kyōto into initial /H/ tone in Tōkyō.
As the same phonological mark ˈ is normally adopted for both /H/ tone and initial
/L/ tone, it looks as if this can be done, but this is an artefact created by the notation.
Tonal developments that have been observed in other languages show that initial /L/
tone, when shifted to the right, will simply spread onto the next syllable, and will not
normally turn into initial /H/ tone. We will see in chapter 7, that in dialects all over
Japan, initial /L/ tone or [L] pitch also spreads to the right, onto the next syllable,
and does not turn into initial /H/ tone. This happens not only in the Tōkyō type
2.3 Kindaichi’s reconstruction and the tone system of proto-Japanese 61
dialects where initial [L] pitch is not distinctive, but also in the Kyōto type dialects,
where it is.6
A rightward tone shift – in other words – cannot explain why the distinct initial
/L/ tone, which is so typical of the Kyōto type dialects, developed into initial /H/
tone in Tōkyō. The development of initial /H/ tone in Tōkyō from /L/ register in
Kyōto is a change which requires a convincing and natural explanation, as it must
have operated throughout Japan; initial /H/ tone in classes 2.4, 2.5, 3.6 and 3.7, is
typical of all subtypes of the Tōkyō type dialects, and (as we shall see in chapter 9)
even of many word-tone dialects in the Ryūkyū’s. We see however, that the
supposed naturalness of this development (whether as a direct development or by
means of intermediate stages) is not confirmed by tonal developments in other
languages, nor by developments in the dialects of Japanese.
2.3.4 How natural is the shift of /H/ tone to the right in the Tōkyō type dialects?
A problem that has already been mentioned several times in the previous sections, is
the fact that the evidence from dialect geography does not support Kindaichi’s
theory. Of all the changes required to derive the modern Tōkyō type dialects from
the tone system of Middle Japanese, only the rightward shift of /H/ tone can truly be
regarded as a natural development that has been attested in many tone languages.
However, as the Kyōto type location of the /H/ tone has not been preserved in Japan
anywhere outside of the central Japanese area, the shift of /H/ tone to the right in the
Tōkyō type dialects must have taken place independently in widely separated areas
in Japan, without leaving a single dialect behind.
Kyōto type tone can only be found in a relatively small central area around
Kyōto and Ōsaka, across the water of the Seto Inland Sea in the northeast of
Shikoku, on small islands off the coast of Shikoku, and due to intensive contact with
the Kyōto area, also on Sado Island.
Tōkyō type tone on the other hand, can be found in Honshū (and Hokkaidō), in
the southwest of Shikoku, in the northeast of Kyūshū, and (in a simplified form) on
islands off the coast of Kyūshū like Tsushima and Iki. It surrounds the area with
Kyōto type tone on all sides, and it can even be found in the middle of this area in
the Totsukawa region, where a number of isolated villages have a Tōkyō type tone
system.
Rightward tone spreading and shift are a natural phenomena and such
developments occurring independently in a large number of dialects is not
impossible. The total lack of any dialect preserving the original location of the /H/
6 See for instance the West Sanuki and East Sanuki dialects in section 7.2.1, and the dialect of
Ibukijima in section 7.2.2. The dialect of Ibukijima is the perfect case to test the assumptions of
standard theory, as in Ibukijima – a Kyōto type dialect – a rightward tone shift can be seen
under way. Whereas /H/ tones indeed shift to the Tōkyō type location, /L/ register does not
develop into initial /H/ tone. Instead, the initial /L/ tone simply spreads more and more to the
right.
62 2 The standard reconstruction of the Middle Japanese tone system
tone outside of the central Japanese area however, indicates that this theory is
stretching the imagination too far.
2.3.5 Problems concerning the tone of class 3.2 in the Kyōto type dialects
Another problem with the standard theory has to do with the tone of class 3.2 in
the Kyōto type dialects. When the development of Tōkyō type tone from Kyōto type
tone is shown, the Nairin/Chūrin type is used to represent the Tōkyō type, and the
tone attested in Bumō-ki is used to represent the Kyōto type.
This has to do with the fact that certain innovations, which took place in
trisyllabic nouns in Kyōto since the 17th century, make the present-day tone system
of Kyōto unfit to use as an example of what the tone system must have been like at
the time of the rightward shift in Tōkyō: As shown in (10), in modern Kyōto, tone
classes 3.2, 3.4 and 3.5 have merged, which is the case in neither of the three Tōkyō
subtypes.
10 Comparison of the merger patterns of trisyllabic nouns in modern Kyōto,
Bumō-ki and the Tōkyō type dialects
Modern 17th century Tōkyō Tōkyō
Kyōto Kyōto (Nairin/ (Gairin)
(Bumō-ki) Chūrin)
3.2 '- '- '- -
3.4 '- '- '- '-
3.5 '- '- '- '-
3.7 ''- ''- '- '-
The Kōchi type tone system, which I used earlier to represent the Kyōto type
dialects, cannot be used either, as in this dialect tone class 3.2 has merged with tone
class 3.7, which is not the case in any of the three Tōkyō subtypes. An additional
reason why the dialect of Kōchi cannot be used, has to do with the fact that the tone
of class 3.2 in Kōchi is hard to explain from the tone pattern of Middle Japanese. In
Kōchi class 3.2 has '' tone, although some ' and ' reflexes (the
last due to influence from Kyōto?) also occur. The /L/ register of this tone class is
hard to explain when Kindaichi’s reconstruction of the Middle Japanese tone system
is taken as the starting point.
If the development of an unexpected /L/ register in tone class 3.2 were limited to
the dialect of Kōchi, one could dismiss it as an unusual development in an isolated
dialect on Shikoku. As a matter of fact however, this development is far more
widespread and can be found in most or perhaps even all of the Kyōto type dialects
on Honshū as well.
2.4 Historical background of the standard theory 63
The reflex in Kyōto for instance, is usually presented as '- (from earlier
'- as in Bumō-ki),7 but in reality it is predominantly ''-, just as in
Kōchi, mixed with '-. 8 The reflexes of class 3.2 in Wakayama city
(Hirayama, 1992) and Ōsaka (Martin, 1987) are also a mixture of ' and
''. (See section 8.1.5.) Hirayama (1988:47) furthermore reports that “the
merger between tone classes 3.2 and 3.7 is widely distributed, for instance in the
present-day Kyōto dialect, and in Kinki dialects such as Gojō, Tanabe, Arida and
Hongū.”
11 Comparison of the tone of class 3.2 and 3.7 in Middle Japanese, Kōchi
and the Tōkyō type dialects
Middle Kōchi Tōkyō Tōkyō
Japanese (Nairin/ (Gairin)
(Kindaichi) Chūrin)
3.2 - ''- '- -
3.7 - ''- '- '-
Before we conclude that the tone system of Bumō-ki cannot have been the
ancestral tone system of any of the modern Kyōto type dialects, we should note that
even Bumō-ki has mixed reflexes: The usual representation of tone class 3.2 in
Bumō-ki as ' is in fact based on only two examples: higasi ‘east’ which is
marked with ' tone marks and midori ‘green’ which is marked with '' as
well as ' tone marks.
It turns out that in the Kyōto type dialects on Honshū, tone class 3.2 has split, a
split which – judging from the limited evidence from Bumō-ki – had already
occurred before the 17th century. For some reason part of tone class 3.2 merged with
tone class 3.7, and part merged with tone class 3.4.
The standard theory cannot explain the development of the '' tone pattern,
nor the reason behind the split in the reflexes of class 3.2 in the Kyōto type dialects.
Furthermore, if the tone systems of the Tōkyō type dialects developed from a
relatively modern Kyōto type tone system, such as the tone system of Bumō-ki, it is
strange that no split in tone class 3.2 or merger between tone classes 3.2 and 3.7 can
be found in the Tōkyō type dialects.
2.4 Historical background of the standard theory
At the beginning of this chapter, I have briefly touched on some of the reasons why
the historical development of the Japanese tone system was reconstructed as it is by
7 See for instance Satō (et al. ed.), 1977:255 and Koku-go gakkai (ed.) 1980:8-9.
8 See both Hirayama (1960) and Kobayashi (1975).
64 2 The standard reconstruction of the Middle Japanese tone system
Kindaichi and others. These few remarks however, are not sufficient to explain the
development and the widespread acceptance of a theory with so many obvious
problems. In order to understand how Kindaichi developed his views on the process
and the direction of linguistic change, a sketch of the scientific climate of the time is
indispensable.
A good recount of the history of the standard theory and the scientific climate in
which it developed is the article ‘Language change in Japan and the Odyssey of a
Teisetsu’ by Ramsey (1982), and my sketch below relies heavily on this article.9
2.4.1 The geographical dilemma
In the 1920’s little was known about the tone systems of the Japanese dialects.
People were aware of the difference between the tone systems of Tōkyō and Kyōto,
but no one had an idea about how the tone patterns of the two dialects exactly
related to each other or what kind of tone system lay on the other side of Kyōto.
This situation was changed by Hattori Shirō who started to investigate the tone
systems of the Japanese dialects and proceeded to write a series of articles in which
he drew up a set of correspondences between the tone systems of a number of
eastern (Tōkyō type) and western (Kyōto type) dialects (1931, 1932, 1933). The two
systems stood in such a relationship to each other that one seemed to be derivative,
yet Hattori never ventured a definite answer to the question of which system was
older.
It is clear however, that from the beginning Hattori believed that the tone system
of Kyōto was older. For instance, he called the Kyōto type dialect the A type and the
Tōkyō type the B type. As has been mentioned, there was good reason to believe
that Kyōto was the older of the two types as the distinctions made in the tone
patterns recorded in old manuscripts from the capital could account not only for the
tone distinctions of the Kyōto type dialects, but for those of the Tōkyō type dialects
as well. This meant that the Middle Japanese tone system of the Kyōto area could
more or less be equated to that of proto-Japanese. But more importantly, this tone
system did not seem to be very different from that of modern Kyōto; the symbols
contained in the old texts appeared to correlate with the pitches heard in modern
Kyōto.10
9 Ramsey in turn expresses his reliance on an essay by Tokugawa (1977).
10 It is only in his famous article of 1951 that Kindaichi set out to argue explicitly what the tone
value of the Middle Chinese tones, and the tone value of the tone dots in Japan had been like.
There can be no doubt however, that from the very beginning, 20th century Japanese linguists
treated ping as /L/ and shang as /H/. Kindaichi himself for instance, already did so in his
graduation paper of 1937, in which he presented his discovery that the distinction that was
made in Middle Japanese between tone classes 2.2 (上平) and 2.3 (平平) had been preserved in
the Gairin dialects and the Kagoshima dialect. The fact that many of the old texts had been
written in Kyōto, of course, strongly suggested such an interpretation, as it resulted in a Middle
Japanese tone system that resembled the tone system of modern Kyōto, but the immediate
identification of ping with /L/ and shang with /H/ may also have been partly due to the names
of the tones (‘level’ and ‘rising’).
2.4 Historical background of the standard theory 65
The written record was regarded as good proof of the antiquity of the Kyōto
system. The reason why Hattori hesitated to conclude that Kyōto was the older tonal
type, was because the prosodic type that lay at the other side of Kyōto could not
easily be explained by this historical interpretation.
The situation in the westernmost regions of the country had been unknown, and
when Hattori turned to the dialects spoken there, it quickly became apparent that
these western areas were not continuations of the Kyōto system, nor did they have a
completely new tone system: Hattori had discovered that Kyōto type tone was
surrounded on both sides by Tōkyō type tone.
If Kyōto type tone were original, precisely the same changes would have had to
occur independently in the east and the west. And this would have happened both
times at the expense of the original Kyōto type, which, as the tone system of the
centre of culture, should have been the object of imitation, not rejection.
This was a very unlikely chain of events, and Hattori therefore considered the
idea of a population migration. However, as the investigation of the Japanese
dialects progressed, Tōkyō type tone was also found in a pocket of isolated rural
villages in Totsukawa, within the region with Kyōto type tone itself. And although
the parts of Shikoku nearest to Kyōto had a Kyōto type tone system, in the
southwestern corner there was Tōkyō type tone again.
In the end Hattori concluded, very reluctantly, that Kyōto type tone was older,
but he added that from the point of view of geographical distribution the Tōkyō type
appeared to be older. (And he never ruled out the possibility that the Tōkyō type
tone system was older.)
As tentative as his conclusions had been, they quickly became established dogma
to the scholars who continued his work. The assumption that the Tōkyō type tone
systems had somehow developed from an original Kyōto type tone system became
the basis of all subsequent work.
Hattori tried to change his original conclusions in his article ‘Genshi Nihon-go
no akusento’ (1951). In this article he concluded that neither the Kyōto type, nor the
Tōkyō type tone system was older, but that the proto-Japanese system had been a
combination of elements of both. (See section 2.5. It is also this article in which
Hattori proposed the idea that Middle Japanese included /M/ tones, which became
the basis of Hayata’s reconstruction of the history of the Japanese tone system.) This
system then split into the tone system of the Middle Japanese dialect of Kyōto on the
one hand, and into a precursor of the Tōkyō type tone system on the other hand.
Hattori’s revised theory has been largely ignored, and it has to be said that it did
not constitute much of a solution, as the problem of identical changes occurring
independently in widely separated areas remained. The unresolved geographical
problem was simply ignored in the reconstruction of dialect history. The modern
evidence was adjusted in various ways to fit the accepted interpretation of the
written record. As more was learned about the dialects, and other islands of Tōkyō
type tone were discovered, the more improbable and strained the explanations had to
become.
66 2 The standard reconstruction of the Middle Japanese tone system
Finally it was Kindaichi Haruhiko who, in the 1950’s, found a way to rationalise
the peculiar geographical distribution of the prosodic types. The development of his
ideas is closely related to a famous academic feud that took place in Japanese
linguistics at the time.
2.4.2 The dialect area theory and the circle theory
The factions in the dispute were two schools of dialectology with a very different
approach to linguistic change. One of the schools was that of the mainstream of
traditional Japanese language study, while the other was related to the field of
folklore.
The leader of the first school was Tōjō Misao (1884–1966), who had made a
general description of the dialect situation in Japan. He had classified the regional
varieties of speech into dialect areas and had called his model the ‘dialect area
theory’. Tōjō’s branching classification of large areas into increasingly smaller areas
could be equated to the kind of family tree commonly used in comparative
linguistics to describe historical relationships. The way the diagram branched
represented an idealization of how the dialects had actually separated historically.
Yanagita Kunio (1875–1962) on the other hand, was Japan’s foremost
ethnologist and folklorist. For him, the story of Japan’s cultural history was one of
fashion and innovation in the capital, which gradually spread into the countryside,
complicating an originally simple culture. The new drove the old out of the
mainstream, but the old was often preserved in rural communities far from the
capital, and it was there that the social and religious traditions of an earlier time
survived. The proof that Yanagita offered for this view came from his study of the
Japanese dialects. He introduced the idea that new words had radiated from around
the great urban and cultural centre of Japan, the old capital of Kyōto, in his study of
the geographical distribution of the dialect words for ‘snail’ (1927).
The distribution of the different words for ‘snail’ showed how older forms
tended to be preserved in remote places. Yanagita compared the situation to the
ripples that a stone creates when thrown into a pond. As each new word is coined in
the centre of culture it spreads out in a circle, gradually displacing the older forms
into ever-remoter areas.
Yanagita’s ideas were influenced by the work of French dialectologists to which
he had been exposed during his study in Geneva. Their ideas had developed out of
the work on the great French dialect atlas, which showed the distribution of the
dialect variants of single words or phrases throughout the French speaking area of
Europe. The distributions of these words had been produced by society as well as
geography and in this sense every word had its own history. They saw their
approach in part as a reaction against the ‘rigidity of the sound laws’ of comparative
linguistics.
Yanagita’s dialectology eventually came into conflict with Tōjō’s dialect area
hypothesis, which was based on the principles of comparative linguistics. Yanagita’s
main objection however, was against the idea that dialect areas seemed to imply
2.4 Historical background of the standard theory 67
many separate spheres of cultural influence. That might be the case if Japan was
made up of a number of relatively independent cultural and linguistic bases, such as
the various countries of Europe, but anyone could plainly see that Japan was a single
entity.
Tōjō asserted that a dialect was first and foremost the linguistic structure of a
particular geographical area taken as a whole. Rather than looking only at individual
words as Yanagita had done, one had to look at the totality of the structure. The
branching of proto-Japanese into the various dialects through systematic structural
changes was the historical mechanism that had produced the variety of the modern
language.
Tōjō did propose that the two theories could coexist without inconsistencies, but
Yanagita’s model of change was only of peripheral importance; new words might
spread from Kyōto, but they would not produce structural changes, and they would
therefore be nothing more that dialect loans.
2.4.3 The resolution of the dispute
It was Kindaichi who popularised Tōjō’s ideas in a simple formula in which the
circle theory accounts for lexical changes, while the dialect area theory accounts for
changes in grammar and phonology (Kindaichi, 1954:70). Kindaichi’s formulation
remains today the accepted resolution of the dispute.
Yanagita’s idea that all linguistic change radiates from a single centre of culture
– which basically denies the existence of regional dialect splits – is far too simplistic.
On the other hand, Tōjō’s idea that linguistic radiation only applies to lexical items
and not to grammar or phonology, is simply not correct: All linguistic change,
structural or otherwise, originates as variants in the speech of a few. These features
then spread from class to class and community and community, in ways and at
speeds that depend on the socio-cultural links between these groups. At the time
however, Tōjō’s arguments carried the day, and apart from that, Kindaichi had a
strong reason to endorse Tōjō’s view:
It was hard to miss how closely the distribution of the Kyōto type tone system
resembled the geographical areas where people used modern Kyōto lexical items,
which every one agreed, had originated as innovations in the capital.11 This would
11 The most famous example of this kind of circular distribution of lexical items is still Yanagita’s
study of the distribution of the different words for ‘snail’ in the Japanese dialects. The
similarity with the distribution of the different tone systems in Japan is striking. We can even
see that the modern Kyōto word for ‘snail’ denden musi, has spread all around the area on the
Kii peninsula where Kyōto type tone occurs, leaving an older word (katatumuri) spoken in the
island of Tōkyō type tone in Totsukawa (Ramsey, 1982: 117).
Another example is the distribution of the words aho and baka (‘stupid’) in the modern
Japanese dialects. In 1990, in a project on Japanese television that was prompted by a question
from a viewer, people from all over Japan were asked to phone in the local variants of the word
for ‘stupid’. This project resulted in the publication of a book and a map (Zenkoku aho-baka
bunpu-zu ‘Map of the countrywide distribution of ‘dumb’ and ‘stupid’’). I was delighted to see
that the distribution of the modern Kyōto word for ‘stupid’ aho, which has replaced the older
68 2 The standard reconstruction of the Middle Japanese tone system
seem to support Yanagita’s position. However, Kindaichi could not take this parallel
at face value: The standard theory said that the tone system of Kyōto was the oldest
in Japan, and by the time that the dispute between the circle theory and the dialect
area theory took place this idea was so well established that the possibility of it
being wrong was not even considered. Kindaichi therefore drew the only conclusion
that could be made, and that had been suggested by Tōjō; namely that Yanagita’s
hypothesis only applied to lexical items and not to structure.
2.4.4 Turning Yanagita’s circles inside-out
This still left Kindaichi with the problem of how to account for the geographical
distribution of something as structural as the Japanese tone systems. For even if one
does deny the idea that structural change radiates out, one is still left with a
distribution of the different Japanese tone systems on a map of Japan that look like
concentric circles, with the centre of the compass placed in the Kyōto area.
Instead of playing down the concentric distribution of the Japanese tone systems,
Kindaichi acknowledged it, and made it a central part of his view on the direction of
linguistic change in Japan. Kindaichi also draws circles, but he stresses that his
circles are exactly the reverse of Yanagita’s circles (1975: 25–26):
As opposed to Yanagita, I take the position that the conservative dialects in
Japan are the dialects of the inner circle, and that the dialects that take the
lead and have undergone major changes are the dialects of the outer circle.
Yanagita has said that in the outer circle, the peripheral region, dialects have
preserved the appearance of ancient times and are therefore highly valuable
for research on the history of the national language. But my position is that it
is in the speech of this fringe area that one feels a breath of freedom and
freshness. If one would like to know what direction Japanese would take if it
were left alone in a natural state, there are many things that can be learned
from the dialects of the outer circle.
The reason, according to Kindaichi, is as follows: While lexical items spread out as
novelties from the centre of culture, changes in grammar and pronunciation happen
first in the periphery because people outside of the cultural centre are not subject to
as many social restrictions, and therefore more relaxed about careless speech. An
older Japan is therefore not to be found in distant villages, like Yanagita believed,
but in the heartland of Japanese civilization itself.
Kindaichi thinks for instance, that the dialects of the Ryūkyūs and southwest
Kyūshū, like the rest of Japan, originally had a Kyōto type tone system.
Subsequently they all passed through the intermediate Tōkyō type stage (preserved
on the island of Tokunoshima) on to their present-day word-tone systems. The
word baka, largely coincides with the area with Kyōto type tone, while the area where the word
baka has been preserved coincides with the area with Tōkyō type tone.
2.5 Other theories that are based on the standard reconstruction 69
Tōkyō type dialects of northeast Kyūshū, however, by virtue of their relative
proximity to the inner circle (i.e. Kyōto), have only progressed so far as the
intermediate stage (Kindaichi, 1975:129–159).
The idea that Kyōto was immune to the development of Tōkyō type tone – which
happened spontaneously all over the rest of Japan – has been widely accepted,
coloring the interpretation of the dialect data. In the Gendai Nihon-go hōgen dai-
jiten (1992, 217–218) for instance, the pocket of Tōkyō type tone that can be found
in the Totsukawa region is treated as follows: “Lying so deep in the mountains that
until recent years the Japanese wolf reportedly still survived in this region,
transportation has always been difficult, and it is therefore thought that unique
developments took place in the language.”12
Even Kobayashi (1975), who explicitly stated that she intended to reconstruct the
proto-Japanese tone system solely on the basis of the present-day dialects, proposes
many coincidental independent developments in the Tōkyō type dialects. This is a
consequence of the fact that in reality her proto-Japanese reconstruction was heavily
based on the tone system of Middle Japanese in Kindaichi’s reconstruction.
Although Kindaichi claims that the central Japanese area is conservative in
general, it will be no surprise that in reality his reversed-circle-theory only applies to
the distribution of the Japanese tone systems for which it was devised. I will leave
the final characterisation of Kindaichi’s brand of dialectology to Ramsey (1982:
120–121):
Kindaichi has constructed a unique kind of dialectology in which everything
works backwards. For Kindaichi, changes do not spread out from a centre of
culture, they press in upon it from the hinterlands. Changes do not travel
along lines of communication, but in areas where communication is poorest.
Dialect islands, normally so precious to the linguist for establishing the
direction of change, are dismissed in Kindaichi’s dialectology as the sporadic
occurrences of innovation. And features that are shared by widely separated
dialects do not, in Kindaichi’s formulation, show what the language was like
in the past, but what it will be like in the future, once these innovations have
crept all the way to the capital.
2.5 Other theories that are based on the standard reconstruction
The tone system of proto-Japanese must have been similar enough to the Tōkyō type
tone systems to explain the predominance and non-contiguous distribution of this
tonal type in the Japanese islands. As we have seen however, the Middle Japanese
12 The irony is that Tōkyō type tone can hardly be labeled unique, as it is found all over Japan,
whereas the Japanese wolf is not. Formally one could even argue that there is more reason to
assume a unique development as the origin of the Japanese wolf in this region than of the tone
system.
70 2 The standard reconstruction of the Middle Japanese tone system
tone system in the standard reconstruction does not resemble the tone system of
Tōkyō at all. (It is, in fact, even more unlike the tone system of Tōkyō than the tone
system of modern Kyōto, as it does not even contain /H/ tones in tone classes that
contain /H/ tones in both Tōkyō and Kyōto today.)
Kindaichi nonetheless chose to reconstruct a proto-Japanese tone system that is
similar to Middle Japanese, which compelled him to invent a theory that justified
ignoring the evidence from dialect geography.
Despite the wide acceptance of this theory, a number of scholars have tried to
formulate alternatives to Kindaichi’s theory. Because the evidence from dialect
geography is so hard to reconcile with the standard reconstruction of the Middle
Japanese tone system, the idea that this tone system was close to, or almost identical,
to the tone system of proto-Japanese has been abandoned in these theories, even
though the tonal distinctions of Middle Japanese can account for practically all the
distinctions in the modern dialects.
I have already referred to Hattori’s reconstruction of a proto-Japanese tone
system that resembled the Kyōto-like tone system of Middle Japanese, but that
included elements of the Tōkyō type tone systems as well (Hattori, 1951). Hattori’s
goal was to reconstruct a proto-Japanese tone system that incorporated elements of
both, to the extent that each could have developed from it in a natural way. (In this
idea therefore, the Middle Japanese tone system represents an earlier stage in the
history of the Kyōto type dialects, and not proto-Japanese.)
12 Hattori’s reconstruction (1951) of the tone system of proto-Japanese
and the developments to the modern dialects
Kyōto Middle Proto-Japanese Transitional Tōkyō
Japanese period
2.1 - < - < - > - > -
2.2 '- < - < - > - > '-
2.3 '- < - < - > - > '-
2.4 '- < - < -~ > - > '-
-
2.5 ''- < - < -~ > - > '-
-
Hattori’s proto-Japanese tone system in (12) is very similar to the tone system of
Middle Japanese, if we ignore the syllables with /F/ tone that he has added. 13 It
seems that he has taken the standard reconstruction of the Middle Japanese tone
system as a basis for his proto-Japanese tone system, and has tried to represent the
13 I am surprised by the fact that Hattori presents the tone of the particles ha and ga after tone
classes 2.2 and 2.5 in Middle Japanese (and therefore also in proto-Japanese) as /L/ instead of
/H/, as in the oldest material these particles are invariably marked with a shang tone dot.
2.5 Other theories that are based on the standard reconstruction 71
location of the /H/ tone in the Tōkyō type dialects by adding a falling tone /F/ in the
appropriate location. Even in this reconstruction however, the tone system of proto-
Japanese is so unlike the Tōkyō type tone systems that many coincidental
developments have to be taken for granted, so that the geographical dilemma is not
solved.14
In other, more recent work on the history of Japanese tone (cf. Matsumori 1993,
1998a, 2001) the idea that the tone system of proto-Japanese was close to or almost
identical to the tone system of Middle Japanese has also been abandoned. The
Middle Japanese tone system represents an earlier stage in the history of the Kyōto
type dialects, and not proto-Japanese, and these works set up proto-Japanese tone
systems that are completely unlike Middle Japanese.
See for instance Matsumori’s reconstruction of the tones of monosyllabic and
disyllabic nouns in (13). (With proto-Mainland Japanese is meant the ancestral tone
system of the Japanese dialects excluding the dialects of the Ryūkyūs.) This means
that the development from proto-Mainland Japanese to the Tōkyō type dialects must
have been as in (14).
13 Matsumori’s reconstruction of the tone system of proto-Japanese
and the developmens to the modern Kyōto dialect
Proto-Japanese Proto-Mainland Middle Kyōto
Japanese Japanese
1.1 , - > , - > - > :-
1.2 , - > , - > - > :'-
1.3 , - > , - > - > ':-
2.1 , - > , - > - > -
2.2 > > - > '-
2.3 > > - > '-
2.4 > > - > '-
2.5 > > - > ''-
In these alternatives to Kindaichi’s theory, the tonal changes that must have occurred
in order to arrive at the tone systems of the various modern dialects are exceedingly
complex and – as far as I can see – unmotivated, and the problems that plague
Kindaichi’s theory are by no means solved. In all of these theories the development
of Tōkyō type tone must have happened many times independently, and none of the
reconstructed tone systems comes even close to resembling the tone system of
14 A later theory proposed by Hattori, in which vowel length distinctions in proto-Japanese play a
role in the historical development of Japanese tone is based for a large part on evidence from
the dialects of the Ryūkyūs. This theory is discussed in chapter 9, as are other theories that deal
specifially with the dialects of the Ryūkyūs (cf. Vovin 1993b, Shimabukuro 1997, 2002,
Matsumori, 1998b).
72 2 The standard reconstruction of the Middle Japanese tone system
Tōkyō enough to explain why it is this particular tonal type that dominates the
Japanese islands in a non-contiguous geographical distribution.
14 The development from Matsumori’s proto-Mainland Japanese
to the dialect of Tōkyō
Proto-Mainland Tōkyō
Japanese
1.1 , - > -
1.2 , - > -
1.3 , - > '-
2.1 , - > -
2.2 > '-
2.3 > '-
2.4 > '-
2.5 > '-
As it is it not necessary to assume multiple independent parallel developments in the
dialects that surround Kyōto if Kyōto type tone is regarded as an innovation, there
have been two scholars who have proposed the obvious alternative to Kindaichi’s
view, namely the idea that Kyōto type tone, and not Tōkyō type tone is an
innovation. The first of these was Ōhara Takamichi, whose theory will be introduced
in the next section. The second scholar to have proposed the idea that Kyōto type
tone is an innovation is Samuel Robert Ramsey, whose theory will be introduced in
the next chapter.
2.5.1 Ōhara’s theory: Kyōto type tone as an innovation
In 1951, Ōhara Takamichi suggested that Japanese may have had a Tōkyō type tone
system at least up to the 8th century, and that a leftward tone shift had taken place in
Kyōto and surrounding areas after the 8th century, but before the time of the earliest
tone dot attestations in the 11th century. Ōhara’s idea was based on the fact that in
the Koji-ki 古事記 (712), Japan’s oldest national history, tone notes are added after
certain words. While these words are marked with ping tone dots in 11th century
Ruiju myōgi-shō, the tone notes in the Koji-ki consist of the characters 去 (which
occurs only once, and represents the Middle Chinese qu tone) and 上 (which
represents the Middle Chinese shang tone).
These words form part of the names of certain persons, gods and places in the
Koji-ki. Although the list in (15) is not exhaustive, such tone notes have only been
added to a small percentage of the names in the Koji-ki. Keichū (1640–1701) and
Motoori Norinaga (1730–1801) therefore explained them as indicating aberrant
pronunciations. Keichū and Motoori’s explanation is still the majority view, and
2.5 Other theories that are based on the standard reconstruction 73
these tone notes are therefore not commonly regarded as evidence for a Tōkyō-like
tone system in central Japan in the Nara period (700–800).
15 Comparison of the tone notes in the Koji-ki with the tones
in Ruiju myōgi-shō and the modern dialects
Koji-ki Ruiju myōgi-shō Kyōto Tōkyō
2.3 kumo ‘cloud’ 上 平平 ' '
2.3 yama ‘mountain’ 上 平平 ' '
2.3 wata ‘cotton’ 上 平平 ' '
2.3 sima ‘island’ 上 平平 ' '
2.3 asi ‘foot’ 上 平平 ' '
2.3 mimi ‘ear’ 上 平平 ' '
1.3 te ‘hand’ 上 平 ' '
1.3 ni ‘earth, red clay’ 去 平 x x
An argument against Ōhara’s interpretation can be found in the research of
Takayama Michiaki (1981) on the writing system used in the Nihon shoki 日本書紀
(720). The Man’yōgana used in the Nihon shoki are based on the Late Middle
Chinese standard language on which the Kan-on character reading tradition and the
value of the tone dots are also based. 15 According to Takayama the Chinese
characters used as Man’yōgana in part of the poems in the Nihon shoki represent not
only the vowels and consonants of Japanese, but also to a certain degree (and
depending on which part of the Nihon shoki) the Japanese tones. As far as can be
determined based on this material, the tone system of Nihon shoki was not radically
different from the tone system of later works like Ruiju myōgi-shō. As the Nihon
shoki expresses the language of the same area (central Japan) of around the same
time (8th century) as the Koji-ki, this supports the idea that the tone notes added to
the Koji-ki indicate aberrant pronunciations.
Takayama’s discovery also means that if a major tone shift – such as Ōhara
proposed – took place in Kyōto, this cannot have happened after the 8th century but
must have happened earlier, before the Nara period. A blow to this idea however, is
formed by the observation that the tone of frequently used, thoroughly Japanized
15 In the Go-on type Man’yō-gana of the Koji-ki 古事記 and Man’yō-shū 万葉集 no such system
can be found. In the Koji-ki for instance, for every syllable only one Man’yō-gana was in use.
There is one reported example of a text in which phonograms that were read according to the
Go-on reading tradition differentiated between the tone of the Japanese syllables they
transcribed, just as in the poems of the Nihon shoki: Following up on a remark by Ōya Tōru
(1850-1928), Kindaichi (1947) discovered this unique spelling technique in the Man’yōgana
used in Konkōmyō saishōō-kyō ongi 金光明最勝王経音義 of 1079. The choice of the Man’yō-
gana seems to have been based on the Go-on tone value in these cases. (Apparently the tone
dots of this manuscript are not as old as the Man’yō-gana spellings.)
74 2 The standard reconstruction of the Middle Japanese tone system
Chinese loanwords that were introduced in the Japanese language in the 6th, 7th and
8th centuries, splits up along the same dialectal lines as the tone of native Japanese
words.
Words that are not part of everyday speech, as well as many neologisms coined
from character readings usually have [HL] pitch in all major three tonal types, i.e.
Tōkyō 2.4/5 ', Kyōto 2.2/3 ' Kagoshima A . Okumura (1963), who
discovered this correlation between the frequency with which a loanword is used
and ist reflexes in the modern dialects called this [HL] pitch the ‘basic’ tone pattern.
A number of more frequently used, thoroughly Japanized loanwords on the other
hand, have reflexes in the different dialects that correspond to one of the tone classes
of native Japanese nouns (16). Okumura suggested that the tone of this type of
loanword was inherited from proto-Japanese and that the tone of the loanword in
Middle Chinese must have determined the tone in proto-Japanese to a certain
degree.16
16 Comparison of the tone of Chinese loanwords in Middle Chinese
and the modern Japanese dialects
Loanword MC tone Tōkyō Kyōto Kagoshima
2.1 kyoku ‘melody’ ru A
2.1 teki ‘enemy’ ru A
2.1 boo ‘stick’ shang A
2.3 kiku ‘chrysanthemum’ ru ' ' B
2.3 doku ‘poison’ ru ' ' B
2.3 niku ‘meat’ ru ' ' B
2.4 miso ‘bean paste’ qu-ping ' ' B
2.4 dai ‘platform’ ping ' ' B
2.4 kai ‘meeting’ qu ' ' B
In Korean and Vietnamese, the Middle Chinese tones are still clearly reflected in the
present day tones of Chinese loanwords. In Japanese, due to the confusing difference
between the tones of the Go-on and the Kan-on character reading traditions (see
chapter 4 and section 11.1 of part II), there is no clear correlation between the
original tone class to which a loanword belonged in Middle Chinese and the tone
class to which it belongs in modern Japanese.17
16 The first person to remark on this phenomenon had been E.D. Polivanov in the 1920’s
(Kindaichi, 1984:13) and in an article in Rōmaji sekai as early as 1943, Kindaichi had
suggested the same, namely that a number of loanwords that entered the Japanese language
early on, still reflected the Middle Chinese tones. Apart from this article, the phenomenon
seems to have been little explored at the time. After the appearance of Nihon shisei kogi (1951),
Okumura (1955a, 1963) and Kindaichi (1980, 1984) devoted more research to this topic.
17 In the examples of loanwords in (16) for instance, kyoku and teki are Kan-on readings of ru
tone characters, while kiku, doku and niku are Go-on readings of ru tone characters, and as a
2.6 Conclusion 75
Even though the correlation between the original Middle Chinese tone category
of the character and the present-day tone in Japanese in (16) is complex, the fact that
these loanwords from Middle Chinese show the same correspondences between the
different dialects as native Japanese words, means that the split between Tōkyō type
tone and Kyōto type tone in Japan must date from after the adoption of these words
into the spoken language, i.e. after the 8th century. This means that the idea of a split
between Tōkyō type and Kyōto type tone before the Nara period (which is the only
way in which Ōhara’s theory can be saved in light of Takayama’s discovery) is out
of the question.
2.6 Conclusion
The standard reconstruction of the historical development of the Japanese tone
system by Kindaichi, and the alternative theories mentioned above, have been one-
sidedly based on what is regarded as an indisputable interpretation of the written
record. All these theories are founded on the shaky basis of a 20th century
interpretation of the way in which the tones of 7th to 8th century Chinese were
applied by the Japanese to their own language in the 11th century.
Kindaichi tried to accommodate the Middle Japanese tone system that results
from this interpretation with the idea that in the geographical distribution of lexical
items the periphery preserves the oldest forms, but that such a rule does not apply to
the spread of phonological features.
The alternative theories all adhere to the standard interpretation too, which
severely complicates the reconstruction of a proto-Japanese tone system that forms a
suitable starting point for the development towards the modern tone systems and the
tone system of Middle Japanese. The result is that the proposed tonal changes are
exceedingly complex and not borne out by what are known to be natural
developments based on the observation of other tone languages.
Finally, because of Ōhara’s adherence to the standard reconstruction, he is forced
to place his proposed leftward tone shift in the Kyōto area in the period before the
tone dot attestations. We have seen that this is not possible, and apart from this,
Ōhara’s idea would only explain the geographical distribution of the tone systems.
His theory does not make the relationship between the tone system of proto-
Japanese and the tone system of Middle Japanese any more transparent, as he does
not question the standard interpretation of the tone dot material.
result they show different reflexes in the dialects. Dai and kai on the other hand, have identical
reflexes in Japanese, while their original tone in Chinese differs. This is because kai is the Kan-
on reading of a qu tone character, while dai is the Go-on reading of a ping tone character, and
in Go-on ping tone characters are read ‘in the reverse’ with a qu tone.
76 2 The standard reconstruction of the Middle Japanese tone system
In the next chapter I will introduce the theory of the only scholar who has not
only proposed the idea that Kyōto type tone is the result of an innovation, but also
challenged the standard interpretation of the written record.
3 Ramsey’s reconstruction of the Middle Japanese
tone system
In 1979 Samuel Robert Ramsey proposed a theory in which he assumed (like Ōhara)
that a leftward tone shift had taken place in the Kyōto area. Ramsey however, did
not place this shift in the distant, unrecorded past. According to him this shift had
taken place sometime after the 11th century, most likely in the 14th century, when the
use of the tone dots became confused and was abandoned.
In one important aspect, Ramsey’s view is closer to Kindaichi’s view than the
theories of Ōhara and the others discusses in the previous chapter. In these theories
the idea that the tone system of Middle Japanese resembled the tone system of proto-
Japanese had been abandoned because of the difficulties in relating the tone system
of Middle Japanese to the tone systems of the modern dialects. Both Ramsey and
Kindaichi however, regard the tone system of Middle Japanese as very close to the
tone system of proto-Japanese.
It will be clear that Ramsey’s reconstruction of the Middle Japanese tone system
must be fundamentally different from the standard reconstruction, and it is:
According to Ramsey, the standard interpretation of the tone value of the tone dots
has to be exactly reversed.
As the ping and the shang tone dots are the ones that are most consistently used
in Japanese, Ramsey limited his theory to these two tones. He argued that, as we do
not know the exact value of the Late Middle Chinese tones, they mean no more than
would x or y when used to indicate the tones of Middle Japanese.
He noticed that the most regular correspondence between Middle Japanese and
any of the modern Japanese dialects was not between Middle Japanese and Kyōto
but between Middle Japanese and Tōkyō: A ping-shang sequence in Middle
Japanese corresponds regularly to a pitch fall in the Tōkyō type dialects, and he
concluded therefore that the ping tone dot must have expressed /H/ tone and the
shang tone dot /L/ tone.1
Ramsey first presented his theory in the article ‘The Old Kyōto dialect and the
historical development of Japanese accent’ (1979).2 A year later an extended version
1 It is important to note that this regular correspondence includes many ping-shang sequences in
which the ping tone falls on the last syllable of a word and the shang tone on the attached case
particle. The tone of the case particles in Middle Japanese forms an integral part of Ramsey’s
theory.
2 At the time, the periods in which the history of the Japanese language is divided were usually
named Old Japanese (700-800), Late Old Japanese (800-1200) and Middle Japanese (1200-
1600), which is why Ramsey used the term ‘Old Kyōto dialect’. Nowadays the names Old
Japanese (700-800), Early Middle Japanese (800-1200) and Late Middle Japanese (1200-1600)
78 3 Ramsey’s reconstruction of the Middle Japanese tone system
of the original article was published in Japanese, translated by Tokugawa
Munemasa: ‘Nihon-go akusento no rekishiteki henka’ (1980). In 1982 Ramsey
published an article in which he concentrated on a sketch of the scientific climate in
which the standard theory had developed, ‘Language change in Japan and the
Odyssey of a Teisetsu’. (This article has been quoted at length in the previous
chapter.)
Although Ramsey did not address the question of what the tone value of the
rarely used light ping and qu tone dots in Japan had been like, it follows from his
reconstruction of /H/ for the ping tone and /L/ for the shang tone that the qu tone
must have expressed a falling tone contour /F/ and the slightly raised light ping dot
must have expressed a rising tone contour /R/. (As we have seen in sections 1.2,
1.3.1 and 1.3.2, the fact that the qu tone must have indicated a ping-shang sequence
and the light ping tone a shang-ping sequence can be inferred from such things as
double attestations, the tone of attached particles and the context in which these
tones occurred in the Middle Japanese material.)
When the Middle Japanese data are reinterpreted following Ramsey’s
interpretation of the value of the tone dots, the result is a tone pattern, which, on a
crucial point (the location of a transition from /H/ to /L/), agrees with the modern
Tōkyō type dialects. This means that in this respect at least, the original tone pattern
of the Kyōto area must have been similar to that of the present-day Tōkyō type
dialects.
The idea that the present-day Kyōto type tone system developed as the result of
changes that took place some time after the 11th century offers a natural explanation
for the geographical distribution of the different tone systems in Japan, which is
such a problem if one adheres to the standard reconstruction of the Middle Japanese
tone system.
3.1 Arguments based on the comparative method
In the next couple of sections, I will present a number of arguments in favor of
Ramsey’s theory that are based on a comparison of the Middle Japanese tone system
with the modern dialect data. The first three arguments already formed part of
Ramsey’s original theory, but I have expanded on them in the following way:
First of all, I have included a comparison with data from the dialect of Nozaki.
Secondly, I have decided to include all nouns of one, two and three syllables, while
Ramsey reasoned from a more limited number of tone classes.3 (The developments
in the monosyllabic nouns will be treated in section 3.1.5.)
are more common.
3 As mentioned, Ramsey decided not to address the complications involved in the rarely used
light ping and qu tones, and because of this he excluded the special developments in tone
classes 1.2, 1.3b and 2.5 from the discussion. As to the longer nouns, in the first article in
which he presented his theory, published in English (1979) he only adduced the two largest and
3.1 Arguments based on the comparative method 79
3.1.1 Ramsey’s Middle Japanese tone system
resembles the tone system of Tōkyō
Apart from the geographical distribution of the different tone systems, the first
strong point of Ramsey’s theory is the fact that the correspondence between
Ramsey’s Middle Japanese and the tone systems of the Tōkyō type dialects is so
straightforward: The syllable that immediately precedes a drop in pitch in Middle
Japanese, is the syllable that has the /H/ tone in these dialects. (See the comparison
between the tones of Middle Japanese and the dialect of Tōkyō below.) In other
words, the Tōkyō type dialects have preserved the location of a transition from /H/
tone to /L/ tone in Middle Japanese.
The average present-day Tōkyō type dialect allows only one /H/ tone per word.
In the change from the Middle Japanese tone system to the tone systems of the
modern Tōkyō type dialects the number of /H/ tones per word was reduced: The /H/
tones of Middle Japanese only survived as /H/ (vs. Ø) tone when immediately
followed by /L/ tone in Middle Japanese. In tone classes that included two non-
consecutive /H/ tones in Middle Japanese (tone class 3.7) or /H/ followed by /R/
tone (tone class 2.5) only the first /H/ before /L/ survived.
There are however, a number of villages in Japan where the two non-consecutive
/H/ tones, as well as the /R/ tone on the final syllable of tone class 2.5 have been
preserved: In the Tōkyō type dialect of Nozaki and a number of other villages on
Noto Island the tone pattern that is reconstructed for classes 2.5 and 3.7 in Ramsey’s
Middle Japanese can still be found.
The Nozaki data are from Kindaichi’s article of 1954, which will be discussed in
more detail in section 6.2. Although Kindaichi interprets the important Nozaki data
very differently, I consider the Nozaki dialect as having preserved one of the most
archaic tone systems of present-day Japan.
1 Comparison of Ramsey’s reconstruction
with the tone systems of the modern dialects
Middle Nozaki Tōkyō
Japanese (Nairin) (Chūrin)
2.1 - /LL-L/ - /ØØ-Ø/ - /ØØ-Ø/
2.2 - /LH-L/ '- /ØH-Ø/ '- /ØH-Ø/
2.3 - /HH-L/ '- /ØH-Ø/ '- /ØH-Ø/
2.4 - /HL-L/ '- /HØ-Ø/ '- /HØ-Ø/
2.5 -4 /HR-L/ '''- /HR-Ø/ '- /HØ-Ø/
most regular of the trisyllabic tone classes (tone classes 3.1 and 3.4), while in the later Japanese
version (1980) tone classes 3.5 and 3.7 were also included. A discussion of the developments in
tone classes 3.2, 3.3 and 3.6 has been added by me.
4 Some varieties of Middle Japanese have - tone for tone class 2.5, which can be analyzed
80 3 Ramsey’s reconstruction of the Middle Japanese tone system
Middle Nozaki Tōkyō
Japanese (Nairin) (Chūrin)
3.1 - /LLL-L/ - /ØØØ-Ø/ - /ØØØ-Ø/
3.2 - /LLH-L/ '- /ØØH-Ø/ '- /ØØH-Ø/
3.35 - /LHH-L/ '- /ØHØ-Ø/ '- /ØHØ-Ø/
3.4 - /HHH-L/ '- /ØØH-Ø/ '- /ØØH-Ø/
3.5 - /HHL-L/ '- /ØHØ-Ø/ '- /ØHØ-Ø/
3.6 - /HLL-L/ '- /HØØ-Ø/ '- /HØØ-Ø/
3.7 - /HLH-L/ ''- /HØH-Ø/ '- /HØØ-Ø/
In the previous chapter, we have seen that in the standard theory the Tōkyō type
tone systems can only have developed from a relatively modern Kyōto type tone
system. In case of Ramsey’s theory the opposite is true: The Kyōto type tone
systems can only have developed from an archaic Tōkyō type tone system such as
that of Nozaki, in which tone classes 2.5 and 3.7 were still distinguished.
3.1.2 Ramsey’s Middle Japanese tone system is a suitable ancestor
of the Kyōto type tone systems
The transition from /H/ to /L/ tone in Middle Japanese was still in the Tōkyō type
location, which means that the leftward tone shift in the Kyōto area must have taken
place some time afterwards. It should therefore be possible to predict the tone of the
modern Kyōto type dialects by shifting the tones of Middle Japanese as
reconstructed by Ramsey one syllable towards the left.
As it turns out, it is indeed possible to predict the tone of the modern Kyōto type
dialects, and much better than was the case with the traditional interpretation of the
Middle Japanese tone dots.
As shown in (2) for the dialect of Kōchi, /H/ before /L/ in Middle Japanese has
been preserved as /H/ tone, and shifted one syllable towards the left.
2 The development from Ramsey’s reconstruction to the tone system of Kōchi
Middle Japanese Kōchi
2.1 - > - /ØØ-Ø/
2.2 - > '- /HØ-Ø/
2.3 - > '- /HØ-Ø/
2.4 - > '- /LØ-Ø/
2.5 - > ''- /LH-Ø/
Middle Japanese Kōchi
as /LR-L/ where the final /R/ tone of class 2.5 is realized as a [L] tone with a floating [H] tone
on the attached case particle. The dialect of Nozaki has gone through a similar development.
5 The developments in tone class 3.3 are most complicated and will be treated in separately in
section 4.5.
3.1 Arguments based on the comparative method 81
3.1 - > - /ØØØ-Ø/
3.2 - > ''- /LHØ-Ø/
3.3 - > '- /HØØ-Ø/
3.4 - > '- /ØHØ-Ø/
3.5 - > '- /HØØ-Ø/
3.6 - > '- /LØØ-Ø/
3.7 - > ''- /LHØ-Ø/
3.1.3 /H/ tones in tone classes 2.3, 3.4 and 3.5 were already present
in Ramsey’s Middle Japanese
With Ramsey’s reconstruction it is not strange that tone classes 2.3, 3.4 and 3.5 have
/H/ tone in the modern dialects. All /H/ tones that have to be reconstructed for proto-
Japanese on the basis of a comparison of the modern dialects can be found in
Ramsey’s reconstruction of the Middle Japanese tone system. This means that one of
the most serious problems of the standard theory is solved.
3 The developments in classes 2.3, 3.4 and 3.5
Middle Tōkyō Kōchi
Japanese (all types)
2.3 - '- '-
3.4 - '- '-
3.5 - '- '-
3.1.4 /H/ tone spreading onto the particles after /LH/ tone:
Gairin type tone as a natural development
The interesting point is, that Ramsey’s reconstruction not only explains the presence
of /H/ tone in tone classes 2.3, 3.4 and 3.5 in the modern dialects, but also the
absence of /H/ tone in tone classes 2.2 and 3.2 in the Gairin type dialects. With
Ramsey’s theory it is – for the first time – possible to derive both the Gairin type
dialects as well as the Nairin/Chūrin type dialects from Middle Japanese in a
plausible way.
It is well-known that in a LHL environment, H tone is more likely to spread to
the following L tone than in a HHL environment (Austen, 1974, Hyman, 1978). In
other words, LH (rising) tone contours have a stronger tendency to spread to the
right than level H tone contours. This difference lies at the root of the development
of the Gairin type tone system, as it explains why tone classes 2.2 and 3.2 have Ø
tone in the modern Gairin type dialects:
In the Gairin type dialects, /H/ tone spreading caused the /L/ tone of the
monosyllabic case particles to be lost after nouns that ended in /LH/ tone, so that
there was no longer a drop in pitch after the final /H/ tone of the noun. When the
number of /H/ tones per word became restricted, during the development towards the
modern restricted tone systems, only /H/ before /L/ was preserved as /H/ tone. (All
82 3 Ramsey’s reconstruction of the Middle Japanese tone system
other occurrences of /H/ tone, as well as all /L/ tones were reduced to Ø tone.) In the
modern Gairin type dialects these nouns therefore lack /H/ tone. The drop to /L/ tone
on the particle after nouns with a level /H/ tone contour (tone classes 2.3 and 3.4) on
the other hand, was not lost in these dialects, and in the modern Gairin type dialects
these tone classes therefore have /H/ tone on the final syllable.
Not only does this development offer an explanation for the lack of /H/ tone in
tone classes 2.2 and 3.2 in the modern Gairin type dialects, just such a development
has been attested in manuscripts such as the Daiji-in-bon of Shiza kōshiki 大慈院本
四座講式 (±13th century?), the Date-ke-bon of Kokin waka-shū 伊達家本古今和歌
集 (1226), Myōgō-ki 名語記 (1268/75) and Kokin kunten-shō 古今訓点抄 (1305).
This is why I have labeled the Middle Japanese tone system attested in these
materials the MJ ‘Gairin’ tone system.
When in the Tōkyō type dialects only the last /H/ before /L/ tone was preserved
as the /H/ tone of the modern restricted tone systems, the Gairin type dialects
merged classes 2.1 and 2.2 and 3.1 and 3.2 as in (4).
4 The Gairin type merger pattern and the tone of the monosyllabic case particles
in Kokin kunten-shō
Kokin kunten-shō Gairin
2.1 - > - /ØØ-Ø/
2.2 - > - /ØØ-Ø/
2.3 - > '- /ØH-Ø/
3.1 - > - /ØØØ-Ø/
3.2 - > - /ØØØ-Ø/
3.4 - > '- /ØØH-Ø/
In the Nairin/Chūrin type dialects on the other hand, in which such /H/ tone
spreading had not taken place, the merger pattern was different. In these dialects a
merger occurred between tone classes 2.2 and 2.3 and tone classes 3.2 and 3.4, as
shown in (5).
5 The Nairin/Chūrin type merger pattern and the tone of the monosyllabic case
particles in Ruiju myōgi-shō
Ruiju myōgi-shō Nairin/Chūrin
2.1 - > - /ØØ-Ø/
2.2 - > '- /ØH-Ø/
2.3 - > '- /ØH-Ø/
3.1 Arguments based on the comparative method 83
Ruiju myōgi-shō Nairin/Chūrin
3.1 - > - /ØØØ-Ø/
3.2 - > '- /ØØH-Ø/
3.4 - > '- /ØØH-Ø/
3.1.5 /H/ tone spreading onto the particles after /R/ tone:
Chūrin type tone as a natural development
With the reconstruction of the tone of tone class 1.2 as /R/, which follows from
Ramsey’s reinterpretation of the ping and shang tones, we now – for the first time –
have a likely explanation for the different ways in which this tone class has merged
in the Nairin dialects and in the Chūrin/Gairin dialects. It turns out that this split
within the Tōkyō type dialects goes back to differences in the degree of tone
spreading across the word boundary as well.
When the /R/ contour tone of class 1.2 was lost, the Gairin and Chūrin dialects,
in which the rise in pitch had spread onto the following particle, simplified tone
class 1.2 to [L]: 1.2 - > - > -. The Nairin dialects on the other hand,
which had not spread the rising tone contour of tone class 1.2 onto the particle,
simplified tone class 1.2 to [H]: 1.2 - > '-.6
The fact that the Chūrin type dialects lost the pitch fall after the noun in case of
tone class 1.2, but not in case of classes 2.2 and 3.2 is because a rising tone contour
on a single syllable has a stronger tendency to spread to the right, onto the following
syllable, than a rising tone contour that is spread out over two consecutive syllables.7
The Nairin dialect did not lose the drop to [L] pitch after any of the nouns with final
/R/ or /LH/ tone, and it seems that in this dialect the tone of the particles was most
resistant to /H/ tone spreading.
What is remarkable, is that the conservative Nairin type, in which /H/ tone
spreading across the word boundary has not yet taken place, is confined to the more
central region, forming a small band around the area with Kyōto type tone in the
middle (except on Shikoku where Nairin type tone is absent). The more innovative
Chūrin type, which has a limited degree of tone spreading, forms a ring around the
Nairin type, which is the opposite of what we would expect.
This distribution appears to be related to the geographical distribution of
automatic vowel length in monosyllables. (Or to be more precise, in the distribution
of vowel length in monosyllabic nouns that persists when a case particle is attached.)
6 There are still many dialects that have contour tones on the phonetic level, but these do not
stand in opposition to level tones. Contour tones typically occur in such dialects as a means to
make a pitch fall after the final syllable audible even when no particle is attached: ', '-
/H-Ø/, or to make a rise in pitch after the final syllable audible, even when no particle is
attached ', '- /L-Ø/.
7 We see for instance, that in material such as the Kanchi-in-bon of Ruiju myōgi-shō and Nihon
shoki shi-ki 日本書紀私記, tone spreading onto the monosyllabic case particles occurs after the
final /R/ tone in class 2.5 (-) but not after the /LH/ tone on two consecutive syllables in
tone class 2.2 (-).
84 3 Ramsey’s reconstruction of the Middle Japanese tone system
Even today, automatic vowel length in monosyllables is a regional feature of central
Japan, which can be found both in the Tōkyō type dialects and in the Kyōto type
dialects of central Honshū.
In proto-Japanese, as in many other languages, syllables with contour tones were
most likely automatically lengthened, whereas syllables with level tones were
automatically short.8 At some point however, the automatic vowel length in contour
tones was lost, and in most dialects this included the vowel length in the contour
tones of the monosyllabic nouns of class 1.2 and 1.3b. The central Japanese dialects
on the other hand, had generalized vowel length in monosyllables, so that in these
dialects the vowel length in tone classes 1.2 and 1.3b persisted.9
Even though monosyllables are automatically lengthened in isolation in central
Japan, there are differences as to whether the vowel length is maintained when a
particle is attached. On Honshū, the Nairin type dialects of Totsukawa and Noto for
instance, have vowel length in monosyllables in isolation as well as when a case
particle is attached, and so do the Kyōto type dialects of Kyōto, Wakayama and
Hyōgo. On Shikoku on the other hand, in the Chūrin type dialects as well as in the
Kyōto type dialects on the island, vowel length is found in monosyllables in
isolation only.
This difference between central Honshū and Shikoku, and the fact that in central
Honshū we find Nairin type tone, while on Shikoku we find Chūrin type tone, is
surely no coincidence. I assume that the rising tone contour on the short vowel of the
monosyllabic noun + particle of Shikoku spread onto the attached case particle,
whereas the rise on the long vowel of central Honshū did not. Accordingly, I
reconstruct the developments in the Nairin type dialects of Honshū as in (6).10
6 The developments in monosyllabic nouns in the Nairin dialects
MJ ‘Nairin’ Intermediate Nairin
stage (Totsukawa)
1.1 :, :- :, :- :, :-
1.2 :,:- > :, :- > :', :'-
1.3 :, :- > :, :- > :', :'-
8 Based on the difference between the tone of Wa-on and Kan-on loanwords, I will argue in
section 11.1.1 of part II, that this was still the case, at least until the 8th century, i. e. during the
Old Japanese period.
9 Not only central Japan, but also Okinawa generalized vowel length in monosyllables. As
automatic lengthening of monosyllables occurs in many languages (including Sakhalin Ainu), I
see the lengthening in central Japan and Okinawa as parallel independent developments.
10 It is not the case that all present day Nairin type dialects have still preserved vowel length in
monosyllables (in the Nairin type tone of Nagoya and other dialects in Aichi prefecture for
instance, monosyllables do not have vowel length today even in isolation) but it is my
assumption that they had at the time when the contrast between contour tones and level tones
disappeared.
3.1 Arguments based on the comparative method 85
The developments in the Chūrin type dialects of Shikoku – which do have vowel
length but only in monosyllabic nouns in isolation – and in Chūrin type dialects such
as Tōkyō – which do not have vowel length in monosyllabic nouns at all – were as
in (7).
7 The developments in monosyllabic nouns in the Chūrin dialects
MJ ‘Chūrin’ Chūrin MJ ‘Chūrin’ Chūrin
(Shikoku) (Shikoku) (Honshū) (Tōkyō)
1.1 :, - > :, - , - > , -
1.2 :, - > :, - :, - > , -
1.3 :, - :', '- , - ', '-
In other words, the difference between the present-day Nairin and Chūrin/Gairin
reflexes goes back to a difference in the tone of the case particles after tone class 1.2.
The Gairin type dialects had lost the [L] pitch of the case particle after any type
of rise on the preceding noun (/LH/ as well as /R/). In case of the Chūrin type
dialects on the other hand, the rise to [H] pitch of class 1.2 was most likely only
shifted onto the particle at the time when the automatic vowel length in contour
tones was lost. In case of the Chūrin type dialects therefore, the pitch of the case
particle was conditioned by the presence or absence of vowel length. As vowel
length in monosyllables is a regional feature of central Honshū, this explains the
central geographical distribution of Nairin type merger pattern in monosyllabic
nouns.
On Honshū, Kyōto type tone developed from Nairin type tone. The development
from the MJ ‘Nairin’ type tone system to the tone system of modern Kyōto is shown
in (8).11
8 The developments in monosyllabic nouns in Kyōto
MJ ‘Nairin’ Intermediate Kyōto
(Honshū) stage
1.1 :, :- > :, :- > :, :-
1.2 :, :- > :, :- > :', :'-
1.3 :, :- > :, :- > ':, ':-
11 Because of changes in Japanese syllable structure that had taken place during the Early Middle
Japanese period (800-1200) as outlined in section 0.6.1.2, non-syllabic moras could now
function as independent timing and tone-bearing units, and it can be seen that in these two-
mora monosyllables, the tones shift from the second to the first mora. (See section 8.2.1 on the
[F] tone contour of class 1.3 in Nairin Japanese in the period before the shift.)
86 3 Ramsey’s reconstruction of the Middle Japanese tone system
On Shikoku on the other hand, Kyōto type tone developed from Chūrin type tone.
The development from the MJ ‘Chūrin’ tone system to the tone system of Kōchi is
shown in (9). A comparison of the two shows that, although the starting point of the
developments in Kōchi and Kyōto was different, the end result was similar.
9 The developments in monosyllabic nouns in Kōchi
MJ ‘Chūrin’ Kōchi
(Shikoku)
1.1 :, - > :, -
1.2 :, - > :, '-
1.3 :, - > ':, '-
3.1.6 The /L/ register of tone class 3.2 in Kōchi
In Kindaichi’s reconstruction of the Middle Japanese tone system, the initial /L/ tone
that can be found in class 3.2 in Kōchi ('') could not be explained. If the tones
of class 3.2 in Ramsey’s reconstruction of the Middle Japanese tone system are
shifted one syllable to the left as in (10) however, it can be seen that /L/ register in
Kōchi has a straightforward origin: If Middle Japanese had /L/ tone on the second
syllable, this /L/ tone was shifted onto the first syllable in Kōchi.
10 The origin of /L/ register in Kōchi
Middle Japanese Kōchi
3.6 - > '-
3.7 - > ''-
3.2 - > ''-
This means that – at least in Kōchi – tone classes 3.2 and 3.4 had not merged at the
time of the shift. In the Kyōto type dialects on Honshū (Kyōto, Wakayama and
Ōsaka) on the other hand, the reflexes of tone class 3.2 are mixed, indicating that the
developments in these dialects were more complex. This issue will be discussed in
sections 4.2.2 and 8.1.5.
3.1.7 Restrictions to the location of the /H/ tone in the Kyōto type dialects
While in the Tōkyō type dialects it is possible to have a /H/ tone on any syllable of
the word, in the Kyōto type dialects there are restrictions on the location of the /H/
tone. In Kyōto, the /H/ tone will never fall on the final syllable in words of more
than two syllables, but to make up for this, the Kyōto dialect has distinctive initial
/L/ tone (/L/ register).12 All of the tone patterns that are allowed in nouns of three
12 The fact that /H/ tone on the final syllable is exceptional is only directly obvious in the longer
nouns, as final /H/ tone does occur regularly in shorter nouns, namely in tone classes 1.2 and
3.2 The attestation of the Nairin/Chūrin/Gairin split in the old documents 87
and four syllables in the modern dialects of Kyōto and Tōkyō (Hirayama, 1960) are
given in (11). A shift of the /H/ tone to the left in the Kyōto type dialect is the most
natural explanation for the lack of /H/ tone on the final syllable and the development
of the initial /L/ tone.
11 The location of the /H/ tone in longer nouns in Kyōto and Tōkyō
Tone of three and four-syllable Tone of three and four-syllable
nouns in Kyōto13 nouns in Tōkyō
' ' ' '
'' '' ' '
' '' ' '
' ' '
'
'
3.2 The attestation of the Nairin/Chūrin/Gairin split
in the old documents
When I first noticed that there were differences in the degree in which the tone of
attached case particles was influenced by the tone of preceding nouns in the old tone
dot material, I expected there to be a chronological development in the manuscripts.
In a syllable-tone language such as Middle Japanese, where each syllable had its
own tone, it makes sense to assume that the monosyllabic case particles originally
had independent /L/ tone, and that this independent /L/ tone was only gradually lost
through a growing influence of the tone of the preceding noun. Of the different types
of Middle Japanese tone systems, the MJ ‘Nairin’ tone system is therefore definitely
the most archaic, and closest to proto-Japanese. The MJ ‘Chūrin’ and MJ ‘Gairin’
tone systems represent innovations.
2.5. These final /H/ tones developed from final /R/ tone in Middle Japanese. Final /R/ appears
to have been preserved longer in shorter nouns. See section 8.2.5.
Apart from this there are a few cases of final /H/ tone in disyllabic nouns with three moras in
Kyōto. These all involve words with /L/ register and geminated consonants before the final
syllable, such as 'matti' ‘match’, 'gittyo' ‘left handed’, 'deppa' ‘protruding front teeth’ and
'noppo' ‘a tall gangly person’, 'minna' ‘all’, 'makka' ‘crimson’ (Martin 1987: 147), and I see
these cases as exceptions.
13 Instances of ', ' and ' tone in Kyōto are mostly compounds. In Kyōto,
the pitch fall in words of three syllables or more that do not have /L/ register nowadays usually
falls after the first syllable. Compound words, onomatopoeia and mimetic words can
sometimes violate tone rules. See for instance the Ainu dialect of Yakumo, where pitch-accent
normally shifts from the second to the third syllable if the second syllable is open, except in
case of compounds and onomatopoeia.
88 3 Ramsey’s reconstruction of the Middle Japanese tone system
12 Differences in the tone of the monosyllabic case particles
in the Middle Japanese material
Tone of attached monosyllabic case particles
Tone of MJ ‘Chūrin’ MJ ‘Nairin’ MJ ‘Gairin’
preceding tone system: tone system: tone system:
noun /H/ tone No /H/ tone /H/ tone
spreading after spreading spreading after
/R/ /R/ & /LH/
1.1 -
1.2 - < >
1.3 -
2.1 -
2.2 - >
2.3 -
2.4 -
2.5 - < >
3.1 -
3.2 - >
3.4 -
3.5 -
3.6 -
3.7 - >
The oldest attestations of the Middle Japanese tone system that contain data on the
tone of the case particles, such as the Tosho-ryō-bon of Ruiju myōgi-shō 図書寮本
類聚名義抄 and certain manuscripts of Nihon shoki 日本書紀 indeed show no
influence of tone spreading onto the particles. The particles have /L/ tone, which
agrees with the idea of a gradual loss of the independent tone of the case particles.
As the different degrees of /H/ tone spreading can be linked to the
Nairin/Chūrin/Gairin split however, and as these three tonal type still exist in
Honshū, we would – at the same time – expect the more archaic types to be still
attested even after the types with tone spreading developed, and this is indeed what
we find: Sakurai (1975:193) already noted that long after the earliest attestations of
tone spreading to the particles, tone systems with a more limited dergee of tone
spreading or no tone spreading at all, continue to be attested in much later
manuscripts.14
This indicates that the differences in the degree of tone spreading were dialectal.
Although the MJ ‘Gairin’ and MJ ‘Chūrin’ types were the result of innovations,
14 Cf. with tone spreading after /R/ and /LH/: the Date-ke-bon of Kokin waka-shū (1226), Shiza
kōshiki (± 1226?). With tone spreading only after /R/: the Mikanagi-bon and Ōei-bon of Nihon
shoki shi-ki (1278-1293). Without tone spreading: the Kamakura-bon of Nihon shoki (1303)
and the Maeda-ke-bon of (Jōben) Shūi waka-shū (1333).
3.2 The attestation of the Nairin/Chūrin/Gairin split in the old documents 89
these innovations never spread to all dialects. (If the loss of /L/ tone on the particles
after final /R/ and /LH/ had been without exception, there would only be Gairin type
tone systems in Japan today.)
The fact that the difference is dialectal also means that it may very well be a
coincidence that the archaic ‘Nairin’ type has the oldest attestations. The early
attestation of this type may simply be the result of the fact that the old capital of
Kyōto was located within the Nairin dialect area. The other types – although the
result of innovations – may have been around for centuries, before the first tone dot
material reflecting a Nairin type tone system was recorded.
The Middle Japanese tone systems that had /H/ tone spreading across the word
boundary can be linked to modern dialect types that occur to the east and west of the
direct vicinity of Kyōto, but still in central Honshū. (To the east of Nagoya for
instance, within a stretch of 50 km., one travels from areas with Nairin, to Chūrin to
Gairin type tone.)
The area with Gairin type tone in eastern Aichi and western Shizuoka prefecture
comprises the old provinces of Mikawa, Tōtōmi and Shinano. Material with tone dot
markings of the MJ ‘Gairin’ type, in which the tone spreading was most extensive,
such as Kokin waka-shū 古今和歌集, may originate from these provinces, or may
have been marked by people who came from these provinces. We have to remember,
that after 1185 the political power shifted to military rulers from eastern Japan, the
Minamoto, the Hōjō and the Ashikaga, who also possessed and ordered copies of the
famous literary works.15
Tone dots were often added to manuscripts by different people than the original
authors, and often at a much later time, and it is not unlikely that these notes
reflected local tone systems, just as nowadays people from the provinces, even when
attempting to speak standard Japanese, tend to maintain their regional tonal
distinctions.
This possibility grows even stronger when we consider that before the Kyōto
shift took place in the 14th century, the differences between the tone systems in this
area of Honshū were truly minor, as they mainly involved the tone of attached case
particles. It seems unlikely that there would have developed a clear awareness of
these differences, and a strong pressure to use only central Japanese Nairin type tone
patterns at such an early stage.
Even if a certain manuscript with MJ ‘Chūrin’ or MJ ‘Gairin’ type tone dot
markings can be shown to originate from a monastery in central Japan, it is still
possible that the tone of the particles in such material reflects a provincial pattern, as
15 I do not know enough of the history and provenance of the various materials in connection with
the differences in particle tone that they reflect (this is certainly a topic which needs further
exploration), but it may be no coincidence that the Jōben-bon of Shūi waka-shū (1333) owned
by the Maeda family – although being a late work – still shows a MJ ‘Nairin’ type tone system
without tone spreading on the particles, as the Maeda family stemmed from Owari province
(corresponding to the western part of modern Aichi district), which for the larger part belongs
to the present-day Nairin area.
90 3 Ramsey’s reconstruction of the Middle Japanese tone system
monks from all over Japan may have entered the famous monasteries, and – as
mentioned before – the preference for a MJ ‘Nairin’ type tone system was most
likely still weak.
From the viewpoint of the standard theory on the other hand, late manuscripts in
which the tone of the particles is not affected by tone spreading have to be explained
away, or they have to be regarded as attestations of dialects that lagged behind in the
developments.16 Eventually all dialects had to follow the trend towards loss of the
independent tone of the particles, because – if we follow the standard reconstruction
of the Middle Japanese tones – such a development is indispensable to get from the
Middle Japanese tone system to the modern Tōkyō and Kyōto type tone systems.
This has to do with the fact that there are no modern Kyōto type dialects, nor any
Tōkyō type dialects, that have preserved the /H/ particle tone that occurs after tone
classes 2.2, 2.3, 3.3, 3.4, 3.5 and 3.7 in the standard reconstruction of the Middle
Japanese tone system. The development in Kyōto for instance, must have been as
shown in (13).
13 The development of the tone of the monosyllabic case particles
in the standard theory
Middle Japanese Intermediate Kyōto
(Kindaichi) stage
2.2 - > -
2.3 - > - > -
3.3 - > -
3.4 - > - > - > -
3.5 - > - > -
3.7 - > -
16 Kindaichi (1964:338) for instance, attributes the lack of tone spreading after 2.2 and 3.2 in
Nihon shoki shi-ki (1278-1293), which are commentaries to the Nihon shoki, to influence of the
tone dots added to old manuscripts of the Nihon shoki itself on the compilers of Nihon shoki
shi-ki. A problem is that Nihon shoki shi-ki does show tone spreading after classes 1.2 and 2.5,
while most manuscripts of Nihon shoki do not. Why would there be adherence to archaic norms
in case of classes 2.2 and 3.2 but not in case of classes 1.2 and 2.5? It is much more likely that
the people who added the tone dots to Nihon shoki shi-ki were speakers of a MJ ‘Chūrin’ type
dialect. Sakurai likewise attributes the lack of tone spreading in the Jōben-bon of Shūi waka-
shū (1333) (which belongs to the MJ ‘Nairin’ type) to adherence to archaic norms (Sakurai,
1976:392). However, Nihon shoki shi-ki and the Jōben-bon of Shūi waka-shū are both
innovative in the sense that they happen to be the very first works in which the use of the qu
tone dot to mark the tones of Japanese was abandoned, and Nihon shoki shi-ki is also
innovative in that the tone marks added to yodan verbs show that the shūshi-kei was already
being replaced by the rentai-kei (Sakurai, 1975:203). All of this argues against the idea that
adherence to archaic norms can explain the lack (or relative lack) of tone spreading to the
particles in these works.
3.3 Two more arguments from dialect geography 91
Although the mixed nature of the Middle Japanese data on the tone of the particles
suggests dialect diversity in the old material, it has been customary to regard all tone
dot material as reflecting the tone system of Kyōto. One reason for this is the fact
that Kyōto was the cultural center of Japan in the Middle Japanese period.
Furthermore, the use of tone dots (being related to the correct pronunciation of the
mantras and dhāran,ī) originated in the Shingon and Tendai monasteries of central
Japan, as did the much later habit of using tone dots to indicate the tones of Japanese
words. But the most important reason is probably the fact that in the standard
interpretation all tone dot material resembles the tone system of Kyōto.
Following Ramsey’s interpretation of the tone dots, it is no longer necessary to
regard all Middle Japanese tone dot material as stemming from Kyōto and its direct
vicinity. Does the fact that this was culturally the most important region truly mean
that all manuscripts should reflect tone systems that stem from this region? How can
this be reconciled with the fact that there are clearly observable dialectal differences
between the tone systems attested in different manuscripts?
3.3 Two more arguments from dialect geography
The present-day geographical distribution of Kyōto type and Tōkyō type tone is the
most well-known argument in favor of Ramsey’s theory, but there are two more
arguments from dialect geography which are much less known. The first has to do
with the fact that there is some historical information on the geographical base of the
tone system that is nowadays typical of Kyōto in earlier periods. The second has to
do with the fact that the isoglosses between Gairin type and Chūrin type tone are
more blurred than those between Tōkyō type and Kyōto type tone.
3.3.1 Reports on the geographical spread of Kyōto type tone in earlier periods
If we follow Ramsey’s theory we expect the area in which Kyōto type tone occurs to
get smaller and smaller as we go back in time. If we follow the standard theory on
the other hand, the development should be the exact opposite, as we expect the
geographical base of the Kyōto type tone system to have once encompassed the
whole of Japan.
The first text in which the split into the Kyōto type and the Tōkyō type tone
systems is mentioned in unequivocal terms is Mōtan shichin-shō 毛端私珍抄, part
of an unfinished work on Nō recitation by Konparu Zenpō 金春禅鳳 (1454-1532) of
which the exact date of compilation is unknown (Akinaga (ed.), 1998:30). This text
contains a comparison of the tone of the word inu ‘dog’ (class 2.3) in four regions of
Japan. The tone system attested in Nō plays is usually of the post-shift Kyōto type,
but provincial tone is occasionally used in the speech of figures introduced as
comical relief, which probably explains Konparu Zenpō’s interest in this topic. The
tones are indicated by means of so-called goma-ten. (See chapter 14 of part II).
92 3 Ramsey’s reconstruction of the Middle Japanese tone system
In Miyako-goe 京ごゑ (the language of the capital) the tone of inu is
(). In Tsukushi 筑紫 (northeast Kyūshū) and Bandō 坂東 (the provinces east of
the Ōsaka barrier in Ōmi)17 it is (). In Shikoku it is ().18
This means that around the year 1530, the area of Kyōto was already set off from
these other areas by a fundamentally different tone system. 19 The designations
Tsukushi, Bandō and Shikoku refer to regions, and the term Miyako can likewise
refer to a region, namely that of the Go-kinai or ‘five home provinces’. These are a
number of relatively small provinces directly to the south of Kyōto: Yamashiro,
Yamato, Kawachi, Izumi and Settsu, equivalent to present-day Nara, Ōsaka and
southern Kyōto prefectures (Map 2).
Map 2: The area of the five Go-kinai provinces
It cannot be ruled out however, that Miyako-goe is to be taken literally, and that it
refers strictly to the dialect of Kyōto, but it is also possible that it refers to a dialect
type of a larger area, which is exemplified by the dialect of Kyōto.
Ramsey has pointed out that Konparu Zenpō’s Shikoku pitches seem to confirm
his reconstruction of the tones of the Middle Japanese tone class 2.3 as . I have
the feeling that this is correct, and that around 1530 the Kyōto tone shift had
probably not yet spread to Shikoku. We cannot however, be absolutely certain of
this. As Ramsey (1980:75) also indicates; in the Sanuki subtype of the Kyōto type
tone system on Shikoku, tone class 2.3 has a level or tone (depending on
the analysis) as tone class 2.3 in this dialect has merged with tone class 2.1. (See
sections 7.2.1 and 4.2.5). The problem is that we do not know when the present-day
Sanuki type tone system developed, and so we cannot know for certain which
Shikoku dialect Konparu Zenpō was referring to.
17 I.e. Iga, Ise, Shima, Owari, Mikawa, Tōtōmi, Suruga, Kai, Izu, Sagami, Musashi, Awa, Kazusa,
Shimōsa, and Hitachi.
18 Konparu Zenpō also mentions a number of compounds in the dialect of Kyōto in which the
pitch of inu is suddenly just like that of Shikoku, namely . See also section 5.2.3.
19 As Mōtan shichin-shō is an unfinished work, I assume that it was written toward the end of
Zenpō’s life (1454-1532).
3.3 Two more arguments from dialect geography 93
In the next work, Arte da Lingoa de Iapam (1604~1608), by the Portuguese
missionary João Rodrigues (who came to Japan in 1577) it is mentioned that Kyōto
type tone (the ‘correct and natural’ type as Rodrigues calls it, a qualification no
doubt adopted from his Japanese informants) could be found in the Go-kinai
provinces, but also in the provinces of Echizen, Wakasa, Tanba, Ōmi and Harima,
equivalent to present-day Fukui prefecture, the larger part of Hyōgo prefecture and
northern Kyōto prefecture (Map 3).
Map 3: The Spread of the Kyōto type tone system according to Rodrigues
This agrees rather closely with the present-day distribution of Kyōto type tone to the
west, north and northeast of Kyōto, but Rodrigues does not yet make mention of a
number of provinces more towards the south, which nowadays also have Kyōto type
tone, like Kii, Iga and Ise (present-day Wakayama and Mie prefectures). Nor does
he yet mention the present-day Kyōto type tone on Shikoku and the islands in the
Seto Inland Sea (Map 4).
Map 4: The present-day spread of the Kyōto type tone system
As Hattori (1942:124-125) has suggested, Rodrigues’ knowledge of the Japanese
dialects may simply not have extended that far, and these areas may have had Kyōto
94 3 Ramsey’s reconstruction of the Middle Japanese tone system
type tone even though they are not mentioned. It is definitely true that these
descriptions are not entirely exact. The area with Tōkyō type tone that still exists
today in the inaccessible Totsukawa region (which formed part of the province of
Yamato, present-day Nara prefecture) is for instance not mentioned by Rodrigues,
and nor does Rodrigues mention the toneless area in the province of Echizen
(present-day Fukui prefecture) if that already existed in his day. It is therefore not
impossible to reason away the impression that Kyōto type tone has expanded at the
cost of the surrounding Tōkyō type tone systems. It will be clear however, that the
information that we have on the geographical base of the Kyōto type tone system in
earlier periods does not support the notion that the area with Kyōto type tone was
once larger, and became smaller in the course of time, but rather suggests the
opposite.
3.3.2 The blurred division between the Gairin and Chūrin areas
as an indication that this is the oldest dialect split in Japan
As I have argued in section 3.2, the existence of different types of tone dot material
indicates that the ancestor dialects of the present-day Chūrin and Gairin type dialects
had definitely split by the 13th century, but it is likely that this actually happened
much earlier. (The Ryūkyūan tone systems developed from a Gairin type tone
system, which means that the Gairin tone system must have split from proto-
Japanese before the settlement of the Ryūkyūs.) The idea that this split within the
Tōkyō type dialects is very old is supported by an observation made by Uwano in
Wurm & Hattori (1981) that there usually is a transitional area where different tone
systems meet, but that the gradation is especially marked wherever the Gairin and
the Chūrin types meet:
To give the example of Niigata Prefecture, there is a line on the map between
Nagahama and Mushū-iwato Hamlets of Jōetsu city, but in Mushū-iwato only
half of the words in 2.2 have merged into 2.1, the other half merging with 2.3.
As we go up north from there, the number of words merging into 2.1
gradually increases, and isoglosses of the residual one-third of 2.2 appear as
bundles again between Sampoku town (Niigata Prefecture) and Nezugaseki
Hamlet (Atsumo Town) Yamagata prefecture). In fact only after entering
Yamagata prefecture do most of the words in 2.2 merge into 2.1, and we find
a clear Gairin type.
The same gradation is seen in the Tōkai, San’in and North Kyūshū districts,
where the Chūrin and Gairin types meet. In Niigata Prefecture there are two
areas where the transition occurs en bloc, which we have marked with a
special transitory area, but in other places the transition is more continuous.
Uwano’s conclusion is that these two tonal types have a long history of contact,
and that the Gairin type originally existed over a wider area. I support Uwano’s first
conclusion, but I am not certain about the second. Following the conventional
3.3 Two more arguments from dialect geography 95
circular dispersion theory (and not Kindaichi’s reversed version of it) it would
indeed make sense to assume that the Chūrin type – being more central – is an
innovation that has expanded into the Gairin type, which – being peripheral – can be
expected to have preserved a more archaic tone system.
We have seen in this chapter however, that the Gairin type tone system is the
result of an innovation, namely the loss of the independent /L/ tone of the
monosyllabic case particles. This means that Chūrin (and Nairin) type tone is more
conservative than Gairin type tone, even though – at first sight – the geographical
distribution seems to suggest the opposite. This issue will be addressed in more
detail in chapter 10. At this point, the only conclusion I would like to draw from the
fact that the isoglosses between the areas with Gairin and Chūrin type tone are more
blurred than those between the other tonal types, is that the former dialect division is
older.
In other words, the distinction between Gairin and Chūrin type tone is the oldest
tonal split in Japan, older than the split between Chūrin and Nairin type tone, and –
most importantly – older than the split between Kyōto and Tōkyō type tone. This
means that the distinction between Gairin and Chūrin type tone was already around
when Kyōto type tone still had to develop, which is in complete agreement with
Ramsey’s theory.
4 The development of the tone systems of Tōkyō, Kyōto
and Kagoshima
In the previous chapter we have seen that it is possible to derive the tone systems of
the modern Kyōto type dialects, as well as the tone systems of all Tōkyō type
dialects directly from a proto-Japanese tone system that is practically identical to the
tone system of Middle Japanese, as long as this is reconstructed in accordance with
Ramsey’s interpretation of the tone dots. It is however, likely that the leftward tone
shift in the Kyōto type dialects occurred in a tone system that was already somewhat
closer to the restricted tone systems of the modern Tōkyō type dialects of central
Honshū: After all, we do not find the Middle Japanese tone system preserved on
either side of the area with Kyōto type tone.1
In 4.1 and subsections I reconstruct the developments that must have taken place
in central Honshū, leading to the kind of restricted Tōkyō type tone system in which
the Kyōto shift occurred. The reconstruction of the different stages is based on the
following types of evidence:
1. The tonal distinctions and the pitch assignment rules in the tone systems of the
Tōkyō type dialects that directly surround Kyōto.
2. The tonal distinctions preserved in the dialect of Kyōto itself.
3. Historical material from the 13th and 14th centuries, which reflects a tone system
that already differed from the tone system of Middle Japanese.2
In 4.2 and subsections I will discuss the changes that led to the development of
the Kyōto type tone systems of central Honshū, Shikoku and the Seto Inland Sea. As
we have seen in the previous chapter, the MJ ‘Gairin’ type tone system was
characterized by /H/ tone spreading across the word boundary onto attached case
particles after /LH/ or /R/ tone on a preceding noun. In section 4.3, I will discuss the
effect of the development towards restricted tone on such a tone system, and the
Gairin type mergers that are the result of it.
Finally, the Kagoshima type word-tone system on Kyūshū is often said to have
derived its division into two word-tones directly from the /H/ or /L/ tone of the
initial syllable in proto-Japanese. In section 4.4 I will argue that it is more likely that
1 The idea that the Middle Japanese tone system had already gone through some changes since
the Middle Japanese stage before the Kyōto tone shift took place, is also borne out by a
comparison of the tone of compound nouns with tone class 2.3 as the second element in Kyōto,
Tōkyō and Hiroshima. (See sections 5.6 and 5.9.)
2 In some of this material the tones are not indicated by means of tone dots, but by means of
musical notation marks. (See chapter 14 of part II.) I will also refer to the evidence contained in
the 14th century tonal spelling system proposed by Gyōa 行阿. (For this material see section
12.1.1 of part II.)
4.1 The developments in the Nairin and Chūrin type dialects 97
this division developed gradually from a restricted Gairin type tone system. (The
kind of tone system that can still be found in the northeast of Kyūshū.)
The tonal developments and intermediate stages that I reconstruct are also based
on a comparison with the principles of historical tonology and universals of tone
rules in Hyman (1978 and 2007), especially those that are based on the restricted
tone systems of a number of East African Bantu languages.3 The tonal developments
that have been observed in these languages closely resemble de developments that
have to be reconstructed for the main three tonal types of modern Japanese, based on
the three types of evidence mentioned above.
I will end this chapter with a discussion of a few complicated issues that are
better addressed separately, such as the special developments in class 3.3 (section
4.5), and the question of whether – as is often thought – the tone of the initial
syllable in Middle Japanese had a special status (section 4.6).
4.1 The developments in the Nairin and Chūrin type dialects
In Middle Japanese the tone of each syllable was distinctive, and a transition from
/L/ to /H/ was just as distinctive as a transition from /H/ to /L/. The location of the
drop from /H/ to /L/ for instance, distinguished tone class 2.4 - from tone
classes 2.2 - and 2.3 -, but the location of a rise from /L/ to /H/ in turn,
distinguished tone class 2.2 - from tone class 2.3 -. In the tone systems
of modern Kyōto and Tōkyō on the other hand, the last [H] before [L] is the only
remaining phonological /H/ tone in the word, and this /H/ tone is anticipated by
preceding Ø tones. Ø tones that precede /H/ tone (except the initial Ø tone in a tonal
phrase) have automatic [H] pitch, and so the location in the word of a transition from
[L] to [H] is not distinctive. In modern Tōkyō for instance, whether a noun occurs
with or with pitch is conditioned by the position of the word in the tonal
phrase. In phrase-internal position tone classes 2.2 and 2.3 both have pitch, and
in phrase-initial position they both have pitch.
The development from a relatively unrestricted /H/ vs. /L/ tone system, towards a
tone system in which only the location of a single remaining /H/ tone in the word is
distinctive was most likely a shared development in the tone systems of Kyōto and
Tōkyō. It is likely that the leftward tone shift in Kyōto took place in a tone system
that had already evolved since the Middle Japanese stage; a tone system that had for
3 In 1974 Larry M. Hyman and Russel Schuh first proposed a number of universals of tone rules.
The rules proposed in this paper were based on evidence from West Africa, a region where the
transition from (relatively) unrestricted tone to restricted tone that can be seen in a number of
East African languages (and that is most relevant to the developments in Japanese) does not
occur. In 1978 however, Hyman again presented a number of principles that rule historical tone
change, this time based on data from languages from East as well as West Africa. Hyman
(2007), which is an updated and expanded version of the universals of tone rules proposed in
Hyman & Schuh, also incorporates the crucial data from East Africa.
98 4 The development of the tone systems of Tōkyō, Kyōto and Kagoshima
instance, already developed the pre-eminence of a drop in pitch over a rise in pitch
that is typical of the modern tone systems of both Kyōto and Tōkyō.
If we accept the idea that Kyōto type tone is the result of an innovation that took
place in the middle of an area with a tone system that had evolved since Middle
Japanese, it makes sense to look to the Tōkyō type tone systems that directly
surround Kyōto for traces of what the tone system at the stage that immediately
preceded the Kyōto shift may have been like.
When we do this, it turns out that is not only the typical Tōkyō type location of
the /H/ tone in the word (one syllable later than in Kyōto) that occurs in the dialects
surrounding Kyōto, but that on a more superficial level as well the pitches of the
Tōkyō type dialects to the east and to the west of the Kyōto type area coincide.
As said, in the modern Tōkyō type tone systems it is only the location of a drop
in pitch that is distinctive, and as it is not important where in the word the pitch
starts to rise, this can differ from dialect to dialect: In the dialect of Aomori in the
northeast of Honshū for instance, in nouns with Ø tone, the final syllable will have
[H] pitch, and if a particle is attached, the final [H] pitch shifts onto the particle. In
other words, there is an automatic rise to [H] pitch in words or phrases with Ø tone,
but the [H] pitch is limited to the phrase-final syllable. In Akita on the other hand,
words or phrases with Ø tone will have [L] pitch throughout. If a word or phrase
contains /H/ tone, Aomori and Akita agree in that only the syllable that carries the
/H/ tone itself will have [H] pitch; all other syllables will have [L] pitch.
In the Tōkyō type dialects that directly surround the area where Kyōto type tone
is found however, the pitch assignment rules agree with each other. In these dialects
the /H/ tone in the word is anticipated. Due to a %L phrase boundary tone, only the
phrase-initial syllable is exempt from the /H/ tone anticipation. The phrase-initial
syllable will have automatic [L] pitch (unless, of course, this is the syllable that
carries the /H/ tone), and after the first syllable the pitch will be [H] until the
distinctive drop to [L] that occurs after the /H/ tone. In words or phrases with Ø tone,
in which the accent-like /H/ tone is lacking, the automatic rise in pitch after the first
syllable is often reported to be somewhat smaller (cf. Pierrehumbert & Beckman,
1988).
I will mention a number of descriptions of Tōkyō type dialects from these areas
in which the pitch assignment rules are as I have just described. (In many other
descriptions only the distinctive location of the pitch fall is indicated, and more
detailed information on the pitch assignment rules is not available.)
Dialects to the west of the Kyōto tone system: Yamaguchi (Kobayashi, 1975),
Hiroshima (Hirayama, 1960). Dialects to the east of the Kyōto tone system:
Matsumoto (Hirayama, 1960), Kōfu (Kindaichi, 1954), Numazu (Hirayama, 1960)
and Tōkyō (Hirayama, 1960). An exception is the dialect of Nagoya (Uwano, 1977).
In Nagoya, in words of more than two syllables, the rise in pitch will not occur after
the first syllable, but will be delayed until after the second syllable.
The rule that there will be an automatic rise after the first syllable of the tonal
phrase (if the first syllable does not carry the /H/ tone) can also be found in areas
4.1 The developments in the Nairin and Chūrin type dialects 99
that are even more significant. As I have mentioned before, in the middle of the area
with Kyōto type tone, in the isolated Totsukawa region, a number of villages have
preserved a Tōkyō type tone system. In many of these villages the above-mentioned
rule can also be found, and phonetic pitch shapes in these villages consequently
coincide with those of the other central Tōkyō type dialects, even though they are
cut-off from these dialects by the Kyōto type tone system that surrounds them. In the
villages with Tōkyō type tone on Noto Island, the pitch assignment rules are the
same in this respect as in the other central Tōkyō type dialects (Kindaichi, 1954).
It is highly unlikely that the tone systems that we can see both to the east and to
the west of the Kyōto type area, in the villages in the Totsukawa region and on Noto
Island, are the result of independent coincidental developments towards a Tōkyō
type tone system when they share with each other not only the location of the /H/
tone in the word, but also such low-level pitch assignment rules. The coincidence of
low-level pitch assignment rules in the modern Tōkyō type dialects of central Japan
most likely represents a remnant of the tone system that preceded the Kyōto shift. A
type of tone system that already incorporated similar rules (a %L phrase boundary
tone, and phrase-internal /H/ tone anticipation) must have existed in this large
central area before Kyōto type tone developed in the middle.
4.1.1 /H/ tone restriction
We have seen that in the Tōkyō type dialects to the east and to the west of Kyōto,
tone classes 2.2 and 2.3 have merged as in phrase-initial position (%) and
as in phrase-internal position, so that it is not immediately clear in what
direction the merger took place if we only look at the modern dialects: Did tone
class 2.2 lose the distinctive initial /L/ tone ( > ) in phrase-internal position,
or did tone class 2.3 lose the distinctive initial /H/ tone in phrase-initial position
( > )? Historical material from the 13th and 14th centuries shows that it was
tone class 2.3 in which the change took place, as from the 13th century on we start to
find attestations of tone class 2.3 with tone instead of the former tone.
(Attestations of tone class 2.2 with * tone on the other hand, are lacking.)
The earliest of such attestations of tone class 2.3 is in the Jakue-bon 寂恵本
(1278) of Kokin waka-shū 古今和歌集 where siho ‘tide’ and hana ‘flower’ (both
tone class 2.3) are marked with 上平 tone dots. (Also in the Fushimi miyake-bon 伏
見宮家本 of Kokin waka-shū from the end of the 13th century.) A 14th century
example of similar markings can be found in Moji-han 文字反 (1331–1334) where
sima ‘island’ (class 2.3) is marked with 上平 instead of 平平 tone dots. Far more
numerous examples can be found in fushihakase material from the 14th century.
These materials (which I refer to as ‘old’ rongi, in order to distinguish them from
‘new’ rongi material such as Bumō-ki 補忘記), have mostly for tone class 2.3
and attestations of occur only very rarely. Tone class 3.5 is almost always
marked with tone, while markings with tone are very rare.
The old rongi data furthermore indicate that the reduction of the number of /H/
tones was quite radical: It seems that all /H/ tones that did not immediately precede
100 4 The development of the tone systems of Tōkyō, Kyōto and Kagoshima
/L/ tone were lost. Tone class 3.4 for instance, still in Middle Japanese, is
marked with tone marks in the old rongi material.4
These markings do not agree with the pitch assignment rules of the Tōkyō type
dialects that nowadays directly surround the Kyōto type dialects, where tone class
3.4 is realized with ' pitch, indicating that these rules developed later
(although before the Kyōto shift).
A more important problem with the old rongi data is that such a radical reduction
of all /H/ tones except the last one per word would have resulted in the merger of
tone class 3.4 with tone class 3.2, while in fact, in Kyōto itself as well as in many
other Kyōto type dialects, tone classes 3.2 and 3.4 are kept – at least partially –
separate.5
4.1.2 The development of [M] pitch
I think it is possible to reconcile this apparent contradiction between the material
from the 14th century and the merger pattern of the modern Kyōto type dialects if we
reconstruct a stage in which – temporarily – [M] pitch played a role.
Both phonetic and phonological studies have shown that a [HL] interval is
subject to F0 polarization: A /H/ tone will often be significantly greater in height
when followed by /L/ tone, and such a pre-L /H/ tone may be raised to a contrastive
super-/H/ toneme (Hyman, 2007:3). I assume that this is how the /H/ tone restriction
in Middle Japanese started out: Pre-L /H/ tones became greater in height than other
/H/ tones, which we can analyze as a lowering to [M] pitch of all /H/ tones that did
not precede /L/ tone.
How should the [M] pitch reconstructed at this stage of the language be
analyzed? There can be no doubt that it started out as a subphonemic variant of /H/
tone, but it is clear that at some point /H/ before /H/ was truly lost. Should [M] pitch
at this stage be analyzed as /H/, as a new toneme /M/, or as something else?
As I have explained in the introduction, it makes sense to treat the modern
Japanese Tōkyō type dialects as restricted tone systems with /H/ vs. Ø opposition. In
order for /H/ tone to become marked and for /L/ tone to develop into the default or Ø
tone, a logical first stage is the reduction of the number of /H/ tones per word, so that
the remaining syllables with /H/ tone become highlighted over others. This is also
how the historical development from the richer tonal oppositions of proto-Bantu to
4 The Sino-Japanese words contained in the old rongi material show a similar restriction of the
number of /H/ tones per word: > , > , > , >
, > (adapted from Sakurai, 1976:173).
5 From the history of the rongi ceremonies, it becomes clear that the rongi tradition did not
develop in Kyōto, but in monasteries on Mount Kōya and Mount Negoro, both located on the
Kii peninsula in Wakayama prefecture. It is of course possible that the development towards a
restricted tone system in this area differed somewhat from the development in Kyōto. However,
the merger of tone class 3.2 with class 3.7 instead of with class 3.4, can also be found in several
Kyōto type dialects on the Kii peninsula, such as Gojō, Tanabe, Arida and Hongū. (See section
2.3.5.)
4.1 The developments in the Nairin and Chūrin type dialects 101
the restricted or accent-like tone systems of a number of modern Bantu languages is
thought to have taken place, as the more the number of /H/ tones per word becomes
restricted, the more accent-like the remaining /H/ tones become.
The following is a description by Beckman (1986) of the process by which tone
sequences might be rephonologized into culminative accent patterns:
Suppose that a language has predominantly bisyllabic roots. There are only
four contrasting patterns that these roots may have, namely, HH, HL, LH and
LL. Then the language need only lose the HH stems (perhaps through a
sound change turning all *HH stems to HL stems) in order for the roots to be
subject to reanalysis as having first syllable accent (HL), second syllable
accent (LH) or no accent (LL). Since the language has affixes whose tones
are determined to a large extent by the tones of the roots, the language now
has culminative accent placement organizing the utterance in into words or
larger sense-groups.
Although those Bantu languages where /H/ tone has accent-like qualities are
nowadays more often analyzed in terms of restricted tone, the process which
Beckman describes remains the same, and the fact that the 14th century intermediate
stage between Middle Japanese and the modern dialects shows a radical restriction
of the number of /H/ tones per word, confirms the idea that the developments in
Japanese were similar to the developments in a number of Bantu languages.
When register tone systems have a default or Ø tone, this is usually [L] in a two-
tone system, so that there is /H/ vs. Ø opposition. There are also languages however,
that are analyzed as having /H/ vs. /L/ vs. Ø opposition, such as Yoruba (Hyman,
2001:237), where Ø is realized as [M].
As I regard the lowering of /H/ before /H/ as the first step in the reduction of all
tones except /H/ before /L/ to Ø tone, I will analyze the three tone levels that I
reconstruct at the intermediate stage, namely [H], [L] and [M] as /H/, /L/ and Ø tone
respectively. The reduction of all /L/ tones to Ø tone as well (i.e. to tones whose
realization is governed by automatic pitch assignment rules), occurred at a later
stage.
In the old rongi materials, the [L] tone mark was selected to mark the newly
developed [M] pitch. It has to be remembered that [M] pitch had not played a role in
the tone system until then, and that there was no appropriate mark to accommodate it,
so that compromises had to be made.6
6 This situation is different from the situation described in section 2.3.2 concerning the
reconstruction of /M/ tone in Middle Japanese by Hayata. According to Hayata, these tonemes
formed part of the language from the very beginning, so already at the time when the method of
marking the tones of Japanese by means of tone dots was developed. It is hard to imagine that
one of the three tonemes of the language (/M/ tone must have been distinctive as it left traces in
all modern dialects) would have been so completely overlooked. The [M] tones that I
reconstruct in the transitional period of the 14th century on the other hand, had to be
accommodated by an already existing marking system.
102 4 The development of the tone systems of Tōkyō, Kyōto and Kagoshima
In tables (1) to (4), I have presented the developments as representative of the
Nairin type tone system. This is because the old rongi material is the most extensive
material from the intermediate period, and this material (most likely) stems from the
MJ ‘Nairin’ area. (This cannot be ascertained as in the rongi materials tone class 1.2
has not been attested with a case particle.)
I assume, that the developments in the Chūrin type dialects (apart from the tone
spreading on the particle after class 1.2) were largely the same. Stage 1 has, of
course, been attested in the tone dot material. Apart from the old rongi material, the
reconstruction of stage 2 is based on Gyōa’s spelling system and the merger pattern
of the Kyōto type dialects.
1 /H/ tone restriction in the MJ ‘Nairin’ tone system
MJ ‘Nairin’ (Stage 1) Nairin (Stage 2)
Pitch assignment rules:7
Ø = [M], /L/ after /R/ =[H],
/L/ after [R:] = [L]
2.1 - /LL-L/ - /LL-L/
2.2 - /LH-L/ - /LH-L/
2.3 - /HH-L/ > - /ØH-L/
2.4 - /HL-L/ - /HL-L/
2.5 - /HR-L/ > - /HR-L/
3.1 - /LLL-L/ - /LLL-L/
3.2 - /LLH-L/ - /LLH-L/
3.4 - /HHH-L/ > - /ØØH-L/
3.5 - /HHL-L/ > - /ØHL-L/
3.6 - /HLL-L/ - /HLL-L/
3.7 - /HLH-L/ - /HLH-L/
In stage 2, /H/ tone has become marked in comparison to /L/ tone and Ø tone, but an
active /L/ tone still plays an important role in the tone system, preventing – among
other things – a merger of tone classes 3.2 and 3.4 and 2.2 and 2.3.
7 Varieties of Middle Japanese that show tone spreading after tone class 2.5 also seem to have
such tone spreading after tone class 1.2 (but class 1.2 is not always attested with a particle).
However, it is possible that there were dialects in which tone spreading did take place after
class 2.5, while it did not take place after class 1.2: In central Japan, monosyllables were
automatically lengthened, so that loss of vowel length in contour tones may have affected the
final /R/ of tone class 2.5, while at the same time failing to affect the /R/ tone (= [R:]) of class
1.2.) The tone system that formed the basis of the rongi material may have been of this type:
Tone spreading after class 2.5 is clearly attested, but as the rongi material most likely stems
from the old Nairin area, I assume that tone spreading did not occur after class 1.2. The old
rongi pitch assignment rules are therefore presented as follows: /L/ after [R] = [H], /L/ after
[R:] = [L].
4.1 The developments in the Nairin and Chūrin type dialects 103
4.1.3 The development of /H/ tone anticipation
and a %L phrase boundary tone
The occurrence of /H/ tone has become rare, causing the remaining /H/ tones to
become accent-like. Once accent-like /H/ tones have developed, the tone of syllables
with Ø tone may assimilate to the tone of syllables with /H/ tone. This usually means
that syllables preceding /H/ tone may anticipate that /H/ tone, a process called ‘tonal
anticipation’ or ‘high tone anticipation’ (HTA).
This is for instance, the case in Luganda (Hyman, 1978:264), where all /H/ tones
but the last in a phrase are reduced, and then automatic [H] pitch is assigned to all
syllables (except the phrase-initial syllable) up to and including the /H/ tone, e.g.,
kìkópò (LHL) ‘cup’ and mùkázì (LHL) ‘woman’ combine and undergo tonal
reduction to yield intermediate kìkòpò kyàà mùkázì ‘the cup of the woman’, which
then becomes kìkópó kyáá múkázì [LHH HH HHL]. The generalization which
appears to hold across the Bantu languages is: “The more accent-like a /H/ tone is,
the more likely tonal anticipation will occur” (Hyman 1978:264).
Another principle of historical tonology is the principle of pause as /L/ tone
(Hyman, 1978: 265).8 A pause boundary can at any time cause a lowering of an
adjacent [H] or other non-low (such as [M]) tone. When the lowering effect of pause
becomes a rule, the language has created a %L phrase boundary tone. As we have
seen, the pitch assignment rules of the Tōkyō type dialects that surround Kyōto all
include such a %L phrase boundary tone.
In stage 2 above, the Ø tones were realized with [M] pitch. In the next stage,
stage 3, the phonetic realization of Ø tone starts to be governed by different rules. In
phrase-internal position Ø tone preceding /H/ tone is now realized with [H] pitch,
but in phrase-initial position the /H/ tone anticipation is blocked by the %L phrase
boundary tone, and Ø tone is realized with [L] pitch.
Now that sequences of /H/ tones had disappeared from the language, there was
only one level tone class left. In Middle Japanese this tone class had been level /L/,
but after the contrasting level /H/ tone class had disappeared, this tone class was
characterized by its unique level tone contour rather than its relative tone height. It
can therefore be analyzed as having Ø tone instead of /L/ tone.
The difference in pitch between [H] and [L] in modern Japanese is small. With
only two pitch levels left, and with the distinction between level [L] and level [H]
tonal phrases gone, a large tonal difference between [L] and [H] became superfluous.
It is therefore possible that the difference in pitch height between [H] and [L] was
reduced at this stage.
Once a /H/ tone has become accent-like, the accent-like /H/ may change in order
to become more prominent. What this usually means is that a level tone will become
a contour tone (Hyman, 1978). Hyman gives an example from Haya (a Bantu
8 The lowering effect of pause could be seen in the Luganda example as well, as the phrase-
initial syllable is exempt from the anticipatory raising, which is similar to the situation in
modern Tōkyō type Japanese.
104 4 The development of the tone systems of Tōkyō, Kyōto and Kagoshima
language with a restricted tone system) where historical *ómùkónò (*HLHL) ‘arm’
has become òmùkōnò (LLFL). It is possible that the accent-like /H/ tones of
Japanese at this stage likewise developed a [F] tone contour, especially in phrase-
final position. (In various modern Japanese dialects /H/ tone is still realized as [F] in
phrase-final position.) It is possible that the merger of tone classes 1.3a and 1.3b –
which appears to have occurred around this time – is connected to this development.
The reconstruction of stage 3 is based on universals of tone rules, the merger
pattern of the Kyōto type dialects, and the pitch assignment rules of the central
Japanese Tōkyō type dialects.
2 /H/ tone anticipation and %L phrase boundary tone
reduce most /L/ tones to Ø tone
Nairin (Stage 2) Nairin (Stage 3)
Pitch assignment rules: Pitch assignment rules:
Ø = [M], /L/ after /R/ =[H], phrase-initial Ø = [L], Ø before /H/ and
/L/ after [R:] = [L] after /R/ = [H], all other Ø = [L]
2.1 - /LL-L/ - /ØØ-Ø/
2.2 - /LH-L/ - /LH-Ø/
2.3 - /ØH-L/ > %-, - /ØH-Ø/
2.4 - /HL-L/ - /HØ-Ø/
2.5 - /HR-L/ - /HR-Ø/
3.1 - /LLL-L/ - /ØØØ-Ø/
3.2 - /LLH-L/ - /ØLH-Ø/
3.4 - /ØØH-L/ > %-, - /ØØH-Ø/
3.5 - /ØHL-L/ > %-, - /ØHØ-Ø/
3.6 - /HLL-L/ - /HØØ-Ø/
3.7 - /HLH-L/ - /HØH-Ø/
4.1.4 Analogy in the tone classes that lack /H/ tone
At the next stage, stage 4, the automatic rise after the phrase-initial syllable in
words that contained /H/ tone (which had developed as the result of the %L phrase
boundary tone in combination with /H/ tone anticipation) was generalized and now
also applied to the tone classes that lacked /H/ tone.
Stage 4 is also the stage around which the leftward tone shift in Kyōto must have
taken place: The automatic rise in pitch after the phrase-initial syllable had taken
shape, explaining why all Tōkyō type dialects that surround Kyōto share such a rule.
In a small number of tone classes (2.2 /LH/, 3.2 /ØLH/ and 4.2 /ØØLH/) the few
remaining /L/ tones in the system still prevented the rise to [H] pitch from starting
after the phrase-initial syllable, but by now, lack of a rise in pitch after the phrase-
initial syllable had become exceptional.
4.1 The developments in the Nairin and Chūrin type dialects 105
3 Development of analogical rise to [H] pitch in tone classes that lack /H/ tone
Nairin (Stage 3) Nairin (Stage 4)
Pitch assignment rules: Pitch assignment rules:
phrase-initial Ø = [L], Ø before /H/ and after Ø after /H/ = [L], phrase initial Ø =
/R/ = [H], all other Ø = [L] [L], all other Ø = [H]
2.1 - /ØØ-Ø/ > - /ØØ-Ø/
2.2 - /LH-Ø/ - /LH-Ø/
2.3 %-, - /ØH-Ø/ %-, - /ØH-Ø/
2.4 - /HØ-Ø/ - /HØ-Ø/
2.5 - /HR-Ø/ - /HR-Ø/
3.1 - /ØØØ-Ø/ > - /ØØØ-Ø/
3.2 - /ØLH-Ø/ - /ØLH-Ø/
3.4 %-, - /ØØH-Ø/ %-, - /ØØH-Ø/
3.5 %-, - /ØHØ-Ø/ %-, - /ØHØ-Ø/
3.6 - /HØØ-Ø/ - /HØØ-Ø/
3.7 - /HØH-Ø/ - /HØH-Ø/
The cause behind the development of the automatic rise in pitch after the phrase-
initial syllable in tone classes with Ø tone, is most likely analogy: In dialects that do
not have /H/ tone anticipation blocked by a %L phrase boundary tone the rule is
lacking.
In Akita and Aomori for instance, where there is no /H/ tone anticipation so that
only the syllable with the /H/ tone itself has [H] pitch, there is no rise in pitch after
the phrase-initial syllable in words with Ø tone. In the Totsukawa dialects there is
/H/ tone anticipation, but in villages where the pitch of the initial syllable of words
that contain /H/ tone is not conditioned by the position in the phrase but in free
variation (i.e. 2.2/3 '-~'-, 3.5 '-~'-), the classes with Ø
tone have level pitch. (These villages are Oritachi, Hiratani, Shigesato, Komori and
Kamikuzukawa.)9
In case of other villages however, in which the tone classes that contain /H/ tone
do have automatic phrase-initial [L] pitch (i.e. 2.2/3 %'-, 3.5 %'-),
Ikuta (1951) indicates as the pitch shape of words with Ø tone (his pitch
indications are visual, consisting of rising and falling lines), while Yamana (1951)
describes their pitch as , and Hirayama (1979) as .10 (These villages are
9 In Yamana’s description (1951) the tone classes with Ø tone have [L] level pitch. In Ikuta’s
description (1951) they are level but it is not clear whether they have [H], [L] or [M] pitch, and
in Hirayama’s description (1979) they are not mentioned. In Yamana’s description both classes
2.1 and 2.2/3 and classes 3.1 and 3.4 are simply described as “level”, but because monosyllabic
case particles have a lower pitch when attached to tone classes 2.2/3 and 3.4, we can analyze
these classes as level [H], and classes 2.1 and 3.1 as level [L].
10 Hirayama’s analysis of the pitches of words with Ø tone as rather than may be partly
based on the assumption that the Totsukawa tone system developed form the Kyōto type,
where the pitch of the Ø tone classes is analyzed as level [H], so that the historical development
would have been > .
106 4 The development of the tone systems of Tōkyō, Kyōto and Kagoshima
Kamiyunokawa, Uenoji, Kotsumoze and Kazaya, which can be found in the outer
ring of the Totsukawa area.)
4.1.5 /H/ tone anticipation affects the remaining /L/ tones
At the next stage, stage 5, the /H/ tone anticipation was generalized, affecting the
few remaining /L/ tones in the system as well, reducing them to Ø tone. This caused
tone classes 2.2 and 2.3 and 3.2 and 3.4 to merge. As this merger can be found in all
Tōkyō type dialects that surround Kyōto, the generalization of the /H/ tone
anticipation must have occurred independently in these dialects. (Stage 5 has been
preserved in Nozaki, except that in Nozaki segmental features later started to
influence the tone system. See 6.2 and subsections.)
Apart from the /R/ tone on the final syllable of tone classes 1.2 and 2.5, all other
occurrences of [H] and [L] were now determined by a tone system that included /H/
and Ø tone only.
Most Tōkyō type dialects eventually lost the final /R/ tone, as well as the
possibility of two non-consecutive /H/ tones per word (such as in class 3.7), but both
features have been preserved in the dialect of Nozaki.
4 /H/ tone anticipation eliminates the remaining /L/ tones
Nairin (Stage 4) Nairin (Stage 5)
Pitch assignment rules: Pitch assignment rules:
Ø after /H/ = [L], phrase initial Ø Ø after /H/ = [L], phrase initial
= [L], Ø after /R/ = [H], all other Ø = [L], Ø after /R/ = [H], all
Ø = [H] other Ø = [H]
- /ØØ-Ø/ - /ØØ-Ø/
2.2 - /LH-Ø/ > %-, - /ØH-Ø/
2.3 %-, - /ØH-Ø/ %-, - /ØH-Ø/
2.4 - /HØ-Ø/ - /HØ-Ø/
2.5 - /HR-Ø/ - /HR-Ø/
3.1 - /ØØØ-Ø/ - /ØØØ-Ø/
3.2 - /ØLH-Ø/ > %-, - /ØØH-Ø/
3.4 %-, - /ØØH-Ø/ %-, - /ØØH-Ø/
3.5 %-, - /ØHØ-Ø/ %-, - /ØHØ-Ø/
3.6 - /HØØ-Ø/ - /HØØ-Ø/
3.7 - /HØH-Ø/ - /HØH-Ø/
4.2 The developments in the Kyōto type dialects
As has been mentioned in section 4.1.3, when a /H/ tone becomes accent-like it may
develop a falling tone contour in order to become more prominent. Such a [F] tone
contour, in combination with tonal anticipation, can lead to a leftward shift (or
4.2 The developments in the Kyōto type dialects 107
retraction) of the /H/ tone: An example is the verb ‘to see’ from Kirundi and
Kinyarwanda, where anticipation regularly occurs. Hyman presents the historical
derivation as follows: *kùbónà (LHL) > kùbōnà (LFL) > kúbōnà (HFL) > kúBònà
(HLL) ‘to see’
The changes involved are: (1) The /H/ tone becomes [F] tone, probably because
it is interpreted as an accent. (2) The [F] tone is then anticipated on the preceding
syllable which becomes [H]. (3) The [F] tone then becomes [L] after this [H] tone,
which Hyman (1978) describes as a ‘left absorption process’. The result is a leftward
shift of the /H/ tone of ‘to see’.
Schadeberg (1977) too, has shown by numerous examples from Bantu languages
that tonal anticipation by spreading, absorption, shifting and displacement occurs as
a natural process in many languages. Such processes are found in ‘restricted’ or
‘accentual’ /H/ vs. Ø tone systems, i.e. those tone systems where there has been
partial reduction of the proto tonal oppositions.
In general however, perseverative tone assimilations are more natural than
anticipatory tone assimilations, and one of the conclusions of Hyman & Schuh was
that where tone spreading is anticipatory, more than the natural phonetic tendencies
must have been implicated (1974:103–105). In West African languages for instance,
which generally have more than one active tone, horizontal tone anticipation is
almost non-existent.
The fact that leftward tone shifts are relatively unusual has been one of the
reasons why Ramsey’s theory has failed to find acceptance. As an argument against
Ramsey’s theory, Matsumori (1993:43) for instance, mentions the fact that tone is
more likely to shift to the right than to the left, and based on this tendency she
regards Ramsey’s idea of a leftward tone shift in the Kyōto area as unnatural.11 But
as the examples from Bantu languages with restricted tone systems have shown, this
relative unnaturalness should not be elevated to the level of an absolute impossibility.
There is, for instance, also at least one example from Japanese itself of a leftward
tone shift which is also acknowledged in the standard theory. This concerns the shift
of the tone of class 3.4 in the dialect of Kyōto from ' to ' sometime after
the 17th century. (See section 1.3.) Furthermore, a comparison between the Korean
dialects of South Hamkyeng and Kyengsang, combined with the Middle Korean data
(Ramsey, 1978:79, 82) shows that a leftward shift must have occurred in the
Kyengsang dialect. (As the reconstruction of a leftward shift in the Kyengsang
dialect does not conflict with the interpretation of the Middle Korean data, it is not
controversial.)12
11 As has been shown in section 2.5, Matsumori reconstructs a proto-Japanese tone system which
– as far as tone classes 1.2, 1.3, 2.2, 2.3 and 2.4 are concerned – coincides with Ramsey’s
reconstruction of the tone system of Middle Japanese. (Her reconstruction of the tones of the
trisyllabic nouns with their wider tonal possibilities is quite different.) In a footnote (1993:74)
Matsumori stresses that this similarity with Ramsey’s reconstruction is a coincidence, and
again rejects Ramsey’s theory because of the unnaturalness of a leftward tone shift.
12 It is not surprising that Ramsey, who analyzed the changes that took place in the Kyengsang
108 4 The development of the tone systems of Tōkyō, Kyōto and Kagoshima
The leftward shift in the Kyōto area may not have developed in exactly the same
way as in the examples of Kirundi and Kinyarwanda above, but the examples from
Bantu, Korean and tone class 3.4 may nevertheless function as a reminder that the
unnaturalness of a leftward tone shift should not be exaggerated.
It also has to be remembered that the leftward tone shift in the Kyōto type
dialects is confined to a central area, meaning that it can be explained as a one-time
occurrence, which was subsequently able to spread to a larger area due to the high
status of the dialect of the capital.
4.2.1 How the leftward shift created the /L/ toneme in modern Kyōto
The tone system of the modern Kyōto type dialects can be analyzed in terms of /H/,
Ø and /L/ tone. The fact that Kyōto has a separate toneme /L/ (which is limited to
the initial syllable of the word), makes the tone system of the Kyōto type dialects
fundamentally different from the tone systems of the Tōkyō type dialects.
Usually, /L/ tone in Kyōto is marked by means of a similar mark as the /H/ tone
in the word (except that it is added before the initial syllable). This is because in
most cases, initial /L/ tone in Kyōto corresponds to /H/ tone on the initial syllable in
the Tōkyō type dialects. In Tōkyō the /H/ tone is followed by a drop to [L] pitch on
the following Ø tone, but in Kyōto initial /L/ represents a pitch fall only in an
abstract morphophonemic sense. Unless for instance a demonstrative is added before
a word that starts with /L/ tone, there is no audible pitch fall.
But this is nevertheless one of the origins of the distinctive initial /L/ tone: It
developed from a pitch fall after the first syllable which was shifted to the left. There
was only a syllable available for the [L] pitch of the Ø tone that followed /H/. The
preceding /H/ tone that had conditioned this [L] pitch was eliminated; shifted off the
word. The [L] pitch of the Ø tone had been conditioned before the shift, but when it
landed on the initial syllable after the shift, it was no longer automatic. It was this
change in circumstances which transformed it into a /L/ toneme. The leftward shift,
in other words, provides precisely the kind of conditions that could make Ø tone
split into Ø tone and /L/ tone.
Not all cases of modern Kyōto /L/ tone however, developed from Ø tone. As we
shall see in the next section, the /L/ register that tone class 3.2 has in many Kyōto
type dialects developed from the rare /L/ tones that had remained in the tone system
by stage 4, which were now shifted onto the initial syllable.
In these cases, the resulting /L/ tone developed from /L/ tone in Middle Japanese
and can be regarded as ‘old’. In the majority of cases however, the /L/ register is not
‘old’, as it did not derive from the /L/ tone of Middle Japanese, but from the Ø tone
of the modern restricted tone systems.
dialects in his dissertation, saw no objection to the reconstruction of a leftward shift in the
Kyōto area if this could solve the geographical dilemma, and explain the phonological
alternations in the Kyōto type dialects discussed in chapter 5.
4.2 The developments in the Kyōto type dialects 109
The result of the leftward shift was that a tonal feature (namely /L/ tone) that was
disappearing from the language by the time of the shift, was recreated, but with the
important difference that it was now restricted to the initial syllable of the word.
The modern Kyōto tone system which distinguishes between /H/, Ø and /L/ tone
developed relatively recently. It is fundamentally different from the Middle Japanese
tone system, as Middle Japanese had two equally active tones /H/ and /L/, where /H/
tone was not more marked than /L/ tone, and /L/ tone was not restricted to the initial
syllable.
4.2.2 The origin of the mixed reflexes of tone class 3.2
in the Kyōto type dialects of central Honshū
I have already mentioned in section 3.1.6 that many Kyōto type dialects have
'' reflexes for tone class 3.2. Although some Kyōto type dialects on Honshū,
such as Ōsaka and Wakayama are said to have ' tone as the reflex of class 3.2,
in fact the reflexes of tone class 3.2 appear to be mixed in all Kyōto type dialects on
Honshū for which I have found data. The Kyōto type dialects that have mixed
reflexes for tone class 3.2 ('' as well as '), must have split off around the
transition from stage 4 to stage 5.
When a phonetic category splits, there often is a pattern in the distribution of the
variants after the split, for instance based on semantic or syntactic grounds.13 In case
of the two reflexes of tone class 3.2 however, there is no such pattern in the
distribution, and the most likely origin of the mixed reflexes in my opinion is
therefore dialect interference.14
Especially if tone class 3.2 had %~% tone in free variation at the
time of the shift, different communities may have developed different tonal
outcomes ( > ', > ''), even if they were in close proximity
to one another. Dialect interference may subsequently have led to irregularities in
the reflexes of class 3.2.
4.2.3 The realization of classes with all Ø tone as [H] in the Kyōto type dialects
In the pitch assignment rules applying to Ø tone in Kyōto, the automatic %L phrase
boundary tone is lacking. As we have seen, when a word starts with [L] pitch in
13 This appears to have happened in Kōchi, where nouns that had tone in Middle Japanese
tend to have '' tone, whereas adjectives and verbs that had tone in Middle
Japanese have ' tone.
14 Some scholars propose that the two different reflexes in the Kyōto type dialects go back to the
two distinct subclasses 3.2a and 3.2b of proto-Japanese (see section 8.1.5). This idea is highly
unlikely, as the different Kyōto type dialects among themselves do not agree on which words
have which reflex, which is an indication that the split in the reflexes must have a different
origin. It would also mean that the modern Kyōto type tone systems of Honshū did not evolve
from any of the attested varieties of Middle Japanese (a division in tone class 3.2 has not been
preserved in any of the known tone dot material) but from some unattested variety of Middle
Japanese, in which this proto-Japanese subclass division had been preserved.
110 4 The development of the tone systems of Tōkyō, Kyōto and Kagoshima
Kyōto, this is distinctive. The loss of the %L phrase boundary tone – which can be
found in all Tōkyō type dialects that surround Kyōto – can be explained as a result
of the leftward tone shift: The automatic phrase-initial [L] tone was shifted off the
word: 2.1 %- > %-, 3.4 %- > %- etc. As a result, the
tone classes with all Ø tone are realized with [H] pitch in Kyōto.
Tone classes with initial /L/ tone followed by Ø tone in Kyōto have an automatic
phrase-final rise to [H] pitch. It is this rise which signals the fact that these classes
start with /L/ tone. The distinction between tone classes with initial /L/ tone
followed by Ø tone, and tone classes with all Ø tone therefore, concerns a difference
in contour: [L] followed by phrase-final [H] pitch, vs. level [H] pitch. 15
4.2.4 The reason why the distinct tone classes 1.2, 2.5 and 3.7
were lost in Tōkyō but preserved in Kyōto
Many Kyōto type dialects have preserved tone classes 1.2, 2.5 and 3.7 as distinct
tone classes. In most Tōkyō type tone systems on the other hand, these tone classes
have merger with one or other of the other tone classes.
The fact that Kyōto has preserved more distinctions has played an important role
in the development of the idea that the Kyōto type tone systems are more archaic
than the Tōkyō type tone systems. If Ramsey’s reconstruction of the Middle
Japanese tone system is correct, it has to be explained why these classes were lost in
most Tōkyō type dialects, and why Kyōto – on the one hand an innovator – is
conservative as far as the preservation of the separate tone classes 1.2, 2.5 and 3.7 is
concerned.
We have seen that after the 13th century the only contour tone left in Middle
Japanese was /R/ tone, which occurred in class 1.2 and the final syllable of class 2.5
(as well as in verb and adjective forms). In the MJ ‘Gairin’ and MJ ‘Chūrin’ dialects,
the tone of the particles had become high after final /R/ tone on a preceding noun:
1.2 [L-H], 2.5 [HL-H].
When the tone reduction occurred, in which only /H/ before /L/ in Middle
Japanese was preserved as a phonological /H/ tone, class 1.2 developed /Ø/ tone (>
class 1.1), and class 2.5 developed /HØ/ tone (> class 2.4) in areas with Chūrin and
Gairin type tone.
The only dialects where final /R/ tone could have left a trace is in areas that had a
MJ ‘Nairin’ type tone system, where the drop to /L/ tone after the final /R/ had still
been preserved at the time when the tone reduction occurred. And we see indeed that
the Noto dialects, which have preserved the final /R/ tone of class 2.5, have a Nairin
type merger pattern in the monosyllabic nouns. Most Nairin type dialects however,
eventually lost the final /R/ tone, which is not surprising in tone systems that were
simplifying towards an ‘accentual’ /H/ vs. Ø opposition.16
15 In Kōchi the automatic rise in the tone classes with /L/ tone followed by Ø tone, will occur
after the initial syllable, in Kyōto at the phrase-final syllable.
16 See section 6.2.6 for an explanation as to why the /R/ toneme in class 2.5 in Nozaki was
4.2 The developments in the Kyōto type dialects 111
Rather, it is the fact that tone classes 1.2 and 2.5 did not merge with other tone
classes in most Kyōto type dialects which requires an explanation. The explanation
is that the leftward tone shift in Kyōto (which occurred before the /R/ contour tone
was lost) transformed final /R/ tone into final /H/ tone. The preservation of classes
1.2 and 2.5 as separate tone classes in Kyōto was therefore an indirect result of the
shift.
As to the loss of the separate tone class 3.7 in most Tōkyō type dialects, the final
/H/ tone of this class in the Gairin type dialects was automatically lost at the time of
the /H/ tone restriction as in these dialects there was no longer a drop to [L] pitch
after the noun. In the Nairin and Chūrin Tōkyō type dialects the /H/ tone on the final
syllable of tone class 3.7 was most likely initially preserved. A consequence of the
fact that /H/ tone became accent-like is that eventually only one /H/ tone per word
was allowed, and in these dialects, eventually only the first /H/ tone in the word
remained. The only Tōkyō type dialect that has preserved the two non-consecutive
/H/ tones that were present in this tone class in Middle Japanese phonetically as well
as phonemically, is the dialect of Nozaki.17
5 The elimination of /R/ tone and multiple /H/ tones from the tone system of Kyōto
Stage 4 Modern Kyōto
1.2 :- /R-Ø/ > :, :- /H-Ø/
2.5 - /HR-Ø/ > - /LH-Ø/
3.7 /HØH/ > /LHØ/
The preservation of class 3.7 in Kyōto, just as the preservation of classes 1.2 and 2.5,
was an indirect result of the shift. In the Kyōto type dialects, the leftward shift
occurred before the final /H/ tone was lost, and again, it was the shift that created the
conditions under which the distinct tone class 3.7 could be preserved: As a result of
the shift, the first /H/ tone of tone class 3.7 was replaced by /L/ tone. From then on
tone class 3.7 was characterized by /L/ register, followed by a single accent-like /H/
tone, which – being the only /H/ tone in the word – was preserved.
In dialects that did not go through the shift, classes 1.2, 2.5 and 3.7 contained
exceptional features: Classes 1.2 and 2.5 contained /R/ tone, and class 3.7 contained
more than one /H/ tone per word. With the disappearance of these special features
these tone classes merged with classes 1.1 or 1.3, and with classes 2.4 and 3.6. In the
Kyōto type dialects on the other hand, the shift had transformed these special
features into a single /H/ tone per word. As the presence of a single /H/ tone in the
word is not an exceptional tonal feature in the modern restricted tone systems, there
preserved, while it was lost in class 1.2.
17 There are a number of other villages on Noto Island that have the same tone system, but in
most Noto dialects, McCawley’s rule caused the first /H/ tone to me realized with [L] pitch.
See section 6.2.2.
112 4 The development of the tone systems of Tōkyō, Kyōto and Kagoshima
was no cause for the occurrence of mergers, and the distinction between the different
tone classes was preserved.
The developments in Kyōto discussed to far, are summarized in (6). The stage
exemplified by the tone marks in Bumō-ki has been preserved in Ōsaka and
Wakayama. In Kyōto itself, the following additional changes occurred after the 17th
century: Class 2.4 - > -, class 3.6 - > - > -,
class 3.4 - > -.
The last change also applied to those members of class 3.2 which – like class 3.4
– had - tone after the shift. (As shown in section 2.3.5, Bumō-ki too has
mixed reflexes for class 3.2, both - as well as -.)
6 From stage 4 or 5 to Bumō-ki
Nairin (Stage 4 to 5) Bumō-ki
Pitch assignment rules: Pitch assignment rules:
Ø after /H/ = [L], phrase initial Ø = [L], Ø after /R/ = Ø after /H/ = [L],
[H], all other Ø = [H] all other Ø = [H]
2.1 %-, - /ØØ-Ø/ > - /ØØ-Ø/
2.2 -~ %-, - /LH-Ø/~ /ØH-Ø/ > - /HØ-Ø/
2.3 %-, - /ØH-Ø/ > - /HØ-Ø/
2.4 - /HØ-Ø/ > - /LØ-Ø/
2.5 - /HR-Ø/ > - /LH-Ø/
3.1 %-, - /ØØØ-Ø/ > - /ØØØ-Ø/
3.2 -~ /ØLH-Ø/ ~ > -, /LHØ-Ø/,
%-, - /ØØH-Ø/ - /ØHØ-Ø/
3.4 %-, - /ØØH-Ø/ > - /ØHØ-Ø/
3.5 %-, - /ØHØ-Ø/ > - /HØØ-Ø/
3.6 - /HØØ-Ø/ > - /LØØ-Ø/
3.7 - /HØH-Ø/ > - /LHØ-Ø/
4.2.5 The developments in the Kyōto type dialects of Shikoku
and the Seto Inland Sea
Overall, the leftward tone shift affected the Kyōto type dialects of Shikoku and the
Seto Inland Sea in the same way as the Kyōto type dialects of Honshū. There is
however an interesting difference: There is not one Kyōto type dialect (nor any
Tōkyō type dialect) in central Honshū that maintains the distinction between tone
classes 2.2 and 2.3. There are however, a number of Kyōto type dialects on Shikoku
and islands in the Seto Inland Sea that that do maintain a distinction between these
two tone classes.
This means that the leftward tone shift as such does not necessarily have to
obliterate the distinction between these two classes. The fact that the geographical
distribution of the dialects that have maintained the distinction is not random, may
indicate that some of the innovations that preceded the leftward tone shift in Kyōto
4.2 The developments in the Kyōto type dialects 113
never spread to these areas. In this light it is interesting to recall the fact that in the
early 16th century Konparu Zenpō still gave the pitches of tone class 2.3 in Shikoku
as .
It is possible to see this as an indication that the /H/ tone restriction never spread
to Shikoku. Another indication may be the fact that in the Sanuki type dialects on
the island, not only are classes 2.2 and 2.3 still distinguished from each other, but
more importantly; class 2.3 ( in Middle Japanese) has merged with class 2.1
( in Middle Japanese).
In section 4.1.3, I have argued that a large difference in pitch height between /H/
and /L/ became superfluous once the tone systems became restricted, and the
distinction between level /H/ and level /L/ tonal phrases was lost. The merger of tone
classes 2.3 and 2.1 in the Sanuki dialects may be the result of a decrease in the
difference in pitch height between /H/ and /L/ occurring in a dialect that was still at
the Middle Japanese stage: The merger of these two classes is more likely if they
both still had a level tone contour (i.e. if tone class 2.3 still had tone).
Classes 2.1 and 2.3 would have been distinguished by pitch height only, and
when the difference between /L/ and /H/ was reduced – which may have occurred
under the influence of adjacent dialects that had already developed a restricted tone
system – they could have easily merged. 18
In the more conservative West Sanuki dialect the tone of this merged tone class
is indeed level: -. The more innovative East Sanuki dialect on the other hand
has -~- in free variation. (See section 7.2.1.)
Another point is that the reflex of tone class 3.2 in the Kyōto type dialect of
Kōchi on Shikoku is a quite regular '' tone. This means that the leftward tone
shift in this dialect must have occurred at a stage that was definitely no later than
stage 4, the stage at which not all /L/ tones had yet been eliminated from the system.
But it ccould also mean that the leftward tone shift (which spread to Kōchi from
Kyōto) occurred in a still unrestricted Middle Japanese type tone system. If so, the
leftward tone shift itself would be the main cause behind the restriction of the
number of /H/ tones in the dialect of Kōchi.
7 Possible developments in Kōchi
Stage 1 Kōchi
3.2 /LLH/ > /LHØ/
3.4 /HHH/ > /ØHØ/
3.7 HLH/ > /LHØ/
18 The Tōkyō type dialects of southwest Shikoku may have adopted the /H/ tone restriction from
the Gairin type dialects of Kyūshū or the Chūrin type dialects in west Honshū. In the Sukumo
dialect in the southwestern corner of Shikoku, the pitch of the merged tone class 2.2/3 is
conditioned (Ikuta, 1951). In isolation it has level pitch (i.e. the former tone of class 2.3), but
when a particle is attached, it has - pitch (i.e. the former tone of class 2.2).
114 4 The development of the tone systems of Tōkyō, Kyōto and Kagoshima
The tone system of the famous dialect of Ibukijima may be the result of the leftward
shift reaching this dialect when it was still in stage 2. (See section 7.2.2.) Finally,
Matsumori (2001:100) reports a split in the reflexes of class 3.5 in Ibukijima,
Shishijima and Marugame (all islands in the Seto Inland Sea) in which part of class
has 3.5 merge with class 3.4. It is possible that this split is related to the existence of
a subclass division (3.5a and 3.5b) in proto-Japanese. (See section 8.2.2.)
4.3 The developments in the Gairin Tōkyō type dialects
In the Gairin type tone systems pre-L /H/ tones started to be highlighted over other
/H/ tones in the word, just as in the Nairin/Chūrin type tone systems. The other /H/
tones started to be realized progressively lower in comparison, which resulted in the
reduction of the number of /H/ tones per word in classes 2.3, 3.4 and 3.5.
The universal behind this change has already been mentioned (a [HL] interval is
subject to F0 polarization). The reverse is true for a [LH] interval, which is subject
to F0 compression and has the tendency to level out to [LM] or [MH] (Hyman,
2007). This second universal caused the complete loss of /H/ tone in classes 2.2 and
3.2, which is what made these tone classes merge with classes 2.1 and 3.1. It also
caused the loss of the /R/ tone in classes 1.2 and 2.5 and the loss of the /H/ tone on
the final syllable of class 3.7. Consequently, there are no Gairin type dialects in
which these tone classes have been preserved as separate classes.
The tone dot manuscripts of Kokin waka-shū and other MJ ‘Gairin’ material
most likely represent the tone system of the area with Gairin type tone located in the
old prefectures of Mikawa, Tōtōmi and Shinano. The first sign of /H/ tone restriction
in these dialects is the attestation of siho ‘tide’ and hana ‘flower’ (class 2.3) as 上平
in the Jakue-bon and the Fushimi miyake-bon of Kokin waka-shū (both late 13th
century).
This indicates that the lowering of /H/ before /H/ in classes 2.3, 3.4 and 3.5
occurred earlier than the leveling out of the [LH] interval in classes 1.2, 2.2, 2.5, 3.2
and 3.7 (at least in this area). For simplicity’s sake however, I have presented the
two changes as occurring simultaneously in (8).
In case of the Nairin/Chūrin type discussed earlier, I reconstructed an
intermediate stage in the lowering of /H/ before /H/ that included a distinction
between /L/ tone, /H/ tone and Ø tone [M]. (As we have seen in section 4.1.2 the
reconstruction of this transitional stage is required to explain the old rongi data in
light of the merger pattern of class 3.2 in the Kyōto type dialects on Honshū.)
In the Gairin type dialects, the /H/ tone restriction was no doubt a gradual
process as well on the phonetic level [H > M > L], and it is likely that there were
intermediate stages with a /L/, /H/, Ø distinction. (Even stage 1, attested in the tone
dot material, could already be analyzed in such terms, if we see the tone spreading
on the case particles as a reason to analyze these particles as having Ø tone.)
4.3 The developments in the Gairin Tōkyō type dialects 115
I have however, skipped such stages in the phonological representations in (8)
and (9): [M] pitch it is analyzed as a subphonemic variant of /H/ tone (or – in case of
the particles – of /L/ tone). When it is lowered further to [L], it is analyzed as Ø tone
(stage 3).
8 [HL] polarization and [LH] compression in the MJ ‘Gairin’ tone system
MJ ‘Gairin’ (Stage 1) Gairin (Stage 2)
Pitch assignment rules: Pitch assignment rules:
/L/ after /R/ and /LH/ = [H] /L/ after /R/ and /LH/ = [M],
/H/ = [M] unless followed
by [L]
2.1 - /LL-L/ - /LL-L/
2.2 - /LH-L/ > - /LH-L/
2.3 - /HH-L/ > - /HH-L/
2.4 - /HL-L/ - /HL-L/
2.5 - /HR-L/ > - /HR-L/
3.1 - /LLL-L/ - /LLL-L/
3.2 - /LLH-L/ > - /LLH-L/
3.4 - /HHH-L/ > - /HHH-L/
3.5 - /HHL-L/ > - /HHL-L/
3.6 - /HLL-L/ - /HLL-L/
3.7 - /HLH-L/ > - /HLH-L/
The next stage (stage 3), in which [M] pitch was lowered further to [L], has been
attested in the fushihakase marks added to Butsuyuigyō-kyō 仏遺教経. (But this
material also includes many irregular markings. See section 14.5 of part II.)
As shown in (9), classes 2.1 and 2.2 have tone in this material, classes 3.1
and 3.2 have tone, and class 3.7 has tone. According to Kindaichi
(1955), the text stems from the mid to late 14th century.
As will be discussed in chapter 10, in the areas with Gairin type tone of western
Japan (Kyūshū, Shimane prefecture) the /H/ tone restriction must have taken place
much earlier. In western Japan these developments must date back to before the
settlement of the Ryūkyūs, as the Ryūkyūs were settled by speakers of a dialect in
which tone classes 2.2 and 3.2 had already merged with classes 2.1 and 3.1.
The developments in the Gairin type dialects after stage 3 are similar to those in
the Nairin/Chūrin type dialects. Modern Gairin type dialects such as Ōita (Hirayama,
1960), Izumo and Matsue (Kobayashi, 1975) have /H/ tone anticipation in
combination with the development of a %L phrase boundary tone, as well as an
analogical rise in pitch after the phrase initial syllable in tone classes with Ø tone.19
19 Except that in Matsue the rise to [H] pitch after the [L] pitch of the phrase-initial syllable will
be delayed one syllable if the second syllable contains a close vowel -i or -u: sakura ‘cherry
blossom’ .
116 4 The development of the tone systems of Tōkyō, Kyōto and Kagoshima
Some dialects, like Akita, have reverted to (or preserved) the simple pitch
assignment rules of stage 3. The partial rightward shift of /H/ tone that occurred in
the Gairin dialects of type B will be discussed in section 7.1.1.
9 Reduction of the oppositions in the Gairin tone system to /H/ vs. Ø tone
Gairin (Stage 2) Gairin (Stage 3)
Pitch assignment rules: Pitch assignment rules:
/L/ after /R/ and /LH/ = [M], Ø = [L]
/H/ = [M] unless followed by [L]
2.1 - /LL-L/ - /ØØ-Ø/
2.2 - /LH-L/ > - /ØØ-Ø/
2.3 - /HH-L/ > - /ØH-Ø/
2.4 - /HL-L/ - /HØ-Ø/
2.5 - /HR-L/ > - /HØ-Ø/
3.1 - /LLL-L/ - ØØØ-Ø/
3.2 - /LLH-L/ > - ØØØ-Ø/
3.4 - /HHH-L/ > - ØØH-Ø/
3.5 - /HHL-L/ > - ØHØ-Ø/
3.6 - /HLL-L/ - HØØ-Ø/
3.7 - /HLH-L/ > - HØØ-Ø/
4.4 The development of the two Kagoshima word-tones
Of the dialects that are characterized by distinct word-tones, the Kagoshima dialect
is most famous. The Kagoshima dialect has two tone classes, one with word-tone A,
which contains a pitch fall before the phrase-final syllable, and one with word-tone
B, which has a rise to [H] pitch on the phrase-final syllable.
10 The two Kagoshima word-tones
A , - A , - A , -
B , - B , - B , -
Other Kagoshima type dialects on Kyūshū, such as the dialect of Makurazaki (cf.
section 1.1.3) have word-tones that are almost the exact opposite of those of the
Kagoshima dialect proper.
There is a direct correspondence between the two Kagoshima word-tones and the
tone of the initial syllable in Middle Japanese: Words that have word-tone A had a
shang tone on the initial syllable in Middle Japanese, and words that have word-tone
4.4 The development of the two Kagoshima word-tones 117
B had a ping tone on the initial syllable in Middle Japanese.20 It is often though that
the Kagoshima type dialects derived their two tonal categories directly from the
initial tone of Middle Japanese.
Uwano (1981) on the other hand, regards the development of the two tonal
categories in Kagoshima as a more gradual process. He sees the Kagoshima type
word-tone systems as the result of a simplification of the Gairin type tone system
that can also be found on Kyūshū. Word-tone A developed from the unaccented tone
classes (i.e. the tone classes with Ø tone) and word-tone B developed from the
accented tone classes (i.e. the tone classes that contain /H/ tone).
Uwano’s idea resembles the developments posited for Kirundi and Kinyarwanda,
where a word like ‘to see’ – historically /LHL/ – could optionally be realized in a
number of ways by the same speaker on different occasions: [LFL], [RFL], [HFL]
and even [HLL]. As Hyman (1978:264) describes:
What seems to be important here is not that there is a H tone on a specific
syllable (or syllables), but rather that there is a H tone with a drop to L
somewhere (anywhere?) in this word. Thus, the whole word stands in
opposition to a verb such as kùròrà ‘to look’, which lacks this drop from H to
L. It is as if Kirundi and Kinyarwanda speakers, on their way to the Luganda
situation, are no longer asking one by one whether a syllable has H or L tone,
but rather whether a word has an (accentual?) drop in it somewhere.
It is possible that in Kagoshima too, at some point the exact location of the drop
from [H] to [L] pitch was no longer considered important. Instead, words which
contained a drop in pitch somewhere (category B) now stood in opposition to words
which lacked such a drop (category A)
Assuming a gradual development by means of a Gairin type stage also explains
why in the Kagoshima type dialect in Nagasaki-ken mentioned by Kindaichi
(1954a/1983:31), the present-day word-tone system has preserved more than just the
initial tone of Middle Japanese. (See section 5.1.4.) In case of this dialect at least,
the idea that the two word-tones developed from nothing but the initial tone of
Middle Japanese has to be dismissed.
A gradual development from a Gairin type stage also seems more natural than a
sudden jump from syllable-tone to word-tone, based on the tone of the initial
syllable. The only point that seems to agree better with the idea of a radical
simplification in one step (taking the tone system of Middle Japanese as a starting
point) is the fact that tone class 3.3 has a quite regular word-tone A in the
Kagoshima type dialects, while the reflex of this tone class in the Gairin type tone
20 The development from Middle Japanese to the two different word-tones of Kagoshima is
simpler if one follows the standard reconstruction of shang as /H/ and ping as /L/, but in case of
the dialect of Makurazaki the development is simpler if one follows Ramsey’s reconstruction.
There is no telling which of the two types is older, and so the only conclusion that can be
drawn is that the tonal development of the melodies of the distinct word-tones appears to be
quite free, once the link of specific tones with specific segments has disappeared.
118 4 The development of the tone systems of Tōkyō, Kyōto and Kagoshima
system of northeast Kyūshū is a very irregular mixture of forms with Ø tone and
forms that contain /H/ tone.
However, the irregularity in the correspondences between the two dialects in this
respect argues in the first place against derivation of the Kagoshima word-tones
from a late Gairin type stage, such as the modern tone system of northeast Kyūshū.
It does not preclude the possibility that the Kagoshima word-tones developed from a
more archaic Gairin tone system, in which the reflexes of class 3.3 were more
regular.
Another argument against Uwano’s view has been put forward by Kibe Nobuko
(2003), who reasons that the Kagoshima type tone system cannot have derived from
a Tōkyō type tone system as the rules that determine the tone of compound nouns in
Kagoshima cannot be derived from an earlier Tōkyō-like stage: The compound tone
rules of the Kagoshima dialect are simple: The word-tone of the compound (A or B)
is the same as the word-tone of the first element of the compound. (The word-tone
associated with the second element is deleted, and the tones of the first element are
spread over the second element.) While in Kagoshima the first element of the
compound determines the tone of the compound, in the Nairin and Chūrin Tōkyō
type tone systems, the second element of the compound determines the tone of the
compound.
An interesting complication however, is that in the Gairin type dialects of
northeast Honshū and Izumo the tone of the compound is determined by the first
element of the compound just as in Kagoshima (cf. section 5.7). If the first element
has Ø tone, the compound will have Ø tone (which corresponds to Kagoshima word-
tone A) and if the first element contains /H/ tone the compound will contain /H/ tone
(which corresponds to Kagoshima word-tone B).
I do not have much information on the tone rules for compound nouns in the
Gairin type tone system of northeast Kyūshū, but the data that have been put at my
disposal (cf. section 5.7) suggest that northeast Kyūshū (Ōita) is like the other Gairin
type dialects in this respect. This means that the Kagoshima word-tone division
(including the one that occurs in compound nouns) can be derived from a Gairin
type tone system such as can still be found in northeast Kyūshū without problem.
4.5 The reconstruction of the tone of class 3.3
The most complex tone class in light of Ramsey’s theory is class 3.3. The reflexes in
the modern dialects of this small tone class are so irregular that for many dialects it
is not possible to determine the regular reflex. The irregularity of the reflexes even
made Kindaichi delete this class from the reconstructed proto-Japanese tone classes
in his later work (1974).
The Chūrin Tōkyō type dialects however, have a reasonable amount of '
reflexes and the dialects of Kōchi and Kyōto have a quite regular ' reflex. In
Kyōto this reflex does not contain information on the tone in proto-Japanese, as in
4.5 The reconstruction of the tone of class 3.3 119
Kyōto all trisyllabic nouns (if they do not start with /L/ tone) have shifted the /H/
tone to the initial syllable after the 17th century, so that tone classes 3.3, 3.4 and 3.5
have all merged as '. The reflexes of Kōchi and the Tōkyō type dialects
combined however, indicate that the tone of this tone class in proto-Japanese must
be reconstructed as * and not , as attested (most of the time) in Ruiju
myōgi-shō 類聚名義抄.
11 Comparison of the tone of class 3.3 in Ruiju myōgi-shō, Tōkyō and Kōchi
Ruiju myōgi-shō Tōkyō Kōchi
3.3 - '- '-
Additional proof for the reconstruction of (at least part of) tone class 3.3 as having
* tone in proto-Japanese, is formed by a number of attestations of members of
class 3.3 as in the written record: In Ruiju myōgi-shō, the word kasiko ‘that
place’ is attested with 上 平 上 markings (the older form?) as well as 上 平 平
markings (the newer form?). In the Maeda-ke-bon of Nihon shoki 前田家本日本書
紀 (1150), the word tikara ‘strength’ is marked 上平上 (Ishizuka 1977: 127). In
Kokin waka-shū (Akinaga, 1974) mohara ‘exclusively’ has both 上平上 (the older
form?) and 上平平 (the newer form?) markings. 21 In the tonal spelling system
discovered by Takayama in some parts of the Nihon shoki, there are two attestations
of members of tone class 3.3. Ahabi ‘abalone’ appears as (or as the
tone of the first character is ambiguous) and Kasuga (a place name) appears as
.
Finally, some members of class 3.3 are clearly compounds and the tones of the
constituent parts also suggest derivation from earlier *, such as in case of
kogane ‘gold’ from 1.2 ko ‘yellow’ + 2.1 kane ‘metal’ (* > * >
), and komugi ‘wheat’ from 1.1 ko ‘little’ + 2.4 mugi ‘barley’ (*
> ).
It is not only * tone that is almost completely missing in Ruiju myōgi-shō,
in fact, all of the following tone sequences are missing: *, *,
*, *, *, *. It seems therefore that /L/ tone was no
longer allowed after /LH/ tone in general in the variety of Middle Japanese attested
in Ruiju myōgi-shō.
In the MJ ‘Gairin’ type tone system /H/ tone spread to the right across a
morpheme boundary onto a particle after a /LH/ tone contour. In the change from
proto-Japanese to the dialect of Ruiju myōgi-shō on the other hand, the spread was
21 Toyama (a place name), the only other member of tone class 3.3 attested in this material, is
marked with 上平平 tone dots. The fact that Toyama is not attested with *上平上 markings is
not surprising, as yama ‘mountain’ belongs to tone class 2.3, which makes it is unlikely that
Toyama could have developed from earlier *.
120 4 The development of the tone systems of Tōkyō, Kyōto and Kagoshima
limited to the domain within the word, as in Ruiju myōgi-shō assimilation of the tone
of the particle after tone classes 2.2 and 3.2 has not been attested.
We have seen that /H/ tone spreading within the word cannot have taken place in
all dialects, as the present-day dialect of Kōchi descends from a dialect with the
unassimilated form *. Because of the secondary shift of /H/ tone to the left in
trisyllabic words that occurred in Kyōto after the 17th century, it is impossible to see
whether the modern ' reflex of tone class 3.3 in Kyōto developed from
(as in Ruiju myōgi-shō) or from proto-Japanese *. I assume however, that
present-day Kyōto ' tone developed from the Ruiju myōgi-shō pattern in the
following way: > > .
In other Kyōto type dialects on Honshū the tone of class 3.3 may also have
developed from the form attested in Ruiju myōgi-shō. If these dialects did not
partake in the secondary shift of /H/ tone to the left in trisyllabic words, we would
expect to see the intermediate form ' preserved in such dialects.
Kindaichi indeed reports the tone pattern ' for tikara ‘strength’ (tone class
3.3) in the Kyōto type dialects of Nagahama and Akaho (Kindaichi, 1942:167),
while inoti ‘life’, kokoro ‘heart’ and hotaru ‘firefly’ (tone class 3.5) in these dialects
have ' tone.
12 The development of the tone of classes 3.3 and 3.5 in Nagahama and Akaho
Proto- Japanese Middle Nagahama,
Japanese Akaho
3.3 tikara > > '
3.5 inoti, kokoro, hotaru = > '
At first sight the tone of class 3.3 in Ramsey’s theory seemed hard to reconcile with
the reflexes in the modern dialects. However, the change from proto-Japanese
* to Middle Japanese that has to be reconstructed fits into the more
general pattern of /H/ tone spreading that can be seen in Middle Japanese.
The fact that not all members of this class (as attested in Ruiju myōgi-shō) seem
to go back to the same class in proto-Japanese – some seem to be going back to
* and some to * – may offer an explanation for the mixed reflexes in
the modern dialects.
As a final point: /H/ tone spreading occurs in many languages that do not have
/L/ tone spreading, but very few languages have /L/ tone spreading without /H/ tone
spreading (Hyman, 2007). Following Ramsey’s reconstruction, Middle Japanese had
/H/ tone spreading (3.3 * > ) but had not yet developed /L/ tone
spreading (3.7 > *). This agrees with the hierarchy in the occurrence
of the two types. In the standard reconstruction the situation would be the reverse,
and the hierarchy in the occurrence of the two types would violate one of the
universals of tone rules.
4.6 Did the tone of the initial syllable have a special status in Middle Japanese? 121
4.6 Did the tone of the initial syllable have a special status
in Middle Japanese?
As mentioned in section 4.2.3, in the modern Kyōto type dialects, tone classes with
Ø tone are realized with [H] pitch. Furthermore, Ø tone which precedes /H/ tone in
the word anticipates the /H/ tone and is realized with [H] pitch as well.
When the tone system of Kyōto is analyzed in terms of register and pitch-accent,
words that start with /H/ tone and words that start with Ø tone are therefore both
regarded as belonging to the /H/ register group. In other words, the tonal opposition
between /H/ tone /L/ tone, and Ø tone in the initial syllable is analyzed in terms of a
two-way distinction: /H/ register (/H/ tone and Ø tone on the initial syllable) vs. /L/
register (/L/ tone on the initial syllable).
The opposition between Ø tone and /H/ tone in the rest of the word is analyzed in
terms of pitch-accent: Syllables with Ø tone are regarded as unaccented and
syllables with /H/ tone as accented. The analysis developed for modern Kyōto
therefore distinguishes between ‘tone’ (‘register’) when the pitch of the initial
syllable is concerned, and ‘pitch-accent’ when the pitches of the other syllables in
the word are concerned.
As the standard reconstruction of the Middle Japanese tone system resembles the
tone system of modern Kyōto, a direct line is often drawn between the two tone
systems, so that the ‘tone’ vs. ‘pitch-accent’ analysis developed for modern Kyōto is
projected onto the tone system of Middle Japanese. (With the difference that
‘accent’ now refers to any change in pitch, from /L/ to /H/ as well as from /H/ to /L/.
See section 0.5.2.3 of the introduction.)
Hayata (1987, 1999) has furthermore drawn a connection between the ‘tone’ vs.
‘pitch-accent’ distinction reconstructed for Middle Japanese and certain features in
modern dialects other than Kyōto, which has strengthened the idea that this
distinction played a role in Middle Japanese. According to Hayata for instance, the
word-tone dialects of Kagoshima and the Ryūkyūs have preserved the tonal part of
the Middle Japanese tone system (reflected in the split into word-tones A and B),
whereas the Tōkyō type dialects have preserved the accentual part of the Middle
Japanese tone system (reflected by the fact that a transition from [H] to [L] is
distinctive). Only the Kyōto type dialects have preserved both aspects.22
First of all, although in Kagoshima the division into word-tone A and word-tone
B corresponds to the tone of the initial syllable in Middle Japanese, this is not so in
the vast majority of the Ryūkyūan dialects. In many dialects the disyllabic tone
classes have merged as 2.1/2/3 vs. 2.4/5, a pattern that cannot be linked to the tone
of the initial syllable in Middle Japanese. In many other dialects, three tone classes
22 Hayata includes a map that shows the accentual dialects stretching to the northeast, towards the
accentual languages of Siberia, while the tonal dialects stretch to the southwest, towards the
tonal languages of China. The two spheres meet in Japan, in the Kyōto area, as only there both
tonal and accentual features can be found in the language.
122 4 The development of the tone systems of Tōkyō, Kyōto and Kagoshima
have been preserved, and the disyllabic nouns have merged as 2.1/2 vs. 2.3 vs. 2.4/5,
a division which – for obvious reasons – cannot be traced back to the simple two-
way /H/ vs. /L/ distinction in the tone of the initial syllable in Middle Japanese
either.23
Secondly, as we have seen in section 4.4, the fact as such that there is a
correspondence between the tone of the initial syllable in Middle Japanese and the
division into two word-tones does not require us to assign a special status to the tone
of the initial syllable in Middle Japanese: The correspondence can be explained as
the result of a gradual development from a Gairin type tone system
Finally, as I have argued in the introduction, the use of the term ‘pitch-accent’
for Middle Japanese – even when only applied to the pitches that occur in other than
the initial syllable of the word – is inappropriate, as it is not possible to point to a
specific syllable in the word that can be regarded as the bearer of accent. ‘Accent’ in
Middle Japanese therefore has very little in common with ‘accent’ in modern
Japanese.
All of this does not preclude the possibility that tone occurring on the initial
syllable in Middle Japanese had a different status than tone occurring in other
syllables of the word, such as is the case in the tone system of modern Kyōto. But
are there concrete examples from Middle Japanese that show evidence of the fact
that the tone of the initial syllable (/H/ or /L/) possessed special features or a special
importance as opposed to the tones of other syllables in the word?
One argument in favor of a distinction between ‘tone’ and ‘accent’ (in the sense
of ‘tone change after the initial syllable’) in Middle Japanese has been seen in the
fact that in Middle Japanese, the stems of verbs and adjectives were mostly divided
into only two tone classes; level /L/ (called class A) and level /H/ (called class B).
To these simple /H/ or /L/ stems verbal/adjectival suffixes were attached which had
their own intrinsic tones, so that the tone patterns of inflected verbs and adjectives
were just as complex as those of nouns. As the lack of tone change in the stems of
verbs and adjectives has been interpreted in terms of the ‘tone’ vs. ‘pitch-accent’
analysis, verbs and adjectives in Middle Japanese are said to have only ‘tone’,
whereas nouns are said to have ‘tone’ as well as ‘accent’.
One problem with this division is that there was actually a class of verbs with
disyllabic stems (like aruk.u ‘to walk’ and kakus.u ‘to hide something’) in Middle
Japanese, that did include tone change in the stem (/HL/), but this class was small.24
A different point is that it is not exceptional for the lexical tones of verb and
adjective stems to be simpler than the lexical tones of nouns. Just such a distinction
23 In the tone system of the dialect of Shuri, on which the Ryūkyūan standard language is based,
the division appears to agree with the Kagoshima type dialects (2.1/2 vs. 2.3/4/5), which may
be why Hayata lists the Ryūkyūan dialects among the tonal languages on his map. As can be
seen in section 9.3.2 however, the dialect of Shuri has in fact preserved a three-way distinction
in the disyllabic tone classes, so that the division in this dialect cannot be linked to the tone of
the initial syllable in Middle Japanese.
24 For a description of the tonal developments in Japanese verbs, see De Boer (2008).
4.7 Conclusion 123
between nouns and the stems of verbs and adjectives can be found in the tone
systems of many Bantu languages. The tones of verb and adjective stems in
Japanese may have been simpler from the start, or they may have simplified, perhaps
as inflectional morphology became more complicated.
Whatever the exact historical background of the relative simplicity of the tonal
distinctions in verb and adjective stems, it does not require us to assign a special
status to the tone of certain syllables in the word. The idea that the tone of the initial
syllable in Middle Japanese possessed special qualities is not based on the Middle
Japanese data, nor is it required in order to explain certain features in the modern
dialects. It stems from the habit of projecting characteristics of the tone system of
modern Kyōto back onto the tone system of Middle Japanese.
4.7 Conclusion
If we follow Ramsey’s reconstruction of the tones of Middle Japanese it is possible
to give a unified explanation for the split into the three different Tōkyō type tone
systems, namely, /H/ tone spreading onto the particles after /R/ and /LH/ tone in
some dialects, followed by /H/ tone restriction in all dialects.
The most archaic example of such a restricted tone system can be found in the
Nairin type dialect of Nozaki on Noto Island. In this dialect /R/ tone (which
distinguishes tone class 2.5) and multiple non-consecutive /H/ tones per word
(which distinguish tone class 3.7) have been preserved. In most other Tōkyō type
dialects /R/ tone and multiple /H/ tones per word were eliminated at some point, so
that class 2.5 merged with class 2.4 and class 3.7 merged with class 3.6.
The /H/ tone restriction created the conditions under which a leftward tone shift
could occur in Kyōto, namely the development towards an ‘accentual’ /H/ versus Ø
tone system. At the time of the shift, phonological /L/ tone had all but disappeared
from the language, but not yet completely: The Kyōto shift occurred at a stage of the
language that was more archaic than has been preserved in any of the modern Tōkyō
type dialects (even Nozaki), as /L/ tone still played a marginal role.
In the modern Kyōto type tone systems on the other hand, a /L/ toneme, which
can only occur on the initial syllable of the word, plays a prominent role. It is the
existence of his toneme which makes the Kyōto type tone systems fundamentally
different from the Tōkyō type tone systems. If Kyōto type tone developed from a
restricted Tōkyō type tone system, the prominent role of this extra toneme has to be
explained.
The explanation can be found in the leftward shift: In several tone classes [L]
pitch on the second syllable – which was conditioned before the shift – landed on the
initial syllable of the word where it became phonemic. Most instances of /L/ tone in
the modern Kyōto type dialects developed in this way. The leftward shift also
explains why the /L/ toneme is confined to the initial syllable of the word, which is
124 4 The development of the tone systems of Tōkyō, Kyōto and Kagoshima
quite different from the situation in Middle Japanese where there were no such
limitations on the occurrence of /L/ tone.
Finally, the fact that the separate tone classes 2.5 and 3.7 have been much better
preserved in the Kyōto type dialects than in the Tōkyō type dialects is also a result
of the leftward tone shift. The leftward shift transformed the final /R/ toneme that
distinguished class 2.5 before the shift into final /H/ tone, without causing a merger
with class 2.4. The leftward tone shift also reduced the two non-consecutive /H/
tones that distinguished class 3.7 before the shift into a single /H/ tone, without
causing a merger with class 3.6. The presence of a single /H/ tone per word is an
acceptable feature in the restricted tone systems of the modern dialects. This is why
there was no further simplification in dialects that had gone through the leftward
shift, and therefore no merger with other tone classes, such as happened in most
Tōkyō type dialects.
5 Arguments in favor of Ramsey’s theory
based on internal reconstruction
The evidence presented in favor of Ramsey’s theory in chapter 3 was based on
comparative data and dialect geography. In this chapter I present arguments that are
based on internal reconstruction. The first has to do with the special effect that the
particle no has on the tone of a preceding noun in Kyōto, and the second with the
tone of compound nouns. It turns out that in these environments Kyōto has
preserved remnants of a Tōkyō type tone system. These are just the kind of remnants
we expect to find in a dialect that has gone through a fundamental change in its
phonology. (Section 5.12 discusses an issue in which the tone of compound nouns
and the effect of the particle no meet.)
I discuss the arguments from internal reconstruction only now as I sometimes
need to refer to developments that I assume took place in Middle Japanese before
the leftward shift in the Kyōto area, developments that have only been introduced in
the previous chapter.
5.1 The special tonal features of the particle no
In the next three sections I will discuss the arguments that are based on the special
features of the particle no as introduced in Ramsey’s work (especially in the
extended Japanese version of his article). I have however, added data concerning
tone classes 2.2 and 3.2, and data from Ōita, Hiroshima, Kōchi and the MJ ‘Gairin’
tone system.
5.1.1. The particle no in the Tōkyō type dialects
We have seen that monosyllabic case particles such as ga, ha, ni, wo have Ø tone in
the modern Tōkyō type dialects and that they will attach with [L] pitch after nouns
that include /H/ tone. The particle no however, is different. In many Tōkyō type
dialects, this particle copies the tone of the final syllable of the preceding noun.
(This tone copying does not occur in case of monosyllabic nouns, probably because
of the rule that the first and the second syllable of a tonal phrase will not have the
same pitch.)
After nouns with /H/ tone on the final syllable, no therefore attaches with [H]
pitch. Because these nouns then have the same tone as the nouns with Ø tone, the
particle no has the effect of cancelling final /H/ tone in a preceding noun.
126 5 Arguments in favor of Ramsey’s theory based on internal reconstruction
1 Cancelling of final /H/ tone before the particle no in Hiroshima and Ōita
Hiroshima Ōita
Chūrin Gairin
noun + ga noun + no noun + ga noun + no
2.1 tori ‘bird’ - - - -
2.2 mura ‘village’ '- → - - -
2.3 inu ‘dog’ '- → - '- → -
3.1 katati ‘shape’ - - - -
3.2 hutatu ‘two’ '- → - - -
3.4 otoko ‘man’ '- → - '- → -
The tone-copying of the particle no in modern Tōkyō type Japanese is a continuation
of the behavior of the particle no in Middle Japanese, where it likewise copied the
tone of the final syllable of the preceding noun. (For an overview of the tone of no in
the old material see Sakurai’s ‘Joshi no no akusento’, 1976:280-415.)
2 Behavior of the particle no in the MJ ‘Chūrin’ and MJ ‘Gairin’ type tone systems
MJ ‘Chūrin’ MJ ‘Gairin’
noun + ga noun + no noun + ga noun + no
2.1 tori ‘bird’ - - - -
2.2 mura ‘village’ - - - -
2.3 inu ‘dog’ - - - -
3.1 katati ‘shape’ - - - -
3.2 hutatu ‘two’ - - - -
3.4 otoko ‘man’ - - - -
In Middle Japanese no attached [H] after nouns ending in /H/ tone just as in the
modern Tōkyō type dialects, but the tone of the nouns themselves was not affected
in any way. Later however, when only /H/ before /L/ was preserved as /H/ tone in
the modern dialects, the result was that no acquired the effect of cancelling /H/ tone
on the final syllable of a preceding noun. (If the /H/ tone is on other than the final
syllable, it is not cancelled, such as in case of 3.5 itoko-no ‘of the cousin’ '-
.)
It is only possible to connect the loss of /H/ tone in the modern Tōkyō type
dialects with the behavior of the particle no in Middle Japanese when Ramsey’s
reconstruction of the Middle Japanese tones is taken as a starting point. (In the MJ
‘Chūrin’ dialect in Kindaichi’s interpretation for instance, the difference between 2.2
+ ga and 2.2 + no would have been - vs. -, which offers no explanation
for the loss of /H/ tone in case of 2.2 + no in the modern dialects.)
5.1 The special tonal features of the particle no 127
5.1.2 The particle no in Kyōto, Ōsaka and Kōchi
If we compare the tone of nouns with ga and no in the modern Tōkyō type and
Kyōto type dialects, we see that in the Kyōto type dialects the same tone classes lose
the /H/ tone when no is attached. According to the examples in Kobayashi
(1974:141-142) for instance, the following tone classes lose the /H/ tone in Kyōto:
3 Comparison of the behavior of the particle no in Kyōto and Hiroshima
Kyōto Hiroshima
noun + ga noun + no noun + ga noun + no
2.1 tori ‘bird’ - - - -
2.2 uti ‘house’ '- → - '- → -
2.3 inu ‘dog’ '- → - '- → -
For the trisyllabic nouns the data from Ōsaka (Okuda, 1975:26) are as in (4). The
longer nouns show that it is not initial /H/ tone as such that causes Ōsaka nouns to
lose the /H/ tone when the particle no is attached. Ōsaka – like Kyōto – has shifted
the /H/ tone of class 3.4 one more syllable to the left after the time of Bumō-ki 補忘
記. These nouns with /H/ tone on the initial syllable have /H/ tone loss before the
particle no. Nouns that already had /H/ tone on the initial syllable before this
secondary shift, and that do not correspond to nouns with final /H/ tone in the Tōkyō
type dialects, do not lose the /H/ tone:1
4 Comparison of the behavior of the particle no in Ōsaka and Hiroshima
Ōsaka Hiroshima
noun + ga noun + no noun + ga noun + no
3.1 katati ‘shape’ - - - -
3.4 takara ‘treasure’ '- → - '- → -
3.5 inoti ‘life’ '- '- '- '-
The cancelling of /H/ tone in the modern Kyōto type dialects can only be related to
the behavior of the particle no in Middle Japanese if we assume that in these dialects,
at an earlier stage, a similar situation to that of Tōkyō existed; a situation in which
the /H/ tone was still located on the final syllable.
1 Kobayashi (1974:142) seems to refer to the same phenomenon in Kyōto when she comments
that initially accented trisyllabic nouns lose their accent “less regularly” than initially accented
disyllabic nouns.
128 5 Arguments in favor of Ramsey’s theory based on internal reconstruction
The dialect of Kōchi did not have the secondary shift of /H/ tone to the left in longer
nouns that occurred in Kyōto and Ōsaka. We therefore see that in Kōchi, it is a
combination of nouns with initial /H/ tone (in case of disyllabic nouns) and nouns
with medial /H/ tone (in case of trisyllabic nouns) that lose the /H/ tone when no is
attached. And again, this only occurs in nouns that have final /H/ tone in the Tōkyō
type dialects. (See the Kōchi data in Martin (1987:258-259) and Kobayashi
(1974:165-166.)
5 Comparison of the behavior of the particle no in Kōchi and Hiroshima
Kōchi Hiroshima
noun + ga noun + no noun + ga noun + no
2.1 tori ‘bird’ - - - -
2.2 uti ‘house’ '- → - '- → -
2.3 inu ‘dog’ '- → - '- → -
3.1 kuruma ‘vehicle’ - - - -
3.4 atama ‘head’ '- → - '- → -
3.5 kokoro ‘heart’ '- '- '- '-
As was the case in Ōsaka and Kyōto, this /H/ tone loss in the dialect of Kōchi can
only be related to the behavior of the particle no in Middle Japanese, if we assume
that at an earlier stage Kōchi had a Tōkyō type location of the /H/ tone.
5.1.3 The loss of the special features of the particle no after monosyllabic nouns
The main difference between the behavior of no in Middle Japanese and in the
modern dialect of Tōkyō is that in Middle Japanese no also copied the final tone of
monosyllabic nouns. I assume – as Ramsey did – that the fact that the monosyllabic
nouns no longer show this behavior when no is attached must be the result of the
modern Tōkyō rule that the first and the second syllable in a phrase will not have the
same tone. (See the previous chapter.)
Neither Tōkyō, nor Kyōto or Kōchi show the /H/ tone copying after
monosyllabic nouns, so that after monosyllabic nouns in all three dialects the tone of
no is the same as the tone of the other monosyllabic case particles. In the musical
notation marks added to recitation guides to the Heike monogatari 平家物語 on the
other hand (which express 18th century post-shift Kyōto type tone), a difference in
tone can be seen in case of tone class 1.3, depending on whether the particle ga or
the particle no is attached (Okumura, 1981: 439). The tone of 1.3 te ‘hand’ is '-
in case of te-ga but '- or - in case of te-no (and 1.3 me-no ‘of the eye’ is -
).2 Furthermore, lexicalized forms of nouns of class 1.3 + no in modern Kyōto
show the same - tone as in the Heike monogatari musical notation marks.
2 It seems that in the dialect reflected in the Heike monogatari, the tone of many of the affected
tone classes had two forms. Okumura gives: 2.3 yama-no ‘of the mountain’ as -~-,
5.1 The special tonal features of the particle no 129
6 Lexicalized forms of tone class 1.3 + no in Kyōto
enogu (絵の具) ‘paints’, ‘colors’ '
hinoko (火の粉) ‘sparks’ '
kinoko (木の子) ‘mushroom’ '
kinome (木の芽) ‘tree bud’ '
Because of the secondary shift of the /H/ tone to the left (' > ') in
trisyllabic nouns in Kyōto, it is hard to say whether the - tone of e-no, hi-no and
ki-no goes back to - or -. However, these lexicalized forms do show that the
regularization of the tone of 1.3 nouns with no as '- in Kyōto was a late
development. The lexicalized forms also show that the tone of class 1.3 (which is /L/
in isolation in modern Kyōto) was once /H/ and not /L/, which is in accordance with
Ramsey’s theory.
Finally, in the Kōbe dialect (Kyōto type) the tone of class 1.3 is /L/ just as in the
modern Kyōto dialect, but the quality of no as a particle that mirrored the tone of the
preceding noun in Middle Japanese has been preserved: 3
7 The behavior of the particle no after tone class 1.3 in Kōbe
‘eye’ me:-ga ':- → me-no '-
‘picture’ e:-ga ':- → e-no '-
‘hand’ te:-ga ':- → te-no '-
‘vinegar’ su:-ga ':- → su-no '-
5.1.4 The distribution of the particle no /H/ tone cancellation
The distribution of the special tonal features of the particle no in the different
dialects is not always clear, as the tone that occurs with this particle is not always
mentioned. In case of the following dialects however, it is clear whether the particle
no behaves differently from the other monosyllabic case particles or not. Dialects
where the tone of no is special: Tsuruoka (Haraguchi, 2001), Hiroshima (Okuda,
1975), Narada (Okuda, 1975), Ōita (Kindaichi, 1954a/1983), Kōbe (see above),
Kyōto (Kobayashi, 1975), Ōsaka (Martin, 1987), Kōchi (Kobayashi, 1975),
Wakayama (Ramsey, 1979a), Chikuzen area of northeast Kyūshū (Kindaichi,
1954a/1983:30), Nagasaki (Kindaichi 1954a/1983:31). Dialects where the tone of no
is the same as the tone of the other monosyllabic case particle: Izumo (Kobayashi,
1974), Aomori (Kobayashi, 1975), Yamaguchi, (Kobayashi 1975), Totsukawa
3.4 atama-no ‘of the head’ as -~- (1981:275), but 2.2 ie-no ‘of the house’ as
- and 3.2 kataki-no ‘of the enemy’ as - (1981:438).
3 The Kōbe data are based on the speech of one of my fellow students at the University of
Hokkaidō.
130 5 Arguments in favor of Ramsey’s theory based on internal reconstruction
(Yamana, 1951), Ibukijima (Uwano, 1985), Kagoshima (Kobayashi, 1974), the
dialects of the Ryūkyūs.
It seems that the rule can be found in a large area in central Honshū, but that it is
4
lacking in the Nairin type Totsukawa dialect. It can be found on Shikoku and Kyūshū,
but it is lacking in the Gairin type tone system of Shimane prefecture (Matsue and
Izumo), and in the Chūrin type tone system of Yamaguchi. In the northeast of
Honshū, some dialects have the rule (Tsuruoka) but others do not (Aomori).
Although the rule is lacking in some of the Gairin type dialects, in the Gairin
type tone system of Ōita (Kindaichi, 1954a/1983:30) nouns of class 2.3 and 3.4 have
/H/ tone cancellation when no is attached. (In this Gairin type dialect, /H/ tone in
classes 2.2 and 3.2 is of course already absent.)
8 /H/ tone cancellation in Ōita
noun + ga noun + no
2.1/2 - -
2.3 '- → -
2.4/5 '- '-
In the Gairin type dialect of Tsuruoka (Haraguchi 2001), the /H/ tone is cancelled
when no attaches to nouns of class 2.3 (which have the /H/ tone on the final syllable
of the noun). In this dialect part of the merged class 2.4/5 shifts the /H/ tone from the
first to the second syllable if the second syllable contains an open vowel (see chapter
7). In these nouns, where the location of the /H/ tone on the final syllable does not
go back to proto-Japanese but is a later development, the final /H/ tone is not
cancelled.
In the dialect of Itoshima-gun in Fukuoka-ken, which belongs to the Hakata-
Fukuoka type that can be found in the Chikuzen area of northeast Kyūshū, class
2.1/2 has merged with class 2.3, and a dialect with two tone classes (2.1/2/3 vs.
2.4/5) has developed. (The merged class 2.1/2/3 has '- tone but an interesting
feature is that when the first vowel is open and the second vowel close, the /H/ tone
shifts to the initial syllable (-).
In this dialect tone class 2.3 is still kept separate when the particle no is attached,
in which case – just as in Ōita – the /H/ tone is cancelled (Kindaichi, 1954a/1983:30).
The fact that the /H/ tone is cancelled only in case of tone class 2.3 indicates that
tone class 2.1/2 did not contain /H/ tone before the merger with class 2.3, which
agrees with the idea that this tonal type developed from the adjacent Ōita type tone
system shown in (8).
4 In the Sanuki type dialects of Shikoku, in which class 2.3 has merged with class 2.1 (Ø tone) it
would be possible for the /H/ tone cancellation to occur with nouns of the separate class 2.2,
but I have found no mention of the rule in this area.
5.1 The special tonal features of the particle no 131
9 /H/ tone cancellation in Itoshima-gun
noun + ga noun + no
2.1/2 '- '-
2.3 '- → -
2.4/5 '- '-
Even in one Kagoshima type dialect in Nagasaki prefecture, the word-tone of nouns
with the particle no is special (Kindaichi 1954a/1983:31). The word-tones in this
dialect are close to those of the Kagoshima dialect proper: Class 2.1/2 contains a fall
and class 2.3/4/5 a rise. After class 2.3/4/5 however, the old characteristic of the
particle no of copying the tone of the preceding syllable has been preserved, just as
it still is in the Tōkyō type dialects.
10 The particle no in the Kagoshima type tone system of Nagasaki prefecture
noun + ga noun + no
2.1/2 A , - , -
2.3/4/5 B , - → , -
In this simple Kagoshima type tone system with no more than two word-tone
categories, which correspond to word-initial tone in Middle Japanese, it is
nevertheless more than just the word-initial tone of Middle Japanese that determines
the present-day tone. As mentioned in section 4.4, I see this as a strong indication
that the Kagoshima type tone system developed gradually from a Gairin type tone
system, and did not derive its two-way word-tone distinction directly from the initial
tone in proto-Japanese. If the Kagoshima type dialects of Kyūshū derived their two-
way word-tone distinction from nothing but the tone of the initial syllable in Middle
Japanese, it is hard to explain how the special tonal quality of an attached particle
like no (occurring at the end of the word) could be reflected in one of these dialects.
The /H/ tone cancellation rule seems to have disappeared independently in a
number of dialects, which probably had it at an earlier stage. The Gairin type
dialects of northeast Honshū and Shimane prefecture may have lost the rule very
early on, perhaps at the same time when the independent /L/ tone of the other case
particles was lost. (I have no information on the situation in the Gairin area around
Hamamatsu.) The Chūrin dialect of Yamaguchi prefecture almost certainly once had
the rule, as it can be found both to the east of Yamaguchi and to the west, where it
can be found in Kyūshū. Yamaguchi may have lost the rule under influence of the
Gairin dialects of adjacent Shimane prefecture, but the Totsukawa dialects and
Ibukijima must have lost the rule independently.
This means that the distribution of this particular rule may not be able to tell us
much about the order in which the Japanese tone systems split off from proto-
132 5 Arguments in favor of Ramsey’s theory based on internal reconstruction
Japanese, but it does tell us something that is possibly even more interesting: As the
origin of the rule can be traced back to a rather innocuous tonal feature of the
particle no in Middle Japanese, the tone system of the areas that do have the rule
must once have resembled that of Middle Japanese in astounding detail. It is this
observation that makes me believe that other more detailed features of Middle
Japanese, such as the distinction between tone classes 2.4 and 2.5 and 3.6 and 3.7,
may once have existed in these areas as well.
5.2 The tone of compound nouns in the modern dialects
The tone rules involved in noun compounding are among the most complicated
problems in Japanese phonology, and many studies have been devoted to clarifying
them, especially to the compound tone rules of the standard dialect of Tōkyō.
The compound noun tone rules are usually divided into two sets. One set deals
with compounds in which the second element of the compound is ‘long’, while the
other set deals with compounds in which the second element is ‘short’. If the second
element is ‘long’, i.e. three moras or more in length, the tone rules are relatively
simple and productive, but if the second element is ‘short’, i.e. less than three moras
in length, the tone that will occur when the noun is compounded depends on the tone
class of the noun that forms the second element of the compound. When the same
noun functions as the second element in a different compound, it can however
happen, that the resulting compound has a different tone. (It is the considerable
irregularity that makes the tone rules for compound nouns so complicated.)
This difference is partly related to the length of the first element of the
compound. Although in Tōkyō, the tone of compound nouns is determined by the
tone of the second element of the compound, the length of the first element has a
certain influence on the resulting tone. Compound tone with short second elements
and shorter first elements (less than three moras) is notoriously irregular, and seems
to be lexicalized, rather than governed by productive rules. Such compounds also
often have Ø tone. (See the list of examples of this phenomenon in section 5.2.2.)
It is therefore compounds with long first elements and short second elements that
have the more productive distinctions. A comparison of the tone of this type of
compound in Kyōto and Tōkyō has been used by Ramsey to argue for his
reconstruction of the Middle Japanese tones. Ramsey’s argument will be discussed
in section 5.2.3.
5.2.1 The tone of compound nouns with ‘long’ second elements
in Tōkyō and Kyōto
The relatively simple tone rules for compounds with ‘long’ second elements (i.e.
three moras or more in length) are as follows: /H/ tone is generally assigned to the
initial syllable of the second element, regardless of the original location of the /H/
tone and even in case of second elements that have Ø tone when occurring
5.2 The tone of compound nouns in the modern dialects 133
independently. Although this rule is to a large extent productive, Tsujimura (1987)
accounts for a small number of exceptions by a rule that stipulates that second
elements that do not have the /H/ tone on one of the last two moras will preserve the
original location of the /H/ tone. Thus bi'zin ‘beautiful woman’ + konku'uru ‘contest’
becomes bijinkonku'uru ‘beauty contest’ and yama' ‘mountain + hototo'gisu ‘quail’
becomes yamahototo'gisu ‘mountain quail’.5
McCawley (1968) had analyzed the placement of /H/ tone (referred to as
‘accent’) on compound nouns in which the second element is long as follows: The
location of the /H/ tone of the second element predominates. If, however, the second
element has /H/ tone on the final syllable or has Ø tone, the /H/ tone will fall on the
initial syllable of the second element. Tsujimura’s treatment however, accounts for
the exceptions in a more complete way. For instance: sato ‘village, hometown’ +
koko'ro ‘heart’, becomes satogo'koro ‘homesickness’, while in McCawley’s analysis
the /H/ tone in a compound like satogo'koro should not have shifted because it was
not on the final syllable. (I do not consider here compounds in which the second
element is already a compound.)
The rules that determine the placement of the /H/ tone in compounds with long
second elements in Tōkyō and Kyōto are the same: The /H/ tone will generally fall
on the initial syllable of the second element. But Kyōto has an additional rule: If the
first element starts with /L/ tone, the compound will start with /L/ tone, otherwise
the compound will start with Ø tone. (As /H/ tone always falls on the second
element of the compound, there are no compounds that start with /H/ tone.)
The fact that /H/ tone placement is the same in both dialects for compounds with
‘long’ second elements indicates that this compound rule predates the split between
Tōkyō type and Kyōto type tone. When distinctive /L/ tone, as opposed to /H/ tone
and Ø tone redeveloped in Kyōto as a result of the shift, this distinction was
superimposed on the older rule that governed the location of the /H/ tone.
5.2.2 The tone of compound nouns with ‘short’ second elements in Tōkyō
If the second element of a compound noun is no more that two moras long each
second element has to be marked for the tone that will occur when the noun
functions as the second element in a compound.
As early as 1943, Wada Minoru noticed that there is a correlation between the
tone class to which the second element belonged in Middle Japanese, and the tone
that will occur in the dialect of Tōkyō when the noun is compounded. He established
the following correspondences for disyllabic nouns:
Tone classes 2.1 and 2.2 will be preceded by /H/ tone when functioning as the
second element in a compound (i.e. they will attach with ' pitch). Tone class 2.3
is split into two groups. The largest group will result in a compound with Ø tone (i.e.
5 This rule is only relevant in case of second elements that contain /H/ tone and that are more
than three moras in length, as shorter elements that contain /H/ tone already have the /H/ tone
on the initial syllable: huyu' ‘winter’ + ke'siki ‘view’ becomes huyuge'siki ‘winter view’.
134 5 Arguments in favor of Ramsey’s theory based on internal reconstruction
in these cases 2.3 will attach with pitch.) The other group will be immediately
preceded by /H/ tone when functioning as the second element in a compound (i.e. in
these cases 2.3 will attach with ' pitch just as classes 2.1 and 2.2). Tone classes
2.4 and 2.5 will have /H/ tone on the initial syllable of the second element when
functioning as the second element in a compound (i.e. they will attach with '
pitch).6
As mentioned, these correspondences are most regular when the first element of
the compound contains three moras or more, as in case of shorter first elements the
resulting compound will often have Ø tone, regardless of what tone class the second
element belongs to. Matsumori (1993) gives an impressive list of examples of this
phenomenon from her own Tōkyō type dialect, which is shown in (11):7
The only nouns that do not show differences in tone depending on the length of
the first element are those members of tone class 2.3 that also result in a compound
with Ø tone when the first element of the compound is long. These are the nouns
that have preserved the distinction between tone classes 2.2 and 2.3, but only as the
second element in a compound.
11 Comparison of the tone of compound nouns with longer
and shorter first elements
2.1 kuti ‘mouth’: deiri'guti ‘entrance and exit’ but iriguti ‘entrance’
kaze ‘wind’: muka'ikaze ‘a head wind’ but kitakaze ‘north wind’
hako ‘box’: suzuri'bako ‘ink-stone case’ but getabako ‘geta box’
sake ‘alcohol’: tamago'zake ‘eggnog’ but nezake ‘a nightcap’
kama ‘pot’: suiha'nkama ‘rice cooking pot’ but tyagama ‘tea kettle’
musi ‘insect’: nanki'nmusi ‘bedbug’ but kemusi ‘hairy caterpillar’
take ‘bamboo’: sitiku'take ‘black bamboo’ but saodake ‘bamboo pole’
2.2 isi ‘stone’: seibutu'isi ‘inanimate (?) stone’ but toisi ‘whetstone’
uta ‘song’: komori'uta ‘lullaby’ but hanauta ‘humming’
kami ‘paper’: tutumi'gami ‘wrapping paper’ but katagami ‘(paper) dress pattern’
mati ‘town’: zyooka'mati ‘castle town’ but sitamati ‘downtown’
kaha ‘river’: Sumida'gawa ‘the Sumida river’ but Edogawa ‘the Edo river’
6 Chew (1964:86) and McCawley (1968, quoting Chew) address the tone that occurs in such
compounds from a completely synchronic perspective. When tone classes 2.2 and 2.3 (having
merged in modern Tōkyō) are lumped together, it is only possible to establish a very rough
correspondence between the tone of disyllabic nouns in isolation, and as second element in a
compound: Nouns that have ' tone independently, attach with ' tone in a compound.
Nouns that have ' tone independently will often result in a compound with Ø tone, and
nouns that have tone independently will often result in a compound with /H/ tone on the
final syllable of the first element.
7 Matsumori’s entries agree for the most part with the Tōkyō data from Hirayama’s dialect
dictionary (1960). In Kyōto, compounds with shorter first elements also usually lack /H/ tone,
but (as usual) the tone of the initial syllable of the compound is determined by the tone of the
initial syllable of the first element.
5.2 The tone of compound nouns in the modern dialects 135
2.3 kutu ‘shoe’: undo'ogutu ‘sports shoe’ but nagagutu ‘boot’
kusa ‘grass’: hahako'gusa ‘cottonweed but ukikusa ‘duckweed’
kami ‘hair’: midare'gami8 ‘tangled hair’ but kurogami ‘black hair’
kumo ‘cloud’: nyuudo'ogumo ‘thunder cloud’ but amagumo ‘rain cloud’
uma ‘horse’: abare'uma ‘an unruly horse’ but taneuma ‘stallion’
kata ‘shoulder’: sizyu'ukata ‘age-related shoulder complaint’ but nadekata
‘sloping shoulders’
inu ‘dog’: Akita'inu ‘an Akita dog’ but norainu ‘stray dog’
2.4 ito ‘thread’: situke'ito ‘stitching thread’ but takoito ‘rope of a kite’
kasa ‘umbrella’: sandoga'sa ‘straw rain hat’ but okigasa ‘spare umbrella’
hasi ‘chopsticks’: uturiba'si ‘only eating the side-dishes’ but nuribasi ‘lacquered
chopsticks’
kasu ‘dregs, grounds’: siborika'su ‘pressed dregs’ but sakekasu ‘sake lees’
ato ‘trace’: yasikia'to ‘remains of a mansion’ but kizuato ‘scar’
hune ‘boat’: marutabu'ne ‘log canoe’ but sasabune ‘folded bamboo leaf boat’
hari ‘needle’: senninba'ri ‘thousand-stitch-belt’9 but kebari ‘fishhook’
2.5 koe ‘voice’: kasurego'e ‘hoarse voice’ but uragoe ‘falsetto’
saru ‘monkey’: Nihonza'ru ‘Japanese monkey’ but yamazaru ‘wild monkey’
mado ‘window’: garasuma'do ‘glass window’ but tenmado ‘a skylight’
kumo ‘spider’: asinagaku'mo ‘long legged spider’ but tutigumo ‘Earth Spider’
nabe ‘pan’: tyankona'be ‘Sumo wrestler’s stew’ but donabe ‘earthen pot’
2.3 tori ‘bird’: mikahatori ‘Mikawa bird’
iro ‘color’: midoriiro ‘green’
yama ‘mountain’: Asamayama ‘mount Asama’10
tama ‘ball’: kusudama ‘decorative paper ball’
mimi ‘ear’: digokumimi ‘ear-from-hell’11
hara ‘belly’: taikobara ‘pot belly’
heya ‘room’: benkyoobeya ‘study’
kata ‘person’: aitekata ‘other party’
koto ‘words’: hitorigoto ‘monologue’
saka ‘slope’: noborizaka ‘upward path’
8 But on the other hand also Nihongami ‘Japanese hairstyle’.
9 A soldier’s belt with a thousand stitches, of which each stitch is made by a different woman
who wishes the soldier good luck in war.
10 Interestingly, the 17th century work Bumō-ki (which reflects a post-shift Kyōto type tone
system) already contains the remark that the kana ‘ma’ of yama ‘mountain’ (class 2.3) is
ヒクシ
pronounced hikusi 卑 ‘low’ when this word is used in isolation, but that in compounds such
as Nisiyama and Higasiyama the kana ‘ma’ is pronounced takasi 高シ ‘high’.
11 ‘Someone who always manages to overhear other peoples secrets.’ In Okuda’s Hiroshima
dialect this word is listed as digoku'mimi, so in Hiroshima mimi belongs to those 2.3 nouns that
attach with ' pitch.
136 5 Arguments in favor of Ramsey’s theory based on internal reconstruction
5.2.3 The tone of compound nouns with ‘short’ second elements in Kyōto:
Wada’s discovery and its meaning for Ramsey’s theory
Apart from the fact that there was a relation between modern compound tone in the
dialect of Tōkyō and the historical tone class that a noun belonged to, Wada also
noticed that the location of the /H/ tone in these tone classes when they functioned as
the second element in compound nouns in Kyōto is the same as in Tōkyō. This is
despite the fact that they have a completely different tone in Kyōto when they are
used in isolation. 12 (Just as was the case with compounds with ‘long’ second
elements, a difference between the tone of the compounds in Kyōto and Tōkyō, is
that in Kyōto, if the first element starts with /L/ tone, the compound will start with
/L/ tone, otherwise with Ø tone.) The examples of the correspondence between
Kyōto and Tōkyō that Wada gave in his article are shown in (12). For each tone
class, first an example with Ø tone on the initial syllable in Kyōto is shown, and then
an example with /L/ tone on the initial syllable in Kyōto.13
First of all, Wada called attention to the fact that the compound noun tone rules
in Tōkyō and Kyōto must go back to a period when tone classes 2.2 and 2.3 had not
yet merged in these dialects. (And in the case of the Kyōto dialect, because of the
extant tone dot material from the old capital, Wada concluded that the compound
tone rules must therefore date from before the end of the Heian period.)
Secondly, Wada concluded that the correspondence between the location of the
/H/ tone between the Tōkyō dialect and the Kyōto dialect was too regular to be a
coincidence. The tone that these tone classes have as second element of a compound
had to be a remnant of the tone that they had before the split between the Kyōto type
and the Tōkyō type tone systems.
More than thirty years later, Ramsey (1979, 1980) quoted Wada’s conclusion as
evidence for his reconstruction of the Middle Japanese tones, as Wada’s discovery
offered independent confirmation of the tone system that Ramsey had reconstructed
for disyllabic nouns in Middle Japanese.14
12 Uwano (1997) remarks that he has often been surprised to find that many researchers are
unaware of the fact that – apart from generally corresponding to each other in a regular way –
the location of the /H/ tone in Tōkyō type tone and Kyōto type tone coincides when compound
noun tone is concerned. He speculates that to Wada, who was a native speaker of both the
dialects of Kōbe and Tōkyō, this fact may have been self-evident from the start. He also
mentions that in 1989 Sugitō Miyoko and Tawara Hiroshi, who do not refer to Wada’s article,
(re-)discovered a similar congruence in the location of the /H/ tone (of over 70%) between the
dialects of Tōkyō and Ōsaka. In addition, Okuda (1971/1975) also includes many examples of
a Tōkyō type location of the /H/ tone in compounds in Kyōto.
13 I usually avoid examples with moraic nasals or long vowels (see the introduction), but as these
are included in Wada’s examples I indicate the pitches of moras here, and not – as I normally
do – of syllables.
14 For an explanation as to why class 2.2 attaches with [] pitch and not with pitch, and
why class 2.5 attaches with pitch just as class 2.4, and not with pitch, even in Kyōto
– where the distinction between classes 2.4 and 2.5 in isolation has been preserved – see
section 5.11.
5.2 The tone of compound nouns in the modern dialects 137
12 Wada’s comparison of the tone of compound nouns in Tōkyō and Kyōto
2.1 Tōkyō Kyōto
Koobeusi ‘Kōbe cow’ ' '
madarausi ‘spotted cow’ ' ''
2.2
mikageisi ‘granite’ ' '
hiutiisi ‘flint’ ' ''
2.3
Nihoninu ‘Japanese dog’
Akitainu ‘Akita dog’ '15
2.4
sandogasa ‘straw rain hat’ ' '
Amidagasa ‘Amida umbrella’ ' ''
2.5
Sikokuzaru ‘Shikoku monkey’ ' '
tenagazaru ‘gibbon’ ' ''
The tone of disyllabic nouns as second element in a compound in Kyōto is the most
manifestly archaic in that it has preserved the pre-shift location of the /H/ tone, but
the tone in Tōkyō is archaic as well: As the second element in a compound, classes
2.2 and 2.3 have managed to (largely) preserve a difference in the tone of the initial
syllable that dates back to the Middle Japanese period.16
13 The tone of the second element in compound nouns
agrees with Ramsey’s reconstruction
Kyōto Middle Tōkyō
Japanese
isolation compound compound isolation
' 2.1 '
' ' 2.2 ' '
' 2.3 '
' ' 2.4 ' '
'' ' 2.5 ' '
15 Second elements that attach with Ø tone will have pitch in Tōkyō as well as Kyōto, if the
first element in Kyōto does not have /L/ tone. If the first element in Kyōto has /L/ tone the final
element will have pitch, or pitch if a particle is attached: Akitainu ',
Akitainu-ga '-.
16 The reason why part of tone class 2.3 now attaches as ' in both dialects will be discussed in
section 5.4.
138 5 Arguments in favor of Ramsey’s theory based on internal reconstruction
As was the case with compounds with longer second elements, when distinctive
initial /L/ tone redeveloped in Kyōto as a result of the shift, this distinction was
superimposed on the older rules governing the location of the /H/ tone, which Tōkyō
and Kyōto shared.
5.3 Incongruent register of compounds in the dialect of Kyōto
Although Wada’s discovery offers the strongest argument in favor of Ramsey’s
reconstruction of the Middle Japanese tones, additional evidence for Ramsey’s
reconstruction can be seen in the fact that the tone of the initial syllable of certain
compound nouns in Kyōto is inconsistent with the tone of the initial syllable of the
first element when this occurs in isolation. Martin (1987:221) for instance gives a
number of examples of nouns that do not start with /L/ tone in modern Kyōto, but
nevertheless yield compounds with initial /L/ tone when they function as the first
element: 2.1 midu ‘water’, 2.1 hana ‘nose’, 2.1 yaki ‘baking’, 17 2.2 isi ‘stone’.
Following Ramsey’s reconstruction the unexpected /L/ tone can be explained as
lexicalized remnants of the pre-shift tone system, as all of these examples started
with /L/ tone in Middle Japanese.
Conversely, the following nouns all start with /L/ tone in modern Kyōto (and
according to the standard theory started with /L/ tone in Middle Japanese), but
several compounds in which they appear as the first element have Ø tone ([H] pitch)
instead of /L/ tone on the initial syllable: 2.5 haru ‘spring’, 2.5 aki ‘autumn’, 2.5
ama- (allomorph of ame) ‘rain’, 2.5 ao ‘blue’, 2.4 kata ‘shoulder’. 18 Martin
(1987:303-307) also includes a long list of compounds with Ø tone that have the
following monosyllabic nouns with /L/ tone in modern Kyōto as the first element:
1.3 te ‘hand’, 1.3 me ‘eye’, 1.3 ki ‘tree’, 1.3 hi ‘fire’.
Ramsey’s reconstruction can again explain the lack of initial /L/ tone in these
compounds as lexicalized remnants of the pre-shift tone system, as all of these
examples started with /H/ tone in Middle Japanese.
5.4 The origin of the two types of reflexes of tone class 2.3 in Tōkyō
We have seen that in Tōkyō part of tone class 2.3 will attach with ØØ tone (=
pitch) and part will attach with 'ØØ tone (= Ø tone immediately preceded by /H/
tone and therefore pitch) when functioning as the second element in a
17 This word forms compounds with unexpected /L/ register in the Kyōto type dialect of
Wakayama as well: yakizakana '' ‘roasted fish’ (Uwano, 1997).
18 Examples (except kata cf. katami ‘upper body’ in Frellesvig, 1994:152) are from Martin
(1987:221–222 and 382).
5.4 The origin of the two types of reflexes of tone class 2.3 in Tōkyō 139
compound. Matsumori (1993) has proposed two possible explanations for the two
different reflexes that occur when nouns of class 2.3 are compounded.
One explanation is that the productive rule for these nouns is in the process of
changing from attaching with ØØ tone to attaching with 'ØØ tone, and that the
number of nouns that attach with 'ØØ tone is growing. She remarks upon the fact
that in Wada’s description (1943) inu ‘dog’ attached with 'ØØ tone in Tōkyō (cf.
Nihoninu ‘Japanese dog’, Karahutoinu ‘Sakhalin dog’ and Akitainu ‘Akita dog’),
while in her own Tōkyō type speech it attaches with 'ØØ tone in Akita'inu. 19
According to Matsumori the weak point of this explanation is that it cannot account
for the fact that only tone class 2.3 is in this process of change, while tone class
2.4/5 is not affected.
I do not see a problem here, as the trigger of the change is no doubt the fact that,
as independent nouns, the members of class 2.3 have merged with class 2.2. The fact
that they now start to adopt the rules of class 2.2 is therefore not surprising. Class
2.4/5 on the other hand, has not merged with class 2.2 (or class 2.1) so that in case of
this class there is no cause for a change in the rules for compound nouns.20
Matsumori’s second explanation is that at the time of the formulation of these
compound tone rules, the tone of part of tone class 2.3 had already changed from
to , while another part of tone class 2.3 still preserved the original
tone.21 However, when a phonological class splits, such a split is usually based on
segmental or semantic grounds, which – as far as I can see – are absent in this case.
It is even possible for the same noun to attach with ØØ tone or with 'ØØ tone
depending on the compound. I therefore prefer Matsumori’s first explanation,
namely that the ØØ attachment rule of tone class 2.3 is no longer productive, and
that compounds in which class 2.3 attaches with ØØ tone are lexicalized remnants of
the older rule.22
19 Okuda (1971) also indicates Akita'inu as the tone in both Tōkyō and Hiroshima.
20 According to Uwano (personal communication) in the speech of the younger generation 2.4
and 2.5 nowadays often attach with pitch. ('ØØ tone) as well. The number of nouns that
attach with pitch (i.e. classes 2.1 and 2.2, and a growing number of members of class 2.3)
is apparently becoming so large, that a new, generalized rule is developing in which all nouns
attach with 'ØØ tone.
21 Although Matsumori does not support Ramsey’s theory, we have seen in section 2.5 that her
reconstruction of the tone system of proto-Japanese is remarkably similar to Ramsey’s Middle
Japanese tone system, so that tone class 2.2 in Matsumori’s proto-Japanese has tone and
class 2.3 has tone. In Matsumori’s proto-Mainland Japanese tone system on the other hand,
which is a later development out of proto-Japanese, class 2.2 has tone and class 2.3 has
tone. In other words, the compounds in Kyōto and Tōkyō have preserved remnants of a
tone system that is older than the split between the Ryūkyūan dialects and the dialects of
mainland Japan.
22 An exceptional case are deverbal nouns that are formed by attaching the morpheme -mono,
which is a noun of class 2.3 meaning ‘thing’. This is the only type of compound involving a
noun of class 2.3 that occurs frequently enough to have developed a productive new rule in
Kyōto that is unrelated to the other rules for noun compounding in this dialect: It has ØØ tone
when attached to verbs of type A, but it has 'ØØ tone when attached to verbs of type B. There
140 5 Arguments in favor of Ramsey’s theory based on internal reconstruction
If this is correct, we can expect compounds in which class 2.3 attaches with ØØ
tone to be older, but it is hard to test this assumption, as it is difficult to tell how old
a compound is. In (14) undo'ogutu ‘sports shoe’ is no doubt a new compound, and
class 2.3 in this word indeed attaches with 'ØØ tone, but ningenwaza ‘human skills’
and seizoomoto ‘original manufacturer’ in (15) also look like new compounds, and
in these cases class 2.3 attaches with ØØ tone. In the following example the same
second element attaches with a different tone in different compounds: midare'gami
‘tangled hair’ (where kami attaches with 'ØØ tone) and Nihongami ‘Japanese hair
style’ (where kami attaches with ØØ tone).
I expect that a compound like Nihongami is more likely to have been created
after Japan was again opened up to the rest of the world after the Meiji Revolution,
and that it may therefore be the more recent of the two compounds. If so, this
example does not confirm the idea that the more recently created compounds will
attach with 'ØØ tone either.23
It may be wrong to say that the ØØ attachment rule, although older, is no longer
productive. What may be the case is that the merger of classes 2.2 and 2.3 caused a
certain amount of new compounds to be formed in which 2.3 attached with 'ØØ tone.
Compounds with tone class 2.3 as the second element now had two different rules to
choose from, and the choice between attaching with ØØ tone or with 'ØØ tone may
be decided by which is the most well-known compound in which a specific noun
occurs as the second element. The tone of the other compounds containing this word,
old or new, could then be (re)modeled after the most famous example. So
compounds that are clearly recent, in which nouns of class 2.3 attach with ØØ tone
could be explained as the result of analogy with a well-known older compound in
which the same second element was involved.
5.5 Compounds with tone class 2.3 in Hiroshima, Kyōto and Tōkyō
In 1971 Okuda published a study of the tone of compound nouns in his native
dialect of Hiroshima (Chūrin Tōkyō type), of which the concrete examples are listed
in appendices at the end of his book. He writes (1971: 244):
With respect to compounds with a ‘short’ final element, it should be pointed
out that the large majority of preaccenting morphemes listed in Appendix A
and deaccenting morphemes listed in Appendix B are also preaccenting and
deaccenting, respectively, when used as final element of a compound in the
Kyooto dialect; even though their tone, when used as independent words may
differ considerably from the Tookyoo dialect.
are exceptions, but these can be attributed to lexicalization of older rules (Frellesvig, 1999).
23 The tone of these compounds in the modern Kyōto dialect, Nihonga'mi and midarega'mi (and
also araiga'mi ‘washed hair’) seem to be based on the modern H'L tone of the word kami in
Kyōto. (Hiroshima, interestingly, also has Nihonga'mi, which may be a loan.)
5.5 Compounds with tone class 2.3 in Hiroshima, Kyōto and Tōkyō 141
The fact that tone classes 2.1 and 2.2 generally attach with 'ØØ tone is of course one
of the reasons for the agreement between morphemes that attach with ØØ tone and
morphemes that attach with 'ØØ tone in Tōkyō and Kyōto. It would be interesting to
see how much agreement there is if we exclude nouns of class 2.1 and 2.2 from the
comparison and consider only nouns of class 2.3. In (14) to (17), I have compared
Okuda’s Hiroshima data with data fromTōkyō and Kyōto.
Tōkyō and Kyōto data are mainly from Hirayama (1960) with a few examples
from Okuda. In case Hirayama and Okuda do not agree on the tone of a compound
or in case only one of the two includes a certain compound I indicate the source as O
(Okuda) or H (Hirayama). In case I could not find the Kyōto type tone of Okuda’s
examples in Hirayama’s dictionary, the Kyōto data are based on the entries in Nihon
koku-go dai-jiten (NKD). In these cases I indicate the source as (N).
Instances of irregular tone among the examples (i.e. compounds with tone other
than ØØ or 'ØØ) have been underlined.
14 Examples where the tone of class 2.3 in Hiroshima, Kyōto and Tōkyō coincides
and attaches with 'ØØ tone
Hiroshima Kyōto Tōkyō
sizimi'gai sizimi'gai (H) sizimi'gai ‘corbicula’
sizimiga'i (O)24
binbo'okuzi 'binboo'kuzi binbo'okuzi ‘losing lottery
ticket’25
kusuri'yubi 'kusuri'yubi kusuri'yubi ‘ring finger’
daiko'nasi daikon'asi daiko'nasi ‘piano leg’
daikonasi (H)
densyo'obato 'densyoo'bato (O) densyo'obato ‘carrier pigeon’
undo'ogutu 'undoo'gutu (H) undo'ogutu ‘sports shoe’
amiage'gutu 'amiage'gutu (H) amiage'gutu ‘laced boots’
nyuudo'ogumo 'nyuudoo'gumo (H) nyuudo'ogumo ‘thunder cloud’
nyuudoogu'mo (H)
doyo'onami doyoo'nami (H) doyo'onami ‘high waves in
summer’
awazi'sima awazi'sima (H) awazi'sima ‘Awaji Island
hatizyo'ozima hatizyoo'zima hatizyo'ozima ‘Hachijō Island’
atari'dosi atari'dosi (H) atari'dosi ‘lucky year’
ataridosi (H)
kanna'duki 'kanna'duki kanna'duki ‘October’
yaguruma'giku yaguruma'giku (N) yaguruma'giku ‘cornflower’
24 NKD only gives sizimi'gai for Kyōto.
25 In Kyōto /H/ tone on a dependent mora is allowed, while in Tōkyō and Hiroshima it is not, so
that in Kyōto the pitch fall after the /H/ tone is heard after a heavy syllable while in Tōkyō and
Hiroshima it is heard in the middle of the heavy syllable.
142 5 Arguments in favor of Ramsey’s theory based on internal reconstruction
Hiroshima Kyōto Tōkyō
hikari'goke hikari'goke (N) hikari'goke ‘luminous moss’
wasurena'gusa wasurena'gusa (N) wasurena'gusa ‘forget-me-not’
wasure'zimo wasure'zimo (N) wasure'zimo ‘late spring frost’
akatuki'yami akatuki'yami (N) akatuki'yami ‘dark dawn’
hotyu'uami hotyuu'ami (N) hotyu'uami ‘butterfly net’
yoosu'iike 'yoosui'ike (N) yoosu'iike ‘reservoir’
15 Examples where the tone in Hiroshima, Kyōto and Tōkyō coincides
and attaches with ØØ tone
Hiroshima Kyōto Tōkyō
sorobandama 'sorobandama sorobandama ‘abacus bead’
dosyoobone dosyoobone (H) dosyoobone ‘guts’
taikobara 'taikobara taikobara ‘potbelly
ningenwaza 'ningenwaza ningenwaza ‘human skills’
seizoomoto 'seizoomoto seizoomoto ‘original
manufacturer’
aitekata aitekata aitekata ‘other party’
aitekata' (H)
hitorigoto 'hitorigoto hitorigoto ‘monologue’
niwatorigoya niwatorigoya niwatorigoya ‘chicken pen’
syoodikimono 'syoodikimono syoodikimono ‘honest person’
syoodikimono' (H)
simenawa simenawa simenawa ‘rope used for
sime'nawa (H) Shintō rituals’
kabetuti kabetuti kabetuti ‘adobe’
murasakiiro 'murasakiiro murasakiiro ‘purple’
yasinaioya yasinaioya yasinaioya ‘foster parent’
takarakura takarakura (N) takarakura, ‘treasury’
takara'kura
benkyoobeya benkyoobeya (N) benkyoobeya ‘study’26
In the following two sets of examples the location of the /H/ tone in Hiroshima,
Kyōto and Tōkyō does not coincide. In the first group the tone of Kyōto differs from
both Hiroshima and Tōkyō (or agrees only with one of the two) but still shows
reflexes that either have ØØ or 'ØØ tone. In the second group the tone of Hiroshima
and Tōkyō coincides, but the Kyōto tone seems to have been modeled after the
modern ' tone of class 2.3 in Kyōto.
26 See also kodomobeya ‘children’s room’ in both Kyōto and Tōkyō according to NKD.
5.6 What do the compounds in Hiroshima, Kyōto and Tōkyō tell us? 143
16 Examples where the tone in Hiroshima, Kyōto and Tōkyō does not coincide (1)
Hiroshima Kyōto Tōkyō
sakura'bana sakurabana (H) sakura'bana ‘cherry blossom’
gomu'mari gomumari (H) gomu'mari ‘rubber ball’
abare'uma 'abareuma (H) abare'uma ‘an unruly horse’
satumaimo satuma'imo satumaimo ‘sweet potato’
hokkyoku'guma 'hokkyokuguma (N) hokkyoku'guma ‘polar bear’27
digoku'mimi digokumimi (H) digokumimi ‘ear-from-hell’
digoku'mimi (H)
nobori'zaka noborizaka noborizaka ‘upward path’
17 Examples where the tone in Hiroshima, Kyōto and Tōkyō does not coincide (2)
Hiroshima Kyōto Tōkyō
inge'nmame ingenma'me (H) inge'nmame ‘kidney-bean’
nigirimesi nigirime'si nigirimesi ‘rice ball’
nigiri'mesi (H)
itize'nmesi (mod.) itizenme'si itize'nmezi ‘one serving of
rice’
gomoku'zusi 'gomokuzu'si (N) gomoku'zusi ‘Gomoku sushi’28
5.6 What do the compounds in Hiroshima, Kyōto and Tōkyō tell us?
When Wada discovered that the distinction between tone class 2.2 and 2.3 had been
(partially) preserved in compounds in Tōkyō and Kyōto he concluded that the
compound noun tone rules in Tōkyō and Kyōto must date from a period when tone
classes 2.2 and 2.3 had not yet merged. The fact that there are nowadays compounds
with class 2.3 as a second member in which this tone class no longer attaches with
ØØ tone but with 'ØØ tone (just as tone class 2.2) in both dialects, is a development
that could only be expected in light of the fact that the two classes have merged
when occurring in isolation.29
27 NKD indicates that ‘polar bear’ can also occur with Ø tone in Tōkyō.
28 NKD indicates that ‘Gomoku sushi’ can occur as gomokuzu'si in Tōkyō. If the dish is originally
from the Kansai region this could be due to borrowing.
29 The merger apparently also resulted in a number of compounds where class 2.2 now attaches
with ØØ tone, just as class 2.3. In the following examples class 2.2 attaches with ØØ tone in all
three dialects.
Hiroshima Kyōto Tōkyō
heibangata heibangata heibangata ‘level pattern’
tamagogata 'tamagogata tamagogata ‘oval shape’
nusumiguse nusumiguse nusumiguse ‘kleptomania’
144 5 Arguments in favor of Ramsey’s theory based on internal reconstruction
What is remarkable however, is that in the majority of cases Tōkyō, Kyōto and
Hiroshima agree on which compounds show the pre-merger attachment rule and
which compounds show the post-merger attachment rule: Despite the exceptions in
the list above, it is clear that the degree in which the dialects of Kyōto, Hiroshima
and Tōkyō agree on which nouns attach with 'ØØ tone and which nouns attach with
ØØ tone is much higher than can be explained as a result of mere coincidence.
The fact that the modern tone rules for compound nouns in central Japan are
very different from those of Middle Japanese (cf. 5.8 and subsections) shows that at
some point during the Middle Japanese period, new rules for compound nouns
developed in this area. At a certain point however, after the merger of classes 2.2
and 2.3. class 2.3 started to be confused with class 2.2, and a new type of compound,
in which class 2.3 attached with 'ØØ tone developed.
In the development of the distribution pattern of the two types, processes similar
to the ones described in section 5.4 for the dialect of Tōkyō most likely played a
role: In some newly created compounds, nouns of class 2.3 now attached with 'ØØ
tone, and at a certain point the choice between 'ØØ or ØØ tone may have been
decided primarily by which was the most well known compound in which a certain
noun functioned as the second element.
In order for the surprising degree of agreement between the three dialects to have
developed, the period in which these developments were shared between them must
have lasted for a considerable time. This means that the new rules for compound
nouns developed some time before the occurrence of the leftward tone shift in the
Kyōto type dialects. It also means that the shift took place at a time when classes 2.2
and 2.3 were already being confused with each other, which would have been
around stage 3, or the transition from stage 3 to stage 4, outlined in the previous
chapter.
5.7 The tone rules for compound nouns in the Gairin type dialects
So far I have not come across published studies of the compound tone rules of the
Gairin type dialects of Hamamatsu and Ōita. My data on compound nouns in Ōita
are from a B.A. thesis by Okamoto Yasuhiro from the University of Kyūshū, kindly
provided to me by Matsuura Toshio from the same university. The compound noun
tone rules of Izumo in Shimane prefecture, Shizukuishi in Iwate prefecture and
Tsugaru in Aomori prefecture are from Hiroto & Ōhara (1953: 86-90), Uwano
(1997) and Kobayashi Yasuhide (1974) respectively.
The tone rules for compound nouns in these Gairin type dialects are
fundamentally different from the rules of central Japanese dialects such as
Hiroshima, Kyōto and Tōkyō:
5.7 The tone rules for compound nouns in the Gairin type dialects 145
In Izumo, if the first element has Ø tone the compound will have Ø tone,
irrespective of the length of the first or the second element.30 If the first element
contains /H/ tone, the compound will contain /H/ tone, but the rules which determine
the location of the /H/ tone are complex. I will not make an attempt at a
comprehensive description of the rules that determine the location of the /H/ tone in
Izumo, but the following observations show that in this respect too, the compound
tone rules of Izumo are fundamentally different from those of Hiroshima, Kyōto and
Tōkyō.
Although there is a similar kind of division into compounds with longer second
elements and compounds with shorter second elements, this is where the similarity
ends. In Izumo, in case of disyllabic second elements the /H/ tone will be on the
penultimate syllable of the compound if the final syllable contains a close vowel, cf.
2.3 + 2.2 iroga'mi ‘colored paper’ (Tōkyō iro'gami), 1.3a + 2.1 teku'bi ‘wrist’
(Tōkyō te'kubi), 1.3b + 2.1 hosa'ki ‘an ear of wheat’ (Tōkyō hosaki'), 2.5 + 2.3
amehu'ri ‘rainfall’,? + 2.3 awada'ti ‘goose pimples’, 2.3 + 2.3 imoho'ri ‘potato
digging’ (Tōkyō idem), 2.3 + 2.3 kawagu'tu ‘leather shoes’ (Tōkyō Ø tone), 2.5 +
2.3 asao'ki ‘early rising’ (Tōkyō asa'oki), 1.3 + 2.3 seno'bi ‘stretching’, (Tōkyō
se'nobi) 3.4 + 2.1 otokobu'ri ‘handsomeness’ (Tōkyō Ø tone), 3.4 + 2.4 atamaka'zu
‘number of people’(Tōkyō idem).
If the final syllable contains an open vowel the /H/ tone will be on the final
syllable, cf. 1.3a + 2.1 ehude' ‘paintbrush’ (Tōkyō e'hude), 1.3 + 2.3 nemoto' ‘root’
(Tōkyō nemoto'), 1.3 + 2.3 hamono' ‘knife’ (Tōkyō ha'mono), kinumono' ‘silk
goods’ (Tōkyō kinu'mono), yoake' ‘dawn’ (yoake'), nihuda' ‘baggage label’ (ni'huda),
kigire' ‘chip of wood’ (Tōkyō idem), aiiro' ‘indigo’ (Tōkyō Ø tone). As will be
discussed in more detail in chapter 7, such rightward shift of the /H/ tone blocked by
close vowels is a feature typical of the Gairin B type tone systems.
In case of trisyllabic second elements, the length of the first element has
influence on the location of the /H/ tone in the compound. In case of monosyllabic
first elements, the /H/ tone will be on the initial syllable of the second element, cf.
ego'koro ‘talent for painting’, eha'gaki ‘picture postcard’, tego'koro ‘consideration’,
tezu'kuri ‘hand-made’, tebu'kuro ‘glove’, hidu'kuri ‘making fire’, hima'turi ‘fire
festival’. (In all of these examples Tōkyō has the /H/ tone in the same location as
Izumo.)
In case of disyllabic first elements, the /H/ tone will be on the penultimate
syllable of the second element cf. udekura'be ‘a trial of skill’, simanaga'si
‘banishment’, sumidawa'ra ‘charcoal sack’, yamanobo'ri mountain climbing’,
akimatu'ri ‘autumn festival’, koinobo'ri ‘carp streamer’, haayasu'mi ‘spring holiday’,
haaisigo'to ‘needlework’. (Tōkyō on the other hand will have the /H/ tone on the
30 All of the following words for instance have Ø tone in Izumo, while in Tōkyō there is /H/ tone
on the initial syllable of the second element. Tōkyō: kigo'koro ‘disposition’, tobu'kuro ‘boxed
shutters’, tozi'mari ‘fastening doors’, tyaba'sira ‘having a tea stalk float upright in one’s tea’,
tyaba'take ‘tea plantation’.
146 5 Arguments in favor of Ramsey’s theory based on internal reconstruction
first syllable of the second element: udeku'rabe, simana'gasi, sumida'wara,
yamano'bori, koino'bori, harisi'goto.)
The fact that there is no /H/ tone shift blocked by close vowels in Izumo in these
cases suggests that the rules that decide the location of the /H/ tone in compounds
with longer second elements date from after the development of the /H/ tone shift.
In Ōita the following compounds have Ø tone, which shows that in Ōita too, first
elements with Ø tone generate compounds with Ø tone:31 2.1 + 2.1 osiire ‘a wall-
cupboard’ (Tōkyō o), 2.1 + 2.3 sakadati ‘inversion, handstand’ (Tōkyō Ø tone), 2.1
+ 2.3 torigoya ‘aviary, chicken coop’ (Tōkyō Ø tone), 2.1 + 2.3 turizao ‘fishing-rod’
(Tōkyō Ø tone), 3.1 + 2.1 kodomozure ‘bringing children along’, 4.1 + 2.2
moritukekata ‘way of serving’, 4.1 + 2.3 itazuramono ‘a mischief-maker’, 3.1 + 1.3
asobiba ‘playground’ (Tōkyō Ø tone).
If the first element in Ōita contains /H/ tone, the compound will also contain /H/
tone: 2.5 + 2.1 amami'zu ‘rain-water’ (Tōkyō ama'mizu), 2.5 + 2.1 asehuki' ‘wiping
away sweat’, 2.5 + 2.3 ao'nori ‘green laver’ (Tōkyō idem), 2.5 + 2.3 ase'kaki
‘breaking into sweat’, 2.5 + 2.3 nama'mono raw/perishable goods’ (Tōkyō idem),
2.5 + 2.4 aozo'ra ‘blue-sky’ (Tōkyō idem), 2.5 + 2.4 amaga'sa ‘umbrella’ (Tōkyō
idem), 2.5 + 2.4 namaga'si ‘unbaked cake’ (Tōkyō idem), 2.3 + 2.1 kiriki'zu
‘cutting-wound’ (Tōkyō kiri'kizu), 2.3 + 2.1 kiriku'ti ‘incision’ (Tōkyō kiri'kuti), 2.3
+ 2.2 iroga'mi ‘colored paper’32 (Tōkyō iro'gami), 2.3 + 2.2 asio'to ‘footfall’ (Tōkyō
asioto'), 2.3 + 2.3 asimoto' ‘near, below the feet’ (Tōkyō asimoto'), 2.3 + 1.3 huroba'
‘bathroom’, 2.3 + 1.3 huro'ya ‘bathhouse’.
The description of compound tone in Shizukuishi by Uwano treats only
compound tone with longer second elements. Just as in Izumo and Ōita, the first
element determines whether the compound will contain /H/ tone or not: If the first
element has Ø tone the compound will have Ø tone. If the first element contains /H/
tone, the compound will contain /H/ tone, which will be located on the first or the
second syllable of the second element, depending on the segmental structure of the
second element.33
In the Tsugaru dialect (Kobayashi, 1974) the first element decides whether a
compound contains /H/ tone or not, in the same way as in the other Gairin type
dialects. If the first element contains /H/ tone the compound will contain /H/ tone,
but the location of the /H/ tone is determined by the tone class of the second element.
In addition, the segmental structure has influence. The location of the /H/ tone is –
among other things – influenced by the rightward shift of /H/ tone blocked by close
vowels that is typical of the Gairin B dialects of northeast Japan.
31 However, 3.6 + 2.3 hidariasi ‘left-foot’, 2.3 + 2.1 utidome ‘bringing something to a close’
and 2.4 (?) + 2.1 okuyuki ‘going into the back’ do not agree.
32 Example from Hirayama ed. (1992).
33 In the dialect of Shizukuishi (cf. section 1.1.1) the pitch assignment rules are different from the
rules of the familiar Gairin type tone systems that surround this dialect. The derivation of the
Shizukuishi tone system from this type is beyond dispute.
5.8 The tone rules for compound nouns in Middle Japanese 147
The tone rules for compound nouns in the ancestral tone systems of Tōkyō,
Kyōto and Hiroshima must have been very similar. The rules in the Gairin type
dialect of Shimane however, appear to be quite different from those in northeast
Kyūshū: In Izumo the location of the /H/ tone in the compound is determined to a
large extent by the length of the first and the second element, while in Ōita, the
location of the /H/ tone in the compound appears to be determined by the tone class
of the second element. (The information available to me on the compound tone rules
of the Tōhoku region is not sufficient to see if these dialects are more like Shimane
or more like Kyūshū.)
The most fundamental difference between Tōkyō, Kyōto and Hiroshima (and
according to Uwano (1997) also Kanazawa and Toyama) and these Gairin type
dialects is that in the former the tone of the second element determines whether a
compound will contain /H/ tone or not, as well as the location of the /H/ tone in the
compound. In the Gairin type dialects on the other hand, the tone of the first element
determines whether the compound will contain /H/ tone or not. The rules that
determine the location of the /H/ tone in these dialects are complex, and show much
less agreement among each other than the rules in Tōkyō, Kyōto and Hiroshima.
A comparison with the tone rules for compound nouns in Middle Japanese may
shed light on the question of which type is more archaic.
5.8 The tone rules for compound nouns in Middle Japanese
The tones of Middle Japanese compounds have been adopted from Martin
(1987:234-239) and are based on the entries in the Kanchi-in-bon of Ruiju myōgi-
shō (i.e. they are based on an MJ ‘Chūrin’ type tone system) but I have reversed
Martin’s tones.34
The tone rules for compound nouns in Middle Japanese were quite irregular.
They may already have been a mixture of productive rules and lexicalized older
rules. We can observe a similar split between the rules for compounds with longer
and shorter second elements as can be found in the modern dialects.
5.8.1 Compounds with ‘long’ second elements
If the second element of a compound is three syllables or longer, the initial tone in
Middle Japanese will be determined by the first element. If the initial tone of the
compound is /L/, there will be a change of tone at the final syllable of the second
element. If the initial tone of the compound is /H/, there will be a change of tone at
the next to last syllable of the second element. This is irrespective of the original
tone of the second element.
34 I have also changed the tone class indication of kusi ‘skewer’ from 2.2 to 2.3, and of mura
‘bunch’ from 2.2 to 2.1, as these are the tone classes of these nouns that Martin himself
indicates in his vocabulary list.
148 5 Arguments in favor of Ramsey’s theory based on internal reconstruction
19 Compound nouns with ‘long’ second elements in Middle Japanese
2.1 kizu ‘wound’ + 3.1 tokoro ‘place’ → kizudokoro ‘wounded spot’
2.1 take ‘bamboo’ + 3.4 hakari ‘measure’ → takebakari ‘yardstick’
2.1 kaha ‘river’ + 3.1 yanagi ‘willow’ → kahayanagi ‘purple willow’
2.3 mimi ‘ear’ + 3.2 kusari ‘chain’ → mimigusari ‘ear pendant’
2.4 ine ‘rice plant’ + 3.4 turubi ‘mating’ → inaturubi ‘lightning’
2.4 wara ‘straw’ +3.1 humide ‘brush’ → warahumide ‘straw brush’
If the Gairin type /H/ tone spreading, and the later tone reduction are applied to
this system, the result is the following: Compounds that started with /L/ tone will
develop Ø tone in the modern Gairin dialects, which is in agreement with the
modern Gairin rules. Compounds that started with /H/ tone will develop /H/ tone on
the first syllable of the second element, which agrees – at least partly – with the
modern Gairin rules.
The fact that the most basic part of the modern Gairin rules, namely that the first
element determines whether the compound contains /H/ tone or not, can bederived
from the rules of Middle Japanese in a straightforward manner means that the rules
of proto-Japanese must have resembled those of Middle Japanese.
The compound tone rules of the large central area that includes Tōkyō, Kyōto
and Hiroshima on the other hand, can only be related to the tone that occurred in
Middle Japanese with first elements that started with /H/ tone. When the tone
reduction is applied to such compounds, the single remaining /H/ tone would be on
the first syllable of the second element. This is what we find in these dialects
irrespective of the original tone class of the first element. This means that at some
point, in central Japan, the rules that applied in case of first elements that started
with /H/ tone, were generalized. (This must have happened after the compilation of
Ruiju myōgi-shō 類聚名義抄.)
The comparison of the tone of these compounds in the modern dialects and
Middle Japanese provides another strong argument for Ramsey’s reconstruction.
There is a clear connection between the rules of Middle Japanese in Ramsey’s
reconstruction and the rules of the modern dialects. It is possible to correctly predict
the presence or absence of /H/ tone in the Gairin dialects, as well as the location of
the /H/ tone in the word in the central Japanese dialects.
By contrast, it is impossible to establish any kind of connection between the
rules of the modern dialects and the rules of Middle Japanese in the standard
reconstruction: The central Japanese dialects for instance, which stem from the same
region as the Middle Japanese material, all have the /H/ tone on the first syllable of
the second element. In the standard reconstruction of the Middle Japanese tone
system the tone of these compounds is and , which cannot be
related to the location of the /H/ tone in central Japan at all. Nor does this
reconstruction offer a link with the presence or absence of /H/ tone in the Gairin
dialects.
5.8 The tone rules for compound nouns in Middle Japanese 149
5.8.2 Compounds with ‘short’ second elements
For compound nouns with ‘short’ second elements I will restrict myself to
compounds with disyllabic nouns as second element. The most frequently occurring
reflex is given first and regarded as regular and/or productive.
The initial tone of a compound will be the same as the initial tone of the first
element. Whether a compound starts with /H/ or /L/ tone can profoundly influence
the realization of the second element.
20 Compound nouns with class 2.1 as second element in Middle Japanese
2.1 + 2.1
+ → 2x kasabuta ‘scab’, nihatori ‘chicken’
2.2 + 2.1
+ → 3x kahatake ‘river bamboo’, humibako ‘box for letters’,
isigani ‘rock crab’
+ → 1x hitodomo ‘people’
2.3 + 2.1
+ → 4x kahamusi ‘caterpillar’, kusomusi ‘gold bug’,
tamakizu ‘gem flaw’, yamamomo ‘wild peach’
+ → 2x hamabisi ‘burnut’, nahasaba ‘dolphin’
+ → 1x yamasuge ‘wild sedge’
+ → 1x yumuhazu ‘bowstring notch’
2.4/5 + 2.1
+ → 2x (2.4) zenigasa ‘ringworm’, (2.5) asagaho ‘morning
glory’
+ → 2x (2.4) warabuta ‘straw lid’, (2.4) inamura ‘rick’
+ → 1x (2.4) uribahe ‘melon fly’
Originally, the level tone classes (2.1 and 2.3) seem to have leveled out all changes
in pitch when they attached as second element in a compound. (See also class 2.3.)
The reflex for 2.2 + 2.1 may therefore be an older lexicalized form, while
the newer rule + → most likely developed as follows: *
> * > .
The reflexes for compounds that start with /H/ tone may likewise be
older lexicalized forms, while the reflexes may be the result of a newer
productive rule. The newer rule seems to be modeled after the productive rule for
/H/ starting compounds with longer second elements that we have seen in the
previous section, as these compounds also ended in /LH/ tone.
150 5 Arguments in favor of Ramsey’s theory based on internal reconstruction
21 Compound nouns with class 2.2 as second element in Middle Japanese
2.1 + 2.2
+ → 2x sakaduki ‘wine cup’, → kanamari ‘metal bowl’
2.2 + 2.2
+ → 1x tabibito ‘traveller’
2.3 + 2.2
+ → 4x mamegara ‘bean pod’, yumiduru ‘bowstring’,
nahas/zemi ‘female cicada’, yumiduka ‘bow hilt’
+ → 3x hamaguri ‘clam’, yamanasi ‘wild pear’, siribone
‘tail bone’,
+ → 1x tukigoro ‘the past few months’
2.4/5 + 2.2
+ → 2x (2.4) mugigara ‘barley husk’, (2.4) nakagoro
‘midway’
+ → 1x (2.5) tateisi ‘upright stone’
Originally, the rules for compound nouns with the non-level tone classes 2.2, and
2.4/5 as the second element seem to have been as follows: If the initial tone of the
first element coincided with the initial tone of the second element, the tones of the
second element remained unaltered.35 If the initial tone of the first element did not
coincide with the initial tone of the second element, the tones of the second element
were reversed. Thus + → but + → and +
→.
While these reflexes in compounds that start with /H/ tone are probably
lexicalized remnants of an older rule, the more frequent reflexes most
likely represent a newer productive rule. We see again that a productive rule for
compound nouns that started with /H/ tone seems to have been spreading, replacing
older rules.36
22 Compound nouns with class 2.3 as second element in Middle Japanese
2.1 + 2.3
+ → 3x hatamono ‘loom, kanakuso ‘slag’, toriami ‘bird net’
+ → 1x kubikasi ‘pillory’
2.2 + 2.3
+ → 3x kahagame ‘river tortoise’, isigame ‘terrapin’,
ihagoke ‘rock moss’
+ → 1x hatahoko ‘flagged spear’
35 The only example that does not agree is 2.2 + 2.2 , but as this type of compound is
represented by no more than one example (tabibito ‘traveler’) this reflex may be an exception.
36 The fact that in this newer system, the tones of 2.2 did not need to be reversed may have played
a role in the adoption of this rule as well.
5.8 The tone rules for compound nouns in Middle Japanese 151
2.3 + 2.3
+ → 8x mimikuso ‘earwax’, tutigura ‘cellar’, tutimuro
‘cellar’, hanagame ‘flowerpot’, hanabusa ‘calyx’,
kamebara ‘(an ailment)’, yamaguha ‘wild
mulberry’, tamagusi, ‘sprig of the sakaki tree’
2.4/5 + 2.3
+ → 2x (2.4) muginaha ‘cruller’, (2.5) mayuzumi ‘eyebrow
paint’
+ → 1x (2.4) inaguki ‘rice stalk’
+ → 1x (2.4) kasugome ‘wine lees’
+ → 1x (2.4) waragutu ‘straw shoes’
The reflexes with level tone seem to indicate that tone class 2.3 originally leveled
out all changes in pitch as second element in a compound, just as class 2.1. The
frequent reflex of class 2.2 + 2.3 on the other hand, may be a newer rule
which developed as follows: * > . (See also the development in
compounds of class 2.2 + 2.1.)
As with class 2.2, the rules for the non-level tone class 2.4/5 were as follows: If
the initial tone of the first element coincided with the initial tone of the second
element, the tones of the second element remained unaltered. If the initial tone of the
first element did not coincide with the initial tone of the second element, the tones of
the second element were reversed. Thus + → and + →
but + → and + → .37
23 Compound nouns with class 2.4/5 as second element in Middle Japanese
2.1 + 2.4/5
+ → 7x (2.4) turibune ‘fishing boat, (2.4) kanaduti
‘hammer’, (2.4) kanaduwe ‘metal staff’, (2.4)
kanabasi ‘metal chopsticks, (2.4) kutibasi ‘beak (of
a bird)’38,(2.5) kananabe ‘metal pan’, (2.5) kutihibi
‘chapped lips’
37 The only example of tone reversal in a compound of which the initial tone of the first element
coincided with the initial tone of the second element is kuhamayu ‘silkworm’ .
Perhaps the productive rule for compound nouns that start with /H/ tone had started spreading
to this type of compound also.
38 I have added this example from Martin’s list to this group although Martin identifies the second
element as 2.1 hasi ‘edge’. I find an identification of the second element with 2.4 hasi
‘chopsticks’ more likely as this is in agreement with the tone of the compound. The Japanese
word for ‘chopsticks’ may very well have derived from ‘beak’. The beak of a bird (tori no hasi)
and a pair of chopsticks both consist of two hard oblong objects that squeeze together in order
to pick up items of food. The tweezer-like ‘folding chopsticks’ (the archaic type that was most
likely first introduced in Japan) furthermore have a much stronger resemblance to the beak of a
bird than the modern chopsticks which are made of two separate pieces. (See the picture in
Nakagawa, 2007:21.) Despite the fact that chopsticks were introduced in Japan from China or
152 5 Arguments in favor of Ramsey’s theory based on internal reconstruction
2.2 + 2.4/5
+ → 1x (2.4) kamizeni ‘paper money’
2.3 + 2.4/5
+ → 2x (2.4) kahaginu ‘fur garment’, (2.4) haraobi ‘belly
band’
+ → 1x (2.5) tutinabe ‘earthenware pot’
+ → 1x (2.4) yamabiyu ‘mountain amaranthus’
2.3 + 2.4/5
+ → 1x (2.5) kuhamayu ‘silkworm’
2.4/5 + 2.4/5
+ → 4x (2.4 + 2.4) kinugasa ‘silk umbrella’, (2.4 + 2.4)
kinuita ‘fulling block’ (x), (2.4 + 2.4)
mugikasu ‘barley bran’, (2.5 + 2.4) amaginu
‘raincoat’
5.9 How old are the tone rules for compound nouns in central Japan?
We see that in Middle Japanese the tone of the initial element of the compound had
a profound influence on the tone of the second element of the compound, both in
case of compounds with short, and in case of compounds with long second elements.
In many cases the tone of the first element could cause a reversal of the tones of the
second element.
The modern rules for compounds with long second elements in Kyōto, Tōkyō
and Hiroshima are fundamentally different, as the tone of the first element has no
influence on the tone of the second element. Even in Kyōto, where the initial tone of
the first element is adopted by the compound, this has no bearing on the location of
the /H/ tone which is determined by the second element alone.
It seems to be the case therefore, that a quite important change occurred in the
tone rules for compound nouns in central Japan after the compilation of the work
that forms the origin of the Ruiju myōgi-shō lineage of which the Kanchi-in-bon 観
智院本 forms part. (This was around 1100, or at the latest around 1180 (Satō ed.
1977:521-522).39
The modern rules, which Tōkyō, Kyōto and Hiroshima all inherited from their
ancestral dialect, must have developed sometime after the 12th century, but before
Korea, a purely Japanese etymology is likely, as the Japanese word is not related to Chinese or
Korean.
39 The alternative solution, namely that the modern rules are ‘older’ than the rules of Middle
Japanese would mean that the modern Kyōto dialect is not the direct descendant of Middle
Japanese, and that Middle Japanese died out in the Kyōto area, and was replaced by another
dialect, that had preserved older compound tone rules. I find this hard to imagine: After the 11th
century, the dialect of Kyōto has always remained prestigious, which makes it unlikely that it
was replaced by a different dialect.
5.9 How old are the tone rules for compound nouns in central Japan? 153
the Tōkyō type and the Kyōto type tone systems split (i.e. before the leftward tone
shift in Kyōto).
The rules of the modern dialects can only be related to the rules of Middle
Japanese in case of compounds that started with /H/ tone in Middle Japanese. (As we
have seen this was also the case with compounds with longer second elements.)
The productive rule for class 2.1 for instance, generated compounds with
tone. When the /H/ tone restriction is applied to such compounds, the result
are compounds in which class 2.1 attaches with pitch (with a pitch fall before
the second element), which agrees with the modern productive rule. The older
lexicalized rule which generated compounds with tone, could account for
the irregularity of the reflexes in case of compounds with shorter first elements. (The
tone of compounds with short second elements and short first elements is
notoriously irregular, and seems to be lexicalized, rather than governed by
productive rules.)
The productive rule for class 2.2 also generated compounds with tone,
and after the application of the /H/ tone restriction, the resulting tone again agrees
with the modern rules for class 2.2. (The older lexicalized rule which resulted in
compounds with tone could again account for the irregularity of the
reflexes in case of compounds with shorter first elements.)
The rules for class 2.3 yielded compounds with tone. The modern rules
agree with this in so far, that class 2.3 indeed attaches without a pitch fall before the
second element (i.e. with pitch) but it is unclear why the expected pitch fall
after the final syllable of the compound in the modern dialects is suppressed.
The rules for class 2.4/5 yielded compounds with tone. The modern
rules after the /H/ tone restriction agree with this as class 2.4/5 attaches with
pitch.
It seems therefore, that a generalization of the rules that applied when the first
element started with /H/ tone in Middle Japanese formed an important part of the
development towards the modern rules. The main development from the tone system
of Middle Japanese to tone systems of the modern Tōkyō type dialects involves the
reduction of the /H/ vs. /L/ tone system to a /H/ vs. Ø tone system. It is natural to
look for a link between the two developments. This link is there in the circumstance
that phonological /L/ tone was disappearing from the tone system in this period: The
reason why the rules that applied in case of first elements that started with /H/ tone
were generalized must be because /L/ tone was being eliminated from the system.
After this generalization had taken place, there was no longer any influence of
the tone of the first element on the resulting compound. From then on, the tone of
the compound was determined by the tone of the second element only, which
attached according to the old rules that had applied in case of compounds with initial
/H/ tone.
As will be explained in more detail in part II, I date the development towards a
restricted tone system (at least in central Japan) to around 1250 or somewhat earlier,
and I date the leftward shift in Kyōto to the mid to late 14 th century. This means that
154 5 Arguments in favor of Ramsey’s theory based on internal reconstruction
the new compound tone rules had a period of 100 to 150 years to spread in the area
of Tōkyō, Kyōto and Hiroshima. When the leftward tone shift in Kyōto took place,
this separated the tone systems of Tōkyō and Hiroshima from each other. Such a
development would explain the agreement in the tone of compounds in the different
dialects seen in sections 5.2.3 and 5.5.
When phonological /L/ tone (i.e. the /L/ tone that is limited to the initial syllable)
redeveloped in Kyōto as a result of the tone shift, this new distinction was
superimposed on the already existing compound rules. Just as in Tōkyō, Hiroshima,
Kanazawa and Toyama, the location of the /H/ tone in the word is determined by the
second element of the compound, but in addition, the first element of the compound
determines whether the initial syllable of the compound will have /L/ or Ø tone. The
Kyōto compound tone rules are in origin identical to the rules of Tōkyō, Hiroshima,
Kanazawa and Toyama, but with the /L/ vs. Ø distinction of the first element added
later on.
5.10 How old are the rules for compound nouns
in the Gairin type dialects?
The rules for compound nouns in the different Gairin type tone systems can be
related to the rules of Middle Japanese in the following way: If the first element in
Middle Japanese started with /L/ tone, the resulting compound either had /L/ tone
throughout, or ended in /LH/ tone. (I disregard here the newer rule for nouns of class
2.1 and 2.3 that attach after class 2.2. I take the older rule, in which all changes in
pitch were leveled out when the level tone classes attached as the starting point.)
In each case the modern reflex in the Gairin type dialects would be a compound
with all Ø tone, which agrees with the modern rules. If the first element started with
/H/ tone in Middle Japanese, the resulting compound would contain a transition
from /H/ to /L/ somewhere in the word, and the compound in the Gairin type dialects
contains /H/ tone.
The most basic part of the modern Gairin rules, namely that the first element
determines whether the compound contains /H/ tone or not, can thus be related to the
tone system of Middle Japanese, which means that the tone rules for compound
nouns in proto-Japanese were not unlike the (lexicalized, older set of) rules that can
be found in Middle Japanese.40
The matter however, of where in the compound the /H/ tone will occur when it is
there, is more complicated. The rules that determine the location of the /H/ tone in
the compound in Izumo and Ōita do not show a direct link with each other, or with
Middle Japanese, and appear to be the result of independent developments. These
40 Middle Japanese stems from a different region, and is too late to be the direct ancestor dialect
of Izumo and Ōita (the MJ ‘Gairin’ material most likely reflects the tone system of the Gairin
type area around Hamamatsu), so that the similarity must be traced back to proto-Japanese.
5.11 Noun compounding and the tone class divisions of proto-Japanese 155
independent developments probably occurred at the moment when each of the
dialects went through the process of /H/ tone restriction that changed the Japanese
tone systems so fundamentally. As I will argue in section 10.7, the /H/ tone
restriction in western Japan must have taken place sometime before the 10th century.
Summarizing we can say that the regular reflexes in the central Japanese dialects
have preserved the tone that the second element had (when the first element of the
compound started with /H/ tone) in the newer, productive rules of Middle Japanese.
The irregular reflexes may have developed from the lexicalized older rules that can
also be seen in Middle Japanese.
The Gairin reflexes on the other hand, have preserved the distinction of the
initial tone of the first element in proto-Japanese. In how far, and in what way, the
location of the /H/ tone in the different Gairin dialects reflects proto-Japanese is
unclear, and is a subject for further investigation.
5.11 Noun compounding and the tone class divisions
of proto-Japanese
In the tone rules for compound nouns in central Japan, tone classes 2.1 and 2.2 and
also 2.4 and 2.5 are not distinguished from each other. The merger pattern (2.1/2 vs.
2.3 vs. 2.4/5) is typical of the Gairin type dialects, and can be reconstructed for
proto-Ryūkyūan. (The interesting iki/ita split in the merged class 2.4/5 in the
Ryūkyūs is most likely a Ryūkyūan innovation. See section 9.6.3.)
Because of this, Kida (1979) argued that the Gairin type division in tone classes
is the oldest type in Japan, and that the split between tone classes 2.1 and 2.2 and 2.4
and 2.5 is an innovation. As will be clear from the previous sections however, the
tone rules for compound nouns in central Japan do not go back very far, and seem to
have developed around the time of the /H/ tone restriction. The merger pattern of
disyllabic nouns in compound nouns in this area must agree with that of the Gairin
dialects and the Ryūkyūan dialects for other reasons.
When we look for such reasons, we see that the lack of the distinction between
tone classes 2.1 and 2.2 as the second element in a compound in the central Japanese
dialects is the result of the fact that at a certain point – in the Kyōto type dialects as
well as the Tōkyō type dialects – no more than one /H/ tone per word was allowed.
If tone class 2.2 had preserved the tone that it has in isolation as the second element
of a compound, a second /H/ tone would have occurred on the final syllable of
compounds with tone class 2.2 as the second element. The only area where it would
be remotely possible for the distinction between 2.1 and 2.2 as second element in a
compound to have been preserved, would be the area with Noto type tone.41
41 Kindaichi’s investigation of these dialects (cf. 6.2 and subsections) makes no mention of the
tone of compound nouns.
156 5 Arguments in favor of Ramsey’s theory based on internal reconstruction
The reason why the distinction between tone classes 2.4 and 2.5 is obliterated in
compounds may have to do with the fact that the final /R/ tone of class 2.5 was most
likely the result of a suffix with /H/ tone that merged into the word stem. (See
section 8.3.) Noun compounding may have prohibited the use of this suffix. (No
final /R/ tone is attested after all in Middle Japanese compounds with class 2.5 as
second element.) Such a prohibition would explain the absence of the distinction
between class 2.4 and class 2.5 in compounds.
5.12 The relation between sequential voicing
and lack of /H/ tone in compounds
It can be seen that in many of the compounds listed in this chapter, sequential
voicing can be observed. In modern Japanese the voicing is not predictable and it
was already unpredictable in the oldest sources that we have of the Japanese
language. (According to Unger (2000) the occurrence of sequential voicing is also
completely randomly distributed over the different tone classes.) As voiced
obstruents in Old Japanese are thought to have been prenasalized, 42 it has been
suggested that the voicing originated in some kind of nasal element that occurred
between the two compounded words. The obvious candidates for such a nasal
element are the particles no and (to a lesser extent) ni that must have lost the vowel
after the nasal. Similar vowel deletion after a nasal is for instance suggested by
Hashimoto Shinkichi (1932:5) for the etymology of yuge ‘bow whittling’ from yumi
‘bow’ + ke < keduru ‘whittling’ (Vance, 1987:135). Another example, mentioned by
Unger (1993) is murazi ‘village chief’ from mura ‘village’ + nusi ‘owner’.
The lack of sequential voicing in verb + verb compounds in both modern and
Old Japanese could be regarded as a corroboration of this hypothesis of the origin of
sequential voicing, as there is no reason to suppose that a genitive particle like no or
a dative particle like ni ever appeared between these kinds of verbal compounds.
Vance (1987:136) explains the irregularity that can already be seen in Old
Japanese in the following way: “There is good reason to believe that not all Old
Japanese noun + noun compounds derived from phrases of the form noun + /no/ +
noun, and that not all such phrases underwent vowel deletion when the second noun
began with a voiceless obstruent.” Vance proceeds by giving three examples that are
all attested in Old Japanese. Each contains the first element huna, an allomorph of
hune ‘boat’: hunahasi ‘pontoon bridge’ from huna + hasi ‘bridge’, hunanohe ‘bow
of a boat’ from huna + he ‘bow’, hunagi ‘wood for boat building’ from huna + ki
‘wood’.
The first example derived from simple juxtaposition of the two nouns and
therefore did not show sequential voicing. The second example retained the genitive
42 The latest argument in favor of such a reconstruction can be found in an article by Hamano
Shōko (2000), who studied the occurrence of sequential voicing in mimetic words.
5.12 The relation between sequential voicing and lack of /H/ tone in compounds 157
particle and therefore did not show sequential voicing either, while the third example
presumably derived from an earlier phrase like huna no ki, by vowel deletion. Vance
therefore suggests that the irregularity of sequential voicing in Old Japanese is due
to the fact that noun + noun compounds did not all develop in the same way.
Even though the origin of sequential voicing may be linked to the former
presence of a particle, in Old Japanese the phenomenon had already developed
independent features, such as the fact that sequential voicing was blocked if the
second element of a compound contained a voiced obstruent. In other words,
whether sequential voicing occurred or not, had already become determined by
factors that were unrelated to the possible origin of the voicing.
Even so, I would like to mention a curious link between the occurrence of
sequential voicing in compounding and loss of the /H/ tone in the resulting
compound in modern Japanese. The strange fact is that the phenomenon only occurs
in compounds of which the second element is a deverbal noun. Could this
nevertheless be a remnant of an original link between sequential voicing and the
particle no, which, as we have seen, had special tonal qualities in Middle Japanese?
(If so, the special qualities of the particle no must go back at least to the period in
which sequential voicing developed.) I have relied heavily on the summary that
Vance (1987: 145-146) presents of this issue.
Vance quotes Okumura (1955) and Sakurai (1966) who point out that in
compounds that consist of a noun + a deverbal noun, sequential voicing is less likely
to occur when the noun is grammatically the direct object of the verb than when it is
an adverbial modifier, which suggests a possible connection between sequential
voicing and the earlier presence of the particle no: yane wo huku > yanehuki ‘to
cover a roof’ but kahara no huki > kaharabuki ‘a covering of tiles’.
Akinaga (1966:53) makes a similar claim, but restricts it to cases where the
deverbal noun is one or two moras long. He says that if the noun functions
grammatically as the direct object in such a case, sequential voicing does not occur
and the compound has /H/ tone on the final syllable of the first element. If, on the
other hand, the noun functions as an adverbial modifier sequential voicing occurs,
and the compound has Ø tone.
Okuda (1971:201) on the other hand, lists ten counterexamples to Akinaga’s
generalization. In eight of these, sequential voicing occurs and the compound has Ø
tone, even though the noun functions as a direct object, and in two examples
sequential voicing does not occur and the compound has /H/ tone on the final
syllable of the first element, even though the noun does not function as a direct
object.
These counter examples suggest that there may now simply be a correlation
between sequential voicing and lack of /H/ tone in compounds of this type. The
grammatical context that may originally have been involved in the process could
have been blurred or lost. Neither Akinaga nor Okuda observes any parallel
phenomenon in compounds with longer deverbal nouns as second elements.
158 5 Arguments in favor of Ramsey’s theory based on internal reconstruction
In Middle Japanese, verbs belonging to tone class A yielded deverbal nouns that
had level /L/ tone, while verbs belonging to tone class B yielded deverbal nouns that
had level /H/ tone. Disyllabic deverbal nouns in the modern Japanese dialects
therefore belong either to tone class 2.1 (< verbs of class A) or to tone class 2.3 (<
verbs of class B).
In the Hiroshima dialect described by Okuda (1971), tone classes 2.2 and 2.3
have merged when used independently, just as in Tōkyō and Kyōto, but just as in
Tōkyō and Kyōto part of class 2.3 is still distinguished as the second element of a
compound by the fact that it attaches with ØØ tone. Deverbal nouns of the historical
tone class 2.3 on the other hand have merged with nouns of class 2.2 completely. All
disyllabic deverbal nouns attach with 'ØØ tone as the second element of compounds,
irrespective of their historical tone class.
However, when sequential voicing occurs, all deverbal nouns attach with ØØ
tone (Okuda, 1971:198): 2.1 hari ‘cover’ posuta'ahari ‘billposter’, but tairubari
‘tiling’, 2.1 hiki ‘coating/pulling’kuruma'hiki ‘rickshaw man’ but hooroobiki
‘enamelled ware’, 2.3 kiri ‘cutting’ garasu'kiri ‘glass cutter’ but mizingiri ‘mincing’,
2.3 turi ‘fishing’ sakana'turi ‘fishing’ but ipponduri ‘fishing with a pole’
The list of compounds with deverbal nouns as second elements that Okumura
gives (1973:288-289 and 306-310) contains only a few examples that do not agree
with this rule. (For instance 2.1 kari ‘hunting’ in kinoko'gari ‘mushroom gathering’,
2.1 kasi ‘loan’ in koori'gasi ‘usurer’, 2.3 kui ‘eating’ in hatumono'gui, 2.3 tuki
‘accompanied’ in hoosyootuki ‘guaranteed’, 2.1 kae ‘change’ in tukurikae
‘rebuilding’ but on the other hand also occurring with sequential voicing in
koromogae ‘seasonal change of clothes’.) The tone of disyllabic nouns as the second
element in compound nouns can be summarized as in (24).
24 The influence of sequential voicing on the tone of deverbal compound nouns
in Hiroshima
Compound Deverbal compound noun Deverbal compound noun
noun without sequential voicing with sequential voicing
2.1 'ØØ 'ØØ ØØ
2.3 ØØ/'ØØ 'ØØ ØØ
I do not know what to make of the difference in this respect between deverbal nouns
and ordinary nouns as the second element of a compound. The few examples of
ordinary nouns of class 2.1 and 2.2 in Hiroshima that attach with ØØ tone instead of
the 'ØØ that Okuda indicates as regular, have sequential voicing. It is tempting to
explain their lack of a /H/ tone on the final syllable of the first element as an effect
of the sequential voicing. If we look however, at the examples of ordinary nouns of
class 2.1, 2.2 and 2.3 that attach with 'ØØ tone, we see that most of them have
5.13 The origin of the irregular cross-dialect correspondences of longer nouns 159
sequential voicing as well, and still do not cancel the /H/ tone on the final syllable of
the first element.
On the other hand, if we still assume that there is a connection between loss of
/H/ tone and the former presence of the particle no, it is not surprising that a similar
loss of /H/ tone in case of sequential voicing cannot be observed in compounds with
longer deverbal nouns as second elements:
As we have already seen, the compound tone rules for compounds with longer
second elements are quite different from those of compounds with shorter second
elements. The /H/ tone in compounds with longer second elements does not occur on
the final syllable of the first element, and if the loss of /H/ tone in compounds with
sequential voicing is indeed connected with the former presence of the particle no,
only a /H/ tone on the final syllable of the first element would have been cancelled.
5.13 The origin of the irregular cross-dialect correspondences
of longer nouns
Finally, one more consequence of the complications involved in compound nouns
stemming from different periods and made up of first and second elements of
different length is the following observation made by Martin (1987:219):
A major reason for the irregularity of cross-dialect correspondence of many
nouns that are three or more syllables in length is that they are compounds. Some of
the compounds are heavily lexicalized and quite old, so that in some or all of the
dialects the accent is inherited in those reflexes appropriate to simple nouns. But
others are new creations, and still others are old compounds that have been
remodeled to conform to the accentuation of the new compounds that are freely
made up with the modern rules for each dialect.
6 A new look at dialect tone
In this chapter I will address some of the implications that Ramsey’s theory has on
the way in which to view the tone systems of a number of dialects. In section 3.3.2, I
have mentioned the transitional areas that can be found between the Gairin type and
the Chūrin type tone systems. Such transitional areas are characterized by the fact
that the reflexes of class 2.2 are mixed, with some members merging with class 2.1
and other members merging with class 2.3.
Between the Kyōto type and Tōkyō type tone systems there also is a transitional
area, albeit much smaller. In this area the reflexes of the tone classes are not mixed.
These dialects are transitional in a different sense: The tone classes in these dialects
show mergers that cannot be found in the more typical Kyōto type tone systems, nor
in the Tōkyō type tone systems. These mergers can however, be explained as the
result of an incomplete adoption of the Kyōto tone shift.
In section 6.1 below I will show how the /H/ tone retraction that took place in the
Kyōto type dialects was adopted in these transitional areas as well, but with one
difference: If the /H/ tone was already on the initial syllable before the shift, the
leftward tone shift was cancelled (or some other solution was found). These dialects
are Kyōto-like, except for the fact that they did not develop the /L/ toneme that is
characteristic of the more typical Kyōto type dialects.
The next subject in this chapter is the special tonal type that can be found on the
Noto peninsula and Noto Island. In section 3.1.1, I have presented the tone system of
Nozaki on Noto Island as one of the most archaic in Japan, a tone system that is still
very close to the tone system of Middle Japanese in Ramsey’s reconstruction. The
same dialect has however, also been used by Kindaichi to argue for the standard
theory. In 6.2 and subsections below, I will investigate the Noto tone system from
both viewpoints.
I will end this chapter with a brief discussion of the tone system of Toyama. In
Hirayama’s dictionary (1960) this tone system is treated as belonging to the Kyōto
type, but I will argue that this tonal type may be closer to the Noto subtype of the
Nairin Tōkyō type tone systems.
6.1 Transitional or ‘Tarui type’ dialects
On Honshū, tone systems that are midway between a Kyōto type and a Tōkyō type
occur in a narrow strip (25 kilometers on average) to the east and the west of the
area with Kyōto type tone, forming a kind of ‘buffer zone’ between the pure Kyōto
type and the pure Tōkyō type tone systems. (Dialect data are from Ikuta (1951) and
6.1 Transitional or ‘Tarui type’ dialects 161
Uwano (1981). Uwano calls these dialects the ‘Tarui type’ after the village of Tarui
in Shiga prefecture where such a tone system was first described.)
Similar tone systems can be found in a number of villages surrounding the
Totsukawa dialect island. In Shikoku, transitional tone systems occur in the western
corner of the island. Nowadays these tone systems no longer border on the Kyōto
type tone system that can be found on the island as well, as a one-pattern area
without lexical tone has developed in-between. I assume however, that both the
transitional type and the one-pattern type are the result of a former meeting of the
Tōkyō type tone system of southwest Shikoku with the Kyōto type tone system of
eastern Shikoku in this area.1
In these transitional tone systems, some tone classes went along with the
leftward tone shift of the Kyōto type dialects, while other classes did not, or
developed Ø tone. The classes that show these deviant developments are typically
those that would have developed initial /L/ tone in a full-fledged Kyōto type tone
system.2
In the first group that I will discuss, those tone classes that would not have to
develop /L/ tone (i.e. the tone classes that did not have /H/ tone on the first syllable)
went along with the Kyōto shift. The tone classes with /H/ tone on the initial syllable
(1.3, 2.4/5) on the other hand, did not. The tone of these tone classes therefore
agrees with the Tōkyō type dialects. As a result tone classes 2.2 and 2.3 merged with
tone classes 2.4 and 2.5, and tone class 1.2 merged with tone class 1.3.3
This tonal type can be found in Sakamoto and Nakatani (Tenkawa village) at the
northern border of the Totsukawa dialect island, and in Matsuyama and Ohaka. And
according to Uwano also in Ōmi, Maibara, Torahime and Fujihashi, in the area
where Kyōto type and Nairin type tone meet to the east of Lake Biwa. In Hyōgo
prefecture, at the western border between Kyōto and the Nairin type, this tone
system can be found in Fukura and Suegane (Sayō village). According to Uwano it
can furthermore be found in Yoshida and Nishi-Tosa on Shikoku.4 As an example I
give the tone system of Nakatani (Tenkawa village):
1 In other areas as well, on Kyūshū and Honshū, we see that one-pattern tone systems developed
where formerly two different tone systems met.
2 In the Korean accent system something similar can be observed in the Kyengsang dialects,
which shifted the proto-Korean accent one syllable to the left. In most dialects, words that
originally had accent on the first syllable became pre-accented, but in North and South
Kyengsang province a number of dialects have compensated in various ways to regularize pre-
accent (Ramsey 1978:81).
3 The merger pattern of the monosyllabic nouns thus coincides with that of the Nairin dialects. I
have come across no examples of this type of dialect in which tone classes 2.4 and 2.5 are kept
separate. It is therefore possible that this tone system derived from a modern Nairin type tone
system, rather than from an older Nozaki-like stage that still preserved more distinctions.
4 In case of these dialects, a development from the Chūrin type tone system that can also be
found on the island of Shikoku is impossible, as in this tone system tone classes 1.1 and 1.2
have merged.
162 6 A new look at dialect tone
1 Tarui type group 1
Nakatani
1.1 -
1.2/3 '-
2.1 -
2.2/3/4/5 '-
In the next group of dialects, tone classes 1.2 and 2.2/2.3 again go along with the
leftward tone shift, just as in Nakatani, but the tone classes that had /H/ tone on the
initial syllable – having no syllable to the left to shift the /H/ tone to – lose the /H/
tone altogether and merge with the tone classes with Ø tone. Within this /H/ tone
losing type, there are actually three subtypes, depending on the reflex of tone class
2.5, which had /H/ tone on the initial syllable and /R/ tone on the second syllable
before the shift.
In group 2a tone class 2.5 has Ø tone and there is no distinction between tone
classes 2.4 and 2.5. Examples are the dialects of Imasu to the east of Lake Biwa and
Yawatahama in the west of Shikoku.
2 Tarui type group 2a
Yawatahama
1.1/3 -
1.2 '-
2.1/4/5 -
2.2/3 '-
In group 2b tone class 2.5 has lost the /H/ tone on the initial syllable, allowing the
/R/ tone on the second syllable to develop into /H/ tone ( > > ') It
appears in a number of different forms in the different dialects, but has not merged
with any other class:
3 Class 2.5 in Tarui type group 2b
','-
','-
','-
','-
This type can be found in the transitional area to the east of the area with Kyōto type
tone on Honshū (Takahama, Obama, Ōura, Kinomoto, Kamikusa-no-mura and
Kashiwabara). To the west of the area with Kyōto type tone it can be found in
6.1 Transitional or ‘Tarui type’ dialects 163
Tenwa. On the southern border of the Totsukawa area it can be found in Misato,
Kushitōge, Kinomoto, Ichiki, Shingū, Kumano and Atawa.5
In central Shikoku this tone system can be found in Higashi-Iyayama, Kitō,
Kubokawa and Nanokawa. From the fact that this tone system preserves the
distinction between tone classes 2.4 and 2.5 and 3.6 and 3.7, it is clear that it cannot
have developed from the modern Tōkyō type dialects in these area. It must have
derived from an older stage that still preserved this distinction. Kindaichi (1942:163-
167) presents the reflexes of disyllabic and trisyllabic nouns the dialect of Akaho
(type 2b) as in (4). The /H/ tone on the initial syllable of class 3.7 was lost, but the
/H/ tone on the final syllable of this class was shifted onto the second syllable,
causing a merger with class 3.2, 3.4 and 3.3.6
4 Disyllabic and trisyllabic nouns of Tarui type group 2b
Akaho
2.1/4
2.2/3 '
2.5 '
3.1/6
3.2/3/4/7 '
3.5 '
In group 2c tone class 2.5 has lost the /H/ tone on the initial syllable (just as tone
class 2.4) but the /R/ tone on the second syllable was shifted onto the first syllable as
/H/ tone, and caused a merger with tone class 2.2/3. Along the northern border of the
Totsukawa area, such a tone system can be found in Nishi-Hiura. At the western
border of the are with Kyōto type tone it can be found in Chikusa, Ochiyama, Kuroi,
Ayabe and Maizuru, and at the eastern border in Tsuruga, Nagahama, Sekigahara
and Tarui. (On Shikoku this type can be found in Nakatsu.) The pitches of
Nagahama (Kindaichi, 1942) are as in (5).
5 Tarui type group 2c
Nagahama
2.1/4
2.2/3 '
5 Uwano mentions that in Atawa tone class 2.4 has Ø tone (cf. kata ‘shoulder’), but when it is
modified by kono ‘this’ it will have /H/ tone on the initial syllable just as in Tōkyō (cf. kono
ka'ta ‘this shoulder’).
6 Note that the reflex of tone class 3.3 in this dialect (and the next dialect) appears to go back to
the assimilated form attested in Ruiju myōgi-shō 類聚名義抄, rather than to the earlier
tone, that is still occasionally attested in the oldest tone dot material. The standard
reconstruction cannot explain the tone of class 3.3 in these dialects. Cf. section 4.5.
164 6 A new look at dialect tone
Nagahama
3.1/6
3.2/3/4/7 '
3.5 '
In (6) I give an overview of the different solutions to avoid creating the extra toneme
/L/. I have added the dialect of Kyōto for comparison:
6 Overview of the strategies to avoid /L/ tone in the Tarui type dialects
Kyōto Nakatani Yawatahama Akaho Nagahama
Group 1 Group 2a Group 2b Group 2c
2.2/2.3 '- '- '- '- '-
2.4 '- '- - - -
2.5 ''- '- - '- '-
It can be seen that those tone classes that had nowhere to shift the /H/ tone to, either
did not go along with the leftward shift and preserved the original Tōkyō type
location of the /H/ tone, or they lost the /H/ tone altogether and merged with the
classes with Ø tone. From the viewpoint of Ramsey’s theory, these developments
are easy to understand and not quite unexpected. Reasoning from the standard theory
on the other hand, the developments in tone classes 2.4 and 2.5 lack a proper
explanation.
Finally, there are a number of Kyōto type dialects that do not have the distinction
between classes 2.4 and 2.5. They are the dialects of Ikehara, Ōse, Owase, Aiga,
Shimakatsu, Miura and Nigo at the eastern border of the Totsukawa dialect island.
The Ikehara data in (7) are from Uwano (1983).
An explanation for the lack of the distinction in this area could be that the shift
reached this area rather late, so that the /R/ tone on the final syllable of class 2.5 was
already lost, just as in the nearby Totsukawa dialects. (It is interesting that the
merged class 2.4/5 adopted the tone that usually occurs with class 2.5 in dialects that
do have the distinction between tone class 2.4 and 2.5).
7 Kyōto type dialects that merged classes 2.4 and 2.5
Ikehara
1.1 -
1.2 '-
1.3 ','-
2.1 -
2.2/3 '-
2.4/5 ''-
6.2 The Noto dialects 165
The next example, Imajō from the northeastern border of the Kyōto type dialect area,
is unusual in that tone classes 1.1 and 1.2 and 2.1, 2.2 and 2.3 have all merged (Ikuta,
1951). In this dialect, it seems that the pre-shift - and pitches of classes 1.1
/Ø-Ø/ and 2.1 /ØØ/ (which are the result of the automatic rise in pitch after the
phrase-initial syllable in words with Ø tone) shifted to the left, along with the -
and pitches of classes 1.2 /R-Ø/ and 2.2/3 /ØH/. If we analyze the tone classes
with level pitch as [L] instead of [H], we could say that this dialect does have /L/
tone. 7
8 The Imajō type
Imajō
1.1/2 ','-
1.3 -
2.1/2/3 '-
2.4/5 -
6.2 The Noto dialects
The tone systems of the dialects of the Noto peninsula and Noto Island have played
an important role in the development of Kindaichi’s theory. I introduce these
dialects as Kindaichi describes them in his article of 1954. (The article has been
reprinted with additional notes in 1975 and 1983. I use the edition of 1983.) These
dialects are especially important as Kindaichi developed his ideas on the
intermediate stages that a Kyōto type tone system had to go through on the way to a
Tōkyō type tone system on the basis of these dialects.
At the beginning of the article Kindaichi compares the tone systems of modern
Tōkyō and modern Kyōto. He shows that a shift from a modern Tōkyō type tone
system to a modern Kyōto type tone system is not possible, as tone classes 3.6 and
3.7 for instance, are distinguished from each other in Kyōto, but have merged in the
Tōkyō type dialects.
For the alternative, a shift from a Kyōto type tone system to a Tōkyō type tone
system, Kindaichi proposes a number of intermediate stages, based on what he
found in the Noto dialects: According to Kindaichi, the intermediate stages have
been preserved in these dialects. In other words, in the tone systems of these dialects
a change from a Kyōto type tone system to a Tōkyō type tone system can be seen
under way.
7 Only looking at the shorter nouns, it is even possible to describe this tone system in terms of
two word-tones. As I have no data on the longer nouns, I cannot tell whether a distinctive
location of the /H/ tone has to be recognized in this dialect.
166 6 A new look at dialect tone
In (9) I adduce Kindaichi’s examples for the most complicated tone classes,
namely classes 2.4 and 2.5 and 3.6 and 3.7, which are distinguished in Kyōto but
have merged in Tōkyō. With the Kyōto type tone system of Wakayama as a starting
point Kindaichi takes us via the stages represented by the general Noto type, the
Ishizaki type and the Nozaki type to the Tōkyō type tone system of the village of
Kōda on Noto Island.8 The sign in Kindaichi’s representation indicates a syllable
that in ‘careful pronunciation’ is [L], but in ‘casual pronunciation’ is [H]. (It is in
this chapter only, that the symbol will be used in this way, as in all other chapters
it indicates [M] pitch.)
9 Kindaichi’s view on the historical developments in the Noto dialects
Wakayama Noto peninsula Ishizaki Nozaki Kōda
(general)
2.4 - > - > - > - -
2.5 - > - - > - > -
3.6 > > >
3.7 > > >
The Ishizaki type can be found in a number of villages on the east coast of the Noto
peninsula, but also in three villages on Noto Island proper. The villages of Nozaki
and Kōda are also both located on Noto Island. There are two more villages on the
island that have the same tone system as Nozaki, and one more village that has the
same tone system as Kōda. The general Noto peninsula type can also be found on
the island.9 Kindaichi comments:
It was discovered that while the Noto accent in general may be regarded as a
variant of the Kyōto-Osaka type accent, in the village of Kōda the accent was
just like that of Tōkyō; and in the area around Kōda there was either an
accent midway in the change from the Kyōto-Osaka type accent to the Tōkyō
type accent or an accent that one would be inclined to say is but one step
removed from the change to the Tōkyō type accents.
8 I have followed Uwano (1981) in calling this village ‘Kōda’, although I have also seen the
reading ‘Mukōda’.
9 On Hirayama Teruo’s dialect maps (1960, 1980, 1992) an area on the northern tip of Noto
peninsula (the area of Tenchi and Ama belonging to Wajima city) is also marked as having
Tōkyō type tone. (See also Hirayama, 1956).
6.2 The Noto dialects 167
General Noto type and
Ishizaki type
Confused Noto types
No lexical tone
Kōda type
Nozaki type
Map 5: The geographical distribution of the tone systems in the Noto area
(Adapted from Iitoyo, 1983:345)
6.2.1 Kindaichi’s data
Kindaichi’s table showed tonal phrases of similar length and tone (with a maximum
of three syllables) but different composition together. Tonal phrases of three
syllables in Kindaichi’s table may for instance represent trisyllabic nouns, trisyllabic
verbs, but also disyllabic nouns + particle, as long as the tone pattern is identical.
The following classes are for instance joined: class 2.1 + particle with class 3.1,
class 2.2/3 + particle with class 3.3/5, class 2.4 + particle with class 3.6, class 2.5 +
particle with class 3.7. I have separated Kindaichi’s entries into distinct tone classes
again. (Monosyllabic nouns will be discussed at the end of this chapter.)
In a number of these dialects a syllable with /H/ tone is realized with [F] pitch in
phrase-final position (in other dialects only when also preceded by [H] pitch).
Kindaichi has added the mark ' to such syllables ('). As I found this confusing – it
suggests that only these syllables have phonological /H/ tone, and therefore that only
these syllables are followed by a drop to [L] pitch – I have chosen to represent the
syllables with [F] pitch by means of the symbol .10 Apart from this, Kindaichi
adds no other phonological marks, and although I have earlier represented the /H/
tone and the /R/ tone that have to be recognized in the dialect of Nozaki by means of
the marks ' and '', I will adopt Kindaichi’s representation unaltered in this chapter.
10 Because of the arrangement of Kindaichi’s table, tone class 2.5 + particle and tone class 3.7 are
together presented as ' in Hakui and Ogi. Because I have separated the tone of 2.5 +
particle from the tone of 3.7, it now looks as though the falling pitch can occur on the particle
after tone class 2.5, but I suspect that in reality, it only occurs in class 3.7 in phrase-final
position.
168 6 A new look at dialect tone
10 The tone of disyllabic nouns in the Noto dialects
2.1 2.2/3 2.4 2.5
Ishizaki - - - , -
Nozaki - - - -11
Kōda - - - as 2.4
Hakui - - - , -
Takahama - - - as 2.1
Tatsuruhama - - - , -
Nanao12 - - - , -
Han-no-ura - - - , -
Ukawa - - - as 2.2/3
Ogi - - - , -
Iida - - - , -
Kōfu (Tōkyō type) - '- '- as 2.4
Wakayama (Kyōto type) - '- ', '- ''-
11 The tone of trisyllabic nouns in the Noto dialects
3.1 3.2/4 3.3/5 3.6 3.7
Ishizaki
Nozaki
Kōda as 3.6
Hakui
Takahama as 3.1
Tatsuruhama
Nanao
Han-no-ura
Ukawa as 3.3/5
Ogi
Iida
Kōfu (Tōkyō type) ' ' ' as 3.6
Wakayama (Kyōto type) ' ' ' ''
In his notes (1975:62), Kindaichi indicates that if the second syllable of a word or
phrase consists of a close vowel and a voiced consonant (or in some dialects also if
11 Kindaichi indicates that in isolation the pitch of tone class 2.5 is level, which he analyses as
low level. I think however, that this level pitch would be better analyzed as /HH/: In isolation
the rise of the /R/ tone on the final syllable could not be shifted onto the case particle, and /HR/
may thus have been simplified to /HH/. The former /R/ tone of class 1.2 after all, also
developed into /H/ tone in this dialect.
12 In his article of 1954 Kindaichi indicated 2.5 ,- and 3.7 as the tone of Nanao.
In two later articles (Kindaichi 1975:170 and Kindaichi 1964:16) the tone of Nanao is given as
2.5 , - and 3.7 . I have adopted these later corrections in the tables.
6.2 The Noto dialects 169
the second syllable consists of a close vowel and a voiceless consonant) the
realization in certain tone classes is different. As the pitches that occur when such a
special second syllable is not present can be regarded as basic, Kindaichi did not
include these variants in his table, and neither have I in tables (10) and (11). I have
followed Kindaichi in adding the corresponding reflexes in the dialects of
Wakayama (Kyōto type) and Kōfu (Tōkyō type) for comparison.
6.2.2 McCawley’s view
In a review (1966) of the English translation of Kindaichi article (1964b),
McCawley’s analysis of Kindaichi’s Noto dialect data is as follows:
None of the eleven Noto dialects from which he cites data displays what I
would regard as ‘a variant of the Kyōto-Ōsaka-type accent’; indeed, if the
data are represented in terms of accent marks rather than high and low
pitched moras all eleven dialects come to look remarkably like Tōkyō
Japanese. I surmise that Kindaichi was misled by a phenomenon common to
eight of the eleven dialects, the fact that the first mora of a phrase can only be
low pitched.
According to McCawley, the entirely [L] pitched phrases fill the hole left by the
impossibility of [H] pitch on the first mora. He suggests that these phrases be
represented with an accent mark after the first mora and concludes that the only
thing that distinguishes these dialects from the dialect of Tōkyō is that whereas
Tōkyō has the rule ‘the first mora of a phrase becomes low pitched if the second is
high pitched’, in eight of the Noto dialects this rule changed to ‘the first mora of a
phrase becomes low pitched’. Since this rule is no more than a generalization of the
first rule, the eight dialects in question could perfectly well have developed from a
Tōkyō type tone system.
I agree with McCawley that the realization of initial /H/ tone as [L] in the eight
dialects that lack initial [H] pitch altogether is an innovation. It is much more likely
that the tone systems of the villages on the west coast of the peninsula and on Noto
Island that still do have initial [H] pitch are a remnant that is in the process of
disappearing, than that they are a new type that is spreading. It surely is no
coincidence that the [H] pitch on the initial syllable is best preserved in the villages
on Noto Island, which are most isolated from the other dialects on the peninsula. I
see the fact that even there, the first syllable tends to be lowered in careful
pronunciation as the result of a growing influence of the dominant tonal type of the
region. Kindaichi on the other hand, saw the fact that the [H] pitch only appears in
casual speech as indicative of his idea that a Tōkyō type tone system somehow
develops naturally, as soon as people speak in a relaxed and careless way.
There is no confirmation from other tone languages for the idea that a change
from tone to tone (so, from simple to more complicated) is a linguistic
universal. I therefore do not find Kindaichi’s idea on the direction of change
convincing. It is more likely that the dominant tonal type of the area is being
170 6 A new look at dialect tone
adopted when consciously trying to speak ‘correctly’. In this way, sociolinguistic
factors may explain why the simple variant occurs in deliberate speech, and the
more complicated variant in casual speech, when at first sight, the opposite would
seem to be more natural.
The simplicity of McCawley’s generalization rule, which derives the tonal type
of Noto peninsula from the Tōkyō type, is appealing, as the derivation from Kyōto
type tone is so complicated in comparison. There is one point however, which
McCawley has overlooked, and this is that it is not possible to derive the Noto tone
patterns of tone classes 2.5 and 3.7 from the common Tōkyō type, as in this tone
system classes 2.5 and 3.7 have merged with classes 2.4 and 3.6.
6.2.3 Noto type tone and Ramsey’s Middle Japanese tone system
Although it is indeed not possible to derive the Noto tone systems from the common
modern Tōkyō type, they can be derived from a more archaic Tōkyō type tone
system, such as the tone system of Nozaki, or the ‘Nairin’ type tone system of
Middle Japanese. As I have mentioned before, I regard the tone system of Nozaki as
probably the most archaic of all modern dialects in Japan, as it has preserved most
closely the tone pattern (albeit not all the tone classes, such as in the Kyōto type
dialect on the island of Ibukijima) of Middle Japanese.
12 A comparison of the tones of Nozaki and Middle Japanese
MJ Nairin Nozaki
2.4 - /HL-L/ - /HØ-Ø/
2.5 -, - /HR-L/ - /HR-Ø/
3.6 /HLL/ /HØØ/
3.7 /HLH/ /HØH/
In simplifying the /HR/ and /HØH/ tone patterns of classes 2.5 and 3.7, Kōda and
Ishizaki have each chosen a different alternative. In Kōda the /R/ tone and the /H/
tone on the final syllable were eliminated, just as happened in most Tōkyō type
dialects, while in Ishizaki the /H/ tone on the initial syllable was eliminated just as in
the other Noto dialects. In Nozaki it can be seen that the /H/ tone on the initial
syllable is in the process of disappearing:
13 The historical developments in the Noto dialects
Noto Ishizaki Nozaki Kōda
2.4 - < - < - -
2.5 - - < - > -
3.6 < <
3.7 < >
6.2 The Noto dialects 171
It is probably thanks to the automatic lowering of the initial /H/ tone that the /H/
tone on the final syllable in tone classes 2.5 and 3.7 has been preserved in the Noto
dialects. Keeping tone classes 2.4 and 2.5 and 3.6 and 3.7 separate is often thought
of as a typical attribute, and therefore also as a possible innovation, of the Kyōto
type dialects (cf. Tokugawa, 1962). With this analysis of the Noto dialects as
belonging to the Tōkyō type, we now have proof from modern dialects that these
tone classes are not an innovation of the Kyōto type dialects. They have to be
reconstructed for the proto-language of (at least) central Japan. Ramsey’s
reconstruction of the Middle Japanese tone system (which had this distinction) as
close to the modern Tōkyō type already implied this, and the fact that we now have
found Tōkyō type dialects that still preserve the distinction serves as confirmation.
6.2.4 The conditioned variants as remnants of earlier Kyōto type tone?
If one only looks at the realization that occurs when there is no ‘special’ second
syllable present in the Noto dialects, there is no reason to assume that the tone
system of these dialects has developed from an earlier Kyōto type tone system.
However, even though Kindaichi regards the tone that occurs when there are no
‘special’ second syllables as ‘basic’, he regards the tone that occurs when such
syllables are present as having preserved the original Kyōto type tone pattern: “I
think one can say that in these regions part of the vocabulary is transmitting the old
shape” (Kindaichi 1985:65). Furthermore, after giving the basic tone together with
the conditioned variants he comments: “When we look thus at the Noto accent
system it looks very much like that of the Kyōto-Osaka type accent. It is merely a
matter of there being a somewhat large number of shapes” (Kindaichi 1964:16).
In (14) and (15) I have therefore added the tone that occurs in these cases (when
the second syllable of words and phrases consists of a voiced consonant followed by
a close vowel -i or -u) below the basic tone that I have given earlier in (10) and (11).
I indicate this variant in the table with the code GI. (G representing the voiced
consonants and I representing the close vowels.)
In three of the dialects, an extra variant occurs when the second syllable consists
of a voiceless consonant followed by the close vowels -i or –u. I have indicated
these extra variants in the table with the code KI. (K representing the voiceless
consonants and I representing the close vowels.)13
As the extra KI variants have no bearing on Kindaichi’s claim, I will not discuss
them any further. I will however, consider the possibility suggested by Kindaichi,
that the variants marked GI developed as the result of a change from a Kyōto type
tone system to a Tōkyō type tone system, which left a special group of words
untouched. If so, the non-basic variant should show the original Kyōto type location
of the /H/ tone.
13 In Ogi and Iida [H] pitch will shift away from such a syllable to the next syllable if preceded by
[L] pitch. In Takahama /H/ tone will shift away from such a syllable to the next syllable if
preceded by [L] pitch. On similar shifts in other dialects, see chapter 7.
172 6 A new look at dialect tone
14 The influence of segmental features on the Noto tones (disyllabic nouns)
2.1 2.2/3 2.4 2.5
Ishizaki - - - , -
- GI - GI
Nozaki - - - , -
- GI - GI
Kōda - - - as 2.4
- GI
Hakui - - - , -
- GI - GI
Takahama - - - as 2.1
- GI - GI
, -KI
Tatsuruhama - - - , -
- GI - GI
Nanao - - - , -
- GI - GI
Han-no-ura - - - , -
- GI - GI
Ukawa - - - as 2.2/3
- GI - GI
Ogi - - - , -
- GI - GI
, - KI , - KI
Iida - - - , -
- GI - GI - GI
, - KI , - KI
15 The influence of segmental features on the Noto tones (trisyllabic nouns)
3.1 3.2/4 3.3/5 3.6 3.714
Ishizaki
GI GI GI
Nozaki
GI GI GI
Kōda as 3.6
GI
Hakui
GI GI GI
14 Kindaichi does not mention variants for tone class 3.7. If correct, this would mean that this tone
class has not completely merged with class 3.1 in Takahama and has not completely merged
with class 3.3/3.5 in Ukawa.
6.2 The Noto dialects 173
3.1 3.2/4 3.3/5 3.6 3.7
Takahama as 3.1
GI GI GI
KI
Tatsuruhama
GI GI GI
Nanao
GI GI GI
Han-no-ura
GI GI GI
Ukawa as 3.3/5
GI GI GI
Ogi
GI GI GI
KI KI KI
Iida
GI GI GI
KI KI KI
Although a number of the conditioned variants do resemble Kyōto type tone, the
tone of class 3.2/4 shows that the variants do not represent remnants of an older,
Kyōto-like stage: In Wakayama this tone class has ' tone, but the conditioned
variant in the Noto dialects has tone, and Kindaichi mentions that this is the
same as the tone of class 3.1. For some reason the /H/ tone in the conditioned variant
of class 3.2/4 was eliminated. This development cannot be related to the tone of this
class in the Kyōto type dialects.
In tone class 2.5 in Iida as well, we see that the conditioned variant has -
tone, which does not agree with the tone of of this class in the Kyōto type dialects.15
16 Comparison of the conditioned variants with the Kyōto type tone system
Noto Noto Wakayama
(basic tone) (conditioned variant) (Kyōto type)
2.1 - - -
2.2/3 '- '- '-
3.1
3.2/4 ' '
3.3/5 ' ' '
15 I have not included the variants that can be seen in Iida for tone class 2.5 in table (16), as the
other dialects do not have variants in this class. (The fact that Takahama and Ukawa do is the
result of the merger of class 2.5 with class 2.1 in Takahama, and with class 2.2/2.3 in Ukawa.
(The variants were adopted from these classes.)
174 6 A new look at dialect tone
6.2.5 The origin of the variants in the Noto dialects
The variants that we see in the Noto dialects are clearly not remnants of Kyōto type
tone. These phonological alternations must have been caused by the special quality
of the syllables in second position. The changes in tone that can be observed should
be predictable; under similar circumstances similar changes should occur.
If we now look for resemblances in the effect of the special second syllables on
the realization of the tones in the different tone classes, we see that the special
second syllables affect the tone of the words involved in two ways: If the special
syllable has /H/ tone, the /H/ tone is shifted away to the preceding syllable. If the
special syllable has [H] pitch, the pitch of the preceding syllable is raised.
The special syllables consist of typical depressor consonants with the close
vowels i or u, which are shorter than the other Japanese vowels, and will avoid [H]
pitch in many Japanese dialects. Shifting the /H/ tone away from these syllables to
the preceding syllable as happens in tone classes 2.2/3 and 3.3/5 is therefore a
strategy to avoid [H] pitch on the depressor syllables.
In tone classes 2.1, 3.1 and 3.2/4 on the other hand, the pitch of the depressor
syllable itself is not lowered, but the pitch of the preceding syllable is raised. This
development could be seen as a strategy to avoid higher pitch on the depressor
syllable than on surrounding syllables. I suspect however, that the depressor syllable
simply eliminated the automatic rise to [H] pitch after the initial syllable.
This would explain why the special second syllable in tone class 3.2/4 not only
raises the pitch of the preceding syllable, but also eliminates the pitch fall after the
noun, even though this is several syllables away. Both effects can be explained if we
assume that the automatic rise to [H] pitch was cancelled. As a result, the entire
tonal phrase became [L] pitched and the pitch fall after the noun was eliminated. If
this analysis is correct, the effect of the depressor syllables is the same in all
environments, namely the avoidance of [H] pitch.
What appears to be in contradiction with this analysis is the fact that the
conditioned variants of tone classes 2.1, 3.1 and 3.2/4 have to be analyzed as having
[H] pitch. This is because they contrast with nouns of classes 3.6 and 2.4, which are
represented as having level [L] pitch. In reality however, classes 3.6 and 2.4 appear
to be characterized by a rise to [H] pitch at the end of the tonal phrase, so that the
contrast is not really between phrases with level [H] and level [L] pitch. 16 This
means that the conditioned variants of tone classes 2.1, 3.1 and 3.2/4 do not have to
be realized with actual [H] pitch, as they are distinguished by their level tone
contour.17
16 In Okuda’s (1971) description of the tone system of the Noto dialect of Suzu (located on the
northeastern point of the Noto peninsula) for instance, such a rise is indicated, although Okuda
does not include it in his phonological representation of the tone system.
17 The differences in pitch in modern Japanese are extremely small. (According to McCawley
(1970:529), the amount by which pitch drops after the /H/ tone is no more than a major or
minor second.) This is probably due to the fact that larger difference are superfluous as modern
Japanese lacks the contrast between words and phrases with level /L/ tone and level /H/ tone
6.2 The Noto dialects 175
The initial /H/ tones that are the result of the shift of /H/ tone away from the
depressor syllables to the preceding syllable are exempt from the initial lowering
rule proposed by McCawley. This means that the initial lowering rule is older than
the shifting away of the /H/ tone to the initial syllable caused by the depressor
syllables. (The fact that the dialects of Nozaki and Kōda do have the conditioned
variants, but have not lowered /H/ tone on the initial syllable, means that
McCawley’s rule never reached these dialects, whereas the conditioned variants
did.)
It is impossible to link the conditioned variants to a stage in the history of the
Noto dialects in which they had a Kyōto type tone system. 18 The conditioned
variants could only have developed in a tone system similar to that of the majority of
the present-day Noto dialects, or in a slightly older stage that was still similar to the
present-day tone system of Nozaki.
6.2.6 The tone of monosyllabic nouns in the Noto dialects
Kindaichi does not show the tone of monosyllables in the Noto dialects in his table,
but refers to the monosyllables in a number of footnotes. As Kindaichi lists phrases
that usually have similar tone patterns together in his table, class 2.1 and class 1.1 +
particle, class 2.2/3 and class 1.2 + particle and class 2.4 and class 1.3 + particle are
not separated. Because of this arrangement it appears as though the tone of classes
1.1 and 1.2 + particle in all Noto dialects is :-, while the tone of class 1.3 is :-
, except in the dialects of Ishizaki, Nozaki and Kōda where it is :-.19 Although
these dialects, like so many Tōkyō type dialects, have the rule that there will be an
automatic rise in pitch after the initial syllable in words with Ø tone, this rule does
not apply to monosyllabic nouns with an attached case particle. As the realization of
that still existed in Middle Japanese. It is for instance the automatic rise to [H] pitch at the end
of a tonal phrase that starts with /L/ tone in Kyōto that gives away that the initial syllable is /L/.
It is this rise that contrasts such phrases with phrases with Ø tone, which are not audibly higher
but lack the rise. As there is no contrast between level [L] and level [H] phrases, the decision to
analyze a level phrase as [H] or [L] depends on the analysis of the tone system of the dialect as
a whole. Different linguists sometimes choose to analyze the level phrases in certain dialect in
different ways: Tone class 2.1/2 + particle in Taketomi is - according to Thorpe (1983),
but - according to Akinaga (1960). The pitch of nouns with Ø tone in Toyama is level
[L] according to Hirayama (1960) but level [H] according to Okuda (1971) and Uwano (1981).
18 It is impossible to take a Kyōto type tone system as the starting point in trying to explain the
conditioned variants in the Noto dialects as no rules obtain: The effect of the special second
syllable could be either to preserve an originally [H] preceding syllable (in all the affected tone
classes) or to raise an originally [L] following syllable (in tone class 3.2/3.4). Furthermore it
cannot be explained why tone classes 2.4, 2.5, 3.6 and 3.7 do not show the same change as tone
class 3.2/3.4.
19 Kindaichi does not mark vowel length for monosyllables in the Noto dialects, but other
descriptions of these dialects indicate that the Noto peninsula belongs to the area in which
vowel length in monosyllables is automatic.
176 6 A new look at dialect tone
the tone of monosyllabic nouns does not conform to the normal rules, Kindaichi
added the correct pitches of the monosyllabic nouns in notes under the table.20
17 The tone of monosyllabic nouns in the Noto dialects
Kōda, Nozaki, Noto
Ishizaki (general)
1.1 :- :-
1.2 :- :-
1.3 :- :-
I prefer to analyze the pitch of tone class 1.1 in Kōda, Ishizaki and Nozaki as :-
instead of :-, as in these dialects, all other tone classes with Ø tone start with [L]
pitch as well, but in the general Noto type dialects, the initial [L] pitch of tone class
1.3 compels us to analyze tone class 1.1 as level [H].21
Although the tone systems of Ishizaki and Nozaki are more archaic than the tone
systems of the other Noto dialects, these dialects have preserved a smaller number of
phonological contrasts in monosyllabic nouns. My explanation is, that at the time
when McCawley’s lowering rule came into effect, tone class 1.2 still had /R/ tone
(realized with an attached case particle as :-). Because of its already low onset,
20 In the article accompanying the Japanese dialect map in Wurm and Hattori (1983) Uwano
states that the Tōkyō type dialect of Kōda distinguishes between tone classes 1.1, 1.2 and 1.3. If
this were correct, the dialect of Kōda would be the only Tōkyō type dialect in Japan to have
preserved three classes for the monosyllabic nouns. The only description of the tone of the
monosyllables in Kōda that I have found so far is in the footnotes contained in Kindaichi’s
article. According to these notes, tone classes 1.2 and 1.3 in Kōda have merged: “Vocabulary
of group (1). Everywhere the shape of monosyllabic nouns + particle (example 1.1 ka-ga) is
. Except in Kōda, words with a second mora consisting of a voiced consonant + a close
vowel (example huru) are . Vocabulary of group (2). Everywhere monosyllables of class
1.2 + particle (example ha-ga) are .” These notes are printed underneath Kindaichi’s table,
which spreads out over two pages, and because of this, the notes are also spread out over two
pages.
Page 62:(1)の語彙 各地とも1泊名詞+助詞の形(例蚊が )は 型。向田を除き、
第2泊 Page 63: が有声子音+狭母音の語(例振る)は 型。(2)の語彙 各地
とも第2類1泊名詞 Page 62: +助詞の形(例葉が)は 型。
This unusual way of printing may have caused confusion as the text that appears together on
page 62 could be mistakenly read as: “Vocabulary of group (1). Everywhere the shape of
monosyllabic nouns + particle (example 1.1 ka-ga) is . Except in Kōda ... monosyllables of
class 1.2 + particle (example ha-ga) are .” Because it appears in the table as though tone
class 1.2 has tone in all Noto dialects, the tone of the monosyllables in Kōda can now
(mistakenly) be interpreted as: 1.1 :-, 1.2 :-, 1.3 :-.
21 Okuda (1971) mentions that in the tone system of Suzu tone class 1.3 is characterized by a rise
in pitch at the end of the tonal phrase (cf. 1.3 tee, tee-ga ‘hand’ :, :-) but Okuda does not
include this rise in his phonological representation. In Suzu all monosyllabic nouns are
automatically lengthened: 1.1 kaa-ga ‘mosquito’ :-, 1.2 naa-ga ‘name’ :-.
6.2 The Noto dialects 177
this tone was not affected by McCawley’s rule. Later, in all Noto dialects the /R/
toneme of tone class 1.2 was simplified to /H/.22
18 The origin of the different merger patterns of monosyllabic nouns
in the Noto dialects
Kōda, Nozaki, MJ ‘Nairin’ lowering Noto
Ishizaki (general)
1.1 :- :- > :- :-
1.2 :- < :- :- > :-
1.3 :- :- > :- :-
6.2.7 The tone system of Toyama
The tone system of the dialect of Toyama is included in Hirayama Teruo’s dialect
dictionary. The dialect of Toyama is not located on the Noto peninsula or on Noto
Island, but in Toyama prefecture directly to the south of the Noto peninsula. The
tone system of this dialect differs from that of the Noto dialects proper, especially
where the influence of segmental features on the placement of the /H/ tone is
concerned.
The location of the /H/ tone in the word varies, depending on the vowel quality
of the final or the second vowel in the word. As shown in table (19), if the second or
final vowel contains the close vowels i or u, the variant marked (I) will occur, the
/H/ tone being in the Kyōto type location. If the final or the second vowel contains
the open vowels e, a, or o, the variant marked (A) will occur, the /H/ tone being in
the Tōkyō type location. Hirayama analyzed the tone system of this dialect as
belonging to the Kyōto type.
In the Noto dialects it was clear that the conditioned variants that occurred with
the depressor syllables did not represent the original location of the /H/ tone. In case
of Toyama on the other hand, it is hard to choose which of the two variants – the one
that occurs with open vowels, or the one that occurs with close vowels – is original.
The decision to classify this tonal type as belonging to the Tōkyō type or to the
Kyōto type depends on this choice.
The conditions for the variation in the location of the /H/ tone in this dialect are
different from those in the Noto dialects, and are more like those found in the Gairin
B dialects, in the Chūrin dialects on the Bōsō peninsula, and in the Kyōto type
dialect of East-Sanuki. In these dialects (which will be discussed in the next chapter),
a rightward tone shift occurred, which was blocked by close vowels. There can be
no doubt that in these dialects the location of the /H/ tone that occurs with close
22 The /R/ tone on the final syllable of tone class 2.5 on the other hand, was preserved, although
not on the phonetic level. The vowel length on this contour tone was not protected by the
automatic vowel lengthening of monosyllables. As the vowel shortened, the rise to [H] pitch
was shifted onto the attached case particle, a situation that has been preserved to this day.
178 6 A new look at dialect tone
vowels is original: With close vowels more distinctions have been preserved, and
neighboring dialects that are more conservative help to determine where the /H/ tone
was originally located.
If – in Toyama as well – the location of the /H/ tone that occurs with close
vowels is original, Toyama would be a dialect that went along with the leftward tone
shift, but avoided creating a /L/ toneme by merging tone classes 1.3, 2.4 and 3.6
with the tone classes with Ø tone. This would make Toyama in origin a Tarui-type
dialect. Later, rightward tone shift blocked by close vowels brought the location of
the /H/ tone in part of the vocabulary back to the Tōkyō type location.23 (There is at
least one example of a dialect that went through a similar development, namely the
dialect of East Sanuki in section 7.2.1.)
19 The tone system of Toyama
Toyama
1.1/3 -
1.2 '-
2.1/4 -
2.2/3/5 '- I in final syllable
'- A in final syllable
3.1/6 -
3.2/4/7 '- I in final syllable
'- A in final syllable
3.3/5 '- I in 2nd syllable
'- A in 2nd syllable
It is also possible to regard the dialect of Toyama as closely related to the Noto
dialects. Instead of being influenced by the Kyōto shift, the tone system of Toyama
could have developed as follows: First, McCawley’s initial lowering rule caused a
loss of the initial /H/ tone in classes. 1.3, 2.4, 2.5, 3.6 and 3.7.24 Next, /H/ tone
23 If so, the developments in class 2.5 must have been as follows. First the final /R/ tone
developed into /H/ tone, just as the /R/ tone did in class 1.2 in the Nairin type dialects: -
(MJ ‘Nairin’ type) > -. Class 2.5 thus merged with class 2.3. Later, /H/ tone restriction
and the Kyōto-type shift resulted in /H/ tone on the initial syllable - > '- > '-.
Still later, in part of the vocabulary, rightward tone shift blocked by close vowels caused the
/H/ tone to shift to the second syllable: '- > '-. In class 3.7 the creation of the /L/
toneme was avoided during the leftward tone shift (just as in classes 1.3, 2.4 and 3.6) '' >
'. This made class 3.7 merge with classes 3.2 and 3.4. Later, in part of the vocabulary,
the rightward tone shift blocked by close vowels caused the /H/ tone to shift to the final syllable
3.2/4/7 ' > '.
24 According to Hirayama’s representation in the dialect dictionary (1960), the classes with Ø
tone in Toyama have level [L] pitch. Shibatani (1990:190) on the other hand, reported an
automatic rise after the initial syllable in the tone classes with Ø tone in Toyama. Uwano
6.2 The Noto dialects 179
would shift away to the preceding syllable in case it was originally located on a
depressor syllable.
In the dialects on the Noto peninsula, these were syllables that contained a close
vowel + a voiced consonant. If the definition of ‘depressor syllable’ was broadened
in Toyama, to include all syllables that contained close vowels, this would explain
why in Toyama /H/ tone was shifted away to the preceding syllable in a larger part
of the vocabulary than on the Noto peninsula.
The variants marked (A) would then have preserved the original location of the
/H/ tone. This would make Toyama a Noto type dialect: An archaic (Nairin) Tōkyō
type dialect (preserving tone classes 2.5 and 3.7), which went through McCawley’s
initial lowering rule, and later shifted the /H/ tone to the left in part of the vocabulary.
As I am not sure which of the two solutions is best, I have not classified the tone
system of this dialect as belonging to the Tōkyō type or the Kyōto type, but have
labeled it ‘Toyama type’ on dialect Map 1.
(1981) and Okuda (1971) analyze Hirayama’s level class as [H], but the rise reported by
Shibatani does not agree well with such an analysis.
7 Rightward spreading and tone shift
in the Japanese dialects
The standard theory has argued that rightward tone shift is such a natural
phenomenon that it happened many times over, independently, in geographically
widely separated dialects, transforming Kyōto type tone systems into Tōkyō type
tone systems throughout Japan. Rightward tone shift is indeed widely attested in
Japan, as well as many other languages in the world.
In Japan, the fact that rightward tone shift has occurred in a certain dialect is
usually evident from two things: First, the tone shift has only affected those tone
classes that did not already have the /H/ tone on the final syllable. Secondly, even
within the affected tone classes, only part of the vocabulary has shifted the /H/ tone
to the right, as under certain conditions the original location of the /H/ tone has been
preserved.
In the previous chapter we have seen how in the Noto dialects certain segmental
features (a combination of close vowels and depressor consonants) had influence on
tone, and could suppress the /H/ tone, or cause the /H/ tone to shift to the left. In
many Japanese dialects that have rightward tone shift, segmental features also have
influence on tone, in that vowel quality can facilitate or block rightward tone shift.
The influence of vowel quality on tone in these dialects is more limited than the
influence of the special syllables in the Noto dialects, as vowel quality only blocks
or facilitates the natural tendency for tone to spread to the right, but does not
suppress /H/ tone or cause /H/ tone to shift to the left.
The rightward shift of the /H/ tone can be blocked for instance, if the syllable to
which the /H/ tone would be shifted contains a close vowel. In a number of Japanese
dialects we even see that two conditions have to be met before /H/ tone will shift:
Hiroto (1961:165) reports that in Nogi-gun to the east of Matsue, in addition to the
second syllable containing an open vowel the first syllable has to contain a close
vowel: 2.4/5 hune' ‘ship’, huna' ‘carp’ but ka'sa ‘umbrella’, ma'do ‘window’.
Likewise, in Mutsu city on the Shimokita peninsula: 2.4/5 ido' ‘well’ but a'ki
‘autumn’, a'me ‘rain’. Kindaichi (1975a (1983): 141) reports the same for the
dialects of Hachinohe in Aomori and Morioka in Iwate.
In many modern Japanese dialects, [L] pitch will spread to the right, such as the
phrase initial [L] pitch that is the result of the %L phrase boundary tone in Aomori
(2.1/2 , -) or the [L] pitch that is the result of the /L/ tone in Kyōto (2.4
', '-). In some dialects (see Matsue in section 7.1.1) this spread is
7.1 Rightward tone shift in the Tōkyō type dialects 181
facilitated if the syllable to which the [L] pitch spreads contains a close vowel, as
this will allow [H] pitch to shift away from syllables with close vowels.1
The reason why close vowels tend to avoid high pitch seems to be connected to
the fact that close vowels are shorter than open vowels, and that in Japanese this
natural difference in length has been exaggerated (Vance, 1987: 49).
Lehiste (1970:18-19) says that “other factors being equal, a high vowel is
shorter than a low vowel.” The figures Lehiste quotes from Elert (1964) show
that for short allophones in Swedish, if the average duration of high vowels is
taken as 1.00, that of mid vowels is 1.08 and that of low vowels is 1.17. The
figures from Han (1962a: 67) show much greater differences for Japanese
vowels. Taking the average duration of /u/ as 1.00, the averages for the other
vowel are as follows /i/ 1.17, /o/ 1.26, /e/ 1.37, /a/ 1.44. Lehiste says, “It is
quite probable that the differences in vowel length according to degree of
opening are physiologically conditioned and thus constitute a phonetic
universal.” Han’s figures suggest that the physiologically conditioned
differences have been exaggerated in Japanese. This kind of exaggeration is
sometimes known as PHONOLOGIZATION (Hyman 1975:171-173).2
This chapter will examine a number of dialects in which rightward tone shift is
conditioned, as well as dialects in which rightward tone shift has occurred
unconditionally and is not related to vowel quality. In each case however, it turns
out that these dialects offer no proof for the two basic assumptions of the standard
theory; namely that Kyōto type tone was once more widespread, and that a Kyōto
type tone system will develop into a Tōkyō type tone system when subjected to the
influence of rightward tone shift.
7.1 Rightward tone shift in the Tōkyō type dialects
In some Tōkyō type dialects the rightward tone shift is conditioned, while in others
the tone shift has occurred unconditionally. In both cases it can be seen that the
1 The term ‘spreading’ applies when tone spreads to adjacent syllables or moras but remains
linked to the original location in the word as in /LHL/ > /LHH/ or /HLL/ > /HHL/. The term
‘shift’ applies when a tone is delinked from the original location in the word as in /LHL/ >
/LLH/ or /HLL/ > /LHL/. The anticipation of the accent-like /H/ tone on syllables with Ø tone
that precede the /H/ tone is therefore a form of spreading, as is the spreading of initial /L/ tone
in Kyōto onto syllables with Ø tone that follow the /L/ tone: Class 2.4 but -. In both
cases [L] or [H] pitch spreads over syllables with Ø tone, but the phonemic tone remains linked
to the original location. In the modern Japanese dialects with /H/ vs. Ø tone systems, spreading
usually occurs over syllables with Ø tone. When the location of the single /H/ tone in the word
changes however, this should be referred to as ‘shift’, as the tone delinks from the original
location.
2 In order to avoid confusion with tone height I prefer to use the terms ‘open’ and ‘close’ rather
than ‘high’ and ‘low’.
182 7 Rightward spreading and tone shift in the Japanese dialects
location of the /H/ tone before the rightward shift was in the Tōkyō type location.
These cases of rightward shift therefore contain no indication that Kyōto type tone
once existed in these areas.
7.1.1 Rightward tone shift conditioned by vowel height
Variation in the location of the /H/ tone conditioned by vowel height is especially
common in the Gairin type tone systems. So common in fact, that the Gairin type
tone system can be divided into two subtypes, type B, which has such variation, and
type A, which does not.
Type B can be found in part of the area with Gairin type tone in Shimane
prefecture (such as in the dialects of Matsue and Izumo), it can be found in a large
area in northeast Honshū (such as in the dialects of Akita and Aomori) and in
Hokkaidō. Within the large Gairin area in northeast Honshū there are pockets where
the more archaic Gairin A type has been preserved such as on the Shimokita
peninsula in Aomori, the area of Hachinohe, Morioka, Miyako and Kamishi in Iwate
prefecture and in Nezugaseki in Yamagata. (As mentioned above, in Mutsu city on
the Shimokita peninsula, Hachinohe in Aomori, and Morioka in Iwate, rightward
shift of the /H/ tone may occur if the first syllable has a close vowel in addition to
the second syllable having an open vowel.)
In the Chūrin type dialects, variation in the location of the /H/ tone conditioned
by vowel height only occurs in the dialects of the Bōsō peninsula.
The tone classes that are affected by the rightward tone shift are those classes
that do not already have the /H/ tone on the final syllable. In the Tōkyō type dialects
it is therefore typically tone classes 2.4/5, 3.3/5 and 3.6/7 that show variation in the
location of the /H/ tone.
As an example of the Gairin B type, I give the dialect of Aomori (Kobayashi,
1975) with the Gairin A type of Ōita (Hirayama, 1960) for comparison in (1). The
first form in Aomori occurs when the rightward shift of the /H/ tone is blocked by a
syllable containing i or u, represented by I, and the second form occurs when the /H/
tone is shifted one syllable to the right onto a syllable containing the vowels a, e or o,
represented by A.3
The comparison with Ōita shows that the tone that occurs in the forms with the
close vowels is original. 4 Other indications for this are the fact that in disyllabic
3 The vocabulary of class 3.3 in Aomori is too small and has too many irregular reflexes to draw
any real conclusions. The examples in Kobayashi (1975) are: sara'sa ‘cloth’, kera'e ‘vassal’,
komuNi ‘wheat’, awa' Úbi ‘abalone’ and tugara ‘power’.
4 Although my data are from Kobayashi, my interpretation is somewhat different. Kobayashi
considers the location of the /H/ tone that occurs with close vowels as original in case of tone
class 2.4/5, but in case of class 3.5 she regards the location of the /H/ tone that occurs with
open vowels as original. This is because in Kobayashi’s table class 3.5 is presented as merged
with class 3.4, and class 3.4 typically has the /H/ tone on the final syllable. It is more correct
however, to see the lack of variation in the location of the /H/ tone in class 3.4 as a feature that
distinguishes this class from class 3.5. This means that the location of the /H/ tone that occurs
7.1 Rightward tone shift in the Tōkyō type dialects 183
nouns the tone that occurs with close vowels has preserved more distinctions (in
case of open vowels tone classes 2.3, 2.4 and 2.5 have merged) and the fact that /H/
tone placement in disyllabic nouns with close vowels in the final syllable is free
whereas nouns ending in open vowels will always have the /H/ tone on the second
syllable.
Because most dialect descriptions concentrate on disyllabic nouns, it may appear
as though the location of the /H/ tone only varies depending on the quality of the
vowel in the final syllable, but the dialects of Aomori (Kobayashi, 1975) and Akita
(Hirayama, 1960) indicate that rightward shift is also blocked in tone class 3.6/7 if
the second syllable contains a close vowel.5 Okuda (1971) reports a similar situation
in Matsue: 3.6 ne'zumi ‘mouse’ but 3.6 usa'gi ‘rabbit’, 3.7 ka'buto ‘helmet’ but 3.7
tuba'ki ‘camellia’.
1 A comparison of Gairin A and Gairin B
Ōita Aomori
(Gairin A) (Gairin B)
2.1/2 tori ‘bird’, mura ‘village’ - ,-
2.3 yama ‘mountain’ '- '-
2.4/5 umi ‘sea’, saru ‘monkey’ '- '- I
ato ‘trace’, kumo ‘spider’ '- A
3.1/2 kuruma ‘vehicle’, aduki ‘red bean’ - ,-
3.4 otoko ‘man’ '- '-
3.5 inoti ‘life’ '- '- I
kokoro ‘heart’ '- A
3.6/7 suzume ‘sparrow’, kabuto ‘helmet’ '- '- I
usagi ‘rabbit’, tayori ‘news’ '- A
In Matsue [L] pitch spreads to the right over syllables with Ø tone. The automatic
rise to [H] pitch after the %L phrase boundary tone for instance, is postponed if the
second syllable contains a close vowel, such as in mugibata'ke ‘wheat field’
'.6 What is more remarkable is that in Matsue the rise to [H] pitch is also
postponed in case of /H/ tone, as can be seen from the tone of class 2.3 (Kindaichi,
1981).
with a close vowel in the final syllable in class 3.5 can be regarded as original, just as in case of
class 2.4/5.
5 Although I represent class 3.6 as merged with class 3.7 in Aomori here, is not certain that class
3.6 shows the same alternations as class 3.7, as all Kobayashi’s examples of vocabulary of
class 3.6 happen to have open vowels in the second syllable. (All therefore have ' tone.)
Akita has 3.6 usa'gi ‘rabbit’, una'gi ‘eel’ etc., but ki'tune ‘fox’, su'zume ‘sparrow’.
6 According to Kobayashi (1975:75) this is also the case in Izumo.
184 7 Rightward spreading and tone shift in the Japanese dialects
As a result, members of class 2.3 merge with class 2.1/2 if the second syllable
contains a close vowel, but members of class 2.4/5 merge with class 2.3 if the
second syllable contains an open vowel.
2 Rightward tone shift in Matsue
Matsue
2.1/2 - I
- A
2.3 - I
'- A
2.4/5 '- I
'- A
The dialect of Ichihara is one of the Chūrin dialects on the Bōsō peninsula. This is
the only region where Chūrin dialects show rightward shift of /H/ tone blocked by
close vowels and rightward spread of [L] pitch facilitated by close vowels (Uwano,
1981). In most dialects the monosyllables are exempt from rightward tone shift, and
the variation in the location of the /H/ tone therefore only occurs in words of two
syllables and longer. Here, in Ichihara however, we see that nouns of class 1.3 that
contain a close vowel will spread the /H/ tone onto the attached case particle.
3 Rightward tone shift in the Chūrin type tone system of the Bōsō peninsula
Ichihara
1.1/2 -
1.3 -' I
'- A
2.1 , - I
- A
2.2/3 , - I
'- A
2.4/5 '- I
'- A
Kobayashi (1975:78) reports that in Izumo, monosyllabic nouns of class 1.3 that
contain a close vowel will shift the /H/ tone to the particle: 1.3 ki-ga ‘tree’ -'.
It is self-evident that it is harder for rightward tone shift to occur in
monosyllables than in words of two or more syllables. However, the following
dialect from the Shimokita peninsula in Aomori prefecture (Ikegami, 1970) shows
that there is even a difference in the ease with which tone shift occurs between
7.1 Rightward tone shift in the Tōkyō type dialects 185
disyllabic and trisyllabic nouns. 7 This dialect has rightward tone shift blocked by
close vowels in trisyllabic nouns, but not in disyllabic nouns, and appears to be in-
between type A and type B.8
4 Rightward tone shift on the Shimokita peninsula
Shimokita
1.1/2 -
1.3 ', '-
2.1/2 -
2.3 ', '-
2.4/5 '-
3.1/2 -
3.4 ', '-
3.59 '- I
', '- A
3.6/710 '- I (in 2nd syllable)
'- A (in 2nd syllable)
There is a phenomenon that may be related this issue. This concerns the frequent Ø
tone reflex of class 3.6 in many Tōkyō type dialects. The expected reflex for this
class in the Tōkyō type dialects is ', just as for tone class 3.7, and there are
indeed areas where this reflex predominates: The Gairin A type dialect of Ōita
(Hirayama, 1960) has a very regular ' reflex for both classes. This is also the
case in the area with Gairin A type tone around Hamamatsu (Iitoyo, 1983:157).
Many Chūrin Tōkyō type dialects however, such as Matsumoto, Numazu and
Hiroshima (Hirayama, 1960), have ' tone in the majority of nouns of class 3.7,
but Ø tone in the majority of nouns of class 3.6. Kobayashi’s (1975) data for Tōkyō
and Yamaguchi also contain many Ø tone reflexes for 3.6 nouns, while class 3.7
generally has ' tone.
7 I see no clear pattern in the reflexes of tone class 3.3: koNa'ni ‘gold’, komu'Ni ‘wheat’, ha Úda'zi
‘twenty years old’ but cikara ‘power’, awabi ‘abalone’, sazjE ‘turban shell’.
8 According to Uwano (personal communication) there are more dialects in northeast Japan that
show such a difference between disyllabic and trisyllabic nouns as far as rightward tone shift is
concerned.
9 The examples from Ikegami are: asa'fi ‘morning sun’, ezi'ci ‘five’, eno'zi ‘life’ kiu'ri ‘cucumber’,
nisi'gi ‘brocade’, fi Úba'si ‘tongs’, mana'gu ‘eye’, kogoro' ‘heart’, a Úbura' ‘oil’, hasira' ‘pillar’,
magura' ‘pillow’. (But 3.5 suNa'da ‘figure’ does not fit into the pattern.)
10 The examples for 3.6 are: ki'zine ‘fox’, su' zÚ ime ‘sparrow’, usa'Ni ‘rabbit’, una'Ni ‘eel’, kara'si
‘crow’, hena'ga ‘back’, taga'sa ‘height’, fi bÚ a'ri ‘sky lark’, joNo'mi ‘mugwort’. (But ne zÚ i'mi ‘mouse’
does not fit into the pattern.)
The examples for 3.7 are: e'ziNo ‘strawberry’, u'siro ‘behind’, ka' Úbudo ‘helmet’, ku' zÚ ira ‘whale’
and tajo'ri ‘news’, tarE' (<*tara'i) ‘tub’, jamE' (<*yama'i) ‘illness’. (But ka'rasi ‘mustard’ and kusi'ri
‘medicine’ do not fit into the pattern.)
186 7 Rightward spreading and tone shift in the Japanese dialects
The explanation for the different reflexes of classes 3.6 and 3.7 in many Tōkyō
type dialects may be as follows: Just as on the Shimokita peninsula, rightward shift
of the /H/ tone was limited to the longer nouns. The /H/ tone that originally fell on
the initial syllable in tone class 3.6 shifted to the second syllable, but the /H/ tone on
the initial syllable of class 2.4 was not affected.
The final /H/ tone in class 3.7 prevented the initial /H/ tone in this class from
shifting to the right. (A constraint known from Bantu as OCP (Obligatory Contour
Principle). Tone class 3.6 now had ' tone, and – after the final /H/ tone of
class 3.7 was lost – tone class 3.7 had ' tone. Finally, the medial /H/ tone in
class 3.6 disappeared, due to the tendency observed by Yoshida (1997) for medial
/H/ tone in Tōkyō to shift to Ø.11
Concluding this section on /H/ tone shift and [L] tone spreading conditioned by
vowel height in the Tōkyō type dialects, I give an overview of the mergers in the
dialects of the Gairin B type that result from such processes. For comparison, I have
added the reflexes of the Gairin A type dialect of Ōita in (5), in which such mergers
have not occurred.12
5 Overview of mergers in the Gairin type tone systems
Ōita Akita Niigata Matsue/ Matsue/
Izumo Izumo
(1953) (1975, 1981)
2.1/2 - - - - - A
- I
2.3 '- '- '- - I - I
' A '- A
2.4/5 '- '- A '- A ' A '- A
'- I '- I '- I '- I
In Shimane prefecture, the Gairin B type dialects that have rightward tone shift such
as Matsue and Izumo share a relatively small area with the more archaic Gairin A
type dialects that do not have rightward tone shift. In addition, the conditions for
rightward tone shift in this region vary, such as for example in the dialect of Nogi-
gun mentioned above.
In the Tōhoku region as well, the Gairin A type and the Gairin B type tone
systems meet, and here too we see that there are dialects where two conditions have
11 Based on a comparison of the data in Hirayama (1957), NHK (1985) and her own research she
concludes that in trisyllabic nouns in Tōkyō, medial (and even final) /H/ tone is disappearing,
leaving initial /H/ as the default location of /H/ tone assignment in Tōkyō Japanese.
12 Ōita and Akita are based on Hirayama (1960), Niigata is based on Kindaichi (1981),
Matsue/Izumo (1953) is based on Hiroto & Ōhara (1953), Matsue/Izumo (1975, 1981) is based
on Kindaichi (1981) for Matsue and on Kobayashi (1975) for Izumo.
7. 2 Rightward tone shift in the Kyōto type dialects 187
to be met before /H/ tone shifts to the right (cf. on the Shimokita peninsula, in
Hachinohe (Aomori) and Morioka (Iwate).
The limited form of rightward tone shift (where the first syllable has to contain a
close vowel and the next syllable an open vowel before the /H/ tone will shift) must
represent a more archaic, incipient variety.
7.1.2 Unconditional rightward tone shift
Uwano (1981) reports that in Arai near Lake Hamana in the area with Gairin type
tone around Hamamatsu, the merger of class 2.4/5 with class 2.3 has been complete,
and is not conditioned by the quality of the vowel in the final syllable. (We can
perhaps call this tone system the Gairin B' subtype.) The division of the disyllabic
nouns into distinct tone classes in this area is therefore 2.1/2 - vs. 2.3/4/5
'-. In northern Miyagi prefecture (from Ishinomaki northward to Ichinoseki)
the merger of class 2.4/5 with class 2.3 has also been complete. The division of the
disyllabic nouns into distinct tone classes in this area is 2.1/2 - vs. 2.3/4/5
'-. (I have no information on similar shifts and mergers in the longer nouns.)
The fact that rightward tone shift in a Gairin type tone system results in a merger
pattern similar to the merger pattern in the tone classes of the Kagoshima type tone
systems and a number of dialects in the Ryūkyūs, is most likely no coincidence.
Rightward tone shift may have been a key factor in the development of the Japanese
word-tone systems. (See section 9.1)
7. 2 Rightward tone shift in the Kyōto type dialects
Rightward tone shift also occurs in two of the Kyōto type dialects. In the first
example below the shift is conditioned, so that only part of the vocabulary of the
affected tone class develops a Tōkyō-like location of the /H/ tone. In the second
example below, the shift is unconditioned, but even here the result is not a Tōkyō
type tone system.
As both these dialects moreover, are squarely located within the area with Kyōto
type tone, they do not contain indications for an earlier spread of the Kyōto type
tone system outside of the present-day area with Kyōto type tone.
7.2.1 Rightward tone shift conditioned by vowel height
On the island of Shikoku the tone of Kōchi is of the more typical Kyōto type, but in
the northeastern part of Shikoku in the area of Takamatsu, Marugame, Kan’onji,
Niihama, Ikeda and islands such as Shōdo a different type can be found which
Uwano (Hattori & Wurm, 1981) has called the Sanuki subtype of the Kyōto type
tone system.
In these dialects tone classes 2.1 and 2.3 have merged, but for the rest the tone of
the disyllabic nouns is of the typical Kyōto type. Uwano divides the dialects further
up into the East Sanuki type and the West Sanuki type. In the East Sanuki type
188 7 Rightward spreading and tone shift in the Japanese dialects
variation in the location of the /H/ tone occurs, depending on the type of vowel in
the final syllable. (It may be that there are also variants in the location of the /H/
tone based on the vowel height of non-final syllables, as in Matsue and Aomori, but
for this we need information on the reflexes of trisyllabic nouns, which is not
included in Uwano’s discussion.)
Just as in the Tōkyō type dialects, it is the tone class with /H/ tone on the initial
syllable that is susceptible to the rightward tone shift, and in this Kyōto type dialect
where tone class 2.3 has merged with class 2.1, this concerns tone class 2.2:13
6 Rightward tone shift in Takamatsu
West Sanuki East Sanuki
(Marugame) (Takamatsu)
2.1/3 - -~-
2.2 '- '- I
', '-~'- A
2.4 ', '- ', '-
2.5 '', ''- '', ''-
A comparison with the closely related West Sanuki dialect shows that the tone that
occurs with close vowels is original. The tone that occurs with open vowels in the
final syllable is almost identical to the tone of class 2.5. The only difference appears
to be that the pitch of the first syllable of nouns of class 2.2 ending in open vowels is
in free variation, just as in class 2.1/3. The variation is probably the result of the
recent development of a %L phrase boundary tone that is still optional. (On the
merger between classes 2.1 and 2.3 in these dialects, see section 4.2.5.)
Concluding we can say that rightward shift of /H/ tone blocked by close vowels
developed at least two times independently on the main Japanese islands, once along
the Sea of Japan coast and once in the East Sanuki dialect of northeast Shikoku.
There can be no doubt that the influence of vowel height on tone in East Sanuki is a
recent development, as it can only have developed after the occurrence of the
leftward tone shift, and as it cannot be found in the closely related dialect of West
Sanuki.
The similar development on the Bōsō peninsula may have been the result of
influence by the former proximity of dialects with a Gairin B type tone system,
when the Bōsō peninsula was not yet cut off from these dialects by the large toneless
area that now exists in between, and by the spread of the dialect type of Tōkyō.14
13 See sections 4.2.5 for a possible explanation for the merger between classes 2.1 and 2.3.
14 According to map 6 in the Gendai Nihon-go hōgen dai-jiten (Hirayama, Teruo ed., 1992), the
dialect of the Bōsō peninsula is also the only Chūrin type dialect in Japan that includes
Tōhoku-type centralized vowels.
7. 2 Rightward tone shift in the Kyōto type dialects 189
7.2.2 Unconditional rightward tone shift in Ibukijima
An interesting case of rightward tone shift can be seen in the Kyōto type dialect of
Ibukijima in the Seto Inland Sea. This dialect is famous because it has preserved five
tone classes for disyllabic nouns, the same number that was distinguished in Middle
Japanese. In addition, this dialect has been through a recent rightward tone shift that
is both well-documented and unconditional. This makes this dialect the ideal case to
test one of the central assumptions of the standard theory, namely, that a Kyōto type
tone system – when left to the influence of rightward tone shift – will develop into a
Tōkyō type tone system.
The tone system of Ibukijima has been researched a number of times. In (7) I
compare the descriptions contained in two publications (Wada, 1966 and Uwano,
1985), which both include data from people of different ages. I have arranged the
data in such a way that a development over time can be seen.
7 Rightward tone shift in Ibukijima
Wada (1966) Uwano (1985) Uwano (1985) Wada (1966)
Old man Old woman Middle aged Middle School
woman student
2.1 - - - -~-
2.2 - - , - ~,
-~-
2.3 , - , - , - -
~-
2.4 , - , - , - ,-
2.5 ~, - - , - , -
3.1 - - -
3.2 x - , - x
3.3 - - -
3.4 - - x
3.5 x -, -, x
- -
3.6 , - , - -
3.7 - , - -
Wada analyzed the tone of class 2.3 as /HM/, but mentioned that an analysis as /ML/
was equally possible. Most descriptions agree on the fact that between the first and
the second syllable there is a fall in pitch that is significantly smaller than the fall
from [H] to [L] in tone class 2.2.15
15 An exception is Kindaichi (1970), who analyzed the pitch of class 2.3 as . Kindaichi also
reported a difference between nouns of class 3.4 and 3.5: In isolation they are both pronounced
as , but when the copula is added 3.4 has - tone and 3.5 has - tone. He
therefore concluded that the tone of class 3.5 may have to be analyzed as . Uwano
190 7 Rightward spreading and tone shift in the Japanese dialects
A comparison of the different descriptions shows that in this dialect, /H/ tone is
progressively shifting to the right. (In the stages before the Middle School student,
no delinking had taken place yet, so in these cases it would be more correct to speak
of tone spreading.) The tone classes that include /H/ tone are therefore indeed
developing a Tōkyō-like location of the /H/ tone.16
A look at the tone classes that start with /L/ tone however, shows that we cannot
present the dialect of Ibukijima as an example of a Kyōto type tone system that turns
into a Tōkyō type tone system. The tone classes that start with /L/ tone behave just
as we have come to expect, looking at the developments in other Japanese dialects
and from what are known to be natural tonal developments in other languages: The
[L] pitch spreads to the right over syllables with Ø tone. There is no question of
initial /L/ tone transforming into initial /H/ tone, and consequently no development
into a real Tōkyō type tone system.
The variation in the location of the /H/ tone that can be found in many Japanese
dialects shows that Kindaichi is right in claiming that rightward tone shift occurred
independently in many dialects in Japan. However, even when such rightward shifts
are unconditional – such as in Ibukijima – a Kyōto type tone system will not change
into a Tōkyō type tone system, as this would require unnatural developments in the
tone classes that start with /L/ tone.
The distinctive /M/ tone in classes 2.3 and 3.4 may be a remnant of the [M] pitch
that has to be reconstructed in these tone classes in the transitional period (cf. section
4.1.2). If the leftward shift affected the dialect of Ibukijima when it was still at this
stage, the pitches of classes 2.3 and 3.4 would have been [MH] and [MMH]
respectively. As a result of the shift, these pitches may have been reversed.
8 Possible origin of /M/ tone in Ibukijima
Nairin Phonological Ibukijima Phonological
(Stage 2) analysis analysis
2.2 /LH/ > /HØ/
2.3 /ØH/ or /MH/ > /MØ/
3.4 /ØØH/ or /MMH/ > /ØMØ/
(1985) on the other hand, reports that he did not find such a difference. Uwano’s description is
the only one that includes a large sample of trisyllabic and longer nouns. (The data on
trisyllabic nouns of Kindaichi and Wada are based on no more than one or two example words
per class.)
16 Because of the archaic nature of the tone classes in Ibukijima, Ramsey (1980:68) argued that
the Tōkyō type location of the /H/ tone in Ibukijima may be old. The comparison in (7)
however, shows that the Tōkyō type location of the /H/ tone in Ibukijima is definitely a recent
development.
8 Subclass divisions in proto-Japanese
In my discussion of the tonal developments in the Japanese nouns, I have so far
ignored the distinction between the a and b subclasses which is sometimes posited
for tone classes 1.3, 2.2, 3.2, 3.5 and 3.7. The subclass division posited for tone
classes 2.2, 3.2 and 3.7 is based on dialectal reflexes. The subclass division posited
for tone classes 1.3 and 3.5 on the other hand, is based on unusual tone dot
attestations.
8.1 Subclass divisions based on dialectal reflexes
The subclass division in tone classes 2.2, 3.2 and 3.7 was proposed by Hayata
(1973), based on the fact that some members of these classes have an unexpected Ø
tone reflex in Tōkyō. (Dialect like Numazu, Matsumoto and Hiroshima have similar
reflexes as Tōkyō.)
Nouns that have a reflex with Ø tone in these dialects are assigned to subclasses
2.2a, 3.2a and 3.7a, and nouns that have a reflex which includes /H/ tone, are
assigned to subclasses 2.2b, 3.2b and 3.7b.
The reason why this division has not been attested in the Middle Japanese
material is because the distinction is thought to have disappeared from the attested
forms of Middle Japanese before tone dot markings started to be used. In other
words, it is thought that the smaller hypothetical classes 2.2a, 3.2a and 3.7a merged
with the larger classes 2.2b, 3.2b and 3.7b in the attested forms of Middle Japanese
before the late 11th century.
8.1.2 The subclasses 2.2a and 2.2b in Martin’s classification
In Samuel E. Martin’s classification of the tone classes (1987:376-599), the
classification of a noun as belonging to class 3.2a is based on a Ø tone reflex in
Tōkyō. In case of class 2.2a however, many nouns that do not have Ø tone in Tōkyō
are nevertheless assigned to this class. This happens when they have Ø tone in the
dialect of Aomori.
In Kobayashi’s description (1975), Aomori is a typical Gairin type dialect, which
has very regularly merged class 2.2 with class 2.1, and on Uwano’s detailed map
(1981) no Chūrin type dialects are indicated north of Yamagata. If Aomori is a
typical Gairin type dialect, a Ø tone reflex is the regular correspondence of class 2.2
and cannot be used to classify a noun as belonging to class 2.2a, but Martin appears
to regard Aomori as a Chūrin type dialect: On p.256 for instance, Martin presents
192 8 Subclass divisions in proto-Japanese
Aomori, together with Yamaguchi and Tōkyō as a dialect that merges class 2.2 with
class 2.3.
Although there is indeed a small number of nouns of class 2.2 that do not have
the expected Ø tone reflex in Aomori (there are such examples in the Gairin type
dialects of Matsue and Izumo also)1 the dialect of Aomori is clearly a Gairin type
dialect.
The number of nouns that conform to the correspondences that Hayata posited
for class 2.2a is therefore much smaller than would appear from Martin’s
classification. The examples that conform to the correspondences posited by Hayata
are: hito ‘person’, kita ‘north’, tuta ‘ivy’ Tōkyō Ø/', mata ‘again’ Tōkyō Ø/',
are ‘that’, semi ‘cicada’ Tōkyō Ø. All examples have been attested with tone in
Middle Japanese (class 2.2), they have ' tone in Kyōto and all have word-tone A
in Kagoshima.2
8.1.3 The subclasses 2.2a, 3.2a and 3.7a in the standard theory
Hayata analyzes the tone system of Middle Japanese as a pitch-accent system, and
points out that the Ruiju myōgi-shō 類聚名義抄 dialect, like the modern Kyōto type
dialects, had no nouns with accent on the final syllable *SS' or *SSS' (S representing
the syllable). Such final accented nouns, when subject to the rightward shift that
according to the standard theory took place in the Tōkyō type dialects, would have
shifted the accent off the word and yield unaccented reflexes in the Tōkyō type
dialects. Hayata therefore reconstructs final accent in tone classes 2.2a, 3.2a, 3.5a
and 3.7a in proto-Japanese, realized as [F] pitch on the final syllable.3 (In (1) have
added Hayata’s reconstruction of the actual pitches after his phonological
representation.)
At first sight it seems impossible to reconcile this idea with the standard theory:
If the Tōkyō type dialects reflect traits from proto-Japanese that have been lost in the
Kyōto type dialects – traits moreover that were already lost in these dialects by the
time of Ruiju myōgi-shō – this must mean that the Tōkyō type tone system branched
off from the Kyōto type tone system before the 11th century. That is; before the
development of pitch falls in tone classes 2.3, 3.4 and 3.5 (2.3 > , 3.4
> , 3.5 > ). As all Tōkyō type tone systems contain pitch
falls in these tone classes, a derivation from such an early stage is impossible.
1 The examples are as follows (they have all been attested with tone in Middle Japanese, i.e.
class 2.2, and except for kura ‘saddle’, all have word-tone A in Kagoshima): Adi ‘saurel’ (fish),
hime ‘lady, princess’, humi ‘writings’ have ' tone in Aomori. Kura ‘seat, saddle’, kata
‘direction, person’ have ' tone in Aomori. Uta ‘song’, waza ‘trick’, tuka ‘mound’ have
' tone in Matsue and Izumo. Tuta ‘ivy’ has Ø/' tone in Matsue and Izumo. Kare ‘he,
that one’, tugi ‘next’ have ' tone in Matsue and Izumo.
2 The following examples have not been attested in Middle Japanese: miki ‘wine’ Ø/1, nire
‘yew’Ø/1. Sita ‘below’2/Ø has Ø tone in Kyōto instead of the expected ' tone.
3 Hayata does not discuss the a/b distinction for tone class 3.5.
8.1 Subclass divisions based on dialectal reflexes 193
The only solution to the problem that there were no pitch falls yet to be shifted to
the right in Tōkyō at that time, is to support Hattori’s theory about the existence of
/M/ tones in Middle Japanese. (See section 2.3.2.) It is therefore no coincidence that
Hayata adopted Hattori’s idea of /M/ tone in these classes in the same article in
which he proposed these proto-Japanese tonal subclasses.
1 The subclasses 2.2a, 3.2a and 3.7a in the standard theory
Kyōto Proto-Japanese Tōkyō
2.2a hito ‘person’ ' < *SS' ['] >
2.2b hasi ‘bridge’ ' < *S'S ['] > '
3.2a tokage ‘lizard’ '' < *SSS' ['] >
3.2b4 azuki ‘azuki bean’ ' < *SS'S ['] > '
3.7a kusuri ‘medicine’ '' < *'SSS' ''] >
3.7b kabuto ‘helmet’ '' < *'SS'S [''] > '
8.1.4 Subclasses 2.2a, 3.2a and 3.7a in Ramsey’s theory:
Final /R/ tone preceded by /L/ tone in proto-Japanese
What does the a/b distinction in proto-Japanese look like if we follow the reversed
reconstruction of the Middle Japanese tones? We have seen how in the Chūrin type
dialects the /R/ tone of class 1.2 resulted in a Ø tone reflex in the modern dialects.
The obvious way therefore, to explain the loss of the /H/ tone in the type a
subclasses in a number of Chūrin dialects, is to reconstruct these tone classes with
/R/ tone on the final syllable in proto-Japanese. The type a subclasses of tone classes
2.2, 3.2 and 3.7 have in common with each other that the final /R/ tone was preceded
by /L/ tone in proto-Japanese.
In the Chūrin type dialects the /L/ tone of the monosyllabic case particles was
lost after the final /R/ tone, so that there was no longer a drop in pitch after the noun.
This led to the modern Ø tone reflex of classes 2.2a and 3.2a. After tone classes 2.2b
and 3.2b on the other hand, which had a /LH/ tone sequence on the final two
syllables but not final /R/ tone, the drop in pitch after the noun was preserved, which
resulted in the modern reflex with /H/ tone on the final syllable.
In the Gairin type dialects tone spreading occurred after final /R/ tone as well as
after a /LH/ tone sequence on the final two syllables, so that the complete tone
classes 2.2 and 3.2 later developed Ø tone.5 (In (2) class 1.2 has been added for
comparison.)
4 The dialect of Kyōto itself has shifted ' to ' (aduki for instance is ' in Kyōto),
so Hayata most likely refers to somewhat more archaic Kyōto type dialects such as the dialects
of Wakayama or Ōsaka.
5 As I have no data on the tone of the a subclasses in the Nairin type dialects, these dialects have
been excluded from the overview. We do know what the developments in the Nairin type
dialects were like for tone class 1.2: There was no tone spreading onto the particle after tone
194 8 Subclass divisions in proto-Japanese
2 Final /R/ tone preceded by /L/ tone in proto-Japanese
Kyōto MJ ‘Nairin’ Proto-Japanese Tōkyō Tōkyō
Chūrin Gairin
'- - < 1.2 - > Ø Ø
'- - < 2.2a - > Ø Ø
'- - < 2.2b - > '- Ø
' - < 3.2a - > Ø Ø
' - < 3.2b - > '- Ø
' - < 3.7a > Ø '
' - < 3.7b > ' '
The final /R/ tone in subclass 3.7a had the same effect as the other final /R/ tones
that were preceded by /L/ tone, i.e. the pitch fall after the word was lost in the
Chūrin and the Gairin type tone systems. The final /R/ tone in class 3.7a had in
common with the first group that it was preceded by /L/ tone, but an important
difference is that there was also /H/ tone on the initial syllable in this tone class. This
/H/ tone was preserved in the Gairin type dialects, and class 3.7a thus merged with
class 3.6.
The Chūrin type dialects of Tōkyō, Numazu, Matsumoto and Hiroshima on the
other hand, frequently have Ø tone reflexes for this class, just as they have in case of
many members of class 3.6. This means that the final /R/ tone of class 3.7a was
indeed lost in the Chūrin type dialects, leading to a merger with class 3.6. (A
possible explanation for the frequent Ø tone reflex of class 3.6 in the Chūrin type
dialects has been suggested in section 7.1.1)
8.1.5 Is the distinction between tone classes 3.2a and 3.2b
reflected in the Kyōto type dialects?
I have doubts about the way in which the reflexes of tone classes 3.2a and 3.2b are
presented in Hayata (1973) and Martin (1987) as it is suggested that the distinction
between subclasses 3.2a and 3.2b is also reflected in the Kyōto type dialects. (See
Hayata’s representation in (1).
I do think that the lack of /H/ tone in a number of Chūrin type dialects for tone
class 3.2a may go back to the former presence of final /R/ tone. However, it is very
unlikely that the split in the reflexes in the Chūrin type dialects can be tied to the
split in the reflexes in the Kyōto type dialects. After all, we do not find the 3.2a/b
subclasses reflected in the attested forms of Middle Japanese (which certainly
class 1.2, and when the /R/ tone was later simplified to /H/, class 1.2 merged with class 1.3.
Although it is possible that in these dialects the final /R/ tone in the a subclasses developed into
final /H/ tone as well, we cannot be certain, as the preservation of the [L] pitch of the case
particle in case of class 1.2 is also related to the fact that monosyllabic nouns have automatic
vowel length in central Japan.
8.1 Subclass divisions based on dialectal reflexes 195
include materials from the old capital). It is therefore hardly feasible that the
present-day mixed reflexes in the Kyōto type dialects represent an a/b subclass
distinction that had already disappeared from the dialect of Kyōto before the 11th
century. As I have argued in section 4.2.2, the mixed reflexes for tone class 3.2 in
the Kyōto type dialects are of a completely unrelated and much later origin.
To illustrate this point I have compared the reflexes of the a and b subclasses of
tone class 3.2 in the Kyōto type dialects of Kyōto, Wakayama, Hyōgo and Kōchi
with the reflexes in the dialect of Tōkyō. The Kyōto and Tōkyō data are from Martin
(1987), the Wakayama data are from fieldwork notes that were collected by S. R.
Ramsey in 1977 and kindly put at my disposal, and the Hyōgo and Kōchi data are
from Hirayama’s dialect dictionary (1960).
I have only included examples for which 上上平 tone dot markings have been
attested in Middle Japanese. The classification of the examples as 3.2a, 3.2b or
3.2a/b (in case of evidence for both a and b) has been adopted unaltered from Martin,
and has been added before each example word, although it can be seen that my own
classification (as evident from the headings) is sometimes different. I have based my
own classification on the reflexes in the dialect of Tōkyō only. Irregular reflexes
have been underlined, such as for instance ' in Wakayama, Hyōgo and Kōchi
(the expected reflex in these dialects is ').
3 The reflexes of 3.2a in the Kyōto type dialects
Kyōto Waka- Hyōgo Kōchi Tōkyō
yama
3.2a mukade ‘centipede’ '' '' '' '' Ø
3.2a tokage ‘lizard’ '' ' '' '' Ø
3.2b turube ‘bucket’ '' x '' '' Ø
3.2b tobira ‘door panel’ ', Ø '' ' ' Ø
3.2a ibara ‘thorn’ '' x x x Ø
3.2a kasiwa ‘oak leaf’ '' x x x Ø
3.2a kibisu ‘heel’ ' x x x Ø
3.2b kadura ‘vine’ ' x x x Ø
3.2a/b asika ‘sea lion’ Ø x x x Ø
3.2a sakura ‘cherry blossom’ Ø Ø Ø Ø Ø6
3.2a akaza ‘chenopodium '' x x x Ø
album’
6 A number of other nouns listed as 3.2a in Martin’s list also have Ø tone in Tōkyō and Kyōto
(just as sakura). They have been omitted from the table as there is no data from other dialects.
The examples are: nikoge ‘downy hair’, odoro ‘thicket’, okera ‘Atractylodes japonica’, tobari
‘curtain’, toboso ‘pivot’, kigawa ‘orange peel’, kohone ‘Nuphar japonicum’, uwami ‘upper
garment’, enoki ‘hackberry’, hanagi ‘nose ring’, huyuge ‘winter fur’.
196 8 Subclass divisions in proto-Japanese
4 The reflexes of 3.2b in the Kyōto type dialects
Kyōto Waka- Hyōgo Kōchi Tōkyō
yama
3.2b aduki ‘red bean’ ' ' ' '' '
3.2b onna ‘woman’ ', ' x x '
'
3.2b nedoko ‘alcove’ ', x x ''7 '
'
3.2a/b hutari ‘two (people)’ '' '' x x '
3.2a/b hutae ‘double’ ', x x x ',
'' '
3.2b musume ‘daughter’ ' x x x '
3.2b aida ‘interval’ Ø Ø Ø Ø Ø < '8
5 The reflexes of 3.2a/b in the Kyōto type dialects
Kyōto Waka- Hyōgo Kōchi Tōkyō
yama
3.2b kenuki ‘tweezers’ ', ' ' ' ', Ø
'
3.2a/b hutatu ‘two (things)’ ', '' ' '' ', Ø
'
3.2b magusa ‘forage’ ', x x x ', Ø
''
3.2b higasi ‘east’ ' x x x ', Ø
3.2a/b sikimi ‘star anise’ '' x x x ', Ø
3.2b kubiki ‘part of wagon’ Ø x x x ', Ø
3.2b yorube ‘something to Ø x x x ', Ø
depend on’
6 Members of 3.2 with irregular reflexes in Tōkyō
Kyōto Waka- Hyōgo Kōchi Tōkyō
yama
3.2a midori ‘green’ '' '' '' '' '
3.2a tubasa ‘wing’ '' '' '' ' ', Ø
3.2b ekubo ‘dimple’ ' ' ' ' '9
7 This entry is from Kobayashi (1975).
8 Martin (1987:378) adduces kono aida' in Tōkyō as evidence for an earlier ' reflex in
Tōkyō.
9 The Tōkyō type dialects of Sapporo, Aomori, Shizukuishi, Matsumoto, Numazu, Hamada and
8.1 Subclass divisions based on dialectal reflexes 197
Kyōto Waka- Hyōgo Kōchi Tōkyō
yama
3.2b oroka ‘stupid’ ' x x x '
3.2b mikosi ‘palanquin’ ' x x x ', Ø
3.2b atari ‘vicinity’ ' x x x '
3.2b arame ‘seaweed’ ' x x x '', Ø
3.2b huguri ‘testicles’ Ø x x x '
3.2a hogeta ‘sailyard’ Ø x x x ', Ø
If the mixed reflexes of class 3.2 in the Kyōto type dialects really were to go back to
a subclass distinction in proto-Japanese, we would expect the different Kyōto type
dialects to agree with each other as to which nouns of class 3.2 have which reflex. A
comparison of (8) and (9) below however, shows that the reflexes are too irregular to
claim that such an agreement exists.
7 Examples where the reflexes of class 3.2 in the Kyōto type dialects
agree with each other
Kyōto Wakayama Hyōgo Kōchi
3.2a mukade ‘centipede’ '' '' '' ''
3.2a midori ‘green’ '' '' '' ''
3.2b turube ‘bucket’ '' x '' ''
3.2b onna ‘woman’ ', ' ' x x
3.2b kenuki ‘tweezers’ ', '' ' ' '
8 Examples where the reflexes of class 3.2 in the Kyōto type dialects
do not agree with each other
Kyōto Wakayama Hyōgo Kōchi
3.2a tokage ‘lizard’ '' ' '' ''
3.2a/b hutatu ‘two’ ', '' '' ' ''
3.2b aduki ‘red bean’ ' ' ' ''
3.2b tobira ‘door panel’ ', Ø '' ' '
3.2b nedoko ‘alcove’ ', ' x x ''
3.2a tubasa ‘wing’ '' '' '' '
For the split in Kyōto to be related to the hypothetical a/b subclass distinction that is
thought to have left a mark in the Tōkyō type dialects, there also needs to be a
convincing measure of agreement between the two reflexes in Kyōto and an
unaccented or an accented reflex in Tōkyō:
Hiroshima all also have ' tone for this noun. Kagoshima has B instead of the expected A.
198 8 Subclass divisions in proto-Japanese
We see that '' tone in Kyōto has the expected Ø tone reflex in Tōkyō in 6
cases (mukade, ibara, kasiwa, turube, akaza, tokage). In 4 cases there is a mixed
reflex, namely Ø tone as well as /H/ tone (tubasa, kenuki, magusa, sikimi). In 2 cases
there is a reflex with /H/ tone (midori (irregular accent) and hutari). In other words,
the number of examples that agree is 6 and the number of examples that do not agree
is also 6.
Furthermore, ' (>') tone in Kyōto has the expected reflex with /H/
tone on the final syllable in Tōkyō in 4 cases (aduki, onna, musume, nedoko). In 3
cases there is an irregular reflex with /H/ tone (ekubo, oroka, atari). In 3 cases there
is a mixed reflex, namely Ø tone as well as /H/ tone (mikosi (irregular) arame
(irregular), higasi). In 3 cases there is a Ø tone reflex (kadura, kibisu, tobira). Even
if we count the irregular reflexes with /H/ tone as examples where the
correspondences between Kyōto and Tōkyō agree, the number of examples that
agree is only 7 and the number of examples that do not agree is 6.
Even without this comparison, on purely theoretical grounds, the idea that the
two proto-Japanese subclasses 3.2a and 3.2b could have been reflected in the Kyōto
type dialects was unlikely. The conclusion therefore, that there is no clear
correlation between '' in Kyōto and Ø tone in Tōkyō, or between '
(>') in Kyōto and final /H/ tone in Tōkyō, does not come as a surprise.
8.2 Subclass divisions based on tone dot attestations
The acknowledgement of a subclass division in tone classes 1.3 and 3.5 is based on
a number of unusual tone dot attestations: There is one large class (1.3a) which is
marked with the ping tone dot, and one small class (1.3b) which is marked with the
qu tone dot. Similarly there is one large class (3.5a) in which the final syllable is
marked with the shang tone dot, and one small class (3.5b) in which the final
syllable is marked with the light ping tone dot. There are no clear reflexes in the
modern dialects to support the a/b subclass division in tone class 1.3. (It can be seen
in section 1.3.1 though, that the reflexes of 1.3b in Kyōto and Kōchi are rather
mixed and do not coincide with the reflexes of class 1.3a in all cases.) As for the a/b
subclass division in tone class 3.5, the possible existence of a reflex of this division
in the modern dialects will be discussed in sections 8.2.2 to 8.2.5.
8.2.1 Tone class 1.3b: /F/ tone in Middle Japanese
Class 1.3b was distinguished from tone class 1.3a until the 13th century, at least in
central Japan, and was marked with the falling qu tone, while members of class 1.3a
were marked with the ping tone. Judging from the tone dot material these two
classes merged in the 13th century.
According to Hyman’s universals of tone rules (2007) when a contour tone is
followed by a like tone (such as when class 1.3b was followed by a case particle
with /L/ tone), progressive absorption is likely to apply, leading to the loss of the /F/
8.2 Subclass divisions based on tone dot attestations 199
contour toneme: /F-L/ > /H-L/. In case of class 1.3b, this would have led to a merger
with class 1.3a.
If we compare this with the developments in class 1.2, an important difference is
that the /R/ contour tone of this class was not followed by a like tone when a case
particle was attached. This would have delayed the disappearance of this tone. And
even when the rise to [H] pitch was shifted onto the attached case particle, the
realization of the noun itself may have been [L], but the phonological distinction
with class 1.1 was preserved. This development can be seen in the MJ ‘Chūrin’ tone
dot material, where tone classes 1.1 and 1.2 now contrasted in the following way:
1.1 - /L/, 1.2 - /R/. The contour tone could be resolved on the phonetic level,
without resulting in a merger with class 1.1, which played an important role in the
fact that class 1.2 has been preserved as a distinct tone class in a number of modern
Japanese dialects.
For the falling tone contour of class 1.3b it was not possible to have an effect on
the pitch of the monosyllabic case particles, as these already had /L/ tone. The main
reason then, why class 1.3b left no (clear) trace in the modern dialects, while class
1.2 did, is because the monosyllabic case particles in Middle Japanese happened to
have /L/ tone.
9 The reason why class 1.3b left no trace in the modern dialects
MJ ‘Nairin’ Totsukawa MJ ‘Chūrin’/ Tōkyō Ōita
Nairin MJ ‘Gairin’ Chūrin Gairin
1.1 :- > :- - > - -
1.2 :- > :'- - > - -
1.3a :- > :'- - > '- '-
1.3b :- > :'- - > '- '-
In principle, the few monosyllabic particles with /H/ tone such as mo ‘also’, zo
(emphasis), to (quotation) could have been influenced by the /F/ contour of tone
class 1.3b, but in practice these /H/ tones seem to have been more resistant to
assimilation. This fits into the more general Middle Japanese pattern, in which there
is /H/ tone spreading, but not yet /L/ tone spreading: Within the word boundary for
instance, we have also seen that developed into quite early on, while
the opposite development of to had still not taken place by the Middle
Japanese period. This hierarchy in the occurrence of the two types agrees with one
of the universals of tone rules, as /H/ tone spreading occurs in many languages that
do not have /L/ tone spreading, while the opposite is very rare. (Cf. section 4.5.)
In the central Japanese dialects where all monosyllables were automatically
lengthened, tone classes 1.3a and 1.3b most likely merged as [F] instead of [H],
forming a symmetrical contrast with class 1.2. The fact that the disappearance of the
/F/ vs. /H/ distinction coincided with the development of accent-like /H/ tones in the
200 8 Subclass divisions in proto-Japanese
13th century is also an argument in favor of a merger as [F], as in restricted tone
systems, the accent-like /H/ tone will often change to a contour tone [F] in order to
become more prominent. (In many modern Japanese dialects this is still the case,
especially in phrase-final position.) The main argument however, for a merger as [F]
in the Nairin type dialects is found in the merger pattern of those Kyōto type dialects
that developed from a Nairin type tone system.
The fact that tone class 1.2 developed /H/ tone and not /L/ tone after the shift
suggests that the tone shift occurred at the moraic level in these dialects. I
reconstruct the merged class 1.3a/b at the intermediate stage with [F] pitch, as this is
the easiest way to explain the /L/ register of this class after the shift. The
developments (represented in moras) are shown in (10).
10 The development of the monosyllabic nouns in Kyōto
MJ ‘Nairin’ Intermediate Kyōto
stage
1.1 - - /Ø-Ø/ > - /Ø-Ø/
1.2 - - /R-Ø/ > - /H-Ø/
1.3a - > - /H-Ø/ > '- /L-Ø/
1.3b - - /H-Ø/ > '- /L-Ø/
8.2.2 Tone class 3.5b (and tone class 2.5):
Final /R/ tone preceded by /H/ tone in Middle Japanese
The distinction between two subclasses for tone class 3.5 was proposed by Hattori
(1951), based on the existence of 平平東 as well as 平平上 tone dot markings in
Middle Japanese. Kindaichi (1964c:350) gives the following examples: akidu
‘dragonfly’ 平平東 in the Yūryaku-ki of Nihon shoki 雄略紀日本書紀, awoto ‘blue
grindstone’ 平平東 in the Tosho-ryō-bon of Ruiju myōgi-shō 図書寮本類聚名義抄,
(marked as 平平上 in the Kanchi-in-bon 観智院本), hirome ‘seaweed’ 平平東 in
the Tosho-ryō-bon of Ruiju myōgi-shō (marked as 平平上 in the Kanchi-in-bon),
hitohe ‘single layer’, marked as both 平平東 and 平平上 in the Tosho-ryō-bon of
Ruiju myōgi-shō, himidu ‘ice water’ 平平東 in the Tosho-ryō-bon of Ruiju myōgi-
shō, tamaki ‘arm ornament’, marked as both 平平東 and 平平上 in the Tosho-ryō-
bon of Ruiju myōgi-shō and as 平平東 (Martin 1987:540) in Konkōmyō saishōō-kyō
ongi 金光明最勝王経音義.10
We can add to this list a number of nouns of class 3.5 which have been attested
with 平平上 tone dots, but where the particle no attaches with a ping tone. Hayata
10 We see that when there are double attestations of these examples, they always concern 平平上
(3.5) markings and never 平平平 (3.4) markings. It therefore makes sense to treat the 平平東
examples as a subclass of tone class 3.5. I therefore do not understand why Martin (1987:189)
assigns these examples to a separate tone class which he labeled 3.4a. (Although hirome, hitohe
(1987:628) and akidu (1987:630) are listed by Martin as 3.5b as well as 3.4a.)
8.2 Subclass divisions based on tone dot attestations 201
(1983:34-41) lists the following examples: The place names Kahara-no, Asano-no,
Yamato-no, Takatsu-no, Ahumi-no, nisiki-no ‘brocade’, ikuha-no ‘archery target’,
inoti-no ‘life’, wosiro-no ‘white tailed’.
The markings of this tone class have two points in common with those of tone
class 2.5; the markings are rare and the /R/ tone on the final syllable is preceded by
/H/ tone. An important difference is however, that the distinction of a separate tone
class 2.5 is beyond doubt, despite the paucity of tone dot attestations, as it is based
on modern dialect reflexes as well as old tone dot attestations.
This makes it doubly interesting that a distinction into two subclasses, 3.5a and
3.5b is included in Martin’s list of tone classes for nouns, based on a split in the
reflexes of this tone class in Tōkyō: Tone class 3.5a has a ' reflex or Ø tone in
Tōkyō, 11 while tone class 3.5b has an unexpected ' reflex in Tōkyō. (The
' tone is unexpected because, unlike the ' reflex of tone class 3.5a, it
does not correspond regularly to the reflexes in the Kyōto type dialects. The Kyōto
type dialect of Kōchi for instance, has ' for both subclasses.)
This means that there is the possibility that the distinction between and
markings in Middle Japanese is still reflected in the modern Tōkyō type
dialects and that the existence of a separate tone class in Middle Japanese can
be confirmed by modern dialect data, just as the existence of a tone class (2.5)
is confirmed by modern dialect data.12
A difference here is that the link between the markings in Middle Japanese
and tone class 2.5 of the modern Kyōto type and Noto type dialects is clear, while
the link between the markings in Middle Japanese and an unexpected '
reflex in Tōkyō is much less clear. This is because it is difficult to determine the
present-day Tōkyō reflexes of the words marked as 平平東 (or 平平上-平 with the
particle no): The markings are rare, the examples nouns are often uncommon and
several examples consist of compound nouns, which often have irregular reflexes as
it is.
Examples in which the special Middle Japanese markings and a ' reflex in
Tōkyō nevertheless coincide are: tamaki ‘arm ornament’ (Tōkyō ', Kyōto
'), Yamato (Tōkyō, ', Kyōto, ') nisiki ‘brocade’ (Tōkyō,
Matsumoto, Numazu, Ōita, ' Aomori, Akita ', Kyōto ') and inoti
‘life’ (Tōkyō ', Kyōto ', Aomori, Akita, Numazu, Matsue, Izumo,
Hiroshima, Yamaguchi, Ōita, ').13
11 The number of reflexes with Ø tone in the Tōkyō type dialects for tone class 3.5a is large, but
not larger than that for many nouns of tone class 3.6 and I consider the ' reflex in the
Tōkyō type dialects as regular here.
12 Matsumori (2001:100) notices a split in the reflexes of tone class 3.5 in a number of dialects in
or along the Seto Inland Sea (such as Ibukijima, Shishijima and Marugame) of which it is
possible that it is connected.
13 Examples that do not agree are: akidu ‘dragonfly’ (Tōkyō Ø, Kyōto '), awoto ‘blue
grindstone’ (Tōkyō, Kyōto Ø), Kahara (Tōkyō, Kyōto Ø) and hitohe ‘single layer’ (Tōkyō,
Aomori, Matsue, Izumo ' (possibly ' > ' due to devoiced vowel in the first
202 8 Subclass divisions in proto-Japanese
If there truly is a connection between this small group of nouns with special
markings in Middle Japanese and the unexpected ' reflex of tone class 3.5b in
Tōkyō, how did this connection come about?
8.2.3 The reason why /R/ tone preceded by /H/ tone
is still attested in Middle Japanese
The final /R/ tone in classes 2.2a, 3.2a and 3.7a is thought to have disappeared
before the Middle Japanese period. The final /R/ tones in classes 3.5b and 2.5 on the
other hand are still attested in the 11th century. An important difference between
subclasses 3.5b and 2.5 and subclasses 2.2a, 3.2a and 3.7a, is that the final /R/ tone
in classes 2.2a, 3.2a and 3.7a was preceded by a like tone, so that regressive
absorption applied: * > (2.2a), * > (3.2a), and * >
(3.7a). In subclasses 3.5b and 2.5 on the other hand, final /R/ tone was preceded by
/H/ tone. As unlike preceding tones provide protection against absorption of the
onset of the contour tone, the preceding /H/ tone had the effect of delaying the
disappearance of the final /R/ tone in classes 2.5 and 3.5b.
8.2.4 Were tone classes 3.5b and 2.5 larger than the small number
of attestations in Middle Japanese would make us believe?
Although the use of the light ping tone dot was limited to a small number of
relatively early texts, this does not mean that the final /R/ tone of classes 2.5 and
3.5b had disappeared from Middle Japanese as a phoneme.14 In later texts, the effect
of the /R/ tone on the tone of attached particles can still be discerned (cf. sections
1.3.2 and 1.3.3) and the distinct tone class 2.5 that was marked with this tone still
survives to this day in a number of the modern dialects. There are indications that
tone classes 2.5 and 3.5b may, in fact, have been larger than the small number of
surviving light ping tone dot attestations would make us believe:
As we have seen, in 11th century Kyōto the particle no generally attached [L]
after /L/ and [H] after /H/. But according to Sakurai (1976:302, 305) when no was
attached to native nouns of the pattern 平上 (2.4) or 平平上 (3.5a) it was marked
with the ping tone (cf. -, -). In order to explain this behavior he
proposed the following rule (for which I have reversed Sakurai’s /H/ and /L/): In a
word that starts with /H/ tone the particle no mirrors the tone of the initial syllable,
unless at least two /L/ tones intervene. (This condition was added in order to explain
why the particle no was not [H] after tone class 3.6: -.)
Sakurai’s rule raises a number of questions: Has the particle no developed a
whole new attribute here, and can it suddenly not only mirror the tone of the final
syllable), Kyōto '). Examples that are not attested in Tōkyō are: hirome ‘seaweed’
(Tōkyō x, Kyōto Ø/') and himidu ‘ice water’ (Tōkyō x, Kyōto Ø). For the following
examples I have found no modern attestations: ikuha ‘archery target’, wosiro ‘white tailed’ and
the place names Asano, Takatsu and Ahumi.
14 See section 9.4.2 of part II for a possible explanation for the abandonment of the use of the
light ping tone dot to mark the tones of Japanese.
8.2 Subclass divisions based on tone dot attestations 203
syllable, but that of the initial syllable as well? But only if that syllable has /H/ tone,
and only if the word is a native noun?
Hyman (1978) stresses that in all tonally induced change, the motivating tone
must be adjacent to the affected tone: It is not possible for a /L/ tone to lower any
/H/ tone two syllables later. Even in cases where /LHHH/ becomes /LLLH/, it is
necessary to postulate an intermediate historical stage /LLHH/, in conformity with
the adjacency principle.
A change converting /LHHH/ to /LHLH/ is ruled out, which also rules out
Sakurai’s explanation of why the tone of the particle no after tone classes 2.4 and
3.5a was [H]. As far as the different tone of the particle no after tone class 3.6 is
concerned; according to Hyman’s principle of adjacency the presence of two
intervening /L/ tones or only one would have made no difference. In both cases the
assimilation of the tone of the particle no to that of the initial syllable would have
been prevented, as the two would not have been adjacent.15
Another problem with Sakurai’s idea is that when the particle no attaches to
Chinese loanwords with 平上 or 平平上 markings, it displays the normal behavior
of the particle no and mirrors the tone of the final syllable. (Cf. Seu-no ‘a type of pan
flute’, niu-no ‘milk’ 平上-上 and rakuda-no ‘camel’ 平平上-上).16
The most likely explanation is therefore the idea put forward by Martin
(1987:173-174), namely that tone class 2.5 may once have been larger than it is
nowadays. When no attached with a ping tone in Middle Japanese after nouns that
are regarded as members of class 2.4 based on the modern reflexes (the examples are
zeni ‘money’, kari ‘wild goose’, kata ‘shoulder’, hune ‘boat’, mugi ‘wheat’, ine
‘rice’, uri ‘melon’, yado ‘shelter’, aha ‘millet’, ima ‘now’, kibi ‘millet’, kinu ‘silk
garment’) he regards this as an indication that these nouns still belonged to tone
class 2.5 at the time. Some of these examples (like the last five: yado, aha, ima, kibi,
kinu) indeed belong to class 2.5 in a number of modern Kyōto type dialects. Some of
the others (kari, mugi, ine, uri) fit well into the profile of class 2.5, which consists
for the largest part of the names of small animals and plants.
Extending the same reasoning to Sakurai’s examples of nouns of class 3.5 to
which the particle no attached with [H] pitch, means that Sakurai’s examples must
have belonged to subclass 3.5b ( tone).17
15 In the modern Kyōto dialects the no /H/ tone cancellation rule does work across syllables, but
the tone of the particle no cannot be said to be the motivating tone as no has no tone of itself.
Historically it was the other way around: the motivating tone (on the final syllable) was
adjacent to the affected tone (the tone of the particle no).
16 Kindaichi (1964) suggests (surprisingly enough) that this means Chinese loanwords were better
integrated into the Japanese language than native Japanese words. However, when no attaches
to the Japanese words 2.4 kai ‘paddle’ in Kokin waka-shū and 3.5a Asuka (a place name) in
Kokin kunten-shō, the markings are 平上-上 and 平平上-上, casting doubt on the idea that the
difference was based on a distinction between native words and loanwords.
17 Sakurai’s examples (1976:287-289), taken from manuscripts of Nihon shoki 日本書紀 and
from the Ōei-bon of Nihon shok shi-ki 応永本日本書紀私記) are: yamato ‘Yamato’ (Tōkyō
ya'mato), kokoro ‘heart’ (Tōkyō koko'ro), awano ‘millet field’ (no dialect data) and tamade
204 8 Subclass divisions in proto-Japanese
8.2.5 The developments in class 3.5b in Tōkyō
The reflex of tone class 3.5b in Tōkyō (/H/ tone on the first syllable instead of the
expected second syllable) can now be explained: In tone class 3.5b the low onset of
the final /R/ tone could be shifted backwards onto the ‘spare’ second syllable:
> . Tone class 3.5b thus merged with tone class 3.7 as . Later
the final /H/ tone in the merged tone class 3.5/7 was lost: > .
The difference between classes 2.5 and 3.5b is that in tone class 2.5 there was no
room to shift the low onset of the /R/ tone backwards onto the preceding syllable,
and so the final rise in the Tōkyō type dialects was lost without leaving a trace. The
final /R/ tone of class 3.5b may therefore have left a mark in the Tōkyō type dialects
(in the unexpected location of the /H/ tone), whereas the final /R/ tone in class 2.5
has not.
The developments from proto-Japanese to modern Tōkyō are shown in (11). As
tone class 2.4 relates to tone class 2.5 in the same way that tone class 3.5a relates to
tone class 3.5b, I have added class 2.4 for comparison.)
11 The development of initial /H/ tone in class 3.5b in Tōkyō
Proto-Japanese Intermediate Tōkyō
stage
2.4 - '-
2.5 - > - > '-
3.5a - > '-/Ø18
3.5b - > - > '-
I can only explain the fact that 2.5 did not disappear as a distinct tone class in central
Japan (i.e. in the Kyōto type and the Noto type dialects) whereas tone class 3.5b did,
by assuming that the vowel length that was required to support the word-final
contour tone disappeared earlier in longer nouns.
If we take the example of Nozaki: In this dialect tone class 3.5b has not been
preserved as a separate class. The early loss of vowel length in the final syllable led
to the loss of the final contour tone. Class 2.5, in which the loss of vowel length in
the final syllable occurred much later has still been preserved. When the final vowel
in this class eventually shortened as well, the rise in pitch of the final /R/ tone spread
onto the particle. The final /R/ tone is now realized with [L] pitch, followed by a
floating [H] tone on the particle. (It is likely that at an earlier stage, this was also the
realization of the final /R/ tone in class 3.5b in this dialect.) In case of tone class 1.2
on the other hand, the vowel length was preserved, and tone spreading to the particle
did not occur. (See section 6.2.6.)
‘beautiful hand’ (no dialect data).
18 /H/ tone on the second syllable in trisyllabic nouns has the tendency to shift to Ø in Tōkyō.
(See section 7.1.1).
8.3 Were the final /R/ tones an innovation of central Japan? 205
This suggests that there is indeed a hierarchy in the occurrence of tone spreading
after a final contour tone, depending on the number of syllables of the noun. The
relation with word length strongly suggests that loss of vowel length in the final
syllable was one of the factors that caused the tone spreading.
8.3 Were the final /R/ tones an innovation of central Japan?
There has been a debate as to whether the distinct tone class 2.5 is an innovation of
the central Japanese dialects, such as Tokugawa (1962) has argued, or already
formed part of proto-Japanese. Ramsey’s theory has no bearing on this issue: The
leftward tone shift in Kyōto is an innovation, but this does not mean that the Kyōto
type dialects are innovative in general. It is true that the distinction of a separate
class 2.5 is particularly common among the Kyōto type dialects, but as I have shown
in section 4.2.4, this is because the leftward tone shift facilitated the preservation of
this tone class. (Just as it facilitated the preservation of the distinct tone classes 1.2
and 3.7.)
The distinction as such, was clearly not an innovation of the Kyōto type dialects,
as the separate class already existed in Middle Japanese, before the leftward tone
shift created the Kyōto type tone systems. The preservation of the distinction in the
Tōkyō type dialects of the Noto peninsula also shows that the distinction is not just
an innovation of the Kyōto type dialects. Although it is possible that tone class 2.5 is
the result of an innovation that spread to a larger area than the present-day spread of
the Kyōto type tone system, I do not find this likely: The MJ ‘Gairin’ type tone dot
material after all includes this distinction, which makes me believe that it must have
formed part of proto-Japanese.19
In the majority of cases, the final rise in tone class 2.5 was most likely the result
of a suffix with /H/ tone that merged into the word stem. Akinaga (1972:5) has
remarked upon the large number of animal names included in class 2.5, and this
observation has been connected to the former existence of a suffix with /H/ tone
which was attached to animal names (cf. Kortlandt, 1993:60). The large number of
names of (small) animals is indeed striking, but 2.5 includes a remarkable number of
names of plants as well. The suffix may have had a diminutive meaning, and may
have played a role in name giving.20
19 The MJ ‘Gairin’ type tone system most likely reflects the 13th century tone system of the area
with Gairin type tone in southern Chūbu (the old provinces of Mikawa, Tōtōmi and Shinano).
20 In some cases, such as hebi ‘snake’ the /HR/ tone pattern most likely originated from loss of a
medial vowel (cf. Kortlandt, 1997: 60). In other cases such as wosa ‘elder’, toga ‘blame,
offence’, tuto ‘early in the morning’ and haya ‘quickly’ a nominalizing suffix may have been
involved (cf. Vovin, 2008:146-147). Vovin reconstructs this suffix as *-m. If the suffix
originally included a vowel (*-mV), and had /H/ tone, it could have generated /R/ tone on the
final syllable of these examples.
206 8 Subclass divisions in proto-Japanese
When attached after nouns that ended in /H/ tone, this suffix would have left no
trace in the tone system.21 When attached after nouns that ended in /L/ tone, such as
tone classes 2.1 and 3.1, it would have yielded , i.e. class 2.2a (nire ‘yew’, tuta
‘ivy’, semi ‘cicada’) and , i.e. class 3.2a (mukade ‘centipede’, tokage ‘lizard’,
ibara ‘thorn’, kasiwa ‘oak leaf’, kadura ‘vine, creeper’, sakura ‘cherry blossom’,
akaza ‘chenopodium album’. As outlined in section 8.2.3, this tonal context (/R/
preceded by /L/) facilitated an early loss of the final /R/ tone.
When the suffix attached to monosyllabic nouns with /L/ tone such as class 1.1
(yielding class 1.2) the /R/ tone survived long enough to leave a trace in the Nairin
type dialects and in the Kyōto type dialects. Mi ‘snake’, ne ‘rat’, i ‘boar’ and u
‘cormorant’, ha ‘leaf’, e ‘branch’, ha ‘feather’ which are members (or suspected
members) of tone class 1.2 may be the result of the same suffix merging into nouns
of class 1.1.
When the /R/ contour tones were preceded by /H/ tone, such as when the suffix
attached to nouns of class 2.4 (yielding class 2.5) it survived to be still attested in the
Noto dialects, and to leave a trace (in the shape of word-final /H/ tone) in the Kyōto
type dialects.
In section 8.2.5 I have mentioned the possibility that the unexpected location of
the /H/ tone of class 3.5b (on the first syllable instead of on the second) in Tōkyō
may be the related to the former presence of /R/ tone on the final syllable. If this
connection is valid, this would be another indication that class 2.5 was not an
innovation limited to the dialects of central Japan, as the same suffix with /H/ tone
was no doubt involved in the formation of members of class 3.5b like akidu
‘dragonfly’ and hirome ‘seaweed’.
Summarizing we can say that the final /R/ tones of proto-Japanese were most
likely result of the merger of some suffix with /H/ tone into the word stem. The
syllables with /R/ tone were probably originally all lengthened in order to
accommodate the contour tone.
If they were preceded by /L/ tone they simplified to /H/ tone before the Middle
Japanese period, at least in the attested forms of Middle Japanese. It is possible that
in other dialects they caused the tone of attached case particles to become [H], which
could explain the reflexes with Ø tone that can be found in some Chūrin Tōkyō type
dialects for subclasses 2.2a and 3.2a.
When preceded by /H/ tone, /R/ tone was eventually also simplified, but this
happened later; late enough for tone class 2.5 (but also tone class 3.5b) to be still
attested with the light ping tone dot on the final syllable in Middle Japanese.
In some dialects /R/ was preserved as a phoneme in this context, but realized
with [L] pitch on the final syllable, and [H] pitch on the attached particle. The
simplifications most likely started with the longer nouns. Shortening of long vowels
21 There is a possibility that it left a trace in the Ryūkyūs in the shape of vowel length. See section
9.7.1.3.
8.4 Restrictions to the location of /F/ and /R/ in Middle Japanese 207
often occurs first in longer words, and loss of the syllabic support of the /R/ contour
tones most likely formed the trigger for the loss of the /R/ toneme.
In the Tōkyō type dialects (except the Noto dialects) the floating [H] tone on the
particle was eventually lost, and with it the /R/ toneme. Before this happened, the
leftward tone shift in the Kyōto type dialects transformed final /R/ tone to final /H/
tone, ensuring the survival of the distinct class 2.5 in the Kyōto type dialects to this
day.
In some dialects the vowel shortening never affected the monosyllabic nouns: In
the Nairin dialects for instance vowel length in monosyllabic nouns was preserved
and prevented the /R/ tone of class 1.2 from being realized as a [L] tone with a
floating [H] tone on the particle. When the /R/ toneme was finally lost in these
dialects as well, the pitch fall after the noun had still been preserved. This made
class 1.2 develop /H/ tone, so that it merged with class 1.3.
8.4 Restrictions to the location of /F/ and /R/ in Middle Japanese
The two Middle Japanese contour tones /R/ and /F/ were rare, and most likely the
result of contractions. In the Middle Japanese tone system /F/ was almost completely
limited to the initial syllable and to monosyllables, and /R/ was almost completely
limited to the final syllable and to monosyllables. In proto-Japanese, the occurrence
of these contour tones may not have had these restrictions. There are some
universals of tone rules concerning contour tones that may explain the elimination of
/F/ in other than the initial syllable and the elimination of /R/ in other than the final
syllable.
According to the overview of tonal contexts that induce loss of contour tones in
Hyman (2007), progressive absorption is likely to occur when /R/ tone is followed
by /H/ tone: /RH/ > /LH/. In this tonal context therefore, the likeliness of /R/ being
lost is high. Furthermore, we have seen that by the Middle Japanese period, /L/ after
/LH/ was no longer allowed within the word. The same prohibition would have
caused /R/ tone followed by /L/ tone to develop into /RH/, after which progressive
absorption would have eliminated the /R/ tone: /RL/ > /RH/ > /LH/. These
developments taken together explain the lack of initial /R/ tone in Middle Japanese
In case of /H/ followed by /F/, regressive absorption is likely to occur: /HF/ >
/HL/. In this tonal context therefore, the likeliness of /F/ being lost is high.
Furthermore, /L/ after /LH/ was no longer allowed within the word, and the same
prohibition would have caused /L/ followed by /F/ to develop into /LH/. (Once the
pitch has risen to [H] a return to [L] pitch is prohibited: /LHL/ > /LHH/, but also
/LF/ > /LH/.) These developments taken together, therefore explain the lack of final
/F/ tone in Middle Japanese.
9 The tone systems of the Ryūkyūs
In most dialects in the Ryūkyūs, it is not the location of a specific tone in the word
that distinguishes the different tone classes from each other. Instead, these dialects
differentiate between distinct word-tones that can be mapped over words or phrases
of different length. As the tonal distinctions are not linked to specific syllables in the
word, but to the word or tonal phrase as a whole, the Ryūkyūan dialects can be
analyzed as having word-tone systems rather than syllable-tone systems. In this
respect they are similar to the Kagoshima type dialects.
The Kagoshima type dialects have only two distinct word-tones, while many
word-tone dialects in the Ryūkyūs distinguish between three. In the different dialects,
the tone classes have merged in different ways. Even in dialects that have only two
distinct word-tones, the distribution of the tone classes over the two types is not
necessarily the same as in Kagoshima.
The merger patterns in disyllabic nouns in the dialects of the Ryūkyūs are as
follows: Dialects with three word-tones have merged the tone classes in the
following way: 2.1/2 vs. 2.3 vs. 2.4/5.1 (There is however, an additional split in class
2.4/5 (and in class 3.4/5) which is unique to the Ryūkyūs. This split and its possible
origin will be discussed in sections 9.3 and 9.6 and subsections.) This merger pattern
shows an overall correspondence to the merger pattern of the Gairin type dialects.
As in the Gairin type tone systems the location of the /H/ tone in the word is
distinctive, the number of tonal distinctions increases with the number of syllables.
The difference with the Ryūkyūan dialects, where the word-tones can be mapped
over words with a different number of syllables or moras, becomes clear when the
longer nouns are compared: The Gairin type dialects have preserved four tone
classes for trisyllabic nouns (one with Ø tone and – because of the possibility of /H/
tone on each syllable – three classes that contain a /H/ tone), but the Ryūkyūan
dialects only three.
Dialects with two word-tones have either the pattern 2.1/2 vs. 2.3/4/5 (which
agrees with Kagoshima)2 or the pattern 2.1/2/3 vs. 2.4/5.3 There are also dialects in
which the tone classes have merged completely: 2.1/2/3/4/5.4
1 The dialects that belong to this group are Tokunoshima dialects such as Asama, Okinoerabu
dialects such as Wadomari, the Mugiya Nishi dialect of Yoron Island, the Sonai dialect of
Yonaguni. Some northern Okinawan dialects such as the Yonamine dialect of Nakijin-son,
some central and southern Okinawan dialects along the west coast of the island, including part
of the old Naha city area (Uemura, 2003:73).
2 The dialects that belong to this group are the Onotsu dialect of Kikaijima, Shodon and other
southern Amami-Ōshima dialects. Part of the northern Okinawan dialects, including Kushi,
most central and southern Okinawan dialects, including Shuri. Most Yaeyama dialects,
9.1 Rightward tone shift and the shift from syllable-tone to word-tone 209
9.1 Rightward tone shift and the shift from syllable-tone to word-tone
There is an enormous variation in the phonetic realization of the word-tones in the
Ryūkyūs. In many dialects however (especially the more conservative types that
have preserved a three-way distinction) the different word-tones are realized with
melodies that resemble the pitches of a Gairin Tōkyō type tone system. (See the
comparison in (1) adopted from Kindaichi, 1975a.) In other words, not only the
mergers between the tone classes, but also the actual pitches of the word-tones are
remarkably similar to those of the Gairin type dialects.5
1 Similarity between the Gairin pitches and the Ryūkyūan word-tones
Inokawa Ōgimi Ōita
(Tokunoshima) (Okinawa) (Kyūshū)
2.1/2 (A) - - -
2.3 (B) - - '-
2.4/5 (C) - - '-
In a number of the Ryūkyūan dialects, the resemblance with a Gairin type tone
system seems to go even further than just a resemblance in the realization of the
surface forms. Matsumori, for instance, argues that in the dialects of Maeno on
Tokunoshima, in Masana and Wadomari on Okinoerabu (2001:103-105) and in
Tarama on Taramajima (2001:106-109) the opposition between the different tone
classes can only be captured if a link is acknowledged between certain pitch or
vowel length distinctions and a specific accented syllable in the word. In
Matsumori’s analysis of these dialects, the location of the accented syllable agrees
with the syllable that carries the /H/ tone (or accent) in the Gairin Tōkyō type
dialects.
The data in Thorpe (1983:134) furthermore show that in the dialect of San on
Tokunoshima, the tone classes are still differentiated from each other by a distinct
including central Ishigaki city and Shiraho
3 The dialects that belong to this group are most Kikaijima dialects, including Aden, most
northern Amami-Ōshima dialects including Sani and Naze, and most dialects of Yoron Island,
including Chabana (Uemura, 2003:73).
4 All merger patterns except the last have been included in the overview in section 9.4.1. The
complete merger occurs in most dialects in Yamato-son and Sumiyō-son in Amami Ōshima,
the old Itoman city area and the Minatogawa dialect of Tamagusuku-son in southern Okinawa
Island, and most of the dialects in the Miyako islands, including Hirara (Uemura, 2003:73).
5 Martin (1987) indicates the different word-tone classes in the Ryūkyūan dialects by means of
different letters. In the dialects that have three word-tones: A (2.1/2), B (2.3) and C (2.4/5). In
the dialects in which the merger pattern is like Kagoshima, the two word-tones are A (2.1/2)
and B (2.3/4/5). The third merger type A/B (2.1/2/3) vs. C (2.4/5) is not represented in Martins
overview of the dialect material.
210 9 The tone systems of the Ryūkyūs
location of a /H/ tone in the word, at least in the longer nouns. Although it is
possible to analyze the shorter nouns in San as having word-tones (2.1/2 rising, 2.3
rising-falling, 2.4/5 falling), a look at the longer nouns in (2) shows that it is still
necessary to mark a specific location of the /H/ tone in the word (the third syllable in
case of class 3.6/7, and the fourth syllable in case of class 3.4/5).
The development from a syllabe-tone system to a word-tone system can be
illustrated by means of a comparison of the tone systems of San, Matsubara,
Yonaguni and Aomori. (The Aomori reflexes in (2) are those that occur when
rightward shift of the /H/ tone is not blocked by close vowels.)6
2 Rightward tone shift resulting in the development of word-tone systems
Aomori San Matsubara Yonaguni
3.1/2 - A - - -
3.4/5 '- A '- ,- -
3.6/7 '- A '- - , -
In San tone classes 3.4/5 and 3.6/7 are distinguished from each other by the location
of the /H/ tone, but classes 3.4 and 3.5 have merged. Significantly, the location of
the /H/ tone in classes 3.5 and 3.6/7 is each time is one more syllable to the right
than is standard in the Tōkyō type dialects. It coincides with the location of the /H/
tone that occurs with open vowels in Aomori.
The closely related dialect of Matsubara on the same island can be regarded as
representing a next stage in the development: The [H] tone in class 3.6/7 has shifted
one more syllable to the right. A merger with class 3.4/5 did not take place as the
[H] tone in this tone class shifted away from the final syllable onto the attached case
particle. The tonal distinctions in this dialect are thus no longer linked to specific
syllables, but to the domain of the tonal phrase as a whole. (The lowering of all [H]
pitches except the phrase-final in the word-tone of class 3.4/5 most likely functioned
to maximize contrast with class 3.6/7.)
In tone systems in which the location of the /H/ tone in the word is distinctive,
the freedom in phonetic realization is relatively limited. In tone systems in which a
limited number of different word-tones have the whole word or phrase as their
domain, this freedom is far greater. In Yonaguni for instance, the Gairin-like tone
pattern from which the melodies of the three word-tones must have developed is
hardly recognizable anymore.
If all /H/ tones in Aomori and San (that are not yet on the final syllable) would
shift one more syllable to the right, the trisyllabic nouns could merge in the pattern
3.1/2 vs. 3.4/5/6/7. This merger pattern is typical of the Kagoshima type dialects,
6 The data on San and Matsubara are from Thorpe (1983:134), Yonaguni data are from Hirayama
(1988), Aomori data are from Kobayashi (1975). The additional split in class 3.4/5 will be
discussed in 9.6.3.
9.1 Rightward tone shift and the shift from syllable-tone to word-tone 211
and can also be found in many dialects with word-tone systems in the Ryūkyūs.
(This has however, not been the development in Matsubara as there the /H/ tone on
the final syllable of the noun also shifted to the right.)
The developments in these Ryūkyūan dialects have a close parallel in the
developments in a number of Gairin type dialects on Honshū, such as in the dialect
of Arai near Lake Hamana and the area from Ishinomaki to Ichinoseki (Uwano,
1981). The merger of class 2.4/5 with class 2.3 that was caused by the rightward
shift of the /H/ tone in these dialects was complete, and no longer conditioned by the
quality of the vowel in the final syllable. As can be seen in (3), in these dialects, the
division of the disyllabic nouns into distinct tone classes is 2.1/2 vs. 2.3/4/5, just as
in the Ryūkyūan dialect of Hatoma, which has been added for comparison.
3 Similarities between developments in Gairin type dialects on Honshū
and the dialects of the Ryūkyūs
Hatoma Arai Ishinomaki/Ichinoseki
2.1/2 , - - -
2.3/4/5 , - '- '-
In a Gairin type dialect where all /H/ tones that are not already on the final syllable
of the word, shift one syllable to the right, the merger pattern in disyllabic nouns
does not necessarily have to be 2.1/2, 2.3/4/5. At the end of section 7.1.1, in the
overview of the merger patterns in the Gairin type dialects, it can be seen that the
result can even be that class 2.3 will merge with class 2.1/2. In Matsue and Izumo
this happened only in case of nouns ending in close vowels, but we have seen that in
many Ryūkyūan dialects class 2.3 merged with class 2.1/2 completely. We see again
that the merger patterns that can be found in the Ryūkyūs have close parallels in
Gairin type dialects that have gone through rightward /H/ tone shift and rightward
spreading of [L] pitch.
Summarizing we can say, that rightward tone shift tends to start in longer nouns.
(See the dialect of San, and the dialect of the Shimokita peninsula in section 7.1.1.)
When rightward tone shift occurs in trisyllabic nouns in a Gairin type dialect, the
mergers that result are those that are typical of the Ryūkyūan tone systems that
distinguish between three word-tones.
Consecutive rightward tone shifts result in the merger of more and more tone
classes, and eventually lead to a system with a limited number of tonal distinctions
that are linked to the word or tonal phrase as a whole, and no longer to specific
syllables in the word. The remaining tone classes may therefore be rephonemicized
in terms of a word-tone distinction. It seems to me that in this way the development
from syllable-tone to word-tone in the Ryūkyūs may be explained as a result of
consecutive rightward tone shifts.
Once a tone system has changed from a system where the tones are linked to
specific syllables (or moras) in the word to a word-tone system, the melodies of the
212 9 The tone systems of the Ryūkyūs
word-tones have a great freedom to change. We have seen for instance, how the
melodies of the word-tones in the Kagoshima type dialects of Kagoshima proper and
Makurazaki (section 1.1.3) are almost exactly each other’s opposite. This freedom in
phonetic realization in word-tone systems may explain the great diversity in the
realization of the word-tones in the Ryūkyūs.
The fact that in northern Miyagi prefecture (from Ishinomaki northward to
Ichinoseki) a pitch fall () has developed on the initial syllable of class 2.1/2 may
therefore be an indication that this dialect has already developed a word-tone system.
The division into tone classes in this area (2.1/2 - vs. 2.3/4/5 '-) could
be analyzed as a division into word-tone A (falling) vs. word-tone B (rising-falling).
The pitch fall in class 2.1/2 in this system could have developed in order to
maximize the contrast with class 2.3/4/5.7
The word-tones in many Ryūkyūan dialects are nevertheless still remarkably
Tōkyō-like, which agrees well with Ramsey’s theory. When one adheres to the
standard theory on the other hand, this resemblance must be the result of
independent parallel developments. (This is indeed what Kindaichi assumes.)8
Hattori on the other hand, who rejected Kindaichi’s reversed circle theory and
always remained dissatisfied with the lack of an explanation for the geographical
distribution of the different tone systems in Japan, also observed the resemblance
between the word-tones of the Ryūkyūan dialects and the tone patterns of the Tōkyō
type dialects. At the same time, he noticed the occurrence of long vowels in a
number of Ryūkyūan dialects. Combining the two observations, Hattori (1979)
presented a whole new theory on the historical development of the Japanese tone
system.
9.2 Hattori’s later reconstruction of the proto-Japanese tone system
First of all, in his series of articles on proto-Japanese (1979), Hattori claimed that the
merger pattern 2.1/2 vs. 2.3 vs. 2.4/5 could in fact not be found anywhere in the
Ryūkyūs. According to Hattori tone classes 2.4/5 and 2.3 had already merged in
proto-Ryūkyūan, and instead he divides the merged class 2.3/4/5 into two groups.
Group 1 contains a long vowel in the initial syllable in a number of Ryūkyūan
7 A look at the tone system of the longer nouns would be needed to determine for sure whether
the location of the /H/ tone in the word is still distinctive in this area or not.
8 Although Kindaichi thinks that the proto-Ryūkyūan tone system was similar to the tone system
of Ōita, he does not think the Ryūkyūan dialects started out with a Gairin type tone system.
Like all other Japanese dialects, the Ryūkyūan dialects originally had a tone system that was
like Middle Japanese in the standard reconstruction. When in Kyūshū the Ōita (Gairin type)
tone system developed, similar changes took place independently in the Ryūkyūs. This
intermediate Gairin type stage, which the Ryūkyūan dialects passed through on their way to
their present-day word-tone systems, was preserved on the island of Tokunoshima. The dialect
of Ōita also preserved the intermediate stage, because of its relative proximity to the inner
circle (Kindaichi, 1975a).
9.2 Hattori’s later reconstruction of the proto-Japanese tone system 213
dialects, and a pitch fall after the initial syllable in many other Ryūkyūan dialects,
while group 2 does not.
Hattori thinks that the long vowel in group 1 is old, and that the two groups were
distinguished in proto-Ryūkyūan by the length of the vowel in the initial syllable.
The long vowel was later shortened in most dialects, and it is this shortening of the
originally long vowel that raised the tone of the initial syllable, and caused a /H/
tone to appear on the initial syllable.
According to Hattori, the vowel length that can be found in part of the Ryūkyūs even
goes back to proto-Japanese, and can explain the development of the pitch fall in
dialects all over Japan in the Tōkyō-type location of the word. In this way the vowel
length functions to reconcile the standard interpretation of the Middle Japanese tone
dot material with the dialect geographical data. In the following passage (1979:110)
Hattori explains his ideas on how proto-Japanese vowel length caused Tōkyō type
tone (which he calls ‘B type accent’) to develop in geographically widely separated
areas in Japan:9
Class 2.1 and 2.2 started with H pitch, and class 2.3, 2.4 and 2.5 must have
started with L pitch. Because the vowel in the initial syllable of the whole of
group 1 (class 2.4/5) and part of the words of group 2 of the same class, and
also the second syllable of most words of class 2.3 were long and started with
L pitch, every dialect had the possibility of developing a pitch fall there.
Because of this, it is not unnatural that this change occurred independently in
the dialects with B type accent that surround the A type accent.
According to Hattori (1979:110-111) in the Tōkyō and Kyōto type dialects,
words of class 2.3 that belonged to group 1 merged with group 2 (the ‘general
class’), and words of class 2.4/5 that belonged to group 2 merged with words of
group 1, because the tone patterns of the two groups were similar. Because class 2.1
and 2.2 had in common that they started with /H/ tone, the two classes merged in the
Gairin dialects. Because the second syllable of class 2.2 had a long vowel, in the
Nairin and Chūrin Tōkyō type dialects, there developed a /H/ tone (‘accent
mountain’) there. The Kyōto type dialects did not develop such /H/ tones when the
proto-Japanese long vowels were shortened, and thus preserved the proto-Japanese
tone pattern most closely. (In this respect Hattori now agrees with Kindaichi.)
Hattori explains the development of the /H/ tone in class 2.3 in Kyōto in the
following way: “In the A type dialect the part before the long vowel became H.”
Hattori mentions that there were five tone classes for disyllabic nouns in proto-
Japanese (which means that he distinguishes tone classes 2.4 and 2.5) but he does
not mention their tone directly. Hattori’s ideas have been summarized in (4).
9 Hattori uses the word akusento throughout, which has a wider range of meaning than English
‘accent’. Akusento may refer to ‘pitch-accent’, ‘word-tone’, ‘pitch’ and even ‘tone’. I have
translated ‘accent mountain’ in this passage as ‘pitch fall’.
214 9 The tone systems of the Ryūkyūs
4 Hattori’s later ideas on the development of Japanese tone
Kyōto Proto-Japanese Tōkyō
< 2.1 >
< 2.2 : > '
< 2.3 : group 1 > '
< 2.3 :: group 2 (general) > '
< 2.4 : group 1 > '
< 2.4 group 2 > '
< 2.5 : group 1 > '
< 2.5 group 2 > '
Hattori’s earlier proto-Japanese tone system was a combination of the Middle
Japanese tone system in the standard reconstruction with the location of the /H/ tone
in Tōkyō. (The latter being represented by a syllable with a falling tone.) Hattori’s
later system is a combination of the Middle Japanese tone system in the standard
reconstruction, with the location of the /H/ tone in Tōkyō this time represented by a
syllable with a long vowel. (A problem is that the /H/ tones in Tōkyō on the initial
syllable of class 2.4/5 can only be explained in case of group 1. It remains unclear
why group 2 of this class developed a /H/ tone.)
Most of the instances of vowel length reconstructed by Hattori are only
necessary if one reasons from the standard theory. In the standard theory the /H/ tone
on the initial syllable of tone class 2.3 in Kyōto, and on the second syllable in Tōkyō,
and the /H/ tone on the initial syllable of class 2.4/5 in Tōkyō need to be explained.
Ramsey’s theory offers a simpler solution, as in his reconstruction the presence of
these /H/ tones and their location in the word in the different dialects are exactly as
expected. It is not necessary to reconstruct vowel length in order to explain them.
What remains to be explained however, also in Ramsey’s theory, is the split of
classes 2.3 and 2.4/5 into two groups in the Ryūkyūs, and the interesting connection
between the vowel length in the initial syllable that can be found in group 1 in some
Ryūkyūan dialects, and the - word-tone that can be found in group 1 in many
other Ryūkyūan dialects.
Uwano (1996), Matsumori (1998b) and Shimabukuro (2007) support Hattori’s
idea that the - word-tone that can be found in group 1 in many Ryūkyūan
dialects developed historically from vowel length in the initial syllable. And so, for
this group of nouns (i.e. group 1 of classes 2.3, 2.4 and 2.5), they too reconstruct
vowel length in proto-Ryūkyūan.10
10 Earlier (2003) Shimabukuro reconstructed this vowel length also in proto-Japanese.
9.3 The split in classes 2.3 and 2.4/5 examined 215
9.3 The split in classes 2.3 and 2.4/5 examined
In order to find out more about the origin of the connection between long vowels in
some Ryūkyūan dialects and a pitch fall after the initial syllable in others, we first
need to examine Hattori’s claims about the splits and mergers of the tone classes in
proto-Ryūkyūan in more detail. Next we need to compare the occurrence of vowel
length and the realization of the word-tones in the different dialects.
9.3.1 Was there no distinct tone class 2.3 in proto-Ryūkyūan?
According to Hattori, classes 2.3, 2.4 and 2.5 had all merged in proto-Ryūkyūan,
and were together divided into two groups.
In order to obtain a better understanding of how the tone classes have split, and
how regular the reflexes among the different dialects are, I have selected a number
of Ryūkyūan dialects that have not merged tone classes 2.3 and 2.4/5 completely,
and compared the distribution of the different lexical items over Hattori’s two
groups. From this comparison (which is presented in the next section) it is possible
to draw the following conclusions:
Hattori is correct in pointing out that tone class 2.4/5 is almost evenly split
between a group that has vowel length or [H] pitch in the first syllable (1) and a
group that has not (2). I do not agree however, with Hattori’s idea that a distinct tone
class 2.3 cannot be recognized in the Ryūkyūs.
While tone class 2.4/5 is almost evenly split between the two groups, nouns of
class 2.3 fall overwhelmingly into group 2. Only a tiny percentage of this large tone
class belongs to group 1.11 The examples for which most dialects agree are mari
‘ball’, kame ‘jar’, nomi ‘flea’, hama ‘beach’ and hone ‘bone’. (I will call this group
the mari-group from now in.) Apart from this small group, all other 31 examples of
nouns of class 2.3 in Hattori’s list belong to group 2.12
The following members of class 2.3 belong to group 1 in an occasional dialect:
For instance, yama ‘mountain’ in Hateruma, yumi ‘bow’ in Onna, hati ‘pot’, yubi
‘finger’ and hagi ‘shin’ in Ashikebu. Furthermore mame ‘bean’, kabi ‘mold’, hato
‘pigeon’, kaki ‘bet’, kame ‘turtle’ and ono ‘axe’ in Shuri (but not in other Okinawan
dialects; see section 9.7.1.3). Nowhere does Hattori offer an explanation for the
discrepancy between the number of nouns of class 2.3 that belong to group 1 (very
few) and the number that belongs to group 2 (the overwhelming majority).
11 The two level tone classes 2.3 and 2.1 were the largest tone classes in proto-Japanese. In
Martin’s comprehensive list (1987) for instance, class 2.3 is made up of approximately 300
examples, while class 2.4 is only made up of approximately 140 examples, and class 2.5 of
approximately 100 examples.
12 The word kusi ‘comb’ or ‘skewer’ which Hattori lists as a member of class 2.2, is regarded as a
member of class 2.3 by Martin (1987). If we include this example in the small group of nouns
of class 2.3 that have joined class 2.4/5, this group now makes up 6 of Hattori’s 37 examples of
nouns of class 2.3.
216 9 The tone systems of the Ryūkyūs
There are many examples all over Japan of nouns that have slipped out of their
class in one or several dialects. Of the examples of class 2.3 that have merged with
class 2.4/5 in one or more of the Ryūkyūan dialects mentioned above, the following
examples have, for instance, also merged with class 2.4/5 in Matsue: mari ‘ball’,
nomi ‘flea’, hato ‘pigeon’ and kame ‘turtle’. I therefore prefer to explain the fact that
the Ryūkyūan dialects agree among themselves (at least as far as the small mari-
group is concerned) as to which nouns have merged with class 2.4/5, by assuming
that – for some reason – these nouns had slipped out of their class and merged with
class 2.4/5 in proto-Ryūkyūan. (In their tone pattern and the occurrence of vowel
length these nouns do not stand out in any way from the other members of group 1
of class 2.4/5.)13
In my opinion therefore, it would be an exaggeration to say that there is no
separate class 2.3 in the Ryūkyūs. It is more correct to say that half of class 2.4/5 has
merged with class 2.3, while a tiny part of class 2.3 has merged with class 2.4/5.
Matsumori (1998b), who supports Hattori’s division, calls group 1 of classes 2.3
and 2.4/5 the iki ‘breath’ group (containing the initial long vowel or the pitch fall
after the first syllable) and group 2 of classes 2.3 and 2.4/5 the ita ‘board’ group. I
will adopt these names, but with the specification that I use them to refer to the split
in tone class 2.4/5 only: The term iki-group refers to those nouns that still form a
separate tone class 2.4/5, and the term ita-group refers to those nouns of class 2.4/5
that have merged with class 2.3. As said, I use the term mari-group to refer to the
small group of members of class 2.3 that have merged with class 2.4/5 in more than
an occasional isolated dialect.
9.3.2 A comparison of the iki-, ita- and mari-groups in 12 dialects
The first six dialects are from the northern part of the Ryūkyūs. In the dialects of
Aden, Naze and Ashikebu class 2.3 has merged with class 2.1/2. Nouns belonging to
the ita-group in these dialects have therefore merged with class 2.1/2/3, so that only
the nouns of the iki-group still constitute a separate class. The iki-group in Asama
has two reflexes that I have marked 2.4/5α and 2.4/5β. Reflex α occurs when the
initial consonants of the dialect are glottalized, and reflex β when they are not. At
the top of the column of each dialect I have indicated the word-tones that are typical
for the word class in that dialect. In the columns underneath, irregular reflexes are
marked in bold print. Aden, Naze and Asama data are from Hattori (1979),
Ashikebu data are from Uwano (1996), Tokuwase data are from Matsumori (1998b)
and Wadomari data are from Kuno (1991).
13 In Hattori’s list of 127 disyllabic nouns in 10 Ryūkyūan dialects (1979) the main example of
irregular vowel length is in case of the 2.1 noun kiba ‘fang’, which has both a unique tone and
occurrence pattern of vowel length in all the dialects that Hattori quotes. This indicates that this
word is treated as a compound in the Ryūkyūs: ki ‘?’ + ba ‘tooth’ Hattori’s list contains the
following additional exceptions: The vowel length and tone in the word ono ‘axe’ 2.3 in
Yonamine wuunuu are unique to this word (at least among the words included in Hattori’s list).
See also section 9.7.1.
9.3 The split in classes 2.3 and 2.4/5 examined 217
The next six dialects are from Okinawa and the southern Ryūkyūs. Apart from
the dialect of Sarahama (where class 2.3 has merged with class 2.1/2 just as in Aden,
Naze and Ashikebu), all these dialects distinguish between three different word-
tones: A (2.1/2), B (2.3 and the ita-group of 2.4/5) and C (the iki-group of 2.4/5).14
Yonamine, Onna and Shuri data are from Hattori (1979), but Hattori’s Shuri data
are in turn from Okinawa-go jiten (1960). In Onna not only has tone class 2.4/5 split
in two, but in tone class 2.3 – apart from the merger of a small part of this class with
tone class 2.4/5 (the mari-group) – a second split has occurred. The word-tones that
occur with the different tone classes in this dialect are 2.1/2 , 2.3α :: (wata
‘intestines’ ana ‘hole’, nami ‘wave’, ami ‘net’, nuka ‘rice bran’, inu ‘dog’, haji
‘shame’, sumi ‘ink’, hana ‘flower’, tama ‘ball’, mame ‘bean’, tuno ‘horn’, mimi
‘ear’, yume ‘dream’). 2.3β :~15 (iro ‘color’, kuso ‘faeces’, tuna ‘rope’, tura
‘surface’, tosi ‘year’, ura ‘back side’, kusa ‘grass’, kumo ‘cloud’, sima ‘island’, haka
‘grave’, mono ‘thing’, yama ‘mountain’, kimo ‘liver’). 2/4.5 (iki-group and mari-
group) :, ita-group :: (= 2.3α), except siru ‘soup’ :~ (= 2.3β).
Sarahama data are from Matsumori (1998b), Hateruma and Yonaguni data are
from Hirayama (1988). Additional Yonaguni data from Martin (1987) have been
marked (M), but Martin’s data are in turn from Hirayama 1964 and 1967b.
The classification of kata shoulder, hune ‘boat’, mugi ‘wheat’, ine ‘rice’, and uri
‘melon’ as belonging to class 2.5 instead of class 2.4 is based on the fact that the
particle no attached with a ping tone to these nouns (Martin 1987:173). This is also
the case with yado ‘shelter’ and aha ‘millet, but these nouns also still belong to class
2.5 in some of the modern Kyōto type dialects. Oku ‘interior’ and sudi ‘sinew’ are
marked with 平平 tone dots (i.e. as class 2.3) in the Kanchi-in-bon of Ruiju myōgi-
shō, but the reflexes in the modern dialects point to class 2.4. Perhaps the 平平
markings or are the result of a copying mistake of earlier 平東, which could mean
that these nouns are former members of class 2.5.
As can be seen from the data in tables (5) to (14) below, all Ryūkyūan dialects
that have not obliterated the iki/ita split by merging tone classes 2.3 and 2.4/5
completely, have the split in tone class 2.4/5. In about half of the words in the two
groups, the reflexes do not agree in each and every dialect, but aberrant reflexes are
limited to isolated items in isolated dialects. This means that the iki/ita split in class
14 The dialect of Shuri is usually seen as a dialect where the tone classes have merged in the
pattern A 2.1/2 vs. B 2.3/4/5, as there is no difference between the word-tones of class 2.3 and
2.4/5. However, these two tone classes have not merged, as the vowel length in the initial
syllable of nouns of class 2.4/5 (iki-group) distinguishes this class (C) from class B (2.3).
15 The word-tones : and appear to occur in free variation For the following nouns Hattori
indicates that both word-tones : and ‘have been recorded’: iro ‘color, kuso ‘feces’,
tuna ‘rope’, tura ‘surface’, tosi ‘year’. For ura ‘back side’, kusa ‘grass’, kumo ‘cloud’, sima
‘island’, haka ‘grave’, mono ‘thing’, yama ‘mountain’ he indicates only , and for kimo
‘liver’ only : This can still mean that both variants are allowed in case of these nouns as
well, but that they have not been recorded.
218 9 The tone systems of the Ryūkyūs
2.4/5 must have formed part of proto-Ryūkyūan, with approximately the same
distribution of the membership as shown in the tables.16
5 The iki-group of class 2.4 in the northern Ryūkyūs
2.4 Kikai Ōshima Tokushima Okierabu
Aden Ashi- Naze Asama Toku- Wado-
kebu wase mari
: α
: β
iki ‘breath’ 2.4/5 2.4/5 2.4/5 2.4/5α 2.4/5 2.4/5
usu ‘mortar’ x 2.4/5 x 2.4/5α 2.4/5 2.4/5
umi ‘sea’ 2.4/5 2.4/5 2.4/5 2.4/5α 2.3 2.4/5
naka ‘inside’ 2.4/5 2.4/5 2.4/5 2.4/5α 2.4/5 x
hasi ‘chopsticks’ 2.4/5 2.4/5 2.4/5 2.4/5α x 2.4/5
hari ‘needle’ 2.4/5 2.4/5 2.4/5 2.4/5β 2.4/5 x
hera ‘spatula’ x 2.4/5 x x 2.4/5 x
matu ‘pine tree’ 2.4/5 2.4/5 2.4/5 2.4/5β 2.4/5 x
nusi ‘owner’ x 2.4/5 x 2.4/5β x x
kazu ‘number’ x x x x x 2.4/5
obi ‘girdle’ x 2.4/5 x 2.3 2.4/5
ito ‘thread’ x 2.1/2/3 x x 2.3 2.4/5
oku ‘interior’ 2.4/5 x 2.1/2/3 x x x
sudi ‘sinew’ 2.4/5 2.4/5 2.4/5 2.4/5β x x
6 The iki-group of class 2.5 in the northern Ryūkyūs
2.5 Kikai Ōshima Tokushima Okierabu
Aden Ashi- Naze Asama Toku- Wado-
kebu wase mari
: α
: β
hune ‘boat’ 2.4/5 2.4/5 2.4/5 2.4/5β 2.4/5 2.4/5
kage ‘shadow’ 2.4/5 2.4/5 2.4/5 2.4/5β 2.4/5 2.4/5
kumo ‘spider’ x 2.4/5 2.4/5 x x x
koe ‘voice’ 2.4/5 2.4/5 2.4/5 2.4/5β 2.4/5 2.4/5
16 It would be possible to argue that dialects in which class 2.4/5 merged with class 2.3
completely, never had this split, but considering the geographical distribution of the dialects
that still show the split, this seems unlikely. It is safe to assume that the split originally formed
part of all Ryūkyūan dialects, and was only later obliterated in some. This was most likely
caused by a shift of the initial [H] pitch of the iki-group to the right, resulting in a merger of the
iki-group with class 2.3.
9.3 The split in classes 2.3 and 2.4/5 examined 219
Aden Ashi- Naze Asama Toku- Wado-
kebu wase mari
saru ‘monkey’ irr.17 2.4/5 2.4/5 2.4/5β 2.4/5 2.4/5
tabi ‘socks’ x 2.1/2/3 x 2.4/5β 2.4/5 x
tuyu ‘dew’ 2.4/5 2.4/5 2.4/5 2.4/5α 2.4/5 x
nabe ‘pot’ 2.4/5 2.4/5 2.4/5 2.4/5β 2.4/5 x
muko ‘groom’ x 2.4/5 2.4/5 2.4/5α x 2.4/5
oke ‘tub’ 2.4/5 2.4/5 2.4/5 2.4/5α 2.4/5 x
yado ‘shelter’ x 2.1/2/3 x 2.4/5β x x
7 The ita-group of class 2.4 in the northern Ryūkyūs
2.4 Kikai Ōshima Tokushima Oki-
erabu
Aden Ashi- Naze Asama Toku- Wado-
kebu wase mari
: :
ita ‘board’ x 2.1/2/3 x 2.3 2.3 x
kasa ‘umbrella’ 2.1/2/3 2.1/2/3 2.1/2/3 2.3 2.3 2.3
siru ‘soup’ 2.1/2/3 2.1/2/3 2.1/2/3 2.3 2.3 x
wara ‘straw’ x 2.1/2/3 x 2.3 2.3 x
tane ‘seed’ x 2.1/2/3 x x 2.3 2.3
miso ‘beanpaste’ x x x x 2.3 x
kado ‘corner’ x x x x 2.3 x
kasu ‘dregs’ 2.1/2/3 2.1/2/3 2.1/2/3 x x x
nomi ‘chisel’ 2.1/2/3 2.1/2/3 2.3 2.3 x x
8 The ita-group of class 2.5 in the northern Ryūkyūs
2.5 Kikai Ōshima Tokushima Oki-
erabu
Aden Ashi- Naze Asama Toku- Wado-
kebu wase mari
: :
kata ‘shoulder’ 2.1/2/3 2.1/2/3 2.1/2/3 2.3 2.3 2.3
ase ‘sweat’ 2.1/2/3 2.1/2/3 2.1/2/3 2.3 2.3 2.4/5
ame ‘rain’ 2.1/2/3 2.1/2/3 2.1/2/3 2.3 2.3 2.3
momo ‘thigh’ x 2.1/2/3 x 2.3 2.3 x
yoru ‘night’ 2.1/2/3 2.1/2/3 2.1/2/3 2.3 2.3 2.3
mugi ‘wheat’ x 2.1/2/3 x 2.3 2.3 2.3
17 In Hattori’s word list the word ‘monkey’ has a unique tone pattern in Aden: saruu. See also
9.7.1 and subsections.
220 9 The tone systems of the Ryūkyūs
Aden Ashi- Naze Asama Toku- Wado-
kebu wase mari
uri ‘melon’ x 2.1/2/3 x x 2.3 x
aha ‘millet’ 2.1/2/3 2.1/2/3 2.1/2/3 2.3 x x
ine ‘rice plant’ x 2.1/2/3 x x x 2.3
9 The mari-group of class 2.3 in the northern Ryūkyūs
2.3 Kikai Ōshima Tokushima Okierabu
Aden Ashi- Naze Asama Toku- Wado-
kebu wase mari
: α
: β
mari ‘ball’ x 2.4/5 x 2.4/5β 2.4/5 x
kame ‘jar’ 2.4/5 2.4/5 2.4/5 2.4/5β 2.4/5 x
nomi ‘flea’ 2.4/5 2.4/5 2.4/5 2.4/5β 2.4/5 2.4/5
hama ‘beach’ x 2.4/5 x 2.4/5β 2.4/5 x
hone ‘bone’ 2.4/5 x 2.4/5 2.4/5β 2.4/5 2.4/5
10 The iki-group of class 2.4 in the southern Ryūkyūs
2.4 Okinawa Irabu Hate- Yonaguni
Yona- Onna Shuri Sara- ruma
mine hama
: : ~
:
iki ‘breath’ 2.4/5 2.4/5 2.4/5 2.4/5 x 2.4/5 (M)
usu ‘mortar’ 2.4/5 2.4/5 2.4/5 2.4/5 2.3 2.4/5
umi ‘sea’ 2.4/5 2.4/5 2.3 2.1/2/3 x 2.4/5 (M)
naka ‘inside’ 2.4/5 2.4/5 2.4/5 2.4/5 x 2.4/5 (M)
hasi ‘chopsticks’ 2.1/2 x 2.4/5 x 2.3 2.4/5
hari ‘needle’ 2.4/5 2.4/5 2.4/5 2.4/5 2.3 2.4/5
hera ‘spatula’' 2.4/5 2.4/5 2.4/5 x x 2.4/5 (M)
matu ‘pine tree’ 2.4/5 2.4/5 2.4/5 x x 2.4/5 (M)
nusi ‘owner’ 2.4/5 2.4/5 2.4/5 x x x
kazu ‘number’ 2.4/5 x 2.3 x x 2.4/5
obi ‘girdle’ x x 2.4 x x x
ito ‘thread’ 2.4 x 2.4/5 x x 2.4/5
oku ‘interior’ 2.4/5 x 2.4/5 x x 2.3 (M)
sudi ‘sinew’ 2.4/5 x 2.3 x x x
9.3 The split in classes 2.3 and 2.4/5 examined 221
11 The iki-group of class 2.5 in the southern Ryūkyūs
2.5 Okinawa Irabu Hate- Yonaguni
Yona- Onna Shuri Sara- ruma
mine hama
: : ~
:
hune ‘boat’ 2.4/5 2.4/5 2.3 2.4/5 x 2.4/5 (M)
kage ‘shadow’ 2.4/5 2.4/5 2.4/5 2.4/5 2.4/5 2.4/5 (M)
kumo ‘spider’ 2.4/5 suff. suff.18 x x 2.4/5, 2.3 (M)
koe ‘voice’ 2.4/5 2.4/5 2.3 2.4/5 2.3 2.4/5
saru ‘monkey’ 2.4/519 2.4/5 2.4/5 2.4/5 2.4/5 2.4/5
tabi ‘socks’ 2.4/5 2.4/5 2.4/5 2.4/5 x x
tuyu ‘dew’ 2.4/5 2.4/5 2.3 2.4/5 x 2.4/5
nabe ‘pot’ 2.4/5 2.4/5 2.4/5 2.4/5 x 2.4/5 (M)
muko ‘groom’ 2.4/5 2.4/5 2.4/5 2.4/5 2.4/5 2.4/5
oke ‘tub’ 2.4/5 2.4/5 2.4/5 2.4/5 x 2.4/5 (M)
yado ‘shelter’ 2.4/5 2.4/5 2.4/5 x x x
12 The ita-group of class 2.4 in the southern Ryūkyūs
2.4 Okinawa Irabu Hate- Yonaguni
Yona- Onna Shuri Sara- ruma
mine hama
: :: -
ita ‘board’ 2.3 2.3α 2.3 2.1/2/3 2.3 2.3
kasa ‘umbrella’ 2.3 2.3α 2.3 2.1/2/3 2.3 2.4/5, 2.3 (M)
siru ‘soup’ 2.3 2.3β20 2.3 x x 2.3 (M)
wara ‘straw’ 2.3 2.3α 2.3 x x x
tane ‘seed’ x x 2.3 x x 2.3
miso ‘beanpaste’ x x 2.3 2.3 x x
kado ‘corner’ x x 2.3 x x 2.3
kasu ‘dregs’ 2.3 2.3 2.3 x x 2.3 (M)
nomi ‘chisel’ 2.3 2.3α 2.3 x x x
18 The words ‘spider in Shuri (kuubaa) and Onna (k’uuba a) probably contain a suffix -a(a). The
vowel length in the second syllable (and the word-tone in Onna) are therefore unusual. For
comparison, Yonamine has hubu, and Naze has k’ubu. See also 9.7.1 and subsections.
19 The word ‘monkey’ in Yonamine has vowel length in the first as well as the second syllable:
saaruu. See also 9.7.1 and subsections.
20 The word-tone of 2.3 β is :~ (siruu~siru).
222 9 The tone systems of the Ryūkyūs
13 The ita-group of class 2.5 in the southern Ryūkyūs
2.5 Okinawa Irabu Hate- Yonaguni
Yona- Onna Shuri Sarahama ruma
mine
: :: -
kata ‘shoulder’ 2.3 2.3α 2.3 2.1/2/3 2.3 2.3 (M)
ase ‘sweat’ 2.3 2.3α 2.3 2.1/2/3 2.3 2.3
ame ‘rain’ 2.3 2.3α 2.3 2.1/2/3 2.3 2.3
momo ‘thigh’ 2.3 2.3α 2.3 x x 2.3 (M)
yoru ‘night’ 2.3 2.3α 2.3 x 2.4/5 2.3
mugi ‘wheat’ 2.3 2.3α 2.3 x 2.4/5 2.3
uri ‘melon’ x x 2.3 2.3 x 2.3
aha ‘millet’ 2.3 2.3α 2.3 x x 2.3
ine ‘rice plant’ x x 2.3 x x 2.3
14 The mari-group of class 2.3 in the southern Ryūkyūs
2.3 Okinawa Irabu Hate- Yonaguni
Yona- Onna Shuri Sarahama ruma
mine
: : ~:
mari ‘ball’ x 2.4/5 2.4/5 2.4/5 x x
kame ‘jar’ 2.4/5 2.4/5 2.4/5 2.4/5 x x
nomi ‘flea’ 2.4/5 2.4/5 2.3 x x 2.4/5
hama ‘beach 2.4/5 2.4/5 2.3 x x x
hone ‘bone’ 2.4/5 2.4/5 2.3 2.4/5 2.4/5 2.4/5
9.4 From vowel length to [H] pitch or from [H] pitch to vowel length?
Hattori and others assume that the vowel length in the initial syllable of the iki-group
is original, and that the [HL] word-tone that can be found in many Ryūkyūan
dialects developed as the long vowels were shortened. Kindaichi (1975a) on the
other hand, assumed that the development was the other way around. In order to
examine which of these two possibilities is more likely, I have included an overview
of the realization of the word-tones and the occurrence of vowel length in disyllabic
nouns in the dialects of the Ryūkyūs.
9.4.1 Overview of word-tones and vowel length in disyllabic nouns
Most data have been adopted from Kindaichi 1975a (reprinted in 1983), who
consulted several sources: Hattori (1959), Uemura (1959), Hirayama (Hōgen; 7-6
9.4 From vowel length to [H] pitch or from [H] pitch to vowel length? 223
and 7-10). I have added data from the following publications: Thorpe (1983), Hattori
(1979), Matsumori (1998, 2001), Kōza Nihon-go 11 (Nakamura, Yukio ed. 1977),
Uwano (1996), Akinaga (1960), Hirayama (1967a and 1988), Kuno (1991) and
Shimabukuro (2002).
Kindaichi only indicates the word-tones of the iki-group, as he regards these as
the regular reflex of tone class 2.4/5. This becomes clear if we compare Kindaichi’s
presentation of the reflexes of the dialects of Aden, Naze and Yonamine (of which
the data stem from Hattori), with Hattori’s own presentation of these dialects in
‘Nihon sogo ni tsuite’.21 It can be seen that Thorpe has taken the same approach as
Kindaichi. Akinaga, Matsumori, Kuno, Hirayama, Uwano and Shimabukuro on the
other hand, list both the reflexes of the iki-group and the ita-group, just like Hattori.
2.1/2 2.3 2.4/5
Aden (Kind.) ,- as 2.1/2 -
Aden (Hattori) as 2.1/2 iki
ita (as 2.1/2/3)
Naze (Kind.) ,- as 2.1/2 -
Naze (Hattori) as 2.1/2 iki
ita (as 2.1/2/3)
Yonamine (Kind.) :- :,:- -
Yonamine (Hattori) : : iki
: ita (as 2.3)
Yonaguni (Thorpe) - - ,-
Yonaguni (Hirayama) - - ,- iki
- ita (as 2.3)
I have added the note (iki) to the entries adopted from Kindaichi and Thorpe, as I
would like to avoid the false impression of a complete divide between tone classes
2.3 and 2.4/5 in certain dialects in the list. In all dialects that still have a distinct tone
class 2.4/5, the ita-group of this class has merged with class 2.3. Only if there is a
difference in vowel length between the reflex of the iki-group and the ita-group do
Kindaichi and Thorpe list both reflexes. Although they do not indicate which word-
tone occurs with which group, we can assume that the word-tone that is identical to
the word-tone of tone class 2.3 belongs to the ita-group, and this is how I have
represented the reflexes in the list:
2.1/2 2.3 2.4/5
Shuri (Kind.) ,- - :- iki
- ita (as 2.3)
Shuri (Thorpe) ,- - :- iki
-ita (as2. 3)
I have arranged the dialects per island, starting in the northeast of the Ryūkyūs
and ending in the southwest. Within each island the dialects are again arranged as
21 The material presented in ‘Nihon sogo ni tsuite’ was mostly collected by Hattori himself, but
his Shuri data are from Okinawa-go jiten (1963) and his Asama data are from Uwano (1977).
224 9 The tone systems of the Ryūkyūs
much as possible in the order in which they appear moving from the northeast to the
southwest. (The mark : indicates vowel length, and the mark . half length.)
15 Word-tones and vowel length in the Amami Archipelago
Kikai 2.1/2 2.3 2.4/5
Onotsu (Kind.) - - as 2.3
Aden (Kind.) ,- as 2.1/2 - (iki)
Aden (Thorpe) ,- as 2.1/2 - (iki)
Aden (Hattori) as 2.1/2 iki
ita (as 2.1/2/3)
Takutsuku (Kind.) ,- as 2.1/2 ,- (iki)
Amami-Ōshima
Ashikebu (Uwano) ,- as 2.1/2 - iki
,- ita (as 2.1/2/3)
Naze (Kind.) ,- as 2.1/2 - (iki)
Naze (Kōza; 11) ,- as 2.1/2 - (iki)
Naze (Hattori) as 2.1/2 iki
ita (as 2.1/2/3)
Koniya (Kind.) ,- ,- as 2.3
Uken (Kind.) : as 2.3
Kakeroma
Shodon (Kind.) ,- ,- as 2.3
Shodon (Thorpe) .,.- .,.- as 2.3
Shodon (Hattori) : . as 2.3
Tokunoshima
San (Thorpe) - - - (iki)
Inokawa (Kind.) - - - (iki)
Kametsu (Kōza; 11) - ,- - (iki)
Kametsu (Hattori) - - - iki
- ita (as 2.3)
Tokuwase - - - iki
(Matsumori ’98) - ita (as 2.3)
Ketoku (Kind.) - - - (iki)
Kanami (Kind.) - - :~:- (iki)
Matsubara (Thorpe) :- :,:- :- (iki)
Matsubara Nishi-ku :- :- :- (iki)
(Kind.)
Maeno :- :,:- :- iki
(Matsumori ’01) :,:- ita (as 2.3)
Asama (Kind.) :- :- :- (iki)
Asama (Hattori) : : : α, : β iki
: ita (as 2.3)
9.4 From vowel length to [H] pitch or from [H] pitch to vowel length? 225
2.1/2 2.3 2.4/5
Bane (Kind.) - - - (iki)
Okierabu
Kunigami (Kind.) ,- as 2.1/2 - (iki)
Wadomari (Kuno) ,- :,- - iki
:,- ita (as 2.3)
Ōgusuku (Kind.) ,- ,- - (iki)
Kamishiro - ,- ,- iki
(Shimabukuro) ,- ita (as 2.3)
Yoron22
Chabana (Thorpe) ,- as 2.1/2 - (iki)
Ritchō (Kind.) - as 2.1/2 - (iki)
16 Word-tones and vowel length in Okinawa and neighboring islands
Okinawa 2.1/2 2.3 2.4/5
Oku (Kind.) - ,- - (iki)
Hentona (Kōza; 11) - ,- - (iki)
Hanchi (Kind.) - - - (iki)
Ōgimi (Kind.) ,- - - (iki)
Kawakami (Kind.) ,- ,- - (iki)
Yonamine (Kind.) :- :,:- - (iki)
Yonamine (Hattori) : : iki
: ita (as 2.3)
Nakijin (Nakasone) : : (:) iki23
: ita (as 2.3)
Sakimotobu (Thorpe)- ,- - (iki)
Awa (Kind.) :- :,:- - (iki)
Nago (Thorpe) - ,- ,- (iki)
Kushi (Hattori) :~ as 2.3
Onna (Hattori) :: α : iki
:~ β ::ita (as 2.3α)
Ishikawa (Kind.) - - :,:- (iki)
Ōgi (Kind.) - ,- :,:- (iki)
Yagihara (Kind.) - - :- (iki)
Shuri (Kind.) ,- - :- iki
- ita (as 2.3)
Shuri (Thorpe) ,- - :- iki
- ita (as 2.3)
22 According to Uwano (1999b) part of Yoron Island has a three-way distinction.
23 When the final vowel is -a the vowel is lengthened. Disyllabic nouns that have monosyllabified
will have the following tone: 2.1/2 :, 2.3 :, 2.4/5 iki : ita : (as 2.3).
226 9 The tone systems of the Ryūkyūs
2.1/2 2.3 2.4/5
Shuri (Hattori) : iki
ita (as 2.3)
Arazato (Kind.) - - :- iki
- ita (as 2.3)
Nakandakari ,- - :- iki
(Kind.) - ita (as 2.3)
Higashi Kuchinda - - :,:- iki
(Kind.) - ita (as 2.3)
Itoman (Kind.) - as 2.1/2 :- iki
- ita (as 2.1/2/3)
Kume
Nakazato (Kind.) - ,- :,:- (iki)
17 Word-tones and vowel length in the Miyako Island Group
of the Sakishima Archipelago
Miyako 2.1/2 2.3 2.4/5
Miyako (Kōza; 11) - as 2.1/2 ,- (iki)
Ōra (Thorpe) - as 2.1/2 - (iki)
Irabu
Sarahama - as 2.1/2 -~:- iki
(Matsumori ’98) - ita (as 2.1/2/3)
Tarama
Tarama - ,- ,- iki
(Matsumori ’01) ,- ita (as 2.3)
18 Word-tones and vowel length in the Yaeyama Island Group
of the Sakishima Archipelago
Ishigaki 2.1/2 2.3 2.4/5
Ishigaki (Kōza; 11) - - as 2.3
Ishigaki (Thorpe) ,- - as 2.3
Ishigaki (Akinaga) - - as 2.3
Ōgawa (Kind.) ,- - as 2.3
Maezato (Akinaga) - - as 2.3
Taketomi
Taketomi (Kōza; 11) - - as 2.3
Taketomi (Thorpe) - - as 2.3
Taketomi (Akinaga) - - as 2.3
Kuroshima
Kuroshima (Akinaga)- as 2.1/2 - iki
- ita (as 2.1/2/3)
227
2.1/2 2.3 2.4/5
Kobama
Kobama (Kind.) - - as 2.3
Hatoma
Hatoma - ,- as 2.3
(Hirayama, 1988)
Hatoma (Akinaga) - , - as 2.3
Iriomote
Komi (Kind.) - - as 2.3
Sonai - ,-24 as 2.3
(Hirayama, 1967a)
Hateruma
Hateruma - ,- ,-iki
(Hirayama, 1988) ,- ita (as 2.3)
Hateruma (Akinaga) - - as 2.3
Yonaguni
Sonai (Kind.) - - ,- (iki)
Sonai (Kōza; 11) -~- - -~- (iki)
Sonai (Akinaga) - - - iki
- ita (as 2.3)
Hikawa - - as 2.3
Yonaguni (Thorpe) - - ,- (iki)
Yonaguni - - ,- iki25
(Hirayama, 1988) - ita (as 2.3)
9.4.2 The geographical distribution of vowel length in the Ryūkyūs
Lengthened vowels can be found on Kakeroma, Tokunoshima and Okinawa. In the
rest of the Ryūkyūs, vowel length is absent. Within the area where vowel length can
be found, it is in most cases subphonemic. When vowel length occurs in the second
syllable for instance, it is never distinctive. Vowel length in the second syllable only
occurs in classes 2.1/2 and 2.3, and usually in both classes at the same time. (In such
dialects, in other words, all tone classes except the iki-group have automatic iambic
lengthening.)
When vowel length occurs in the first syllable it is always in the iki-group, and
not in other tone classes. 26 There are two areas where vowel length in the first
syllable can be found, namely in part of Tokunoshima and in part of Okinawa.
24 But - in case the second syllable has developed into a dependent mora or in case of a
devoiced vowel in the second syllable.
25 Words belonging to the iki-group that have monosyllabified, such as ‘needle’ hai and ‘voice’
kui have a falling tone contour.
26 A small number of exceptions will be discussed in 9.7.1 and subsections.
228 9 The tone systems of the Ryūkyūs
On Tokunoshima, if a dialect has vowel length in the first syllable of the iki-
group, it will also have vowel length in the second syllable of classes 2.1/2 and 2.3,
and vice versa (i.e. iambic lengthening in all tone classes other than the iki-group).
On Okinawa the situation is different: In southern Okinawa, vowel length can be
found in the first syllable of the iki-group but not in the second syllable of class
2.1/2 and 2.3. In northern Okinawa on the other hand, vowel length can be found in
the second syllable of classes 2.1/2 and 2.3 (again iambic lengthening in all tone
classes other than the iki-group), but not in the first syllable of the iki-group.
In southern Okinawa, there are dialects where the iki-group and class 2.3 have
the same word-tone. (From Ōgi to Higashi Kuchinda.) In this area therefore, the
vowel length in the first syllable of the iki-group is distinctive. It is the presence of
this vowel length which distinguishes the iki-group from class 2.3.27
Another area where the vowel length in the first syllable of the iki-group is
distinctive, is in the Matsubara Nishi-ku and Maeno dialects on Tokunoshima. The
area on Tokunoshima where such vowel length can be found is larger, but in these
two dialects the iki-group and class 2.1/2 have the same word-tone, so that the
presence of vowel length in the iki-group is what distinguishes this class from class
2.1/2.
In all other dialects that have vowel length – in the first syllable of the iki-group,
in the second syllable of classes 2.1/2 and 2.3 or in both – the vowel length is
subphonemic. This type of vowel length automatically accompanies certain word-
tones, so that word-tone and vowel length are tied together and cannot be separated.
9.4.3 Arguments against the idea that vowel length in the initial syllable
is original
One thing that becomes clear when we look at the table, is that a [HL] word-tone for
the iki-group is the most widely attested reflex of this tone class throughout the
Ryūkyūs. This reflex is represented from the Amami Archipelago all the way to the
Miyako Island Group, and even Yonaguni.
This makes the idea that this tone class once started with [L] pitch and that the
present-day initial [H] pitch is the result of shortening of an originally long vowel
unlikely, as the exact same development must have occurred several times
independently. (One time in the area from Kikai and Amami-Ōshima to the northern
part of Tokunoshima, one time in the northern part of Okinawa, one time in Miyako
and Tarama and one time in Yonaguni.) The idea that the iki-group once started with
27 This is also the case in Nakazato on Kume Island, but the village of Nakazato is a so-called
yadori (ja:dui) settlement established by poor and/or unemployed samurai who emigrated from
the capital districts of Shuri and Naha from the middle of the 18th century on. The distinctive
vowel length on Kume Island can therefore be regarded as an offshoot of the Okinawan group.
Large areas of Okinawa itself are dotted with these kinds of settlements as well, and Hattori
(Wurm and Hattori 1981) stresses that one should be aware of the location of these settlements
because of the influence that they exerted on the local dialects.
229
[L] pitch is clearly inspired by the standard reconstruction of the Middle Japanese
tone system. The dialect geography of the Ryūkyūs itself argues against it.
Another problem with the idea that vowel length in the initial syllable of the iki-
group is old, and that [H] pitch on the initial syllable developed from it, has to do
with the fact that vowel length also occurs in a number of Chinese (Go-on)
loanwords that belong to the iki-group. (In Shuri for instance, maaku ‘curtain’,
hyaaku ‘hundred’ and haaci ‘begging bowl’.) It is interesting that these loanwords,
which in the dialects of mainland Japan belong to class 2.3, belong to class 2.4/5 in
the Ryūkyūs. (If their reflexes throughout the Ryūkyūs are regular enough they
should be added to the mari-group.)
Whatever the cause behind this shift may have been, the main point is that the
presence of vowel length in these loanwords casts doubt on the idea that vowel
length in the Ryūkyūs is old and goes back to proto-Ryūkyūan or proto-Japanese.
The fact that Yonaguni has /d-/ in loanwords that started with /y-/ in Middle
Chinese (cf. dasai ‘vegetables’) is a strong indication that /d-/ in Yonaguni does not
go back to proto-Japanese but is the result of an innovation. In a similar manner,
these examples of vowel length in loanwords from Early Middle Chinese suggest
that the Ryūkyūan vowel length is not original, but is the result of an innovation that
took place within the Ryūkyūs. (The unlikely alternative would be, to reconstruct
vowel length in these words in Early Middle Chinese, based on the Ryūkyūan
examples.)
In this light, it is interesting to look at Kindaichi’s idea that vowel length in the
initial syllable in the Ryūkyūs developed from earlier tonal distinctions.
9.4.4 Kindaichi’s ideas on the origin of vowel length in the Ryūkyūs
Kindaichi (1975a), argued that the vowel length that can be found in the initial
syllable of the iki-group in part of Tokunoshima and Okinawa developed due to a
rightward shift of the [H] pitch on the initial syllable. This resulted in a rising
contour tone on the initial syllable, which caused the vowel to lengthen, such as in
the Kanami dialect on Tokunoshima. Kindaichi points to the dialect of Ninohe in
Iwate prefecture for comparison, as in this dialect a similar development can be
seen: Class 2.4 in this dialect has : pitch. (Kindaichi, 1975a (1983):141).
In order to accommodate a contour tone, the syllabic support is often lengthened.
As Odden (1999:209) points out: “Very many languages exhibit a one-tone-per-
mora restriction, so that contour tones can only appear on long vowels. A corollary
of that restriction is that if a language disallows short contours, but also for some
reason wants a contour tone in some position, then vowel lengthening may be
required to support this contour.” This observation agrees well with Kindaichi’s idea
of contour tones as the origin of the vowel length, especially as it can be seen from
the table that contour tones in the Ryūkyūs are indeed usually automatically
lengthened.
When the [H] pitch on the initial syllable continues to shift to the right, this can
result in [L] pitch on the first syllable and [H] pitch on the second syllable, while the
230 9 The tone systems of the Ryūkyūs
first syllable remains lengthened. This seems to have happened in a number of
dialects on Tokunoshima (Matsubara Nishi-ku, Maeno, Asama) and Okinawa (Onna,
Ishikawa, Ōgi, Itoman). It can be illustrated with an example of a similar
development in the dialect of Kunohe in Iwate. According to Uwano (1996:39) the
dialect of Kunohe is similar to Ninohe mentioned earlier, but Kunohe has an extra
form in free variation, in which the /H/ tone has shifted to the second syllable while
the vowel length on the initial syllable has remained: 2.4 matu ‘pine tree’
:~:.
Rightward shift could also result in identical pitch on the first and the second
syllable (with the first syllable remaining lengthened), such as we see on
Tokunoshima (Matsubara) and Okinawa (Yagihara, Shuri, Arazato, Nakandakari
and Higashi Kuchinda).
The development of vowel length in the initial syllable of the iki-group must
have happened in two areas in the Ryūkyūs independently: In part of Tokunoshima
and in part of Okinawa. In both areas there are dialects where the vowel length
became distinctive. In Tokunoshima this was because in some dialects, the iki-group
and class 2.1/2 developed the same word-tone, and in southern Okinawa this was
because in some dialects the iki-group and class 2.3 developed the same word-tone.
All things considered, the case for a development of vowel length from [H] pitch
in the Ryūkyūs is stronger than for a development of [H] pitch from vowel length:
The development of vowel length from [H] pitch need only have taken place twice
in the Ryūkyūs. Also, the fact that such developments do really occur in the
Japanese dialects is illustrated by means of indisputable examples from dialects in
Iwate.
As tables 15 to 18 show, if we follow the idea that [H] pitch developed from
vowel length in the Ryūkyūs, we have to assume that this happened independently a
number of times over. In the standard theory, the development of /H/ tone in tone
classes that earlier started with sequences of /L/ tone (whether with the help of
vowel length or not) is regarded as a very common occurrence. This is not because
there are many indisputable examples of such a development in the Japanese dialects.
The only reason why this development is regarded as common, is because it must
have been, based on the standard reconstruction of the Middle Japanese tone system.
This tone system (and by extension the tone system of proto-Japanese) lacks /H/
tones in all the required places (cf. section 2.3.1) so that one is forced to assume that
these /H/ tones developed independently in many different dialects in Japan.
9.5 Rightward tone shift conditioned by vowel height
and the split in class 2.4/5
As the iki/ita split affects tone class 2.4/5 in the sense that half of this class merges
with class 2.3, it is natural to be reminded of those dialects in which vowel height
has influence on the location of the /H/ tone, namely the Gairin B dialects. In these
9.5 Rightward tone shift conditioned by vowel height and the split in class 2.4/5 231
dialects it is also tone class 2.4/5 that has split, whereby – just as in the Ryūkyūs –
part of this class merged with class 2.3.
We have seen in chapter 7 how /H/ tone in these dialects shifts to the right, and
how this rightward shift is blocked when the second syllable contains a close vowel.
Although the word-tones of the different tone classes in the Ryūkyūs nowadays
differ from dialect to dialect, the split in class 2.4/5 could be explained if we assume
a situation in proto-Ryūkyūan where – just as in the Gairin B type dialects – class
2.4/5 had - tone, but where members of this class with open vowels in the
final syllable had merged with class 2.3 by shifting the /H/ tone from the first to the
second syllable (- > -).
This idea originated with Hirayama (Hirayama, Ōshima & Nakamoto eds,
1966:14-15) and was supported by Kindaichi (1975a (1983):138-142). Hirayama
and Kindaichi regarded the similarity between the Gairin B type dialects and the
Ryūkyūan dialects as independent parallel developments.
Tokugawa (1990:256) later compared the tone system of Yonaguni in the
extreme southwest of the Japanese language zone with the tone system of Akita in
the northeast. (See table 19.)28 Not just in case of Akita, but in case of Yonaguni as
well, Tokugawa regarded the split of class 2.4/5 as related to the quality of the vowel
in the final syllable. Unlike Kindaichi however, Tokugawa suggested that the
similarity between the tone systems of the two dialects was not the result of
coincidence, but had to be ‘traced back far into the history of the Japanese language’.
If the similarity of the split in class 2.4/5 is not the result of coincidence, this split
must have formed part of the language of the settlers that brought the Japanese
language to the Ryūkyūs. Could the split in class 2.4/5 be traced back to the fact that
at least part of the people that settled in the Ryūkyūs were speakers of a Gairin B
type dialect?
19 Comparison of the iki/ita split and the Gairin B tone system
Akita Tokuwase Yonaguni
2.1/2 - - -
2.3 '- - -
2.4/5 '- A (> 2.3) - ita (> 2.3) - ita (> 2.3)
'- I - iki , - iki
Uwano (1996:32) on the other hand criticized the assumption that the split in class
2.4/5 in the Ryūkyūs was related to vowel quality. According to Uwano, it is at most
possible to speak of a tendency and certainly not of a phonological rule as there are
far too many exceptions.
28 I have added data from the dialect of Tokuwase on Tokunoshima (Matsumori, 1998b) to
Tokugawa’s comparison.
232 9 The tone systems of the Ryūkyūs
The tables in section 9.3.2 confirm Uwano’s assertion that the split between the
iki-group and the ita-group is not clearly based on the type of vowel in the second
syllable in any of the Ryūkyūan dialects. Of the 23 examples of nouns with an open
vowel (a, e, o) in the second syllable in Middle Japanese for instance, approximately
half has merged with class 2.3, while the other half has not. The merged items are ita
‘board’, kasa ‘umbrella’, wara ‘straw’, tane ‘seed’, miso ‘beanpaste’, kado ‘corner’,
kata ‘shoulder’, ase ‘sweat’, ame ‘rain’, momo ‘thigh’, aha ‘millet’, ine ‘rice plant’
(12 examples). The unmerged items are naka ‘inside’, hera ‘spatula’, hune ‘boat’,
kage ‘shadow’, kumo ‘spider’, koe ‘voice’, nabe ‘pot’, muko ‘bridegroom’, oke ‘tub’,
yado ‘shelter’, ito ‘thread’ (11 examples).
The reflexes of the 20 examples of nouns with a close vowel (i or u) in the
second syllable in Middle Japanese on the other hand, do not appear to be
completely random. Only 6 examples have merged with class 2.3, whereas 14 have
stayed in class 2.4/5 (70%). The merged items are siru ‘soup’, kasu ‘dregs’, nomi
‘chisel’, yoru ‘night’, mugi ‘wheat’, uri ‘melon’. The unmerged items are iki
‘breath’ usu ‘mortar’, umi ‘sea’, oku ‘interior’, hasi ‘chopsticks’, hari ‘needle’, matu
‘pine tree’, saru ‘monkey’, tabi ‘socks’, tuyu ‘dew’, sudi ‘sinew’, obi ‘girdle’, kazu
‘number’, nusi ‘owner’. It is therefore not impossible that the presence of a close
vowel in the second syllable played some role in preventing the merger of members
of class 2.4/5 with class 2.3.29
However, as Matsumori (2008) has pointed out, there is a very similar split in
class 3.4/5 in the Ryūkyūs: Half of the nouns of this class have merged with class
3.6/7. It is impossible to link this split to a Gairin B type tone system, as in such tone
systems, it is class 3.6/7 that has split (and 3.5 as well), but not class 3.4.
9.6 Possible explanations for the iki/ita split compared
If the split in tone class 2.4/5 in the Ryūkyūs were regularly based on vowel height,
as it is in the Gairin B type dialects (although in these dialects too, there are many
irregularities, especially in Shimane prefecture), its origin could either be explained
as an independent parallel development (such as Kindaichi does) or it could be
explained as the result of a genetic relationship with the Gairin B type dialects.
In the Ryūkyūs however, the phonological basis for the split in class 2.4/5 – if
recognizable at all – is at best severely blurred. A further problem is that this idea
29 The vowels e and o of mainland Japanese have raised, and merged with i and u in the Ryūkyūs.
The percentage of nouns that have remained in class 2.4/5 differs depending on whether i or u
in the final syllable in the Ryūkyūs goes back to e or o or to i and u: 14 out of 20 examples in
which i and u go back to i and u have remained in class 2.4/5 (70%), while 9 out of 19 words in
which i and u go back to e and o have remained in class 2.4/5 (56%). The fact that there is such
a difference indicates that – if the merger of part of class 2.4/5 with class 2.3 was conditioned
by vowel quality – this merger must have occurred before the Ryūkyūan vowel raising.
9.6 Possible explanations for the iki/ita split compared 233
fails to explain the very similar split in class 3.4/5. With the lack of a phonological
basis for the division, what possible explanations for the iki/ita split are left?
9.6.1 Extra tone classes in proto-Japanese
If one chooses to reconstruct extra tone classes in proto-Japanese to account for the
iki/ita split, the only way in which this can be done is by the inclusion of contour
tonemes in half of class 2.4/5 (and in a very small part of class 2.3) and in half of
class 3.4/5. The possibilities in case of disyllabic nouns are for instance: iki * >
but ita * > , or iki * but ita * > * > .
From the distribution and frequency of the different tonemes in Middle Japanese,
it is clear that the basic tonal opposition was between the two level tones /H/ and /L/.
The contour tones appear to be the result of contractions. There is nothing in the
oppositions of Middle Japanese that would lead us to reconstruct primary contour
tones in proto-Japanese. This means that if we want to account for the iki/ita split in
the Ryūkyūs by reconstructing contour tones in either the iki-group or the ita-group
of class 2.4/5, this would mean that we would have to reconstruct a least 25 cases of
contractions in proto-Japanese, based on the examples that are included in tables 5 to
14 in this chapter alone.
In light of this, reconstructing the ita-group with * tone would probably be
the best option, as in that case the contour tone could be attributed to a suffix. But
what could the semantic load of this suffix have been? As mentioned in section 8.3,
a large part of class 2.5 (which may incorporate a former suffix), consists of the
names of small plants and animals. Between the iki group and the ita group however,
it is hard to find a semantic distinction.
9.6.2 Vowel length distinctions in proto-Japanese
Following Ramsey’s theory the - word-tone that the iki-group of class 2.4/5
has in many Ryūkyūan dialects agrees closely with the reconstructed tone system of
proto-Japanese, and it is not necessary to reconstruct vowel length in order to
explain it. It is however, possible to combine Ramsey’s theory with Hattori’s idea
that the vowel length that can be found in this part of the vocabulary in some
Ryūkyūan dialects goes back to proto-Ryūkyūan and proto-Japanese.
One could assume for instance that the partial rightward tone shift in class 2.4/5
was conditioned by the presence or absence of vowel length in the initial syllable.
Vowel length in the initial syllable of the iki-group could have had the effect of
preventing the rightward shift of [H] pitch from the first to the second syllable. This
would have prevented the merger of the iki-group of class 2.4/5 with class 2.3. In
trisyllables, vowel length in the second syllable could have prevented the merger of
class 3.4/5 with class 3.6/7.
The proto-Japanese vowel length would then have been lost in the dialects of
mainland Japan, as well as in most Ryūkyūan dialects. In some Ryūkyūan dialects
on the other hand, the development was different: In these dialects the vowel length
was preserved, while the original location of the /H/ tone was lost.
234 9 The tone systems of the Ryūkyūs
A problem with this reconstruction is that there are hardly any dialects that have
preserved the iki reflex as :, as tables 15 to 18 show. The vowel length is
typically found in dialects that have shifted the [H] pitch away from the first syllable.
Because of the strong correlation between rightward shift and vowel length that
appears from the tables, I find Kindaichi’s idea that vowel length is a secondary
development more convincing.
Another problem is that the distribution of the vowel length in proto-Japanese
remains problematic: Why did it only occur on the penultimate syllable, and why
only on syllables with /H/ tone? It cannot have been related to stress, as stress-accent
is usually obligatory, and would not be limited to a small part of the vocabulary only.
20 Vowel length distinctions as the origin of the iki/ita split
Proto- Proto-Gairin/ Proto-
Japanese Ryūkyūan Ryūkyūan
(mergers) (word-tones)
2.1 A
2.2 > A
2.3 > B
: > : > : C
2.4 > B
: : : C
2.5 > > B
: > : : C
3.1 A
3.2 > A
3.4 > B
: > : > : C
3.5 > > B
: > : : C
3.6 > C
3.7 > > C
9.6.3 Dialect interference in the development of proto-Ryūkyūan
There is one more possibility to consider, and that is that the split in tone classes
2.4/5 and 3.4/5 in proto-Ryūkyūan is the result of contact between dialects with a
different merger pattern of the tone classes. Although all Ryūkyūan dialects do seem
to go back to a single proto-language, this does not preclude the possibility that this
proto-language itself was the result of dialect mixing.
Matsumori (1997:64) has shown how in the dialect of Wakimachi on Shikoku
tone class 2.3 has split. In this dialect, approximately half of class 2.3 has merged
with class 2.1 and the other half has merged with class 2.2, without it being possible
to discern a phonological or semantic basis for the split. As Wakimachi is located in-
9.6 Possible explanations for the iki/ita split compared 235
between an area with a Kyōto type tone system in which tone class 2.3 has merged
with class 2.2, and an area with a Sanuki type tone system, in which tone class 2.3
has merged with class 2.1, the most likely explanation for the split in this tone class
is dialect interference. Thanks to the fact that the two tonal types that influenced
Wakimachi have survived in the region, it is possible to see that the split in the
reflexes of class 2.3 in Wakimachi is the result of dialect contact, and does not
require the reconstruction of vowel length distinctions or an additional tone class in
proto-Japanese.
Similar dialect mixing is reported by Uwano (1981). In Mushū-iwato in Niigata
prefecture, which is located in-between areas with a Chūrin type tone system to the
south and areas with a Gairin type tone system to the north, half of the nouns in class
2.2 have merged with class 2.1, while the other half has merged with class 2.3.
Moving northward from there towards the areas with a pure Gairin type tone system,
the number of nouns of class 2.2 that merge with class 2.1 gradually increases. (See
section 3.3.2.)
It is possible that the atypical division of the nouns over the different tone classes
in proto-Ryūkyūan can likewise be attributed to interference between two different
tone systems that were once located in adjacent areas. I will assume that both types
were once spoken in southwest Kyūshū, the most likely starting point of migration
to the Ryūkyūs.
As I have argued in section 9.1, rightward tone shift forms a likely intermediate
stage in the shift from syllable-tone to word-tone. The tone system that formed the
starting point of the formation of the tone system of proto-Ryūkyūan (Component 1)
is therefore reconstructed as a Gairin type tone system that has gone through
rightward tone shift in trisyllabic nouns. This reconstruction is based on two things:
We have seen in section 7.1.1 that rightward tone shift has a tendency to start in
longer words. Furthermore, in the dialect of San on Tokunoshima (cf. section 9.1),
just such a development can be seen: Rightward tone shift has occurred in trisyllabic
nouns but shorter nouns have not yet been affected.
The location of the /H/ tone in Component 1 is therefore based on San.30 I regard
this dialect as an archaic Ryūkyūan type in which the shift to word-tone has not been
completed, as the tone patterns of longer words show that the location of the /H/
tone in the word is still distinctive.
The original division into tone classes of Component 1 was interfered with by a
Kagoshima type dialect with only two word-tones (Component 2). The word-tones
of Component 2 have been modeled after the Kagoshima type tone system of
Makurazaki from the southwestern tip of Kyūshū, which contains a pitch fall in
word-tone B. The merger pattern of this tone system was A 2.1/2 vs. B 2.3/4/5 in
disyllabic nouns, and A 3.1/2 vs. B 3.4/5/6/7 in trisyllabic nouns.
30 A difference is that Component 1 did not yet have the split in classes 2.4/5 and 3.4/5, while
San does.
236 9 The tone systems of the Ryūkyūs
The division into tone classes of Component 2 interfered with adjacent
Component 1 in the following way: Nouns with word-tone B from Component 2
were adopted into classes 2.3 and 3.6/7 of Component 1. This is because word-tone
B was phonetically similar to the tone of these classes.
Members of tone-class B were not adopted into classes 2.4/5 or 3.4/5 of
Component 1, as in this case there was no phonetic similarity between the tone
classes. As shown in (21), the interference resulted in a tone system with the merger
patterns that are typical of proto-Ryūkyūan: Classes 2.4/5 and 3.4/5 have split, while
classes 2.3 and 3.6/7 have not.
21 Dialect interference as the origin of the iki/ita split
Component 2 Component 1 Proto-Ryūkyūan
(Makurazaki type) (Gairin with rightward (Tokunoshima type)
tone shift in trisyllables)
2.1/2 - A → 2.1/2 - > 2.1/2 -
2.3/4/5 - B → 2.3 - > 2.3/4/5 -
2.4/5 - > 2.4/5 -
3.1/2 - A → 3.1/2 - > 3.1/2 -
3.4/5 - > 3.4/5 -
3.4/5/6/7 - B → 3.6/7 - > 3.4/5/6/7 -
The main point is, that there must have been a phonetic similarity between certain
tone classes in the two components, but not between others.
In concrete terms: The interference of Component 2, in which the word-tone of
class 2.4/5 was phonetically identical to the tone of class 2.3 in Component 1 may
account for the merger of half of class 2.4/5 with class 2.3 (ita ‘board’, kasa
‘umbrella’, wara ‘straw’, tane ‘seed’, miso ‘beanpaste’, kado ‘corner’, kata
‘shoulder’, ase ‘sweat’, ame ‘rain’, momo ‘thigh’, aha ‘millet’ and ine ‘rice plant’).
However, the divisions of Component 2 were not adopted completely (at least not
yet), and half of class 2.4/5 remained in class 2.4/5 (kumo ‘spider’, muko
‘bridegroom’, hune ‘boat’, koe ‘voice’, yado ‘shelter’, nabe ‘pot’, kage ‘shadow’,
oke ‘tub’, naka ‘inside’ and hera ‘spatula’). A similar process led to the split of class
3.4/5.
The resulting proto-Ryūkyūan tone system has no initial /H/ tone in words longer
than 2 syllables. This may explain why vowel length in longer nouns is missing in
the Ryūkyūs, even in those dialects that do have vowel length in the initial syllable
of the iki-group. (Vowel length in the second syllable of trisyllabic nouns on the
other hand, does occur, although not phonemically.)
In case of Wakimachi and Mushū-iwato, the tone systems that contributed to the
dialect mixing – as well as the intermediate tone system that resulted – are still
represented in the respective areas. The preservation of all varieties in these
instances of dialect contact, may be due to the fact that none of the tone systems
9.7 Martin’s idea of /L/ tone as a concomitant of vowel length in proto-Japanese 237
involved had a simpler set of distinctions than the other. For one of the systems to
replace the others it would have been necessary for speakers of the other varieties to
master a new set of correspondences that was equally as complex as their own.
This was not the case in Kyūshū, as there, the Kagoshima type two-way
distinction of Component 2 was clearly the most simple. There was no need to
master a completely novel set of reflexes if this system was adopted. All that
speakers of Component 1 had to do, was to erase previously existing distinctions (i.e.
the distinction between 2.4/5 and 2.3, and between 3.4/5 and 3.6/7), not master new
ones.
I therefore think that this is what eventually happened: In the end, all of class
2.4/5 merged with class 2.3, and all of class 3.4/5 merged with class 3.6/7, so that
the shift to a Kagoshima type word-tone system was complete.
Before this process was completed however, the dialect was exported to the
Ryūkyūs, i.e. still at the intermediate stage. The formation of the proto-Ryūkyūan
tone system could therefore be described as a process of lexical diffusion through
dialect contact, which got frozen.
The reason why the intermediate variety (proto-Ryūkyūan) survived in the
Ryūkyūs but not on Kyūshū is because of the presence on Kyūshū of the Kagoshima
type tone system. Because of its simple two-way distinction, this tone system
eventually absorbed the intermediate variety. (And this tone system is, of course,
still typical of southwestern Kyūshū.)31
The advantage of the dialect contact hypothesis is that it is possible to explain
the distribution of the lexical items over the different tone classes in the Ryūkyūs
without having to reconstruct extra tone classes or vowel length distinctions in
proto-Japanese, solutions which are each in their own way problematic (cf. sections
9.6.1 and 9.6.2).
9.7 Martin’s idea of /L/ tone as a concomitant of vowel length
in proto-Japanese
Hattori reconstructed both distinctive vowel length and distinctive tone in proto-
Japanese. Part of his reconstructed vowel length functioned to explain why – at
some point – /H/ tone developed in certain tone classes that did not yet contain /H/
tone in proto-Japanese according to the standard theory. Another part of his
reconstructed vowel length functioned to explain why – throughout most of Japan –
31 It is possible that the intermediate variety did not develop on Kyūshū itself, but that Component
1 and Component 2 both made their way from Kyūshū to the Ryūkyūs. The mixed Ryūkyūan
proto-dialect could subsequently have evolved on the island of Okinawa, from where
permanent settlement of the other islands may have begun. (It that case however, the chances
are that Component 2 would have absorbed Component 1 on Okinawa completely.) A thorough
comparison of the merger patterns in the Ryūkyūs and their geographical distribution may be
able to shed light on the question of where the mixed tone system of proto-Ryūkyūan evolved.
238 9 The tone systems of the Ryūkyūs
/H/ tone developed in the Tōkyō type location of the word. Other scholars, like
Uwano, Matsumori and Shimabukuro, reconstruct vowel length in proto-Ryūkyūan
(and sometimes also in proto-Japanese) only in case of the first syllable of the iki-
group and the mari-group. (I.e. they adopt the first part of Hattori’s reconstructed
vowe length but not the second.)
Martin (1987) on the other hand, sees a much closer link between tone and vowel
length in proto-Japanese than Hattori and the others. Martin proposed the idea that
the primary phonetic manifestation of initial /L/ tone in proto-Japanese may have
been vowel length. Martin reconstructs as vowel length not only initial /L/ tone, but
also the reversion to /L/ tone in nouns of class 2.5 and 3.7 (in the standard
reconstruction of the Middle Japanese tone system). In other words, his
reconstruction of vowel length is completely based on the tone system, and is not
reconstructed independently.
This hypothesis was primarily inspired by the long vowels that can be found in
the initial syllable in Shuri and other Ryūkyūan dialects in the iki-group of class
2.4/5 and the mari-group of class 2.3. (Martin does not address the iki/ita split, and
so he reconstructed vowel length in the initial syllable of the entire classes 2.3 and
2.4/5.) As additional evidence for a link between /L/ tone and vowel length, Martin
(1987:248) mentions an interesting phenomenon from the southern part of the Noto
peninsula:
According to Iwai Ryūsei (in Tōjō, 1961:3: 94-98) the dialect of Oshimizu in
Hakui-gun of Ishikawa prefecture has a long first syllable in almost all words
of type 2.4 (such as kama ‘sickle’ and umi ‘sea’) and quite a few in type 2.5
(aki ‘autumn’ and huna ‘crucian carp’) with the pattern L:L or L:H in free
variation; there are also some examples of L:LL such as karasu ‘crow’ (type
3.6). (As in Tōkyō type 2.3 has merged into 2.2.) This would seem to be
excellent independent evidence of the antiquity of vowel length for the ‘low’
pitch.
As we have seen in section 6.2 on the tone systems of the Noto dialects however,
these dialects originally had /H/ tone on the initial syllable in these tone classes, a
/H/ tone that was later lost. The vowel length may be the result of compensatory
lengthening due to the loss of the /H/ tone. (If anything, vowel length may actually
have been a concomitant of /H/ tone in these dialects,)
Martin himself also expresses reservations as to how clear an indication for the
antiquity of vowel length this is, as he continues to say: “It is disturbing that the
Komatsu dialect has vowel length in the initial syllable of those words of type 2.2
(hasi ‘bridge’, mati ‘town’, hiru ‘daytime’) and type 2.3 (asi ‘foot’, kutu ‘shoe’,
mimi ‘ear’) that end in a high vowel, the pattern being H:L.” Not only does the
vowel length here occur in the wrong tone class (2.2), it is also again related to /H/
tone.32
32 The Komatsu dialect is not a Noto type dialect. According to Uwano’s map the Komatsu
9.7 Martin’s idea of /L/ tone as a concomitant of vowel length in proto-Japanese 239
Based on Ramsey’s reconstruction, it would be possible to reverse the
connection proposed by Martin and regard vowel length as the primary phonetic
manifestation of /H/ tone. As I have pointed out in this chapter however, even in the
Ryūkyūan dialects, the evidence for proto-Ryūkyūan vowel length is not strong. It
can be explained as a regional innovation that does not have to be projected back
onto proto-Ryūkyūan.
The reconstruction of vowel length in proto-Japanese in the locations that Martin
indicates, is a means to simplify the tone system. But the tone system of Middle
Japanese is not overly complicated for a register tone language.
Martin too, remarks upon the fact that this merely shifts the problem to
explaining the peculiar distribution of the long vowels and the question of their
source. Another problem is that the reconstruction of proto-Japanese vowel length in
the initial syllable of the entire classes 2.3, 2.4 and 2.5 ignores the split that can be
seen in these classes in the Ryūkyūs.
9.7.1 Vovin’s evidence for vowel length in proto-Okinawan
Vovin (1993b) supports Martin’s reconstruction of proto-Japanese vowel length in
the initial syllable of classes that started with /L/ tone in proto-Japanese, such as 2.3
and 2.4/5, and in the second syllable of class 2.5. Like Martin, Vovin argues that
remnants of this proto-Japanese vowel length have been preserved in Okinawa,
based on a comparison of a number of Okinawan dialects, such as Shuri, Nakijin,
Onna and Ō(mu).33 While Martin based his reconstruction on the reflexes of the iki
and mari-groups, Vovin also includes examples that fall outside of these groups. (In
the following sections the proto-Japanese tone indicated is based on the standard
reconstruction of the Middle Japanese tone system, to which Vovin adheres.)
9.7.1.1 Vovin’s examples
Vovin presents 14 examples of vowel length in the initial syllable, of which 6
belong to the iki-group of class 2.4/5.34 (As we have seen, such vowel length is the
dialect is located in the area with Kaga type tone. In this tonal type class 2.2/3 has merged with
class 2.1, which does not seem to be true (or at least is not mentioned by Martin) in case of the
Komatsu dialect.) I assume that the tone pattern of the nouns with a final open vowel in these
dialects is different from that of nouns with a final close vowel, and that the difference is not
only in the vowel length. In the Gairin B type dialects close vowels like i and u block rightward
shift of /H/ tone; they do not usually cause /H/ tone to shift to the left. Kaga however, is
located close to the Noto and Toyama dialects where close vowels do sometimes cause /H/ tone
to shift to the left.
33 Vovin’s data for Shuri are from Okinawa-go jiten (1963), for Nakijin from Nakasone (1983),
for Onna from Hattori (1978) and for Ōmu from Hirayama (1967b). (According to
Shimabukuro the last dialect should be called Ō.)
34 I do not adopt the word niibi from Vovin’s list, which occurs in Shuri and Nakijin. In Shuri it
means ‘hard red soil’ and in Nakijin ‘sandstone’. It has no clear cognate in Japanese and may
be a compound. I have also removed the word for ‘carp’ kuu/juu from Vovin’s list, as it is
probably not related to Japanese koi (< kohi < *kwopi, class 2.5), but instead may be a
240 9 The tone systems of the Ryūkyūs
regular reflex of this class in part of Okinawa.) However, apart from the examples
that show the regular iki-group reflex, Vovin also includes examples of class 2.5 that
have vowel length not only in the first, but also in the second syllable. The nouns of
class 2.3 with vowel length in the initial syllable that Vovin includes are also special,
as they do not belong to the mari-group in most Ryūkyūan dialects (‘pigeon’, ‘bet’,
‘turtle’, ‘axe’).
Vovin furthermore includes a number of examples in which the initial vowel
length occurs in nouns of other tone classes, such as class 2.1 (‘mold’, ‘heron’) and
2.2 (‘rainbow’) In these cases the occurrence of vowel length is not automatically
part of the realization of the word-tones. In addition ‘heron’, ‘tortoise’ and ‘pigeon’
also have vowel length in the second syllable, just as part of class 2.5. This is not
automatically part of the realization of the word-tones either.
Vovin does not address the fact that the reconstruction of vowel length in the
initial syllable of classes 2.1 and 2.2 does not agree with Martin’s ideas, as these
classes started with /H/ tone and not with /L/ tone in Middle Japanese.35
9.7.1.2 Amendments to Vovin’s examples
Vovin uses the notations A (2.1/2), B (2.3 and the ita-group of class 2.4/5) and C
(the iki-group of class 2.4/5) to indicate the different word-tone classes in the
dialects. In case of Shuri, the word-tone of class 2.3 as well as class 2.4/5 is level,
and Vovin’s division is therefore A 2.1/2 vs. B 2.3/4/5. As we have seen however,
level tone with vowel length in the initial syllable is the regular reflex in Shuri of the
iki-group, i.e. class C. I have therefore indicated C as the tone class of these
examples in (22).
Shimabukuro (1997) analyzes the tone of ‘mold’ haabui and ‘shadow’
k’aagaa in the Nakijin material as C and not A, and so does Martin (1987). The
same is true for ‘bet’, which is also listed with word-tone A instead of C in Vovin’s
list.36 In (22) I have therefore indicated C as the tone class of these examples.
compound of the words ‘small’ and ‘fish’ (Martin, 1987:454).
35 Shimabukuro (1997) therefore criticizes Vovin’s approach, saying that he does not take initial
/L/ tone in proto-Japanese into account. In his own article Shimabukuro therefore only includes
examples of Ryūkyūan vowel length in nouns of classes 2.3 and 2.4/5.
36 In Nakijin, the realization of the word-tones, and the occurrence of vowel length is influenced
by vowel quality and the length in feet. (A foot can consist of a single heavy syllable, two light
syllables or a light syllable followed by a heavy syllable. Monosyllables are automatically
lengthened.) Unless indicated otherwise, the vowel length occurring in the description of the
word-tones below is automatic:
In word-tone A, the syllable containing the second mora has [H] pitch, all other moras have [L]
pitch (but the last complete foot has [HL] pitch):
: 1.2 paa ‘leaf, 2.1 tui (Jap. tori) ‘bird’ : 2.1 hazii ‘wind’, 2.2 /isii ‘stone’, 3.2 kazai (Jap.
kazari) ‘ornament’ : 3.1 kataaci ‘shape’, 3.2 sakuura ‘cherry blossom’, 3.3 cikaara
‘strength’. The word saazai ‘heron’, which has underlying vowel length in the first and the
second syllable is realized with :: pitch.
In word-tone B, the final mora of the first foot has [H] pitch. If the word is longer than one foot
9.7 Martin’s idea of /L/ tone as a concomitant of vowel length in proto-Japanese 241
I have also adopted Shimabukuro’s proto-Okinawan reconstruction
*kaaboori/*kaabuuri instead of Vovin’s *kaabui. (Shimabukuro reconstructs
kaaboori/*kaabuuri based on an (unpublished) etymology by Serafim, 1993).
As ‘bet’ has word-tone C in both Shuri and Nakijin, I have changed Vovin’s
reconstructed proto-Okinawan tone class from B to C. Furthermore, the word-tone
of the words ‘shadow’ and ‘monkey’, which is : in the dialect of Onna in fact
belongs to type C and not type B, such as Vovin indicates. (See section 9.3.2 on the
tone system of Onna.)
For ‘pigeon’ I have adopted Shimabukuro’s proto-Okinawan reconstruction
*pootu, instead of Vovin’s *paatu. The word ‘pigeon’ is given as hooto in Shuri and
Ō(mu) by Vovin. However, according to Okinawa-go jiten it is hootu in Shuri and
according to Shimabukuro, it is also hootu in Ō(mu). The vowel -aa- in Vovin’s
reconstruction of ‘pigeon’ *paatu is clearly based on the Japanese reflex hato, as
Okinawa only has forms with -oo- (hootu and p’ootuu).37
In case of ‘heron’, ‘shadow’ and ‘monkey’, the Shuri data of Okinawa-go jiten
include forms with vowel length only in the first syllable, but also forms with vowel
length both in the first and in the second syllable. In case of ‘shadow’ and ‘monkey’
Vovin only adduces the examples with vowel length in both syllables. I have added
all of the attested forms of ‘shadow’, ‘monkey’ and ‘heron’ to the list in (22), as a
comparison of the two forms shows that the form with the long vowel in the second
syllable was most likely derived from the regular form by the addition of a suffix.
(In some cases the meaning of the exceptional form appears to be derived.)38
the first foot will have [L] pitch, while subsequent feet or extrametrical syllables have [H]
pitch:
: 1.3 kii ‘tree’, 2.3 naa (Jap. nawa) ‘rope’ : 2.3 panaa ‘flower’ : when the second
vowel is a as in 3.4 takaara ‘treasure’ and : when the second vowel is u or i as in 3.4
tanumii ‘request’. The words pootuu ‘pigeon’, haamii ‘turtle’, and uunuu ‘axe’, which have
underlying vowel length in the first syllable are realized with :: pitch.
In word-tone C, the last complete foot has [HL] pitch. If the word is longer than one foot the
first foot will have [L] pitch, while subsequent extrametrical syllables have [H] pitch:
: 2.4 pai (Jap. hari) ‘needle’, 2.5 kui ‘voice’ in case the final vowel is i or u, as in 2.4
/ici ‘breath’, 2.5 muhu ‘bridegroom’ but : in case the final vowel is a as in 2.4 nahaa
‘inside’. : 3.6 unaazi ‘eel’, 3.4 kutuuba ‘word’. The following words have underlying
vowel length in the first as well as the second syllable, and are realized with :: pitch: ‘rice
cake’ muucii (<*motipi), ‘bet’ k’aakii, ‘mold’ haabui, ‘reflection’ k’aagaa. The variant
k’aagi, but also for instance 3.4 haara ‘roof tile’ (Jap. kahara) and 3.3 aabi (Jap. ahabi) have
underlying vowel length in the initial syllable only and are realized with : pitch.
37 There are a number of inconsistencies in Vovin’s proto-Okinawan reconstructions that I have
left as they are: As ‘shadow’ is reconstructed by Vovin as *kaagai, based on the forms kaagi
and kaagaa in Okinawa, ‘heron’ should likewise have been reconstructed as *saazai, and not as
*saazi, because of the forms saazi and saazaa in Shuri (and even saazai in Nakijin). Also, as
Vovin reconstructs ‘spider’ as *kuubu/koobu it would have been more consistent to reconstruct
‘rainbow’ as *nuuzi/noozi, because both words have been attested with -oo- in Ō(mu).
38 While saaru means ‘monkey’, the definition of saaruu in Okinawa-go jiten is: 口のとがった
者。猿に似た者の意。 ‘Someone who has a sharp mouth (complains a lot). Someone who
resembles a monkey.’ Kaagi means ‘shadow’, but kaagaa means ‘reflection’.
242 9 The tone systems of the Ryūkyūs
Finally, I have removed examples of the iki-group in which vowel length only
occurs in the initial syllable in Shuri and Onna and Ō(mu) from the list in (22), as
this is the regular reflex in these dialects and is no indication of underlying vowel
length.
22 Vovin’s reconstruction of vowel length in proto-Okinawan
Shuri Ō(mu) Nakijin Onna Proto-
Okinawan
2.1 ‘mold’ kaabui C x haabui C x *kaaboori/
*kaabuuri C
‘heron’ saazi A x saazai A x *saazi A
saazaa A
2.2 ‘rainbow’ nuuzi A noozi A t’iNtoo- x *nuuzi A
nooziri 39 C
2.3 ‘pigeon’ hootu C hootu B p’ootuu B x *pootu B
‘bet’ kaakii C x k’aakii C x *kaakii C
‘turtle’ kaamii C kaamii B haamii B x *kaamii B
‘axe’ uuN C uuN B uunuu B x *wuunu B
2.5 ‘shadow’ kaagi C kaagi B k’aagi C k’aagi C *kaagai C
kaagaa C k’aagaa C
‘spider’ kubu B40 koobu B hubu C k’uubaa B *kuubu/
kuubaa C *koobu C
‘monkey’ saaru C saaru B saaruu C saaru C *saaru C
saaruu C
saru B41
9.7.1.3 The examples of underlying vowel length
I will start by discussing the cases of vowel length in the initial syllable in the
examples in (22). The examples 2.1 ‘heron’ and 2.2 ‘rainbow’ have vowel length in
the initial syllable in Shuri and Nakijin, even though this does not form part of word-
tone A in these dialects. 42 However, as to ‘rainbow’, the historical development of
the word in Japanese is unclear. The reflexes in the dialects are very diverse, and it
39 ‘Rainbow’ in Nakijin is t’iNtoo-nooziri, a compound consisting of t’iNtoo (Japanese: tendoo 天
道) and noozi ‘rainbow’ plus some kind of suffix -ri (?) The fact that the compound has word-
tone C does not mean that the morpheme noozi would have had word-tone C when used in
isolation, as it is not the first element of the compound.
40 Kubu is a literary form.
41 Saru is a literary form.
42 According to W. P. Lawrence (personal communication) in the dialects of Asama and Masana
on Tokunoshima ‘heron’ is saagi, even though vowel length in the initial syllable does not
form part of the regular reflex of class A in these dialects.
9.7 Martin’s idea of /L/ tone as a concomitant of vowel length in proto-Japanese 243
is possible that the long vowel in the initial syllable is the result of a contraction
(Martin 1987:498-499).
We would have expected the examples of class 2.3 (‘mold’, ‘pigeon’, ‘bet’,
‘turtle’ and ‘axe’) to belong to class B in Shuri. We have seen however, that vowel
length in the initial syllable in Shuri is the regular reflex of the iki-group (class C),
and so the long vowels that can be found in the initial syllable of these examples
make them part of class C in Shuri.
The mari-group also consists of nouns of class 2.3 that have joined class C, but
in case of the mari-group the shift to class C occurred in more than just an isolated
dialect, indicating that this shift already took place in proto-Ryūkyūan.43 ‘Pigeon’,
‘turtle’ and ‘axe’ on the other hand, belong to class C only in Shuri, while in other
dialects on Okinawa, such as Nakijin and Ō(mu), they belong to the expected class
B. In these dialects however, they have vowel length in the initial syllable as well,
even though this does not form part of the word-tone of class B in these dialects.44
Maybe then, these words belonged to class B, and had underlying vowel length in
the initial syllable in proto-Okinawan. When Shuri later developed vowel length in
the initial syllble of class C and leveled out the distinction in pitch between class B
and class C, this special group of nouns of class B became indistinguishable from
class C.
‘Mold’ and ‘bet’ on the other hand, belong to class C in Shuri as well as in
Nakijin, so they may have shifted to class C in proto-Okinawan. If so, the vowel
length in the initial syllable in Shuri would stem from the shift from initial [H] pitch
to vowel length that took place in this dialect. (See section 9.4.3 and 9.4.4.) The
vowel length in the inital syllable of nouns of class 2.5 in Shuri (and Ō(mu) and
Onna) is no more than the regular reflex of class C and has the same origin. (The
examples with short vowels are literary forms.)
In Nakijin on the other hand, vowel length in the initial syllable does not form
part of the regular reflex of any of the word-tones, so that in all examples of vowel
length in the initial syllable in Nakijin in (22), the vowel length is underlying.
This means that there is a small number of words that have underlying vowel
length in the initial syllable (i.e. not related to the word-tone) in part, and sometimes
even most of Okinawa, and occasionally in Tokunoshima. In some cases this vowel
length may be the result of sporadic lengthening in proto-Okinawan or proto-
Northern Ryūkyūan, while in other cases lengthening seems to have occurred
independently. 45 In yet other cases the vowel length may have originated as the
43 As mentioned in section 9.3.1, the examples 2.3 ‘pigeon’ and 2.3 ‘turtle’ have reflexes
characteristic of class 2.4/5 in Matsue, and ‘turtle’ belongs to class 2.5 in Kyōto and Osaka as
well. Whether a coincidence or not, it does show that nouns can slip out of their tone class.
44 According to Lawrence, ‘pigeon’ has vowel length in the initial syllable in all Okinawan
dialects (but not in Tokunoshima) while ‘turtle’ has vowel length in the initial syllable in many
of the Okinawan dialects.
45 The membership of class 2.3α in Onna for instance, which also has vowel length in the initial
syllable that is not automatically linked to the word-tone of this class (as it is lacking in reflex
244 9 The tone systems of the Ryūkyūs
regular reflex of the iki-group in the standard dialect of Shuri, and spread from there
to other dialects in loanwords.46
Martin (1987:253) too, considered the various lengthenings and shortenings of
the Shodon dialect as secondary, and commented: “It should be borne in mind that
each of the morphemes (including the monosyllables) that appears with a long vowel
also has an allomorph with the short vowel, as found in many compounds; and this
is true whether the length is automatic (as in monosyllables) or distinctive. It is true
for all the dialects that seem to offer evidence for earlier vowel-length distinctions.”
The main point is however, that the occurence of vowel length in the Ryūkyūs
cannot be linked to initial /L/ tone in proto-Japanese. In some cases where it has to
be reconstructed in proto-Okinawan, it corresponds to /H/ tone in the initial syllable
in proto-Japanese, but more importantly, in the vast majorit of cases, the
correspondence between initial vowel length in part of the Ryūkyūs and initial /L/
tone in proto-Japanese is limited to nouns of class 2.4/5. It is therefore not proto-
Japanese /L/ tone as such which yields initial vowel length in Okinawa and
Tokunoshima, but proto-Japanese /L/ tone in this particular class.
As I have already discussed my ideas on the origin of this correspondence at
length in this chapter, I will now discuss the vowel length in the second syllable in
Vovin’s examples. As we have seen, the diphthong -ui in the second syllable of
‘mold’ appears to go back to a contraction of two syllables. Other cases of vowel
length in the second syllable may be due to suffixation. Hattori (1979:107) for
instance states that “changes can occur to the shape of the word when a suffix -a(a)
has been attached or when the final vowel has been lengthened in order to impart a
special meaning.” He clearly does not regard these long vowels as part of the
original word stem, and in his list he puts examples like 2.5 kuubaa ‘spider’ in Shuri
and Onna and 2.5 saaruu ‘someone who resembles a monkey’ in Yonamine (which
is another designation for the dialect of Nakijin) in brackets.
In case of saaruu and kaamii ‘turtle’ the final vowel has been lengthened,47 and
in case of kuubaa Hattori suspects that the long final vowel goes back to a suffix -
a(a), as other Okinawan dialects have forms with final -u such as koobu or hubu.
The forms saazaa ‘heron’ (another animal name), and kaagaa ‘reflection’ probably
go back to the same suffix, as forms like saazi and kaagi are also attested.
The presence alone of forms with the regular CV:CV shape in case of ‘heron’,
‘shadow’, ‘spider’ and ‘monkey’, and the likeliness that the final -aa in ‘spider’ and
‘heron’ goes back to a suffix, suggests that the vowel length in the final syllable that
can be seen in these examples originally did not form part of the word stem. (An
exception remains ‘bet’ however, in which the vowel length in the second syllable
does not seem to be the result of a contraction or a suffix.)
2.3β), does not coincide with these other Okinawan cases of unexplained vowel length.
46 For instance in case of ‘bet’ and ‘shadow’ in Nakijin, as the expected reflexes would have
included h- instead of k-.
47 According to Lawrence, the final syllable of ‘turtle’ is lengthened in all of southern Okinawa.
9.7 Martin’s idea of /L/ tone as a concomitant of vowel length in proto-Japanese 245
In section 8.3, I have suggested that there may have been a diminutive suffix
with /H/ tone in proto-Japanese that was frequently added to names of animals and
plants. It is possible that the long final vowels in ‘spider’, ‘monkey’, ‘turtle’, ‘heron’,
‘pigeon’ (in Nakijin) and ‘reflection’ (as opposed to kaagi ‘shadow’) go back to this
proto-Japanese suffix. It is clear, at least, that in Vovin’s examples the vowel length
is not limited to tone class 2.5, and can therefore not be linked to a reversion to /L/
tone on the second syllable of this class in proto-Japanese, as was Martin’s idea.
9.7.1.4 Vovin’s proto-Ainu evidence for vowel length in proto-Japanese
Vovin reconstructed vowel length in proto-Okinawan, but considered the evidence
from other Ryūkyūan dialects insufficient for the reconstruction of vowel length in
proto-Ryūkyūan. He does however, believe that the vowel length goes back to proto-
Japanese (it just left no traces in other dialects in the Ryūkyūs). His reconstruction of
vowel length in proto-Japanese is based on the fact that a number of words have
vowel length in the initial syllable as loanwords in Vovin’s reconstruction of proto-
Ainu.48
23 Vovin’s reconstruction of proto-Japanese vowel length based on Ainu
Vovin’s Vovin’s
proto-Ainu proto-Japanese
*kaani /HHH/ *kaana(=Ci) /HHH/ ‘metal’ (class 2.1)
*paakari /HLLL/ *paaka(=ra=) /LLL/ ‘to measure’, ‘to weigh’ (class 3.4)
*tuuki /HLL/ *tuuki /HLL/ ‘sake cup’ (class 2.2)
*tuuti /LLH/ *tuutu(=Ci) /LLH/ ‘large wooden hammer’ (class 2.4)
The original reason why Vovin looked for traces of vowel length in the Ryūkyūs was
because he supported Martin’s reconstruction of vowel length in the initial syllable
of words that started with /L/ tone in proto-Japanese (and in syllables that reverted to
/L/ tone later on in the word). But in fact, the vowel length that Vovin reconstructed
in proto-Okinawan was not limited to syllables with /L/ tone in proto-Japanese.
With the Ainu examples in (23) the link between vowel length and the tone of the
initial syllable is disappearing even further from sight, as only half of the examples
from Ainu has /L/ tone in Middle Japanese (in the standard reconstruction), while
the other half has /H/ tone.
Nevertheless, if vowel length can indeed be reconstructed in proto-Japanese
loanwords in proto-Ainu, and if this vowel length can only be explained if it was
already there in Japanese, this would constitute independent proof of vowel length in
proto-Japanese (although without a connection to initial /L/ tone).
48 The proto-Japanese tones shown here, are based on the standard reconstruction, to which Vovin
adheres.
246 9 The tone systems of the Ryūkyūs
Other possibilities however, have to be ruled out first, such as that the tone in
proto-Japanese caused these words to have long vowels in the initial syllable in
proto-Ainu. Everything therefore depends on whether the reconstruction of both
vowel length and accent in proto-Ainu is correct.
Both proto-Ainu vowel length and proto-Ainu accent have been reconstructed by
Vovin in his monograph on proto-Ainu (1993a) and he comments (1993b:130):
One can doubt whether PJ (proto-Japanese) loanwords in PA (proto-Ainu)
would preserve original vowel length. However, the examples above also
show that PA accent of these loanwords coincides with PJ accent as
reconstructed in Martin (1987). It would be a linguistic miracle if both vowel
length and pitch-accent in PJ loanwords in PA happened to be the same as in
PJ due to simple coincidence.
It has to be realized however, that the proto-Japanese vowel length that Vovin
reconstructed based on the reflexes in Okinawa, occurs in a different set of words:
Not a single one of Vovin’s examples of Japanese loanwords with vowel length in
proto-Ainu has vowel length in Okinawa.49
As the examples for which Vovin reconstructs long vowels in proto-Japanese
(based on Ainu), have short vowels in Okinawa, there is in fact no agreement of
“both vowel length and pitch-accent” in proto-Japanese loanwords in proto-Ainu.50
This means that we do not have to believe in linguistic miracles if we do not
follow Vovin in his reconstruction of distinctive vowel length in proto-Japanese.
Nevertheless, even apart from the issue of proto-Japanese vowel length; if
Vovin’s reconstruction of proto-Ainu tone is correct, and if Ainu has truly preserved
the tones of proto-Japanese in these loanwords, this would constitute an argument
against Ramsey’s reconstruction of the Middle Japanese tone system.
This is especially so, as the examples above are not the only examples of
Japanese loanwords in proto-Ainu that Vovin presents: In a later article (1997)
Vovin argues explicitly against Ramsey’s reconstruction of the Middle Japanese
tone system based on his reconstruction of the tone of proto-Japanese loanwords in
Ainu. Everything therefore depends on whether Vovin’s reconstruction of the
prosodic system of proto-Ainu is correct. As I will argue in chapter 11 however,
Hattori’s far simpler reconstruction of the prosodic system of proto-Ainu (1967) has
to be preferred. Secondly, the prosodic shape of Japanese loanwords in proto-Ainu is
at best neutral as to the reconstruction of the Middle Japanese tone system.
49 According to the Okinawa-go jiten (1963) ‘metal’, ‘sake cup’ and ‘large wooden hammer’ all
have short vowels. I have not found a cognate of *paakari (hakari in modern Japanese) in the
dictionary, but vowel length in the initial syllable is absent in words of more than two syllables
in Okinawa.
50 Apart from this, the tone of proto Ainu *paakari (/HLLL/ in Vovin’s reconstruction) does not
agree with proto-Japanese *paaka=ra= /LLL/.
10 Conclusion: The order and timing of the dialect splits
There are not many languages in the world of which the tone systems have been so
well researched and documented as in case of Japanese. In addition to the modern
dialect data, there is a written record which contains a wealth of information on tonal
distinctions from the Middle Japanese period on.
Despite these advantages, it has proven difficult to reconstruct the proto-
Japanese tone system in such a way that a satisfactory account can be given of the
historical developments that led to the tone system of Middle Japanese and the tone
systems of the modern dialects.
The single most important cause of these difficulties is the standard
interpretation of the written record: The standard interpretation of the value of the
tone dots results in a Middle Japanese tone system that does not fit in with the
modern dialect data. In this study I have therefore adopted the radically different
interpretation of the value of the Middle Japanese tone dots by S. R. Ramsey.
In the preceding chapters I have shown what the developments from the tone
system of proto-Japanese to the tone systems of the modern dialects look like, when
the reconstruction of the tone system of proto-Japanese is based on Ramsey’s ideas.
In this chapter, I will go over the main conclusions reached in the preceding
chapters, while concentrating on the most likely timing of the developments, and the
possible causes behind the present-day geographical distribution of the different tone
systems in Japan.
10.1 Minor developments
The most archaic type of the attested Middle Japanese tone systems (the MJ ‘Nairin’
tone system), is closest to the tone system of proto-Japanese. This tone system had
two basic tones /L/ and /H/ and two derived contour tones /R/ and /F/. The later
disappearance of /F/ tone, and the reason why it left no trace in the modern dialects
is discussed in section 8.2.1.)
The Middle Japanese material only preserved final /R/ tone in monosyllables,
and when preceded by /H/ tone, but dialect comparison suggests that proto-Japanese
included final /R/ tones preceded by /L/ tone as well. Tone classes that included this
final /R/ preceded by /L/ are the subclasses 2.2.a, 3.2a and 3.7a, which have been
discussed in 8.1 and subsections. These final /R/ tones were lost before the Middle
Japanese period – at least in the attested variants of Middle Japanese – but they may
have left a trace in some of the Tōkyō type dialects in the form of an unexpected Ø
tone reflex. Although I regard the existence of these final /R/ tones preceded by /L/
248 10 Conclusion: The order and timing of the dialect splits
tone as likely, the present-day dialect reflexes that may be connected to these
subclasses are too vague to base clear dialect splits on. This possible subdivision is
therefore not included in my overview of the dialect splits below.
Another difference between the MJ ‘Nairin’ tone system and the tone system of
proto-Japanese is that in the MJ ‘Nairin’ tone system /L/ tone was no longer allowed
after sequences of /LH/ tone within the word boundary. Dialect comparison shows
that proto-Japanese knew no such restriction, as the modern reflexes in Tōkyō
(') and Kōchi (') indicate that part of the nouns attested with
tone in Middle Japanese (class 3.3) must go back to * tone in proto-Japanese.
(See section 4.5.) Moreover, there are still some rare attestations of these nouns with
tone in the written record.
Due to the small size of tone class 3.3, it is hard to determine the regular reflexes
of this class in many dialects, and this development is therefore not included in the
overview of the dialect splits below either.
10.2 The new dialect-geographical paradox
With Ramsey’s reversal of the value of the Middle Japanese tones, the tone system
of proto-Japanese is so close to the tone systems of the Tōkyō type dialects that the
predominance and peripheral distribution of the Tōkyō type tone systems in relation
to the Kyōto type tone systems is no longer problematic. With the Kyōto type tone
system as a late development the attention shifts to the difference between the three
Tōkyō subtypes (Nairin, Chūrin and Gairin) and their geographical distribution.
With the exception of the special Nairin subtype preserved on Noto Island, none
of these three Tōkyō subtypes has preserved more distinctions than the other; just
different ones. At first sight, it is therefore hard to tell which of these three types is
more archaic.
The standard reconstruction of the proto-Japanese tone system provides no clue
to answering this question. The geographical distribution however, would suggest
that the Gairin type represents an archaic stage, followed by the Chūrin type, with
the Nairin type in the center as the most innovative subtype.
When the proto-Japanese tone system is reconstructed in accordance with
Ramsey’s theory, it becomes clear that the merger patterns of the Chūrin and Gairin
subtypes are the result of innovations. In case of the Chūrin type, the innovation was
the loss of /L/ tone on the monosyllabic case particles after /R/ tone. At the time of
the later tone reduction, this led to the merger of classes 1.1 and 1.2. In case of the
Gairin type, the innovation was the loss of /L/ tone on the case particles after /R/ and
/LH/ tone. At the time of the later tone reduction this led to the merger of classes 1.1
and 1.2 but also to the merger of classes 2.1 and 2.2, and classes 3.1 and 3.2.
It turns out that the merger pattern of the Nairin type developed from the most
conservative variant, in which no tone spreading across the word boundary had
taken place. Although the difference with the Chūrin type that immediately
10.3 The conditioned split between Nairin type and Chūrin type 249
surrounds it is minor (the only difference is in the merger pattern of class 1.2), the
difference is nevertheless the result of an innovation in the non-central Chūrin type,
which is contrary to expectation.
The Nairin and Chūrin type tone systems are closely related, but the difference
between these two and the Gairin type tone system is much more profound. The
Gairin tone system also provides the largest surprise. This type, which is distributed
in four blocks in the periphery, developed from the ancestral tone system that had
gone through the most extensive form of tone spreading, the ancestral type therefore,
that was most innovative of all.
10.3 The conditioned split between Nairin type and Chūrin type
The Nairin/Chūrin split appears to have been conditioned by the presence or absence
of automatic vowel length in monosyllables. (See section 3.1.5.) In the central
Japanese dialects, where monosyllables are automatically lengthened (even when a
case particle is attached), the pitch of the case particles remained [L]. Later, when
the contour tone of class 1.2 was simplified (:- > :-) these dialects
developed a Nairin type merger pattern in the monosyllabic nouns (class 1.2 merged
with class 1.3). Dialects that did not have automatic vowel length in monosyllables
spread the rise to [H] pitch of tone class 1.2 onto the attached case particle (- >
- or -), which later led to a Chūrin type merger pattern in the monosyllabic
nouns (class 1.2 merged with class 1.1). The Nairin and Chūrin merger patterns can
therefore be regarded as conditioned variants within a common tonal type.
As automatic vowel length in monosyllables is a feature of central Japan, this
explains the fact that the Nairin type, in which tone spreading after monosyllabic
contour tones did not take place, occupies a central position in relation to the Chūrin
type. Before the occurrence of the Nairin/Chūrin split, the present-day areas with
Chūrin type tone on both sides of the Nairin type, had a Nairin type tone system,
without tone spreading across the word boundary.
The generalization of vowel length in monosyllables was most likely an
innovation of central Japan, as in proto-Japanese contour tones were most likely
lengthened while level tones were short. This innovation resulted in an archaic tonal
enclave in central Japan, in which – due to the presence of automatic vowel length –
/H/ tone spreading onto the case particles failed to take place after monosyllables
with /R/ tone.
As to the time of origin of the Nairin/Chūrin subgrouping; the shortening of
contour tones must have taken place after the introduction of the Go-on and Kan-on
reading traditions in Japan (see section 11.1.1 of part II). This means that the tone
spreading outside of the central area most likely dates from after the 8th century. As
the oldest attestation of an MJ ‘Chūrin’ type tone system dates from the mid 12th
century, the tone spreading must have taken place sometime during the Heian period.
250 10 Conclusion: The order and timing of the dialect splits
10.4 The oldest split from proto-Japanese:
The Gairin type tone system and its geographical distribution
The earliest split from proto-Japanese that left a clear reflex in the modern dialects
occurred when the Gairin dialects lost the /L/ tone of the monosyllabic case particles
after /R/ and /LH/ tone. The split of the Gairin tone system from the proto-Japanese
Nairin type was not related to the presence or absence of vowel length. In the Gairin
type dialects after all, /H/ tone spreading not only occurs after the contour tone /R/
but also after a sequence of /LH/ tone on two consecutive syllables. The peripheral
distribution of the four areas with innovative Gairin type tone therefore requires a
different explanation.
One possibility is that they are the result of independent parallel developments.
As the Gairin tone spreading agrees with tonal developments that are commonly
observed in tone languages, this is not impossible. However, it is now widely
believed that the Japanese language was brought to the Japanese islands by Yayoi
immigrants from the Korean peninsula from approximately the 5th century B.C. on.
From the initial arrival point in northeast Kyūshū, the spread of the Japanese
language coincided with the eastward spread of Yayoi immigrants. As it is known
that population migration played a crucial role in the spread of the Japanese
language through the Japanese islands, it is not farfetched to look for a connection
between the geographically widely separated blocks of Gairin type tone and the fact
that Japan was settles by speakers of Japanese by means of west-to-east migrations.
I assume that the type of Japanese that first arrived in north Kyūshū in
approximately 500 B.C was Nairin type proto-Japanese. During the Early Yayoi
period, this language spread to western Honshū (Chūgoku, Kibi), Shikoku and the
Kinki region by way of the Seto Inland Sea.
When the groups that had started to move up the Inland Sea got separated from
the groups that remained in Kyūshū, the proto-Japanese tone system split. In north
Kyūshū, the Gairin /H/ tone spreading created a new tonal type.
As expansion from northeast Kyūshū to the south and west was blocked by the
Kumaso/Hayato, all further migration of the Yayoi population in Kyūshū was
necessarily towards the northeast. (This situation may have lasted more or less until
the Hayato were defeated in 720 A.D.)1 One more Early Yayoi group from Kyūshū
may have moved up the Sea of Japan coast and settled in the coastal plain of Izumo
after the Gairin innovation had taken place. (Izumo is separated from the rest of the
Chūgoku area by mountain ranges. This means that it may have been sheltered from
settlement by the earlier groups of immigrants that had spread along the Inland
Sea.)2
1 I also assume that the Hayato would only have started to give up their language(s) sometime
afterward.
2 It is also conceivable that the innovative type developed on the Korean peninsula after the first
Yayoi migrations to Kyūshū. After part of the initial group of migrants had moved up the Seto
10.5 Similarities between the dialects of Izumo and Tōhoku 251
By the end of the Early Yayoi period, the immigrant population had come as far
as present-day Nagoya (Hudson 1994: 245), at which point it is thought that the
Yayoi expansion temporarily slowed down. The fact that there is another Gairin area
just to the east of the border of the Yayoi expansion at the end of the Early Yayoi
period, suggests that during the Middle Yayoi period there was one final migration
from Kyūshū.
By this time, the areas with the best agricultural land along the Seto Inland Sea
had most likely already been settled by speakers of Nairin Japanese. It is possible
that a final group of migrants from Kyūshū avoided these areas, and continued to
sail around the coast until they reach the frontier of the Yayoi expansion to the east
of Nagoya.
After the establishment of the Gairin area in the Tokai region, the expansion of
Nairin dialect speakers from central Japan continued, spreading towards the Kantō
region. Immigrant type skeletons make their appearance in the Kantō region in the
late Middle Yayoi period, and by the beginning of the Kofun period (5th c.) the
southern part of Tōhoku had been reached (Hudson 1999:66, 142). By the 7th
century, the immigrant type population in these regions had reached a high level, but
further settlement of the Tōhoku region cannot have proceeded through a gradual
spread of these Nairin speakers to the northeast.
The similarities between the Gairin type dialects of northeast Japan and the
Gairin type dialects of Shimane are too numerous to ignore, and strongly suggest
that north Tōhoku was settled via the Sea of Japan coast.
10.5 Similarities between the dialects of Izumo and Tōhoku
The tone systems of the area in Shimane prefecture with Gairin type tone are divided
into Gairin type A, and Gairin type B. Gairin B developed from Gairin A by shifting
/H/ tone that was not located on the final syllable one syllable to the right. This
rightward shift was blocked if the syllable to which the /H/ tone would be shifted
contained the close vowels /i/ or /u/. In disyllabic nouns, the result of this
development was that approximately half of the membership of class 2.4/5 (nouns
that contained /a/, /e/, or /o/ in the second syllable) merged with class 2.3. The fact
that Gairin B is an innovation is clearly illustrated by the geographical distribution
of the two types in Shimane prefecture: The conservative type A has been preserved
on both sides of the area with innovative type B, which occupies the centre. (See
Map 1.)
The cases of onbin vowel loss, which start to be observed in the written record of
central Japan in the early 9th century, involve loss of the vowels /i/ and /u/. This
indicates that in this area the close vowels were most likely pronounced shorter and
inland Sea, it was then brought to Kyūshū and Shimane by a second migration from the Korean
peninsula.
252 10 Conclusion: The order and timing of the dialect splits
weaker than their open counterparts, just as they are in many modern Japanese
dialects. (Devoicing of vowels between voiceless consonants in modern Japanese for
instance, also primarily affects the vowels /i/ and /u/.)
The avoidance of /H/ tone on syllables that contain /i/ and /u/ in the Gairin B
tone system also points to a short and weak pronunciation of these vowels, as short
and weak vowels are less fit to function as bearers of /H/ tone. The relatively weak
nature of /i/ and /u/ as compared to the other vowels therefore, appears to have been
a feature of both central Japan and Izumo, and most likely goes back to proto-
Japanese.
As to segmental phonology, in Izumo /u/ and /i/ both have a more centralized
realization than in standard Japanese. After dental stops and sibilants (/t/, /d/ and /s/)
furthermore, /u/ is realized as a central unrounded vowel. In part of Shimane
(roughly the area with innovative Gairin B tone) this has led to a merger of /u/ with
/i/ after /t/, /d/, /s/, but in other parts (roughly the area with Gairin A tone) the two
vowels are still being distinguished. (See the map in Kamei, Kōno & Chino eds,
1989:1760).
There also is a considerable phonetic overlap between /i/ and /e/, and between /u/
and /o/, as the pronunciation of /e/ and /o/ is higher than in standard Japanese. In the
initial syllable /i/ is lowered and merges with /e/, and depending on the preceding
consonant, /u/ is also lowered in the initial syllable and merges with /o/. (See the
Izumo data in Kobayashi, 1975).
In all of these respects, the dialects of the Tōhoku region show a strong similarity
with the dialects of Shimane. In the Gairin B dialects of the Tōhoku region, half of
class 2.4/5 (the part that has an open vowel in the final syllable) has also merged
with class 2.3. Just as in Shimane, /u/ is a central unrounded vowel after /t/, /d/ and
/s/, and in the Sea of Japan coast-side of Tōhoku this has led to a merger with /i/, just
as in part of Shimane. And just as in Shimane, /i/ in the initial syllable has merged
with /e/, and there are areas where /u/ merges with /o/ in similar environments as in
Shimane (Ikegami, 1970:534). Another similarity between Izumo and Tōhoku –
mentioned by Fujiwara Yoichi (1951:184) – is the palatalization of /k/ before /i/.
10.6 The settlement of the Tōhoku region
Based on the many similarities between the dialects of Izumo and Tōhoku, Fujiwara
proposed a division of the Japanese dialects into a Sea of Japan coast dialect group
and a Pacific coast dialect group. He regarded the first as a group of remnant dialects
that became separated by the later spread of linguistic influence from Kyōto and
Tōkyō.
At the time when Fujiwara proposed this idea, it was still commonly thought that
Japanese had been spoken on the Japanese islands ever since the Jōmon period. To
regard the widely separated Izumo and Tōhoku dialect areas as remnants of an
earlier, far larger spread of this dialect type, was therefore natural.
10.6 The settlement of the Tōhoku region 253
With the now predominant view that the Japanese language only came to Japan
in the Yayoi period, is it still possible to posit the existence of a once large Gairin-
speaking zone, which subsequently disappeared due to the later spread of linguistic
influence from central Japan?
One would have to assume that Yayoi migrants spread the Gairin subtype all
along the Sea of Japan coast to the Tōhoku area before Yamato became predominant.
The later dominance of Nairin-speaking Yamato would then have caused the Nairin
subtype to expand, so that the western and the eastern extremes of the Gairin zone
got separated from each other.
Because of similarities in pottery styles, it is indeed thought that some Yayoi
groups from the coastal areas of Shimane and Kyōto had already “leap-frogged” up
the Sea of Japan to Akita and Aomori in the Early Yayoi period (Hudson 1999:136).
This does not mean however, that the entire coastal area along the Sea of Japan was
once Gairin-speaking territory.3 There are, for instance, no remnant Gairin dialect
areas to be found anywhere between Shimane and Niigata.
Also, despite the early Yayoi settlements in the northeast, it is likely that the
decisive spread of the Japanese language to the Tōhoku region took place only much
later. The dialect diversification in the dialects of the northeast is smaller than in
other parts of Japan (Inoue, 1992), indicating a relatively late date of the spread of
the Japanese language in the Tōhoku region, rather than an early one.
Furthermore, even as late as the 8th century, several campaigns were needed to
subdue the non-Japanese (Ainu?) speaking Emishi in Mutsu and Sendai, and in 774
there were still Emishi raids on the Kantō plain. It seems unlikely that such a
situation would have invited large-scale migration to the Tōhoku region. It was only
in 811 that the Emishi were defeated, truly opening the way for settlement of the
northeast, which agrees with the idea of a relatively late spread of the Japanese
language to the Tōhoku region.
Although the Gairin A type may have been introduced from Shimane to Akita
and Aomori in the Early Yayoi period, the much larger spread of the Gairin B tone
system must have taken place later, as the Gairin B type represents an innovation.
For the reasons I have just mentioned, this spread should probably be dated no
earlier than the 9th century.
The fact that the Gairin B type, although arriving later, was able to spread to
such an extensive area in the Tōhoku region, probably means that the Yayoi
immigrant population in the area at the time of the introduction of Gairin B was still
low. All of these things argue against the early existence of a large Gairin dialect
zone along the Sea of Japan coast that included the Tōhoku region.
The two remnant areas in the Tōhoku region where the Gairin A tone system has
been preserved (the Shimokita peninsula and part of Iwate) are the areas that are
3 Leap-frogging migrations are characterized by the desire get to the best, or more important land
quickly, without stopping at intermediate points. In the Japanese case, skill in sailing would
have been a great advantage in this type of migration.
254 10 Conclusion: The order and timing of the dialect splits
farthest removed from the Sea of Japan coast. This makes it likely that it was from
the Sea of Japan coast that the Gairin B type started its spread in the Tōhoku region.
The only other area in Japan where the innovative Gairin B subtype can be found
is in Izumo. Izumo was settled early on by a Yayoi immigrant population, and
became one of the most powerful kingdoms in Japan. Although its importance
diminished after Yamato became predominant, it nevertheless seems unlikely for the
Gairin B tone system to have developed in the Tōhoku region, and to have been
exported from there to far-away Izumo: If Gairin B spread from Tōhoku to Izumo
for instance, an area with an already long settled immigrant population, it is strange
that it was not exported to many other areas along the Sea of Japan coast between
Niigata and Shimane, which had also long ago been settled by Yayoi farmers.
As it was the Tōhoku region, not Izumo, which had a relatively late date of
settlement by a Japanese speaking immigrant population, a spread of the Gairin B
tone system (including all the other typical features of this dialect type) from Izumo
to Tōhoku in the shape of trade contacts or migration – and probably both – seems
the obvious solution. In this scenario it is not strange that Gairin B was only
exported to the Tōhoku region, and not to many other places along the Sea of Japan
coast in-between Izumo and Tōhoku: The areas in-between were already settled by
Yayoi farmers. The Tōhoku region on the other hand, had only recently been opened
up to large-scale immigration. The amount of arable land on the Izumo plain is
limited, and pre-existing trade contacts along the coast may have paved the way for
farmers from Izumo in search of new agricultural land on the newly opened frontier.
All of this would mean, of course, that the Gairin B tone system developed in
Shimane sometime before the 9th century.
Finally, Uwano (1981) has drawn attention to the fact that the dividing line
between the Gairin type and the Chūrin type tone systems in the southern Tōhoku
region is extremely blurred, as the transitional area between them stretches for more
than 200 kilometers. The mixture of Chūrin and Gairin reflexes in this area may be
due to 9th century immigrants with two different tone systems arriving from two
different areas: Chūrin speakers who proceeded north from the Central Highlands,
and Gairin B speakers from Izumo, who entered from the Sea of Japan coast.
10.7 The starting point of the /H/ tone restriction
While proto-Japanese was a register tone language with two tones, /L/ and /H/, (as
well as two derived contour tones /R/ and /F/), at some point the number of /H/ tones
started to become restricted to the extent that the remaining /H/ tones became
marked or accent-like.
In central Japan, the first signs of the /H/ tone restriction appear in the written
record in the late 13th century. (Nouns of class 2.3 for instance start to be marked
with 上平 instead of earlier 平平 tone.) Although this development only reached
10.7 The starting point of the /H/ tone restriction 255
central Japan in the 13th century, it must have started much earlier in western Japan,
most likely on Kyūshū.
The reason why the tone reduction must have taken place prior to the 13th
century in western Japan, is because the Ryūkyūs were settled from there by Gairin
speakers in whose dialect tone classes 2.1 and 2.2 and 3.1 and 3.2 had already
merged.
The original Gairin innovation had caused the pitch fall after tone classes 2.2 and
3.2 to disappear, but did not cause these two classes to merge with tone classes 2.1
and 3.1. (This stage has been attested in the MJ ‘Gairin’ material.) The mergers only
took place later as a result of the /H/ tone restriction.4
The fact that there is not a single dialect in the Ryūkyūs that keeps tone classes
2.1 and 2.2 and 3.1 and 3.2 separate, indicates that proto-Ryūkyūan must have split
off from a western Japanese Gairin type dialect that had already passed through the
tone reduction. If the modern word-tone systems had developed from an unrestricted
Gairin type tone system (such as attested in the MJ ‘Gairin’ material), or directly
from proto-Japanese (as is often assumed) it is likely that classes 2.1 and 2.2 and
classes 3.2 and 3.1 would have developed distinct word-tones in at least some of the
many – and highly diverse – Ryūkyūan word-tone systems.
Based among other things on the fact that wet-rice cultivation is first attested in
the Ryūkyūs in the 10th century, Asato and Doi (1999) have argued that the spread of
the Japanese language to the Ryūkyūs may only date from the Heian period.
The idea that the Ryūkyūan tone systems developed from an evolved, relatively
late Gairin type tone system fits in well with Asato and Doi’s findings. Kindaichi
(1984) has furthermore pointed out that most Chinese loanwords in Ryūkyūan (the
later Kan-on as well as the earlier Go-on), are split up in the Ryūkyūan dialects in
the same way among the different tone classes as they are in mainland Japanese.
This indicates that these words already formed part of proto-Ryūkyūan, and were not
introduced to the islands later on. This agrees with the idea of a relatively late
movement of speakers of Japanese to the Ryūkyūs.
As the spread of the Japanese language to the Ryūkyūs most likely took place
around the 10th century, the /H/ tone restriction in western Japan must date from
before that period. From western Japan, the /H/ tone restriction then spread across
old dialect borders from one dialect to the next, affecting the different tone systems
one after the other, until it finally reached central Japan in the 13th century.
The modern tone rules for compound nouns in the different dialects probably
developed their present form at the time when /H/ tone restriction fundamentally
changed the original tone systems of these dialects. In northeast Kyūshū and
Shimane the results were not identical, even though both dialects had a MJ ‘Gairin’
type tone system as a starting point. On one important point however, the rules in
the two dialects agree: If the first element has Ø tone the compound will have Ø tone.
4 It is also the tone reduction which caused the merger of class 2.5 with class 2.4, and the merger
of class 3.7 with class 3.6 in the Gairin dialects.
256 10 Conclusion: The order and timing of the dialect splits
If the first element contains /H/ tone, the compound will contain /H/ tone.5 The rules
that determine the location of the /H/ tone in the two dialects however, are quite
different.
This confirms that the /H/ tone restriction spread across older dialect boundaries
from one dialect to another. In each dialect, the onset of the restriction was triggered
by a neighboring dialect, but once the restriction process had started, the
developments in the individual dialects were relatively autonomous.
In case of the Gairin B type tone system in the Tōhoku region however, I suspect
that the /H/ tone restriction already formed part of the dialect that spread to this
region from Izumo, so that at a certain point the Chūrin/Nairin areas were
surrounded by restricted tone systems on both sides. Certainty on this point could be
obtained through a detailed study and comparison of the tone rules for compound
nouns in Izumo and Tōhoku. If the tone rules in Izumo and Tōhoku coincide, it is
likely that the tone reduction already occurred in Izumo. If the tone rules are
different, it is likely that the tone reduction occurred independently in these dialects.
In the Chūrin/Nairin dialects (in an area that stretches at least from Hiroshima to
Tōkyō, and northward to Kanazawa and Toyama), the /H/ tone restriction also
triggered the development of new tone rules for compound nouns. In Middle
Japanese, the /H/ or /L/ tone with which the first element started had influenced the
tone of the second element of the compound. With the reduction of Middle Japanese
/L/ tone to Ø, the rules that obtained when the first element started with /H/ tone
were generalized, and applied to all compounds. In the Chūrin/Nairin dialects
therefore, the tone of the compound is determined by the second element only.
10.8 Final developments
As I will argue in chapter 12 of part II, the leftward tone shift in the Kyōto type
dialects most likely started in Kyōto around the mid to late 14th century. The
Shikoku pitches recorded in Mōtan shichin-shō 毛 端 私 珍 抄 of around 1530,
suggest that the /H/ tone restriction had not yet reached Shikoku by that time. The
tone shift in Shikoku (which in that case must have taken place after 1530) may
therefore have had an unrestricted MJ ‘Chūrin’ type tone system as a starting point.
It is clear that the shift in Kyōto must have taken place at a time when /R/ tone,
as well as the possibility to have two non-consecutive /H/ tones per word, had still
been preserved. Kyōto after all, preserves the distinct tone classes 1.2, 2.5 and 3.7,
which had /R/ tone, /HR/ tone and /HØH/ tone respectively before the shift.
After the Kyōto type dialects had split off, these features were lost in most Nairin
and Chūrin type dialects, and in these dialects therefore the distinct tone classes 1.2,
5 In section 5.10, I have argued that these rules developed from proto-Japanese rules that must
have been similar to those attested in the Kanchi-in-bon of Ruiju myōgi-shō 観智院本類聚名
義抄.
10.8 Final developments 257
2.5 and 3.7 disappeared. (The only exception are the Tōkyō type dialects of the Noto
peninsula and Noto Island.)
In most Kyōto type dialects on the other hand, these tone classes were preserved.
In these dialects the leftward tone shift eliminated /R/ tone and multiple non-
consecutive /H/ tones from the system, without causing tone class mergers. (As a
result of the shift in the Kyōto type dialects, class 1.2 now has /H/ tone, class 2.5 has
/LH/ tone, and class 3.7 has /LHØ/ tone.)
The tone shift in the Kyōto type dialects also caused the tone rules for compound
nouns to be modified. The pre-shift Nairin/Chūrin rule in which the tone of the
compound was determined by the second element was maintained, and even the
location of the /H/ tone in the compound was not affected. The newly developed
register distinction however, was superimposed on the older tone rules. From then
on, the register of the compound was determined by the register of the first element.
An overview of the developments summarized in this chapter is given in Figure
1. Tone systems represented with bold outlines are at the unrestricted stage, all other
tone systems went through the /H/ tone restriction.
/H/ tone spreading Proto-Japanese
onto particles after /H/ tone
/R/ and /LH/
(Nairin)
spreading
onto
particles
Rightward after /R/
Gairin Nairin
shift of
/H/ tone
Shift to blocked
word-tone Gairin A by close
vowels Nairin Chūrin
Makurazaki Gairin B
Proto-Ryūkyūan
Shift to Nairin Chūrin
word-tone Leftward
Ryūkyūan shift of
/H/ tone
Leftward
shift of
Kyōto /H/ tone
Kōchi
Figure 1: Overview of the splits in the Japanese tone systems
258 10 Conclusion: The order and timing of the dialect splits
10.9 Hattori’s ideas on the relation between dialect boundaries
based on tonal distinctions and Japanese history
In the early years of the investigation of the Japanese tone systems, Hattori (1930)
argued that dialect boundaries based on tonal (accentual) distinctions could be
expected to be more archaic than dialect boundaries based on other features. The
sheer complexity of the tone systems, which include the tonal distinctions of
thousand of words as well as rules for their phonetic realization, would have made
them more resistant to borrowing and imitation than features of grammar or
segmental phonology. Because of this, Hattori suspected that dialect boundaries
based on tonal distinctions could hold important information on the cultural and
political history of Japan.
In the years since Hattori started his investigation, the painstaking research of
Japanese scholarship has resulted in a detailed knowledge of the tone systems of the
Japanese dialects and their geographical distribution.
In this chapter I have looked at the geographical distribution of the Japanese tone
systems from the viewpoint of Ramsey’s theory. In particular, I have addressed the
problem of the relative antiquity of the different types that follows from Ramsey’s
theory, in relation to their geographical spread.
In Ramsey’s theory the relative antiquity of the Kyōto type tone systems and the
Tōkyō type tone systems is reversed, which explains the central distribution of the
innovative Kyōto type. The reversed reconstruction of the Middle Japanese material
however, also has implications for the antiquity of the Tōkyō type tone systems vis-
à-vis each other. The central distribution of the conservative Nairin type in the
middle of the more innovative Chūrin type is unexpected, but it can be explained by
the presence of automatic vowel length in monosyllables in the central Japanese area.
The distribution of the innovative Gairin type however, is harder to explain.
There is a serious possibility that the distribution of this type in four widely
separated blocks is related to the order in which the Japanese islands were settled by
speakers of Japanese. The Gairin areas in the Izumo and Tōkai regions could be
related to migrations of settlers from the Gairin area in northeast Kyūshū. The large
Gairin area in the Tōhoku region could be the result of a much later migration from
Izumo. If so, Hattori was correct in suspecting that the geographical distribution of
the Japanese tone systems holds important information on the history of Japan.
11 The accent of Japanese loanwords in Ainu
In Chapter 9, I have discussed how Alexander Vovin (1993b) argued that proto-
Japanese loanwords in proto-Ainu confirm his reconstruction of distinctive vowel
length in proto-Japanese. As we have seen however, the Okinawan dialects on which
Vovin based his reconstruction of proto-Japanese vowel length do not agree with the
Ainu dialects on the lexical items in which vowel length is found. In a later article
(1997), Vovin adduces the accent of Japanese loanwords in Ainu to argue against
Ramsey’s reconstruction of the Middle Japanese tone system.
It is impossible to assess the evidence from Japanese loanwords in Ainu
concerning vowel length or the reconstruction of the proto-Japanese tone system
without a considerable knowledge of Ainu phonology and of the history of Ainu
pitch-accent and syllable structure. I will therefore begin with a general introduction
of the differences between the Ainu dialects, and the reconstruction of proto-Ainu
syllable structure and accent by Hattori Shirō (1967).
I will then discuss Vovin’s reconstruction of the pitch-accent system of proto-
Ainu, as well as his arguments regarding the Japanese loanwords. Finally, I will
make a brief excursion into some wider issues regarding Vovin’s reconstruction of
proto-Ainu.
11.1 The basis of the Ainu dialect comparison
The basis for thedialect comparison of Ainu is the Ainu dialect dictionary published
in 1964, edited by Hattori Shirō. The dictionary includes material from Raichishka
on Sakhalin and from the following Hokkaidō Ainu dialects: Yakumo, Horobetsu,
Saru, Asahikawa, Obihiro, Bihoro, Nayoro and Sōya. At the time when the material
was being gathered, mainly from 1955 to 1956, the condition of the Ainu language
was already so bad that for several dialects in the dictionary (i.e. Yakumo, Bihoro,
Nayoro and Sōya) the informant whose speech formed the basis of the material was
the last surviving speaker of the dialect. The area of Nibutani and Biratori near the
Saru river has been the last foothold of the Ainu language, and the so-called Saru
dialect of Ainu is therefore the dialect that has been most extensively studied. The
dialect dictionary also includes the Kuril Ainu material of Torii Ryūzō, even though
Hattori remarks that it is not known how reliable Torii’s data are.1
1 Torii Ryūzō’s material was collected in 1899 on the island of Shikotan from an informant who
had come from the island of Shumshu near Kamchatka. Murayama (1971:III) has pointed out
that the material collected by the polish doctor Dybowski, who was on Kamchatka from 1879
260 11 The accent of Japanese loanwords in Ainu
On the basis of this material and some older written sources (especially sources
on Kuril Ainu, which are introduced in this chapter in 11.8.5 and subsections), the
first large-scale attempt to reconstruct proto-Ainu was made in 1993 by Alexander
Vovin.
In 1960 Hattori Shirō and Chiri Mashiho published a lexicostatistical study of
the Ainu dialects. (As Kuril Ainu had already died out by that time, Kuril Ainu
dialects were not included.) As could be expected, the greatest gap could be found
between the Sakhalin dialects and the Hokkaidō dialects. In the Hokkaidō dialects
85% to 95% of the 200 items in Swadesh’s basic word list agreed with each other.
Between the Hokkaidō dialects and the Sakhalin dialects this was about 70% to 75%.
The Sakhalin dialects agreed with each other in about 90% of the items. The
Hokkaidō dialect that is closest to the Sakhalin dialects is the dialect of Sōya, on the
northwestern point of Hokkaidō. (The Sōya informant described the Sakhalin Ainu
language as “a little different from the language here, but I can understand it. The
pronunciation is a little different”.)
Within Hokkaidō there is a division between eastern and western dialects. In the
east, initial h- has the tendency to drop before unaccented syllables, while in the
west it is usually retained: ‘child’ hekáci in Horobetsu, Saru and Sōya, ekáci in
Obihiro, Asahikawa, Nayoro and Bihoro. (As Bihoro has lost all accentual
distinctions the Bihoro reflex should actually be written as ekaci. In Yakumo accent
is shifted to the third syllable if the second syllable is open: hekací.)
While there are no overt accentual distinctions in Bihoro, accent did leave a trace
in Bihoro, in the retention or loss of initial h-, and in a related phenomenon: In this
dialect, the rule that initial h- is dropped before formerly unaccented syllables has
been regularized into a rule in which a prothetic h- appears before formerly accented
syllables (personal communication by Satō Tomomi).2
to 1883, and also collected material from an informant from Shumshu must have been from the
same dialect as Torii’s material, and as it also stems from around the same time, these two
sources can be checked against each other.
2 In the dialect of Bihoro a prothetic h- appears in the following words that originally started
with a vowel, and that all have accent on the first syllable in the other Hokkaidō Ainu dialects:
‘claw’ Bihoro ham, Yakumo, Horobetsu, Saru, Obihiro ám, am-í, Asahikawa,
Sōya ám, am-íhi, Nayoro ám, Sakhalin am, am-ihi
‘deaf’ Bihoro haspa, Yakumo, Horobetsu, Saru, Obihiro, Nayoro, Sōya áspa
‘to consent to’ Bihoro hese, Saru, Obihiro, Asahikawa ése, Horobetsu ko/ése
‘fog’ Bihoro hurar, Yakumo, Saru, Obihiro, Asahikawa, Nayoro, Sōya úrar,
Sakhalin uurara
‘barrel’ (< Jap. taru ‘barrel’) Bihoro hontaro, Horobetsu, Saru, Obihiro,
Asahikawa, Nayoro, Sōya óntaro, Sakhalin ontoro
‘lacquer’ (< Jap. urusi ‘lacquer’) Bihoro hupsi, Yakumo, Horobetsu, Saru,
Asahikawa, Nayoro, Sōya ússi, Obihiro úsi, Sakhalin usi
I would like to give the reflexes of the word ‘nail’ for comparison with the first example:
‘nail’ Bihoro am, am-i, Yakumo, Horobetsu, Saru, Obihiro, Nayoro, ám, am-í
Asahikawa, Sōya ám, am-íhi, Sakhalin am, am-ihi
It will be clear that we have to do with a single word here, which is obvious from the Japanese
11.2 Phonological differences between Sakhalin Ainu and Hokkaidō Ainu 261
Within Sakhalin, on the west coast, the dialects of Raichishka, Maoka and
Tarantomani were spoken. On the east coast there were the dialects of Nairo,
Shiraura and Ochiho. The only Sakhalin Ainu dialect represented in the dialect
dictionary is the dialect of Raichishka, based on the speech of Fujiyama Haru (Ainu
name: Esohrankemah) who was ‘repatriated’ to Japan after World War II, and who
spoke the dialect fluently. For the rest of the Sakhalin dialects the only available data
are the items on Swadesh’s word list investigated by Hattori and Chiri in their
lexicostatistical study of 1960.
Kuril Ainu, which was most closely related to Hokkaidō Ainu, died out during
the 19th century. Sakhalin Ainu died out in 1994 with the death of Mrs. Fujiyama,
and Hokkaidō Ainu has only a few speakers left.
11.2 Phonological differences between Sakhalin Ainu
and Hokkaidō Ainu
An important difference between the Sakhalin and the Hokkaidō Ainu dialects is the
fact that the Hokkaidō dialects have preserved syllable-final -p, -t, -k, while in
several dialects on Sakhalin (i.e. Ochiho, Maoka, Shiraura, and Raichishka), these
consonants have shifted to -s (realized as ʃ) after the vowel i, and to -h after all other
vowels.
Asai (1976) writes: “In Ainu, final stops are usually uttered without explosion
and are generally described as ‘implosive stops’. Sometimes in northern dialects,
however they are followed by a release of air which may be either weak or fairly
glosses (動物の爪 ‘the nail of an animal’ and 爪 ‘a nail’). Why then, is there a difference in the
reflex in Bihoro? I assume that this is related to the fact that Bihoro (and also Nayoro) have a
possessive form when the word refers to a human nail, but no possessive form when the word
refers to an animal nail. The existence of a possessive form, in which the accent – also in
Bihoro – once fell on the second syllable instead of the first, prevented the appearance of the
prothetic h-. In case of ‘claw’ on the other hand, the lack of a possessive form meant that the
accent was always on ám, and in this case the prothetic h- did appear.
The first syllable in ontaro may go back to the Japanese polite prefix o-, or possibly oo- ‘big’.
The shape on- instead of o- can only be explained if we assume the word was a loan from a
Tōhoku type dialect, where the intervocalic consonant -t- would have been voiced. As will be
discussed in section 11.11.1, intervocalic voiced consonants in loanwords from Japanese are
adopted as prenasalized voiceless stops in Ainu. If so, this word can count as proof of the fact
that the h- in Bihoro is not original. The clearest example showing that the Bihoro h- is not
original is the last one: The second u in urusi must have been interpreted as the vowel copy that
normally appears after -r, and at a later stage must have assimilated *ursi > ussi. (The -ps-
cluster in Bihoro is unusual. I don’t know if there are more examples of -ps- clusters in Bihoro
(or other dialects) that are the reflex of an original -rs- cluster, or a geminated s.)
Vovin mentions the possibility that the h- in Bihoro is prothetic, but nevertheless bases the
reconstruction of a separate proto-Ainu consonant, a voiced fricative *H- on the Bihoro data:
*Ham ‘claw’, *Haspa ‘deaf’, *HE(=)sE ‘to consent to’, *Hon=tarO ‘barrel’, *Huurar ‘fog’
and even *Hupsi ‘lacquer’ (Vovin, 1993:94).
262 11 The accent of Japanese loanwords in Ainu
strong. Thus /’asiknep/ (five; with a numerical ending) may occasionally be
pronounced as [’asiknep], [’asiki̥nep], [’asiki̥neph] and so on.” This description of
the Ainu dialects of northeast Hokkaidō shows a situation that can be seen as a first
step in the direction of the development in Sakhalin.
Syllable final -r is not allowed in the Sakhalin dialects, and where Hokkaidō has
CVr Sakhalin will have CVrV, the final vowel usually – but not always – being a
copy of the vowel in the preceding syllable. Here again the situation in Hokkaidō is
not that different from the situation in the Sakhalin dialects: The consonant /r/ in
Ainu is a single flap and the preceding vowel will automatically resound after r in
final position in Hokkaidō as well, so that a word like /nukar/ ‘to see’ will actually
sound the same as in Sakhalin [nukara].3 Only when particles are attached does it
becomes clear that the final vowel in Hokkaidō is not a phoneme, and that on the
phonological level the word ends in /-r/:4 As Ainu has the rule /r/ → /n/ before /r/
we can see that the phonetic form [nukara] for ‘to see’ is really /nukar/ because of
the change: ‘to want to see’ /nukar rusuy/→/nukan rusuy/ (Asai, 1976).
But even here, Asai (1976:193) remarks that kikír ‘insect’ “phonetically may be
described as [kikiri] in general. Phonemically however some idiolectal/dialectal
variation is indicated by the existence of two forms, [kikin rabu] and [kikiri rabu],
both meaning ‘insect wings’. “The process that Asai describes, where the automatic
vowel copy that appears after -r is starting to be reanalyzed as a phoneme in the
Hokkaidō Ainu dialects, is no doubt the origin of the situation in Sakhalin.5
From the examples in (1) (Hattori, 1967), it is clear that Hokkaidō has preserved
the older form.6
3 The vowel is not always a complete vowel copy in Hokkaidō either. Asai notes that /citarpe/ ‘a
mat made of cattail’ may be phonetically realized as [citaru̥be] rather than [citara•be]. “The
vowel that is heard phonetically after the final /-r/ is somehow neutralized.” In other cases the
reason why the vowel that can be heard after the final /-r/ is not a copy of the preceding vowel
may be influence from vowels later on in the word as in /arki/ → [ariki] ‘to come’. This again,
is very close to the situation in Sakhalin where we likewise find:
Raichishka Hokkaidō
‘to come’ ariki árki
‘left’ hariki hárki
‘grease’ kirupu kírpu
4 According to Kindaichi Kyōsuke, Chiri Mashiho’s sister Yukie, who died when she was only
nineteen, a year before her famous book Ainu shinyō-shū “A collection of Ainu epics of the
gods” (1912) came out, was actually the first to insist to him that the final vowel after -r was
only a kind of reverberation and not a real vowel (Asai 1976:193).
5 I expect this process to occur less frequently in case of monosyllables like /kór/ ‘to have’. The
fact that it is realized as [kóro], with accent on the first vowel, reveals that the automatic vowel
copy is not a phoneme.
6 If the final vowel in Raichishka is not a vowel copy, Hattori regards the final vowel as original,
and the vowelless Hokkaidō form as an innovation:
Raichishka Hokkaidō
‘eyebrow’ raru rár (poss. rarú)
‘ear’ kisaru kisár (poss. kisára)
11.2 Phonological differences between Sakhalin Ainu and Hokkaidō Ainu 263
1 The loss of the distinction between /-r/ and /-rV/ in Sakhalin
Sakhalin Hokkaidō
‘to submerge’ rara rár
‘to have’ koro kór
‘to see’ nukara nukár
‘broad’ para pará
‘big’ poro poró
‘to look for’ hunara hunára
The Kuril Ainu materials, both the older and the more recent (cf. section 11.8.5 and
subsections) show that Kuril Ainu was similar to Hokkaidō Ainu:
– Krasheninnikov: setùr ‘back’, Klaproth/Steller: ŝēdǔr ‘back’ (Hokkaidō setúr, -
u7, Sakhalin seturu, -hu).
– Krasheninnikov: xar (= kisar) ‘ear’, Klaproth/Steller: gsāhr (= kisar) ‘ear’
(Hokkaidō kisár, -a, Sakhalin kisaru, -hu, kisara, -ha).
– Krasheninnikov: uuràr ‘cloud’, Klaproth/Steller: ûrăr Dybowski: urar ‘fog’
(Hokkaidō úrar, Sakhalin uurara ‘fog’).
– Klaproth/Steller: rahr ‘eyebrow’, Dybowski: rar ‘eyebrow’ (Hokkaidō rár, -ú,
Sakhalin raru, -hu).
A problem is that correspondences of the following type then become hard to explain:
Raichishka Hokkaidō
‘lightning’ imeru iméru
A comparison of the Raichishka forms in the dialect dictionary with those in Murasaki’s
wordlist of the Raichishka dialect shows that the regular correspondences do exist in
Raichishka, but that there is a certain alternation, possibly due to confusion with the possessive
forms in -u/-uhu. (See the explanation of how the possessive is formed later on in this section.)
The word ‘ear’ is also included in Hattori & Chiri’s investigation of Swadesh’ basic word list
in a number of Sakhalin Ainu dialects (1960), and while Ochiho, Maoka, Shiraura and Nairo all
have the form kisara, only Tarantomani and Raichishka have kisaru. It therefore seems correct
to regard the Hokkaidō form as original in case of these examples as well, and reconstruct
*kisar and *rar, especially as these are also the shapes in which they are attested in Kuril Ainu.
Raichishka Hokkaidō Proto-Ainu
‘eyebrow’ rara ~raru rár *rar
(poss. raruhu) (poss. rarú)
‘ear’ kisara ~kisaru kisár *kisar
(poss. kisar-uhu) (poss. kisára)
As for the possessive kisara instead of expected kisaru in Hokkaidō; the vowel of the
possessive suffix after kisár may have been reduced to a vowel copy because it was unaccented
(as accent will remain on the second syllable). The vowel quality of the possessive suffix after
rár on the other hand was preserved, because it was accented.
7 Sakhalin has two possessive forms: seturuhu and seturihi. The last form has the annotation
‘old’. Horobetsu also has a possessive form in -i: setúri. The possessive forms in Nayoro
(setúru, -hu) and Sōya (setúr, -uhu) indicate that in these dialects this word is being reanalyzed
as ending in a vowel.
264 11 The accent of Japanese loanwords in Ainu
– Krasheninnikov: ur ‘clothing’, Dybowski: ur ‘a coat made of dear skin’,
(Hokkaidō ur ‘fur coat’, Sakhalin x).
– Klaproth/Steller: bāikǎr ‘spring’, Voznesenskij: pajgar ‘spring’ (Hokkaidō
páykar, Sakhalin paykara).
– Klaproth/Steller: rērăr ‘chest’ (Hokkaidō rerár, -u, Sakhalin reraru, -hu).
– Klaproth/Steller: kîhgĭr ‘insect’, (Hokkaidō kikír, Sakhalin kikiri).
According to Murayama (1971) Kuril Ainu should be regarded as a branch of
northern Hokkaidō Ainu, as in other respects as well (both phonological and lexical)
Kuril Ainu has more in common with the Hokkaidō Ainu dialects than with the
dialects of Sakhalin. (Contact between the three areas was never completely severed
as there were extensive trade links between the Ainu of Hokkaidō, Sakhalin and the
Kuril islands (as well as with northern China, and – from the 17th century on – with
Russian traders).
Another case of an automatic vowel copy that may not be a phoneme involves
the possessive suffix. The number of words to which the possessive suffix can be
attached is limited, as it mainly attaches to words of inalienable possession (such as
body parts), culturally important nouns and locational nouns. The possessive suffix
has two different shapes:
After nouns that end in a consonant, a suffix -i or -u is attached.8 In case of
monosyllabic nouns, Hokkaidō will have the accent on the second syllable of the
resulting CVCV sequence, which is the preferred location of the accent for words of
this segmental shape in Hokkaidō. (See section 11.4 for the relation between
segmental shape and accent placement in Hokkaidō, and the relation between accent
placement in Hokkaidō and vowel length in Sakhalin.)
After nouns that end in a vowel, a suffix -hV is attached. 9 The vowel is an
automatic copy of the vowel that precedes the suffix. It is possible to add this suffix
to the suffix -i/-u (mainly after monosyllabic words) without an apparent difference
in meaning. This is especially common in Sakhalin Ainu. (It is not possible to add
the suffix -hV once more after itself.)
In case of monosyllabic nouns, Hokkaidō will have accent on the first syllable of
the resulting CVCV sequence, which is not the preferred location of the accent for
sequences of this segmental shape. Accent on the first syllable is, however, the
regular correspondence in Hokkaidō of the vowel length in the initial syllable that
we find in Sakhalin in these cases: In Sakhalin ‘child’ is poo, poo-ho, in Hokkaidō it
is po, pó-ho.
Final -h does not occur at all in Hokkaidō Ainu, and is always followed by a
vowel copy. It is therefore possible that the vowel copy that appears after -h is not a
8 The variation between -i and -u is due to a rather limited type of vowel harmony in Ainu.
9 When a possessive suffix attaches, a number of nouns with a seemingly open syllable structure
reveal that – underlyingly –they end in semivowels. In syllable-final position, the combinations
iy and uw are not allowed anymore in Ainu, but when the possessive suffix -i is attached the
semivowels reappear: (C)Vyé (C)Vwé. (The -e is a lowered -i as the combinations yi and wi are
not allowed anymore either.)
11.2 Phonological differences between Sakhalin Ainu and Hokkaidō Ainu 265
phonological vowel, and that this is the explanation for the lack of accent on this
vowel in the resulting CVCV sequence.10 However, in Yakumo, where accent on the
second syllable is automatically shifted onto the third syllable if the second syllable
is open,11 we see that it is possible for the automatic vowel copy to carry the accent
in these cases: Yakumo ‘child’ po, pó-ho etc. but ‘doorway’ apá, apa-há, ‘foot’ uré,
ure-hé ‘leg’ kemá, kema-há etc. In Yakumo therefore, the vowel copy definitely has
to be analyzed as a phoneme, but before the innovative accent shift, Yakumo must
have been similar to the other Hokkaidō Ainu dialects.
The vowel copy after the possessive suffix can be found in Sakhalin Ainu as well,
and here it is clear too that the vowel copy – if it was originally not a phoneme – has
been reevaluated as a phoneme, parallel to what happened in case of the vowel copy
after -r. In Sakhalin after all, final -h (from -p, -t, -k) does occur, and contrasts with
the possessive suffix /-hV/.12
Finally, in the Sakhalin Ainu dialect of Nairo (Hattori & Chiri 1960) initial r- has
shifted to t-. (Nairo is not the only Sakhalin dialect in which this shift occurred, in
Taraika for instance the same rule applies (Takahashi, 1997), but Nairo is the only t-
dialect included in the lexicostatistical study.)
The fact that the r- > t- shift only occurs in initial position can be clearly seen
from the complementary distribution of t- and -r- in tayki ‘to kill’ and anrayke ‘I
kill’. I regard the form of the verbs ‘to come down’ and ‘to smell’ that appear in the
Nairo word list as ran and rak as examples of the same kind of complementary
distribution, as in the examples in the basic word list they appear after vowels, in the
phrases atto ran ‘rain falls’ and huraha rak ‘to smell a smell’. I expect that they
would appear with initial t- if there had been examples included in which they
appeared in utterance initial position (or perhaps after words ending in voiceless
consonants).
However, there are two exceptions to this rule in the Nairo dialect included in the
basic word list: ranka ‘breast’ and ramoro ‘guts’. I regard these forms as the result
of dialect mixing. The fact that even forms that are attested in the word list with
initial t- occasionally appear with initial r- can be seen as proof that a certain
admixture of forms from other dialects had occurred in the speech of the Nairo
dialect informant.13 This also appears to be the case with regard to other features:
According to Hattori, the Nairo dialect informant, Takayama Yoshi, was born in
Taraika and moved to Nairo at the age of eight and from there to Niitoi after she
10 Compare the similar case of the vowel copy that automatically appears after -r in the Hokkaidō
Ainu dialects, which at closer examination turned out not to be a phoneme. The word /kór/ ‘to
have’ for instance, is realized as [kóro] in Hokkaidō, with accent on the first vowel, and not on
the automatic vowel copy.
11 See sections 11.4 and 11.8.2 for a number of exceptions to this rule.
12 This means that the development of final -h from -p, -t, -k in Sakhalin Ainu must have
happened after the possessive suffix developed its present -hV shape.
13 According to Hattori’s introduction, in the Nairo dialect the forms ray and rayki, as well as tay
and tayki occurred for ‘to die’ and ‘to kill’.
266 11 The accent of Japanese loanwords in Ainu
reached the age of twenty. Her husband with whom she had lived for 30 years had
been from Kashiho and had influenced her speech somewhat: while Taraika, Nairo
and Niitoi all have preserved final -p, -t, -k, nevertheless in her speech, forms like
‘woman’ mahnekuh (matnekur) and ‘to swell’ sekuhke (sekutke) occurred. I
therefore do not consider it farfetched to regard the occasional occurrence of initial
r- in her speech likewise as the result of influence of the speech of her husband.14
Vovin (1993) on the other hand, bases the reconstruction of a distinct proto-Ainu
phoneme on the Nairo data. When the Nairo dialect has t-, Vovin reconstructs proto-
Ainu *d- and when the Nairo dialect has r-, Vovin reconstructs proto-Ainu *r-.15
I do not think this is correct as the dialects of Nairo and Taraika do not agree as
to which lexemes appear with t- and which lexemes appear with r-. There are cases
where Nairo has t- but Taraika (which normally also has r- > t-) has r- (cf. Nairo
tekut, Taraika rekut ‘neck’, Takahashi, 1997). In other instances, words that Vovin
reconstructs with initial *r- (because of r- in Nairo) have t- in Taraika (cf. Taraika
ramuhu~tamuhu ‘breast, heart’ (which Vovin connects with Nairo ranka) and
tamtam ‘fish scales’ (a reduplication of ram > tam). Vovin includes the words ‘to
come down’ (atto) ran and ‘to smell’ (huraha) rak in his examples of proto-Ainu *r-,
but as I have mentioned before, I think that ranka ‘breast’ and ramoro ‘guts’ are the
only real examples of initial r- in the Nairo dialect. And if ranka truly is a cognate of
Taraika ramu-hu~tamu-hu ‘breast, heart’ even of those two examples one has been
attested with initial t-.
11.3 Distinctive vowel length in Sakhalin
The Sakhalin dialects, and the Raichishka dialect that represents the Sakhalin
dialects in the dialect dictionary, have an opposition between long and short vowels
in words of more than one syllable. The opposition only exists in open syllables, as
the vowels in closed syllables are all short.
While the opposition clearly exists in the first syllable, it can only sporadically
be found in the second syllable, typically in cases where prefixes precede a stem
with a long vowel.
14 Takahashi Yasushige (personal communication), likewise explains the cases of initial r- in the
speech of his informant – who grew up on Sakhalin but now lives in Abashiri – as the result of
dialect mixing. The pre-war Japanese population policy in Sakhalin encouraged the Ainu
population from different areas to settle in a number of larger coastal villages and dialect
mixing was thus likely to occur.
15 Vovin explains the lack of the distinction in medial and final position in the following way: As
Ainu consonants are unreleased in final position, an early merger of *-d and *-t is likely and *-
d thus merged with -t. In medial position on the other hand, lenition caused *-d- to shift to -r-
and so medial *-d- merged with -r-.
11.4 Distinctive accent in Hokkaidō 267
2 Examples of vowel length in the second syllable in Sakhalin
mii ‘to put on clothes’ → imii, imiyehe (possessive) ‘clothing’
nuu ‘to listen’ → konuu ‘to ask someone’
paa ‘year’ → kupaaha ‘my age’
miina ‘to laugh’ → emiina ‘to laugh at’, seemiinayara ‘to be funny,
to make people laugh’
Long vowels in the second syllable therefore seem to be secondary. Minimal pairs
only exist for the opposition in the first syllable in polysyllabic words. The
opposition does not exist in monosyllabic words, as there the vowels in all open
syllables are long, while the vowels in all closed syllables are short.
In the examples above, I have indicated the pitch of the first and the second
syllable. This pitch is completely automatic: If the first syllable is closed or has a
long vowel it will always have [H] pitch, even if the second syllable is also closed or
consists of two moras. In all other cases the second syllable has [H] pitch. After the
second syllable pitch can vary and often takes a [LHLM], or [HLML] shape.
11.4 Distinctive accent in Hokkaidō
Hokkaidō Ainu does not have an opposition between long and short vowels, but
instead in these dialects accent placement is distinctive. (The dialect of Bihoro is the
only dialect that has lost all accentual distinctions.) Accent can fall on the first or on
the second syllable.16 As it is the rise in pitch that is distinctive and not the fall
(unlike in Japanese) the pitch of syllables after the accented syllable is free, just as in
16 The only exception is the southernmost dialect of Yakumo. In words of three or more syllables
that do not have the accent on the first syllable, the accent will fall on the third syllable instead
of the second, unless the second syllable is closed. In that case the accent will fall on the
second syllable just as in the other Hokkaidō Ainu dialects:
Hokkaidō Ainu Yakumo
‘head’ sapá sapá
‘her/his head’ sapáha sapahá
‘my head’ kusápaha kusapáha
‘spatula/chopsticks’ pasúy pasúy
‘chopsticks (for eating)’ ipépasuy ipepásuy
‘to go home (once)’ hosípi hosipí
‘to go home (repeatedly)’ hosíppa hosíppa
‘to run (once)’ hoyúpu hoyupú
‘to run (repeatedly)’ hoyúppa hoyúppa
‘child’ hekáci hekací
‘children’ hekáttar hekáttar
There are exceptions to this rule, i.e. cases where accent will remain on the second syllable
even though this syllable is not closed, but these concern compounds, reduplications (often
mimetic words) and onomatopoeia. One special class of exceptions will be discussed in section
11.8.2.
268 11 The accent of Japanese loanwords in Ainu
Sakhalin, and may be low or high according to the sentence intonation. Although the
position of the accent is distinctive it is nevertheless to a large extent determined by
segmental features:
If the first syllable is closed (this includes syllables ending in the semivowels -y
and -w) it will be accented.
If the first syllable is open, the favored position of the accent is on the second
syllable. If such a word takes a prefix with an open syllable, the accent will shift in
order to remain on the second syllable: sitóma ‘to fear’ (transitive), isítoma ‘to be
afraid’ (intransitive). There are however, words in which the accent will fall on the
first syllable even though this syllable is not closed. Because of this, accent in
Hokkaidō can be said to be unpredictable and distinctive. Minimal pairs of this kind
in the Saru dialect are kéra ‘taste’ vs. kerá ‘straw raincoat’ and nína ‘to collect
firewood’ vs. niná ‘to knead’.
Hattori Shirō (1967) noticed that accent on the initial syllable in Hokkaidō
usually corresponds to vowel length in the initial syllable in Sakhalin (Raichishka).
In other words, accent in Hokkaidō falls on the second syllable, unless the first
syllable is closed, or unless the first syllable has a long vowel in Sakhalin.
3 The relation between initial vowel length in Sakhalin
and initial accent in Hokkaidō
Sakhalin Saru
‘Japanese person’17 siisam sísam
‘early, quick’ tuunas túnas
‘four’ iineh ínep
‘to fall’ haaciri hácir
‘cold’ meerayki mérayke
‘to teach’ caakasno cákasnu
‘thin’ aane áne
‘to breathe’ heese hése
‘red’ huure húre
‘to laugh’ miina mína
‘yesterday’ nuuman núman
‘epic poem’ yuukara yúkar
11.5 Similarities between the two systems
Just as Sakhalin Ainu has non-distinctive differences in accent placement depending
on the syllable structure of the language, the Hokkaidō Ainu dialects have non-
distinctive differences in vowel length depending on the accentual distinctions of the
17 Sísam literally means ‘neighbor’.
11.6 Hattori’s reconstruction of proto-Ainu phonological structure and accent 269
language. In Hokkaidō Ainu, vowels in open syllables that are accented are
pronounced longer (Asai, 1972, 1976). Asai, who studied a number of eastern
Hokkaidō Ainu dialects, indicates for instance that words like kéra an (tasteful) mé
an (cold) tópen (sweet) have long initial vowels. In his description of the Hokkaidō
Ainu accent rules (1960:25), Kindaichi Kyōsuke even seems to turn the analysis
around: He states that as a rule, accent will fall on the second syllable, unless the
first syllable is a closed syllable or “a syllable that is pronounced a bit stretched
out.” (This is an interesting remark in light of Hattori’s later theory that historically,
initial accent in Hokkaidō developed out of vowel length in the initial syllable.)
The difference between the two systems is therefore not directly obvious, and it
was only in 1955 that Hattori Shirō discovered that the Sakhalin Ainu material
(which he had analyzed as having distinctive pitch-accent until then, just like
Hokkaidō Ainu) could be better analyzed as showing a distinctive difference in
vowel length, with the pitches depending on the syllable structure.
11.6 Hattori’s reconstruction of proto-Ainu phonological structure
and accent
In 1967, Hattori Shirō proposed the idea that proto-Ainu had a distinction between
long and short vowels like the Sakhalin dialects, but no distinctive location of the
accent. This reconstruction was based on a comparison of the Sakhalin Ainu dialect
of Raichishka and the Hokkaidō Ainu dialect of Saru. According to Hattori, vowel
length in the initial syllable (of polysyllabic words) in proto-Ainu has been
preserved in Sakhalin Ainu, while it has shifted to distinctive high pitch in Hokkaidō.
Hattori indicates initial glottal stop in case of Ainu words that start with a vowel.
In his reconstruction of proto-Ainu phonological structure therefore, Hattori
indicates all Ainu syllables as starting with a consonant (C). I have adopted Hattori’s
representation of Ainu syllable structure as uniformly starting with C in (4), but I
follow Vovin in only indicating the glottal stop when it occurs between two vowels
within a word.
4 Hattori’s reconstruction of proto-Ainu syllable structure and accent
Sakhalin Proto-Ainu Hokkaidō
CVCV < *CVCV > CVCV́
CVCVCV < *CVCVCV > CVCV́CV
CVCVVCV < *CVCVVCV > CVCV́CV
CVCVC < *CVCVC > CVCV́C
CVCVS < *CVCVS > CVCV́S
CVCCV(CV) < *CVCCV(CV) > CV́CCV(CV)
CVSCV(CV) < *CVSCV(CV) > CV́SCV(CV)
270 11 The accent of Japanese loanwords in Ainu
Sakhalin Proto-Ainu Hokkaidō
CVVCV < *CVVCV > CV́CV
CVVCVC < *CVVCVC > CV́CVC
CVVCV < *CVVCVV > CVCV́
CVVCVS < *CVVCVS > CVCV́S
As can be seen from the table, one of the differences that Hattori posits between
proto-Ainu and Sakhalin Ainu is the fact that proto-Ainu always had [H] pitch on
the second mora of initial long vowels, whereas in Sakhalin Ainu the entire initial
syllable has [H] pitch. (In Hattori & Chiri (1960), the reconstructed proto-Ainu
pitches were similar to those of modern Sakhalin in this respect.)
In (3) I present Hattori’s correspondences in an adapted form. I have simplified
Hattori’s representation in the following way: Hattori distinguishes between closed
syllables that end in a semivowel -y, -w (CVS) and closed syllables that end in other
consonants (CVC). As there is no difference in the accentual reflexes of the two
kinds of syllables except between the last and the third to last correspondences, I
have omitted the distinction in all other cases. (As I find the use of the symbol S
confusing, I represent -y/-w with the symbol W.) I have also numbered the sets of
correspondences.
Correspondence nr. 4 has been added by me.18 The example words (except for
correspondence 4) have been adopted from Hattori’s article.
5 Adaptation of Hattori’s reconstruction
Sakhalin Proto-Ainu Hokkaidō
1 CVCV sapa *CVCV CVCV́ sapá ‘head’
2 CVCVC etoh *CVCVC CVCV́C etók ‘tip, edge’
3 CVCCV suhki *CVCCV CV́CCV súpki ‘reed’
4 CVCCVC sahteh *CVCCVC CV́CCVC sáttek ‘to get skinny’
5 CVVCV heese *CVVCV CV́CV hése ‘to breathe’
6 CVVCVC seeseh *CVVCVC CV́CVC sések ‘to grow hot’
7 CVVCV kiiki *CVVCVV CVCV́ kikí ‘to scratch’
8 CVVCVW nocíw *CVVCVW CVCV́W noociw ‘star’
18 There are many examples of correspondence 4, also in the Saru dialect that Hattori used. Just to
show a few: Sakhalin ahkas, Hokkaidō ápkas ‘to walk’, Sakhalin sahkes, Hokkaidō sákkes
‘november’, Sakhalin muhkun, Hokkaidō múkkur ‘flute’, Sakhalin sisrah, Hokkaidō síkrap
‘eyelashes’, Sakhalin tehkuh, Hokkaidō tékkup ‘wing’, Sakhalin sinris, Hokkaidō sínrit ‘root’,
Sakhalin aykah, Hokkaidō áykap ‘to be unskillful (intrans.)’, Sakhalin kewtum, Hokkaidō
kéwtum or kéytum (Bihoro) ‘mind, heart, soul’ etc.
11.6 Hattori’s reconstruction of proto-Ainu phonological structure and accent 271
Examples of correspondences 1 to 6 are numerous. Correspondences 7 and 8 are
based on a small number of cases where Hokkaidō has accent on the second syllable
instead of the first. I will first discuss the examples of correspondence 8.
6 Hattori’s examples of correspondence 8
Sakhalin Hokkaidō
noociw nocíw ‘star’
kaasiw kasúy ‘to help’ (transitive)19
niisew niséw ‘nut, acorn’
In case of ‘to help’ Yakumo kásiw and Chitose kásuy (personal communication by
Takahashi) show the regular correspondence with Sakhalin, while Saru, Horobetsu,
Obihiro and Asahikawa (kasúy) do not.20 (Nayoro and Sōya have u-kásuy and i-
kásuyke respectively, but because of the verbal prefix the original location of the
accent cannot be established.
Examples of words with this segmental shape (CVVCVW in Sakhalin) that
have accent on the first syllable in Hokkaidō Ainu are in fact more numerous:
Sakhalin kuukew ‘shoulder’, uusey ‘warm water’, niitay(usi) ‘forest’ and meerayki
‘cold’ are kúkew, úsey, nítay and mérayke respectively in Hokkaidō.
However, Hattori dismisses these examples on the following grounds: ‘shoulder’
is reconstructed as *kuwkew, because Torii Ryūzō’s Kuril Ainu material has
kupkeu. 21 According to Murayama (1971:221) ‘shoulder’ is a compound of ku-
‘bow’ and -kew bone. The underlying form of ‘bow’ can indeed be reconstructed as
*kuw based on the possessive form kuwé.
19 There are more cases of forms that end in -y in one dialect and in -w in others. As Yakumo –
the southernmost dialect of Hokkaidō – agrees with Sakhalin on the segmental shape and as the
form kasí also exists in Hokkaidō Ainu, the proto-Ainu form was probably *kaasiw. (The only
dialect in Hokkaidō that regularly has -uy instead of -iw is Bihoro. In the other dialects both -uy
and -iw occur and the distribution of the forms differs widely per word so that it is not possible
to see a clear pattern.) As *-iy is not allowed in Ainu, the vowel u in the -uy forms is probably
the result of rounding of the original vowel. The development could have been -iw > -uw > -uy.
20 There are a number of loanwords from Japanese that have vowel length in the initial syllable in
Sakhalin. In case of only one of them the Hokkaidō Ainu dialects show divergent accent: 2.4
tuti ‘large wooden hammer’, Sakhalin tuuci, Saru, Sōya túci, Horobetsu tútci, but Yakumo and
Obihiro tucí. It is interesting to see that this word is reconstructed by Martin 1987:557 as *tutiy
or *tutui in proto-Japanese (even though there is no kō/otsu distinction after t- in Old Japanese).
If this reconstruction is correct the proto-Ainu form may have been *tuutuy, (i.e.
correspondence 8), which could explain the accent on the second syllable in Yakumo and
Obihiro.
21 Dybowski’s Shumshu material has a different word for shoulder, tapko and Krasheninnikov’s
material has tapsùt so we cannot confirm Torii’s entry. The morpheme -kew, in kukew (and
probably also -ko in tapko) has the meaning ‘bone’ but only occurs in compounds. It may be
the original Ainu word for ‘bone’ until this was replaced by poné from Japanese hone ‘bone’
(Murayama 1971:221).
272 11 The accent of Japanese loanwords in Ainu
The next example is tentatively reconstructed as proto-Ainu *uwsey 22 by
Hattori, and the last two examples are dismissed as compounds of the independent
words ni (Sakhalin nii) ‘tree’ and me (Sakhalin mee) ‘cold’ to which the morphemes
tay and rayke have been added. ‘Cold’ is clearly a compound as the morpheme ráyke
occurs independently (in the meaning ‘to kill’) and may be removed, but even
though there can be no doubt that nítay contains the morpheme ‘tree’ -tay cannot
occur separately and I do not think it can be dismissed from the examples.
Based on the possessive form niyé ‘wood’ in Horobetsu we may reconstruct
‘tree’ as *niy. The word ‘nut, acorn’ no doubt contains the same morpheme. (The
second component (-sew) cannot occur independently.) This leaves only two
examples of correspondence 8.
If we rearrange the examples based on their reconstructed syllable structure,
correspondence 4 now has the following new members, of which only ‘nut, acorn’
has an irregular location of the accent in Hokkaidō.
7 Additions to correspondence 4
Sakhalin Hokkaidō Proto-Ainu
‘shoulder’ kuukew kúkew *kuwkew
‘warm water’ uusey úsey *uwsey
‘forest’ niitay(usi) nítay *niytai
‘nut, acorn’ niisew niséw *niysew
As to the examples of correspondence 7; kiiki and siine can be found in Raichishka
and the Hokkaidō reflexes kikí and siní can be found in most Hokkaidō dialects,
always with accent on the second syllable. I have not been able to find niina ‘to
mash (with a ladle)’ for Raichishka in the Ainu dialect dictionary, nor in Murasaki’s
wordlist but the example may stem from fieldwork by Hattori himself.23 I have also
22 Hattori’s reconstruction is based on the following correspondences:
Sakhalin Nayoro Saru Yakumo Proto-Ainu
nuyna núyna núyna núyna *nuyna ‘to hide’ (transitive)
uyna úyna úyna úyna *uyna ‘to take (many things)’
uuna úna úna úyna *uYna ‘ash’
uusey úsey úsey úsey *uwsey ‘warm water’
However, the issue is even more complicated as there is also the following correspondence:
Sakhalin Nayoro Saru Yakumo Proto-Ainu
uteh úytek úytek úytek ? ‘to use (a person)’
23 I did however, find niina ‘to get firewood’ in Murasaki’s wordlist. This example shows a
regular correspondence with the minimal pairs that Hattori gave for accentual distinctions in
the Saru dialect: nína (<*niyna) ‘to collect firewood’ (also in the Chitose dialect of Oda Ito, p.c.
by Anna Bugaeva) vs. niná ‘to knead/crush’, and we would have expected ‘to mash (with a
ladle)’ to have a short vowel in Sakhalin.
11.6 Hattori’s reconstruction of proto-Ainu phonological structure and accent 273
been unable to find kaamanpa for Raichishka, but the word kamá can be found in
Saru, Yakumo and Asahikawa. The Chitose dialect on the other hand has káma
(personal communication by Takahashi Yasushige), which corresponds regularly to
the long vowel in Raichishka reported by Hattori.
8 Examples of correspondence 7
Raichishka Hokkaidō
kiiki kikí ‘scratch’
siine siní ‘to rest’
niina ‘to mash (with a ladle)’ niná ‘to knead/crush’
kaamanpa kamá ‘to go over, to cross over’
11.6.1 Exceptions to Hattori’s correspondences
I suspect that it is easier for accent to shift to the right, to the favored second syllable
in a number of Hokkaidō Ainu dialects, than for original vowel length in the first
syllable to be lost in Sakhalin. In the accent systems of the Japanese dialects,
irregular reflexes in individual dialects are not uncommon, especially in case of
longer nouns. Rightward shift of accent – often conditioned by vowel height – is
very common and can be quite irregular, even in disyllabic nouns.
In case of Ainu it can be seen that a closed second syllable can induce accent to
shift to the right in some dialects, even when the initial syllable was long in proto-
Ainu: sések ‘to grow hot (seeseh in Sakhalin) has initial accent in all Hokkaidō Ainu
dialects, while in sésekka ‘to boil (water)’ (seesehka in Sakhalin), which is clearly a
cognate word, accent has shifted to the right in Asahikawa and Sōya (sesékka). I
think the irregular reflexes in Sakhalin niisew, Hokkaidō niséw (< *niysew) ‘nut,
acorn’ and Sakhalin kaaris, Hokkaidō karíp may therefore be due to the closed
second syllable.24 Hattori’s correspondence 7 may have to be viewed in the same
light, namely as a limited set of cases where accent shifted to the favored second
syllable in Hokkaidō, despite the fact that there was vowel length in the initial
syllable in proto-Ainu.
Examples where Hokkaidō has accent on the first syllable although the first
vowel in Sakhalin is short are exceedingly rare. The clearest example of this type,
where all the Hokkaidō Ainu dialects except Sōya have accent on the first syllable,
despite the fact that Sakhalin has a short vowel in the first syllable, is ‘smell’:
Raichishka hura, Hokkaidō húra. I think that in this case it is best to reconstruct a
Raichishka Hokkaidō
(Hattori) (Murasaki)
‘mash, crush, knead’ niina x niná
‘collect firewood’ niina niina nína
24 Hattori on the other hand, reconstructs *kaariip ‘ring, wheel’, based on the correspondence of
Raichishka kaaris with Hokkaidō karíp, even though this is the only case where a long vowel
in a closed syllable has to be reconstructed in proto-Ainu.
274 11 The accent of Japanese loanwords in Ainu
long initial vowel in Raichishka, which was later lost, as it is hard to imagine that all
the Hokkaidō dialects independently shifted the accent to the marked position. (This
reconstruction is confirmed by the fact that the dialect of Ochiho on Sakhalin has
huuraha rah ‘to smell (a smell)’ according to Hattori and Chiri’s lexicostatistical
study.) Only in Sōya, the dialect in the northwestern point of Hokkaidō that is
relatively close to the dialects of Sakhalin, the accent relates regularly to that of
Raichishka as hurá.
The only two other examples are ekasi ‘old man’ and hoku ‘husband’. In case of
ekasi half of the dialects in Hokkaidō show the regular correspondence with
Sakhalin while the other half does not. In case of hoku only Asahikawa fails to show
the regular correspondence with Sakhalin.
9 Exceptions to Hattori’s correspondences
Sakhalin Hokkaidō
‘smell’ hura, huuraha húra Sōya hurá
‘old man’ ekasi ékasi Obihiro, Asahikawa, Nayoro
ekási Horobetsu, Saru, Sōya25
‘husband’ hoku hóku Asahikawa
hokú Yakumo, Horobetsu, Saru, Sōya
okú Obihiro, Nayoro, (Bihoro)26
11.7 The lack of pitch and vowel length distinctions in monosyllables
As will be shown in 11.8.3 and subsections, older materials confirm that vowel
length distinctions similar to the ones in Sakhalin Ainu once existed in Hokkaidō
Ainu in polysyllabic words, but proof of vowel length distinctions in monosyllables
is inconclusive.
This makes it only more surprising that there is one Hokkaidō Ainu dialect
description from as late as 1938, in which a vowel length distinction is reported for
monosyllabic words.
11.7.1 Yamamoto Tasuke’s description
In 1938 the not linguistically schooled Yamamoto Tasuke made a description of his
own eastern Hokkaidō Ainu (Kushiro) dialect (published in 1959). In his description
he made no mention of vowel length contrasts in polysyllabic words but in case of
monosyllables Yamamoto gave a list of words which formed minimal pairs in his
25 As accent on the second syllable shifts to the third syllable if the second syllable is open in
Yakumo, this dialect has ekasí.
26 The dialect of Bihoro, in which initial h- is dropped when it precedes an unaccented syllable
has oku, indicating that in this dialect as well, accent originally fell on the second syllable.
11.7 The lack of pitch and vowel length distinctions in monosyllables 275
own dialect, depending on whether they had a long or a short vowel: ma ‘to roast’,
maa ‘to swim’, ka ‘surface, top’, kaa ‘thread’, ki ‘louse, flea’, kii ‘reed’, pa ‘head,
year’, paa ‘to find’, ri ‘to peel off, tear off’, rii ‘high’, to ‘swamp, lake’, too ‘today,
day’, pe ‘water, liquid’, pee ‘thing’, nu ‘hot spring’, nuu ‘to listen’, ru ‘road’, ruu ‘to
melt’, ta ‘harvest’, taa ‘this’, pu ‘storage house’, puu ‘to rise up’, pi ‘to pull’, pii ‘to
drip’, o ‘end, bottom’, oo ‘to ride’, ku ‘bow’, kuu ‘to drink’.
As Yamamoto’s indication of vowel length distinctions in monosyllables was
unique in the description of Ainu, and could not be confirmed by any other data,27
Ainu linguists have been hesitant to conclude that Yamamoto’s dialect preserved an
original vowel length distinction in monosyllabic words inherited from proto-Ainu
(cf. Hattori & Chiri, 1960, Itō, 1978).
I find it hard to dismiss Yamamoto’s list of minimal pairs. In most cases the two
distinct meanings that he mentions do exist in other Ainu dialects as homophones (in
Sakhalin always with long vowels, and in Hokkaidō always with short vowels), and
it is hard to imagine him making up such a list of minimal pairs if they did not truly
exist in his own dialect.
In case of five of his examples of monosyllabic words with short vowels, I have
found indications that his examples go back to closed syllables; syllables ending in
semivowels in proto-Ainu:
– pu ‘storage house’ vs. puu ‘to rise up’
‘storage house’ Hokkaidō pú. Yakumo has the possessive form púhu, but
Sakhalin puu, puwehe indicates that this word should be reconstructed as *puw.28
– ri ‘to peel off, tear off’ vs. rii ‘high’
‘to peel off, tear off’ rí in Hokkaidō Ainu means ‘to skin, to strip’. In Sakhalin
the form riye indicates that this word should be reconstructed as *riy
– ku ‘bow’ vs. kuu ‘to drink’
‘bow’ Hokkaidō kú, Sakhalin kuu. The poss. forms kuwé in Yakumo, kuyé in
Obihiro and kuwé (but also kúhu) in Saru indicates that this word should be
reconstructed as *kuw.
– ka ‘surface, top’ vs. kaa ‘thread’
Torii has the entry kau for ‘surface, top’, which indicates that this word should
be reconstructed as *kaw.
27 Asai (1972 and 1976), mentions that ma ‘to roast’ and ma ‘to swim’ often differ in tone in
context in the Hokkaidō Ainu dialects that he studied (Ishikari and Tokachi):
‘a fish is swimming here’ taánta cép má kor an (with a rising tone on ma)
‘that man is roasting a fish’ taánkur cép má kor an (with a falling tone on ma)
This example happens to agree with one of the examples that Yamamoto Tasuke gave of
differences in vowel length in the description of his own dialect, but as F. Kortlandt has pointed
out to me, the difference in intonation in Asai’s sentences may be due to differences in the
placement of sentence stress in case of transitive versus intransitive verbs, and cannot be
counted as a confirmation of Yamamoto’s distinction. See English: ‘He is frying a FISH’ with
no stress on ‘frying’, but ‘The fish is SWIMMING’ with sentence stress on ‘swimming’.
28 See section 11.2 for an explanation of the way in which the possessive is formed.
276 11 The accent of Japanese loanwords in Ainu
– ru ‘road’ vs. ruu ‘to melt’ In the meaning of ‘road’ rú has no possessive form,
but in the meaning of ‘line’ and ‘tracks, traces, footmarks’ the possessive form is
ruwé, indicates that this word should be reconstructed as *ruw.
My conclusion from these examples is that Yamamoto Tasuke’s description
should be taken seriously, but that it has no relation to a possible vowel length
contrast in monosyllables in proto-Ainu. What appears to have happened, is that
certain vowel + semi-vowel sequences were no longer allowed and were simplified.
The simplification of the vowel + semivowel sequences in the examples above
created open syllables, but this happened after the vowels of all originally open
syllables had been lengthened, just as in Sakhalin. As closed syllables were exempt
from the vowel lengthening, the semivowels left a trace in Yamamoto’s dialect in
the shape of vowel shortness, even though they were later lost.
This analysis of Yamamoto Tasuke’s data would also mean that the small set of
words for which it was possible to reconstruct a final semivowel because this
semivowel reappeared in the possessive form, can now be enlarged with the
following examples:, ki ‘louse, flea’ (< *kiy), pi ‘to pull’ (<*piy), pe ‘water, liquid’
(< *pew or *pey?), nu ‘hot spring’ (< *nuw), to ‘swamp, lake’ (< *tow), o ‘end,
bottom’ (<*ow), ma ‘to roast’ (< *maw), pa ‘head, year’ (<*paw), ta ‘harvest’
(<*taw).29
11.7.2 Asai’s findings
There is another Hokkaidō Ainu dialect in which the lost semivowels appear to have
left a trace in the form of vowel length. Asai (1976) records differences in vowel
length in Hokkaidō Ainu (in what he calls the ‘northern dialects’ (Asahikawa and
Tokachi) among the east Hokkaidō Ainu dialect group. As an example he gives the
following two sentences:
– Tanto e-nína kusun kim ta e-oman nankon na. ‘You should go to the forest to get
firewood.’
– Tanpe e-nína kunip ne na. ‘This is what you should knead.’
According to Asai, the vowel i in e-nína (from nína ‘to collect firewood’, is
pronounced much longer than the vowel i in e-nína (from niná ‘to knead’). This is
despite the fact that they are both accented, because niná ‘to knead’ becomes e-nína
when prefixed, with the accent remaining on the favored second syllable. Asai
furthermore reports that the vowel -u in ku-kéwe ‘my height’ (= ku ‘my’+ kewé
‘body’) is pronounced very short, while the u’s in kú kewé ‘the grip of a bow’ and
kúkewe ‘shoulder’ (poss.) are pronounced very long. In contrast with this, the length
of -e in the syllable ke in these examples only differs slightly, although one is
accented and the others are not (1976:200).
29 It is unclear why *-aw has sometimes been preserved in Sakhalin, the Kurils and some of the
older Ainu sources (as in Hokkaidō rá ‘down, rá-ta ‘downward’ but Sakhalin: raw ‘down’,
rawta ‘downward’ Ezo kotoba irohabiki (1848) rauta ‘bottom’ and Dybowski’s Kuril Ainu
ravda) and sometimes not as ma ‘to roast’, pa ‘head, year’ and ta ‘harvest’ .
11.8 Evidence for the direction of change 277
The significant point here is that accent placement makes only a slight difference,
while two morphemes that have to be reconstructed as ending in semivowels in
proto-Ainu (based on other grounds), namely *niy ‘tree’ in ‘to collect firewood’ and
*kuw ‘bow’ in ‘the grip of a bow’ and ‘shoulder’ (‘bow bone’ kupkeu in Torii
Ryūzō’s Kuril Ainu material) are pronounced with markedly lengthened vowels.
Compared with Yamamoto’s dialect, the vowel length occurs in the opposite set
of words. This means that open monosyllables in the dialects described by Asai were
not lengthened. Instead, the semivolwels in -uw and -iy were replaced by vowel
length. Both dialects however, have preserved traces of a similar distinction in
proto-Ainu.
11.8 Evidence for the direction of change
Hattori’s idea was that the pitch distinctions in Hokkaidō Ainu go back to original
differences in vowel length in proto-Ainu, which have only been partly preserved in
Sakhalin Ainu. Hattori’s reconstruction was based on the regular correspondence
between vowel length in the first syllable in Sakhalin and accent on the first syllable
in Hokkaidō. But how do we know that it was the pitch distinction that developed
out of the vowel length distinction and not the other way around?
11.8.1 The Hokkaidō Ainu system as a simplification
of the Sakhalin Ainu system
A strong indication that Hattori’s ideas on the direction of change is correct is the
fact that in compounds and prefixed forms the Sakhalin Ainu system based on vowel
length has preserved more distinctions than the Hokkaidō Ainu system based on
pitch. In Sakhalin for instance, the initial closed syllable in sisseeseh ‘the weather
is hot’ (a compound of sir ‘weather’ and seeseh ‘to grow hot’) has automatic [H]
pitch, but the distinctive long vowel of seeseh has been preserved. In Hokkaidō
sírsesek ‘the weather is hot’ on the other hand, the initial closed syllable is
automatically accented, which has caused the distinctive initial accent of sések to be
lost. Compare also the cases in (10) where the Hokkaidō system has preserved less
distinctions that the Sakhalin system.
10 The Hokkaidō Ainu system as a simplification of the Sakhalin Ainu system
Sakhalin Hokkaidō
‘to hear’ CVCVV inuu CVCV́ inú
‘head’ CVCV sapa CVCV́ sapá
‘to laugh at’ CVCVVCV emiina CVCV́CV emína
‘gift’ CVCVCV imoka CVCV́CV imóka
278 11 The accent of Japanese loanwords in Ainu
11.8.2 The relation between retention of accent on the second syllable
in Yakumo and vowel length in Sakhalin
There is a small set of words of more than two syllables in Yakumo that will not
shift the accent from the second to the third syllable, even though the second syllable
is open, and even though the words involved are not onomatopoeia or reduplications.
As we have seen, long vowels can only be found sporadically in the second syllable
in Sakhalin, but it is the existence of a distinction between long and short vowels in
the second syllable that makes the phonological system of the Sakhalin Ainu dialects
fundamentally different from the Hokkaidō Ainu dialects. It is therefore remarkable
that Yakumo, which is the southernmost Ainu dialect in Hokkaidō, and thus farthest
removed from Sakhalin, appears to maintain accent on the second syllable, when the
second syllable has a long vowel in Sakhalin.30 The last three examples, which do
not have long vowels in Sakhalin, have been added for comparison.
11 Retention of accent on the second syllable in Yakumo
and vowel length in Sakhalin
Sakhalin Yakumo
‘to bring to, to deliver’ koruura korúra
‘overcoat’ ikaakuspe ikákuspe
‘last night’ onuuman onúman
‘to give birth’ upookoro upókor
‘teacher’ icaakasnokur ipákasnokur31
‘red’ huure ehúre ‘to be bald’32
‘to make fun of someone’ raara irára iták ‘to joke’
‘to weave’ isitayki isitáyki (cf.Hokkaidō isítayki)
‘basket’ saranis saraníp (cf.Hokkaidō saránip)
‘to spread it out’ pirasa pirasá (cf.Hokkaidō pirása)
A synchronic approach, which takes the contemporary dialect of Yakumo as a
starting point is not impossible: The words rúra, núman, pókor and pákasno also
occur in the Yakumo dialect, without prefixes, with the accent on the first syllable.
We know this is an exceptional accent shape in Ainu and the markedness of this
accent shape may have been so strong, that accent was maintained on the second
syllable in Yakumo even when a prefix was attached.
30 There are however also two examples where Yakumo has shifted the accent to the right even
though Sakhalin has a long vowel Sakhalin ekaari, Yakumo ekarí ‘to go out to meet’, Sakhalin
esiina, Yakumo esiná ‘to hide something’.
31 See section 11.13 on the c/p correspondence between the Sakhalin Ainu form and the Yakumo
form.
32 Prefix e- + húre ‘red’. Compare also Yakumo sapahure (‘head is red’) ‘bald’.
11.8 Evidence for the direction of change 279
In case of ikákuspe on the other hand, an initially accented form (i.e. one without
a prefix) does not exist, and yet accent is maintained on the second syllable in
Yakumo. These instances of accent on the second syllable in Yakumo are therefore
best explained as remnants in Hokkaidō of the vowel length distinctions in the
second syllable that we still find in Sakhalin: At the time when the rightward accent
shift occurred in Yakumo, vowel length still existed in the same location as in
Sakhalin, and accent was maintained on the second syllable if this syllable was
heavy, just as it was maintained on the second syllable if this syllable was closed.33
11.8.3 Vowel length in older Japanese sources of Hokkaidō Ainu
There are a number of older dictionaries and wordlists containing Ainu vocabulary
and sentences that have been compiled in Japan in the Edo period. If vowel length
distinctions did exist in older Hokkaidō Ainu, it is likely that the Japanese compilers
of these word lists would have recorded these distinctions, as Japanese has a
distinction between long and short vowels in syllables that are have [H] pitch, as
well as in syllables that have [L] pitch. In this older Ainu material, vowel length can
indeed be found in certain Hokkaidō Ainu words. I will introduce material from the
oldest vocabularies (Matsumae no kotoba, Moshiogusa and Ezo kotoba irohabiki) in
the following sections.
11.8.3.1 Matsumae no kotoba (1626/1627)
The oldest vocabulary is Matsumae no kotoba (Satō 1998, 1999), containing 117
words. The author and exact date of compilation are not certain, but a date of around
1626 or 1627 is assumed. I have selected entries that appear to indicate vowel length
from this vocabulary, as well as one example where vowel length would have been
expected based on the modern Sakhalin and Hokkaidō reflexes.
12 Vowel length in Matsumae no kotoba
Entry Hokkaidō Ainu Sakhalin Ainu
reira (= reera) réra reera ‘wind’
riiko (= riikop) rikop34 x ‘star’
teita (= teeta) téta teeta ‘here’
tuukiu (= tuuki) túki tuuki ‘cup’ (<Japanese)
inetufu (= inep) ínep (Asahikawa inép) iineh ‘four’
33 In two of the cases above the retention of accent on the second syllable in Yakumo may also be
due to the fact that the second syllable was still closed at the time of the accent shift: korúra
goes back to *koruwra as it contains the word rú (*ruw) ‘road’ and ikákuspe goes back to
*ikawkuspe as it contains the word ka (*kaw) ‘surface, top’. It is also possible that the vowel +
semivowel sequences developed into long vowels before the accent shift in Yakumo took place.
34 This word (‘a high thing’) can only be found in Bihoro, where all accentual distinctions have
been lost. In all other dialects ‘star’ is nocíw. As the final consonant -p in Ainu is unreleased it
could occasionally be hard for a speaker of Japanese to hear.
280 11 The accent of Japanese loanwords in Ainu
Entry Hokkaidō Ainu Sakhalin Ainu
rii35 rí rii ‘high’
tii36 (= cii) cí cii ‘to cook’
seu (= suw) sú suu, suwehe ‘pan’
The only case where we would have expected a long vowel in the first syllable but
do not find one is in inetufu ‘four’. We see however, that there is also one Hokkaidō
Ainu dialect that has shifted the accent to the second syllable.
The spelling seu, which would be read as ([So:] if the word were Sino-Japanese,
is surprising as the modern Hokkaidō Ainu word is su (in many dialects [Su]). It can
be seen from the possessive form in Sakhalin, that su should be reconstructed as
*suw. I suspect therefore that せう(seu) is a way to spell suw (seu being used instead
of suu in order to ensure pronunciation of u as a semivowel). This example is
therefore an attestation in Hokkaidō Ainu of a form that already had to be
reconstructed on the basis of the modern Sakhalin data.
As wen ‘bad’ is written as うゑん 37 (uwen) in Matsumae no kotoba (Satō,
1999:5), there is another example of the kana u being used to express w in this
dictionary, but I have not found a syllable-final example. (Unless we interpret tuukiu
‘cup’ as tuukiw.) In Moshiogusa on the other hand, Hokkaidō Ainu kéwtum ‘feelings,
mind, heart’ (Bihoro kéytum, Sakhalin kewtum) is written as ケウトモ (keutomo),
and Hokkaidō Ainu nocíw ‘star’ is written as ノチウ (notiu). (But síw ‘bitter’ is
written as シユウ siyuu.)
From the examples above we can see that monosyllables in this material are
automatically lengthened, and that in general vowel length in the first syllable
coincides with Sakhalin (except in case of ‘four’). There are however, also a number
of unexpected cases of vowel length.
13 Unexpected cases of vowel length in Matsumae no kotoba
Entry Hokkaidō Ainu Sakhalin Ainu
atuhei (= appee) apé x ‘fire’
etuu etú (Sōya étu) etu ‘nose’
taguu (‘husband’) tán kúr x ‘that person’
The indicated vowel length in the last example is probably a mistake. According to
Satō tán kúr ‘that person’ is a way to refer to one’s husband in the Chitose dialect.
The final -u (う) of taguu is therefore probably a mistake for る. Satō remarks on the
35 This example is the result of the following analysis of atairiiha ‘expensive’ by Satō: ataye
(poss. of atay < Jap. atai ‘price’) rii (‘high’) wa (stress).
36 This example is the result of the following analysis of tiiamamo ‘food’ by Satō: cii (‘to cook)
amam (‘rice’).
37 In Matsumae no kotoba the kana we ゑ and o お are used to express e and o, in Moshiogusa the
kana we ヱ and wo ヲ are used.
11.8 Evidence for the direction of change 281
unusual use of nigori in the transcription of the Ainu word. This may have been a
way to express /nk/ in Ainu, as in northeast Japan intervocalic voiced stops are
pronounced as prenasalized voiced stops.)
The examples of ‘nose’ and ‘fire’ seem to indicate that whenever Hokkaidō has
[H] pitch, the dialect of Matsumae no kotoba automatically has vowel length. Even
today, accented vowels in open syllables are pronounced slightly longer in Hokkaidō
Ainu, and if the indicated vowel length simply coincides with [H] pitch in the
modern Hokkaidō Ainu dialects, this would be an indication that vowel length
distinctions had already disappeared: The length indicated in these older sources
would be no more than an automatic concomitant of [H] pitch in Ainu, which was
nevertheless recorded because the Japanese compiler(s) of the word list knew vowel
length differences independent of pitch distinctions in their own language. If,
however, we find vowel length only in those places that also have vowel length in
Sakhalin, this would mean that vowel length was still distinctive at the time when
this material was collected.
Because of etuu and appee, vowel length in Matsumae no kotoba appears to be
automatic in [H] pitched syllables, but the following examples show that this is not
the case.
14 [H] pitched syllables with short vowels in Matsumae no kotoba
Entry Hokkaidō Ainu Sakhalin Ainu
sake saké sake (‘new’) ‘sake’ (<Japanese)
siyaha38 sapá sapa ‘head’
This means that there has to be another reason for the vowel length indicated in the
forms for ‘nose’ and ‘fire’: Although all the Hokkaidō Ainu dialects have etúhu as
possessive form for ‘nose’, and Sakhalin has etuhu as well in the dialect dictionary,
according to Murasaki the possessive form in Sakhalin is etuyehe. This means that
etu probably has to be reconstructed as *etuy (or *etuw), and the long vowel
indicated in Matsumae no kotoba could indicate a semivowel -w.
As for ‘fire’, one possible explanation for the vowel length indicated in this form
has to do with the use of the kana tu in the spelling of this word: Usually the kana tu
is added before kana that start with h- in order to indicate syllable final -p. Examples
of this use of tu in Matsumae no kotoba are tiyetufu ‘fish’ (cép), titufu ‘boat’ (cíp),
retufu ‘three’ (rép), inetufu ‘four’ (inép) etc. Word-internally however, p is simply
written with a kana that starts with h- (see siyaha for sapa ‘head’), so the use of tu
38 In Moshiogusa as well, most cases of Ainu sa are written as シヤ sha. (Ainu s is palatalized to
different degrees in the different dialects.) Because the compiler of the wordlist no doubt
recognized sake as a Japanese word, the spelling in this case is サ instead of シヤ. (But in
Moshiogusa it is シヤケ.)
282 11 The accent of Japanese loanwords in Ainu
here is somewhat surprising. Was the kana i after tuhe added in order to prevent this
word from being pronounced as ap?
Another explanation could lie in the forms that Torii Ryūzō gives as the Kuril
Ainu forms, ape, abe and apoi. The last form suggests that the long vowel in
Matsumae no kotoba could be the spelling for original *apey. (Sakhalin has
unrelated unci, unci-hi for ‘fire’.)
Because of the lack of examples of vowel length in the second syllable in this
small collection of words, we cannot be absolutely certain, but the material in
Matsumae no kotoba appears to confirm Hattori’s theory that vowel length
distinctions like those that still exist in Sakhalin Ainu once existed in Hokkaidō
Ainu as well.
11.8.3.2 Moshiogusa (1792)
Moshiogusa (Narita, 1977) by Uehara Kumajirō dates from 1792. This is the most
famous and most extensive collection of older Hokkaidō Ainu material.
I have selected only a small number of examples that indicate vowel length from
this large dictionary. I have selected examples for which modern equivalents could
be found in the Ainu dialect dictionary.
15 Vowel length in Moshiogusa
Entry Hokkaidō Sakhalin
uuse (‘to heat up’) úsey, úsew uusey ‘warm water’
uurari úrar uurara ‘fog’
keera kéra keera ‘taste’
niirus nírus niirus ‘the gums’
tuuki túki tuuki ‘cup’ (<Japanese)
teire tére teere ‘to wait for’
siiratupu (‘eagle’) sírap x ‘hawk’
inuu inú inuu ‘to listen’ (intransitive)
uguu ukú x ‘to blow’
kunuu kunú kunuu ‘I listen’
nuu nú nuu ‘to hear’ (intransitive)
kaa ká kaa ‘thread
kii kí x ‘louse’
siyuu sú suu ‘pot’
siyoo só eso ‘waterfall’
taa tá x ‘draw (water)’
tii cí, ciyé cii, ciyehe ‘penis’
too tó too ‘lake’
toobekeru (‘daybreak’)39 tó too ‘day’
39 Toobekeru can be analyzed as tó ‘day’ + pekér ‘become clear’.
11.8 Evidence for the direction of change 283
The only examples of monosyllables with a short vowel that I have found are gu
‘bow’ (which also occurs with a long vowel in guuka ‘bowstring’) and ci ‘wasabi’,
and so I think we can say that in general monosyllables in this material are
automatically lengthened.40
As for the other examples of vowel length in this material, the places where the
vowel length occurs appear to agree with the dialect of Sakhalin, just as the material
in Matsumae no kotoba, although there are a few cases of vowel length where we
would not have expected it.
16 Unexpected cases of vowel length in Moshiogusa
Entry Hokkaidō Sakhalin
siyooya soyá x ‘bee’
tiyaasi cás (Bihoro) cas ‘to run’
There are also two cases where we would have expected to find vowel length but do
not find it.
17 Unexpected occurrence of short vowels in Moshiogusa
Entry Hokkaidō Sakhalin
pakari pákari41 paakari ‘to measure’ (< Japanese)
nisiyuu nísu niisu ‘mortar’
In case of pakari, the compiler of the dictionary was probably aware of the fact that
this was a loanword from Japanese, and he may have been influenced by the fact
that Japanese has no vowel length in this word.
As for ‘mortar’, instead of in the first syllable where we would have expected it,
the vowel length in this word can be found in the second syllable. I have no
explanation for the short vowel in the first syllable, but if nisu is a compound of ni
‘tree, wooden’ and su (<*suw) ‘pot, container’ the long second vowel should
probably be read as -uw.
Finally, there is one example where the vowel length seems to go back to a
contraction, namely in case of uusite ‘to pass along, inform’, which is not attested in
Sakhalin, but which is u/uste in the Saru dialect.
40 Two other examples of monosyllables with short vowels in this material are ta ‘in, at’ and ku
‘I’, but as these forms in practice never occur in isolation, they are probably not valid examples.
41 Nayoro is the only modern dialect that has pakári.
284 11 The accent of Japanese loanwords in Ainu
11.8.3.3 Ezo kotoba irohabiki (1848)
The last vocabulary is Ezo kotoba irohabiki (Satō, 1995), which was probably
compiled around 1848. The manuscript was originally owned by Itoya Kizaemon,
head of a fisherman’s cooperative in Otaru who died in 1904.
It is not impossible that Itoya himself collected the material in the dictionary, as
the many anecdotes that are still told about him among his descendants suggest that
he was a man of many interests who was well befriended with the Ainu (Satō,
1995:371).
18 Vowel length in Ezo kotoba irohabiki
Entry Hokkaidō Sakhalin
keeran (=keera an) kéra an keera/an ‘to be tasty’
huura húra hura ‘smell’
huure húre huure ‘red’
naahun náhun (Yakumo) x ‘just now’
tuuri tóri (Saru, Yakumo) x ‘to stay overnight’
paahau páhaw x ‘gossip’
numan núman nuuman ‘yesterday’
urari úrar uurara ‘mist’
nitae nítay niitayusi ‘forest’
arura42 rúra ruura ‘to carry’
emina emína emiina ‘to laugh at’
esina esína esiina ‘to hide’
onuman onúman onuuman ‘last night’
etoro etóro etooro ‘to snore’
tii cí, ciyé cii, ciyehe ‘penis’
bii pí, piyé~piyéhe pii ‘seed’
buu pú, púhu puu, puwehe ‘storehouse’
rei ré, réhe ree ‘name’
ruu rú, ruwé ruu ‘path, road’
pou pó, óho poo, pooho ‘child’
rii rí orii ‘high’
shii sí sii, siyehe ‘dung’
nii ní nii ‘to sip’
bei pé, péhe x ‘water’
guu kú, kuwé~kúhu kuu ‘bow’
The last eleven examples show that monosyllables in this material are automatically
lengthened. 43 (The only exception is pa ‘year’.) As for vowel length in the first
42 Arura consists of an indefinite personal prefix a- + rura.
43 The spellings ei, and ou indicate e: and o: in Japanese and most likely do not mean that these
11.8 Evidence for the direction of change 285
syllable of polysyllabic words, I have found six examples (‘to be tasty, ‘smell’,44
‘red’, ‘now’, ‘to stay overnight’, ‘gossip’) where vowel length is indicated in places
where Sakhalin has a long vowel and Hokkaidō has accent on the first syllable.
There are three examples (‘yesterday’, mist’ and ‘forest’) where there is no vowel
length in words even though it would be expected.45
11.8.4 The development of distinctive pitch-accent in Hokkaidō Ainu
In older Hokkaidō Ainu materials we still find the kind of contrast in vowel length
that has been preserved in Sakhalin, and these materials therefore corroborate
Hattori’s theory. By the time of Ezo kotoba irohabiki however, the distinction had
disappeared, and the vowel length markings that we find in this material are
consistent with the modern Hokkaidō Ainu system.
Just as in Sakhalin Ainu, older Hokkaidō Ainu had a vowel length distinction in
the first and the second syllable of polysyllabic words only, where accent would fall
on the second syllable unless the first syllable contained a long vowel or was closed.
In such a language, the shift to a pitch-accent system is truly minor, as all that is
needed, is for the vowel length distinction in the second syllable to be lost.
In a system where accent will fall on the second syllable, unless the first syllable
has a long vowel or is closed, and where vowel length no longer occurs outside of
the first syllable, the occurrence of vowel length is from then on determined entirely
by the location of the accent. If the first syllable is open, accent can fall either on the
first or on the second syllable, whereby accent falling on the first syllable is
accompanied by automatic lengthening of the vowel. (The historical development of
the Hokkaidō Ainu pitch-accent distinction from an original vowel length distinction
may be one of the reasons why accent on the first open syllable is still accompanied
by automatic lengthening of the vowel.)
Even though an accented vowel in the first syllable remains audibly longer (as is
the case in modern Hokkaidō Ainu and apparently also in the dialect of Ezo kotoba
irohabiki) this vowel length is a redundant feature, and in the modern Hokkaidō
Ainu spelling it is not indicated. As Japanese has vowel length distinctions
independent of pitch height, it was nevertheless picked up and recoded in most cases
by the Japanese compiler of Ezo kotoba irohabiki.
11.8.5 Vowel length in older sources of Kuril Ainu
A number of wordlists containing Kuril Ainu vocabulary and sentences have been
collected from the 18th century on. I will introduce material from these vocabularies
in the following sections.
syllables ended in semivowels.
44 The dialect of Ochiho on Sakhalin has huura according to Hattori and Chiri’s lexicostatistical
study.
45 In case of ‘star’ nociyu Sakhalin (noociw) and Hokkaidō (nocíw) do not agree, and the entry in
Ezo kotoba irohabiki follows the Hokkaidō Ainu pattern.
286 11 The accent of Japanese loanwords in Ainu
11.8.5.1 Krasheninnikov (1738)
The oldest Kuril Ainu vocabulary is by Krasheninnikov, who took part in the
Second Kamchatka Expedition and collected material on Kamchatka from two
informants who came from the islands of Shumshu and Poromushir in July 1738.46
(These are the first and the second island before the coast of Kamchatka).
The circumstances surrounding the compilation of this glossary have remained
unclear for a long time, and for the longest time therefore this collection of words
has been regarded as Kamchatka Ainu (Murayama, 1968).
19 Vowel length in the vocabulary of Krasheninnikov
Entry Hokkaidō Sakhalin
uuràr (‘cloud’) úrar uurara ‘fog’
keerà (= reerà) réra reera ‘wind’
kaanì káni kaani ‘metal’ (<Japanese)
nuuman núman nuuman ‘yesterday’
toopì tópe toope ‘milk’
áapu47 hápo x ‘mother’
onuumàn onúman onuuman ‘evening’
pı pé x ‘water’
ru rú ruu ‘road’
to tó too ‘lake’
to tó too ‘day’
sju sú suu, suwehe (poss.) ‘pot’
trivia48 rí orii ‘high’
Long vowels are unambiguously indicated by means of double vowel signs in
Krasheninnikov’s material, but there is rather an abundance of accent marks. The
many grave accents fall (almost without exception) on the last syllable (cf. ainù
‘person’ tapsùt ‘shoulder’). Acute accents are rare and when they occur they fall on
the first syllable, again almost without exception.
It is likely that acute accents indicate high pitch and grave accents low pitch. In
the following two examples, it appears as though acute and grave accents are used as
two different options to indicate [HL] pitch: kóntschi ‘hat’, kittschì ‘utensils made of
wood or leather’. It is not clear how reliable the accent markings are, as the
46 During the first half of the 18th century, Peter the Great and Catherine I sent several expeditions
into the Kurils, partly on the mistaken assumption that the islands contained precious ores
(Sargent, 1976:216).
47 Murayama has argued that Kuril Ainu had not lost initial h- but that Krasheninnikov failed to
hear the softer Ainu h- because of interference from Russian х-. See for instance the example of
‘fish eggs’: Klaproth/Steller hōmǎ, Dybowski xoma, Torii homa.
48 Trivia consists of trii ‘high’ + va stress.
11.8 Evidence for the direction of change 287
following examples all have accent ([H] pitch) on the second syllable in the modern
Ainu dialects: etù ‘nose’, otà ‘sand’, kotàn ‘earth’, pasùi ‘spoon’ and kamùi ‘god’.49
Krasheninnikov’s material has vowel length in the first syllable of polysyllabic
words when Sakhalin has vowel length. Monosyllables and words with accent on the
second syllable in Hokkaidō but no vowel length in Sakhalin (apı ‘fire’, etù ‘nose’,
otà ‘sand’) are not written with long vowels. As there is one example of vowel
length in the second syllable in a location where Sakhalin also has vowel length
(onuumàn ‘evening’), it appears that there was a vowel length contrast in this
position, similar to the kind of contrast that can still be found in Sakhalin Ainu, but
this conclusion is based on no more than a single attestation.
Finally, there is the possibility that ı (as opposed to i) in this material indicates a
syllable ending in -y, as pı may have to be reconstructed as *pey based on
Yamamoto Tasuke’s material, and nı ‘tree’ (‘forest’) is attested as niyé in the
meaning of ‘wood’ in Horobetsu, indicating *niy. (And toopì ‘milk’ is a compound
of too ‘breast’ and pe (*pey) ‘water, sap liquid’.)
11.8.5.2 Klaproth/Steller (1823/1743)
This material was published as ‘Kamchatka Ainu’ by Klaproth in Asia Polyglotta
(1823). The provenance of this ‘Kamchatka Ainu’ material (372 words) has been
uncertain. It is definitely not the same as the material collected by Krasheninnikov.
Murayama (1971) has argued that it is most likely material collected by the German
biologist Georg Wilhelm Steller when he visited Cape Lopatka and the northern
Kuril Islands in the months of May or June 1743.50
20 Vowel length in the vocabulary of Klaproth/Steller
Entry Hokkaidō Sakhalin
gânäh káni kaani ‘metal’ (<Japanese)
hūrăh húra hura51 ‘smell’
hūräh52 húre huure ‘red’
ŷhnäp ínep iineh ‘four’
rähra réra reera ‘wind’
ûrăr úrar uurara ‘fog’
49 Based on the fact that Krasheninnikov’s material contains indications of vowel length as well
as accent marks Vovin (1993:66) reconstructs distinctive vowel length as well as distinctive
pitch-accent in Kuril Ainu. Such a conclusion is premature as Krasheninnikov’s accent
markings may very well indicate the kind of automatic accent placement that characterizes the
dialect of Sakhalin.
50 Klaproth’s Sakhalin Ainu material (280 words) was adopted from Davidov (1812). Asia
Polyglotta also includes the Hokkaidō (Ezo) dialect (80 words).
51 According to the lexicostatistical study the dialect of Ochiho on Sakhalin has huuraha rah ‘to
smell (a smell)’.
52 As in: hūräh-gāhnäh ‘copper (red metal)’.
288 11 The accent of Japanese loanwords in Ainu
Entry Hokkaidō Sakhalin
dōpĕh tópe toope ‘milk’
pōrŭh póru x ‘cave’
dōhnŭ (‘judge’) tonó tono ‘government official’ (< Jap.)53
ōhnūmă onúman onuuman ‘evening’
ipākar (‘will, volition’) x ipaakari ‘to think’
sūh sú suu, suwehe ‘pot’
pāh pá, páha paa ‘year’
rūh rú ruu ‘road’
dōh tó too ‘day’
ruh (‘body hair’) x saparuwe ‘hair of the head’
do tó too ‘breasts’
nyh ní54 nii ‘tree’
pĕh pé x ‘water, sap, liquid’
kùh kú, kuwé kuu ‘bow’
Klaproth uses Roman script except for the signs ч and ш (represented by Murayama
as č and š). Murayama thinks that the reason for this is that these sounds would have
been cumbersome to write in his native German (tsch and sch). He may even have
replaced tsch and sch in Steller’s original manuscript with these Cyrillic letters
(Murayama 1971:46).
At first it may appear as though a vowel length distinction in the first and second
syllable of polysyllabic words has been preserved in this material, and that there is a
vowel length contrast in monosyllables. The word ruh ‘body hair’ (in Sakhalin and
Hokkaidō only used in compounds) even seems to form a minimal pair with rūh
‘road’, but in rūhtŭh ‘hair’ it suddenly appears with a long vowel, similarly do ‘the
breasts’ seems to form a minimal pair of some sorts with dōh ‘day’, but in dōpĕh
‘milk’ it suddenly appears with a long vowel.
It is probably useless to ponder these differences too much. Where
Krasheninnikov’s material was at least unambiguous in the marking of vowel length,
Steller/Klaproth’s material is drowned in an abundance of diacritics (a ā â ă,y ŷ ў î ï
ĭ, u ū ŭ û ù, ä ē ĕ, o ō ŏ) which all suggest differences in vowel length or accent.
(The letter h may also have been meant to indicate vowel length in some cases like
rähra ‘wind’, or perhaps was even used to mark remnants of former semivowels.)
These diacritics and other spelling devices appear to have been applied without
much of an underlying system: nôhk ‘egg’ is the same word as nōk ‘testicles’, mohs
‘flea’ is no doubt the same word as mōhs ‘mosquito’ and pōhnĕ ‘fin (of a fish)’ is
probably the same word as pŏŏnh ‘bone’, especially as all modern Ainu dialects
53 Although the accent of this word in isolation in Ainu is tonó, I will argue in section 11.11.3 that
originally, this form had initial accent in Hokkaidō and a long vowel in the initial syllable in
Sakhalin.
54 With the possessive niyé in Horobetsu.
11.8 Evidence for the direction of change 289
have a completely unrelated word for ‘fin’: Hokkaidō mókrap, Sakhalin mohrah.
Closed syllables are sometimes written with long vowels and sometimes with short
vowels: dēk ‘hand’, čār ‘mouth’, čep ‘fish’, čip ‘boat’, but it would be unwise to
base a vowel length distinction in closed syllables in Kuril Ainu based on such
unsystematic material.
My conclusion is that the vowel length attestations in this material serve to
confirm the fact that such distinctions existed in Kuril Ainu in the 18th century, as is
evident from the more reliable material of Krasheninnikov. It would be dangerous
however, to identify specific cases of vowel length in Kuril Ainu based on this
material. In the examples in (21) for instance, the indicated vowel length does not
agree with any of the modern Ainu dialects.
21 Examples where vowel length in the vocabulary of Klaproth/Steller does not
agree with the modern dialects
Entry Hokkaidō Sakhalin
aïnūh áynu aynu ‘person’
ōmōmpĕh omunpe oponpe ‘trousers’55
pŏŏnh poné poni ‘bone’ (<Japanese)
sākў saké sake ‘liquor’ (<Japanese)
ŝēdǔr setúr seturu ‘back’
gsāhr kisár kisaru, kisara ‘ear’
bāikǎr páykar paykara ‘spring’
bōrŭ (‘big’) poró poro ‘big’
kîhgĭr kikír kikiri ‘insect’
šîpŭnŭă (‘salty’) síppo sispo ‘salt’ (<Japanese)
āpĕh apé x ‘fire’
rēkŭt rekút rekuh ‘neck’
rērăr rerár reraru ‘chest’
pŏŏnh, pōhnĕ poné poni ‘bone’ (<Japanese)
11.8.5.3 Nineteenth century sources of Kuril Ainu
There are three sources of Kuril Ainu stemming from the 19th century. The (perhaps)
oldest is a source of Kuril Ainu material by an unknown author preserved in the I.
Voznesenskij collection in the archives of the Academy of Sciences in Leningrad,
provided by Vovin. Vovin suspects that this glossary was probably compiled by a
Cossack officer sometime before 1843, but it is not known which dialect (or
dialects?) is represented.
55 Ainu omunpe is definitely not a loan from Japanese as it can be analyzed as om-un-pe
‘something (worn around) the thighs’. (The initial vowel has been reinterpreted as the polite
prefix o- in Japanese.) The form omonpe which can also be found in Hokkaidō Ainu is
probably influenced in turn by Japanese.
290 11 The accent of Japanese loanwords in Ainu
According to Vovin, this material shows evidence of vowel length distinctions
(1993:66), but as there is no more evidence for vowel length in the Voznesenskij
material than a single attestation of ‘tree’ as nij, I cannot agree.56
Next there is material collected by the polish doctor Dybowski from an
informant from Shumshu, when he was on Kamchatka from 1879 to 1883.
Dybowski’s material was published by Radlinski (1892). According to Vovin, this
material too, preserved vowel length distinctions. It is true that Dybowski’s large
collection of Shumshu dialect material contains more attestations of vowel length
than the Voznesenskij material, but I do not think that a mere dozen cases (of which
a number can be explained as contractions or vowel + semivowel sequences) in a
dictionary of almost 2000 entries can count as sufficient proof for the existence of
distinctive vowel length.57
Finally there is the material of Torii Ryūzō included in the Ainu dialect
dictionary, which contains no indications of vowel length. The material agrees with
Dybowski’s material in this respect, which was to be expected, as Murayama has
pointed out that Torii and Dybowski collected material from the same dialect in
roughly the same period.
The following list may serve as an illustration of how the vowel length
distinctions that could still be found in the material of Krasheninnikov had
disappeared from Kuril Ainu by the time the material of Dybowski, Torii and the
Voznesenskij collection was collected:
– ‘mother’: Krasheninnikov aapu, Dybowski aapu but also apu
– ‘fog’: Sakhalin uurara, Krasheninnikov uurar, but Torii urarube and Dybowski
urar
– ‘metal’: Sakhalin kaani, Krasheninnikov kaani but Dybowski kani, Voznesenskij
kane58
– ‘wind’: Sakhalin reera, Krasheninnikov keera (=reera), but Dybowski rer, Torii
re’ra, reara, Voznesenskij rera
56 This single attestation may reflect a semivowel rather than vowel length: In Horobetsu, where
the word ní in the meaning of ‘wood’ has a possessive form, this form is niyé, indicating that
‘tree’ should be reconstructed as *niy.
57 The complete list of examples is: suu ‘to boil’ (Hokkaidō suwé and suyé ‘to cook’ indicates
*suw. Cf. ‘pan, cooking pot’ *suw), mii ‘to wear’(Sakhalin imii, imiyehe (poss.) ‘clothing’
indicates *miy), piip ‘to be fat’ (this attestation may be related to the fact that Raichishka has
piye ‘to grow fat’, indicating *piy) and poo ‘son’. Furthermore aayni ‘to sit down’, aana ‘a
kind of duck which dives in the water’, aapu ‘mother’, aatkari ‘to knot together’, nisaatno
‘early in the morning’, kioo ‘dirty, lousy’ (probably from ki ‘louse’ +?). The following vowel
length attestations have been analyzed by Murayama and can be shown to go back to
contractions: niikiri ‘trees’ = ni ‘tree’ + ikiri ‘many’ (or even niy-ikiri?), ikasooduk ‘to be
overgrown (with plants)’ = i-ko-soy-otuk. The same may be true for yoopunu ‘to harvest’ and
yootraski ‘to add’.
58 I have not adopted Vovin’s practice of transcribing Russian e as ie (as opposed to Russian э,
which is transcribed as e), because the letter э is never used in Voznesenskij’s material.
11.8 Evidence for the direction of change 291
– ‘yesterday’: Sakhalin nuuman, Krasheninnikov nuuman, but Torii and Dybowski
numan
– ‘evening’: Sakhalin onuuman, Krasheninnikov onuuman, but Voznesenskij
onuman ‘evening’, Dybowski onumonan (= onuman an ‘in the evening’
according to Murayama)
– ‘high’: Sakhalin o-rii, Krasheninnikov trii-va, but Dybowski ribi, Torii ri,
Voznesenskij trichingi ‘higher’
The evidence for Kuril Ainu is not overwhelming but sufficient to make it plausible
that the Kuril Ainu dialects had a vowel length distinction in the 18th century, which,
just as in the Hokkaidō Ainu dialects, was lost in the 19th century.
11.8.6 Influence from Japanese
The influence that the Japanese phonological system exerted on the Ainu language
must have been quite strong: Even without much prior knowledge of the Ainu
language it is possible for a person with a background in Japanese to transcribe a
recording of Hokkaidō (or Sakhalin) Ainu without too many mistakes, whereas an
Uilta recording from Sakhalin sounds almost completely unintelligible.
In Hokkaidō Ainu it is mainly the unreleased final consonants that sound exotic,
and in Sakhalin the syllable final -h, as the vowel system and even details like [h] >
[∏] / [u] and [t] > [tS] [i] in both dialects agree with Japanese. The fact that the
sequences *yi and *wi are not allowed (although they were in earlier staged of the
language) also agrees with Japanese.
Although the phonology of Sakhalin Ainu and Hokkaidō Ainu both resemble the
phonology of Japanese, only Hokkaidō Ainu, which had much more direct contact
with Japan, developed distinctive pitch-accent which replaced the earlier vowel
length distinction. The process by which the shift from distinctive vowel length to
distinctive accent placement took place (loss of vowel length in the second syllable)
can be linked to influence from Japanese, as in Japanese vowel length distinctions
are strongly reduced outside of the first syllable. The period in which this change
occurred in Hokkaidō (the early 19th century) also suggests that it was the result of a
growing influence of Japanese on the Ainu language.
The Matsumae-han, the feudal domain that had ruled over the southern part of
Hokkaidō from the 15th century on, had restricted contact of its subjects with the
Ainu, but a growing influence of Japanese settlers had already set in by the late 17th
century. The threat that this posed to the livelihood of the Ainu had led to the
Shakushain rebellion of 1669, which was eventually put down in 1672, a blow from
which the Ainu never recovered.
In 1799 however, the presence of Russian merchants and explorers off the
northern shores of Japan, as well as a major rebellion among the Kuril Ainu, led the
Tokugawa government to take control of Hokkaidō (including the Kuril islands of
Kunashir and Iturup) from the jurisdiction of the Matsumae-han.
From this moment on, Hokkaidō was under direct rule of the Tokugawa
government, who dispatched large numbers of samurai and government officials.
292 11 The accent of Japanese loanwords in Ainu
The colonization of Hokkaidō by the central government therefore, began well
before the Meiji Revolution of 1868. It is this policy, which resulted in the most
extensive influence of the Japanese language on Hokkaidō Ainu.
It is uncertain whether the disappearance of vowel length distinctions from the
northern Kuril dialect of Shumshu at the end of the 19th century is linked to
influence from Japanese.59 Even though the northern Kurils came under Japanese
rule only in 1875, Dybowski’s material, which was collected from 1879 to 1883,
already shows loss of vowel length distinctions.60 It has to be borne in mind however,
that the vowel length distinction in Ainu was already quite marginal, and it is
possible that it was lost independently in Hokkaidō and the Kurils.
11.9 Vovin’s reconstruction of proto-Ainu phonological structure
and tone
Hattori proposed the idea that proto-Ainu had a distinction between long and short
vowels, but that accent placement was automatic. Accent fell on the second syllable
unless the first syllable was closed or contained a long vowel. This reconstruction
was based on the correspondence between the occurrence of vowel length in the
initial syllable in Sakhalin and accent on the initial syllable in Hokkaidō, and on the
fact that in both dialects the initial syllable is automatically accented if closed.
Vovin on the other hand, dismisses any systematic relation between syllable
structure and accent placement:
Hattori Shirō proposed that PA had distinction between long and short
vowels but no pitch-accent, that is SAKH (Sakhalin) dialects reflect PA
almost completely, while pitch-accent in HKD (Hokkaidō) dialects is an
innovation (Hattori 1967). Hattori’s conclusion is based on the comparison of
only two dialects – SA (Saru) from HKD and RA (Raichishka) from SAKH
and on the assumption that high pitch in SA mostly corresponds to long
vowel in RA. However, there are strong limitations to the distribution of long
vowels in RA – they can occur only in open syllables; and in the
overwhelming majority of cases only in the first syllable of the word, while
high pitch may characterize both open and closed syllables. In addition high
pitch in HKD may correspond not only to long vowels in RA, but also to
short ones in open syllables, e.g. HKD ya H, RA yaa ‘net’, but HKD ya H,
RA ya=qunsiri ‘dry land, ‘shore’. 61 Similarly, long vowels in RA may
59 It seems that at that time, the Ainu language had already died out in the southern Kurils. The
southern Kuril islands had been under Japanese rule since 1799, and in 1808 over 1000 troops
from the Sendai-han were stationed on Kunashir and Iturup.
60 Unfortunately, the date of collection of the material included in the Voznesenskij collection is
unclear.
61 According to Murasaki on the other hand (1976: 230), ‘dry land’ is yaa in Raichishka. The
11.9 Vovin’s reconstruction of proto-Ainu phonological structure and tone 293
correspond not only to a high pitch in HKD, but also to a low one, e.g. Y
(Yakumo) tuki HL, RA tuuki ‘sake cup’, but Y tuci LH, RA tuuci ‘large
wooden hammer’.62 (Vovin, 1993:65)
As a matter of fact however, Hattori never claimed that [H] pitch in Hokkaidō
‘mostly corresponds to long vowels in Sakhalin’ (a claim which could indeed easily
be refuted). Hattori’s claim was much more specific: [H] pitch on an initial open
syllable in Hokkaidō corresponds a long vowel in Sakhalin. Apart from the rare
exceptions mentioned in section 11.6.1, namely ‘smell’, ‘old man’, and ‘husband’ –
and even these have all been attested with regular correspondences as well – accent
will be on the second syllable in Hokkaidō as long as there is no closed initial
syllable, or no vowel length in the initial syllable in Sakhalin.
Based on a single attestation, Vovin also calls the rule into question that the first
syllable will be automatically accented if closed, so that any connection between
segmental shape and accent placement is dismissed. 63 As a consequence Vovin
reconstructs distinctive vowel length and distinctive tone independently from each
other in proto-Ainu.
Vovin uses the term ‘pitch-accent’ for his reconstruction of the prosodic system
of proto-Ainu, but as he reconstructs an opposition between /H/ and /L/ in
monosyllables, and opposition between /HH/, /HL/, /LL/ and /LH/ in disyllables, his
reconstruction should definitely be referred to as a (register) tone system.
dialect dictionary also lists Raichishka yaa-ta ‘to the shore’ with the locative postposition ta (cf.
Hokkaidō yá-ta) which Vovin does not quote. As for ya/unsiri, when followed by a vowel in a
compound, monosyllables in Raichishka are often shortened. See yee ‘pus’, ye/oo ‘to fester’,
poo child, po/utarikehe ‘descendants’. Open monosyllables that go back to syllables ending in
a semivowel in proto-Ainu on the other hand, seem to retain the vowel length in compounds,
even when followed by another vowel: nii/ay ‘thorn’(<*niy ‘tree’), suu/ohpe ‘pot
hanger’(<*suw ‘pot’). When followed by a consonant in a compound, vowel length is
preserved, regardless of whether the monosyllable was originally open or ended in a
semivowel: nuupe ‘tear (< nuu ‘to well up’ (<*nuw, based on Yamamoto Tasuke’s minimal
pairs), niisew ‘nut, acorn’ (< *niy tree), kuukew ‘shoulder (<*kuw ‘bow’), kiinuh ‘grass plain’
(Murasaki 1975:167) (< kii ‘grass’), yeenuu ‘pus oozes’ (< yee ‘pus’), niirus ‘the gums’ (< nii
‘to knead + rus ‘leather’), pookor ‘to give birth (< poo ‘child’). But there are exceptions:
Sakhalin kina ‘edible weeds’ (<kii ‘grass’). N.B: Hokkaidō kiná ‘grass’ and kínup ‘grass plain’
reflect the presence or lack of vowel length in Sakhalin.
62 Yakumo and Obihiro both have tutí. As mentioned in section 11.6.1, there are examples where
some of the Hokkaidō Ainu dialects have moved the accent to the unmarked position (second
syllable) despite the fact that Sakhalin has vowel length in the initial syllable. Moreover, the
word may have to be reconstructed as *tuutuy in proto-Ainu, with a closed second syllable (cf.
section 11.6). Vovin does not mention the fact that Horobetsu (tútci), Saru and Sōya (túci) all
have the regular correspondence of accent on the first syllable. (N.B. both ‘sake cup’ and ‘large
wooden hammer’ are loanwords from Japanese. See also section 11.11.3.)
63 The example is kakká ‘vulva’ (with accent on the second syllable) which has only been attested
in the Saru dialect. The informant actually dismissed the word as ‘dialect’, which could have
been out of a form of shyness. (The same informant used cikappo ‘little bird’ for ci ‘penis’, and
tamanko (< Japanese tamago) for nok ‘egg’, as nok may also refer to ‘testicles’.)
294 11 The accent of Japanese loanwords in Ainu
In the following sections I will discuss Vovin’s reconstruction of the proto-Ainu
tone system. The main question that has to be answered is whether the Hokkaidō
Ainu dialects show differences in accent placement that are systematic enough to
justify the reconstruction of the much richer set of tonal oppositions that Vovin
proposes for proto-Ainu.
Vovin divides his reconstructed proto-Ainu tone classes into groups based on
their length in moras. One result of this is that examples that have accent on the first
syllable in Hokkaidō, but happen not to be attested in Sakhalin, are automatically
separated from examples that I would see as belonging to the same type, but which
happen to have been attested in Sakhalin with long initial vowels. We can discuss
Vovin’s reconstructions just as well when we rearrange them based on their length
in syllables instead of moras. I have therefore rearranged Vovin’s reconstructions in
groups based on their length in syllables in order to avoid separating disyllabic
words into a two-mora and a three-mora group, before agreeing on whether this
division is justified or not.
11.9.1 Monosyllables
Vovin reconstructs two tone classes for monosyllabic words. Although in isolation
the monosyllabic stems in Sakhalin and Hokkaidō are always accented (i.e. have [H]
pitch), Vovin’s two different classes are based on the fact that some words have [H]
pitch, while others have [L] pitch in compounds, or when a possessive suffix is
attached.
11.9.1.1 Proto-Ainu */H/
All the examples of the proto-Ainu */H/ tone class that Vovin reconstructs based on
compounds have to be dismissed, as the accent placement in all his examples is
determined by the segmental structure: When accent placement is not free, it cannot
be used to reconstruct */H/ tone as opposed to */L/ tone in proto-Ainu.64
64 The examples are nis ‘sky’, pis ‘seashore’ (which do not have possessive forms) and ray ‘to
die’, because these words have [H] pitch in the following compounds: nískoton ‘sky’, the
transitive ráyke ‘to kill’ and pís-ke (pis + ke, a suffix which sometimes attaches after words
with a locational meaning without an apparent difference in meaning) and the form e-pís-un
(lit: ‘to the beach’) in Obihiro. However, closed syllables like nis in nískoton, ray in ráyke and
pis in píske are automatically be accented. In e-pís-un the accent also falls automatically on pis,
as pis in this word happens to be the second syllable of a word with an open initial syllable.
(See also the earlier example of sitóma ‘to fear’ and isítoma ‘to be afraid’.) One example is
more complicated: ra ‘down’, is reconstructed with */H/ tone by Vovin because this word has
[H-L] pitch in Hokkaidō combined with the locative postposition ta in combinations like rá-ta
‘downward’, and [H-L] pitch in Sōya with the adverbial suffix -wa in rá-wa. In this case, the
initial open syllable remains [H] even when a second syllable is attached because historically
this word ended in a semivowel, which has been preserved in Sakhalin: raw ‘down’, rawta
‘downward’. In addition, the word has been attested as rauta ‘bottom’ in Ezo kotoba irohabiki
(1848) and as ravda in Dybowski’s Kuril Ainu material. Compare this to the earlier example of
Hokkaidō yá-ta, Raichishka yaa-ta ‘to the shore’. In this case, the [H-L] pitch in Hokkaidō was
11.9 Vovin’s reconstruction of proto-Ainu phonological structure and tone 295
The single remaining member of Vovin’s proto-Ainu monosyllabic */H/ tone
class (*ra ‘liver’) happens not to have been attested in Sakhalin, and – on the basis
of this coincidence – is reconstructed with a short vowel by Vovin. There are many
more examples with identical accentual reflexes and syllable structure in Hokkaidō,
but as these happen to have been attested in Sakhalin (where monosyllables are
automatically lengthened) they are reconstructed with a long vowel by Vovin.
For unknown reasons – after all, there is no monosyllabic two mora */HH/ class
to contrast with it – this last group is reconstructed with */HL/ instead of */HH/ tone
by Vovin. It is clear however, that these examples belong in one group with ‘liver’,
which is how I present them in (22)
22 Open monosyllables are reconstructed with */H/ tone
Vovin’s proto-Ainu Modern accentuation
*ra (*da?) /H/ ‘liver’ rá, rá-ha
*dEE65 /HL/ ‘name’ ré, ré-he (cf. Sakhalin ree, Nairo tee)
*pOO /HL/, ‘child’ pó, pó-ho (cf. Sakhalin poo-ho)
*puu /HL/, ‘storehouse’ pú, pú-hu (cf. Sakhalin puu-wehe)
Vovin’s reconstruction of */H/ tone (or */HL/ tone) as opposed to */L/ tone in these
examples is based on the fact that the accent remains on the first syllable in
Hokkaidō when the possessive suffix is added, even though the nouns now have a
CVCV shape. As Hattori has pointed out however, this is the regular reflex in
Hokkaidō of vowel length in Sakhalin. It does not constitute a basis for the
reconstruction of */H/ tone as opposed to */L/ tone in proto-Ainu.
As has been discussed in section 11.2, it is even possible that the possessive form
of these monosyllables in Hokkaidō Ainu should be analyzed as /CVh/, as the
automatic vowel copy after /h/ may not be a phonological vowel. As monosyllables
are automatically accented, this is another reason why these forms cannot be used to
reconstruct a */H/ tone class in proto-Ainu.
11.9.1.2 Proto-Ainu */L/
Vovin’s */L/ tone class consists entirely of nouns with closed syllables, to which the
possessive suffix -i or -u is added, resulting in a disyllabic possessive form with a
CVCV shape, so that the first syllable is now open. Words of this shape have accent
on the second syllable (unless, of course, there was vowel length in the initial
syllable in proto-Ainu, preserved in Sakhalin, which is not the case). As accent on
the second syllable in Hokkaidō is the expected reflex, the [L] pitch of the initial
the regular reflex of vowel length in Sakhalin.
65 In connection with the variation between -i and -u in the possessive suffix, which is due to a
rather limited type of vowel harmony, Vovin reconstructs two types of vowels for o (o and O),
e (e and E) and a (a and A) in the proto-language.
296 11 The accent of Japanese loanwords in Ainu
syllable does not constitute a basis for the reconstruction of distinctive */L/ tone in
proto-Ainu.
23 Closed monosyllables are reconstructed with */L/ tone
Vovin’s proto-Ainu Modern accentuation
*trAp /L/ ‘feather’ ráp, rap-ú (cf. Sakhalin rah, rap-uhu)
*gum /L/ ‘noise’ húm, hum-í, Sōya hum-íhi (cf. Sakhalin hum-ihi)
*tEk /L/ ‘hand’ ték, tek-é (cf. Sakhalin teh, tek-ihi)
*nit /L/ ‘handle’ nít, nit-ú (cf. Sakhalin nis, nic-ihi)
*nAn /L/ ‘face’ nán, nan-ú, Sōya nan-ú/-úhu (cf. Sakhalin nan-uhu)
There are two examples of words with a (seemingly) open syllable structure that
Vovin includes in his */L/ (two mora) group: ci ‘penis’ and ku ‘bow’, which have
[LH] pitch in the possessive forms: ciyé and kuwé. The possessive forms however
reveal that the underlying segmental shape of these nouns is *ciy and *kuw, i.e. that
they have closed syllables, just as the other examples.
In syllable-final position, the combinations iy and uw are not allowed anymore in
Ainu, but -y and -w reappear when the possessive suffix -i is attached: ciyé and kuwé.
(The -e is a lowered -i as the combinations yi and wi are not allowed anymore
either.)66
Vovin sees the accent of cikáppo ‘penis’ in the Saru dialect as a confirmation of
his reconstruction of proto-Ainu */L/ tone for ci. Cikáppo however, is not a cognate
of ci ‘penis’ but a diminutive of cikáp ‘bird’. Its use for ‘penis’ in the Saru dialect is
a euphemism or nickname. (An association between ‘bird’ and ‘penis’ is also known
from other languages.)
In the Saru dialect it appears that a reevaluation of the form kú has taken place.
In this dialect kú is now sometimes seen as a normal open-syllable stem and in this
dialect the possessive forms kúhu as well as kuwé can be found.67
There are more members of this small group of words which ended in
semivowels in proto-Ainu, and a few have already been mentioned:
– ru ‘line, traces, footmarks’ (not included in Vovin’s */L/ group) has the
possessive form ruwé in Hokkaidō and ruu-wehe in Sakhalin which means that
this word has to be reconstructed as *ruw.
– pi ‘seed’ (not included in Vovin’s */L/ group) has the possessive forms piyé,
piyé-he in Hokkaidō, which means that this word has to be reconstructed as *piy.
– pu ‘storehouse’ is pú-hu in Hokkaidō (and was therefore included in Vovin’s
*/H/ group), but it is puu-wehe in Raichishka. (The reduplication suffix -hV has
been attached after the lowered *i > e of the possessive suffix.) The Raichishka
66 See Nakagawa (1983:198) on the analysis of -e after y- and w- as lowered -i.
67 It is significant that in this case kuwé with the lowered -i suffix is [L-H] while kúhu with the [-
hV] suffix) is [H-L], offering strong support for the /-h/ analysis.
11.9 Vovin’s reconstruction of proto-Ainu phonological structure and tone 297
form shows that this word has to be reconstructed as *puw. It has however been
reanalyzed as an open syllable stem in Hokkaidō, just as kú, kú-hu ‘bow’ in the
Saru dialect.
– su ‘cooking pot’, which has no possessive form in Hokkaidō (and is therefore not
included in Vovin’s */H/ group), has the possessive form suu-wehe in
Raichishka, showing that this noun has to be reconstructed as *suw.
– ni ‘tree’ (not included in Vovin’s */L/ group), has a possessive form niyé in
Horobetsu, indicating that ‘tree’ should be reconstructed as *niy.
– mi ‘clothing’ is imii with the possessive form imiyehe in Sakhalin, indicating that
this noun (which is included in Vovin’s */L/ group) has to be reconstructed as
*miy.
Finally, the words attested with short vowels in Yamamoto Tasuke’s dialect may
also be added to this group. The conclusion from the list above can only be, that the
location of the accent in the possessive forms of monosyllabic nouns is determined
by their (underlying) segmental shape. The resulting pitch pattern can therefore not
be used as a basis for the reconstruction of distinctive */H/ or */L/ tone in
monosyllables in proto-Ainu.
11.9.2 Disyllables
11.9.2.1 Proto-Ainu */HH/
Vovin reconstructs this proto-Ainu tone class based on the following accentual
correspondences between the dialects: “PA two-mora high prototonic class HH has
fallen together with oxytonic class LH in SO (Sōya) and with low prototonic class
HL in Y (Yakumo) SA (Saru) and N (Nayoro). In the other dialects, words of this
class can belong either to low prototonic (HL) or to oxytonic (LH) class.”
24 The basis for the reconstruction of */HH/ tone according to Vovin
Proto-Ainu Sōya Yakumo Saru Nayoro Other
*/HH/ [LH] [HL] [HL] [HL] [HL] or [LH]
The reconstruction of this class therefore depends on the [LH] reflex in Sōya, in
contrast with the [HL] reflexes in Yakumo, Saru and Nayoro, as the expected reflex
in the other dialects is kept very vague. The examples are as in (25).
25 Disyllables reconstructed with */HH/ tone
Vovin’s proto-Ainu Modern accentuation
*Erum /HH/ ‘mouse’ érum in general, but Saru érmu, Obihiro and
Sōya erúmun (cf. Sakhalin erumu)
*gura /HH/ ‘smell’ húra in general, hurá in Sōya (cf. Sakhalin
hura)
298 11 The accent of Japanese loanwords in Ainu
Vovin’s proto-Ainu Modern accentuation
*gaa(=)pO /HH(-)L/ ‘mother’ hápo in general, hapó in Sōya
*opsOr /HH/ ‘bosom’ úpsor in general, osór in Sōya
*puri /HH/ ‘custom’ (< Jap. huri 2.1) púri in general but Horobetsu, Asahikawa
and Sōya purí
*kaani /HHH/ ‘metal’ (< Jap. kane 2.1) káni in Yakumo, Horobetsu, Asahikawa,
Nayoro, káne in Saru, but kaní in Obihiro,
Sōya (cf. Sakhalin kaani)
I will discuss the examples in the order in which they are given in (25).
‘Mouse’: It is unclear how the forms érum, érmu, erúmun and erumu relate to
each other. The fact that one form (érum) always has the accent on the first syllable,
while the next form (erúmun) always has accent on the second syllable, and never
the other way around, makes me doubt whether we can simply equate the two forms.
Krasheninnikov’s Kuril Ainu wordlist (1738) with ermù and Steller’s Kuril Ainu
wordlist (1774) with ärmǔh (Murayama, 1971) both agree with the Saru dialect. If
érmu is original and developed into érum through reanalysis as a phoneme (and
metathesis) of the vowel that automatically appears after final -r (Sakhalin for
instance also has erumu), the initial [H] pitch in this form could be explained as a
remnant of an originally closed syllable. (Just as the [H] pitch of ra (< raw) ‘down’
in rá-ta ‘downward’.)
‘Smell’: This word has already been mentioned in section 11.6.1 as the clearest
example of a case where Hokkaidō unexpectedly has accent on the first syllable,
even though the vowel in the first syllable in Sakhalin is not long (except in the
dialect of Ochiho.) The [LH] pitch in the dialect of Sōya, can be explained as the
regular reflex of the segmental shape in Sakhalin, to which the dialect of Sōya is
closest (geographically and linguistically) of all the Hokkaidō Ainu dialects.
‘Mother’: This word is not attested in Sakhalin, but the Kuril Ainu wordlist of
Krasheninnikov (1738) has áapu and Dybowski (1879-1883) has apu~aapu
(Murayama, 1971). (Vovin reconstructs a long initial vowel on the basis of the Kuril
data, but the word is still listed with his two-mora group because he regards -po as a
suffix.)
‘Bosom’: The first problem here is that osór is not a cognate of úpsor, as is
evident from the entries in the dialect dictionary shown in (26).
26 The mix-up of osor and upsor in Sōya
Sōya Other dialects
‘bosom’ osór úpsor or ússor
‘buttocks’ osór osór
In all Hokkaidō dialects except Sōya ‘bosom’ is úpsor (with the assimilated form
ússor occurring in Asahikawa, Nayoro and Bihoro). Only Sōya has the entry osór.
11.9 Vovin’s reconstruction of proto-Ainu phonological structure and tone 299
Although osór apparently can have the meaning ‘bosom’ in Sōya, its basic meaning
(in Sōya as well as in the other dialects) is ‘buttocks’ (Hattori, 1964:14). The Sōya
informant was the only speaker left of her dialect, and her vocabulary was limited.
Perhaps she just substituted the word osór for ‘bosom’, because she did not have the
term úpsor ready anymore. In any case, it is clear that osór in Sōya is not a cognate
of úpsor, and does not belong in the list of reflexes.
Apart from this, there is no use in contrasting the pitch of a word in which accent
placement is free (accent in a word of the shape of osór can fall on the first or on the
second syllable) with the pitch of a word in which accent placement is not free
(úpsor with its closed initial syllable will automatically have the accent on the first
syllable), as the correspondence would be meaningless in any case.
‘Custom’: The Sōya entry actually is: わからない; purí (?). “I don’t know/The
informant doesn’t know”; purí (?). We cannot rule out the possibility that the choice
of a form with accent on the second syllable by the Sōya informant has to do with
the fact that the informant was relatively unfamiliar with the word. (N.B: Although
Sakhalin has puuri, for some reason Vovin does not reconstruct proto-Ainu (or
proto-Japanese) vowel length for this noun.)
‘Metal’: Like puri this example is attested with a long initial vowel in Sakhalin
and in this case is reconstructed by Vovin with an initial long vowel. (In Vovin’s
division therefore this word belongs to the proto-Ainu three mora group.) Yakumo,
Horobetsu, Asahikawa, Nayoro káni, Saru káne, Obihiro, Sōya kaní, Sakhalin kaani.
In general the Hokkaidō accent (on the first syllable) relates regularly to the long
initial vowel in Sakhalin. In two dialects (Obihiro and Sōya) accent has shifted to the
favored second syllable.
Modern Hokkaidō Ainu has only two possibilities to place the accent in
disyllabic words; on the first syllable or on the second syllable. If there were a large-
scale systematic difference in the accentual reflexes between the different dialects, it
would be possible to reconstruct a more complex system of tone classes for proto-
Ainu, similar to what can be done in case of proto-Japanese.
The reflexes of the five examples in (27) however (from which I would exclude
érum, as I suspect the initial [H] pitch goes back to an originally closed first
syllable), are not nearly numerous and systematic enough to justify the
reconstruction of */HH/ tone as opposed to */HL/ tone.
The second syllable is the favored locus of the accent in Ainu, and an occasional
regularization into that shape, such as has happened in case of these five examples in
Sōya, is not surprising. Moreover, in three of the five examples, accent on the
second syllable can be found in one or two dialects other than Sōya as well, but this
does not happen each time in the same dialects. Even within the scope of this small
remaining group of examples therefore, the situation cannot be compared to the
much more systematic division into tone classes that we find in the Japanese dialects.
300 11 The accent of Japanese loanwords in Ainu
27 The correspondences on which Vovin’s reconstruction of */HH/ tone is based
Sakhalin Sōya Yakumo Saru Nayoro Other
hapo x68 [LH] [HL] [HL] [HL] [HL]
erum - vowel length [LH] (erúmun) [HL] x69 [HL] [HL], [LH]70
hura - vowel length [LH] [HL] [HL] [HL] [HL]
puri + vowel length [LH] (?) [HL] [HL] [HL] [HL], [LH]71
kani + vowel length [LH] [HL] [HL] [HL] [HL], [LH]72
Notice too, that only two of the examples with (predominantly) [HL] reflexes in
Hokkaidō have been attested with a short vowel in Sakhalin Ainu, namely érum and
húra (although, as mentioned before, I have my doubts about the validity of the first
example) indicating that – contrary to Vovin’s claim – Hattori’s rule that initial
accent corresponds to vowel length in Sakhalin is hardly ever violated.
Finally, as for the Japanese loanwords involved, Vovin does not address the
problem of why the majority of loanwords from Japanese class 2.1 (/HH/ tone in
proto-Japanese in the standard reconstruction) do not show his proposed
correspondence for words with proto-Ainu */HH/ tone at all. In the examples in (28),
the Sōya reflex is mostly missing, but the general Hokkaidō Ainu reflex is clear:
Accent falls on the second syllable, and never on the first.
28 The reflexes of Japanese loanwords of class 2.1
Japanese Hokkaidō Ainu Sōya
2.1 ‘shark’ same samé x
2.1 ‘sake’ sake saké x
2.1 ‘pot’ kama kamá x
2.1 ‘a rush mat’ toma tomá x
2.1 ‘cover, lid’ huta putá putá
2.1 ‘hatchet’ nata natá nata73
11.9.2.2 Proto-Ainu */HL/
All examples of Vovin’s */HL/ tone class start with a closed syllable. As accent
placement on initial closed syllables is automatic, the pitch of these examples is
determined entirely by their segmental shape, and forms no basis for the
reconstruction of a distinctive */HL/ tone class.
68 Not attested in Sakhalin, but with long initial vowel in Kuril Ainu.
69 In Saru accent placement is not free because of the closed first syllable in érmu.
70 Erúmun in Obihiro.
71 Purí in Horobetsu and Asahikawa.
72 Kaní in Obihiro.
73 The accent mark is missing in the Ainu dialect dictionary.
11.9 Vovin’s reconstruction of proto-Ainu phonological structure and tone 301
29 Disyllables with closed initial syllables are reconstructed with */HL/ tone
Vovin’s proto-Ainu Modern accentuation
*gAnku /HL/ ‘navel’ hánku
*ihka /HL/ ‘to steal’ íkka
*takne /HL/ ‘short’ tákne
*Haspa /HL/ ‘deaf’ áspa
*hdak=ka /HL/ ‘water’ wákka
The examples in (29) have automatic accent on the first syllable in the modern
Hokkaidō dialects because the initial syllable is closed. The other group of words
that has accent on the initial syllable in Hokkaidō, are those words that show the
regular correspondence between initial vowel length in Sakhalin and initial accent in
Hokkaidō established by Hattori. Hattori’s proto-Ainu reconstruction of these words
was *CVVCV.
Vovin however, bases the reconstruction of another distinctive proto-Ainu tone
class */HLL/ on this correspondence. (Because of the vowel length in Sakhalin, the
examples involved belong to the proto-Ainu three-mora group in Vovin’s division.)
30 Disyllables with vowel length in the initial syllable in Sakhalin
are reconstructed with */HLL/ tone
Vovin’s proto-Ainu Modern accentuation
*tOOpEn /HLL/ ‘sweet’ tópen (cf. Sakhalin o-toopen)
*dEEra /HLL/ ‘wind’ réra (cf. Sakhalin reera, Nairo teera)
*daarak /HLL/ ‘smooth’ rárak (cf. Sakhalin raarah, Nairo taarak)
*kEEra /HLL/ ‘taste’ kéra, kéra-ha (cf. Sakhalin keera)
*gEEsE /HLL/ ‘to breathe’ hése (cf. Sakhalin heese)
The reason why Vovin reconstructs */HLL/ and not */HHL/, is because he
reconstructs */HHL/ in case of a different pattern of correspondences, namely when
“RA three-mora word corresponds to HKD two-syllable word, which belongs in Y
to prototonic class (HL) and in other dialects to oxytonic (LH) class.” There is only
one word that shows this type of correspondence, which is *kaasi=/u /HHL-L/ ‘to
help’.
The modern reflexes of this word are: Yakumo kásiw, Horobetsu, Saru, Obihiro
kasúy, Asahikawa kasúy, i-kásuy, Nayoro u-kásuy, Sōya i-kásuyke, Sakhalin, kaasiw,
i-kaasiw.
The only example of initial accent in the dialect dictionary is from Yakumo in
the south of Hokkaidō (kásiw), but the accent in the dialect of Chitose (not included
in the dialect dictionary, personal communication by Anna Bugaeva) is also on the
first syllable: kásuy. (The accentuation in Asahikawa, Nayoro and Sōya only shows
the mechanism (discussed earlier) of keeping the accent on the second syllable when
302 11 The accent of Japanese loanwords in Ainu
verbal prefixes are added (i-kásuy, u-kásuy etc.). As has already been discussed in
section 11.6.1, the fact that the second syllable of this example was heavy may have
caused the accent to fall on the second syllable instead of the first in a number of in
Hokkaidō dialects. The fact that Yakumo relates regularly to the Sakhalin form
(from which it is geographically farthest removed) both in accent and segmental
shape would make me reconstruct *kaasiw or *kaasiw
11.9.2.3 Proto-Ainu */LL/
The reconstruction of the */LL/ proto-Ainu tone class is based on [LL-H] reflexes in
Yakumo when a possessive suffix attaches to the word. It has to be remembered
however, that in Yakumo the accent regularly shifts onto the third syllable if the
second syllable is open, and if the second syllable does not contain a long vowel in
Sakhalin.
For the reconstruction of proto-Ainu */LL/ tone (as opposed to proto-Ainu */LH/
tone) to be valid, there ought to have been be a contrast between Yakumo forms
with [LL-H] pitch and Yakumo forms with *[LH-L] pitch, but such a contrast does
not exist.
31 Disyllables that have a possessive form are reconstructed with */LL/ tone
Vovin’s proto-Ainu Modern accentuation
Hokkaidō Yakumo
*EtOp /LL/ ‘hair’ otóp, otópi etóp, etopí
*ti/Ep /LL/ ‘fish’ ci/ép, cép74 (no possessive) ci/ép, ci/epí
*kisAr /LL/ ‘ear’ kisár, kisára kisár, kisará
*dE=kut /LL/ ‘neck’ rekút, rekúci rekút, rekucí
*nupEk /LL/ ‘light’ nupék, nupéki nupék, nupekí
There is one example with a long initial vowel in Sakhalin, namely ‘wheel, ring’
kaaris, kaarip-ihi, which is karíp in Hokkaidō and karíp, karipí in Yakumo. It is
reconstructed as *kaari=p */LLL/ by Vovin. My objection to the reconstruction of a
proto-Ainu */LLL/ tone class (with – apparently – only a single member) is the same
as my objection to the reconstruction of a */LL/ tone class: I see no reason for the
reconstruction of a distinct proto-Ainu level /L/ tone class based on the Yakumo
reflex, if that reflex can easily be explained as an innovation, and can be predicted
based on the segmental shape of the words involved. As for the discrepancy between
the long initial vowel in Sakhalin and the accent on the second syllable in Hokkaidō
in case of ‘wheel, ring’; Hattori reconstructed a long vowel in the closed second
syllable in proto-Ainu to account for the unusual accent placement in Hokkaidō:
*kaariip (see section 11.6.1).
74 In most dialects this word ci/ép ‘fish’ (lit. ‘what we eat’, ci- ‘we’, -e- ‘eat’, -p ‘thing’) has
monosyllabified to cép.
11.9 Vovin’s reconstruction of proto-Ainu phonological structure and tone 303
11.9.2.4 Proto-Ainu */LH/
The following group of words is reconstructed with proto-Ainu */LH/ tone by Vovin.
32 Disyllables that do not have a possessive form are reconstructed with */LH/ tone
Vovin’s proto-Ainu Modern accentuation
*apE /LH/ ‘fire’ apé (no possessive)
*kina /LH/ ‘grass’ kiná (no possessive)
*turEp /LH/ ‘sweet potato’ turép (no possessive)
*mOyO /LH/ ‘few, little’ moyó
*Oman /LH/ ‘to go’ omán
These examples have accent on the second syllable in all Hokkaidō dialects, and
have short initial vowels in Sakhalin, showing the regular reflex established by
Hattori. Vovin reconstructs these examples with */LH/ instead of */LL/ tone,
because they lack a [LL-H] attestation in Yakumo. They lack such an attestation
because they happen not to have a possessive (or other derivative form) in which
they are lengthened to three syllables.
All disyllabic words in Yakumo with accent on the second syllable that do have
a possessive form have [LL-H] pitch in the possessive form. Whether the words
have consonant-ending or vowel-ending stems makes no difference in this respect,
cf. ‘shin’ nisáp, nisapí, ‘hair’ etóp, etopí, ‘light’ nupék, nupekí, ‘doorway’ apá,
apahá, ‘foot’ uré, urehé, ‘leg’ kemá, kemahá etc.
As I have mentioned before, Vovin’s reconstruction of a proto-Ainu */LL/ tone
class separate from a proto-Ainu */LH/ tone class would have been convincing, if
there had been a contrast between *[LH-L] attestations in Yakumo for the proto-
Ainu */LH/ class, and [LL-H] attestations in Yakumo for the proto-Ainu */LL/ class,
but such a contrast does not exist.
11.9.3 Trisyllables
Vovin reconstructs only two trisyllabic tone classes, */LHL/ and */LLH/.
11.9.3.1 Proto-Ainu */LHL/
Shibatani (1990) notes that Ainu has a strong tendency to avoid vowel sequences
and that a number of phonological processes operate to effect this tendency: In
diphthongs such as ai and ui, the second vowels are devocalized and pronounced as
[aj] and [uj], and the semivowel w and y are inserted when close vowels are
followed by other vowels; w is inserted following u and y after i.
It must have been with these kinds of processes in mind that Vovin reconstructed
proto-Ainu vowel sequences in the following examples: We see that modern Ainu
iw is reconstructed as *i/u, and oy as *O/i. As a result, the following examples have
three syllables in proto-Ainu in Vovin’s reconstruction, while they have only two
304 11 The accent of Japanese loanwords in Ainu
syllables in modern Ainu. The result is the reconstruction of a proto-Ainu trisyllabic
tone class */LHL/.
I do not know if this reconstruction is correct. My main concern is with the
reconstruction of the proto-Ainu prosodic system, and the Hokkaidō Ainu accent
patterns in these examples are unexceptional.
33 Trisyllables reconstructed with */LHL/ tone
Vovin’s proto-Ainu Modern accentuation
*kapi/u /LHL/ ‘seagull’ Yakumo, Horobetsu, Obihiro Asahikawa, Sōya kapíw.
*gO/inu /LHL/ ‘marten’ Yakumo, Saru, Obihiro, Asahikawa, Nayoro hóynu,
Horobetsu ho/ínu, Bihoro oynuy, Sakhalin hoynu.
11.9.3.2 Proto-Ainu */LLH/
Finally, Vovin reconstructs a trisyllabic proto-Ainu tone class */LLH/.
34 Trisyllables reconstructed with */LLH/ tone
Vovin’s proto-Ainu Modern accentuation
*makiri /LLH/ ‘knife’75 makíri, Yakumo makirí
*hErOki /LLH/ ‘herring’ heróki, eróki, Yakumo herokí
*isEpO /LLH/ ‘hare’ isépo, Yakumo isepó
*sipuya /LLH/ ‘smoke’ sipúya, Yakumo sipuyá
*pisaku /LLH/ ‘dipper’ (< Japanese)76 pisáku, Yakumo pisakú
In words of three syllables, there are only two accentual possibilities in almost all
Hokkaidō Ainu dialects; accent on the first or on the second syllable. The only
exception is the dialect of Yakumo. In Yakumo, in words of three or more syllables
that do not have the accent on the first syllable the accent will shift from the second
syllable to the third, if the second syllable is open, and does not contain a long vowel
in Sakhalin.77
75 According to Kokugo-gaku Dai-jiten this is a loanword from Ainu into Japanese, meaning
‘small sword’. (Tōkyō: ma'kiri.)
76 From hisyaku 3.6 or hisak/go 3.6 (an older form). Both have Ø tone in Tōkyō.
77 (See section 11.8.2.) As I have mentioned before, there are exceptions to this rule, such as
onomatopoeia and reduplications (which are often mimetic words) like heráhera ‘to limp’,
reyéreye ‘to creep, to crawl’. Often reduplications occur with the suffix -se: horárayse ‘to slide,
to slip’, kiyárarse ‘to yell, to shout’, rayáyayse ‘to cry, to weep’. There are however, also
reduplications in Yakumo that do have the accent on the third syllable, like nuyanúya ‘to
shatter it, to crush it’, suyesúye ‘to wave, to shake’, sikaríkari ‘to turn it around, to revolve it’,
parupáru ‘to fan’. I do not know the reason behind this kind of variation, whether it may for
instance be the result of influence from other dialects, or a special feature of mimetic words. (It
is not possible to compare the accent in these examples with records of any other speaker of the
Yakumo dialect as none exist.)
11.10 Vovin’s evidence from Japanese loanwords in Ainu 305
In these cases we have the correspondence: Hokkaidō [LHL], Yakumo [LLH].
The reason why Vovin reconstructs proto-Ainu */LLH/ tone in case of this
correspondence, is because he does not acknowledge a relationship between accent
placement in Yakumo and factors like the segmental structure, or vowel length in
proto-Ainu (as evidenced by the vowel length in Sakhalin).
11.10 Vovin’s evidence from Japanese loanwords in Ainu
for the standard reconstruction of proto-Japanese tone
Vovin presents the following list of comparisons of the tone of proto-Japanese
loanwords in proto-Ainu as corroboration for the standard reconstruction of the
proto-Japanese tone system.
35 Vovin’s comparison of proto-Japanese with proto-Japanese loanwords
in proto-Ainu
Vovin’s proto-Japanese Vovin’s proto-Ainu
‘metal’ *kaana(=Ci) /HHH/ (2.1) *kaani /HHH/
‘paper’ *kanpi /HL/ (2.2a) *ka[n]pi /HL/
‘cup’ *tuuki /HLL/ (2.2b) *tuuki /HLL/
‘board’ *ita /LH/ (2.4) *ita /LH/
‘saw’ *noko /LF/ (2.5) *noko /LH/
‘bone’ *pone /LL/ (2.3) *pone /LL/
‘skin’ *kapa /LL/ (2.3) *kap /L/
‘hammer’ *tuutu(=Ci) /LLH/ (2.4) *tuuti /LLH/
‘to measure’ *paaka(=ra=) /LLL/ (3.4) *paakari /HLLL/
‘bag’ *pukurwo /LLL/ (3.4) *pukuru /LLH/
‘ladle’ *pisaku /LHH/ (3.6) *pisaku /LLH/
‘medicine’ *kusuri /LHL/ (3.7a) *kusuri /LHL/
As is clear from my discussion of Vovin’s reconstructed proto-Ainu tone classes, I
do not believe that proto-Ainu had phonological tone. Even in modern Hokkaidō
Ainu, where pitch-accent has become distinctive, accent placement is still for the
largest part determined by segmental features. It is only in polysyllabic words with
an open initial syllable that accent placement is free. The correlation between initial
accent in this environment, and vowel length in Sakhalin is so strong, that it is hard
to escape the conclusion that initial accent in Hokkaidō can be traced back
historically, to vowel length in the initial syllable in proto-Ainu.
Unfortunately therefore, the Japanese loanwords in Ainu do not contain the
detailed information on proto-Japanese tone and vowel length that Vovin (1993b and
306 11 The accent of Japanese loanwords in Ainu
1997) thinks to find.78 I will discuss Vovin’s examples in the order in which they are
given in (35).
‘Metal’: The Hokkaidō Ainu dialects in general have [HL] pitch. The vowel
length is based on Sakhalin kaani. As the Sōya reflex is [LH], Vovin reconstructs
*/HH/ tone. We have seen that there is a small group of words in which accent
placement in Sōya does not agree with (most) other Hokkaidō Ainu dialects, but the
examples are not numerous, and the reflexes not systematic enough to justify the
reconstruction of this proto-Ainu tone class. (Moreover, this correspondence does
not occur at all in case of the majority of Japanese loanwords of class 2.1 in Ainu.)
Following Hattori, the reconstruction of this type of word in proto-Ainu would be
*CVVCV, with non-distinctive [H] pitch on the second mora.
‘Paper’: The -np- cluster in this word was probably adopted unaltered from the
Japanese original. (See section 11.11.1.) As an initial closed syllable in Ainu will
always have the accent, there is only one accentual possibility for a word of this
segmental shape in Ainu. If the resulting [HL] pitch in such a case agrees with the
original tone of the Japanese word this is a pure coincidence.
‘Cup’ : This word has the regular correspondence of initial vowel length in
Sakhalin with accent on the first syllable in Hokkaidō. Following Hattori, the
reconstruction of this type of word in proto-Ainu would be *CVVCV, with non-
distinctive [H] pitch on the second mora.
‘Board’ and ‘saw’: We have to be aware that in the modern Tōhoku type dialects
words like ita and noko (nouns belonging to class 2.4 or 2.5 that have an open final
vowel) will have [LH] pitch, and that the Ainu accent could have been inherited
from these dialects. Furthermore, a certain amount of loanwords with Kyōto type
tone in Ainu can be expected as the trade contacts of the Ainu were mainly with
traders from Ōsaka (see section 11.11.4).
‘Bone’: This word has the same accentuation (poné) as itá and nokó in the
Hokkaidō Ainu dialects, but */LL/ tone is reconstructed by Vovin because Yakumo
has [LL-H] pitch when the possessive suffix attaches. This word is therefore only
reconstructed as */LL/ and not as */LH/ like itá and nokó because – unlike itá and
nokó – it happens to have a possessive form. As the shift of [H] pitch from short
open syllables in second position to the third syllable in Yakumo is automatic (in
words that do not have vowel length in the second syllable in Sakhalin) accent on
the third syllable does not justify the reconstruction of a proto-Ainu */LL/ tone class.
‘Skin’: As we have seen, a word like káp (poss. kap-ú), consisting of a single
closed syllable can only have [L-H] pitch in case a possessive attaches. Even if this
word were a loanword from Japanese (which I don not think is the case as I fail to
78 In his article of 1993 proto Japanese vowel length was reconstructed for ‘metal’ *kaana=Ci
/HHH/, ‘cup’ *tuuki /HLL/, ‘hammer’ *tuutu=Ci /LLH/ and ‘to measure’ *paaka=ra= /LLL/.
Except for *paaka=ra= /LLL/ all these examples were presented again in the article of 1997,
but this time the reconstructed proto Japanese vowel length was left out. I will discuss all
Vovin’s examples together here, with the examples from the 1993 article presented with
reconstructed vowel length.
11.10 Vovin’s evidence from Japanese loanwords in Ainu 307
understand the loss of the final vowel) an agreement in the pitches would be
coincidental.
‘Hammer’: A disyllabic, three-mora */LLH/ proto-Ainu tone class is actually
not reconstructed by Vovin in A reconstruction of Proto-Ainu, so this Japanese
loanword is the only representative of this class. With reflexes like Horobetsu tútci,
Saru, Sōya túci, Yakumo Obihiro tutí, Sakhalin tuuci, I would assume that Yakumo
and Obihiro had moved the accent to the unmarked position. Following Hattori, the
proto-Ainu reconstruction would be *CVVCV.
‘To measure’: With pákari in all Hokkaidō dialects (except for pakári in Nayoro,
where the accent has moved to the favored second syllable) and paakari in Sakhalin,
Hattori would reconstruct proto-Ainu *CVVCVCV, with non-distinctive [H] pitch
on the second mora. (Note that even in Vovin’s reconstruction this example cannot
count as a confirmation of the standard reconstruction of the proto-Japanese tone
system as the tone of the initial syllable does not agree with Middle Japanese.)
‘Bag’ and ‘ladle’: These two examples have the accent on the second syllable
(pukúru, pisáku) in all Hokkaidō Ainu dialects except Yakumo, where accent
automatically shifts to the third syllable (pukurú and pisakú). Vovin nevertheless
reconstructs the proto-Ainu tone as */LLH/ and not */LHL/ on this basis. Vovin’s
next example however, is reconstructed as */LHL/.
‘Medicine’: In the compound ‘ring finger’ (lit. ‘medicine finger’ as eastern
Japan) this example has accent on the second syllable in Yakumo: kusúri/aspeket,
kusúri/aspekec-i (poss.). This is what leads Vovin to reconstruct proto-Ainu */LHL/
tone for this noun, which agrees wonderfully well with the standard reconstruction
of the Middle Japanese tones. Vovin adduces this example therefore as proof for the
idea that a distinct Japanese tone class 3.7 (with /LHL/ tone) once existed in
northern Japan as well, and not only in the central Japanese area. Vovin however,
somehow fails to quote the accent of this loanword in Yakumo when it is not part of
a compound, which is kusurí, as was to be expected in Yakumo for words of this
segmental shape. See: Tán kusurí tasúm nópirka ‘This medicine works’ (Hattori
1967:31).
In a similar way, Yakumo ekasí ‘grandfather’ has preserved accent on the second
syllable in the compound ekásirakpopo ‘descendants’. The fact that the original
location of the accent has been preserved in the compounded form is a phenomenon
that can also be observed in case of Japanese in the modern Kyōto type dialects. As
the accentuation of ‘medicine’ is the same as the accentuation of ‘bag’ and ‘ladle’,
this example cannot count as proof that Ainu has preserved traces of the Middle
Japanese tones of class 3.7 in the standard reconstruction.
As I will explain further in section 11.11.2, I see accent on the second syllable in
Ainu in general as an unreliable indication for the original tone of the Japanese
loanword, and as the divergent accentuation in Yakumo is related to the fact that the
second syllable in these words is open, this type of correspondence contains no
reliable clue as to the original tone of the Japanese words involved.
308 11 The accent of Japanese loanwords in Ainu
If this kind of correspondence really means that we have to reconstruct proto-
Ainu */LLH/ tone, then what about ‘car’ kurúma, but Yakumo kurumá (< tone class
3.1), ‘port’ tomári, but Yakumo tomarí (< tone class 3.1), ‘mark’ sirósi, but Yakumo
sirosí (< sirusi tone class 3.1), ‘price’ atáy or atáye, but Yakumo atayé (< atai <
atafyi 3.1), ‘to rest’ yasúmi, but Yakumo yasumí (< tone class 3.4)? And is potokí in
Yakumo enough reason to reconstruct 3.4 hotoke as */LLH/?
11.11 What can the Japanese loanwords really tell us?
Although the information that these Ainu loanwords can offer on the original tone of
these words in Japanese is limited, this does not mean that the accent of Japanese
loanwords in Ainu is not worth looking at, as it still may contain information,
although perhaps on other matters.
Before discussing the accent of the Japanese loanwords in Ainu, it is important
to distinguish between words of which it is likely that the accent can tell us
something about the original Japanese tones and words of which this is unlikely or
unclear.
11.11.1 Loanwords that include voiced consonants in the second syllable
in Japanese
There is one Japanese loanword in Ainu that has a nasal + consonant cluster that
may go straight back to a Japanese nasal + consonant cluster, i.e. Ainu kánpi from
Jap. kami (< *kamyi/kabyi < * kanpi 2.2a) ‘paper’. (Interestingly Krasheninnikov’s
Kuril Ainu has unrelated tat, while Dybowski later has kambiy.) The vast majority of
nasal + consonant clusters in Japanese loanwords in Ainu however, go back to
Japanese voiced consonants.
These voiced consonants in Japanese are in turn thought to go back to nasal +
consonant clusters in Old Japanese, but the point is that it is not necessary to go back
so far for an explanation of the nasal + consonant clusters in these loanwords in
Ainu. They could simply be the way in which the Ainu language deals with
intervocalic voiced consonants in loanwords. Besides, in the dialects of northeastern
Japan, the intervocalic voiceless stops of standard Japanese are voiced, while the
intervocalic voiced stops of standard Japanese are prenasalized. So there are three
equally possible explanations for these clusters in Ainu: adopted straight from Old
or proto-Japanese, adopted from the northern dialects, or simply the way in which
the Ainu language deals with intervocalic voiced consonants in loanwords.
(Japanese -m- on the other hand is usually taken over as -m- in Ainu, so the case for
Japanese kami < *kanpi ‘paper’ is quite strong.)
11.11 What can the Japanese loanwords really tell us? 309
36 Loanwords that include voiced consonants in the second syllable in Japanese
Japanese Hokkaidō Ainu
2.1 ‘basket’ kago kánko
2.1 ‘nail’ kugi kúnki
2.2a ‘rudder’ kadi kánci
2.2a ‘paper’ kami (< *kanpi) kánpi
2.4 ‘wheat’ mugi múnki
3.1 ‘gold’ kogane kónkani (Saru kónkane)
(3.1)79 ‘tobacco’ tabako támpaku (Obihiro tapáko, Nayoro tapáku)
3.2 ‘azuki bean’ aduki ántuki
3.4 ‘mirror’ kagami kánkami
(3.5)80 ‘funnel’ zyoogo cónko
(3.7)81 ‘priest’ boozu pónci
B ‘to worship’ ogamu ónkami
2.5/3.7b ‘sea weed’ kobu/konbu kónpu
A comparison of the tone classes of the Japanese words with the accent of the words
in Ainu clearly shows that the accent in Ainu will automatically fall on the closed
syllables that are the result of the nasal + consonant clusters, irrespective of the tone
class of the words in Japanese.82
11.11.2 Loanwords that have the accent on the second syllable in Ainu
Whenever the accent of Japanese loanwords in Ainu fell on the first syllable, Chiri
(1956:156) regarded the Ainu accentuation as having preserved initial /H/ tone in
Japanese. As the second syllable is the favored syllable to carry the accent in Ainu, I
agree with Chiri that accent on the first syllable in Japanese loanwords in Ainu is
much more significant than accent on the second syllable: If accent falls on the first
syllable in Ainu, the chances are very high that such a word started with [H] pitch in
Japanese, but if accent falls on the second syllable, this may be due to the Ainu
preference, and not to the original pitches of the word in Japanese:
Japanese loanwords with a CVCV structure and /LL/ or /LH/ tone were almost
certainly adopted as CVCV in Ainu, i.e. with the accent on the second syllable, as
this is the preferred location of the accent in words of this segmental shape. It is
highly unlikely that words of this type would have been adopted with accent on the
first syllable, unless the tone of the donor word had been /HL/. Japanese loanwords
with an originally /HH/ tone in Japanese are a less clear case. They could have been
79 Modern Tōkyō and Kyōto have Ø tone.
80 Both Kyōto and Tōkyō have zyo'ogo.
81 Kyōto has 'boo'zu, Tōkyō has bo'ozu. The segmental shape in Aomori is: bonzi, bonzu.
82 Only occasionally do other factors cause Japanese loanwords in Ainu to have closed syllables.
The vowel sequence in Japanese nui ito (縫い糸) ‘sewing thread’ for instance becomes núyto
in Ainu, with automatic accent on the first syllable.
310 11 The accent of Japanese loanwords in Ainu
adopted as CVCV, like the /LL/ and /LH/ loanwords, or the initial /H/ tone may
have had some other result. (This case will be discussed later.)
We simply cannot maintain that the Ainu preference for accent on the second
syllable played no role when we look at the long list of loanwords below which
shows that words from all tone classes in Japanese end up with accent on the second
syllable in Ainu. I see this as a strong indication in favor of Hattori’s idea that pitch
patterns in proto-Ainu adapted themselves to the segmental shape of the words and
were automatic in origin.
On the other hand, if we want to maintain that proto-Ainu was a tone language
and that proto-Ainu faithfully took over the Japanese tones of the loanwords with
/LL/ and /HH/ tone that entered the language, we will have to find convincing and
meaningful sets of divergent accent placement in the dialects for all cases that did
not have /LH/ tone in Japanese to begin with. As we have already seen, such
convincing and meaningful sets of divergent accent do not exist.
Of course, these examples may have had /LH/ tone in the type of Japanese that
they were borrowed from to start with, but the point is that there is no way in which
we can be certain of this. For the time being, I do not regard the accent in Ainu of
the following examples as reliable evidence for the original Japanese tones of the
words involved, except that we can say that it is unlikely that they had /HL/ tone.
37 Loanwords that have the accent on the second syllable in Ainu
Japanese Hokkaidō Ainu
2.1 ‘shark’ same samé
2.1 ‘cover, lid’ huta putá
2.1 ‘hatchet’ nata natá
2.1 ‘sake’ sake saké
2.1 ‘pot’ kama kamá
2.1 ‘a rush mat’ toma tomá83
2.1 ‘pig’ buta putá
2.2 ‘flag’ hata hatá
2.2 ‘saddle’ kura kurá
2.2 ‘person’ hito pitó84
2.3 ‘bean’ mame mamé
2.3 ‘potato’ imo imó (Saru emó)
2.3 ‘ball’ tama tamá
2.3 ‘bone’ hone pone (Sakhalin poni)85
83 From Nakagawa (1989), not included in the dialect dictionary. This word sounds very archaic
to speakers of standard Japanese, but according to Nakagawa it is still in daily use in the
countryside of Kantō and Tōhoku.
84 In Hokkaidō Ainu this word is mainly used for ‘man’ as opposed to ‘god’, while in Sakhalin it
can also be used in the more ordinary meaning of ‘person’ (Nakagawa, 1989).
85 In the Hokkaidō Ainu dialects as well, final -e, -o and -u of Japanese loanwords are sometimes
11.11 What can the Japanese loanwords really tell us? 311
Japanese Hokkaidō Ainu
2.3 ‘cooked rice’ mesi mesí
2.3 ‘color’ iro iró
2.3 ‘god kami kamúy
2.4 ‘chopsticks’ hasi pasúy
2.4 ‘rice paddle’ hera perá
2.4 ‘seed’ tane tané
2.4 ‘board’ ita itá
2.5 ‘saw’ noko nokó
2.5? ‘cat’ neko nekó, mekó
3.1 ‘car’ kuruma kurúma (Yakumo kurumá)
3.1 ‘port’ tomari tomári (Yakumo tomarí)
3.1 ‘mark’ sirusi sirósi (Yakumo sirusí)
3.1 ‘price’ atai (<atahi) atáy, atáye (Yakumo atayé)
3.4 ‘rest’ yasumi yasúmi (Yakumo yasumí)
3.4 ‘scissors’ hasami hasámi (not attested in Yakumo)
3.4 ‘bag’ hukuro pukúru (Yakumo pukurú)
3.6 ‘ladle’ hisyaku pisáku86 (Yakumo pisakú)
3.7 ‘medicine’ kusuri kusúri (Yakumo kusurí)
3.5/3.7a ‘egg’ tamago tamánko87 (only in Saru)
3.6? ‘kosode’88 kosode kosónte (Yakumo kosónte)
4.7 ‘silver’ sirokane sirókani (Yakumo sirokáni, sirókane in
Saru)
11.11.3 Loanwords that have the accent on the initial syllable in Ainu
Compared to the number of Japanese loanwords in Ainu that have accent on the
second syllable, the number of examples with accent on the initial syllable is truly
small.
38 Japanese loanwords that have accent on the initial syllable in Ainu
Japanese Hokkaidō Ainu Sakhalin Ainu
2.1 ‘metal’ kane káni, kaní89 kaani
2.1 ‘custom’ huri púri, purí90 puuri
raised: kane > káni, boozu > pónci, hukuro > pukúru, hotoke > potokí (Yakumo), tabako >
támpaku. The verbs ogamu > ónkami and yasumu > yasúmi, hakaru > pákari on the other hand
may have been loaned in the nominal form. Raising of a final vowel that is accented is rare.
86 In addition the form pisákku occurs in Saru and pisákko in Bihoro.
87 The usual word for ‘egg’ in the Ainu dialects is nók, but the Ainu word is sometimes avoided
because it also means ‘testicles’.
88 A short sleeved kimono. Both Tōkyō and Kyōto have ko'sode. (Sakhalin has kosonto ‘one’s best
clothes’.)
89 Kaní in Sōya and Obihiro.
312 11 The accent of Japanese loanwords in Ainu
Japanese Hokkaidō Ainu Sakhalin Ainu
2.1 ‘large lidded box’ hitu pítu91 x
2.2 ‘cup’ tuki túki tuuki
2.3 ‘salt’ sio92 síppo sispo
2.3 ‘pot’ hati pátci pahci
2.3 ‘whip’ muti mútci93 muhci
2.3 ‘horse’ uma úma, mma, úmma, umá uuma94
2.3 ‘lord’ tono *tóno (‘Japanese person’) *toono95
90 Purí in Horobetsu and Asahikawa, and perhaps in Sōya.
91 This example stems from Chiri (1956). As it is not included in the Ainu dialect dictionary, or
Murasaki’s word list I have no Sakhalin data. There is a chance that this word has a geminated
consonant (i.e. píttu) in Hokkaidō Ainu, which has not been acknowledged by Chiri: The
examples ‘salt’ and ‘pot’ are indicated as síppo and pátci in all Ainu dialects in the dialect
dictionary but in Chiri (1956) they are given as sípo and páci. We cannot rule out the
possibility that Chiri described a different form of Ainu, but it is more likely that Chiri’s
awareness of the absence of geminated consonants in Japanese in these examples prevented
him from acknowledging the geminated consonants in Ainu.
92 (< sifwo < *sipwo) The spelling wo here expresses the kō-o of Old Japanese, and not a
semivowel. Intervocalic -∏- (spelled as -f- in Martin’s transcription) was replaced by -w- in
central Japan before the end of the Heian period. As early as 1206 we find the spelling hawa
(∏awa) for haha ‘mother’ (Martin 1987:10). As the sequence wo is still allowed in Ainu (see
wosa) the word is probably a pre-Heian period borrowing.
93 I have chosen to represent ‘whip’ with the geminated form mútci that occurs in the Saru dialect,
instead of with the ungeminated form múci that occurs in Obihiro and Bihoro. This is because
in these two dialects the note (‘Japanese’) has been added in the dialect dictionary, indicating
that the informants themselves regarded the word in this form as Japanese rather than as Ainu.
94 Yakumo, Horobetsu, Saru, Obihiro, Bihoro umma, Asahikawa úma, mma, Sōya umá.
According to Nakagawa Hiroshi (1989) Dobrotvorskij’s Sakhalin Ainu material (collected
between 1867-1872) has: “umá, in Japanese: mma.” Pilsudski’s material (collected in 1903-
1904) has: úma ‘a horse’, umá ‘also’. (As since the publication of Pilsudski’s material the
accentual differences in Sakhalin have been shown to be determined by the syllable structure,
Nakagawa analyses this úma as uuma.) The Raichishka entry has the note that this word stems
from the Japanese period. Earlier on a Russian word may have been used. Analyzing Japanese
loanwords in Sakhalin Ainu is particularly complicated as there are three possibilities of
derivation: a before the split between Sakhalin Ainu and Hokkaidō Ainu, directly from
Japanese, b after the split between Sakhalin Ainu and Hokkaidō Ainu, from Hokkaidō Ainu, c
after the split between Sakhalin Ainu and Hokkaidō Ainu (during the time of the Japanese rule
over the southern half of Sakhalin from 1905 to 1945), directly from Japanese.
95 Although the accent of this word in isolation in Ainu is tonó, it has the accent tóno- in the
compound tónoto ‘unfiltered sake’ in Yakumo, Saru, Obihiro, Nayoro and Asahikawa. (In
Horobetsu and Sōya on the other hand it is tonóto.) Although the dialect dictionary has the
form tonoto for Sakhalin, according to Murasaki (1975:218) the word is toonoto in the dialect
of Raichishka. This word is thought to come from Japanese tono ‘lord’ (2.3), a respectful term
for ‘Japanese person’ in Ainu, and -to ‘milk’. (Unfiltered sake has a milky white color.) Of
course, for a compound such as tónoto to have been coined, the word *tóno must already have
been well established as a loanword in Ainu, and the accent in this word must have been shifted
to the preferred second syllable after the compound tónoto was created. I therefore reconstruct
the original accent of this loanword as *tóno in Hokkaidō Ainu. Krasheninnikov has tonò and
11.11 What can the Japanese loanwords really tell us? 313
Japanese Hokkaidō Ainu Sakhalin Ainu
2.4/5 ‘yarn guide’ wosa wósa x96
97
2.4/5 ‘hammer’ tuti túci, tucí, tútci tuuci
2.4/5 ‘millet’ kibi/kimi kími (‘maize’) x98
3.4 ‘to measure’ hakari pákari, pakári (Nayoro) paakari
Before we address the problem of which of the two reconstructions of the Middle
Japanese tone system agrees best with the accent that these loanwords have in Ainu,
there is one more factor that we have to consider.
11.11.4 Traders from Ōsaka
In 1956 Chiri Mashiho made a connection between the Kyōto type location of the
accent in some Japanese loanwords in Ainu with the fact that the monopoly of trade
with the Ainu was held by the so-called Ōmi shōnin, the Ōmi-merchants (named
after Ōmi near lake Biwa), who traded from Ōsaka. These speakers of a dialect with
Kyōto type tone system conducted the trade with the Ainu with their ships, the
kitamae bune from the 17th century on, following the Japan Sea coast on their way
from Ōsaka to Hokkaidō.
The Matsumae-han, the feudal domain that ruled over the south of Hokkaidō,
restricted contact between their subjects (who entered Hokkaidō from the
Muromachi period on, and who must have been speakers of a precursor of the
present-day Tōhoku dialect) and the Ainu. (Some Tōhoku dialect influence is
nevertheless seen in the fact that Ainu sometimes does not distinguish between i and
e in Japanese loanwords (like imó or emó ‘potato’), as is also the case in the Tōhoku
dialects.)
The linguistic influence of these trade contacts with the Kinki region should
therefore not be underestimated, as these may have formed the main source of
contact with the Japanese language for a period of almost two centuries.
Klaproth/Steller has dōhnŭ.
96 This example stems from Chiri (1956). As it is not included in the Ainu dialect dictionary, I
have no Sakhalin data.
97 Saru, Sōya túci, Horobetsu tútci, Yakumo, Obihiro tucí. The correspondence between the
widely separated dialects of Saru, Sōya and Sakhalin justifies the reconstruction of a long
vowel in the initial syllable. See also section 11.6
98 Although the Ainu word kími goes back to Japanese kimi/kibi ‘millet’, from the meaning in
Ainu it is clear that it refers to tookimi/tookibi ‘maize’ (‘Chinese (=foreign) millet’). Yakumo
has mamekími (lit. ‘bean millet’), Horobetsu and Saru have kími, Obihiro has tókimi. Sakhalin
has tookipi with intervocalic -b- rendered as -p-, which may be more like an attempt to
pronounce Japanese – the informant grew up at the time when the southern half of Sakhalin
was under Japanese administration – than a real loanword. (It has the annotation ‘new’.)
314 11 The accent of Japanese loanwords in Ainu
11.12 Evaluating the evidence
As will appear from the present section, with the complication that a significant part
of the Japanese vocabulary in Ainu may stem from traders from faraway Ōsaka,
speakers of a Kyōto type dialect, many conclusions that we otherwise would have
been able to draw based on the accent of the Japanese loanwords in Ainu, are in the
air again.
11.12.1 The loanwords and the standard reconstruction
Of the loanwords that have accent on the second syllable in Ainu, we can say that it
is unlikely (but not impossible), that they had /HL/ tone in the type of Japanese from
which they were borrowed into Ainu. If we follow the standard theory, the fact that
nouns of class 2.2 have accent on the second syllable is therefore hard to explain,
and the accent on the second syllable of nouns of class 2.3 can only be explained if
they were early loans, i.e. loans from before the period in which (according to the
standard theory), the /L/ level tone classes developed pitch falls at the beginning of
the word. (See section 2.3.1.)
Of the loanwords that have accent on the first syllable in Ainu, we can say that it
is extremely unlikely that they had /LH/ tone in the type of Japanese from which
they were borrowed into Ainu. The accent on the initial syllable that appears in Ainu
for words of the Japanese classes 2.3 /LL/ and 3.4 /LLL/ is also hard to explain, as
words which lacked initial /H/ tone would almost certainly have resulted in accent
on the second syllable instead of the first syllable in Ainu. But words of these
classes could have been loans from a late Kyōto type dialect (like that of the Ōmi
shōnin) in which the /L/ level classes had already developed /H/ tones at the
beginning of the word. It appears therefore that classes 2.4 /LH-H/ and 2.5 /LH-L/
are the main problem: These words cannot be loans from a dialect with a tone
system like that of Middle Japanese in the standard reconstruction, nor can they be
loans from a more modern Kyōto type tone system like that of the Ōmi shōnin.
11.12.2 The loanwords and Ramsey’s reconstruction
As mentioned, of the loanwords that have accent on the second syllable in Ainu, we
can say that it is unlikely (but not impossible), that they had /HL/ tone in the type of
Japanese from which they were borrowed into Ainu. If we follow Ramsey’s theory,
the fact that nouns of class 2.4/5 and class 3.6/7 have accent on the second syllable
therefore seems hard to explain. The examples of tone classes 2.4/5 and 3.6/7 (apart
from kusuri ‘medicine’) however, have open vowels in the second syllable, and as
has been discussed in section 7.1.1, in the Gairin dialects of north-east Japan (and
Izumo) words with this segmental shape have shifted the /H/ tone one syllable
towards the right. Class 2.4/5 with an open vowel in the second syllable thus has
/ØH/ tone, and class 3.6/7 with an open vowel in the second syllable has /ØHØ/ tone.
This means that these particular examples of Kyōto-like accentuation in Ainu for
Japanese nouns of classes 2.4/5 and 3.6/7 may very well go back to the dialects of
11.12 Evaluating the evidence 315
northern Japan. (Another possibility is of course that these words were introduced by
the Ōmi shōnin.)
Of the loanwords that have accent on the first syllable in Ainu, we can say that it
is extremely unlikely that they had /LH/ tone in the type of Japanese from which
they were borrowed into Ainu. It appears therefore that if we follow Ramsey’s
theory the accent that can be found on the first syllable of words of class 2.2 /LH/ is
the main problem, but the accent on the initial syllable that appears in Ainu for
words of class 2.1 /LL/ is also hard to explain. (Just as I have done in the previous
section, I assume that a word with a level /L/ tone in the Middle Japanese tone
system would most likely have resulted in accent on the second syllable in Ainu.) In
the modern Gairin Tōkyō type dialects of northern Japan, the initial pitch of these
tone classes is still low.
Vovin (1997:117) has argued that a Kyōto type tone system must once have
existed in the dialects of the north of Japan in order to account for the Kyōto type
location of the accent in a number of Japanese loanwords in Ainu, and if it had not
been for the Ōmi shōnin he would have had a point. As it is however, the contacts of
the Ainu with traders from the Kinki region make the conclusion that a Kyōto type
tone system must once have existed in the north of Honshū premature.
11.12.3 Attempts to date the examples
Ideally, it should be possible to sift out which vocabulary stems from which period
and therefore from which source, and in this way to evaluate the information that the
Ainu data contain on the earlier tone pattern of Japanese, but unfortunately there is
no unequivocal way to decide which part of the vocabulary stems from which source.
Although Nakagawa Hiroshi has been able to find out a lot about the time of
borrowing and the probable source dialects for the loanword ‘horse’, we cannot hope
to be so lucky for many other examples.99
The initial p- in pítu ‘large lidded box’, perá ‘rice paddle’, pátci ‘pot’, pitó
‘person’, poné ‘bone’, pisáku ‘ladle’ and pákari ‘to measure’ tells us that these
words must have been earlier loans than hatá flag and hasámi ‘scissors’. But Ainu
99 The horse was introduced in the area of Hokkaidō that was controlled by the Matsumae-han
in 1615, but it was only in 1789 that horses were introduced in the Ainu (Ezo) area of east
Hokkaidō. In 1807 they were introduced in west Hokkaidō. (This late introduction is
illustrated by the fact that unlike other animals, the horse never figures as a god in the Ainu
epics.) Nakagawa thinks that the difference in accentuation between the eastern dialects
(úmma, mma and úma) and the western dialects (represented by Sōya alone with umá), can
be explained by the fact that bakufu took over direct control of Hokkaidō from the
Matsumae-han in 1799. They delegated the defense to the feudal domains of Nambu and
Tsugaru, and from that time on the Ainu had more contact with bakufu officials and samurai
from northeast Honshū than with traders from the Kinki area. In other words: According to
Nakagawa, in east Hokkaidō the Ainu took over the word for ‘horse’ from speakers of a
Kyōto type dialect (úmma), while in west Hokkaidō the Ainu took over the word for ‘horse’
from speakers of Tōkyō type dialects (Tōkyō: uma', Tōhoku: mma'). The form úma in
Asahikawa would be the result of contamination.
316 11 The accent of Japanese loanwords in Ainu
p- can go back to p- as well as ∏-, (which is why púri ‘custom’, pukúru ‘bag’ and
putá ‘lid’ can easily be modern loans).
The labiality of ∏- is thought to have begun eroding at the end of the Muromachi
period (± 1600). The Japanese-Portuguese dictionary of 1603 wrote f- in front of all
vowels and in 1632, in Portuguese transcriptions of Japanese, h- is still spelled as f-
at least before -e, -i and -ya (Martin 1987:11). We therefore cannot simply conclude
that loanwords with initial p- before other vowels than -u in Ainu are too old to stem
from the Ōmi shōnin. (Besides, ∏a- and ∏e- reportedly still occur in dialects in
northern Honshū.)
Wósa (2.4), which at first sight strikes us as archaic, could stem from the period
of the Ōmi shōnin as well, as far as the segmental shape is concerned, as o and wo
initially fell together as wo (the delabialized o of today probably developed only in
the 18th century). But wósa may of course be much older, and its accent is a strong
indication that it does not stem from the Ōmi shōnin. Perhaps this word can count as
confirmation of Ramsey’s reconstruction of /HL/ tone for tone class 2.4 in Middle
Japanese. The alternative possibility would be that it is a loan from the areas with a
Gairin A tone system in the Tōhoku region. (In the Gairin B tone system, words of
this segmental shape (with open vowels in the second syllable) have the /H/ tone on
the second syllable.)
Finally, there are a number of loanwords that have been attested in the dialect of
the island of Shumshu in the early 18th century. Even these loanwords however, may
stem from the Ōmi shōnin: Retainers of the Matumae-han were given rights to trade
with the Kuril Ainu in the 17th century and in due course these trading rights were
subcontracted to the Ōmi shōnin.
39 Examples of Japanese loanwords in Kuril Ainu
Krasheninnikov Klaproth/Steller
2.1 sake ‘sake’ > x sākў
2.1 kane ‘metal’ > kaanì gânäh
2.3 sifwo/sipwo ‘salt’ > sippù šîpŭnŭă (‘salty’)
2.3 tono ‘lord’ > tonò dōhnŭ (‘judge’)
2.3 pone ‘bone’ > x pŏŏnh (and maybe pōhnĕ)
2.4 ita ‘board’ > x ita
The only example of which we can be absolutely certain that it is very old, too old to
stem from the Ōmi shōnin, on the grounds of its segmental shape in Ainu (and
perhaps also on the grounds of its meaning), is ‘salt’. The accent of this word in
Ainu agrees with Ramsey’s reconstruction of the Middle Japanese tones as /HH/. In
the standard reconstruction on the other and, the tone of this word would have been
/LL/. (It cannot have been a loan from the period when – according to the standard
theory – tone class 2.3 developed /H/ tone at the beginning of the word /LL/ > /HL/.
11.12 Evaluating the evidence 317
This is thought to have happened only in the Muromachi period.) This leaves me
with exactly one truly valid example in favor of Ramsey’s theory.
The discouraging conclusion at the end of this lengthy investigation has to be
that the Japanese loanwords in Ainu can tell us close to nothing about the original
tones of Japanese. But can they tell us something else?
11.12.4 The origin of the two different segmental shapes for loanwords
with accent on the initial syllable
Loanwords with accent on the initial syllable in Ainu can have two segmental
shapes: They either have long vowels in the initial syllable or the initial syllable is
closed. To me the most interesting aspect of the initially accented loanwords in Ainu
is perhaps not so much their pitch as such, but the choice between these two
available options, i.e. the choice between geminated consonants or long vowels.
One thing we know about the group of Japanese loanwords in Ainu that have
accent on the first syllable, is that the tone in the Japanese dialect from which they
were adopted was almost certainly /HL/, or /HH/. (Moreover, for words with /HH/
tone to have been adopted with initial accent in Ainu, the tones must have been
audibly [H], i.e. these loans must stem from the time when there still was a contrast
between /HH/ and /LL/ in Japanese.) A thorough examination, taking the many
possibilities of origin and time of borrowing into account, is complicated. As far as I
have been able to ascertain, such a comparison shows no connection between
possible differences in tone in the Japanese donor word (i.e. the difference between
[HH] and [HL] pitch) and a preference for one of the two possible segmental shapes.
The idea that the long vowels may go back to long vowels in proto-Japanese (and
the geminates to short vowels, for instance) is of course interesting, but will remain
no more than a wild guess, as there is no way to prove it. As we have seen at the end
of chapter 9, Vovin was not able to obtain confirmation of the vowel length that he
reconstructed in Japanese loanwords in Ainu with cognate examples of vowel length
in Okinawa.
Another possibility is that the outcome was at least partly determined by whether
the initial consonant of the second syllable was a stop or a continuant. There
definitely seems to be a tendency for stops to develop into geminates (sippo, patci,
mutci, tutci), while continuants do not (kaani, puuri, wosa, kimi, uuma, toono).
There are however, examples of stops that did not develop into geminates such as
paakari, tuuki, tuuci (which occurs as both túci and tútci in Hokkaidō) and possibly
pitu.100
If the lack of geminated continuants in the examples above is not a coincidence –
the number of examples of initially accented loanwords is after all limited – this
would mean that Japanese words with /HH/ or /HL/ tone, in which the initial
consonant of the second syllable was a stop had two alternative segmental shapes
100 I will not adduce umma as a counter example to the tendency, as this word may very well
have adopted the geminated -m- straight from the Tōhoku dialect mma ‘horse’.
318 11 The accent of Japanese loanwords in Ainu
open to them in Ainu, while words in which the initial consonant of the second
syllable was a continuant could take only one segmental shape. I have no
explanation for this distribution; the geminated continuants -mm-, -nn- and -ss-101
are not uncommon in native Ainu words, so why would they have been avoided in
loanwords from Japanese?.
In cases where the initial consonant of the second syllable was a stop, the split
into two different segmental shapes must be due to the fact that at the time of
borrowing of these loanwords, there was no perfect fit between the available
Japanese and Ainu prosodic and segmental shapes.
11.12.5 The CVCCV shaped loanwords as evidence for Hattori’s reconstruction
of proto-Ainu vowel length
In modern Hokkaidō Ainu, there is no reason to adopt a Japanese loanword which
has /H/ tone on the initial syllable as anything other than CV́CV, at least as long as
the intervocalic consonant is voiceless. What does it mean then, when we see that
Ainu changed such words into CV́CCV in case of sippo ‘salt’, patci ‘pot’, mútci
‘whip’ and tútci (but also túci/tuuci) ‘hammer’?
As I have mentioned, I think this means that at the time of borrowing no perfect
fit for Japanese loanwords with initial /H/ tone and a CVCV structure existed in
Ainu. The most likely cause for this is something that had to be assumed on other
grounds as well, namely the idea that distinctive vowel length in the first syllable
(accompanied by automatic [H] pitch, similar to what we still see in Sakhalin Ainu)
once existed in Hokkaidō Ainu.102
If such a system existed at the time when these words were adopted into the
language, speakers of Hokkaidō Ainu had to choose between a CVVCV or a
CVCCV structure if they wanted the word to have initial [H] pitch, which neither
of the two are an exact fit for the Japanese CVCV shape.
For loanwords with initial /H/ tone in Japanese that were introduced after the
shift from distinctive vowel length to distinctive pitch-accent in Hokkaidō, it is
unlikely that the form CV́CCV would have been chosen, as the modern Hokkaidō
CV́CV shape is a perfect fit.
Nakagawa (1998) argues exactly the other way around. He thinks that the forms
without geminated consonants are older, and must date from the time when long
101 As Ainu has the rule /r/ → /n/ before /r/, *-rr- does not occur.
102 The fact that no perfect fit for Japanese loanwords with initial /H/ tone and a CVCV structure
existed in Ainu can also mean something else: Although Hattori initially assumed that the
pitch of CVVCV shaped words in proto-Ainu had been *CVVCV (1960), he later changed
his reconstruction to *CVVCV (1967). It will be clear that a Japanese loanword with initial
/H/ tone would not have taken Hattori’s reconstructed *CVVCV shape, but would have
taken a *CVCCV segmental shape instead. Loanwords with a CVVCV segmental shape
would be loans from after the change of *CVVCV to *CVVCV (the form we still find in
Sakhalin). I do not prefer this explanation, as Hattori’s *CVVCV reconstruction is
hypothetical and cannot be confirmed.
11.12 Evaluating the evidence 319
vowels in the initial syllable had not yet changed to initial accent in Hokkaidō. In
Nakagawa’s idea, it was only after the long vowels had disappeared that geminated
consonants had to be created in order to keep accent on the first syllable.
I do not agree with Nakagawa’s idea, as I do not see why a Japanese loanword
with /HL/ tone could not have been adopted as CV́CV in Hokkaidō Ainu. I suppose
Nakagawa is influenced by the fact that if one follows the standard theory, nouns of
tone class 2.3, like ‘salt’, ‘pot’ and ‘whip’ that have initial accent and geminated
consonants in Ainu, cannot be old loans. After all, according to the standard theory,
the /H/ tone on the initial syllable of these words in Japanese only developed in the
Muromachi period.
We have seen from the material in Moshiogusa (1792) that distinctive vowel
length could still be found in Hokkaidō Ainu until the end of the 18th century, which
means that we cannot say that loanwords with geminated consonants must date from
before the time of the Ōmi shōnin.103 This is truly unfortunate, as it again leaves us
with empty hands if we try to use these loanwords as an instrument for the
determination of the history of the Japanese tone system.
Although I think that CV́CCV shaped loanwords are most likely relatively old
(meaning that they probably date from before the end of the 18th century), this does
not necessarily mean that CV́CV shaped loanwords with accent on the first syllable
are relatively new. They can be, but they can just as well have been borrowed as
CVVCV, and only later developed into the modern CV́CV shape. (And a number
of these originally CVVCV > CV́CV loans may at some point have shifted the
accent to the preferred second syllable, as I assume happened in case of tonó
‘Japanese person/government official’.)
I have no ready explanation for the fact that some loanwords ended up with one
segmental shape and other loanwords with the other. The forms with the geminated
consonants however, serve to confirm Hattori’s reconstruction of distinctive vowel
length rather than distinctive pitch-accent in proto-Ainu.
11.12.6 The special case of pasúy, kamúy and múy
In Hokkaidō Ainu the word pasúy can be found in the compounds ipépasuy
‘chopsticks’ (‘pasúy for eating’) and ikúpasuy ‘pasúy for drinking alcohol’ (or
kamúynomi pasúy ‘pasúy for festivals/rites’). When used independently the word
pasuy refers to the ikúpasuy, a spatula shaped utensil used in ceremonies to scoop up
and scatter small amounts of sake as a libation.104
In Kuril Ainu the word pasuy referred to the libation wand as well, and not to
chopsticks. ‘Spoon’ is for instance given as the translation for pasùi by
103 The same is true for Kuril Ainu where vowel length is clearly attested in Krasheninnikov’s
material as late as 1739.
104 The ikúpasuy is apparently also used to lift up the moustache of the men when the remaining
liquor is consumed, which is why the implement has sometimes been called a ‘moustache
lifter’. This word is nowadays considered politically incorrect, and the term ‘libation
wand’ is used instead (Nakagawa, 2007:14).
320 11 The accent of Japanese loanwords in Ainu
Krasheninnikov (1738), for pāŝuig by Klaproth/Steller (1743/1823), for pashui by
Torii and for pasiu by Dybowski. (Dybowski’s pasiu may be a miscopy of pasui. In
the Ainu dialects, the endings -uy and -iw often occur for one and the same word
however, so that the interpretation pasiw is also a possibility.)
Torii Ryūzō’s Kuril Ainu material includes a word for ‘chopsticks’ as well,
euturumbe, which can be analyzed as ‘a pair of opposing things’. In Sakhalin the
word pasuy does not occur at all. Instead, ‘chopsticks’ is unrelated sahka, which has
cognates in languages such as Nivkh, Hezhen and Udege (Nakagawa, 2007).
Nakagawa therefore argues that chopsticks were not introduced to the Sakhalin Ainu
from Japan but by way of the trade contacts that they maintained with the continent.
For ‘libation wand’ the term ikuunis is used, which means ‘stick for drinking
alcohol’.
Nakagawa argues for derivation of the Ainu word from Japanese during the
period of the Satsumon culture (8th to 14th century) in Hokkaidō. According to
Nakagawa, the Ainu ritual in which the libation wand was used originated in similar
libation rituals used by the peoples of northeast Siberia and Sakhalin, although the
use of a libation wand is unique to the Ainu. He considers the use of the wand in
Ainu rituals as older than the introduction of the Japanese term for the object in
question. Based on Sakhalin Ainu ikuunis, he reconstruct the word for libation wand
in proto-Ainu as *ikuunit.
Nakagawa further points out that during the Satsumon period a whole range of
lacquerware implements were imported from Japan to find their exclusive use as
sacred objects in Ainu rituals. With the incorporation of these foreign imports, the
Ainu rituals were fundamentally rearranged. Nakagawa suggests that the new term
pasuy may have substituted earlier *ikuunit, partly because *ikuunit was now
associated with rituals in the older style.105 The term must have been transmitted
with the other implements and the rituals to the Kuril islands after this time.
The replacement of the term *ikuunit with pasuy or *ikuupasuy resulted in the
split of the term pasúy into modern ikúpasuy and ipépasuy in Hokkaidō Ainu. (An
ordinary ‘spoon’ in Hokkaidō Ainu is now parápasuy or ‘broad pasúy’ and in
Yakumo perapásuy, a compound of Japanese hera ‘rice paddle’ and pasúy.)
I regard the derivation of Ainu ‘libation wand’ from Japanese ‘chopsticks’
(rather than the other way around) as likely, as there appears to be a Japanese
etymology for ‘chopsticks’, whereas there is no internal etymology for ‘libation
wand’ in Ainu.106
105 Another reason for the adoption of the Japanese term for ‘chopsticks’ for the Ainu libation
wand has been seen in the fact that the chopsticks introduced in Japan were most likely of the
archaic folding type (Nakagawa, 2007:21). When such chopsticks are opened up, they
resemble Ainu libation wands, but Nakagawa points out that the type of chopsticks that have
been excavated from various archaeological sites from the Satsumon period in Hokkaidō are
all of the separated type.
106 I consider the Japanese word for ‘beak’ (cf. kutibasi ‘beak’ and tori no hasi ‘beak of a bird’)
a probable etymology for ‘chopsticks’. See section 5.8.2 (footnote 38).
11.13 Vovin’s reconstruction of proto-Ainu consonant clusters 321
In case of libation wand/chopsticks, the Ainu diphthong -uy corresponds to
Japanese i. There are two other examples that show a similar correspondence,
namely Ainu kamúy and Japanese kami ‘god’ (tone class 2.3) and Ainu múy and
Japanese mi ‘winnow’ (tone class 1.3).
In the 8th century Man’yōgana spelling system, both of these words were spelled
with the so-called otsu i (iy), and so it seems that the correspondence can be
narrowed down to a correspondence between the Ainu diphthong -uy and Japanese
otsu i (iy).
The 8th century dialect of central Japan no longer preserved the kō/otsu
distinction after s, but it is by no means certain that this was also the case in the
dialect from which the word pasuy was adopted into Ainu. Based on these two
correspondences hasi ‘chopsticks’ may therefore have to be reconstructed as *pasiy
in proto-Japanese.
The fact that Ainu pasúy is most likely a loanword from Japanese, does not
automatically imply that kamúy and múy are also loanwords from Japanese. It is for
instance possible that Ainu múy was adopted into Japanese to indicate a specific
indigenous type of winnow. (Archaeological evidence indicates that agriculture was
already practiced in Japan in the Jōmon period.)
As to kamúy however, we have seen that in words of this segmental structure the
second syllable was most likely automatically accented in proto-Ainu. It could be
argued that if this word were a loanword from Ainu into Japanese, it would have
been adopted with the tone of class 2.2 instead of 2.3. Seeing also, that the Ainu
word for ‘man’ as opposed to ‘god’ is pitó (from Japanese hito ‘person’) it seems
more likely that ‘man’ and ‘god’ were adopted from Japanese into Ainu as a pair, as
terms used in a religious context.
11.13 Vovin’s reconstruction of proto-Ainu consonant clusters
As I have outlined above, even between Hokkaidō and Sakhalin, the Ainu dialects
do not differ profoundly from each other, and unfortunately it is not possible to
reconstruct many new phonemes based on the modern dialect correspondences when
there is so little divergence. (I do not agree for instance with Vovin’s reconstruction
of proto-Ainu *H-.) Vovin nevertheless reconstructs a number of proto-Ainu
consonant clusters based on the unusual consonantism that can be found sporadically
in older sources like Klaproth’s Asia Polyglotta (1823) or the Kuril Ainu data
collected by 18th and 19th century travelers. A problem is however, that it is not clear
how reliable these vocabularies are.
Vovin’s *hd cluster for instance is based on no more than the following entries
from Klaproth (1823) and Krasheninnikov’s Kuril Ainu vocabularies (1738):
– *hdan ‘ten’ is based on îhgŭœn ‘six’ (= ‘four-ten’) in Klaproth (= [igwan]?).
Hokkaidō Ainu: iwán-pe, Krasheninnikov ivàn.
322 11 The accent of Japanese loanwords in Ainu
– *ihdagu ‘sulphur’ is based on ĭgŭăkh in Klaproth (= [igwax]?). Hokkaidō Ainu
(Batchelor’s dictionary): iwau.
– *hdEn ‘bad’ is based on sirugèn ‘rain’ (= ‘weather is bad’) in Krasheninnikov.
Hokkaidō Ainu: sír wén. Torii Ryūzō’s Kuril Ainu material has shiriwin and
Klaproth has šŷrǔўhn (= [sírwin]?).
In all other sources the reflex is w. Even in Krasheninnikov’s own wordlist, only
five words above ‘rain’ on which Vovin’s reconstruction of *hd for ‘bad’ is based,
we find uín-kamuj ‘devil’ (= bad god).107 Hokkaidō Ainu: wénkamuy. This is a clear
attestation of ‘bad’ with initial w- in Krasheninnikov’s material. Vovin however,
does not quote this example from Krasheninnikov, although he does acknowledge
the word ‘devil’ as an attestation of ‘bad’ as he quotes vyn-kamuj from Dybowski
and Voznesenskij.
40 Comparison of the occurrence of w and gw in the older Kuril Ainu materials
Krasheninnikov Klaproth/Steller Hokkaidō
‘ten’ ivàn îhgŭœn iwán-pe
‘sulphur’ x ĭgŭăkh iwau
‘bad’ sirugèn (‘rain’) šŷrǔўhn (‘rain’) sír wén (‘rain’)
uín-kamuj (‘devil’) x wénkamuy (‘devil’)
There is no reason why w and gw could not be cognates (as for instance in war and
guerra), and so I see no reason to reconstruct different phonemes in the proto
language, especially as in Klaproth/Steller’s material [gw] appears to be no more
than an intervocalic allophone of /w/. In Krasheninnikov’s material, fluctuation
between g and w is attested in one and the same word (‘bad’) and if it had not been
for ivàn (which distorts the pattern), we would have been able to conclude that [g]
was no more than intervocalic allophone of /w/ in this material as well.
These entries in Krasheninnikov and Klaproth/Steller are a good illustration of
how uncertain the reliability and the correct interpretation of these vocabularies is.
107 According to Murayama (1971), Krasheninnikov first created his Latin-Kuril vocabulary, in
which the Ainu language is transcribed with Roman letters, which lay hidden in the archives
of the Academy of Sciences in Leningrad until 1968. (All my quotations are from this
vocabulary.) The collection of Kuril Ainu words included in his Opisanie Zemli Kamchatki
(St Peterburg 1755-1756) on the other hand (written in Cyrillic script), is famous, but this
collection of words has been regarded as Kamchatka Ainu for a long time. Murayama argues
that the version in Roman script must be the original version and the Cyrillic version a copy,
because both versions contain a mistake: keerà instead of reerà ‘wind’, which can be
better explained as a miscopy from notes made in Roman script (K < R) than in Cyrillic script
(К<Р). The version in Roman script has sirugèn and uìn-kamui. The version in Cyrillic script,
which Vovin used, has г with a dash over it instead of g, й instead of i, and acute accents
instead of grave accents.
11.13 Vovin’s reconstruction of proto-Ainu consonant clusters 323
There are many unusual spellings of Ainu words in old material, but one has to be
careful not to jump to conclusions as to their interpretation.108
Based on the already mentioned attestations of ihguœn ‘four-ten’ and sirugèn
‘weather is bad’, Vovin eliminates the phoneme /w/ from proto-Ainu altogether. Of
the five initial *hd- examples in Vovin’s wordlist (p.91) only ‘ten’ and ‘bad’ have
attestations that include consonants other than w-. In case of *hdak=ka ‘water’,
*hdatara ‘stone’ and *hdOO ‘span of the thumb and first finger’ all attestations
without exception are with initial w-.
Although Dobrotvorskij and Piłsudski indicated that tr- was a variant of initial r-
in the pronunciation of Sakhalin, Vovin’s proto-Ainu *tr- cluster in ‘road’, ‘feather’,
‘beard’ and ‘high’ is nevertheless based on t- reflexes in Nairo combined with tr-
and/or r- reflexes in Dobrotvorskij’s Sakhalin Ainu (1875), Voznesenskij’s Kuril
Ainu or Krashenninikov’s Kuril Ainu. I think that these variants (t- or tr-) in the
pronunciation of initial r- in Sakhalin and Kuril Ainu all go back to an original r-, as
there is no discernible pattern in the distribution of the different forms, and variation
occurs even within these dialects. (See also the earlier discussion of initial t- reflexes
in Sakhalin in section 11.2.) The consonant r- apparently has considerable variation
in its phonetic realization: Even in Hokkaidō Ainu in older material like Moshiogusa
and Ezo-kotoba irohabiki, before the vowel -e we sometimes find initial t- or d-
where we would expect r- (Satō 1995:11-12). Vovin however, bases the
reconstruction of three separate proto-Ainu initials, *r-, *tr- and *d- on these
variants.
One of Vovin’s consonant clusters however, the proto-Ainu initial cluster *pr-, is
based on modern dialect data, i.e. on the apparent sound correspondence between
dialects with initial c- and dialects with initial p-.
In Ainu, there are words that have initial c- in all dialects such as cari ‘to scatter’
and there are words that have initial p- in all dialects such as para ‘wide’, but there
is a difference in the reflexes of the word ‘mouth’ between western dialects such as
Yakumo, Horobetsu, Saru and Asahikawa that have par, and eastern dialects such as
108 As an illustration of how complicated the interpretation of these kinds of materials can be, I
give the example of Ezo-go (1850) by Matsuura Takeshiro. In this material we find the
spellings seroke for heróki ‘herring’, sekati for hekáci ‘child’ and semui for hemóy ‘trout’.
The use of s- instead of h- before -e was explained by Kindaichi Kyōsuke (1938) as influence
from the dialect of Aomori on the compiler of the glossary, as in the dialect of Aomori, both
/se/ and /he/ are pronounced as [˛e]. Satō (1990:159) on the other hand, finds this theory
questionable, as Matsuura was born in Mie-ken, and lived there until he was a young man. He
therefore thinks that Matsuura took these three words over from some other source. Most of
the vocabulary in Ezo-go is clearly based on the famous dictionary Moshiogusa by Uehara
Kumajirō but it also includes material collected by Matsuura himself, which shows many
characteristics of the dialects of northern Hokkaidō and Sakhalin, where he traveled. Satō
therefore also mentions the comment of Tamura Suzuko, who pointed out that when accent in
Sakhalin falls on the second syllable, the vowels in the first syllable are devoiced, and that
due to this devoicing it is according to her quite suitable to record word-initial he with the
kana セ se instead of ヘ he (Satō, 1990:160).
324 11 The accent of Japanese loanwords in Ainu
Obihiro, Bihoro, Nayoro, and Sōya that have car. (Sakhalin and Kuril Ainu also
have c-.)
Kirikae Hideo (1994) has argued that, the correspondence of p- and c- is limited
to this word (and words that are derived from the word ‘mouth’ such as ‘to smile’,
‘to mutter’, ‘glutton’, ‘to be astringent’ ‘to tell, to teach’, ‘light, tasteless’ etc.),109
and that the two forms go back to two different competing words in the proto
language. Three of Vovin’s six examples (‘mouth’, ‘to tell, to teach’ and ‘light,
tasteless’) are clearly based on ‘mouth’ and therefore should be treated as only one
example. Furthermore I do not see pok and corpok ‘under, below’ (Vovin’s only
example of proto-Ainu *pr- before a vowel other than -a) as a meaningful set of
cognates. Most dialects have the form pok as well as corpok and would therefore be
c- and p- dialects at the same time, but most importantly: The p- of pok remains a p-
in corpok and does not change to c-. The only attestation of chok is in Batchelor’s
(1889/1938) dictionary, but the reliability of this dictionary is disputed110 and the
existence of this form cannot be confirmed in any of the dialects included in the
Ainu dialect dictionary.
The example ‘to run’ in Vovin’s list (Horobetsu pas, Bihoro cas, Asahikawa
ikaopas ‘to run to the rescue’, Raichishka cas, Kuril chasi, chase Batchelor chash,
pash) is the only ca-/pa- correspondence that – as far as I can tell – is not related
etymologically to ‘mouth’.
It is possible that par and car are cognate words and not competing words in the
proto language. (The example where Yakumo has preserved accent on the second
syllable because of earlier vowel length distinctions in ‘teacher’ (Sakhalin:
icaakasnokur, Yakumo: ipákasnokur) does look remarkably like a cognate
relationship.) But the problem remains that only two serious examples of pa-/ca-
correspondence (‘mouth’ and its derivatives and ‘to run’) are a shaky basis for a
sound correspondence, making Vovin’s reconstruction of a consonant cluster *pr- in
the proto language questionable.
Finally, the initial consonant clusters in proto-Ainu must have formed part of a
phonological system that allowed such clusters. As the reconstruction of one
consonant cluster in proto-Ainu is in dispute because we cannot be certain whether
we are dealing with a sound correspondence or something else, and as the other
reconstructions are based on no more than one unusual consonant attestation (or
even none) per example, in material of which the reliability and the interpretation is
109 The only word derived from ‘mouth’ that only has a p- reflex is parunpe ‘tongue’ (from par
‘mouth’ un ‘to exist in something’ pe ‘thing’). According to Hattori (1960:64 and 1964:27) it
is a newly coined word from the p- area that replaced an older form that must have resembled
aw, awé-he (Sōya and Sakhalin) or aukH (Torii’s Kuril Ainu), possibly due to some taboo.
(Krasheninnikov’s Kuril Ainu has áchu, Klaproth/Steller has aūch, Dybowski has au.)
110 See H. A. Dettmer (1985). Chiri Mashiho’s judgment as well is surprisingly harsh: “Contrary
to the trust that is generally put into it, I have never seen a dictionary with so many flaws.
Rather than to say that it has many flaws, it would be closer to the truth to say that it is
entirely made up of flaws” (1956:237).
11.14 Conclusion 325
uncertain, the reconstruction of initial consonant clusters as such in proto-Ainu is
questionable.
11.14 Conclusion
All things considered, my impression is that the differences between the Ainu
dialects are only minor, and that it is not possible to base a reconstruction of proto-
Ainu on them that differs fundamentally from the modern Ainu dialects. In this
respect my position differs profoundly from that which is taken by Vovin. The most
promising tool for a reconstruction of proto-Ainu turns out to be internal
reconstruction.
I find no evidence for instance, in the accentual correspondences between the
different modern Hokkaidō Ainu dialects for the reconstruction of proto-Ainu as a
tone language, or even as a language in which the location of (pitch) accent was
distinctive.
In my opinion therefore, the accent of Japanese loanwords in Ainu contains
hardly any information at all on the pitches these words must have had in the type of
Japanese from which they were borrowed. The accent of Japanese loanwords in
Ainu cannot be used to confirm or refute the standard reconstruction of the Middle
Japanese tone system. Fortunately however, the loanwords do contain clues as to
historical developments in Ainu.
II The introduction and adaptation
of the Middle Chinese tones in Japan
Introduction
0.1 Ramsey’s theory and the evidence from the modern dialects
In part I of this study, I have discussed a number of theories on the historical
development of Japanese tone. I have concentrated on the two theories that stand in
the most direct opposition to each other: The standard theory, proposed by Kindaichi
Haruhiko, and the theory proposed by S. R. Ramsey. Ramsey’s theory constitutes
the most fundamental challenge to the standard theory, as it questions the accepted
interpretation of the written record. The two theories are based on a fundamentally
different reconstruction of the value of the tone dots that were used in Middle
Japanese manuscripts to mark the pitches of Japanese.
My conclusion has been that the evidence from the modern dialects supports
Ramsey’s theory in every instance. To name just a few of the points discussed in the
previous chapters: The leftward tone shift that Ramsey reconstructed in the Kyōto
type dialects is able to explain the geographical distribution of the Tōkyō type and
Kyōto type tone systems vis-à-vis each other, and the fact that remnants of a Tōkyō
type location of the /H/ tone have been preserved in the Kyōto type dialects in
morphologically complex environments. The proto-Japanese tone system that
follows from Ramsey’s interpretation of the Middle Japanese tone dots offers a
unified explanation (related to the tone of enclitic case particles) for the merger
patterns that later developed in the three subtypes of the Tōkyō type tone system
(Nairin, Chūrin and Gairin) and for the /H/ tone loss that is found in certain tone
classes before the particle no in both the Tōkyō type and the Kyōto type dialects.
Ramsey’s reconstruction furthermore explains why the word-tones in many dialects
of the Ryūkyūs still show a remarkable resemblance to the pitches of a (Gairin)
Tōkyō type tone system.
0.2 Ramsey’s theory and Late Middle Chinese tone,
Japanese philology and the Buddhist shōmyō tradition
Rejection of Ramsey’s theory is for an important part based on other considerations.
It is, for instance, thought that the use of the shang or ‘rising’ tone to mark Japanese
/L/ tone, and the use of the ping or ‘level’ tone to mark Japanese /H/ tone is
unnatural, and in contradiction with the (reconstructed) tone system of Late Middle
Chinese.
Even more importantly; it is thought that Ramsey’s interpretation of the value of
the tone dots is contradicted by historical descriptions of the Late Middle Chinese
330 Introduction
tones in Japan. In other words, the idea is that Ramsey’s theory cannot be brought
into agreement with Japanese philology.
Also; the value that Ramsey reconstructed for the Late Middle Chinese tones in
Japan does not agree with the value that the tones have in contemporary Japanese
Buddhist chant, which is thought to go back in an uninterrupted tradition to the 9th
century. Finally, it is often thought that the musical notation systems used in
Buddhist vocal chant – even in the earliest times – marked the tones of Japanese in
an unambiguous way, and that old material of this type contradicts Ramsey’s theory.
The second part of this study therefore examines the nature of the Late Middle
Chinese tone system, the circumstances surrounding its adoption in Japan, and the
way in which it has been described and discussed in Japan in different historical
periods by (mainly) Buddhist scholars. This last issue especially, had a direct impact
on the way in which the tones came to be viewed in the Buddhist chanting tradition,
and on the way in which musical notation marks were used to mark the pitches of
Japanese.
1 The history of Middle Chinese
1.1 The different varieties of speech that functioned
as the Chinese standard language
In order to understand the origin of the different character reading traditions that
developed in Japan, it is necessary to look at the language varieties that functioned
as standard language in China in the period when the main character reading systems
were transmitted to Japan. This period stretches from the 6th to about the 9th century.
The language of this period in China is called Middle Chinese. (Sporadic borrowing
of later forms did occur, but not on a scale large enough to result in new character
reading traditions.) Middle Chinese is the language of the Sui 随 (581–618), Tang
唐 (618–907) and early Song 宋 (960–1279) dynasties and it is divided into two
distinct stages, Early Middle Chinese and Late Middle Chinese.
1.1.1 Early Middle Chinese
Early Middle Chinese (often abbreviated as EMC) is the language of the Qieyun 切
韻 rhyme dictionary (yunshu 韻書) of 601. The Qieyun was compiled shortly after
the founding of the Sui dynasty (581–618), that reunited the country after the
Nanbeichao period of division. According to the preface, the Qieyun presents the
results of a series of discussions on phonology started some twenty years earlier by a
group of scholars who gathered at the house of Lu Fayan 陸法言, the final compiler.
There were many editions, the final one being the Guangyun 広韻 of the year 1008.
Chang’an 長安 (now Xian 西安) was the capital of the Sui and Tang dynasties,
and since the Qieyun was written in the Sui dynasty, it may seem logical that the
Qieyun authors would have taken this dialect as their standard. However, the Sui
dynasty actually reunited China only in 589, after the time when – according to the
Qieyun preface – the Qieyun authors were beginning their phonological discussions.
The dialect of Chang’an may have enjoyed less prestige at the time than the
dialects of other major cultural centers like Luoyang 洛陽 in the north and Jinling 金
陵 (modern Nanjing) in the south. 1.1 Varieties of speech that functioned as the Chinese standard language
The literary standard was predominantly established by the southern and eastern
literati and aristocracy, who had moved to the new capital after the Sui reunification,
while the local dialect continued to be spoken by the majority of the inhabitants. The
preface to the Qieyun strongly suggests that the intention of the authors was to
establish a standard of correct speech common to the educated classes of both north
and south; a compromise between the literary pronunciation of the two regions in the
6th century. The most important component was that of the southern dynastic capital
332 1 The history of Middle Chinese
of Jinling, which (until the reunification of China under the Sui dynasty) was the
undisputed cultural centre of China.
There is evidence that at least to the end of the 7th century a somewhat evolved
form of Early Middle Chinese, and not the Chang’an dialect, remained dominant at
the Tang court. This can be seen from the rhyming of court poets and from the
survival of pre-Tang norms in Buddhist transcription practice.
1.1.2 Varieties of Early Middle Chinese
Yang Zhitui 顔之推, one of the scholars who cooperated with Lu Fayan, describes
some of the differences between the two regions. Other evidence that confirms and
supplements his remarks can be found in Buddhist transcriptions from the north and
the south, and from fanqie emanating from the south in the 6th century like those of
the original Yupian 玉篇 compiled in 543 (Pulleyblank, 1984:131).1
The original Yupian went through various abridgements and revisions, which
often altered the original fanqie spellings. Of the original version only fragments
remain and the currently available version of the Yupian is not a reliable guide to
Early Middle Chinese phonology. However, Kūkai 空海 (774–835) the founder of
the Shingon school, who went to China in 804, used the original Yupian as the basis
for his character dictionary Tenrei banshō myōgi 篆隷萬象名義 and through his
work the original fanqie of the Yupian can be recovered.
1.1.3 Late Middle Chinese
Late Middle Chinese (often abbreviated as LMC) is the standard language of the late
Tang (618–907) and the early Song (960–1279) dynasty, based on the dialect of the
Sui-Tang capital Chang’an. It begins to appear around the 7th century and was well
established in the 8th century.
Dictionaries incorporating the new standard were compiled, but none has
survived. Nevertheless, there is even better evidence for its phonological categories
than for Early Middle Chinese. One source of information is formed by the fanqie
spellings based on such dictionaries (like the Yunying 韻英 of around 750), that can
be found in the great compendium of glosses on the Buddhist canon, Yiqiejing yinyi
一切経音義. This work by the monk Huilin 慧琳 (completed at around 755), shows
a pattern of distinctions that is quite different from the Qieyun, and essentially the
same as that of the so-called rhyme tables, which developed in Buddhist circles in
late Tang times.
1 In works like the Qieyun and Yupian a system called fanqie (反切), was used as a method of
‘spelling’ the pronunciation of the characters. One character of which the pronunciation was
well known was used to represent the initial consonant (shengmu 声母), while another well-
known character was used to represent the rest of the syllable and its tone (yunmu 韻母). The
origin of the fanqie method is connected with Chinese Buddhist study of Sanskrit. The division
into an initial and a final was inspired by the distinction between the taimon 体文 (consonants)
and mada 摩多 (vowels) of the Sanskrit Siddham script.
1.1 Varieties of speech that functioned as the Chinese standard language 333
These rhyme tables (dengyuntu 等韻図) form the most important source of
information on Late Middle Chinese. Rhyme tables like the Yunjing 韻鏡 (originally
compiled around 750) and the Qiyinlue 七音略 developed in Buddhist circles under
influence of knowledge of the Sanskrit alphabet. (The Yunjing is the earliest extant
complete table.) The term dengyun, which is often used to designate the phonology
of the rhyme tables, means ‘classified rhymes’. In essence, dengyuntu refers to an
attempt to classify and systematize the phonology of the Qieyun, using concepts
borrowed from Indian phonological theory.
At the time when this was done however, the standard literary pronunciation had
already changed considerably; not only had almost two centuries elapsed since the
compilation of the Qieyun, but the geographic base of the standard had changed
from Jinling along the Lower Yangtze to the region around Chang’an, the Tang
capital. Although these were closely related dialects, there were nevertheless
important differences. Norman (1988) stresses that one must assume that the
categories of dengyun phonology actually refer to the later Tang standard and not to
the Qieyun language itself, since the dengyun phonologists would have had no way
of knowing how the standard language of Jinling two centuries earlier had been
pronounced.
The rhyme tables are still very useful in reconstructing Early Middle Chinese,
and much of their terminology is applicable to the Early Middle Chinese stage.
1.1.4 Wu pronunciation and Qin pronunciation
In the 8th century, for example in the preface to Huilin’s work and in Xitanziji 悉曇
字記 (written by the Chinese monk Zhiguang 智広 between 780 and 804), the new
standard dialect was called Qin yin 秦音 (Qin pronunciation, or Qin sounds), while
the old standard of the Qieyun was called Wu yin 呉音 (Wu pronunciation or Wu
sounds). Qin refers to the region around Chang’an, while Wu refers to the capital
region of the old Southern Dynasties, the present-day Nanking and surrounding
territory, which had been the most prestigious cultural centre during the time of
division, and where the old Early Middle Chinese standard language was most
persistent.
The south preserved the old standard longer; therefore, in the new capital, the
impression could arise that the Qieyun standard had been based on southern Chinese,
and the old standard language of the Qieyun was now regarded as provincial and
substandard. However, the idea that the old Qieyun standard had been based on the
original Wu dialects was not correct.
A good deal of the basis for the Qieyun standard language came from the speech
of former northern families, which had settled in the vicinity of present day Nanjing.
Before this northern influence, the local language is thought to have been more like
the present day Min 罨 dialects (Norman, 1988:186).
The term Wu pronunciation in this sense is not old: In older Chinese material it
is only used to indicate the Wu dialects, but from the 8th century on, ‘Wu
334 1 The history of Middle Chinese
pronunciation’ no longer designates a dialect, but an older stage of the standard
language that had been preserved locally.
1.2 The relationship between Early Middle Chinese
and Late Middle Chinese
Late Middle Chinese was for the first time reconstructed separately by Pulleyblank
in 1991. It differs from Early Middle Chinese in having far less distinctions.
As the shift from Early Middle Chinese to Late Middle Chinese was not merely a
matter of historical evolution but represented a major shift of dialect base, one
cannot – strictly speaking – consider Late Middle Chinese to have evolved from
Early Middle Chinese.
At the same time however, the phonological categories of Early Middle Chinese
and Late Middle Chinese are very largely commensurate. This makes clear that Late
Middle Chinese must go back to an earlier form of language that made the same
distinctions as those found in Early Middle Chinese, and it makes sense to treat it as
if it had evolved from Early Middle Chinese.
However, some of the differences between Early Middle Chinese and Late
Middle Chinese, particularly in matters of phonetic realization as opposed to
categorical distinctions, were probably not the result of straight-line evolution but
were inherited from earlier dialectal variation. This is clear – among other things –
from the radical changes that occurred in the method of transcribing Sanskrit as the
Chang’an dialect replaced the earlier form of standard Chinese from the end of the
7th century onward.
The characteristic Tang dynasty pronunciation of initial nasal phonemes as
prenasalized stops, which was not found in Early Middle Chinese and which
disappears again in standard Mandarin, was very likely an old regional characteristic,
rather than a Tang innovation. 2
2 Late Middle Chinese nasal initials were pronounced as prenasalized stops, which were now
used to transcribe Sanskrit voiced stops. Syllables ending in a final nasal did not realize the
initial nasal as a prenasalized stop, and these syllables could therefore be used to represent
Sanskrit nasals. According to Arisaka Hideyo the first example of such new Sanskrit
transcription can be found in the translation into Chinese of the Mahāvairocana sutra 大日経
of 724 (Wenck, 1957: 18). The conditioned variation in the pronunciation of the initial nasal
can also be seen in transcriptions of Chinese in the Tibetan hPhagspa script and in the Sino-
Japanese readings of the Shōsō-in 正倉院 manuscript (early Kamakura period) of the Mōgyū
蒙求, an early source of Kan-on readings (Arisaka, 1936). Another difference between Early
Middle Chinese and Late Middle Chinese transcription is the fact that Sanskrit long vowels
were now sometimes transcribed by Chinese qu tone characters with the annotation yin 引
‘drawn out’, which indicates some degree of vowel length for the qu tone. A change from the
final voiceless -h of the qu tone to voiced -˙ would account for the fact that the qu tone had
become somewhat longer (Pulleyblank, 1978).
1.2 The relationship between Early Middle Chinese and Late Middle Chinese 335
1.2.1 Late Middle Chinese as the ancestor of the modern dialects
It is the Tang standard language that spread to the whole country and became the
ancestor of the modern dialects. Karlgren had identified the Tang standard language
with the Qieyun sound glosses. However, the common standard language of the
Tang period, which underlies the modern dialects, was not the language of the
Qieyun, but that of the later rhyme tables. Although the older standard language had
been preserved longer in the area around what is now Nanjing, the Chang’an
standard finally spread tot this area too, and there is no particularly close
relationship between Early Middle Chinese and the present-day Wu dialects. There
is very little in the modern dialects of the north and the south (excepting Min)3,
which cannot be comprehended with the rhyme table categories, as most of the
distinctions found in the modern Chinese dialects cannot be traced further back than
Late Middle Chinese.
Nevertheless, many dialects retain a few distinctions from an earlier stage. One
of the things that go back to a stage of the language earlier than Late Middle Chinese
is the fact that some southern dialects, for example the colloquial layers of
Cantonese, Chaozhou (広州, southern Min) and Wenzhou (温州, Wu)4, do not show
the shift of shang to qu tone after Late Middle Chinese voiced obstruents. In other
cases however, this shift has spread even to the colloquial level, as in Suzhou (蘇州,
Wu), Fuzhou (福州, northern Min) and Amoy (廈門, southern Min).
The literary layers of all southern dialects are clearly derived from Late Middle
Chinese (Pulleyblank 1984:149). In its literary form – that is, in its role as the
standard way in which texts were read – it profoundly influenced all the local
dialects of China; this influence was so great that the reading pronunciations of
characters go back to this Tang standard in all Chinese dialects, with only a few
scattered survivals of an earlier standard.
3 An exception has to be made for the Min dialect group, which shows distinctions that predate
Early Middle Chinese.
4 Wenzhou is also one of the dialects that still retain a final glottal stop in the shang tone.
2 The origin of tone in Middle Chinese
2.1 From consonantal distinctions to tonal distinctions
Tones in Chinese, as in many other languages, are thought to have arisen through the
loss of consonantal distinctions. Typically, tones develop from pitch differences that
begin as predictable concomitants of consonantal distinctions. For example, initial
voiced consonants may be accompanied by lowered pitch, and final glottal stops by
raised pitch. If these consonantal distinctions are lost, the associated features of pitch
may become distinctive.
The so-called register distinction for instance, that divided Chinese syllables into
a higher (yin 陰 ) and a lower (yang 陽 ) register, developed from an original
difference between voiced and voiceless initials that was later lost. This split into a
higher and a lower register is a relatively late development in the history of Chinese,
but the Qieyun distinction into four tones, ping 平, shang 上, qu 去 (and ru 入), is
much older.
In the case of Chinese, it is thought that shang developed from syllables with a
final glottal stop and that qu developed from syllables with final -h from earlier *-s.
Haudricourt was the first to suggest this idea in 1954, based on an analogy with
Vietnamese:
Haudricourt demonstrated that Vietnamese was not related to the Tai languages
with which it shares a tonal system on the Chinese model,1 but rather to the non-
tonal Austro-Asiatic languages. He showed that the Vietnamese tones which
correspond to the Middle Chinese shang and qu tones correspond to final -/, and -h,
from earlier -s, in other Austro-Asiatic languages like Mon, Muong, Khmu and
Riang (Norman, 1988:55-56).
Haudricourt suggested that glottal stop -/, and -h (from earlier -s) had also been
the origin of the shang and qu tones respectively in Old Chinese. As in the earliest
layer of Chinese loanwords in Vietnamese the Chinese shang tone corresponds to
the Vietnamese rising tone and the qu tone corresponds to the Vietnamese falling
tone, Haudricourt suggested that at the time of these borrowings these words still
ended in glottal stop and -s in Chinese.
In the case of Vietnamese the origin of the Vietnamese rising and falling tones
from final glottal stop and -s respectively has been proven, as the genetically related,
1 The Vietnamese tonal categories A, B and C correspond to the Chinese categories ping shang
and qu respectively. Category D, like the Chinese ru tone, consists of all syllables that ended in
a stop. The upper register developed from voiceless initials, and the lower register developed
from voiced initials.
2.2 The effect of glottal stop and -h on the pitch of preceding syllables 337
non-tonal Mon Khmer languages still have final glottal stop and -s or -h in cognate
words.
In the case of Chinese on the other hand, the theory was at first only based on a
correspondence with Vietnamese, but later it was considerably strengthened by Mei
and Pulleyblank: Mei (1970) found convincing evidence for a final glottal stop in the
shang tone, such as the survival of a final glottal stop in words which had the shang
tone in Middle Chinese in several modern Chinese dialects which occur in non-
adjacent area’s.2 Furthermore, evidence from the Tang period shows that the shang
tone was considered most appropriate to represent vowel shortness in transcribing
Sanskrit with Chinese characters.
Pulleyblank strengthened the final -s hypothesis considerably by citing evidence
for final -s in early Chinese transcriptions of foreign words. The following examples
from Pulleyblank (1984) and Baxter (1992) show that -s survived in certain qu tone
rhymes (yun 韻) as late as the beginning of the 6th century AD.3
1 The transcription of foreign words as evidence for the survival of -s
in certain qu tone rhymes
波羅奈 Vārānasī The city of Benares in India
都頼 Talas The Talas river in Central Asia
対馬 Tusima The Japanese island of Tsushima
阿貝摩羅 Apasmara The Sanskrit term for ‘blindness, ignorance’
2.2 The effect of glottal stop and -h on the pitch of preceding syllables
Direct evidence for the phonetic value of the tone categories of Middle Chinese is
meager, but the effect of glottal stop and -h on the pitch of preceding vowels is such,
that a glottal stop will cause a rise in pitch, while -h will cause a fall in pitch. Figure
1 for instance, shows the average fundamental frequency values (in Hz) of vowels
preceding [/] (curves with positive slope) and [h] (curves with negative slope) in
Arabic (four subjects).
When the glottal stop was lost, it was therefore most likely replaced with a rising
tone, as final glottal stop is accompanied by an automatic raise in pitch. Conversely,
2 The dialects that Mei mentions are Wenzhou 温州, belonging the Wu group, and four Min 罨
dialects Pucheng 浦城, Jianyang 建陽, Ding’an 定安 and Wenchang 文昌 on Hainan island.
3 The final -s seems to have survived longest in Qieyun rhymes in -j (Pulleyblank, 1984: 224).
Analogy with other languages suggests that in other rhymes it had become -h earlier.
Palatalization apparently forms a favorable condition for the preservation of final -s. (In
Sakhalin Ainu on the other hand, palatalization even appears to have been responsible for
generating final -s coda’s: The syllable final -p, -t, -k of Hokkaido Ainu has shifted to -s after
the vowel i, while it has shifted to -h after all other vowels.
338 2 The origin of tone in Middle Chinese
when the final -h was lost it was most likely replaced with a falling tone, as final -h
is accompanied by an automatic fall in pitch.
S1 S2
S3 S4
Figure 1: Fundamental frequency values in four speakers of Arabic
Source: Hombert (1978:93)
But final -/ and -h are still used as markers of the shang and qu tones in Early
Middle Chinese in Pulleyblank’s lexicon of reconstructed pronunciation in Early
Middle Chinese, Late Middle Chinese and Early Mandarin (1991). “No doubt there
were already associated features of pitch and contour which helped to make up the
total acoustic effect of the ‘tone’, but if one insists that ‘tone’ must mean primarily
pitch and contour, then one should really speak of EMC as having an incipient or
quasi-tonal system rather than fully developed tones” (Pulleyblank, 1978:175). A
more or less identical quasi-tonal system existed in Burmese and still exists in Mon-
Khmer.
An essential difference between the ping and the oblique tones was that the ping
tone was unchecked (CV), while the shang (CV/), qu (CVh) and ru tones (CVp,
CVt, CVk) were checked. This reconstruction of the Early Middle Chinese tones is
consistent with the transcription of Sanskrit into Chinese: The shang tone was the
favored indicator of short vowels, but there are also many examples of the qu tone
being used, while the ping tone was unquestionably the favored indicator of vowel
length. In Late Middle Chinese the glottalized and aspirated pronunciations are
2.3 Chinese descriptions of the four tones 339
thought to have been replaced by rising and falling tone, although still pronounced
short.
2.3 Chinese descriptions of the four tones
By the middle of the Tang dynasty when the new standard language Late Middle
Chinese was already replacing the old standard of the Qieyun, there is the first rather
vague Chinese description of the tones. It originates from the work Yuanhe yunpu 元
和韻譜 (806-820) which is now lost (Pulleyblank, 1978:177). The description was
also transmitted to Japan where it was frequently quoted in works on Chinese
phonology.
2 Chinese description of the Late Middle Chinese tones (mid Tang period)
平声者哀而安 The ping tone is sad and calm
上声勵而挙 The shang tone is fierce and rises
去声清而遠 The qu tone is clear and distant
入声直而促 The ru tone is straight and abrupt
The only information on the realization of the tones in this poem is contained in the
sentence: “The shang tone is fierce and rises”.4 It is therefore tempting to look at the
names of the Middle Chinese tones for a clue as to their original tonal value. The
names ping (‘level’) and shang (‘rising’) suggest that ping had a level tone contour
while shang had a rising tone contour, at least at the time when these names were
invented.
According to Hashimoto (1978: 270) these names date back to the 5th and 6th
centuries and are therefore far too old to describe the tones of Late Middle Chinese.
Hashimoto also points out that each of the four names were also themselves
examples of the tone that they represented. The names could therefore mean no
more than to say ‘a tone just like that of the word ping’, ‘a tone just like that of the
word shang’ etc. (Hashimoto 1979:390).
Nevertheless, it has to be noted that the description of the shang tone in this
poem – from a time when the old standard language from the south had already been
replaced by the new standard language from Chang’an – agrees with the meaning of
the name of the shang tone.
Kindaichi (1951:640) cites another poem as the next oldest description of the
tones in China. It should be mentioned that the correct dating of the poem is
problematic.5
4 “The qu tone is clear and distant”, if interpreted as “The qu tone is clear and peters out in the
distance” could refer to a tone ending in voiceless aspiration.
5 According to Mei Tsu-lin this poem stems from the work Zhuyaochi gejue 主鑰匙歌訣 and
340 2 The origin of tone in Middle Chinese
3 Chinese description of the Late Middle Chinese tones (possibly late Tang period)
平声平道莫低昂 Ping is said level without low/fall or high/rise
上声高呼猛烈強 Shang is called out high/loud, fierce and strong
去声分明哀遠道 Qu is said clear, sad and distant
入声短促急収蔵 Ru is short and quick and suddenly stored up
The character 低 can mean ‘low’ or ‘falling’ and the character 昂 can mean ‘high’ or
‘rising’. The character 高 in the second line can mean both ‘high’ and ‘loud’. The
meaning ‘loud’ would agree well with the following ‘fierce and strong’ but if we
assume the meaning ‘high’ is more appropriate here, it would mean that the shang
tone had high pitch. In light of the first line, where the fact that the ping tone was
level and had no rises and falls is explicitly mentioned as something special, an
interpretation of shang as a tone with a pitch that rose up high is probably most
appropriate.
The tonal information in the first line again agrees with the meaning of the name
of the ping tone. Kindaichi however, stresses that it is possible that the writer of the
poem was influenced by the names of the tones, and concludes that without
corroboration from other material, it is impossible to draw any real conclusions from
these descriptions.
These are not the only tone descriptions that stem from China itself, but all of the
others date from after the Ming dynasty and are therefore much too late to be
relevant to the question of what the tone value of the tones in China was like at the
time of their introduction in Japan.
Finally, once consonantal features have been replaced by features of pitch, the
pitch can continue to change; the third tone of Mandarin (the main reflex of the
shang tone) nowadays is actually a low tone, and low-rising in pre-pausal position.
was written by the monk Chu-zhong 處忠 of the Ming dynasty (1368-1644), which would
make this description far too late to be relevant for a discussion of the value of the tone dots in
Japan. Kindaichi on the other hand, quotes Liu Fu (1924) who suspects that it dates from the
end of the Tang and the beginning of the Song dynasty (10th century), based on its literary style.
3 Character reading traditions in Japan
3.1 Early Sino-Japanese
The earliest contact of Japan with the Chinese language and script was probably via
the southwestern Korean state of Paekche in the 5th and 6th centuries. Even earlier
there had been relations between the kingdoms of Kyūshū and China, but as yet no
traces of involvement with the Chinese script have been found.
This raises the possibility that Chinese character readings reached Japan in a
Koreanized form, and that there is a Korean component in the earlier Sino-Japanese.
(The term Sino-Japanese refers to the Japanized pronunciation of the Chinese
characters that is used in Japan when reading Chinese texts.)
The most likely circles where Sino-Japanese could develop were the Buddhist
monasteries, as what was important here was a uniform fixed sound for recitation.
This called for a regular set of pronunciations that could be learned, and that did not
necessarily have to be understandable by native speakers of Chinese. Buddhism, like
the Chinese language and script, was introduced in Japan from Korea. This makes a
certain Korean component in the older Sino-Japanese even more likely.
Unfortunately, the oldest material available on Sino-Korean is from the 15th
century, and mainly based on the dialect of the state of Shilla that had unified Korea
in the 7th century, whereas the Japanese had contact with Paekche.
The term Go-on 呉音 for the older Sino-Japanese only appears in the Heian
period, which makes clear that one has to do with a term that was applied afterwards
to something of which the origin lay three to four hundred years back. Earlier, the
terms Wa-on 和音 ‘Japanized pronunciation’ and Tsushima-on 対馬音 were used.
The first mention of Tsushima-on is in Tsushima kōgin-ki 対馬貢銀 記 of Ōe
Koretoki 大江維時 (888-963). According to this document, at the time of Emperor
Kinmei 欽明 (when Buddhism first came to Japan) there had been a Paekche nun on
the island of Tsushima who taught Buddhism using Wu pronunciation, which was
the reason why in Japan Buddhist sutras and other scriptures usually used this
pronunciation, and why it was called ‘Tsushima pronunciation’ (Wenck, 1953:312).
3.1.1 Go-on and southern Early Middle Chinese
As outlined in chapter 1, there existed a northern and a southern variant of Early
Middle Chinese. A comparison with the characteristics that are mentioned by Lu-
Fayan and that can be inferred from the Yupian 玉篇 dictionary shows that some
northern characteristics can be found in both Go-on and Kan-on, while some
remnants of the southern characteristics can only be found in Go-on.
342 3 Character reading traditions in Japan
Wenck therefore concludes that while it is not possible to classify Go-on as
southern Early Middle Chinese, it would be possible to classify it with the northern
variant if one assumes a stage that is somewhat older than the Qieyun 切韻. The pre-
Qieyun characteristics of the older Sino-Japanese can be considerably increased if
one takes into account its oldest source, the Man’yōgana (Wenck, 1953:322).
According to Wenck, Go-on is therefore based on a Chinese standard language
that is about 100 years older than the Qieyun, for which there is no reason to assume
a southern base. The reason why it was nevertheless referred to as ‘Wu
pronunciation’ has been explained in section 1.1.4: In older Chinese written sources,
the term Wu-yin 呉音 (Go-on) was used to indicate the Wu dialects. But in Chinese
works from the 8th century on (like Xitanziji 悉曇字記, which was very influential in
Japan)1 the term ‘Wu pronunciation’ referred to the old standard language of the
Qieyun, and was opposed to the Qin 秦 or Han 漢 pronunciation, which referred to
the new standard language of Chang’an.
This is also the sense in which the term was at first used in the Heian period in
Japan; compared to the new Han pronunciation propagated at the time, the Wu
pronunciation represented an older type of foreign Chinese that was closer to the
language represented by the Qieyun.
From Annen’s comments in Shittan-zō 悉曇蔵 for instance, it is clear that Go-on
did not indicate a form of Sino-Japanese, but a foreign Chinese. It was only later that
the term Go-on supplanted the term Wa-on as the designation for the older form of
Sino-Japanese. (Cf. section 3.3.)
3.2 Direct contacts with China
The earliest Japanese mission mentioned in the Chinese records came to China in 57
AD. The next Japanese embassy visited the Later Han in 107, and in the first half of
the 3rd century several more embassies are reported to have reached the northern
dynasty of Wei, in a China that was then divided, and in the 5th century even more
came to Nanjing, the capital of the southern dynasties of the period.
After a lapse of more than a century, Japanese embassies to China were renewed
in the early 7th century, but now on a more permanent basis, and with far more
significant results than before. After almost four centuries of political division,
China was once again a unified empire under the Tang.
Regular embassies were sent to China in the 7th and 8th centuries, and in 804 the
embassy that included Saichō 最澄 and Kūkai 空海 (who founded the Tendai and
the Shingon schools respectively in Japan), went to China. The next embassy, which
1 Because Annen praised Xitanziji (Japanese: Shittan jiki) in Shittan-zō (cf. chapter 6) as the best
work on the Siddham script, it became the focus of Siddham studies in Japan. See section 3.7
for an explanation of Siddham, and the background of Japanese Siddham studies.
3.2 Direct contacts with China 343
reached China in 838 was the one to which Ennin 円仁 (who laid the basis of
Tendai shōmyō)2 was appointed. Ennin came back to Japan in 847.
It was the last mission to be dispatched abroad by the imperial court of Japan
until the 19th century. Another embassy was proposed in 894, but the idea was
eventually abandoned. This loss of interest in contact with China is usually
explained as a redirection of energy from the borrowing of new institutions and
learning, to the assimilation of the acquired knowledge into the indigenous culture.
Another factor may have been that the Tang empire went into decline in the 9th
century.
3.2.1 Introduction of new character readings
The direct and intensive contact between Japan and China in the 7th and 8th centuries
was the means by which a new system of character readings reached Japan, which
was called Han pronunciation 漢 音 . The term refers to Chinese as a foreign
language, and not yet as a new form of Sino-Japanese. The official nature of the
contact makes it unlikely that anything other than the standard language of the
educated class would have been introduced.
The discrepancy between the new Chinese pronunciation and the older one that
was already deeply rooted in Japan may have given rise to the employment of
pronunciation teachers (on-hakase 音博士 or koe no hakase) at the Daigaku-ryō 大
学寮 (‘Bureau of Higher Learning’). The first appointment of Chinese on-hakase is
mentioned in 691. When official contact with China was severed in the middle of the
9th century, no more new on-hakase from China were appointed, but they seem to
have been around until the end of the 9th century.
3.2.2 The introduction of the tone dots
The tone dots (shōten 声点 in Japanese) developed in China from so-called poyin 破
音 marks that were used to distinguish the original and the derivative meaning of a
character in Chinese. In the beginning, the centre or sometimes the right side of the
character was marked with a dot of red ink. Later, a system developed in which one
of the four corners of the character was marked with a dot, indicating one of the four
tones. This happened in cases where the meaning of the character or the function of
the character in the sentence had to be clarified, as this could depend on the tone of
the character.
At first, the ping 平, shang 上, qu 去 and ru 入 tone marks were added around
the character in that order, beginning with ping at the upper-right corner, shang at
the lower right corner, qu at the lower left corner and ru at the upper left corner
(Ishizuka, 1993, 1995). Considering the fact that Chinese is written in lines from top
to bottom, starting at the right hand side this system seems natural.
2 For an explanation of the term shōmyō and the history of Tendai and Shingon in Japan, see
chapter 5.
344 3 Character reading traditions in Japan
Later however, the tone dots started to rotate, a process that can be seen in the
Dunhuang 燉煌 manuscripts.3 In the second half of the 7th century, in some texts, the
marks would start at the bottom-right corner, in others at the bottom-left or upper-
left corner. The system eventually settled down with the tone marks starting at the
bottom-left corner, i.e. bottom-left ping, top-left shang, top-right qu, bottom-right ru.
This is the system that was introduced in Japan.4
From the material preserved at Dunhuang, it appears that until the mid 8th
century these dots were hardly ever used to mark simply the tones, but always
functioned to clarify meaning. Ink marks that were placed at the four corners of the
character in order to indicate simply the tone developed in China in the period from
the end of the 8th century to the first half of the 9th century.
The use of shōten in Japan had its origin in shōten that were used in dhāran,ī 陀
羅尼 transcriptions at the end of the 9th century. (Tendai monks were the first to use
shōten in dhāran,ī texts.)5 The oldest extant example dates from the year 889. The
oldest example of tone dots added to the kana of a Japanese word is said to be in the
Kongō-kai giki 金剛界儀軌 (987-989 or 1028-1037), and another example can be
found in the Konryū mandara goma giki 建立曼荼羅護摩儀軌 (1040).
3.2.3 The government promotes foreign Chinese (Han pronunciation)
Around the beginning of the Heian (794-1185) period, one can find many
exhortations from the side of the government to use the correct Han pronunciation
and not the wrong Wu pronunciation, and there can therefore be no question that the
term Kan-on goes back to the beginning of the Heian period.
These injunctions to use Han pronunciation at the end of the 8th century should
be seen in the context of the move of the capital from Nara to Heian-kyō, in an
attempt to free the government from the influence of the Nara Buddhist clergy.
3 It is not known why this happened, but the same kind of rotation can be seen in the wokoto-ten,
the dots that were arranged around a character to indicate the grammatical function of the
character in a sentence read in Japanese. See Tsukishima’s table in the supplement to Kokugo-
gaku dai-jiten (1980). In the mid Heian period there was as yet no fixed system for the
placement of the wokoto-ten, but at the end of the Heian period one system started to spread.
4 As to the tonal value of the marks in Japan, it has been suggested (Martin, 1987:167, Vovin,
1997:116) that the position of the ping tone mark at the bottom-left corner of the character
somehow naturally expressed /L/ tone, while the shang tone mark at the top-left corner of the
character naturally expressed /H/ tone. However, the tonal value of the marks in Japan was (at
least ostensibly) based on the tonal value in Late Middle Chinese, and considering the fact that
the tone marks changed position a number of times in China before settling down in the
positions that finally became the norm, no such ‘natural’ connection between ‘top left’ with /H/
tone and ‘bottom left’ with /L/ tone can be established.
5 In esoteric Buddhism, short mystic verses are called mantras (shingon or ‘true words’ in
Japanese) while long ones are called dhāran,ī. A dhāran,ī or mantra is regarded as the
quintessence of a sutra. It is thought that a mystical power is embodied in the syllables of these
verses. As these mantras and dhāran,ī very often have no literal meaning, they were not
translated but taken over in the Siddham script or transcribed by means of phonographically
used Chinese characters.
3.2 Direct contacts with China 345
Confucianism was contrasted to Buddhism, and the secular study of Chinese was
contrasted to the religious study of Chinese. One example of this reaction against
tradition was the use of ‘foreign’ Chinese (Han pronunciation), which was now
promoted against the Sino-Chinese of the Buddhist rituals (Wa-on). In 793, by
imperial edict, it was even attempted to force the new pronunciation on the Buddhist
clergy, as those who had not studied the new pronunciation would not be allowed to
enter the clergy.
The new Shingon and Tendai esoteric schools spread due to the reaction of the
state against the older Nara Buddhism, and it is against this background that they –
and especially the Tendai school – partly used rites in foreign Chinese.
3.2.4 The development of a new standard of Sino-Japanese
It is unlikely that a new Sino-Japanese – next to the ‘foreign’ Chinese that was being
introduced based on the new Chinese standard language – could have developed
while there was still an active cultural exchange with China, as there would have
been no usage limited to a Japanese context around which such a standard could
have developed.
Until the cessation of contacts with China at the end of the 9th century, the only
circles where Sino-Japanese was used completely within a Japanese context and
therefore independently of foreign Chinese was in the Buddhist rites, but there the
older Sino-Japanese continued to be used: In the late 8th and early 9th century when
Japanese priests like Saichō went to China to study Esoteric Buddhism they had to
use either the written language or an interpreter to communicate (Wenck, 1953:305).
From Ennin’s diary we see that there were three Shilla interpreters, or Shiragi
wosa (the South Korean state of Shilla had unified the whole of the Korean
peninsula almost two centuries earlier in 676) accompanying his embassy
(Reischauer, 1955:50).
The earliest signs of a developing new Sino-Japanese are found in the late 9th
century. Earlier in the 9th century, in the newly established Shingon and Tendai
schools, some kind of Kan-on was occasionally used, but it is not clear whether one
can already think of the new Sino-Japanese pronunciation system that we now
associate with the term. Character readings that differ from the old established Sino-
Japanese were for instance used in the reading of the Rishu-kyō 理趣経 in the
Shingon school, but more important is the so-called Tendai Kan-on. Tendai was
initially introduced in Japan on a Kan-on base (called ‘Chinese recitations’ or Kara-
goe yomi 唐声読) but as contact with China ceased, there was a reversion back to
Go-on.
Although the Tendai Kan-on readings are different from the older Sino-Japanese,
they are also different from what we now know as Kan-on, so it is unlikely that these
readings were the starting point of the new Sino-Japanese standard.6
6 Although Tendai Kan-on is attested earlier, it is based on a later form of Middle Chinese than
standard Kan-on. Tendai Kan-on was introduced in Japan by Ennin, the founder of Tendai
346 3 Character reading traditions in Japan
3.3 Confusion and overlapping of terms
Initially, the new Sino-Japanese did not yet have an established form, end neither did
the terms which were used to refer to it. Because the word Han pronunciation
already meant foreign Chinese the new term Sei-on 正音 ‘correct pronunciation’
was coined for this new type of Sino-Japanese. The first time when the two terms
are clearly used in opposition to each other, and thus the first clear proof of a newly
developing Sino-Japanese is found in Annen’s Shittan-zō (880): 呉音似和音、漢如
正音 “The Wu pronunciation is close to the Wa-on and the Han pronunciation is like
the Sei-on.”
From this and other passages in Shittan-zō it is also clear that 呉音 and 漢音 did
not indicate forms of Sino-Japanese, but different forms of foreign Chinese.
As there was no longer a possibility of comparison with real foreign Chinese, the
terms Sei-on and Kan-on were not always separated. Sometimes Kan-on was called
directly Kara-goe or ‘Chinese pronunciation’.7 Wenck (1953:308) lists a number of
examples of ways in which the different terms were used and contrasted with each
other:
In Fujiwara Kintō’s 藤原公任 (966-1041) Dai-hannya-kyō ji-shō 大般若経字抄
the word Go-on is for the first time used to refer to a way of pronouncing characters
in Japan.8
For Myōgaku 明覚 in Shittan yōketsu 悉曇要決 (1101) the terms Sei-on and
Kan-on seem to have merged, and he no longer uses the term Sei-on. In light of this,
it is likely that Go-on, too, no longer referred to a form of foreign Chinese, and when
shōmyō, around the middle of the 9th century. (It was later also called Shin Kan-on ‘New Kan-
on’.) It never became commonly used, as its usage remained limited to specific texts. Iida
(1955) made a study of the Kan-on introduced by Ennin. Examples of Tendai Kan-on texts that
Iida mentions are Hokke senpō 法華懺法, recited every morning, and Reiji sahō 例時作法,
recited in the evening. (However, Iida notes that both texts also contain Go-on parts. Of the
first, the parts Ku-jō shakujō 九条錫杖 and of the latter, the parts Ekō 廻向 Dai-zange 大懺悔
and Go-nemmon 五念門 are read in Go-on.)
Some of the characteristics of Tendai Kan-on mentioned by Iida (1955: 80-83) are the fact that
one can already observe the weakening or disappearance of the final consonants in the ru tone,
for instance in the Tendai Kan-on spelling of the characters 八 and 白 as ハ (i.e. ha instead of
hati and haku) and 北 as ホ (i.e. ho instead of hoku). On the other hand, in the Tendai Kan-on
texts that are still chanted today (like Hokke senpō and Reiji sahō), all Chinese initial nasals –
also those in syllables that end in a nasal – are read as voiced stops, and although originally
Tendai Kan-on must have been based on the Late Middle Chinese standard language, it does
not show this characteristic feature of the dialect of Chang’an (cf. section 1.2). It is, however,
not clear to which extent these shōmyō texts in their present-day shape go back to Ennin, as the
oldest preserved texts date from the Kamakura period.
7 Kara-goe is written with the same characters as Tō-in 唐音. However, the term Tō-in as it is
used nowadays, usually refers to the character readings that were introduced from the 12th
century onward by merchants and Zen monks. (Although the official pronunciation of the term
is Tō-on, I have adopted the more customary pronunciation Tō-in.)
8 This work (from the year 1032) is also sometimes called Dai-hannya-kyō ongi 大般若経音義.
3.4 Confucianist and Buddhist reading practice 347
he compares Go-on and Wa-on and says that they for a large part agree with each
other, he probably sees Go-on as a more correct Sino-Japanese used in Buddhist
rites, while Wa-on was a more Japanized form used in daily life.
In Ruiju myōgi-shō 類 聚 名 義 抄 (±1100) the terms Wa-on and Sei-on are
opposed to each other to indicate the old and the new Sino-Japanese, but in Chūyū-ki
中右記 (1087-1138) by the courtier Fujiwara Munetada 藤原宗忠, Wa-on and Kan-
on are used. Go-kyōgoku sesshō-ki 後 京 極 摂 政 記 (1200) expresses the same
distinction by the terms Tsushima-on and Tō-on/Kara-goe 唐音 and Kōke shidai 江
家次第 (±1100) uses the present-day terms Go-on and Kan-on.
3.4 Confucianist and Buddhist reading practice
In Japan different ways of reading Chinese texts were used in different circles. The
official government school, the Daigaku-ryō, and families holding the hereditary
post of professional scholar (the hakase-ke 博士家),both of whom dealt with the
Chinese classics, adopted a different line of approach than the temples, which dealt
with Buddhist scriptures.
In the Daigaku-ryō there were the teachers of Chinese pronunciation (on-hakase),
and the Chinese classics were read in accordance with Sei-on, which was as close as
possible to the pronunciation of Chinese proper. Kundoku 訓読 was however, also
practiced at the Daigaku-ryō. This is the reading of a Chinese text, in which the
Chinese text is converted into Japanese while reading.
To facilitate the conversion into Japanese, reading aids such as pronunciation
notes, notes on Japanese grammatical particles (wokoto-ten 乎古止点) and word
order started to be added in the margin of the text.9 In materials dealing with the
secular study of Chinese such reading notes appear in the beginning of the 10th
century. In the 9th century one does not yet find such materials with added notes, and
it is thought that the habit was taken over from Buddhist texts.
The use of wokoto-ten is thought to have originated at the end of the Nara period
in Nara-Buddhist schools such as Hossō and Sanron. Later their use is thought to
have spread to the Tendai and Shingon schools and the hakase-ke. The 10th century
is also the time when the scientific traditions of the hakase-ke (which were based on
the teaching tradition of the teachers of the Daigaku-ryō) were born. In each family
the reading traditions, and manuscripts with added notes were handed down from
generation to generation.
Surviving material of this kind is sparse compared with the wealth of kunten
material from Buddhist circles. This is probably the result of greater losses by fire
9 Such diacritical reading aids are called kunten 訓点.They include the mark レ (the kaeri-ten 返
り点), which indicated that the word order of two characters should be reversed and the
wokoto-ten (also 乎己止点 or 遠古登点), small dots around the character which indicated the
appropriate particles that were to be used when the text was read in Japanese.
348 3 Character reading traditions in Japan
and other disasters of this material, which was passed on in the world outside the
monasteries.
In the different hakase-ke, different secret traditions existed, but these eventually
mixed, especially when in the Kamakura period Buddhist monks began to copy the
Chinese classics, often combining the notes of more than one hakase-ke tradition.
The Kiyohara 清原 family preserved their own tradition the longest, but the vast
majority of material was preserved by other parties.
3.4.1 Buddhist reading methods
Buddhist text may be broadly divided into scriptures and commentaries. From an
early period on both scriptures and commentaries were read by means of the
kundoku method, (i.e. while reading they were translated into Japanese). Even after
the establishment of the kundoku method however, the custom of reading all or
certain parts of a small number of important scriptures and a limited group of
commentaries by means of the ondoku 音読 method (reading a Chinese text in
Chinese word order using Sino-Japanese character readings) was continued.
To such scriptures of central importance to the Shingon and affiliated schools
(like the Hoke-kyō 法華経 and the Konkōmyō saishōō-kyō 金光明最勝王経) ongi
音義 were made, which are pronunciation guides for a particular text.
When reading non-Buddhist texts by means of the kundoku method, the monks
too used Chinese loanwords pronounced in accordance with Kan-on. In other words,
they differentiated between Wa-on and Go-on readings on the one hand and Kan-on
readings on the other, depending upon the nature of the text in question.
The fact that the Chinese pronunciation introduced in the Heian period was
labeled ‘correct pronunciation’, and that its use by monks was encouraged with the
backing of the authorities, does not mean that Go-on readings were completely
rejected as incorrect. From a Buddhist standpoint, Wa-on and Go-on represented the
pronunciation to be used when reciting the sacred scriptures, and it even may not
have been considered proper to use them unreservedly for reading non-Buddhist
texts.
As Komatsu explains (1993:22), Kan-on type Chinese loanwords used in the
kundoku reading method would have represented orthodox character readings for
teachers of the Chinese classics, but for monks they would have been secular
readings. The auditive impression that Go-on produced differed markedly from Kan-
on, and in its function as the pronunciation used in reciting the scriptures it was
thereby able to create an atmosphere transcending everyday reality. In this sense Go-
on was able to preserve its distinctiveness while presupposing the existence of Kan-
on.
For the different schools, distinguishing different kinds of Sino-Japanese was
also a way to distinguish themselves from each other: While the shōmyō of the
Shingon school held almost exclusively on to the older Sino-Japanese, the Tendai
3.6 Buddhist Kan-on study 349
school used the special Tendai Kan-on, and the Zen schools later took over Tō-in as
the latest form of Sino-Japanese.10
3.5 Reorganization of Go-on
If the correspondence between Kan-on and Go-on had been perfectly regular, then
by mastering one of these two systems and also familiarizing oneself with the rules
of correspondence between the two systems one would have been able to infer the
pronunciation of one system on the basis of the other. But the traditional method of
pronunciation used in reciting the scriptures included irregular forms, and it is to be
surmised that Go-on was reorganized on the basis of Chinese phonology because
these irregularities proved to be an obstacle to the practical preservation of the
contrast between the two systems of pronunciation.
In many cases the normative Sino-Japanese reading was deducted from the
categories of initials and finals of which the Chinese syllable is composed on the
basis of the fanqie 反切 sound glosses used in Chinese character dictionaries and
rhyme dictionaries. When the character reading thus determined did not conform
with the traditional pronunciation preserved in the recitation of Buddhist scriptures,
either the rules for deduction were re-examined or else the traditional pronunciation
was modified so that it accorded with the norms of Sino-Japanese.
3.6 Buddhist Kan-on study
Although in Buddhist rites it was mostly Go-on readings that were used, Kan-on
readings were used in certain cases especially in the Tendai school, and according to
Komatsu (1993:23-24) conscious efforts were made to master Kan-on as a way of
correctly learning the Go-on pronunciation:
As an example he gives the ondoku reading of the Mōgyū 蒙求, an elementary
textbook of Chinese history written in verse, which constitutes an important source
of early Kan-on pronunciation, as in the Chōjō 長承 manuscript detailed sound
glosses and tone marks were added by monks in the second half of the 10th century
(in red ink) and in the early 12th century (in black ink). The Mōgyū was frequently
read by monks in accordance with the ondoku method for the purpose of mastering
10 The Tō-in used in the Zen schools has characteristics that distinguish it from Go-on and Kan-on,
as it was borrowed around the 12th century from the Wu dialect area. Komatsu (1993:22)
explains that, because the Go-on system had by this time been simplified, it would have been
easier for the many monks who had no experience of studying in China to use only Go-on
readings. Since the content of the scriptures remained the same regardless of the system of
pronunciation by which they were recited, the greatest significance of reciting the scriptures in
accordance with Tō-in lay ultimately in affirming the identity of a new form of Buddhism and
its group of adherents.
350 3 Character reading traditions in Japan
Kan-on readings. Komatsu suspects that their goal was not simply to master Kan-on
readings in addition to Go-on readings, but that in order to preserve the correct Go-
on readings it was necessary for them to gain a practical and systematic knowledge
of Kan-on, which constituted the basis of Go-on.
Before undertaking the study of the Mōgyū the monks would already have
mastered Go-on readings. As in many respects, including the tones, Kan-on and Go-
on stood in contrast to one another, reciting the Mōgyū would have been a most
effective means of grasping the correspondences between the two.
This is also how Komatsu explains why the tone marks added to the Mōgyū
differentiate eight tones. Judging from the sound glosses given in katagana the
majority of phonological contrasts in Chinese were not distinguished on a segmental
level. It would therefore be somewhat incongruous for there to be as many as eight
tones.
However, according to the principles of Chinese phonology prevailing at the
time, the subtone to which a syllable belonged was determined by the initial element
of the first of the two characters used in the fanqie spelling method while the four
tones were determined by the final element of the second character. Each tone would
have to be divided into subtones depending upon the category of the initial
consonant. In other words, it was possible to specify the category of the consonant
on the basis of its subtone. Therefore, in order to preserve a clear-cut contrast
between Kan-on and Go-on readings, it would have been necessary to gain a sound
grasp of the four tones and their subtones in the Kan-on system that underlay Go-on.
The differentiation of eight tone marks thus does not indicate that eight different
tones were actually distinguished.
As the difference between voiceless and voiced stop initials was lost in Kan-on
the main reason to indicate the subtone was in view of the contrast with Go-on in
which the initial with the lower subtone would be voiced.11
3.7 Buddhist study of Chinese phonology
The strongest force in the development and preservation of Sino-Japanese was
Buddhism, especially the esoteric Buddhism of the Shingon school. In the Shingon
school the study of character readings did not form part of the comprehensive study
of the Chinese language, but constituted an independent field of learning virtually
unrelated to the meaning and usage of individual characters. The individual sounds
themselves were thought to contain a magical meaning (the so-called ongi-setsu 音
義説) and character readings per se were endowed with an inherent value.
This was because the various scriptures on which particular value was set in
esoteric Buddhism contained mystic formulae called dhāran,ī 陀羅尼 or mantras
11 Komatsu nevertheless assumes that in case of the ping and the ru tone, the difference between
light and heavy was characterized by an actual difference in pitch.
3.7 Buddhist study of Chinese phonology 351
(shingon 真言) that had been transcribed from Sanskrit by using Chinese characters
phonetically, and there were also similar formulae written in the Sanskrit Siddham
script.
The Indian script that came to Japan together with Buddhism was the so-called
Siddham script, a medieval style of Sanskrit orthography that has fallen out of use in
India.12 Although the Buddhist scriptures were read in Chinese and not in Sanskrit,
many old Japanese Buddhist scriptures contain parts that are written in the Siddham
script (particularly the mantras and dhāran,ī), and Siddham writing is still used in
Japan in esoteric rituals. According to Van Gulik (1953), Siddham calligraphy
reached great heights in Japan, even though the beautifully calligraphed texts often
contain the most basic grammatical mistakes in Sanskrit, indicating that the Japanese
were mainly interested in the putative magical qualities of the letters and not so
much in a practical knowledge and understanding of Sanskrit.
The power that was attributed to correctly pronounced dhāran,ī was of central
importance.13 Furthermore, in analogy with the Chinese writing system an inherent
meaning was ascribed to the written Siddham syllables. This is the main reason why
the monks occupied themselves with the study of Sanskrit pronunciation and
Siddham calligraphy, while no interest was shown in grammar or semantics.
The value that was put on correctly pronouncing the dhāran,ī is also the reason
why so much attention was given to Kan-on and the Kan-on tones, although in the
reading of the scriptures (even in the Tendai school, which was originally introduced
in Japan on a Kan-on basis), more Go-on than Kan-on was used.
The knowledge of Sanskrit in Japan came through China, where the Sanskrit
sounds had been transcribed into Chinese. As mentioned before, the work Xitanziji
by Zhi Guang 智広 a Chinese monk of the second half of the 8th century, had a great
influence on Japanese Sanskrit studies. The sound correspondences set up in China
were adopted without change in Japan, and their correct pronunciation presupposed
a thorough knowledge of Chinese phonology. That is why in Japan it was the
Buddhist Sanskrit (or Siddham) scholars who were the specialists in Chinese
phonology.
12 This script received its name because of the custom to start each writing lesson by writing
down the word for ‘success’ siddhām. Siddham thus became a colloquial equivalent of the
literary word for ‘script’ lipi. Siddham (or Shittan 悉曇 in Japanese), has forty-seven letters:
twelve vowels (mada 摩多) and thirty-five consonants (taimon 体文). In addition there are four
semivowels.
13 This idea is illustrated by the following two esoteric Buddhist tenets: shōmyō jōbutsu 声明成仏
‘recitation equals attaining Buddhahood’ and onjō soku myōjō 音声即妙乗 ‘the sound/voice
itself is the wonderful vehicle’ (Wenck, 1953:207).
352 3 Character reading traditions in Japan
3.8 Different types of historical material
The material for the older history of Sino-Japanese consists mainly of kana reading
glosses in dictionaries and Buddhist commentaries. Nowhere however, are glosses
added throughout a whole manuscript. In the dictionaries it often concerns
pronunciations that differed from the standard Sino-Japanese of the time, and in the
Buddhist texts it is often only rarely used characters to which a reading gloss is
added. The sound glosses in the ongi are arranged in the order in which the
characters appear in the text. The sound glosses that are given in character
dictionaries are arranged by radicals. These last are normative readings that are not
governed by context.
3.9 Present day Go-on and Kan-on pronunciations
The present day Go-on and Kan-on pronunciation of characters, especially their
historical spelling or jion kana-zukai 字音仮名遣 is the result of an even later,
thorough reworking and rearrangement on the basis of the Chinese rhyme
dictionaries by Monnō 文 雄 in Makō inkyō 磨 光 韻 鏡 (1744) 14 and Motoori
Norinaga in Jion kariji yōkaku 字音仮字用格 (1776).
Numoto (1993) shows that as they used a deductive method based on the
distinctions of Early Middle Chinese instead of those of Late Middle Chinese, many
reconstructed Kan-on pronunciations have no historical basis. As these reconstructed
Kan-on readings are nowadays nevertheless regarded as standard, original Kan-on
readings attested in old manuscripts which differ from these, have been designated
Kan’yō-on (habitual pronunciation) or Zoku-on (popular pronunciation).
3.10 Summary of terms relating to Sino-Japanese
– Sino-Japanese 日本漢字音: A term that refers not so much to the word forms of
Chinese loanwords in Japanese, but to the pronunciation of characters that is
used when reading the Chinese classics or reciting Buddhist scriptures.
– Wa-on 和 音 : A multistratified unsystematic older from of Sino-Japanese,
probably introduced in Japan via Korea. As it evolved naturally in the course of
scriptural transmission without being influenced or modified in accordance with
phonological theory it did not constitute an orderly system. In the late Heian
period Wa-on was systematically reorganized on the basis of Chinese phonology,
and this corresponds to Go-on in the modern sense of the word.
14 Monnō’s motivation for embarking on this task was, as usual, connected to the correct
pronunciation of the dhāran,ī.
3.9 Present day Go-on and Kan-on pronunciations 353
– Tsushima-on 対馬音: One form of older Sino-Japanese, possibly introduced in
Japan by Koreans on the island of Tsushima between Korea and Kyūshū. It is
very close to, or the same, as Wa-on.
– Go-on 呉音: On the basis of Chinese phonology systematically reorganized
older form of Sino-Japanese. This happened in the late Heian period. In the early
Heian period ‘Go-on’ referred to the ‘foreign Chinese’ older standard language
of Tang China.
– Wu pronunciation 呉音: ‘Foreign Chinese’ older standard language of Tang
China that had been preserved in the old capital Jinling (present-day Nanjing,
therefore the name ‘Wu’), and that later became regarded as ‘provincial’. (Early
Middle Chinese.)
– Kan-on 漢音: Newer form of Sino-Japanese, based through direct contact with
China, on the new standard language of the Tang dynasty. In the early Heian
period ‘Kan-on’ referred to the ‘foreign Chinese’ new standard language of Tang
China (Han pronunciation).
– Han pronunciation 漢音: ‘Foreign Chinese’ new standard language of Tang
China, based on the dialect of the capital Chang’an. (Late Middle Chinese.)
– Sei-on 正音: After contacts with China were severed around the middle of the 9th
century a new form of Sino-Japanese developed on the basis of the Han
pronunciation. This pronunciation was initially called sei-on (‘correct
pronunciation’). Later this word was supplanted by the term Kan-on which by
that time had lost its meaning of ‘foreign Chinese’.
– Tendai Kan-on/Shin Kan-on 天 台 漢 音 / 新 漢 音 : A way of reading certain
Buddhist texts in the Tendai school, based on a late 9th century form of Tang
Chinese.
– Tō-in 唐音: From the 12th century on, Zen monks and merchants introduced
character readings based on the Wu dialects. Tō-in can especially be found in
Kamakura-period annotated texts from the Rinzai school. They are called Tō-in
(also Tō-on) or ‘Tang pronunciation’ despite the fact that they were introduced
during the Song 宋 (960-1279) dynasty.
– Karagoe 唐音: Another term for Kan-on.
– Kan’yō-on 慣用音: ‘Habitual pronunciation’. This is the present-day designation
for traditional character pronunciations that survived despite the fact that they
could not be explained by the phonology of the rhyme tables. Some of these go
back to old pronunciations from before the rearrangement of Sino-Japanese that
had already established themselves in the spoken language and were thus
protected from later developments and rules. Therefore among them some old
Wa-on and Tsushima-on may have been preserved.
– Zoku-on 俗音: Popular pronunciation. Another term for Kan’yō-on.
4 The difference between the tones of Go-on and Kan-on
As has been briefly mentioned in section 3.6, a complication having to do with the
different Sino-Japanese reading traditions in Japan, is the fact that there is a
difference between the tones of Early Middle Chinese-based Go-on and Late Middle
Chinese-based Kan-on. The possible origin behind this difference will be addressed
in section 11.1.1. In this chapter I will introduce a number of comparisons of the
earlier Sino-Japanese and the later Sino-Japanese tones that can be found in
Buddhist works.
4.1 The Go-on tones and the Kan-on tones are contrasted to each other
The earliest comparison is in Hoke-kyō shakumon 法華経釈文 (976) by the Hossō
monk Chūzan 仲算.
1 The comparison of the Go-on tones and the Kan-on tones by Chūzan
平声字都司馬音渡上去音、 In the pronunciation of Tsushima
ping tone characters change into shang and qu tone,
上去字対馬音渡平音 and shang and qu tone characters change
into ping tone in the pronunciation of Tsushima.
In this text the shang and qu tones together are opposed to the ping tone, and no
distinction is being made between a light and a heavy ping tone. This, and the fact
that the term Tsushima-on is being used, makes it clear that it is the early,
unregularized form of older Sino-Japanese that is here contrasted with Kan-on, and
not the regularized Go-on of the late Heian period.
A proper theory on the Go-on tones only developed later, when Go-on became
regularized in the latter half of the Heian period. Comparisons between Go-on and
Kan-on tones from that time, such as in Shittan kuden 悉曇口伝 (1180) by the
Shingon monk Shinren 心蓮, are much more detailed.
Shinren’s comparison in (2) for instance, includes a distinction between a Go-on
shang tone and a Go-on qu tone, and between a Go-on light ping tone and a Go-on
heavy ping tone. These distinctions – which originally did not exist in Wa-
on/Tsushima-on – stem from the Kan-on tone system. They were projected onto Go-
on in order to form a regular correspondence with Kan-on (Mabuchi, 1996:330). The
difference that Shinren mentions between the Go-on ru tone and the Kan-on ru tone
however, was definitely real, as it is confirmed by the fact that in the modern
4.1 The Go-on tones and the Kan-on tones are contrasted to each other 355
dialects Go-on and Kan-on ru tone loanwords belong to different tone classes (cf
section 11.1.2).
2 The comparison of the Go-on tones and the Kan-on tones by Shinren
呉音平重成漢上声 The heavy ping tone of Go-on
becomes shang tone in Kan-on
呉音上成漢平 The shang tone of Go-on
becomes ping tone in Kan-on
呉音去成漢平軽 The qu tone of Go-on
becomes light ping tone in Kan-on
呉平軽成漢去 The light ping tone of Go-on
becomes qu tone in Kan-on
呉音入則漢入也 The ru tone of Go-on is ru tone in Kan-on
於漢無入重於呉無入軽也 but Kan-on has no heavy ru tone
and Go-on has no light ru tone
A century later, Ryōson 了尊 (another Shingon scholar) included the following
comparison of the Go-on and Kan-on tones in his work Shittan rinryaku-zu-shō 悉
曇輪略図抄 (1287), which is a further elaboration of the correspondence between
the two tone systems (Mabuchi, 1962:622-623).
3 The comparison of the Go-on tones and the Kan-on tones by Ryōson
次明二呉漢音声一者、私頌云、 Next I will explain my own recitation
of the Go-on and Kan-on tones.
呉漢音声互相博1。 The tones of Go-on and Kan-on go over into
each other.
平声重与二上声軽一、 Heavy ping goes over to light shang
平声軽与二去声重一、 light ping goes over to heavy qu
上声重与二去声軽一、 heavy shang goes over to light qu
入声軽与二同声重一。 light ru goes over to heavy ru
In addition, Ryōson provides a visual representation of the relation between the
tones in the two traditions.2
1 This character is a mistake for 溥 ‘to completely go over to’.
2 The only difference between Ryōson and Shinren is that Shinren has left out a comparison
between the Go-on and Kan-on heavy shang and light qu tones. This is because in the Shingon
school the heavy shang and light qu tones were not used in practice. (According to the Shingon
school the heavy shang tone had merged with the qu tone and the light qu tone had merged
with the shang tone. The background of these mergers in the Shingon tone theory is discussed
in section 8.1.2.)
356 4 The difference between the tones of Go-on and Kan-on
Figure 1: The correspondences between the Go-on and Kan-on tones
as represented by Ryōson in Shittan rinryaku-zu-shō
Source: Konishi (1948:502)
4.2 Characters in the Go-on pronunciation are marked
with ‘reversed’ tone dots
The opposition between the tones of Go-on and Kan-on is not merely found in tone
treatises like this, but is also reflected in the way in which the characters are marked
with tone dots: In sources that give Wa-on readings, such as Hoke-kyō ongi 法華経
音義 (1365) by Shinkū 心空 and Ruiju myōgi-shō 類聚名義抄, ‘reversed’ tone dots
(i.e. tone dots that do not agree with the tone category of the character in the rhyme
books) are added to these traditional readings.
In Ruiju myōgi-shō, below each character entry, first the so-called Sei-on (Kan-
on) is given, with tone dots and then the Wa-on (Go-on) pronunciation is given,
again with tone dots. One-kana Wa-on characters for instance, can be divided into
two groups: One group has a ping tone dot and one group has a qu tone dot. The
characters that are marked with a ping tone dot usually have shang or qu tone in the
Sei-on pronunciation, while the characters that are marked with a qu tone dot usually
have ping tone in the Sei-on pronunciation.
This shows that the ping vs. shang/qu opposition in Wa-on stands in a relation of
some regularity with the tones of the later Kan-on. The Kan-on tones in turn agree
with the traditional tonal division of the characters in the Chinese rhyme dictionaries.
This means that the Wa-on opposition between ping and shang/qu – although rather
irregular – is not merely random, but must go back to a tonal opposition adopted
from Early Middle Chinese.
Reversed tone marks were not only added to Go-on readings in dictionaries and
pronunciation guides to important sutras (like the Hoke-kyō ongi), but often also in
the Buddhist texts themselves, so that texts that were read according to the Go-on
reading tradition were marked with reversed tone dots. The reversed tone dots added
4.2 Characters in the Go-on pronunciation are marked with ‘reversed’ tone dots 357
to these texts however, are often extremely irregular: An examination of the tone dot
markings in those cases where there is a clear segmental difference between the Go-
on and the Kan-on pronunciation, so that it is possible to identify a reading
positively as Go-on, shows a considerable irregularity in the Go-on tone dot
markings.
Some tone marks agree with the tone category of the character in the rhyme
dictionaries while others are reversed, and if they are reversed, no such precise
system (forming a regular contrast with the Kan-on tones) as described by Shinkū or
Ryōson can be found. The reversed tone marks tend to follow a four-tone system in
which there is no distinction between light ping and heavy ping, and the Go-on
shang and qu tones are opposed the ping tone together. In very late material such as
Bumō-ki 補忘記 (1687) – in which the Go-on tone markings are extremely irregular
– occasional light ping and light ru markings occur with Go-on readings. (As a
Buddhist recitation guide to the rongi ceremonies, Bumō-ki contains mostly Go-on
readings.)
As a result of all of this, the relation between the tone that a character had in
China and the tone that such a character has ended up with in Japan is obscure. This
is a consequence of the uniquely Japanese situation where character readings from
different periods continued to be used simultaneously. The fact that the different
character reading traditions also had different tones, and the fact that in case of many
characters the Go-on reading and the Kan-on reading are identical on the segmental
level, caused such confusion that it very hard to discern any connection at all
between the original tone class of a character in China and the tone class it belongs
to in Japanese.3
What is clear however, is that the relationship between the markings of these
loanwords and their present-day tonal reflexes in the dialect is the same for both
reading traditions, and the same as for native Japanese words: Whether it is for
instance a monosyllabic Go-on word that is marked with a ping, shang or qu tone
dot, or whether it is a monosyllabic Kan-on word that is marked with a ping, shang
3 This confusion already occurred quite early on. Okumura writes for instance: “What are
referred to as Go-on materials or Kan-on materials are often actually a mixture of both. For
example, the Wa-on 和音 readings of the Ruiju myōgi-shō (including the ‘Shin iwaku 真云’
readings found in the Bureau of Books and Drawings manuscript 図 書 寮 本 ), which is
frequently used as a source of Go-on readings, contains examples such as “嫌 Wa-on: nai, dei,”
while in the case of the Hoke-kyō tanji 法華経単字 (Single Characters of the Lotus Sūtra), the
initial section gives Kan-on forms, but the tone marks (especially those in red) are frequently
what would appear to be Go-on tone marks. By way of contrast, in the Daigo-ji 醍醐寺
manuscript of the Yu-hsien-k’u 遊仙窟 (Disporting in the Cave of the Immortals), which is
generally regarded as a source of Kan-on readings, some Go-on forms are also to be found.”
Okumura also mentions that “in some texts one will occasionally find that the pronunciation of
a certain character is given in the Go-on form while the tone mark indicates the Kan-on form”
(Okumura, 1993:60).
358 4 The difference between the tones of Go-on and Kan-on
or qu tone dot; they will generally show the same modern dialect reflexes as a
monosyllabic native Japanese word that was marked with one of these tone dots.4
4.3 The tonal value of the tone dots is based
on the Kan-on tone tradition
The tone dots added to characters read as Kan-on agree with the tone category of the
characters in the rhyme dictionaries, but the tone dots added to characters that were
read as Go-on, do not (at least most of the time). This is because only Kan-on was
introduced in Japan in a systematic way, and tone dots were introduced in Japan
together with Kan-on. As a real awareness of the Chinese tones only developed with
the introduction of Kan-on, the Japanese tone dots are automatically based on the
value of the Kan-on tones. The already established Go-on readings were marked
with reverse tone dots, in retrospect, based on the tone value of the Kan-on tones.
An illustration of this situation can be seen in the Shōryaku-bon 承暦本 of the
pronunciation guide Konkōmyō saishōō-kyō ongi 金光明最勝王経音義 compiled in
1079 by a monk of the Hossō school. In this work the pronunciation of the
characters is based on Wa-on, while in the introduction a Kan-on tone dot chart is
given that includes the tōten 東点 for the light ping tone and the tokuten 徳点 for the
light ru tone even though these tones were not distinguished in Wa-on.5
Mabuchi (1996:329) explains that – as no such thing as a Go-on tone chart
existed – one had to select those Kan-on tone dots that were suitable and use them to
express the Go-on tones, which meant adding them ‘in the reverse’.
4.4 The shift from qu tone markings in Wa-on
to shang tone markings in Go-on
In early Wa-on sources the shang tone is hardly ever used. It is only later that a
number of single-kana words (having mostly ping tones in the rhyme books) which
had been marked with the qu tone in earlier works, started to be marked with shang
4 In case of the ru tone however, the situation is more complicated. (See section 11.1.2.)
5 Komatsu (1993:21) explains: ‘According to the introduction to this work, it was compiled
because young monks were learning only the systematized pronunciation deduced by means of
phonological theory and were unfamiliar with traditional Wa-on readings, and the compiler
accordingly gathered together examples of Wa-on readings that could not be fully explained by
theory alone so that they might be handed down to posterity.’ The compiler of the Konkōmyō
saishōō-kyō ongi accepted unconditionally the traditional Wa-on readings and attempted to
pass them on to the younger generation of monks. But according to Komatsu there is clear
evidence of interference of Sei-on in these Wa-on readings, and the prevailing view at the time
was that in order to preserve the purity of Wa-on it would be more effective to reorganize it on
the basis of its regular correspondences with Sei-on.
4.5 The tone descriptions concern the Kan-on tones 359
tone dots. Kindaichi (1951:20-21) for instance, lists a number of characters that have
a qu tone mark in the Wa-on reading in Ruiju myōgi-shō, but a shang and
(sometimes qu) tone mark in the 14th century pronunciation guide to the Lotus Sūtra
Hoke-kyō ongi 法華経音義 (1365-1370) by the Tendai monk Shinkū 心空. (See
section 7.3.3.3.)
I see the fact that in later Buddhist Go-on material – like Hoke-kyō ongi – single-
kana characters that had a qu tone in Wa-on now often have a shang tone in Go-on
as a development in scholarly tradition, and not as a phonological development:
These characters do not occur independently as loanwords in spoken Japanese, and
the change therefore involves mere character readings and not true loanwords. If this
were a true phonological development, one would expect native Japanese words of
class 1.3b, that were equally marked with the qu tone dot in older material (such as e
‘bait’, ha ‘tooth, ni ‘load’ hi ‘shuttle’ etc.) to show a similar shift in tone markings,
but this is not the case.
In the anonymous work Shosha-san shōmyō-shō 書写山声明抄 from the Tendai
school, we find the remark: “According to my teacher, the qu tone does not occur
with single kana. This started from the time of Kōya-san’s Rinrin Hōin” (cf. section
7.3.3.2).6 I see this as confirmation of the idea that this change in tone dot markings
stems from a change in scholarly tradition, and is not based on historical sound
change.
As it is clear that shortness is the main condition for taking part in the change,
the most likely reason for the change in tone dot markings is the fact that in Kan-on
the qu tone was regarded as the longest tone (cf. section 11.1.1). The long qu tone
was no longer felt appropriate as a marker for short single-kana readings.
4.5 Tone descriptions from the Tendai and Shingon schools concern
the Kan-on tones
Descriptions of the tones always concern the Kan-on tones. The Go-on tones are
only described in terms of their relation to the Kan-on tones, such as in the examples
above. 4.5 The tone descriptions concern the Kan-on tones
Almost all of these Kan-on tone descriptions – beginning with the oldest and
most famous description by the Tendai monk Annen in his work Shittan-zō – stem
from monks of the esoteric Tendai and Shingon schools. As the correct
pronunciation of the mantras and dhāran,ī was of essential importance in these
schools, and as the magical formulae had been transcribed by means of Chinese
characters, monks from these circles who took a special interest in the Kan-on tones.
Because of this, I will introduce the Tendai and the Shingon schools and their
chanting traditions in the next chapter.
6 Hōin is one of the grades of the priesthood in Japan.
5 The shōmyō traditions of the Tendai and Shingon schools
5.1 Varieties of shōmyō
Buddhist rites have been of great importance in the development of Sino-Japanese
and in its preservation, as for the performance of the Buddhist rites a uniform sound
for recitation was needed. For this purpose the texts were also annotated with
different types of pitch indications. Buddhist vocal music or shōmyō 声明 therefore
constitutes an important source of historical material on the Japanese tones.
The term shōmyō goes back to a translation into Chinese of the Sanskrit šabda
vidyā or ‘science of sounds’, one of the five studies in India (the others being arts,
medicine, logic and philosophy). Shōmyō originally meant the study of the script and
grammar of Sanskrit, but in China and Japan it gradually took on the meaning of
hymns or verses chanted before the Buddha.
Shōmyō are sung in every Buddhist school, but the Tendai 天台 and Shingon 真
言 schools have an especially rich tradition.
There are many ways to categorize shōmyō. One way is on the basis of the
language that is used; Sanskrit, Chinese or Japanese. Hymns of praise in Sanskrit are
for instance called Bon-san 梵讚, if they are in Chinese they are called Kan-san 漢
讚 and if they are in Japanese they are called Wa-san 和讚. The shōmyō melodies
can furthermore be divided into three scales: ritsu 律, ryo 呂 and chūkyoku 中曲.1
Examples of different genres are: kada 伽 陀 (Sanskrit gāthā, or hymns
describing Buddhist doctrines and virtues), santan 賛嘆 (texts in praise of venerated
monks), rongi 論議 (debate/catechism), saimon 祭文 (prayers), hyōhyaku 表白
(expressing the dedication of the service), goeika 御詠歌 (pilgrim’s hymns), zukyō
誦経 (recitation of sutras), nembutsu 念仏 (formula of invocation to a Buddha),
kōshiki 講式 (lecture, or exposition of the Buddhist teachings in prose) and kyōke 教
化 (very similar to kōshiki but with a different structure). Finally, the shōmyō can be
divided into several types, based on the way in which they are recited:
– utau shōmyō; hymns that are sung, such as sandan, wasan and kyōke
1 The earliest Japanese shōmyō theorist was Annen, a pupil of Ennin. (See chapter 6.) His
Shittan-zō 悉曇蔵 is concerned with the correct pronunciation and chanting of Sanskrit texts,
in terms of a pentatonic system of the type later known as ritsu. In the 12th century shōmyō also
came under the influence of the heptatonic ryō system of gagaku, and a system of nine notes
was devised. This system – afterwards known as chūkyoku – is a combination of ryō and ritsu.
Later theorists such as Tanchi (Tendai) and Genkei (Shingon) in the 13th century, discussed
shōmyō in terms of these three systems, but many points are obscure, and practice then and
since does not necessarily agree with theory.
5.2 Nara shōmyō 361
– yomu shōmyō; hymns that are recited, such as hyōhyaku and saimon
– kataru shōmyō; hymns that are narrated such as kōshiki and rongi
In case of the sung hymns the stress is on the melody. The voice often stretches
in long musical flourishes (melismas), and the singing of a single syllable can take
as long as one and a half minute. In case of yomu shōmyō and kataru shōmyō the
style is very different: The syllables are not lengthened and each word is clearly
pronounced. Here the stress is on the content of the words, and especially so in case
of kataru shōmyō such as kōshiki and rongi, although even these can contain long
decorative flourishes.
The melody of the chant consists of sequences of vocal formulae (setsu 節
‘sections’, kyoku-setsu 曲節 ‘melodic sections’ or senritsu-kei 旋律形 ‘melodic
patterns’), which make up the actual chant and connect the central tones. These
vocal formulae all have names, which in some cases agree among the different
shōmyō schools and other vocal music traditions such as Heikyoku 平 曲 (the
recitation of the Heike monogatari 平家物語) and Nō 能. A few examples of these
vocal formulae are: yuri ‘tremolo’, yuro-ori ‘tremolo with descent’, sori ‘curve’,
sugu ‘straight’, modori ‘return’ (which can either be the same note repeated three
times, while cutting off the voice in between, or the same with the middle note
raised a second), iro-modori ‘colored return’ (with a tremolo on the middle note),
atari ‘sound that hits the mark’ (to make the voice jump up, and after cutting of the
voice for a moment, continue on a lower pitch), furi ‘wave’ (a tremolo followed by a
descent), suteru ‘let go’ (to let the voice fall until it dies out as in the exclamation
“Ahh!!”), hana wo ireru ‘nasal voice’, and so on.
The greater part of the shōmyō repertoire are pieces in free rhythm, without a
fixed meter. Generally speaking, pitch is either determined by angled bars next to
the Chinese characters such as in the goin hakase 五音博士 system, or by contour
lines following the undulations of the voice, as in the meyasu hakase 目安博士
system. (See chapter 14 for an introduction of the musical notation systems used in
shōmyō.) Often verbal directions are added, for these signs are seldom exact, and
more often mnemonic, the actual intonation being transmitted by oral tradition and
gradually changing from generation to generation.
5.2 Nara shōmyō
Buddhism was introduced in Japan from the 6th to the 9th century, initially via the
Korean kingdom of Paekche and later through direct contact with China. When
Buddhist scriptures were introduced in Japan they were at first not translated but
taken over in Chinese.
In the 7th and 8th centuries, the centre of Buddhism in Japan was Nara, but with
the shift of the capital to Heian-kyō, Nara Buddhism went into decline, and the new
esoteric Tendai and Shingon schools became important. It is likely that between
these two traditions and the Nara shōmyō tradition there initially existed extensive
362 5 The shōmyō traditions of the Tendai and Shingon schools
contacts, and that the difference between them was not fundamental. According to
the Eigaku yō-ki 叡岳要記 2 for instance, in 794 Saichō 最澄 and monks from
several temples in Nara co-operated in a rongi ceremony, and in 834 Kūkai 空海 of
the Shingon school and others from the Tō-ji 東寺 temple co-operated with Ennin
円仁 in the performance of a rongi ceremony.3 The last time that such co-operation
between the Shingon and the Tendai school occurred was in 980, and one can
surmise that from that time on, the recitation traditions grew too divergent and could
not be combined any longer.
Because of these political and religious changes many new developments
occurred in Buddhist music. Of the Nara period Buddhist music almost nothing
remains.4 By the geographical shift in political power and the emergence of new
religious groups it disappeared or was absorbed into the new schools before the
Kamakura period (1185-1338), so that it is the Tendai and Shingon shōmyō that
form the basis of the tradition that has survived in Japan to this day, and not the Nara
shōmyō.
5.3 Heian shōmyō: The introduction of Tendai and Shingon
The two Buddhist schools that were dominant in the Heian period were the Tendai
and the Shingon school, which were the leading schools of the Tang court. Tendai
was introduced to Japan by Saichō 最澄 or Dengyō Daishi 伝教大師 (767-822) who
went to Tang China to study for eight and a half months from 804 to 805. The centre
of this school was the Enryaku-ji 延暦寺 on Mount Hiei near Kyōto.
Shingon was introduced to Japan by Kūkai 空海 or Kōbō Daishi 弘法大師 (774-
835), who went to China to study for two and a half years from 804 to 806. Amongst
the things he brought back with him to Japan were 142 translations in 247 volumes
of texts of the new esoteric Buddhism, chiefly those of Amoghavajra (705-774).
Whereas some schools saw esotericism (mikkyō 密 教 ) as a complementary
practice, for the Shingon school it was central (the word shingon itself means ‘true
word’ or mantra). Shingon’s esotericism is called Tō-mitsu 東密 and takes its name
2 The Eigaku yō-ki is a temple chronicle from the end of the Heian period describing the history
of the Enryaku-ji 延暦寺.
3 A rongi is a debate on Buddhist doctrine, which in later ages developed a fixed shape. The fact
that this kind of co-operation was possible must also mean that the Nara schools (who used the
Go-on reading of Chinese characters) and the Tendai and Shingon schools used a similar
pronunciation of Chinese characters. Although Shingon and Tendai were introduced on a Kan-
on basis, Shingon almost completely reverted back to Go-on (using the Kan-on pronunciation
only to pronounce the dhāran,ī, while in the Tendai school the special Tendai Kan-on or Shin
Kan-on (cf. section 3.2.4) was restricted to the recitation of certain texts and never became
generally used.
4 The Nigatsu-sen 二月懺 (O-mizutori お水取り) chanted at the Nigatsu-dō 二月堂 is often
mentioned as the sole example of a ceremony can be traced back to the old Nara shōmyō
tradition.
5.4 A period of change 363
from the Tō-ji temple, the school’s main temple in Kyōto, but the school’s present-
day headquarters is at the Kongōbu-ji 金剛峰寺 on Mount Kōya (Kōya-san) in
Wakayama prefecture.5
In the Shingon school esotericism was important from the start, but it was the
Tendai monk Ennin 円仁 or Jikaku Daishi 滋覚大師 (794-864) who studied in
China for nine years from 838 to 847, who truly gave esoteric Buddhism a place in
the Tendai school.
As for the amount of hymns and ceremonies that each of the aforementioned
figures brought back, judging from the materials it seems that the foundation of
Shingon shōmyō was laid by Kūkai in the Tō-ji, while the foundation of Tendai
shōmyō was laid by Ennin in the Enryaku-ji. Therefore, although it was Saichō who
founded the Tendai school in 806 and built the centre of the school, the Enryaku-ji
on Mount Hiei north-east of Kyōto, it should be said that the founder of Tendai
shōmyō is Ennin.
The Tendai tradition went through some tribulations in the Heian period: In 858,
another Tendai monk, Enchin 円珍 (814-891) likewise went to China to study, and
gathered a group of followers upon his return. In the 10th century a split occurred
between the followers of Ennin and Enchin, and Enchin’s followers moved to Onjō-
ji 園城寺 (or Mii-dera 三井寺). In 1081 the monks of the Enryaku-ji attacked and
destroyed the Mii-dera. Furthermore, at the end of the 10th century, Jie 慈 恵
simplified and systematized Tendai shōmyō which had become confused, owing to
its variety. (The invention of the rongi (ceremonially chanted questions and answers
on Tendai doctrine), which would in time replace the original examinations, is
traditionally also attributed to Jie.)
By the time of emperor Ichijō 一条 (986-1011) new shōmyō melodies were
being composed in Japan. At the end of the Heian period, a narrative style of shōmyō
became popular, and the yomu shōmyō and the kataru shōmyō are the expression of
this trend.
5.4 A period of change
The 12th to 13th century was a period of great change in Buddhist music in Japan.
The Kōfuku-ji 興福寺 in Nara had been destroyed during the civil wars of the 12th
century, and its rebuilding stimulated a revival of Nara Buddhism. The Nara shōmyō
tradition however, was now based on one of the Shingon schools, Nanzan Shin-ryū
南山進流, combined with one of the Tendai schools, Myōnon-in-ryū 妙音院流. In
5 The Shingon school is also known by such names as Shingon-darani 真言陀羅尼 Mandara 曼
荼羅 or Dainichi 大日. The basic sutras of Shingon (Dainichi-kyō 大日経 and Kongōchō-kyō
金剛頂経) stem from Dainichi (Mahāvairocana). Because Mahāvairocana means Great Sun,
there developed a strong fusion between Shingon and Shintō (through association with
Amaterasu, the sun goddess), which resulted in syncretistic doctrines like Ryōbu shintō 両部神
道 and Shugen-dō 修験道 .
364 5 The shōmyō traditions of the Tendai and Shingon schools
the Kamakura period the Tendai and Shingon schools, which by now had existed for
about 400 years in Japan, had taken shape as established religious groups, and
various kinds of shōmyō traditions had developed.
With the spread of Nara Buddhism to the Kantō region, a shōmyō school was
founded by Kenna 剣阿 (combining Shin-ryū and Myōnon-in-ryū) in Kanazawa 金
沢 in the Shōmyō-ji 称名寺. This school however, later died out, but a number of
early examples of Kakui’s goin hakase musical notation system – which was at first
not readily accepted in Kakui’s own Shingon school – have been preserved at this
temple.
Furthermore, from dogmas that originally belonged to the Tendai school Jōdo-
shū 淨土宗 (founded by Hōnen 法然 1133-1212), Jōdo shin-shū 淨土真宗 (founded
by Shinran 親鸞 1173-1262) and Nichiren-shū 日蓮宗 (founded by Nichiren 日蓮
1222-1282) developed. The growth in popularity of the Jōdo teachings, which held
that it is possible to attain Buddhahood by being reborn in the Pure Land (Jōdo 淨
土) of Amida, resulted in the development of the Japanese shōmyō genre of kōshiki.6
The first kōshiki is attributed to the Tendai monk Genshin 源信 (also called
Eshin 慧 心 , 942-1017), wo was a pupil of Jie. The most well-known kōshiki
composer however, is Myōe 明 慧 (1173-1232), a priest of the Kegon school.
Although kōshiki did not belong to the established shōmyō tradition, it transcended
religious boundaries and became popular in every religious school.
The cultural contacts with China, which had been interrupted, were renewed and
Zen 禅 was introduced as a new school from Song China. The three schools of Zen
Buddhism (Rinzai 臨済, Sōtō 曹洞 and Ōbaku 黄檗) imported the ceremonial music
of the Song period, which had some influence, in the first place on Nara shōmyō, but
also on the other shōmyō traditions. A lively exchange in ceremonial music
developed between the old shōmyō traditions of Nara, Tendai and Shingon, and
these new schools, both from inside and from outside the country. The shōmyō of
every school underwent an enormous transformation: In 1145 for instance, a
conference was held at the Ninna-ji 仁和寺, to solve the confusion that existed
between the different schools of Shingon shōmyō. In the same period, in the Tendai
school, there was the conflict between the shin-ryū 新流 and ko-ryū 古流 groups at
Ōhara. (For an introduction of these and other subschools, see the next sections.)
The new religious schools all tried to create ceremonial music influenced by folk
music, and some of this has survived, but none of them could form a tradition strong
enough to withstand the traditions of Tendai and Shingon. As for the newly
established groups, apart from Jōdo shinshū, they lost their own characteristic
ceremonies after a while, and started to take over shōmyō of other groups, especially
the Tendai Ōhara shōmyō. It is only in the wasan and goeika genre that folksongs
continued to be used from the Kamakura period on until the Edo period.
6 A kōshiki is a chanted exposition of the Buddhist teachings which promotes devotion to Amida.
It is written in Chinese (Kanbun), but read according to the Kanbun kundoku method, i.e., it is
translated into Japanese while reading, with the help of reading aids.
5.5 Shōmyō traditions within the Tendai school 365
5.5 Shōmyō traditions within the Tendai school
Ennin had divided the hymns he had learned in China into different groups for his
followers to pass on, which resulted in a split into many different shōmyō schools.
The tradition was restored by Ryōnin 良忍 (1071-1132) who reunited the schools,
and has since been revered as the restorer of musical chanting in the Tendai school.
In 1109 he founded a temple, the Raigō-in 来迎院, in Ōhara near Kyōto, and made it
into the centre of the Tendai school. Since then his school is called Ōhara-ryū 大原
流 or sometimes Gyosan-ryū 魚山流, and it has been the leading school of Tendai
shōmyō since the middle of the 12th century, although not without further upheavals
and disruptions.
According to tradition, Kekan 家寛, one of Ryōnin’s pupils, compiled the most
commonly used shōmyō in two volumes under the title Gyosan shōmyō rokkan-jō 魚
山声明六巻抄 in 1173, but the actual history of the origin of this collection is quite
complicated. The oldest dated copy is from 1481 (Giesen, 1977: 12-16). It was
reprinted many times and is still in use in this school.
Around 1200 Ōhara-ryū split into two groups, one innovative (shin-ryū 新流)
and one conservative (ko-ryū 古流). The leader of the first group was Tanchi 湛智
(1163-±1240), the author of Shōmyō yōjin-shū 声明用心集 (1232) who studied the
work Shittan-zō by Annen intensively, and introduced many revolutionary practices
in shōmyō and in the hakase notation. He established new modi and rhythm and a
fixed order in which to sing the hymns. His opponent was Jōshin 成 親 , who
preserved the traditional way of performing introduced by Ryōnin and whose school
– because of this conservative character – was called ko-ryū. 7
A pupil of Tanchi, Shūkai 宗快, compiled Gyosan mokuroku 魚山目録 (1235-
1237), the shōmyō collection used in the Tendai Ōhara school.
Another Tendai shōmyō school that developed in this period was the Myōnon-in-
ryū 妙音院流. This school was founded by Fujiwara Moronaga 藤原師長 (1137-
1192) who is also known under his Buddhist name (Myōnon-in) Chōen 長円. He
wrote important books on Gagaku 雅楽 music and used a 13-stringed lute or sō 筝
while teaching shōmyō. In time both this tradition and the ko-ryū tradition were lost.
Tanchi’s innovative school on the other hand, lived on under the name Ōhara-ryū,
and Tanchi’s theory was elaborated by several of his pupils who wrote instruction
books. In 1352 however, after a period of decline, the Raigō-in, which until then had
been the seat of the Ōhara school, was taken over by other schools, and the tradition
was interrupted. It was only restored in the early 15th century by Ryōyū (1351-1421)
7 Especially among their later followers a vehement polemic arose. In Nomori no Kagami 野守
鏡 (1295) Tanchi is accused of causing great confusion in the Ōhara school, as a result of
which many admirable shōmyō of Ryōnin were lost. The author goes as far as to deny Tanchi
legitimacy, as he claims that Tanchi had not even received the secret oral transmission or kuden
口 伝 of Ryōnin, and based his recitation purely on the hakase (Giesen, 1977:87). The
authorship of Nomori no kagami has not been definitely established. Although often attributed
to the poet Roku-jō Arifusa 六条有房, it may stem from some unidentified Tendai priest.
366 5 The shōmyō traditions of the Tendai and Shingon schools
who was originally from Hiei-zan, in the Jōrenge-in 浄蓮華院, which has been the
legitimate seat of the school since then (Giesen, 1977:23).
After the sixteenth century, much of the tradition was again lost. In 1842, a book
attempting to explain many unclear points in the melodies was written by Shūen 宗
淵 (1786-1859) together with a large compilation of hymns, the Gyosan sō-sho 魚山
叢書. Nowadays in the liturgy of the Tendai school only Ōhara-ryū hymns are used.
5.6 Shōmyō traditions within the Shingon school
After Kūkai established Shingon shōmyō the tradition survived for almost two
centuries, after which it fell into decay and split into a great number of subschools.
Two groups that already existed in the mid-Heian period were for instance the
Ninna-ji-ryū 仁和寺流 and the Daigo-ryū 醍醐流. Another, somewhat later group,
was called Shin-ryū, which had been created by a monk called Shūkan 宗観 or
Daishin 大進 around 1100. The shōmyō school that he founded was therefore called
the Shin school. (Shin-ryū 進流 is an abbreviation of Daishin shōnin-ryū 大進上人
流.)
In 1145, at the conference in Kyōto at the Ninna-ji temple (cf. section 5.6), the
chaotically divided shōmyō schools of the Shingon school were organized into four
new schools, so that the official shōmyō schools within Shingon were initially as
follows: the Ninna-ji Sōō-in or Hon-Sōō-in-ryū 本相応院流 (‘original’ Sōō-in-ryū)
and the Shin-Sōō-in-ryū 新相応院流 (‘new’ Sōō-in-ryū ), the Daigo-ryū 醍醐流 at
the Daigo-ji 醍醐寺 temple and finally the Shin-ryū 進流 school. The original seat
of this school had been in Nakanogawa near Nara, but it was later (1235) moved by
one of Shūkan’s followers to Kōya-san. Thus the Kongōbu-ji on this mountain
became the centre of the school, and Shin-ryū is therefore better known under its
later name Nanzan Shin-ryū 南山進流 (Nanzan = Kōya-san).
The Hon-Sōō-in and Shin-Sōō-in schools were reunited, but again split into two
branches at the beginning of the 13th century, namely the Bodai-in 菩提院 branch
(founded by Gyōhen 行遍) and the Saihō-in 西方院 branch (founded by Sonpen 尊
遍, Gyōhen’s elder brother).
Between 1140 and 1299, on the basis of doctrine there occurred a major division
within Shingon Buddhism into the Kogi 古義 ‘traditional doctrine’ group and the
Shingi 新義 ‘new doctrine’ group. (Those who held that the Buddha preaches in his
primordial body were called the Kogi branch, while those who held that the Buddha
preaches in his manifest form were called the Shingi branch.) The Kogi group
included the above mentioned shōmyō schools, while the Shingi group created its
own shōmyō tradition by combining the shōmyō traditions of different schools.
5.6.1 Kogi Shingon
Kogi Shingon has many subsects: Tō-ji 東寺, Daigo 醍醐, Daikaku-ji 大覚寺,
Omuro 御室派, Sennyū-ji 泉涌寺, Yamashina 山階 and Zentsū-ji 善通寺. As to
5.6 Shōmyō traditions within the Shingon school 367
shōmyō schools within Kogi Shingon there were of course the aforementioned Sōō-
in-ryū, Daigo-ryū and Nanzan Shin-ryū, of which the last became most famous.
Kindaichi (1972) describes the Muromachi period (14th –16th c.) as a dark age in
the history of Shingon shōmyō, as many theoretical works from this period are so
confused that it is hard to make any sense of them. (He mentions for instance, that
the habit of chanting the kaku tone on the same tone height as the chi tone, which
can still be found in modern Nanzan Shin-ryū shōmyō probably stems from this
period.)
During the Ōnin war (1467-1477) Kōya-san was spared from most unrest, and at
the end of the 15th century the first shōmyō collections, such as Gyosan taigai-shū 魚
山條芥集 (1496) were printed. In this work, in which Kakui’s goin hakase system is
used, the most commonly used hymns were collected by Chōe 長恵 (1458-1524). It
would have many modified reprint editions. The oldest extant edition is from 1646.
The musical scores that are used today in Nanzan Shin-ryū Shingon go back to
the late 16th century, when Chōi 朝意 (1518-1599) undertook a revision of all the
important shōmyō texts of the Nanzan-shin school, and these improved versions
were printed in the first half of the 17th century.
From the beginning of the Tokugawa period however (the 17th century) the
Nanzan Shin-ryū tradition was in severe decay. The other two schools, Sōō-in-ryū
and Daigo-ryū, underwent the same fate.
Only the Nanzan Shin-ryū school survived to this day, as the tradition of this
school was revived in the second half of the 18th century.
However, according to Iwahara Taishin (1932), who is an authority on the
shōmyō of the Nanzan Shin-ryū school, in the present-day shōmyō of this school,
melodies that belong to the ryo scale have been deformed so strongly that the
melody has become almost flat, and (as mentioned) The melodies in the ritsu and
chūkyoku scales as well differ considerably from what is written in the score. 8
After Ninna-ji and Daigo lost their tradition, they took over Shin-ryū shōmyō, so
that nowadays, in almost all temples that belong to the Kogi Shingon school, the
liturgy is conducted only with Shin-ryū hymns.
5.6.2 Shingi Shingon
Shingi Shingon was founded at the end of the Heian period by Kakuban 覚鑁
(1095-1143) a follower of Kūkai, who opened a centre on Kōya-san. (This was
before the Shin-ryū shōmyō tradition moved there.) There was conflict with the
traditional centre Kongōbu-ji, and in 1140, the Mitsugon-in 密厳院, where Kakuban
8 In Nihon shisei kogi (1951) for instance, Kindaichi describes how he went up to Kōya-san,
hoping to find the Middle Chinese tones still reflected in the chanting of the ceremonies. He
discovered that in modern Nanzan Shin-ryū shōmyō, there no longer existed a relation between
the tones of the characters and their tone in the melody of the chants. It also turned out that the
shōmyō were recited in completely different ways by different authorities on Nanzan Shin-ryū
recitation. He therefore had to rely on the fushihakase marks added to texts with tone dot
markings, as the ‘oral material’ that he had been hoping for was not available.
368 5 The shōmyō traditions of the Tendai and Shingon schools
had retired, was destroyed by angry Mount Kōya priests. Kakuban and his followers
escaped to Negoro-san (Mount Negoro) 25 km. to the north-west, where Kakuban
died in 1143. Under Raiyu 頼 瑜 (1226-1304), 140 years later an official split
occurred: In 1299 Raiyu founded the Negoro-ji 根来寺, the headquarters of Shingi
Shingon on Negoro-san.
Much less in known about the origin of Shingi Shingon shōmyō than about the
origins of Nanzan Shin-ryū shōmyō. Kakuban is the ancestor of the Shingi Shingon
teachings, and not of the shōmyō.9 The shōmyō of the Shingi group were made by
monks of the Negoro-ji after the split under Raiyu. According to tradition they
revived the Daigo tradition, which had fallen into decline shortly after its official
foundation in 1145, and combined it with innovations taken over from the Shin-ryū
tradition. As they are thus said to incorporate the Daigo tradition they are by some
considered to have an older origin than the Shin-ryū shōmyō of the Kogi Shingon
school. Kindaichi (1964) on the other hand, is skeptical about a supposed Daigo-ryū
component in Shingi Shingon shōmyō.
For hundreds of years, the main temple of this group was the Negoro-ji, until it
was destroyed by Toyotomi Hideyoshi in 1585. The Shingi Shingon school first fled
to Kōya-san, where they left after two years. Afterwards one group established itself
at the Hase-dera 長谷寺 in Buzan near Kyōto, which became famous as a study
centre for the Shingon school, while another group moved to Higashiyama in Kyōto
where they established themselves in the Chishaku-in 智積院. Since then the two
groups were opposed to each other and today Shingi Shingon still has the Buzan-ha
豊山派 and the Chizan-ha 智山派 branches.
The period in which Shingi Shingon shōmyō achieved its greatest success is after
the destruction of the Negoro-ji. In its present form, the shōmyō collection used in
this school, Gyosan-shū 魚山集 (1682), goes back to Raishō 頼正. The period of the
great flourishing of Shingi Shingon is counted from the publication of Gyosan-shū
to the early 18th century. In this period Shingi Shingon shōmyō clearly surpassed the
Nanzan Shin-ryū tradition, which was in severe decay.
The shōmyō of the Shingi Shingon school belong to the shōmyō schools that
have survived to this day. Especially the Buzan-ha of Shingi Shingon remains strong.
5.7 The antiquity of the shōmyō traditions
that have survived to this day
In the Tendai school, the first reform that is mentioned is the one by Jie at the end of
the 10th century, which was apparently needed because Tendai shōmyō had become
9 It is not too clear what shōmyō Kakuban may have used, as the Shin-ryū shōmyō tradition had
not yet moved to Kōya-san. According to tradition however, the Kōya-san shōmyō tradition had
originally been imported from the Sanbō-in 三宝院 (which later became the head temple of the
Daigo school in Kyōto) to Kōya-san at the beginning of the Kamakura period.
5.7 The antiquity of the shōmyō traditions that have survived to this day 369
confused. These reforms may not have been very radical, but a subsequent reform
has become famous: The Tendai shōmyō tradition fell into decay sometime in or
before the 11th century, and was restored in the early 12th century by Ryōnin in his
shōmyō restoration.
Around 1200 the school split into a conservative and an innovative group of
which only the innovative group led by Tanchi survived under the name Ōhara-ryū.
In the 14th century however, there was a breach in the tradition, and the tradition was
only revived in the early 15th century. After the 16th century the tradition again went
into decline, but was restored again in the 19th century. The oldest extant shōmyō
collection (Gyosan shōmyō rokkan-jō) dates from 1481, but the present-day shōmyō
repertoire (collected in Gyosan sō-sho) goes back to 19th century editions.
The Kogi Shingon shōmyō tradition first fell into decay in the 10th century, but
after a period of chaotic division it was restored and reorganized in the 11th century.
The only Kogi Shingon school that has survived is the Nanzan Shin school. The
extremely confused tone description in Shishō shiki (1409) shows that the tradition
of this school must have been virtually lost during the 14th century. (See section
12.2.1.) The musical scores that are used today in Shin-ryū Shingon only go back to
revisions made in the late 16th century. In the 17th century the Kogi Shingon school
was in severe disarray and would only be restored in the latter half of the 18th
century. It is only since this time that the shōmyō tradition of Kōya-san developed
into the representative school of Shingon shōmyō. The oldest extant shōmyō
collection, Gyosan taigai-shū is from the year 1646.
The origin of the Shingi Shingon shōmyō tradition is not clear, and there is not
much information on the state of the Shingi Shingon shōmyō tradition in the Middle
Ages. The rongi tradition apparently died out at the Negoro-ji in the 15th century, but
was revived in the late 16th century. In its present form, the shōmyō collection that is
used today in this school, goes back to late 17th century editions.
The history of the different shōmyō schools in Japan seems to indicate that the
present-day shōmyō traditions of the three surviving schools are old, but do not go
back to the time of the introduction of Tendai and Shingon in Japan in the 9th
century. Neither do they go back to the 11th to late 13th centuries, when the use of
tone dots to mark the tones of Middle Japanese proliferated. The important period of
decline in the shōmyō tradition in the 14th century, which affected both the Tendai
and the Shingon school, may have been the result of a combination of causes. There
were, of course, the political and economic upheavals of the Nanboku-chō period,
but the leftward tone shift in Kyōto proposed by Ramsey, which disrupted the
traditional tone system of the standard language, would probably have been even
more important.
The present-day recitation practice, and the present-day views on the value of the
Middle Chinese tones (Go-on as well as Kan-on), cannot be traced back directly (i.e.
in an uninterrupted tradition) further than the late 16th century at the earliest. This
means that as far as the reconstruction of the tone system of Middle Japanese is
concerned, the modern Tendai and Shingon traditions are of little value.
370 5 The shōmyō traditions of the Tendai and Shingon schools
There are however, historical descriptions of the tones from Tendai and Shingon
monks, and manuscripts with fushihakase musical notation marks from earlier
periods. If interpreted correctly, these form a valuable source of historical
information on the Japanese tones.
6 The earliest tone description in Japan: Shittan-zō
The tone descriptions contained in the work Shittan-zō 悉曇蔵 (‘Siddham treasury’)
written by the Tendai monk Annen 安然 in the year 880 are the earliest tone
descriptions that originated in Japan. Annen was a disciple of Ennin and the first
Japanese shōmyō theorist. Shōmyō theory from after Annen’s time is invariably
based on his work. A shōmyō reformer like Tanchi 湛智 of the Tendai school for
instance (who wrote Shōmyō yōjin-shū 声明用心集 in 1232), studied Shittan-zō
intensively (Yoshida, 1954). Annen’s work deals both with musical theory and
Siddham phonology.
Annen’s description is in many ways unique: It is the only work that dates from a
time when direct contact with China was still relatively recent (official contact with
China had been severed around the middle of the 9th century), and the only work that
compares several different tone traditions that were all based on Late Middle
Chinese. It is also the only work that stems from the period when the new Chinese
standard language from Chang’an was still in the process of settling into a form of
Sino-Japanese (Kan-on).
6.1 Annen’s four traditions
In volume V of his work, 1 Annen mentions four important teachers of Chinese
pronunciation in Japan by their abbreviated names: Biao (Japanese: Hyō) 表 and Jin
金 (Japanese: Kin) who came first but of whom no dates are given, and the Japanese
Tendai monks Sei 正 and Sō 聡 who had studied in China and returned to Japan in
847 and 877 respectively. The tone systems of the latter two may have been passed
on to Annen directly, but the tone systems of the first two were probably handed
down to him by kuden 口伝, or oral tradition.
1 I will use the version of Annen’s text presented by Endō (1988). Endō has compared three
manuscripts of the text, one from 942, one from 1085 and one from the end of the Kamakura
period (1185-1338), and two printed versions from 1672 and 1789 respectively. Endō follows
the text of the oldest, most valuable manuscript of the year 942, which is only 60 years later
than Annen’s original, but indicates when other texts deviate from this text. (The main instance
in which Endō’s text differs from the one in Taishō shinshū dai-zō 大正新修大蔵 vol. 84 used
by Mei (1970), is in line 10.)
372 6 The earliest tone description in Japan: Shittan-zō
6.1.1 Biao and Jin
The first two are identified as Biao Xingong 表信公 (Japanese: Hyō Shinkō) and Jin
Lixin 金礼信 (Japanese: Kin Reishin) in Iida’s study (1955:70). This is because in
the work Shittan sammitsu-shō 悉曇三密鈔 (1682), Jōgon 淨厳 mentions all of the
four teachers together, but this time Biao and Jin are given full names.2
Jōgon further claims that, first of all, Go-on was taught by Jin Lixin on the island
of Tsushima, and this pronunciation was therefore also called Tsushima-on. And
after that, in Hakata in Kyūshū, Kan-on was taught by Biao Xingong. (If Jin was
really connected to Tsushima he may very well have been Korean.)
Jōgon bases his story on ‘a certain comment’ (或抄) which one so far has not
been able to identify (Wenck, 1953:312). Wenck therefore assumes that Jōgon
merely tried to make a logical connection between two of the different names used
for the older Sino-Japanese of the Heian era: Go-on and Tsushima-on.
A Jin Lixin is however, also mentioned in the Heian period work Sandai jitsu-
roku 三代実録 (901). He is presented as a naturalized Chinese scholar (kika-jin 帰
化 人 ) of the Tang era, but in this case together with a certain Yuan Jinqing
(Japanese: En Shinkei) 袁普卿 of around the year 770, and Motoori Norinaga 本居
宣長 therefore explained Annen’s Biao as the result of a copying mistake in the
manuscript (Wenck, 1953:312). In the manuscript of the year 1085 that Endō studied,
there is indeed one instance (in line 2) where the character 袁 is used instead of 表,
but this is again corrected to 表 in the margin.) In imperial records of the 8th century
a Yuan Jinqing is also mentioned, who is identified as a Chinese on-hakase who
came to Japan in 735.
Mabuchi (1962:340) thinks that Jin Lixin may have been a Korean from Shilla
(Korean: Kim Ye-sin). In the Nihon shoki 日本書紀 for instance, it is mentioned
that the name Jin was very common in Shilla whereas unusual in Paekche and
Mimana.3 I think however, that without Jōgon’s ‘certain comment’ the indication of
a Korean background for Jin is weak. In Sandai jitsu-roku (901) after all, which is so
much older than Jōgon’s work, Jin is identified as a naturalized Chinese. (Although
nowadays the family name 金 is more common in Korea than in China, this has not
always been the case.)
As for Biao, without the ‘comment’ the identification of his full name as Biao
Xingong (Hyō Shinkō) is solely based on Jōgon’s Edo period work. The connection
of Biao with Yuan Jinqing on the other hand, is based on two things: The fact that
the latter is mentioned together with Jin in Sandai jitsu-roku, and on the use of the
character 袁 in the Shittan-zō manuscript of 1085. Although this makes me lean
towards an identification of Biao with Yuan Jinqing, I will indicate his name
throughout as Biao.
2 Jōgon (1639-1702) was a scholar of the Siddham script and Chinese and the teacher of Keichū
契沖 (1640-1701), who established the official Japanese orthography that was in use until 1945
in his work Waji shōran-shō 和字正濫抄 (1693).
3 Mimana is the Japanese name for the Korean state of Kaya.
6.2 Annen’s text 373
The traditions concerning these two teachers of Chinese are complex but more is
known about the people who brought the later two traditions to Japan.
6.1.2 Isei and Chisō
These later two teachers have been identified by Hashimoto Shinkichi (1920) as the
monks Isei 惟正 and Chisō 智聡. Annen says that they explained both the Wu and
the Han pronunciation (此両法師共説呉音漢音) but the descriptions in Shittan-zō
are no doubt of the latter, most likely the dialect spoken in Chang’an. About Isei,
Annen says that he studied first in Luoyang, and later in Chang’an and that he
returned in 847 to Japan. His itinerary and the date of his return coincide with those
of Ennin and it is therefore thought that he belonged to the same mission. This
makes it likely that the tone system transmitted by Isei (and probably also the one
transmitted by Chisō) was close to the original tone system of Tendai Kan-on.
The monk Chisō lived for a long time in Chang’an but also traveled around in
the south and in the north and finally returned to Japan in 877. Annen also mentions
that according to Chisō the tone systems of the other three did not exist in Tang
China ( 但 聡 和 上 説 云 、 前 三 家 音 巨 唐 無 矣 ). Annen probably knew Chisō
personally (they were contemporaries belonging to the same Buddhist school, and in
the text Annen praises Chisō’s thorough knowledge of the dialects), and from the
above remark, it seems that he held his tone description in high esteem as the most
recent to come from China.
As for Chisō’s remark, Mabuchi (1962:342) argues that in case of the earlier two
traditions one has to bear in mind that their descriptions are of another time than that
of Chisō, and also of another type of Chinese. Jin and Biao – as official teachers –
most likely transmitted the early Tang standard reading pronunciation (dokusho-on
読書音) of the characters, while the latter two monks most likely described the
spoken language of Chang’an. That the tone system of Isei, who had after all lived in
Chang’an for a long time, did not exist is hard to believe. Perhaps his tone
description was somewhat influenced by his stay in Luoyang, or perhaps there was
some rivalry between two people with rare and prestigious knowledge from abroad.
Summarizing we can probably say that the first two traditions go back to
standard reading pronunciations taught by on-hakase from China, while the second
two traditions go back to reports on the spoken language of Chang’an by monks
returning from study in China.
6.2 Annen’s text
Biao:
1 我日本国元伝二音 Originally two sounds were transmitted to Japan
2 表則平直低有軽有重 According to Biao ping was straight and low/falling
and has light and heavy
374 6 The earliest tone description in Japan: Shittan-zō
3 上声直昂有軽無重 Shang is straight and high/rising and has light
but no heavy
4 去声稍引無軽無重 Qu is a little drawn out and has no (distinction
between) light and heavy
5 入声径止無内無外 Ru stopped abruptly, and has no inner and outer
(the syllable final consonants have lenited)
6 平中怒声与重無別 In the ping tone the nu-sounds are not distinguished
from the heavy
7 上中重音与去不分 In the shang tone the heavy are not divided
from the qu tone
Jin:
8 金則声勢低昂与表不殊 The tones according to Jin did not differ from those
of Biao with respect to pitch and/or contour4
9 但以上声之重 But Jin’s heavy shang tone
稍似相合平声軽重 was somewhat like a combination
of the light and the heavy ping tone
始重終軽呼之為異 enunciating it beginning heavy and ending light
is what makes it different
10 唇舌之間亦有差些舛5 There is also a difference in the articulation
Isei:
11 承和之末正法師来 At the end of the Jōwa period (834-848)
the monk Sei came
12 初習洛陽中聴大原 having first learned the Luoyang dialect,
and then listened to the Taiyuan dialect
13 終学長安声勢太奇 and finally studied the Chang’an dialect,
the tones have become quite strange
14 四声之中各有軽重 Each of the four tones has light and heavy
15 平有軽重軽亦軽重 and the light of the ping tone is again divided
into light and heavy
16 軽之重者金怒声也 The heavy of the light are Jin’s nu-sounds
17 上有軽重 The shang tone has the light and heavy
18 軽似相合金声平軽上軽 the light is like combining the light ping
and the light shang tone of Jin
19 始平終上呼之 beginning with the ping tone and ending
with the shang tone
4 Literally: ‘low/falling and high/rising’
5 The character 舛 used here has the meaning ‘to differ’. The manuscript of 1085 and the one
from the end of the Kamakura period have a character here that consists of a reduplication of
牙 ‘tusk, fang’. This is a very rare character of which the meaning is unclear. Taishō-zō has 升
‘to rise, climb’, and thus Mei translates as: “In the process of articulating there also is a
differential rise”. This translation however, appears to be in contradiction with the first line of
Jin’s tone description.
6.2 Annen’s text 375
20 重似金声上重不突呼之 The heavy is like Jin’s heavy shang tone
but without the abrupt pronunciation
21 去有軽重重長軽短 Qu tone has light and heavy. Heavy is long
and light is short
22 入有軽重重低軽昂 Ru tone has light and heavy. Heavy is low
and light is high
Chisō:
23 元慶之初聡法師来 At the beginning of the Gangyō period (877-884)
the monk Sō came
24 久住長安委捜進士 having stayed long in Chang’an, where he made
wide acquaintance with men of learning
25 亦遊南北熟知風音 and also through his travels north and south
he became familiar with the various dialects
26 四声皆有軽重著力 All the four tones have heavy and light
and enunciatory strength (voiced stop initials)
27 平入軽重同正和上 The light and heavy of the ping and ru
are the same as those of Sei.
28 上声之軽似正和上上声之重 The light shang resembles the monk Sei’s
heavy shang
29 上声之重似正和上平軽之重 The heavy shang resembles the monk
Sei’s heavy-light of the ping tone
30 平軽之重金怒音也 The heavy-light of the ping tone
are Jin’s nu-sounds
31 但呼著力為今別也 but they are now pronounced
with enunciatory strength (as voiced stops)
32 去之軽重似自上重 The light and heavy of the qu tone
resemble the heavy shang tone itself
33 但以角引為去声也 but are drawn out on a middle pitch
and become qu tone
34 音響之終妙有軽重 At the end of the sound there is a slight
difference between the heavy and the light
35 直止為軽稍昂為重 If it stops directly it is light.
If it rises slightly it is heavy
36 此中著力亦怒声也 In this the sounds with enunciatory strength
(the voiced stops) are again the nu-sounds.
In the interpretation of Annen’s text, I have relied heavily on the following two
excellent studies: Edwin Pulleyblank (1978) made extensive use of Annen’s text in
his article on Middle Chinese tone, and his groundbreaking study of this work has
clarified many previously obscure points. Furthermore, Endō Mitsuaki’s study
(1988) is essential, as Endō succeeded in explaining the meaning of many terms that
are hard to interpret, by looking at the way in which they are used in other parts of
Shittan-zō.
376 6 The earliest tone description in Japan: Shittan-zō
6.2.1 Heavy and light 重軽
The terms heavy and light originally indicated the presence or absence of aspiration.
The word heavy was used for Sanskrit voiceless aspirated consonants and voiced
aspirated consonants (ph, b˙), and the term light was used for unaspirated
consonants, both voiced and voiceless (p, b, m). However, the way in which the
Sanskrit consonants were transcribed by means of Chinese characters in the Early
Middle Chinese-based transcription method had as a result that heavy ended up
being associated specifically with voiced aspiration:
As Chinese had voiceless aspirated obstruents these could be used to indicate the
same sound in Sanskrit, and no additional note indicating that these sounds should
be aspirated was required. The Sanskrit voiced aspirated sounds on the other hand,
had no equivalent in Early Middle Chinese, and so Sanskrit voiced and voiced
aspirated sounds were both transcribed by Chinese voiced obstruent initials. In order
to indicate the voiced aspiration of the latter, in the most precise texts the note heavy
was added to the Chinese transcription.6
1 Transcription of the Sanskrit initials by means of Chinese
Sanskrit EMC
p p
ph ph
b b
b˙ b with the note 重 (=b˙)
m m
The term heavy in Shittan-zō therefore referred to voiced aspiration, and the
difference between light and heavy was one of voice quality, and was directly
related to the fact that in Late Middle Chinese the voiced stop initials of Early
Middle Chinese had developed voiced aspiration. 7 In Pulleyblank’s description:
“they were pronounced much as in the majority of modern Wu dialects, that is, with
at least partially devoiced onset followed by voiced aspiration spreading through the
syllable” (1978:179).
6 Karlgren postulated three obstruent series for the language of the Qieyun (EMC): voiced
aspirate, voiceless non-aspirate, and voiceless aspirate. As proof of the validity of the voiced
aspirates, he adduced the voicing of Go-on and Wu dialects, and the aspiration of Mandarin and
Hakka. It seems however, that the voiced obstruent series in Early Middle Chinese was
unaspirated. Maspéro (1920) already noted that the note heavy added to the transcription of
Sanskrit voiced aspirated consonants implies that the voiced stops as such were not voiced
aspirates. If they had been, Sanskrit unaspirated voiced obstruents would have been transcribed
by Chinese voiced stop initials with the added note light.
7 As Pulleyblank points out, in phonological systems which have obstruent series of the shape /b,
p, ph/, the voiced obstruent /b/ often seems to have a concomitant ‘breathy register offset’,
explaining the development from Early Middle Chinese to Late Middle Chinese.
6.2 Annen’s text 377
It is because of the voiced aspirated nature of the voiced stop initials in Late
Middle Chinese that Biao’s heavy shang tone merged with the qu tone: The heavy
shang tone being voiced aspirated with a final glottal stop, assimilated this glottal
stop to the voiced aspiration (p˙V/ > p˙V˙) and in this way merged with the qu
tone. 8 So while Early Middle Chinese voiced obstruents did not yet have voiced
aspiration, Pulleyblank’s claim that Late Middle Chinese did (i.e. EMC bV > LMC
p˙V), is crucial in explaining the merger of the heavy shang tone with the qu tone.
(It is highly significant that the shift from shang tone to qu tone did not take place
after sonorant initials, which are equally voiced, but lacked voiced aspiration.)
In the description of Jin’s tones the heavy shang tone starts heavy but ends light,
and has not merged with the qu tone. This indicates that in Jin’s tone system the
final glottal stop of the shang tone had not yet changed to voiced aspiration under
influence of the heavy initial. (We see how Annen uses the ping tone, in which the
heavy/light voice quality distinction was most clear, to describe the change from
breathy voice quality to clear voice quality within the syllable in Jin’s unmerged
heavy shang tone.) There is also an interesting allusion in line 10 to a difference in
the articulation of Jin’s heavy shang tone as compared to Biao’s. This difference
most likely also refers to Jin’s preservation of the syllable final glottal stop in this
tone.
One thing that appears to disagree with Pulleyblank’s interpretation is the fact
that according to Biao’s description, in the ping tone the nu-sounds (nasals, or
possibly all sonorant initials, see section 6.2.5) belong to the heavy category. This
seems to indicate that the term heavy referred in the first place to voice as such, and
not necessarily to voiced aspiration (although the origin of the term and the way in
which it was used in the transcription of Sanskrit support Pulleyblank’s
interpretation). It has to be kept in mind however, that the reason why Annen
mentions this state of affairs in Biao’s ping tone is because it was an exception. In
the other tones apparently, the division of the initials over the categories light and
heavy does agree with Pulleyblank’s interpretation of heavy as voiced aspirated.
The explanation that Pulleyblank suggests for this exceptional situation in the
ping tone – as opposed to the other tones – is that in Biao’s dialect the sonorant
initials in this tone had become voiced aspirated by analogy with the other voiced
sounds. (In many Wu dialects the sonorant initials are also characterized by breathy
aspiration.) The voiceless syllable-final consonants of the ru tone, and the glottal
stop of the shang tone must have prevented this from happening in the other tones.
The voiced aspiration of the heavy initials would only spread through the entire
syllable – giving it a breathy voice quality – when there was no final devoicing to
8 Pulleyblank remarks that for such assimilation to result in a merger with the qu tone, one must
assume that the final aspiration in the latter was also voiced. It is quite likely that this was the
case in Late Middle Chines, as a change from voiceless -h to voiced -˙ would account for the
fact that the qu tone had become somewhat longer at this point. (Cf. “Qu is a little drawn out”
in line 4 of Annen’s text.)
378 6 The earliest tone description in Japan: Shittan-zō
inhibit it, such as in the ru tone, which ended in voiceless oral stops. The fact that no
heavy/light distinction is mentioned for the ru tone in Biao’s system seems to
indicate that a syllable was only considered heavy if the breathy voice quality had
spread through the entire syllable, and not in case the syllable merely had a voiced
aspirated initial.9
The qu tone would always end heavy (in voiced aspiration), and the voiced
aspiration may have spread through the preceding vowel. This would make the qu
tone inherently heavy, which would explain why no heavy/light distinction is
mentioned for this tone.
6.2.2 Low/falling and high/rising 低昂
There are only a few expressions in Annen’s text that directly refer to tone height.
The characters that are used are 低 and 昂. They can mean ‘low’ and ‘high’ but also
‘falling’ and ‘rising’ in Chinese, and context must decide which meaning is
appropriate. In lines 2 and 3, I think it is not possible to determine which meaning
should be chosen on the basis of the text.
The expression 直低 could either mean ‘falling with an even slope’ or ‘straight
and low’. Similarly, the expression 直昂 could either mean ‘rising with an even
slope’ or ‘straight and high’. The fact that Annen uses such parallel expressions
when he describes the ping and the shang tones in Biao’s tradition, suggests that he
either regarded them both as level tones, or both as contour tones. (The question of
how realistic this is as far as the tone system of Late Middle Chinese is concerned, is
discussed in section 9.4.3.)
Pulleyblank argues that in Annen’s two later 9th century tone descriptions, the
terms heavy and light have become ambiguous, and can now also refer to a
difference in tone height, but that in case of the two older tone traditions (Biao and
Jin’s) interpreting heavy as [L] and light as [H] would be incorrect. “It is very
doubtful however, whether pitch as such was from the outset the pertinent feature of
the register distinction, any more than it was for the four tones themselves” (1978:
179). Pulleyblank, who interprets Annen’s ping tone as [L], sees the following
9 Pulleyblank (1978:182) adduces evidence from the Song period which seems to indicate that
voiced aspiration eventually did spread to the sonorant initials in the ru tone, although this was
still prevented by the syllable final voiceless stops in the early period represented by the
descriptions of Biao and Jin: In Shao Yung’s phonetic tables (of around 1030) the nasals and
laterals are described as muddy in the ping, qu and ru tones but as clear in the shang tone. In
addition, in the Wu dialects, the sonorant initials are phonetically muddy (with voiced
aspiration spreading through the syllable) in lower register tones, but clear in upper register
tones. In some northern Wu dialects this means that sonorant initials are muddy in the ping, qu
and ru tones (as the sonorant initials join the yangping, yangqu and yangru tones), but clear in
the shang tone (as in the shang tone they belong to the yinshang category in these dialects).
Perhaps Zhengzhang’s idea that the syllable final glottal stop in the shang tone caused the
sonorant initials to become glottalized can explain the resistance of the nasals and laterals in
this tone to voiced aspiration (1995, unpublished ms. quoted by Sagart (1999).
6.2 Annen’s text 379
indication for this in Annen’s text (line 2): “the ping tone is called inherently low,
and yet it has the light and heavy distinction.”
By Isei and Chisō’s time however, the terms heavy and light had begun to refer
not only to a difference in voice quality but also to a lower and a higher register
respectively. Taking the heavy ru tone as an example: The voiced initials of the
heavy category would have lowered the onset of the syllable, making a low tone
possible but certainly not a falling one. In line 22: “Ru tone has light and heavy.
Heavy is low and light is high,” I have therefore chosen the translations ‘low’ and
‘high’ instead of ‘falling’ and ‘rising’.
Conversely in line 35: “If it stops directly it is light. If it rises slightly it is heavy”
the character 昂 must certainly mean ‘rising’. It is unlikely that the heavy qu tone
was higher than the light qu tone, but a lower onset of the tone due to the influence
of a heavy initial could very well have resulted in a ‘slightly rising’ tone.
Chisō’s description in line 35 does not disagree with the view that qu was
originally a falling tone. Under the influence of the voiced initials the heavy qu tone
had now developed a lower onset and as a result of this was slightly rising. However,
the light qu tone – which was not influenced in this way – most likely preserved a
falling tone contour.
6.2.3 Inner and outer 内外
The terms ‘inner and outer’ are only used in Biao’s tone description, in line 5. In the
Cantonese ru tone there is a distinction between short vowels (‘inner’) and long
vowels (‘outer’) but there are no examples of the use of ‘inner and outer’ as
indicators of vowel length in Shittan-zō. Pulleyblank wonders whether ‘inner and
outer’ could be a different way of referring to the heavy/light distinction again, but
concludes that the meaning of the terms is just not clear. Endō on the other hand,
found instances in which these terms referred to the five articulation points of the
Sanskrit consonants, ‘inner’ meaning the back of the mouth and ‘outer’ the front.
He wonders whether this could mean that the distinction between the final -p, -t,
-k of the ru tone had already disappeared and been replaced by a glottal stop. This
would have been very early; Biao’s tone system probably dates from the 8th century.
(It would also mean that this dialect cannot have belonged to the type that formed
the basis of mainstream Kan-on.)
I find it more likely that Endō’s observation has to do with the lenition of the
final consonants in the ru tone that had occurred in Late Middle Chinese.10 About
the final consonants of the ru tone in Late Middle Chinese Pulleyblank (1978:176-
177) writes:
Another important point that emerges from transcriptions, and not only those
of Sanskrit, is that the final oral stops of the entering tone had evidently
10 Some of the characteristics of Tendai Kan-on (which was introduced in the 9th century)
mentioned by Iida (1955: 80-83) are the fact that one can already observe the weakening or
disappearance of the final consonants in the ru tone.
380 6 The earliest tone description in Japan: Shittan-zō
undergone some degree of lenition. It is well known that at this period
Chinese final -t is transcribed as -r in foreign scripts, including Tibetan,
Uighur and Khotanese Brahmi. Sino-Korean represents it as -l. Furthermore,
as Maspéro pointed out, Chinese -k was frequently used at this time to
represent Sanskrit visarga -h and is sometimes transcribed in Uighur script as
-ƒ. Evidence about final -p is less abundant but we do occasionally find it
transcribed as -v, instead of the usual -b, in Uighur script. (...) As Maspéro
points out (p. 44), the entering tone finals seem to have still been pronounced
as stops when in close juncture with a following occlusive. (...) The best
solution seems to be to postulate continuants accompanied by a glottal stop
alternating with oral stops: v//p, r//t, ƒ//k.
6.2.4 The tones 声勢
The expression seisei 声勢 or ‘tone’ is used twice; in line 8 and in line 13. Endō
argues from its use in Shittan-zō that this word is a synonym of yunmu 韻母. I.e. it
refers to the final of the character, the traditional domain of the four tones (the
second character of the fanqie). Endō – who associates the terms heavy and light
with [L] and [H] pitch throughout his article – argues that the heavy/light distinction
belonged to the domain of the initial, the shengmu 声母 (the first character of the
fanqie) and that even if tones were different with respect to heavy and light, this
would not be considered to affect the ‘tone’ in this sense.
In this way, Endō tries to explain why the tones of Biao and Jin are called the
same with respect to pitch and contour in line 8, although they do have differences
as regards to heavy and light, and at the same time maintain that heavy and light
refer to differences in tone height. It is hard to imagine however, that in a natural
language a difference in the tone height of the onset of a tone could fail to affect the
tone of the syllable as a whole.11
I find Pulleyblank’s explanation of the apparent contradiction in lines 2 and 8
therefore more convincing and natural: The light and heavy distinction in Biao and
Jin’s tone system was still not primarily related to pitch. It is true that in the later
two tone descriptions, which had gone through the split into a higher and a lower
register, Annen uses the terms in a broader sense, but the following passage from
Shittan-zō shows that Annen still adhered to the traditional meaning of the terms
which had related to voice quality and not to tone height:
低昂依下、軽重依上 pitch and/or contour depend on the second
character and light and heavy on the first.
11 Endō’s solution is reminiscent of the solution later developed by Myōgaku 明覚, who also
defined the heavy/light distinction purely in terms of tone height. By this time however, the
Japanese tone theories were no longer based on a form of spoken Chinese and had become
increasingly unnatural and theoretical.
6.2 Annen’s text 381
6.2.5 Nu-sounds 怒声
In line 6 the term nu-sounds ‘angry voice’ is used. According to Arisaka Hideyo
(1936) Annen used this term to indicate the Chinese ci zhuo 次濁 second muddy
initials (sonorants, i.e. nasals, laterals, the lax fricative V and the ‘zero initial’ or K).
Originally the word was used to indicate Sanskrit aspirated and unaspirated
voiced obstruents, as opposed to voiceless ones (the so called ‘gentle voice’ 柔声).
Nasals were called ‘not gentle or angry’ 非柔怒声. These expressions are also used
in this way in Shittan-zō (Konishi, 1948:528). The term ‘enunciatory strength’ 著力
is used as an extra note to indicate the voiced obstruents.
2 Chinese terminology for the manner of articulation of the Sanskrit consonants
The five articulation points of the Sanskrit consonants Name of class
k c t t p 柔声
kh ch th th ph 柔声
g j d d b 怒声 (著力)
g˙ j˙ d˙ d˙ b˙ 怒声 (著力)
N n) n n m 非柔怒声
In Late Middle Chinese the nasal initials were pronounced as prenasalized voiced
stops, except in syllables that also ended in a nasal.12 In the Late Middle Chinese
transcription practice developed by Amoghavajra in the beginning of the 8th century,
the Sanskrit unaspirated voiced stops were transcribed by means of Chinese nasals.
As shown in (3), the nasals belonged to the so-called second muddy class of Chinese
initials.13
12 Through Amoghavajra this peculiarity of the Chang’an dialect has left a trace in the Korean
method of transcribing Sanskrit with hangul graphs: the hangul syllable ma for instance is used
to transcribe Sanskrit da, while the hangul syllable maN is used to transcribe ma. The same
method is applied to distinguish the other nasal consonants from the homorganic stops.
Amoghavajra’s new transcription method quickly replaced the systems that had existed before
him. Part of the reason for the popularity of his transcription is that his syllabary formed part of
a small sutra on the mystical value of letters transcribed by him. This is one of the sutras that
Kūkai brought home (and published under the title Bonji shittan-jibo narabi ni shakugi 梵字悉
曇字母并釈義).
13 The term qing 清 clear meant ‘voiceless’ while zhuo 濁 muddy meant ‘voiced’ (which in
practice in Late Middle Chinese meant ‘with voiced aspiration’ or ‘murmured’). The term
second muddy is a rather late development, paralleling the term second clear, which is probably
derived from Sanskrit. One of the older designations for the second muddy class is 不清不濁
buqing buzhuo ‘not clear not muddy’ or 清濁 qingzhuo ‘clear-muddy’. The Chinese rhyme
tables, from which these two later terms stem developed in Buddhist circles in late Tang times.
382 6 The earliest tone description in Japan: Shittan-zō
3 Chinese terminology for the manner of articulation of the Chinese consonants
Sanskrit LMC Name of class
p p 清 clear
ph ph 次清 second clear
b (怒) mb 次濁 second muddy
b˙ (怒) (b˙>) p˙ 濁 muddy
m mVm/n/N 次濁 second muddy
At the same time, in Late Middle Chinese, the Chinese muddy sounds had started to
become devoiced (b˙ > p˙ ). Because of this, in the transcriptions by Baoyue 宝月
(Jap. Hōgetsu) and Shuei 宗叡14 in the beginning of the 9th century, even Sanskrit
voiced aspirated sounds were now sometimes transcribed by means of Chinese
nasals. The muddy sounds had become voiceless to the extent that they were no
longer felt to be appropriate to transcribe voiced stops (although Amoghavajra’s
transcription system was not completely abandoned).
In this way, both Sanskrit voiced stops (nu-sounds) as well as voiced aspirated
stops (also nu-sounds) were now transcribed by means of Chinese nasals, so that the
term nu-sounds came to be associated more and more with the Chinese nasal
category of initials. (The fact that the initial of the nu character itself happened to be
a nasal sound may have strengthened this connection, as many traditional
phonological terms, such as the names of the tones, exemplify the categories to
which they refer.)
According to Arisaka, the term nu-sounds referred not just to the nasal initials,
but by extension to all sonorant initials, and was thus used as a means to refer to the
whole class of second muddy initials.15
When the term is used in line 6 of Biao’s tradition, it is for instance thought to
include all of the second muddy initials. It is definitely the case that later generations
of Siddham scholars in Japan interpreted the term nu-sounds in this passage in this
way (cf. section 8.1.1). In some passages however, the term may refer to the nasal
initials only.
Line 6 for instance, is the passage that describes how in the ping tone, the nu-
sounds had acquired a breathy register. As mentioned, Pulleyblank thought this
14 Ennin studied the pronunciation of Chinese and the Siddham script with both Shuei (in 838)
and Baoyue (in 840), who was a native of South India. Direct contact with Sanskrit therefore,
was still relatively recent in Annen’s time.
15 One may wonder why Annen – if he aimed at indicating all sonorant initials – did not use the
unambiguous terms buqing buzhuo 不清不濁 or 清濁 qingzhuo in which not only the nasals
but also the other sonorants are clearly included. It has to be remembered though, that although
these terms had already developed in the rhyme tables in China in the middle of the 8th century,
the rhyme tables only came to be known in Japan in the mid 13th century. (The first use of
terminology from the rhyme tables can be found in the works of the Shingon monk Shinpan 信
範 in the mid 13th century.)
6.2 Annen’s text 383
meant that the sonorant initials had become voiced by analogy with the other voiced
sounds. But as it was only the nasal initials that had acquired a (prenasalized) voiced
stop pronunciation [mb], it is perhaps more likely that the development of the
breathy register was caused by analogy with the voiced stops, and only affected the
nasal initials.
When the word nu-sounds is used in lines 16 and 30, each time there is a
reference to Jin’s nu-sounds. If Annen’s aim had been to indicate the kind of initials
that made up the ‘heavy of the light’ tonal category in Isei and Chisō’s ping tone, he
could have said: “The heavy of the light are the nu-sounds.” Why then is there each
time a reference to Jin’s nu-sounds?
This is even more puzzling, as the nu-sounds are never mentioned in Jin’s tone
description. Instead, it is Biao’s tone description in which they feature, where it is
mentioned (as an exception) that in the ping tone they were heavy (voiced aspirated).
The references to Jin’s nu-sounds therefore, probably mean to stress that the nu-
sounds in Isei’s ping tone (line 16) and in Chisō’s ping tone (line 30) were not heavy
like those of Biao, but light (i.e. had a clear voice) like those of Jin.16 As we shall
see in section 6.3 this is in perfect agreement with Pulleyblank’s analysis of the
interesting three-way split in the ping tone mentioned in the tone descriptions of Isei
and Chisō. (An interpretation of the nu-sounds as including all sonorant initials
makes most sense in these passages.)
In line 36 it is mentioned that in Chisō’s qu tone, or (which is more likely) in his
tone system as a whole, the voiced stops are the nu-sounds. Unless this means that
all sonorant initials had developed a voiced stop pronunciation, nu-sounds in this
passage most likely refers to the nasal initials only. (See also the next section.)
6.2.6 Enunciatory strength 著力
This term is only used in Chisō’s tone description, in lines 26, 31 and 36. I have
adopted the expression that Pulleyblank uses to translate this term in Annen’s text as
it is close to the meaning of the characters. Pulleyblank suspects that it is a way to
describe voice quality as an element of the tones, and in his translation line 26 is:
“The four tones all have light and heavy enunciatory strengths.”17
Endō, found other places in Shittan-zō in which the expression was used. Annen
for instance, presents the system of Sanskrit vowels and consonants of Jiguang 智広,
and indicates the rows with voiced unaspirated and voiced aspirated sounds with
‘enunciatory strength’. Those were also the original ‘nu-sounds’ in Sanskrit, but as
we have seen above, the meaning of that word changed to indicating the Chinese
second muddy initials (or only the nasals among them). Enunciatory strength
16 See also the fact that they are considered to be light, as they are called ‘heavy (of the) light’ and
not ‘light (of the) heavy’.
17 Mei (1970) has: “All four tones have the variants distinguished by light versus heavy and
forceful versus non-forceful.”
384 6 The earliest tone description in Japan: Shittan-zō
therefore, is a term that is used to indicate a pronunciation as a voiced stop
(aspirated or unaspirated). Lines 26, 30, 31 and 36 now become:
26 All of the four tones have heavy and light, and voiced stop (mb) initials
30 The heavier light of the ping tone are Jin’s nu-sounds
31 but they are now pronounced as voiced stops
36 In this (the qu tone) the voiced stops are again the nu-sounds
To start with the last line; as Endō has pointed out, line 36 most likely means that
in Chisō’s tone system the Late Middle Chinese muddy initials had become
completely devoiced, so that the only voiced stops that remained in the system were
the nu sounds. Line 26 poses no particular problems, but line 31 seems to indicate
that Jin’s nu-sounds – in contrast to those of Chisō – were pronounced as pure nasals,
and not as prenasalized voiced stops. In section 6.4 I will discuss what this probably
means as to the type of Chinese that Jin was transmitting.
6.3 The tone systems of Isei and Chisō
First of all, reading Annen’s text one gets the impression that the split into a heavy
and a light register affected more and more tones as time progressed. In Biao’s tone
system the only distinction was in the ping tone, in Jin’s tone system the shang tone
had light and heavy as well, while in the last two tone systems all the tones had the
distinction.
But we have already seen that it is in fact Jin’s tone system in which the heavy
shang tone had not yet merged with the qu tone that represents an older stage than
the tone system of Biao. As Isei and Chisō also have a heavy shang tone, at first
sight they seem more archaic than Biao. Pulleyblank however, argues that in their
tone systems also, the heavy shang tone had already merged with the qu tone.
This must be so, as Isei and Chisō almost certainly describe the dialect of
Chang’an. And Pulleyblank stresses that it was this dialect that was the Tang
standard language that spread through all of China. Not only do all the northern
Chinese dialects show this merger, many southern dialects do so as well, and all
dialects have this merger in their character reading pronunciation (which goes back
to the Tang standard language).
In Sino-Vietnamese, which is also based on this language, the heavy shang tone
has merged with the qu tone as well. The merger of heavy shang syllables with the
qu tone that is typical of the Tang standard language can also be found in Japanese
Kan-on readings and finally, as Rai (1951) has shown, the special Kan-on of the
Tendai school, which – at least in its original form – was probably very close to the
dialect learned by Isei and Chisō, also shows this merger.18
18 Isei and Ennin 円仁 probably belonged to the same mission. As Tendai Kan-on was introduced
by Ennin, the tone system of Tendai Kan-on and the tone system described by Isei must have
6.3 The tone systems of Isei and Chisō 385
If so, how can the shang tone in the tone system of Isei and Chisō still have a
distinction between light and heavy? Pulleyblank’s explanation is as follows: In the
first two tone descriptions light and heavy were not directly connected to [H] and [L]
pitch, but by the time of Isei and Chisō this had become different. “In these 9th
century dialects we seem to have a situation in which a distinction, presumably of
pitch, based simply on voicing has appeared and been superimposed on the
distinction between clear and breathy voice quality which continues to exist
independently.”
Pulleyblank’s analysis is shown in (4). Vowels in which the voiced aspiration
has spread through the entire vowel, giving them a breathy voice quality, are
represented in bold print.
4 Pulleyblank’s analysis of the tone systems of Isei and Chisō
Ping Shang Qu Ru
H pV (light) pV/ (light) pV˙ (light) pVp (light)
L mbV (heavy-light) mbV/ (heavy) mbV˙ (heavy) mbVp (heavy)
L p˙V (heavy) → p˙V˙ (heavy) p˙Vp (heavy)
The effect of the presence of two sets of distinctions can be seen most clearly in the
ping tone, which has split into three different categories. A double system of tone
height and voice quality is the best way to explain this split. The three categories in
the ping tone are light ([H], clear voice), heavy ([L], breathy voice) and ‘heavy of
the light’ ([L], clear voice). The ‘heavy of the light’ category is made up of the
second muddy initials.
In Biao’s description, the syllables with nu-initials had a breathy voice quality
just as the syllables with muddy initials. Because of the typical merger of the heavy
shang tone with the qu tone that is mentioned in this tone description, Biao’s tone
system is usually associated with Chang’an based Late Middle Chinese.
In order for the ping tone to split into three categories in the 9th century Chang’an
dialect as described by Isei and Chisō however, the sonorant initials must have had a
clear voice in this dialect. Apart from the fact that the breathy voice quality of
Biao’s nu sounds is mentioned as an exception, we have also seen that Annen goes
through some effort to make clear that the nu-sounds in Isei and Chisō’s heavy of the
light of the ping tone are light with his references to Jin’s nu-sounds. The breathy
voice quality of the nu-sounds in the ping tone must have been a temporary
phenomenon in Chang’an, which had disappeared by the 9th century.
We see that the meaning of the terms heavy and light has become ambiguous
when they are applied to the three categories of the ping tone. Sometimes they refer
been very close.
386 6 The earliest tone description in Japan: Shittan-zō
to voice quality, sometimes to pitch and sometimes to both. In the other tones, not
all of these distinctions are represented, and the terms heavy and light are now used
to refer to tone height:
In the shang tone, syllables with muddy initials had merged with the qu tone, and
the remaining shang tone syllables all had clear voice. In the traditional meaning of
the terms heavy and light therefore, the shang tone would have been regarded as
light. However, the tonal split based on voicing had divided the shang tone into a
category with [H] pitch, made up of the clear and second clear initials, and a
category with [L] pitch, made up of the second muddy initials, and the terms light
[H] and heavy [L] were now used to refer to this difference.
In the qu tone, all vowels had a breathy voice quality and in the traditional
meaning of the term, the qu tone would have been regarded as heavy. (See also
section 8.1.2.) However, the tonal split based on voicing had divided the qu tone into
a [L] and a [H] register. When the qu tone is said to have a distinction between light
and heavy in Isei and Chisō’s descriptions therefore, this also refers to a difference
in tone height.
In the ru tone, the syllable-final voiceless stop prevented the voiced aspiration of
the muddy initials to spread through the syllable, and so, in the traditional meaning
of the term, the ru tone would have been regarded as light, but here too, the terms
heavy and light were now used to describe the division into a [L] and a [H]
register.19
Later Japanese tone theories often indulge in theoretical speculation, but there
can be hardly any doubt that what Isei and Chisō were trying to describe, was the
tone system of the spoken language of Chang’an. Attempts to reconstruct the
distinctions between the many phonological categories mentioned by Isei and Chisō
purely in terms of tone height, result in tone systems in which the differences in
pitch and contour are so subtle, that it is hard to believe they could have been
distinctive. By contrast, Pulleyblank’s reconstruction of a [H] and a [L] register that
was superimposed on a pre-existing voice quality distinction – which continued to
exist independently – clarifies the phonological distinctions considerably. In his
reconstruction they look like a set of distinctions that could have formed part of a
natural language.
19 Although Annen does not mention a three-way split based on the three classes of initials in the
ru tone, the three-way split must have spread to the ru tone eventually, in order to explain the
later development to Early Mandarin (Pulleyblank. 1978:185, 194-195). This happened most
likely at the time when the syllable-final voiceless stops – which prevented the voiced
aspiration of the muddy initials to spread through the vowel – disappeared.
6.4 Which of Annen’s tone systems represents the LMC standard language? 387
6.4 Which of Annen’s tone systems represents
the LMC standard language?
An important issue is the question of which tone system among the four most likely
represents the tone system of the Late Middle Chinese standard language, the
language that later developed into Kan-on. What type of Chinese do the four tone
systems introduced by Annen most likely represent?
Annen’s two later tone traditions are unequalled both in Japan and in China in
the sense that we know almost exactly which dialect (Chang’an) from which period
(±840-880) is being described. In addition we are fortunate enough to stumble across
two detailed (but not always easy to interpret) descriptions of how the yin-yang tone
split was transforming the tone system of Late Middle Chinese. It is clear however,
that these later two descriptions are too late and too modern to represent the Late
Middle Chinese that developed into the Japanese Kan-on character reading tradition.
It is true that Kan-on was taking shape around the same time, but it developed on
the basis of the standard reading tradition that had been taught at the official
government school since the late 7th century, not on descriptions of the spoken
language of Chang’an by 9th century students returning from China.
Jin’s tone system in which the heavy shang tone had not yet merged with the qu
tone cannot have represented the dialect that formed the main basis of the Kan-on
reading tradition either. We also see how line 31 indicates that the nu-sounds in Jin’s
tradition were pronounced as pure nasals, which was a characteristic of the old
Qieyun standard language (and Wa-on/Tsushima-on), but not of Late Middle
Chinese.
The Korean background that is sometimes attributed to Jin would seem to point
to Wa-on/Tsushima-on, but the idea that Jin was a Korean appears to be a late
development. Most scholars agree that the tones of Wa-on/Tsushima-on were not
introduced in a systematic way, which makes it unlikely that Jin’s tone system
represents the tone system of these older multistratified character reading traditions.
It is, on the other hand, not certain that Jin was transmitting standard Early Middle
Chinese either.
Some of the teachers of Late Middle Chinese who first came to Japan, such as
Jin, may have preserved the pure nasal initial as they were after all teaching a
standard reading pronunciation that could be a little archaic. This would explain why
there are many Sino-Japanese readings that combine a nasal (and therefore Go-on
type) initial with a Kan-on type vowel.
Komatsu (1971:463) has pointed out that the heavy shang tone characters among
the Sei-on in the Tosho-ryō-bon of Ruiju myōgi-shō 図書寮本類聚名義抄 are
marked with the shang tone dot, and not with the qu tone dot. According to Komatsu,
this may mean that the compilers respected the tone categories of the rhyme books,
but that in practice, heavy shang tone characters were nevertheless pronounced as qu.
It may also mean however, that in the transmission of the character reading
388 6 The earliest tone description in Japan: Shittan-zō
traditions to Japan, the line between Early Middle Chinese and Late Middle Chinese
is not always easy to draw. 20
There is evidence that at least up to the end of the 7th century a somewhat
evolved form of Early Middle Chinese, and not the Chang’an dialect, remained
dominant at the Tang court. It may have been this type of Chinese that Jin was
transmitting, rather than standard Early Middle Chinese. It is after all mentioned that
his heavy shang tone starts heavy (with voiced aspiration), while voiced aspiration is
not thought to have formed part of the Early Middle Chinese muddy initials.
I find it unlikely that someone writing in the 9th century would have failed to
mention the tone tradition of the official new foreign Chinese standard language that
had been actively promoted by the government. (The edict stipulating that from now
on the new foreign Han pronunciation was the correct way of reading the characters
dates already from 793.) I therefore think that Biao’s tone system represents the tone
system of Late Middle Chinese, the foreign Chinese standard language of Chang’an
of the 7th and 8th centuries.
I see the fact that Annen starts with Biao’s tone description and uses it as the
basis for his description of Jin’s tone system as support for this idea. It does not
seem to be the case that Biao’s tradition is mentioned first because Biao was the first
to arrive. (At least Annen does not say so, and it would seem unlikely – although not
impossible – as Jin’s tone system is more archaic.) The later habit to define the Go-
on tones on the basis of the Kan-on tones was most likely inspired by Annen’s
example in Shittan-zō, as Jin’s tone system later came to be identified with Go-on.
We also see that the tonal categories mentioned later for standard Kan-on by
Shinren 心蓮 (in 1180) in his comparison of the Go-on and Kan-on tones (cf.
section 4.1) agree with the five tones mentioned by Biao, namely light ping, heavy
ping, shang, qu and ru.
The tones of Wa-on/Tsushima-on differed from those of the later Kan-on, but
Annen does not mention tonal differences between the traditions of Biao and Jin.
Wa-on/Tsushima-on had been based on the old Qieyun standard, just like Early
Middle Chinese, so the fact that Annen does not mention differences in pitch
between the tone systems of Biao and Jin could be seen as another argument against
the idea that Jin’s tone system represented the old Qieyun standard. However, the
20 As Okumura (1993:60) writes: “What are referred to as Go-on materials or Kan-on materials
are often actually a mixture of both. For example, the Wa-on 和音 readings of the Ruiju myōgi-
shō (including the ‘Shin iwaku 真云’ readings found in the Bureau of Books and Drawings
manuscript 図書寮本), which is frequently used as a source of Go-on readings, contain
examples such as “嫌 Wa-on: nai, dei,” while in the case of the Hoke-kyō tanji 法華経単字
(Single Characters of the Lotus Sutra), the initial section gives Kan-on forms, but the tone
marks (especially those in red) are frequently what would appear to be Go-on tone marks. By
way of contrast, in the Daigo-ji manuscript of the Yu-hsien-k’u 遊仙窟 (Disporting in the Cave
of the Immortals), which is generally regarded as a source of Kan-on readings, some Go-on
forms are also to be found.” Okumura also mentions that “in some texts one will occasionally
find that the pronunciation of a certain character is given in the Go-on form while the tone
mark indicates the Kan-on form.”
6.5 Remaining problems 389
difference between the Wa-on/Tsushima-on tones and the Kan-on tones does not
have to go back to a fundamental difference between the pitches of Early Middle
Chinese and Late Middle Chinese for two reasons.
First of all, although Wa-on/Tsushima-on must have been closer to Early Middle
Chinese, it is not at all certain that they were quite the same. The introduction of
Wa-on/Tsushima-on was indirect and unsystematic, and influence of the prosodic
system of the language of Paekche on Wa-on/Tsushima-on cannot be ruled out. This
could have given rise to the occurrence of considerable tonal differences between
the Early Middle Chinese standard language in China and the older Sino-Japanese
character reading tradition in Japan.
Secondly, as will be discussed in section 11.1.1, the tonal contrast between the
Go-on and the Kan-on character reading traditions in Japan may have developed
from real or imagined differences in vowel length between Early Middle Chinese
and Late Middle Chinese, rather than from differences in pitch.
6.5 Remaining problems
In the preface to Shittan-zō Annen added some remarks about the four tone
traditions that are very hard to interpret. About the first two tone traditions that came
to Japan, Annen commented:
或無上去之軽重 Sometimes there is no distinction between heavy and light
in the shang and qu tones
或無平去之軽重 Sometimes there is no distinction between heavy and light
in the ping and qu tones
The first remark fits Biao’s description without too many problems: The heavy
shang tone had merged with the qu tone, which made the shang tone all light. The
qu tone was all heavy, as all syllables ended in -˙, and the breathy voice quality had
spread backwards through the syllable. Interestingly enough, the ru tone is not
mentioned here, even though no heavy/light distinction is mentioned for the ru tone
in Biao’s description.21
The second remark must refer to Jin’s tone system. It is thought that voiced
aspiration did not yet form part of the Early Middle Chinese voiced initials, and if
Jin transmitted the old standard language it could be said that all syllables in his tone
system had a clear voice and were thus light. The reference to Jin’s light ping tone in
line 18 on the other hand, does suggest that there was also a heavy ping tone in Jin’s
tone system. Furthermore, a heavy shang tone is explicitly mentioned in Jin’s tone
description, and also that this tone started with voiced aspiration.
21 Later Japanese scholars tried to bring the descriptions of Biao and Isei and Chisō in the greatest
possible agreement with each other, and Annen’s failure to mention the ru tone here gave them
reason to include a heavy/light distinction in the ru tone in Kan-on. (See section 10.2.)
390 6 The earliest tone description in Japan: Shittan-zō
If the muddy initials were voiced aspirated in the shang tone (where voiced
aspiration could have been inhibited by the final glottal stop), it is even more likely
that they were voiced aspirated in the unchecked ping tone.
Perhaps Jin was transmitting a form of Early Middle Chinese that had already
been influenced by the Chang’an dialect. His muddy initials may have been voiced
aspirated, but the voiced aspiration may not yet have spread through the vowel,
giving the whole syllable a breathy voice quality. If so, all syllables in Jin’s tone
system would have been regarded as light by Annen, as they had a clear voice. The
reason why a ‘heavy’ shang tone is nevertheless mentioned by Annen in Jin’s
system (which seems contradictory) would then be, because it was necessary to
mention that there was no merger with the qu tone in case of shang tone syllables
with heavy initials. Again, there is no mention of the ru tone, even though it is
unlikely that Jin’s archaic tone system included a heavy/light distinction in this tone.
About the two new tone descriptions Annen commented:
或上去軽重稍近 Sometimes the light and heavy of the shang and qu tones
is somewhat similar22
或平上平去相渉 Sometimes the ping and the shang tone, and the ping and
the qu tone go over into each other
These last two comments seem to refer to a resemblance of the tones to each other
within a tone system. It is hard to say which of the remarks refers to which tone
system. One would expect the first remark to refer to Isei’s tone system and the
second remark to Chisō’s tone system, but their tone systems do not contain features
that make it possible to link either one of these remarks positively to one of the
remaining two tone systems.
The first remark may have a connection with something mentioned by
Pulleyblank (1978:186): In Late Middle Chinese ci 辞 and other kinds of poetry,
which followed the norms of current speech rather than those of the rhyme book
tradition, the shang and qu tones inter-rhymed freely in contrast to the ping tone.
(This is quite independent of the shift of heavy shang tone to qu tone.) Evidently the
final closures of the two oblique tones had changed in some way, which allowed
them to rhyme together even though they remained phonologically distinct. This
observation may in turn be connected with the reversal in the tone values of the
Chinese qu and shang tones between early Chinese loanwords in Vietnamese and
the later Sino-Vietnamese:
If in late 9th century Late Middle Chinese the final glottal stop of the shang tone
was starting to weaken to -h, and the final voiced -˙ of the qu tone was starting to
disappear, it would have been very hard to distinguish these two tones from Annen’s
traditional point of view.
22 This sentence could perhaps be read as: “Sometimes, the shang/qu distinction has some
resemblance to the light/heavy distinction” meaning that shang tone syllables had clear voice,
while qu tone syllables had breathy voice.
6.5 Remaining problems 391
Line 20 of Isei’s description mentions a shang tone that has lost its ‘abrupt’
pronunciation, which could refer to a weakening of the glottal stop. Line 29 of
Chisō’s description mentions a shang tone that resembles a ping tone in Isei’s tone
system, which could refer to the same, while line 32 of Chisō’s description mentions
a qu tone that resembles a shang tone in Isei’s tone system.
The Late Middle Chinese shang and qu tones could now have been reflected in
Sino-Vietnamese as shown in (5).
5 The tonal category of the LMC shang and qu tones in Sino-Vietnamese
LMC Vietnamese
shang CV/ > CVh → CV˙ (= hỏi/ngã, the earlier equivalents
of the yin qu and yang qu tone)
qu CV˙ > CV → CV/ (= sắc/ngang, the earlier equivalents
of the yin shang and yang shang tone)
Annen’s last remark only makes sense if we assume that he is talking about two
different ping tones, a light ping tone and a heavy ping tone here, as otherwise we
would have to assume that ping, shang and qu all “went over into each other”. In the
two later tone traditions, which had gone through the tonal split, the light ping tone
may have had a falling tone contour, which (apart from the difference in voice
quality) would have made it resemble the qu tone, assuming that the qu tone was
falling.
The heavy ping tone in this tradition may have had a rising tone contour, which
(apart from the difference in voice quality) would have made it resemble the shang
tone, assuming that the shang tone was rising.
An alternative explanation would be that the light ping tone resembled the shang
tone, because they both had clear voice, and that the heavy ping tone resembled the
qu tone because they both had breathy voice.
Again, these two possible explanation work best if we assume that the final
closures of the shang and qu tones had started to weaken or disappear.
7 Later Japanese tone theories
The descriptions of the Chinese tones included in the work of the Tendai monk
Annen 安然 are the earliest in Japan, and they are especially interesting as they date
from a time when contact with on-hakase from China (the first two tone traditions)
and the spoken language of Chang’an (the last two tone traditions) was still fresh. In
this sense they are unique, as all other descriptions stem from a much later time,
when Chinese on-hakase were no longer around and direct contact with China had
long ceased.
From the title of Annen’s work it is clear that Annen belonged to the so-called
Shittan gakusha 悉 曇 学 者 , or Siddham scholars, and the later Japanese tone
descriptions that will be introduced in this chapter stem from the same circles.
The Siddham scholars were Shingon or Tendai monks who concerned
themselves with the correct pronunciation of the Chinese characters that had been
used to transcribe the Sanskrit mantras and dhāran,ī. (This had happened in Tang
China from the 6th to the 9th century.) A correct pronunciation of the magical
formulae was of the utmost importance to adherents of these esoteric schools, and
what they were trying to do was to give a meticulous description of the most correct
way in which to pronounce the Chinese characters. A tremendous amount of
attention was given to the Chinese tones, and generations of scholars appear to have
been completely absorbed by tone theories that became ever more theoretical and
complicated.
On the one hand, these later tone systems are less valuable than Annen’s work,
as we cannot regard them as reliable sources on the tone system of Late Middle
Chinese. In another respect though, they are more valuable, as they – unlike Annen’s
descriptions – are contemporary with the production of the tone dot material on
which our knowledge of the Middle Japanese tone system is based, and originate
from the same circles. They can provide us with valuable information on how the
Chinese tones came to be viewed in Japan after Annen’s time. The tonal value of the
dots that were used to mark the tones of Middle Japanese is after all based on these
later theories and not directly on the tones of Late Middle Chinese.
I will introduce a number of tone theories from the 11th to the 14th century. I
concentrate in this chapter on the translation of the texts and the clarification of
passages that are especially obscure. For this purpose I have found it indispensable
to start with some background information on the different types of tone dot
markings in Japan, and on Myōgaku, the most influential Siddham scholar after
Annen.
7.1Overview of the kinds of tone dots used in Japan 393
An analysis of the tone systems described in these works, their interrelationship
and the way in which they relate to the tone system of Japanese will be presented
separately in chapters 8 and 9.
7.1Overview of the kinds of tone dots used in Japan
Many different systems of tone dot markings have been in use among different
groups in Japan. Some systems included as many as eight, twelve, or even sixteen
tone dots. It is questionable whether in such systems the dots all expressed
differences in pitch. The four basic tones were no doubt distinguished, but in some
of the earlier materials, the heavy and light subtones may have functioned to
distinguish the different types of Chinese initials from each other.
7.1.1 Tone systems in which not all distinctions
may have been based on pitch
In Wamyō ruiju-shō 和名類聚抄 (936) and the Mōgyū 蒙求 manuscript from the
second half of the 10th century mentioned by Komatsu (see section 3.6), eight tone
dots are used, but concrete descriptions of the pitches of the tones in such early
periods are lacking. The dots were arranged around the character in the following
manner:
上軽 去軽
上重 去重
字
平軽 入軽
平重 入重
Figure 1: The eight-tone marking system
As we have seen, Komatsu argued that this type of eight-tone distinction was
maintained in order to clarify the regular relationship between the initials in the Go-
on and Kan-on pronunciations and did not express differences in pitch.
Numoto’s study of the early Heian period Kujaku-kyō 孔 雀 経 manuscript
(Numoto, 1974) also suggests that in these early materials the tone dots functioned
to distinguish the different types of initials from each other and did not indicate
differences in pitch: The Kujaku-kyō manuscript distinguishes no less than sixteen
tone dots. Each of the four tones has four different locations of the tone dots,
depending on whether the character had a clear, second clear, muddy or second
muddy initial.
394 7 Later Japanese tone theories
The systems mentioned above date from before the period when the subtones
started to be clearly defined in terms of pitch. From a much later period however,
there is a twelve tone system, namely in Shinkū’s 心空 Hoke-kyō ongi 法華経音義
(1386). This work includes a tone dot chart in which all of the four basic tones have
light, heavy and not-light-not-heavy ( 非 軽 非 重 ), which seems to be based on
Annen’s mentioning of the category heavy of the light in the ping tone in the tone
systems of Isei 惟正 and Chisō 智聡 (except that this extra distinction is now
applied to all of the four tones).
In this case too, however, it is questionable whether all these distinctions
represented differences in pitch, as the tone theories in Shinkū’s time were in
considerable disarray. (See also section 7.3.3.3.)
7.1.2 Tone systems in which the distinctions were based on pitch
Tone dot systems of which we can be fairly certain that the dots truly marked tonal
distinctions are the following: The earliest Kan-on tone system introduced in Japan
is thought to have been a four-tone system. Although a four-tone system can still be
found much later, in such cases it is normally only used in Go-on material.1
上 去
字
平 入
Figure 2: The four-tone marking system
In Hoke-kyō shakumon 法華経釈文 (976) by Chūzan 仲算 (Hossō school) and in a
tone dot chart in Konkōmyō saishōō-kyō ongi 金光明最勝王経音義 (1079, also of
the Hossō school) we find tone dot systems that distinguish light and heavy variants
of the ping and ru tones, bringing the number of tones to six. In the same period
however, a system with five tone dots (including light and heavy variants for the
ping tone but not yet for the ru tone) also appears to have been in use. See for
instance the five-tone system that was still used for the Sei-on in the Tosho-ryō-bon
of Ruiju myōgi-shō 図書寮本類聚名義抄 (which is also though to have originated
from the Hossō school), and the five tones mentioned by Shinren 心蓮 (Shingon
school) in Shittan kuden 悉曇口伝 (1180).
1 In late Go-on material such as Bumō-ki and many other works, heavy and light variants of the
ping and ru tones are occasionally used, which is no doubt the result of Kan-on influence
(many supposedly Go-on forms are actually Kan-on or a mixture of both) and possibly also
born from the desire to obtain a regular correspondence with Kan-on.
7.1Overview of the kinds of tone dots used in Japan 395
As the five-tone system in the Tosho-ryō-bon of Ruiju myōgi-shō was also used
to mark the tones of Japanese (including the light ping tone dot), the distinctions
were clearly tonal, but concrete descriptions of the actual tonal value of the tones
from this period are lacking.2
Concrete descriptions of the tones in terms of pitch – including the subtones –
start with the Tendai monk Myōgaku 明覚 (1056-1122). (As we have seen, the tone
descriptions in Shittan-zō 悉曇蔵 contain preciously little information on the actual
pitches of the tones, and in some of the systems included in Shittan-zō the subtones
were most likely differentiated from each other in the first place by voice quality,
and not by pitch.)
Myōgaku set out to reconstruct the eight-tone system of Isei and Chisō
mentioned in Shittan-zō. Although his reconstruction of the eight tones is presented
as his own personal opinion (shian 私案), and was not based on oral tradition, he
became the most influential figure among the Japanese Siddham scholars after
Annen. (All distinctions in Myōgaku’s tone system are clearly tonal.)
His tone system also includes a light/heavy distinction in the ru tone, and we see
that soon after, all Kan-on tone systems, both in the Shingon school and in the
Tendai school, include a heavy/light distinction in the ru tone as well as the ping
tone, so that they all have at least six tones.
The six-tone system came to be regarded as typical of the Shingon school, and
includes a light ping tone and a light ru tone, but no light or heavy variants of the
other tones. (The slightly elevated light ping and light ru tone dots seem to be a
Japanese invention, as these tone dots cannot be found in Chinese or Korean
material.) The tone dots could be arranged around the character in several ways:
2 As isolated character readings usually do not develop into loanwords the best way to prove that
the tone dots expressed real tonal differences is to look at the modern dialect reflexes of native
Japanese words that were marked with light and heavy tone dots. There are only two examples
of ru-tone dots added to Japanese words in the Tosho-ryō-bon of Ruiju myōgi-shō (ノ. トル
nottoru ‘to appeal, complain of’ and ウ. タウ uttau ‘to follow, imitate’) In the Kanchi-in-bon
観智院本 they are marked as ノトル 上平上 and ウタウ 平平 x, so it seems that in the
Tosho-ryō-bon the wish to express a closed syllable overruled tonal considerations. In case of
the light and heavy ping tone dots however, we see that the modern reflexes of disyllabic
Japanese words marked 平平 (class 2.3) still differ from those marked 平東 (class 2.5).
The fact that the light/heavy distinction in the tone dots of Ruiju myōgi-shō was based on a
difference in tone may also explain why the Sei-on in the Tosho-ryō-bon of Ruiju myōgi-shō
only show a light and heavy distinction in the ping tone, even though they are based on those of
Wamyō ruiju-shō. The Sei-on in the tone dot manuscript of Wamyō ruiju-shō itself had a
distinction between light and heavy not only in the ping and ru tones but also in the shang and
qu tones. As mentioned, these dots were most likely used to distinguish the different categories
of Chinese initials from each other. In the Tosho-ryō-bon of Ruiju myōgi-shō, in which the tone
dots were only used to express real differences in pitch, these extra dots were therefore
discarded.
396 7 Later Japanese tone theories
上 去 上 去
字 平軽 字 入軽
平軽 入軽
平重 入重 平重 入重
Figure 3: The six-tone marking systems
This is the tone system that was used in Ruiju myōgi-shō as well as many other
works, to indicate the tones of Japanese. An interesting aspect of the Shingon six-
tone system is that even though it includes only six tones, it is in fact based on an
underlying eight-tone system: The heavy shang tone in this tone system has merged
with the qu tone, a merger that – as we have seen – goes back to a real historical
development in Chinese. From the beginning of the 12th century on however, we
find descriptions from the Shingon school which mention that not only has the heavy
shang tone gone over to the qu tone, but also that the light qu tone has gone over to
the shang tone. This is a merger that can only be found in Japan. (The background of
this merger will be discussed in section 8.1.2.) As a result, the shang tone includes
only characters starting with clear, second clear and second muddy initials while the
qu tone only includes characters with muddy initials.
The eight-tone system is considered typical of the Tendai school. In the Tendai
school there are two types of eight-tone systems. In the first eight-tone system, all
four Chinese tones have a light and a heavy variant. This is the tone system that was
reconstructed by Myōgaku on the basis of a thorough study of Annen’s text,
especially of the last two tone traditions quoted by Annen.
Although Myōgaku’s ideas were very influential, his eight-tone system was
never truly adopted, as – before long – a very different type of eight-tone system
came to be used in his own Tendai school. This tone system developed out of the
first, but did not truly distinguish eight different tones.
7.1.3 The quasi eight-tone system of the Tendai school
This later type of eight-tone system, which became widely used, and is truly
representative of the Tendai school, is a tone system that includes the so-called fu-
nisshō-ten (フ入声点 or 不入声点) and the bifura-ten 毘富羅点.
The fu-nisshō-ten (the ‘fu ru tone mark’ or ‘non-ru tone mark’) is a mark below
the character between the ping and the ru tone. As the original final -p of Chinese
shifted from Japanese -fu to -u, what had originally been disyllabic character
readings had now become syllables ending in long vowels, and the normal ru tone
7.1Overview of the kinds of tone dots used in Japan 397
mark was no longer thought appropriate. In the Tendai school (and also in the Shingi
Shingon school) therefore, this special mark was used.3
The bifura-ten has to do with the complications involving the heavy and light
subtones of the shang and the qu tones. It is a mark in the middle above the
character, between the marks for the shang and the qu tone.
The word bifura comes from the Sanskrit vipūra meaning ‘broad’, indicating that
this mark had a flexibility that made it possible to pronounce it as either shang or qu.
(This name was only given to the mark in the Muromachi period. Originally it was
called the jōkyo nin’i-ten 上去任意点 ‘shang-qu optional mark’ or chōjō-ten 頂上
点 ‘top mark’.)
The first mention of the bifura and the fu-nisshō marks is in the Kujō-ke-bon 九
条家本 of Hoke-kyō on 法華経音.4 About the bifura mark it says: 上去両声任意
“both the shang tone and the qu tone are optional” and about the fu-nisshō mark it
says: 本入声ナルヲ平声呼ブ “pronounce what is originally a ru tone as a ping
tone” (Konishi, 1948: 478).
上 毘富羅 去
平軽 字 入軽
平 フ入 入
Figure 4: The quasi eight-tone marking system of the Tendai school
In the work Dokkyō kuden myōkyō-shū 読経口伝明鏡集 5 (1284) by the Tendai
scholar Nōyo 能誉 the bifura-ten – which is still called the chōjō-ten 頂上点 or ‘top
mark’ in this work – is described as in (1).
1 Dokkyō kuden myōkyō-shū on the bifura-ten
本声雖レ為二去声字一、 When an original qu tone
3 In the Shingi Shingon school the fu-nisshō mark was later adopted, but not the bifura mark
(Konishi 1948: 514).
4 Hoke-kyō-on is often thought to have been edited by Myōgaku himself. The exact date of the
Kujō-bon of Hoke-kyō-on, in which these tones are mentioned for the first time is not known,
but Konishi places it somewhere between the rules of emperor Toba (1107-1123) and emperor
Takakura (1168-1180), which means that Konishi places the development of the bifura ten and
the fu-nisshō-ten somewhere between 1107 and 1180, which may still have been during
Myōgaku’s lifetime (Myōgaku died in 1122).
5 This is one of only a few Tendai works on the fanqie method from this period, when the
Shingon school was much more active in this field. The work does not contain concrete
descriptions of the tones.
398 7 Later Japanese tone theories
ニ
被レ引二上字 被レ読二成上声一
一 under the influence of the first character
(of the fanqie) is read as a shang tone
ノ
之時 注指之点也。 this dot is used as a mark.
Modern comments are not very clear on how the bifura-ten actually worked.
Kindaichi’s description in Kokugo-gaku dai-jiten (1980:547) is as follows:
In the eight-tone theory, which was mainly transmitted in the Tendai school,
at first heavy and light subtones for all the four tones were acknowledged, but
by the end of the Heian period the heavy shang tone got mixed up with the qu
tone, and the light qu tone got mixed up with the shang tone so that they both
disappeared. Instead of the original eight-tone system, an eight-tone system
that included the bifura-ten and the fu-nisshō-ten came in use.
This description does not tell us which type of characters would be marked with
the bifura-ten, but the suggestion is that it was the heavy shang and the light qu
characters, meaning that the ordinary shang tone dot in the Tendai school represents
light shang and the ordinary qu tone dot represents heavy qu, just as in the Shingon
school. According to Iida (1953:67) on the other hand, the bifura-ten is used for a
“light sound derived from the shang and the qu tones, used for characters that have a
broken-off pronunciation”. This suggests that the bifura-ten represents the light qu
and light shang tones, meaning that the ordinary shang and qu tone dots in the
Tendai school indicate heavy shang and heavy qu.
Finally, according to Ōyama Kōjun (1989:367), an authority on modern Tendai
shōmyō practice, what is marked with the bifura-ten in the Tendai school is usually
marked as shang in the Shingon school, suggesting that only light qu characters
(which would after all have merged with the shang tone in the Shingon school) were
marked with the bifura-ten. Heavy shang tone characters would presumably have
merged with the qu tone, and would have been marked with a qu tone dot just as in
the Shingon school.
When I tried to confirm this however, it turned out that Ōyama’s examples were
not light qu tone characters at all, at least not in Kan-on. Of the three example
characters, two had a ping tone in Middle Chinese (諸 and 阿), while one (唯) had a
ping tone or a shang tone in Middle Chinese. It appears therefore, that merger rules
that developed in the Kan-on tone theories in Japan were later applied to Go-on as
well.
In any case, from the description in Dokkyō kuden myōkyō-shū it appears that
Ōyama’s description is correct and that the bifura-ten represents light qu tone
characters in the Tendai eight-tone system. In the Tendai eight-tone system therefore,
the ‘normal’ qu tone represents heavy qu, and the ‘normal’ shang tone represents
light shang. As Konishi already pointed out (1948:493), this tone system is in fact
very close to the Shingon six-tone system.
7.2 Myōgaku and the state of Siddham study in Myōgaku’s time 399
7.2 Myōgaku and the state of Siddham study in Myōgaku’s time
The most important figure in the development of the later Japanese tone theories is
no doubt the Tendai scholar Myōgaku 明覚 (1056-1122). I will therefore provide
some background information on Myōgaku and the state of Siddham study when he
developed his theories.
Initially it had been mainly monks belonging to the Tendai school such as Annen
that undertook the Sino-Japanese studies required for Siddham and shōmyō practice,
but the study of Sino-Japanese also started to flourish in the Shingon school, and
studies by Shingon monks such as Kanchi 寛智 (1046-1111) and Shinren 心蓮 (?-
1181) were especially outstanding.
In the mid Heian period however, the Siddham studies that had achieved so
much success through Annen were in serious decay. Especially in the Tendai school
there appears to have been almost no one to continue Annen’s studies. Only in the
Shingon school at the Ninna-ji 仁和寺 Siddham studies were continued. Many
works from this time remain, but unfortunately, these works do not contain concrete
descriptions of the tones (Mabuchi, 1963:402).
It is in these circumstances that the Tendai monk Myōgaku made his appearance
to promote Japanese Siddham studies and to leave a decisive influence on the
generations after him. He did not receive oral instruction from a teacher, and his
works are the result of self-study. From his own preface in Han’on sahō 反音作法,
it appears that it was not only he who lacked oral instruction in the traditional
scholarship, but that the tradition of phonological study introduced by Annen some
200 years earlier had in fact ceased to exist. There may of course be an element of
self-promotion in these statements, but Mabuchi regards the fact that all theoretical
works on the fanqie method (called hansetsu in Japan) after Myōgaku followed his
method, as an indication that the study of fanqie and Siddham had indeed been in
serious decay when Myōgaku introduced his new theories.
As a result, his works show a thorough dependence on earlier written sources,
which he interpreted in his own way. He used the extensive library of the Tendai
monastery on Hiei-zan near Kyōto for his research. Mabuchi (1962: 426-428) shows
a list of hundreds of titles of books that Myōgaku quotes, showing the extent of
Myōgaku’s study and his reliance on written sources. It may be precisely because
the oral tradition had been interrupted that Myōgaku is the first Siddham scholar
after Annen to attempt to describe the pitches of the Chinese tones explicitly, and to
mold them into a logical system.6
6 The disarray in the tones mentioned by Myōgaku, but as we shall see also by a contemporary
like Fujiwara Munetada 藤原宗忠, may have been the direct result of the introduction of the
yin/yang tone split by the Shingon and Tendai monks who had studied in China. This upset a
simpler tone system that had already been introduced centuries earlier. The confusion is after
all always associated with the distinction between the heavy and the light subtones. The tones
had indeed “become quite strange” as Annen put it.
400 7 Later Japanese tone theories
The important point is that Myōgaku’s work was extremely influential in the
same period, and in the same circles that produced most of the extant tone dot
material. It is this material on which our knowledge of the tonal distinctions of
Middle Japanese is based. Myōgaku’s tone descriptions, and the tone descriptions by
others that were influenced by Myōgaku’s work, are therefore important sources as
to the kind of theories that formed the background of the marking system used in
works like Ruiju myōgi-shō.
7.3 The descriptions of the tones
I will introduce the descriptions and – as far as possible – limit myself to the issue of
their translation. In case of certain passages however, a broader discussion was
unavoidable. The tone descriptions from different periods have been divided
according to the traditional classification of Japanese historical periods. I have used
the works of Konishi (1948), Kindaichi (1951), Wenck (1953, 1957) and Mabuchi
(1962, 1963).
7.3.1 Heian period (794-1185)
7.3.1.1 Chūzan 仲算 (Hossō school)
The following text in the Daigo-ji Sangyoku-in-bon 醍醐寺三宝院本 of Hoke-kyō
shakumon 法華経釈文 (976) by the Hossō monk Chūzan 仲算(935-976) forms the
most elaborate description of the tones that can be found from the period after
Annen and before Myōgaku, although even here, concrete descriptions of the tonal
value of the tones are lacking (Mabuchi 1996:307, 313, 1963:1067). The ‘first’ and
the ‘second’ character refer to the first and second character of the fanqie.
2 Hoke-kyō shakumon on the tones and the difference between heavy and light
平上去入依下字 Ping, shang, qu, ru depend on the second character
軽重清濁依上字 Light, heavy, clear, muddy depend on the first character
濁平声字軽重 Ping tone characters with muddy initials are light-heavy
濁上入声字重軽 Shang and ru tone characters with muddy initials
are heavy-light
従本清字但随出来 Characters with clear initials however,
are of course pronounced as they come out.
上声字重短軽長 Heavy shang tone characters are short
while light ones are long
去声字重長軽短 heavy qu tone characters are long
while light ones are short
7.3 The descriptions of the tones 401
After this, in red ink, there is the comparison of the tones of Kan-on and Tsushima-
on that has been quoted in section 4.1.
Mabuchi has pointed out that in the 7th century the Hossō school contained a
large group of monks who had all studied in Chang-an for 10 to 20 years, so that the
phonological tradition of the Hossō school must have started out very strong
(Mabuchi 1996:250). 7 I had hoped that the tone description above might be a
remnant of this tradition (although three centuries is a long time), but the reality is
that this tone descriptions appears to be strongly based on Annen’s text in Shittan-zō.
The first indication can be found in the way in which the tone dots have been
added to the characters in Hoke-kyō shakumon. Mabuchi (1996:309) shows that the
division into light and heavy in the ping and ru tones is as follows: in the ping tone
second muddy as well as muddy characters belong to the heavy category, while in the
ru tone only muddy characters belong to the heavy category.8 (The reason why such
a division is unnatural and most likely goes back to Annen’s text will be explained
in section 8.1.1.)
What appears to have happened in this tone description is that certain statements
by Annen have been made symmetrical. The clearest case can be seen in the last two
lines, which appear to stem from Isei’s description of the qu tone in Annen’s text,
but then applied to the shang tone in the reverse:
Isei (line 21): “The qu tone has light and heavy. Heavy is long and light is short”.
(Chisō’s description (line 35) can easily be brought into agreement with this: “If it
stops directly it is light. If it rises slightly it is heavy.”) Chūzan apparently had no
idea what the yin/yang register split in Chinese really entailed, and grasped at the
only description that made sense to him, extending it (in the reverse) to the shang
tone.
The second case where I think Chūzan made certain statements by Annen
symmetrical is in the passage about light-heavy in the ping tone and heavy-light in
the shang and ru tones. Mabuchi admits that he is puzzled by the clear
contradictions contained in these statements, such as when muddy ping characters
are said to be light as well as heavy, while they should of course be simply heavy. I
can only make sense of Chūzan’s remarks if they go back to line 15 in Annen’s text.
In order to understand this we have to remember that – confusingly enough –
Chinese second muddy (jidaku 次濁) characters are often referred to as muddy (daku
濁) in Japan.9
7 The Hossō school 法相宗 was one of the eight Japanese Buddhist schools in the Heian period.
It was transmitted to Japan several times in the 7th and 8th century and was supported by the
powerful Fujiwara family.
8 Despite the fact that a light and heavy distinction in the shang and qu tones is mentioned by
Chūzan, the tone dots do not distinguish between heavy and light in these tones.
9 This is not only because in Kan-on the Chinese nasals developed into voiced stops, but
according Mabuchi (1996:313) also because the rhyme tables, from which the term second
muddy or jidaku 次濁 stems, were not yet known in Chūzan’s time. (The terms qing/sei 清
clear and zhuo/daku 濁 muddy were already in use in Annen’s time (Iida (1955:68) quotes a
402 7 Later Japanese tone theories
In Annen’s text ping tone syllables that started with nu-sounds (which referred to
the nasals or to all the second muddy initials)10 were called 軽重. (Line 15: 平有軽
重軽亦軽重.) In a reversal similar to what we have seen above between the shang
and qu tones, the second muddy (jidaku 次濁) sounds in the shang and ru tones are
now called 重軽.
The remark that originally clear characters remain as they are (mama 随 ‘as it
is’), may mean no more than that characters with clear and second clear initials do
not involve such complications: Unlike the characters with second muddy initials,
they are always light, and do not join different categories (i.e. light or heavy)
depending on whether they are ping tone characters or shang/ru tone characters.
These remarks, whose formulation was based on Annen, are in my opinion an
obscure way of referring to the fact that in Kan-on, characters with second muddy
initials in the ping tone belong to the heavy register (light-heavy = heavy), while in
the shang and ru tones they belong to the light register (heavy-light = light).11
This feature can be traced back to Biao’s tone system in Annen’s text, while the
distinction of heavy and light variants of the shang and qu tones stem from the
descriptions by Isei and Chisō. Chūzan’s tone theory appears to be a pick-and-
choose of features derived from Shittan-zō. As far as I can see, the only thing that is
clear from this early tone description is that the phonological tradition in the time
just before Myōgaku had become quite obscure.
7.3.1.2 Myōgaku 明覚 (Tendai school)
Of Myōgaku’s earliest known work, Shittan daitei 悉曇大底 (1084), there exists a
handwritten copy of the year 1160. It contains hardly any information on the
Chinese tones.12 In order to understand Myōgaku’s later, and much more extensive
text in which Ennin 円仁 equates clear with light and muddy with heavy), but not the terms
ciqing/jisei 次 清 and qing-zhuo/jidaku 清 濁 ). Although the rhyme tables had already
developed in China in the middle of the 8th century, they only became known in Japan in the
mid 13th century. The first use of terminology from the rhyme tables can be found in the works
of the Shingon monk Shinpan 信範.
10 The closest term that Annen had been able to provide to indicate this group at a time when the
term qing-zhuo/jidaku 清濁 was not yet known had been the designation nu-sounds.
11 Because of the remark that originally clear characters remain as they are, Mabuchi on the other
hand, (albeit very tentatively) proposes the idea that these remarks may somehow refer to the
difference in Sino-Japanese between characters whose initial becomes voiced as the result of or
sequential voicing (rendaku 連濁) when they appear as the second character of a character
compound, and characters that are already voiced in themselves (i.e. the difference between
hondaku 本濁 (‘originally voiced’) and shindaku 新濁 (‘newly voiced’) characters (Mabuchi,
1996:313). The difference between voiceless and voiced initials as such is not marked in Hoke-
kyō shakumon. The first example of so-called daku-ten (two paired circles that functioned to
mark the tone as well as the fact that the initial of a character was voiced) being added to a
Japanese text only dates from around the year 1030. In connection with this explanation he
interprets the passage 従本清字但随出来 as “characters with muddy initials that were not
originally clear (本より清ではない濁), remain as they are”.
12 There is one short remark stating that the beginning of the heavy shang tone is as ping and the
7.3 The descriptions of the tones 403
writings on the tones, there are a number of issues that need to be explained. These
issues have to do with Myōgaku’s interpretation of the reason behind historical
changes in the transcription of Sanskrit.
From Annen’s text, Myōgaku concluded that the tone system in which the heavy
shang tone had merged with the qu tone (Biao’s system) was an older system, and
that in the newer eight-tone system of Isei and Chisō the heavy shang tone was still
kept separate.
It is clear that Myōgaku saw the later system, which did not include the merger
of heavy shang with qu, as superior and more advanced, as it is this system which he
set out to revive. And this is understandable, as it was the eight-tone system that was
closely connected to the period and the people that had founded the Tendai school in
Japan.
Myōgaku noticed that certain Sanskrit syllables (those with short vowels) were
transcribed by means of qu tone characters (or marked with the note ‘qu’) in the
works of Amoghavajra and Zhiguang 智広 from the second half of the 8th century,
whereas in the works of later scholars like Baoyue 宝月 and Shuei 宗叡 from the
first half of the 9th century, such syllables were transcribed by means of shang tone
characters (or marked with the note ‘shang’).13
Myōgaku explains Amoghavajra’s use of the qu tone as a result of the fact that
Amoghavajra still adhered to the older six-tone theory: In the six-tone theory after
all, the heavy shang tone had merged with the qu tone. Baoyue and Shuei on the
other hand, adhered to the more advanced eight-tone system, in which the heavy
shang tone was still kept separate, and in Myōgaku’s opinion, this explains their use
of the shang tone.
Finally, Myōgaku identifies the older theory (the six-tone theory) with the
Shingon school (Kūkai) and the more advanced theory (the eight-tone theory) with
ending as shang, and that the qu tone is similar to this (Mabuchi, 1962:407):
ト キハ ハ ハ ニ フ ヲ タルカ フ シカ ヵ
今云レ去者、上声之重 初 平後 上 呼 レ之 。音似 二去声一故云 レ 然歟
13 At first the practice was to use characters with the appropriate tones. In several Buddhist texts
from the 7th to the 9th centuries however, other ways were recommended to represent the
Sanskrit contrast in vowel length. In Shittan-zō for instance Annen quotes a rule originally
formulated by Yijing 義淨 (end of the 7th century) saying that characters used to transcribe
short vowels had to be read in the shang tone, irrespective of the original tone of the character
(Mei, 1976:90). In several Buddhist texts from the Late Middle Chinese period (from the early
8th century on), both members of a pair of syllables involving length contrast are represented by
the same character, with the length contrast indicated by some other means. Amoghavajra and
Zhiguang (both second half of the 8th century) for instance use the annotations 上
(Amoghavajra) or 上声短呼 (Zhiguang) for syllables with short vowels, and the annotations 引
去 (Amoghavajra) or 依声長呼 (Zhiguang) for syllables with long vowels. But in Xitanziji 悉
曇字記 Zhiguang still occasionally uses the annotation 去声 for Siddham syllables that are
described by him as ‘short’ 短. (Xitanzijii was widely studied by Siddham scholars in Japan
because Annen had praised it as the best work on the Siddham script.)
404 7 Later Japanese tone theories
his own Tendai school. This is because Baoyue and Shuei were associated with the
founders of Tendai in Japan.14
In other words; qu tone markings were associated by Myōgaku with an older six-
tone system to which the Shingon school adhered, while shang tone markings were
associated with a more advanced eight-tone system which he preferred for his own
Tendai school.15
Mabuchi (1962:429) accepts these theories, and believes that the differences in
transcription practice indeed go back to the use of an earlier system in which the
heavy shang tone had merged with the qu tone and a later system in which the heavy
shang tone was still kept separate. He also accepts the idea that in the conservative
religious circles in Japan each of the two theories was preserved by a different
school; the Tendai school still adheres to the eight-tone system while the Shingon
school adheres to the six-tone system.16
I do not think this is correct. I have not come across references to a change in
transcription practice as a result of the ying/yang high/low register split in the 9th
century. And this would have been extremely unlikely too: Both in the Chinese
standard language of before the yin/yang split, as well in the Chinese tone system of
after the yin/yang split (the eight-tone system of 9th century Chang-an), the heavy
shang tone had merged with the qu tone, and the difference between these two tone
systems can therefore not have been the reason behind the fluctuation between qu
tone markings and shang tone markings that puzzled Myōgaku.17
14 Ennin studied the pronunciation of Chinese and the Siddham script with both Shuei (in 838)
and Baoyue (in 840).
15 As we shall see in this chapter, the merger of heavy shang with qu in the six-tone system is a
problem for Myōgaku, as it goes against his method of analyzing the tones: Such a merger
would mean that the heavy or light quality of the initial can influence the domain of the tone.
We have seen in the first quotation from Chūzan that the idea that heavy and light and clear
and muddy depend on the first character of the fanqie while the tones depend on the second
character of the fanqie (an idea which goes back to Annen) had already been well established in
Japan. Apart from this, Sanskrit included syllables in which voiced aspirated consonants
(marked with the note ‘heavy’) were followed by short vowels (transcribed by means of
Chinese shang tone characters, or marked with a shang tone dot). In Han’on sahō, Myōgaku
also quotes the fact that “in Siddham, one combines characters with a heavy sound with a shang
tone dot” as an argument against adopting a tone system in which heavy shang had merged
with qu.
16 Konishi’s approach is slightly different (1948:493). He sees these passages merely as the origin
of the tradition in which the six-tone system is associated with the Shingon school and the
eight-tone theory with the Tendai school. In reality – as we have seen, and as Konishi also
points out – both schools would eventually use a very similar six-tone system, as the tone
system used in the Tendai school does not truly distinguish eight tones.
17 As mentioned in section 6.3, Pulleyblank has stressed that this merger must have taken place in
the eight-tone system described in Annen’s texts, as Isei and Chisō almost certainly describe
the dialect of Chang’an. It was this dialect that was the Tang standard language that spread
through all of China, and not only do all the northern Chinese dialects show this merger, many
southern dialects do as well, and all dialects (even Min), have this merger in their character
reading pronunciation, which goes back to the Tang standard language.
7.3 The descriptions of the tones 405
The connection that Myōgaku saw between the different methods to transcribe
Sanskrit by means of Chinese characters, and the existence of a six-tone system as
well as an eight-tone system was a mistake: The difference that Myōgaku sought to
explain was most likely the difference between the transcription practice that was
based on Early Middle Chinese (or remnants of it) and the transcription practice that
was based on Late Middle Chinese.18
By the middle of the Tang dynasty, when the new standard language based on the
dialect of Chang’an was replacing the old standard of the Qieyun, there was a
change in the usage in transcribing Sanskrit. The shang tone remained the preferred
indicator of short vowels but for long vowels the annotation qu yin 去引 ‘qu drawn
out’ was now also used. Although the addition of the word yin presumably means
that the qu tone by itself was not felt to be quite appropriate for indicating vowel
length, there must obviously have been some degree of length involved. This is
evidently a point of difference between Early and Late Middle Chinese.
(Pulleyblank, 1978). (See also sections 1.2, 6.2.1 and 11.1.1.)
In other words; in the Early Middle Chinese-based transcription practice the qu
tone was still used to transcribe short vowels. (The shang tone was the favored
indicator of short Sanskrit vowels and the ping tone was the favored indicator of
long Sanskrit vowels, but as the qu tone was checked (ending in -h) just like the
shang and ru tones, it could still be used to mark vowel shortness.) In the Late
Middle Chinese-based transcription practice on the other hand, the qu tone started to
be used to indicate long vowels instead of short vowels, and syllables with short
vowels that had previously been indicated by means of qu tones, were now be
indicated by means of shang tones. The change in transcription method therefore,
stemmed from differences in vowel length between Early and Late Middle Chinese,
and not from the difference between an older six-tone theory, that was later replaced
by a more correct eight-tone theory.
The earliest – more extensive – tone description by Myōgaku stems from
Shittan-hi 悉曇秘(1090), also called Shittan go-on-shō 悉曇五音鈔. I have adopted
this description from Konishi (1948:371 and 490). A copy of this work dating from
the year 1115 was originally kept at one of the libraries on Kōya-san but was lost
before the end of the Taishō period (so before 1926). According to a description by
the scholar Ōya Tōru (1850-1928) the first half of the work was by Myōgaku and the
second half by another unknown author.
The colophon added to this copy of the year 1115, stated that the original work
was a handwritten copy, finished by Myōgaku on the fifth of May 1090, and that
Myōgaku had added a number of 私案 or ‘personal opinions’. Another, much later
18 In the Chinese works that Myōgaku studied, both systems could be found. He mentions for
instance that characters with muddy initials like dha 駄 and bha 婆 have the annotation heavy
in some of the works he quotes, which is something that was typical of the Early Middle
Chinese-based transcription method (cf. section 6.2.1). Although the origin of this transcription
lies in the consonant system of Early Middle Chinese, the note ‘heavy’ added to syllables with
voiced aspirate initials was still used by Zhiguang in Xitanziji (8th century) as well.
406 7 Later Japanese tone theories
copy of this work from the year 1570, has the title Shittan go-on-shō 悉曇五音鈔,
and has also now been lost, so that Konishi had to rely on a handwritten copy made
before Shittan-hi was lost.19
The work contains a tone chart that shows eight tone dots. Next to each dot a
character representative of that tone is written, with a description. There are a
number of truly mystifying aspects to this tone chart, which may have to do with the
fact that this is an early work by Myōgaku, from the period in which he had only just
started to develop his theories:
Why are there no descriptions added to the light shang and light qu tone dots?
Why are the light and heavy shang and qu tones represented by the same character?
Why do the example characters often belong to completely different tonal categories
than the tones they are supposed to represent?20
What remains however, is that the descriptions added to the tone dots give an
idea of how Myōgaku saw the tonal value of the Chinese tones.
3 Comments to the tone chart in Shittan-hi
東 Light ping
字ノ初ハアガリ、 The beginning of the character rises
終ハサガレルナリ。 and the end is falling
等 Heavy ping
字ノ音ノ初モ終モ Both the beginning and the end
サガレルナリ。 of the sound of the character are falling
洞 Light shang (no description)
洞 Heavy shang
字ノ初ハサガリ、 The beginning of the character falls
終ハアガル。 and the end rises
灯 Light qu (no description)
19 The description is however, also quoted by Kenpō in Shittan-jiki shōgaku-shō 悉曇字記創学
抄 (1380). See section 7.3.3.1.
20 The tone that the example characters actually had in Middle Chinese is as follows: 東 was a
ping tone character with a clear initial (= light ping). 徳 was a ru tone character with a clear
initial (= light ru). The characters 東 and 徳 are therefore appropriate examples of characters
with a light ping and light ru tone. However, 等 did not have a heavy ping tone. In reality it is a
shang tone character with a clear initial (= light shang). 洞 was not a character with a light or a
heavy shang tone. In reality it was a qu tone character with a muddy initial (= heavy qu). 灯
was not a character with a light or a heavy qu tone. In reality it was a ping tone character with a
clear initial (= light ping). 得 was not a character with a heavy ru tone. In reality it was a ru
tone character with a clear initial (= light ru). Even the (rather unlikely) idea that Myōgaku
might have chosen these characters as examples because their ‘reversed’ Go-on tone agreed
with the tone categories that he sought to exemplify does not help: The correspondence pattern
between Go-on and Kan-on (cf. chapter 4) does not agree with his choice of characters.
7.3 The descriptions of the tones 407
灯 Heavy qu
字ノ初ハサガリ、 The beginning of the character falls
後ハユガメルナリ。 and the end is bent/warped
徳 Light ru
字ノ終ニ、 At the end of the character
フツクチキ五文字アルガ、 are the five graphs hu tu ku ti ki,
初モ終モアガレル也。 but both the beginning and the end are rising
得 Heavy ru
字ノ終ニ、 At the end of the character
フツクチキ五文字アルガ、 are the five graphs hu tu ku ti ki,
初モ終モサガレル也。 but both the beginning and the end are falling
The next part of the tone theory in Shittan-hi is also quoted by Mabuchi (1996:311),
but with slightly different reading aids than in Konishi’s version. Unfortunately
Mabuchi does not indicate from which copy of Shittan-hi his quotation stems.
4 Shittan-hi on the use of shang or qu in the transcription of Sanskrit
上声ノ重ハ世人皆云二去声一。 Most people call the heavy shang tone ‘qu tone’
故用二六声一ノ時不レ云レ之也。 because of this they do not mention it when they
use the six-tone system
平上去入ハ依二下字一、 Ping, shang, qu, ru depend on the second
character,
故依二下響一定二低昂一也。 and falling and rising are therefore determined
depending on the second sound.
軽重清濁ハ依二上字一、 Light and heavy and clear and muddy depend on
the first character;
故依二上字ノ低昂一 because of this, light and heavy are determined
以定二軽重一也。 depending on whether the first character is falling
or rising.
梵字ノ中ノ体文卅五字ハ皆是 The 35 consonants of the Sanskrit script
上声字也。 are all shang tone characters.21
然 dha 駄 bha 婆等字、 However, syllables (Siddham graphs) like dha
駄22 and bha 婆23
皆注二去声一、 are annotated by everyone with the qu tone,
以レ此得レ意、 and now you will understand why they do
this:
依三世人用二六声一 because most people use the six-tone (system)
21 Siddham consonant graphs automatically include the short (and therefore ‘shang’) vowel -a.
22 Dha 駄 Kan-on ta (Go-on da) had a ping or qu tone and a muddy initial.
23 Bha 婆 Kan-on ha (Go-on ba) had a ping tone and a muddy initial.
408 7 Later Japanese tone theories
故或注レ去、 it can happen that these (syllables) are
annotated with the note qu,
或依二八声一為二上声重一 but other times, based on the eight-tone
system, these syllables are regarded as heavy
shang,
故注二重音一也。 and they are therefore annotated as heavy
sounds,
実ニハ初ハ平、 In fact however, they are called (heavy) because
their tone is in the beginning as ping
後ハ上ノ勢ニ所レ呼レ之也。 and in the end as shang.
tam 擔 ham 撼等ノ字ハ、 Syllables like tam 擔24 and ham 撼25
皆是短声ノ字ニ加二空点一 are all syllables with short tones to which ‘heaven-
dots’26 have been added,
故依レ理上声ニ可レ呼レ之。 It would therefore have been logical, if they had
been called shang tones,
而モ文皆注云レ去、 but in the text they are all annotated with the note
qu,
或注レ上、皆是其義歟。 although they are sometimes again annotated with
the note shang. This must all be because of the
same reason.
若二 bhām 字一注二去声一、 The syllable bhām is annotated with the qu tone,
実初サガリ終ハユカムデ in fact however, the sound is at first falling and in
the end bent/warped,
去声ノ重ニ可レ呼レ之。 and should therefore be called a heavy qu tone.
此等皆是明覚私案也。 (All this is the personal opinion of Myōgaku.
不可為本也。 One cannot rely on it.)
The note at the end of the passage indicates that the chart, and the explanations
added to it, belong to Myōgaku’s personal opinions, i.e. that the ideas expressed in
them were not traditional, but innovations that can be attributed to Myōgaku himself.
The next work by Myōgaku is Han’on Sahō 反音作法 (1093). Of this work there
exists a handwritten copy of the year 1095 which is a direct copy of the original, and
also many later handwritten copies as well as printed versions. Han’on sahō is a
collection of ‘correct’ character readings expressed by means of fanqie/hansetsu, or
24 Tam 擔, Kan-on tan (< tamu) had a qu tone and a clear initial.
25 Ham 撼 Kan-on kan (< kamu) had a shang tone in Early Middle Chinese but merged with the
qu tone in Late Middle Chinese because of the muddy initial.
26 The kū-ten 空 点 or ‘heaven-dot/hollow-dot’ is a dot used in the Siddham script, which
indicates that the syllable ends in -m. In Japan, the composite Siddham graphs were thought to
contain meaning in themselves, and the different strokes that made up the Siddham graphs were
all given names like ‘heaven’, ‘wind’, ‘fire’, ‘water’, ‘earth’ and so on (Van Gulik, 1953:126).
7.3 The descriptions of the tones 409
han’on as they are called here,27 and arranged according to the gojūon-zu 五十音図
(‘table of fifty sounds’). In Han’on sahō Myōgaku developed a Japanese fanqie
method that is at variance with the traditional Chinese practice. (See section 8.3.1.)
In the following passages from Han’on sahō Myōgaku explains in what way the
Chinese tones are expressed by the first and the second character of the fanqie. He
calls the sound expressed by the second character of the fanqie the jishū-on 字終音
‘final sound of the character’ and the sound expressed by the first character of the
fanqie the jisho-sei 字初声 ‘initial tone of the character’.
I have translated the characters 音 and 声 here as literally as possible as ‘sound’
and ‘tone’, and the characters 昂 and 低 as ‘rising’ and ‘falling’. Not in all instances
of the use of these characters in the text furigana or okurigana are added to them,
but when this happens these kana notes indicate that 低 was read as taru,28 which
means ‘to hang, droop, go down’ and 昂 was read as agaru which means ‘to rise up’.
While in Annen’s tone description the meaning of these two characters had to be
determined by context, the furigana and okurigana notes added to the characters in
this text leave only one reading open.
The reason why Myōgaku uses the verb taru here, instead of his earlier sagaru to
express a falling tone contour, is probably because he wanted to adhere to the use of
the characters 昂 and 低 that were also used in Annen’s influential text. (The
character 低 in Myōgaku’s time had the reading taru, but not sagaru.)
Myōgaku’s description of the fanqie spelling method is presented in the mondō
門答 ‘question and answer’ style (Mabuchi, 1962:431, 1963:185).
5 Han’on Sahō on the fanqie spelling method
問。 Question:
ハ
平上去入 何故依二下字一耶。Why do ping, shang, qu and ru depend
on the second character (of the fanqie)?
答。 Answer:
タレルヲハ ト
字終音低 云二平声 一、 When the final sound of the character is falling
it is called the ping tone,
終音昂云二上声一。 when the final sound is rising
it is called the shang tone.
ヲ ト
終音ユガム 云二去声 一。 When the final sound is bent/warped
it is called the qu tone.
27 These words are usually regarded as synonyms, but in his translation of Bunkyō hifu-ron 文鏡
秘府論 (Kūkai’s explanation of the Chinese poetry rules), Richard Bodman (1978) translates
han’on as the ‘analyzing of the sound’ into an initial and a final that precedes the actual
spelling of the sound by means of fanqie characters.
28 The form tareru consists of a contraction of tari (the ren’yōkei of the intransitive verb taru) +
aru (the rentaikei of the verb ari ‘to be/to exist’). See also p. 416 on the meaning of this form.
410 7 Later Japanese tone theories
ニ
終 有二フツクチキ一 When the final sound is fu, tu, ku, ti, ki
云二入声一。 it is called the ru tone.
ハ
故四声 依二下字一也。 Therefore the four tones
depend on the second character.
問。 Question:
軽重清濁依二上字一者、 Light and heavy, clear and muddy
depend on the first character.
先何故清濁依二上字一耶。 First of all, why do clear and muddy
depend on the first character?
答。 Answer:
ニ
清濁依二字初声 一 、 Clear and muddy depend on the initial tone
of the character,
ニ
不レ依終音 一。 and not on the final sound.
故二字相合成二一字音一時、 Therefore when the two characters are combined
to form the sound of one character,
清濁依二上字一也。 clear and muddy depend on the first character.
源字玄字魚字居字 The characters 源 (gwen), 玄 (kwen),
魚 (gyo) and 居 (kyo)
カ ナニ
皆依二上仮名 一分二 all divide into clear and muddy
清濁一也。 depending on the first kana.29
問。 Question:
軽重何故依二上字一耶。 Why do light and heavy
depend on the first character?
答。 Answer:
渉二四声一 Concerning all of the four tones,
29 What Myōgaku defines as muddy characters (源 gwen and 魚 gyo in Kan-on) were ping tone
characters with a second muddy initial in Middle Chinese. Of the characters that Myōgaku
defines as clear (玄 kwen and 居 kyo in Kan-on) the first was a ping tone character with a
muddy initial and the second was a ping tone character with a clear initial in Middle Chinese.
This means that the determination of a character as either clear or muddy is decided on the
basis of the Japanese Kan-on reading by Myōgaku, and not on the nature of the initials in
Middle Chinese. In Japanese, voiced consonants are called muddy (daku), and voiceless
consonants are called clear (sei). Many of the voiceless (clear) initials in Kan-on go back to
voiced (muddy) initials in Late Middle Chinese. The voiced (muddy) Kan-on consonants on the
other hand have developed from Late Middle Chinese second muddy initials:
EMC Go-on LMC Kan-on
p clear p clear (sei) p clear p clear (sei)
ph 2nd clear p clear (sei) ph 2nd clear p clear (sei)
b muddy b muddy (daku) p˙ muddy p clear (sei)
m 2nd muddy m muddy (daku) mb 2nd muddy b muddy (daku)
7.3 The descriptions of the tones 411
ノ ノ ノ ルヲ
字 初 声 昂 名レ軽。 if the initial tone of the character is rising
it is called light.
ルヲ
初声低 名レ重。 If the initial tone is falling it is called heavy.
故二字相合成二一字音一時、 Therefore when the two characters are combined
to form the sound of one character,
軽重猶依二上字一也。 light and heavy still depend on the first character.
問。 Question:
若軽重依二上字一者、 If light and heavy depend on the first character,
動字切韻云従総反、 and in the Qieyun the character 動
has the fanqie spelling 従総,
総字上声故動字可二上声一。 and the character 総 has the shang tone,
then it follows that 動 has the shang tone.
而従字重故、此動字初平、 And because the character 従 is heavy,
the character 動 is pronounced at the beginning ping
後上可レ呼レ之。 and at the end shang.
此初平後上声即是去声。 In the beginning ping and in the end shang
in other words is the qu tone.
若去声総而既上声也。 If it is qu tone, but the character 総
already had the shang tone,
レヌ
四声依二下字一義 壊 耶。 wouldn’t the rule that the four tones depend
on the second character be broken?
ハ
故和軽重 直可レ依二下字一。 Therefore when light and heavy are combined
(the outcome) can directly depend on
the second character (of the fanqie).30
答。 Answer:
於レ呼レ声六声八声家分。 In pronouncing the tones, there is a difference
between adherent of the six-tone system
and adherents of the eight-tone system.
今六声家、 It is in the six-tone system
上声之重即渉二去声一。 that the heavy shang tone goes over to the qu tone.
故所レ難難レ会。 It is therefore hard for me to agree
with your criticism.
八声家意、初平後上之声 In the eight-tone system
in the beginning ping and in the end shang
即是上声之重也。 is the heavy shang tone.
30 The character 動 is at the beginning pronounced as ping and at the end as shang. As Myōgaku
associated the tone contour of the ping tone (falling) with heavy and the tone contour of the
shang tone (rising) with light, he regarded this example as a case where light and heavy were
combined.
412 7 Later Japanese tone theories
今動字従総反 Now, the fanqie spelling of 動 is 従総,
総字上声故動字成二上声一。 and as 総 has the shang tone,
the character動has the shang tone.
従字重故動字成レ重。 As the character 従 is heavy,
the character動is heavy.
軽重依二上字一事、 Heavy and light depend on the first character,
上東同字反中既定了。 and have already been determined by the light
(東 tou) or heavy (同 tou)31 character
in the fanqie spelling of the first character.
今不二重論一。 I will not go into this once again.
悉曇中重音字 In Siddham, one combines characters
with a heavy sound
与二上声点一相合反レ之、 with a shang tone dot.
即呼二上声重音一。 When one spells this, this is called a shang tone
with a heavy sound.
誠知軽重専依二上字一也。 In this way one truly knows that light and heavy
depend exclusively on the first character.
The remarks in Myōgaku’s text that are most directly related to the realization of the
tones are the following:
The final sound of ping is falling
The final sound of shang is rising
The final sound of qu is bent/warped
The final sound of ru is fu, tu, ku, ti, ki
When the initial tone is rising it is called light
When the initial tone is falling it is called heavy
Myōgaku’s next work, Shittan yōketsu 悉曇要決 (1101), is quoted by Kindaichi
(1951:62), Mabuchi (1962: 428-431) and Konishi (1948:492). Although the original
text is a complicated and lengthy discourse in the question and answer style, I have
followed Kindaichi (1951: 62-63) in only selecting the following five passages from
the answer section that are directly related to tone. The corrections to the text added
in the footnotes are from Mabuchi (1962: 428-431).
31 In Shittan-hi Myōgaku still used the completely inappropriate light shang tone character 等 as
an example of the heavy ping tone, but here the character 同 has been chosen, which is indeed a
ping tone character with a muddy initial in Middle Chinese.
7.3 The descriptions of the tones 413
6 Selected passages from Shittan yōketsu
1st passage
六声家之去声与二八声字32去声一不レ同。 The qu tone of adherents of the six-tone
system is not the same as the qu tone of
adherents of the eight-tone system.
今云二去声一音33可二是六声家之去声一。 The qu tone being discussed now is the
qu tone of the six-tone system.
実是八声家之上声重音也。 This is in fact the heavy sound of the
shang tone in the eight-tone system.
何者初平後上之音六声家為二去一、 As to what it is like; a sound that is in the
beginning ping and in the end shang is
the qu tone of the six-tone system,
八声家為二上声重音一。 in the eight-tone system this is the sound
of the heavy shang tone.
2nd passage
ai, o 二字可二初平後去呼一
レ 之。 Ai, o, these two characters (Siddham
graphs) should in the beginning be called
ping and in the end qu.
即是八声家去声也。 This is the qu tone of the eight-tone
system.
故雖二同去声一軽重有レ異歟。 Although they are the same qu tone,
there must be a difference between heavy
and light.
初平後上之字及初平後去之字、 Adherents of the six-tone system regard
characters that are in the beginning ping
and in the end shang, and characters that
are in the beginning ping
六声家同為二去声一。 and in the end qu both as qu tones.34
故不空所訳皆云二去引一歟。 This must be why Amoghavajra in his
transliteration calls them all ‘qu pulled
out’.
32 The character 字 is probably a mistake for 家.
33 The character 音 is probably a mistake for 者.
34 In other words: heavy shang characters (characters with the beginning ping and the end shang
in Myōgaku’s words) and heavy qu characters (characters with the beginning ping and the end
qu in Myōgaku’s words) have merged as the same qu tone in the six-tone system.
414 7 Later Japanese tone theories
3rd passage
宝月宗叡意用二八声一、 It must be because Baoyue and Shuei 35
intentionally use the eight-tone system,
故五句第四字36皆云二上声重音一、 that Sanskrit syllables starting with the
consonants g˙, j˙, d˙, d˙, b˙ are all
called heavy shang,
e, o, am 三字亦云二上声一歟。 and that the three graphs e, o, am are in
turn called shang.
弘法家用二六声一、 It must be because the followers of Kōbō
Daishi37 use the six-tone system,
故此等字皆云二去声一歟。 that they call all these characters qu.38
4th passage
但重音者、去声上声之軽重、 As to heavy sounds, qu and shang have
heavy and light.
知人既少。 Not many people know this anymore.
今私案レ之。 Now, these are my personal opinions:
初昂後低為二平声之軽一。 Beginning rising and later falling is the
light ping tone.
初後倶低為二平声之重一。 Beginning and ending both falling is the
heavy ping tone.
初後倶昂為二入声之軽一。 Beginning and ending both rising is the
light ru tone.
初後倶低為二入声之重一。 Beginning and ending both falling is the
heavy ru tone.
当レ知重音者初低音也。 It will be obvious that a heavy sound is a
sound that begins falling.
35 These were two Siddham scholars from the 9th century. Baoyue was a native of South India and
had taught Ennin (one of the founders of Tendai in Japan) the Siddham script and the
pronunciation of Chinese characters. Shuei was one of the Japanese monks who studied
together with Ennin in China. He returned from China in 866. The reference by Myōgaku to the
transcription practices of these two is straight from Shittan-zō: Right after his tone descriptions
Annen quotes several transcription traditions among which those by Baoyue and Shuei (Endō,
1988:47).
36 The term 五句 refers to the 5 articulation points of Sanskrit (g j d d b) and the term 第四字 to
the fourth voicing type, which is voiced aspiration.
37 This refers to the Shingon school (as opposed to the Tendai school). In this passage the six-tone
system is attributed to Kōbō Daishi’s followers (the Shingon school) and the eight-tone theory
to Baoyue and Shuei, who are connected to Ennin, one of the founders of Tendai.
38 Officially the vowels e and o and the a in am in the Siddham script are short. In Korean dhāran{ī
transcriptions however, these syllables are all marked with the qu tone dot as well. This means
that in Korea these vowels were regarded as long, and according to some, e and o were in fact
long in Sanskrit (Rosen 1974:131).
7.3 The descriptions of the tones 415
初後倶昂名為二上声一。 When beginning and ending are both
rising this is called the shang tone.
是六声之家義也。 This is according to the six-tone theory.
初低終昂之音可レ為二上声の重一。 Beginning falling and ending rising is the
heavy shang tone.
5th passage
故知,去声者即今重音也。 We therefore know that the qu tone
corresponds to the heavy sound here.
初低後昂之音六声之家以為二去声一也。 The beginning falling and later rising is
the qu tone of the six-tone theory
Finally, Konishi (1948:493) adds another quotation from Shittan yōketsu in which
the light qu tone is described as 初上後去之声, i.e. a tone that is in the beginning as
shang and in the end as qu.
The tone descriptions in Shittan yōketsu are not presented in a very accessible
way, but Myōgaku’s statements can be summarized as follows:
7 Myōgaku’s six-tone system
light ping tone beginning rising and ending falling
heavy ping tone beginning and ending both falling
shang tone beginning and ending both rising
qu tone beginning falling and ending rising (ping + shang)
light ru tone beginning and ending both rising
heavy ru tone beginning and ending both falling
8 Myōgaku’s eight-tone system
light ping tone beginning rising and ending falling
heavy ping tone beginning and ending both falling
light shang tone beginning and ending both rising
heavy shang tone beginning falling and ending rising (ping + shang)
light qu tone shang + qu
heavy qu tone ping + qu
light ru tone beginning and ending both rising
heavy ru tone beginning and ending both falling
I can only assume that the difference between the ping tone and the qu tone in
Myōgaku’s theory was a difference in length. Although no furigana or okurigana
notes are added to this text in the version quoted by Konishi and Mabuchi, I have
decided to translate the characters 低 and 昂 again as ‘falling’ and ‘rising’ and not as
‘low’ and ‘high’. It seems unlikely to me that Myōgaku would have changed his
accustomed reading of these characters in this text only. Moreover, Kindaichi, who
416 7 Later Japanese tone theories
quotes Shittan yōketsu in Nihon shisei kogi, used a woodblock print version which
does have kana notes added to the text. These indicate in each instance that 低 and
昂 should be read as taru and agaru.
Kindaichi nevertheless argues that 低 and 昂 should be read as ‘low’ and ‘high’.
Kindaichi’s first argument for his interpretation is that – according to him – to use
the expression ‘rising’ for a high level tone and ‘falling’ for a low level tone is still
not uncommon in Japanese. What Kindaichi refers to is the fact that in modern
Japanese – when used attributively – the past tense can express the perfect aspect (as
in あがった調子 ‘a raised (= high) tone’), but this is not the way in which these
verbs are used in the Buddhist tone descriptions: In most cases they appear in the
unmarked form, which expresses the imperfect aspect. They also appear however,
with the suffix -eri (< i-ari) which can express the imperfect, progressive or iterative
aspect, but also the perfect aspect. (The latter in case of verbs that express a
momentary or resultative action, especially in the attributive function. Cf. Lewin,
1975: 165-171.) Both forms however, are often used interchangeably within one and
the same text. Moreover, one and the same form will be used one time to describe
the first part of the tone, and the next time to describe the second part of the tone. It
is therefore likely that they were used with identical meaning, which in this case can
only mean that the -eri forms expressed the imperfect aspect.
Kindaichi next argument is that the kana notes indicating the readings taru and
agaru may have been added by later generations, and that Myōgaku himself may
very well have read these characters as hikusi and takasi. (Even if this were true, it
does not change the fact that all later generations of scholars, also those who were
contemporary with the production of the tone dot material, consistently read these
characters in Myōgaku’s texts as ‘falling’ and ‘rising’.
The older handwritten copy used by Mabuchi and Konishi does not have reading
notes. Does this mean that Kindaichi could be right, and that Myōgaku himself may
have read 低 and 昂 as hikusi and takasi? The first problem with this idea is that
there are other texts in which Myōgaku used the verbs taru (or sagaru) and agaru in
the exact same context. Secondly, the contemporary Ruiju myōgi-shō dictionary
does not include the readings hikisi39 and takasi for the characters 低 and 昂. Neither
have I come across any versions of the old texts in which these characters have been
attested with the okurigana シ, indicating that the possibility to read them as the
adjectives takasi or hikisi was even there.40
The readings that are indicated for these characters in Ruiju myōgi-shō are the
following: The two characters 低 and 昂 occur together in Ruiju myōgi-shō as 夷昂
(the character 低 is usually listed in its variant shape 夷). The reading of this
compound (which in the tone descriptions is used to describe the tone contour of the
39 The form hikusi instead of hikisi apparently only developed in the Muromachi period. The
ヒ キ ヒ ト
oldest attested form is hiki- in the compound hikihito 比木比止 ‘a person of small stature’
(Jidai-betsu kokugo dai-jiten, Jōdai-hen, 1967:606).
40 One such possible attestation will be discussed later on in this chapter.
7.3 The descriptions of the tones 417
heavy shang tone) is given as tari-agaru, which can only mean ‘to fall and rise’ and
not ‘low-high’.
In isolation the character 低 has the readings taru ‘to droop, to hang down’,
katabuku ‘incline, lean, slant, tilt, go down’, which both point to a (falling) contour
tone instead of a level tone, and furthermore mizikasi ‘short’, tosi ‘sharp, swift,
early’ and tahiraka ni ‘flat, level’. (I see the meanings of ‘short’ and ‘sharp, swift,
early’ (‘a short time’) as related.) Although tahiraka ni opens the possibility to read
this character as ‘level’, this is of course not yet the same as Kindaichi’s ‘low’.
The character 昂 in isolation has the readings agaru ‘to rise up’, ahugu (modern
Japanese: aogu) ‘to look up to (from a humble position), to revere’, nozomu ‘hope,
expect’, and saka ‘slope’, which all point to a contour tone.
I have found no indication for a meaning of the word taru (which seems to have
been the most standard reading for the character 低 or 夷 at the time) that comes
close to ‘low’, despite the fact that the character with which it was written nowadays
has the reading hikui ‘low’. Everything indicates that it meant ‘falling’ or ‘hanging
down’. The word tarumi 垂水 for instance (albeit written with a different character)
means ‘waterfall’, and not something like ‘low lying pool’, which would have been
possible if taru really did have the meaning ‘low’.
Finally, it appears that the usual word for ‘low’, hikisi in Heian/Kamakura period
Japanese, was written with the characters 卑 or 短 (see for instance hikiyama 短山
‘low mountain’ as opposed to takayama 高山 ‘high mountain’) and not with the
character 低. In much later tone descriptions from the Edo period as well – in which
the terms takasi and hikusi are used for the first time instead of the terms agaru and
taru/sagaru – the katakana reading note ヒクシ is added to the character 卑 and not
to the character 低. Furthermore, the term takasi is written as 高シ, employing the
character 高 and not 昂. (See section 13.1.2.)
Other arguments that Kindaichi uses against reading these characters as ‘falling’
and ‘rising’ are that this would result in a tone system that includes no level tones at
all, which would be strange, and that the expressions ‘beginning and ending both
falling’ and ‘beginning and ending both rising’ make no sense; it would have been
more logical to simply write ‘falling’ or ‘rising’ instead.
I think the first argument is premature, as the idea that a particular tone system is
strange depends on how one sees the origin and function of this tone system. As to
the last argument: The expressions ‘beginning and ending both falling’ and
‘beginning and ending both rising’ are only strange if Myōgaku’s method of
dividing the tones into two parts (where the tone of the first part depends on the
heavy or light nature of the initial, and the tone of the second part depends on the
traditional tone category of the character) is not taken into account.
In Chinese, the characters 低 and 昂 can be read as adjectives or verbs, the first
yielding ‘low’ and ‘high’, and the second ‘falling’ and ‘rising’. Because of the
readings in Ruiju myōgi-shō, I regard ‘falling’ and ‘rising’ as the standard meaning
of these characters in Japan at the time. Unless the texts include the kana reading aid
418 7 Later Japanese tone theories
シ, indicating that it was possible to read these characters as the adjectives hikisi or
takasi, ‘falling’ and ‘rising’ are the appropriate translations.
So far, I see no reason to change the normal meaning of these characters at that
period, which clearly indicate contour tones, to ‘level low’ and ‘level high’ as
Kindaichi does.
7.3.1.3 Fujiwara Munetada 藤原宗忠
The next text is not from a Buddhist work, but from Sakumon daitai 作文大体
(1108), a manual for the composition of Chinese poetry by the courtier Fujiwara
Munetada (1062-1141). According to Konishi, the most complete text can be found
in the Tōzan gyo-bunko otsu-bon 東山御文庫乙本. This text is quoted by Konishi
(1948:525) and Mabuchi (1962:434). According to Mabuchi, Sakumon daitai has so
many points in common with Myōgaku (the characters that Munetada chooses as his
examples for instance coincide exactly with Myōgaku’s examples in Han’on sahō)
that Mabuchi assumes the author must have been familiar with Myōgaku’s work.
9 Sakumon daitai on the fanqie method
凡文字者、有二反音一。 All characters can be spelled.
反音義与 飜同 レ (Han-on means the same as ‘transcription’)41
反音必有二二字一。 There are always two characters in a spelling
故略頌云、 To give a short explanation,
平上去入者依二下字一、 ping, shang, qu and ru depend
on the second character
軽重清濁者依二上字一。 light, heavy, clear and muddy depend
on the first character
謂二平声軽一者東、重者同 An example of the light ping tone is 東 (tou),
the heavy (ping tone) is 同 (tou)
入声軽者徳,重者独、 the light ru tone is 徳 (toku), the heavy (ru tone)
is 独 (toku)42
皆依二飜音一、 The meaning is that in all cases,
based on the spelling,
上字得二其軽重清濁之義一也。 one derives light and heavy, clear and muddy
from the first character.
爰只挙二平入声一者、 The reason why I give only the ping and ru tones
here,
上声重渉二於去声一、 is because the heavy shang tone goes over
to the qu tone
41 I have put the translation of sentences written in smaller characters between brackets.
42 These four example characters indeed belong to the light ping and heavy ping, and the light ru
and heavy ru categories respectively.
7.3 The descriptions of the tones 419
々々之軽渉二於上声一、 and the light qu tone goes over to the shang tone,
逓難二分別一之故。 and they are hard to distinguish from each other.
去声軽渉上声 (As to the light qu tone
going over to the shang tone
以未知其意云々。 I do not yet understand the meaning of this.)43
二字之音能難二反得一 It is difficult to express the sound of these two
characters in a fanqie
以二悉曇一可レ知レ之云々。 but based on Siddham one should be able to
understand it, etc.
平声入声軽重 The light and heavy of the ping and ru tones
或不三必依二上字一。 does not always depend on the first character.
濁字多如レ之。 (Most muddy characters are like this.
依三平声無二軽音一、 It is because the ping tone has no light sounds
入声無二重音一也。 and the ru tone has no heavy sounds.
清字又有二如レ此之類一。 To this group the clear characters also belong.)44
又如二賢弦有支等一者 And just as there is for instance a division
between ken and ken,
雖レ用二同字一所レ読各異。 although the same character is used,
when one reads them they each differ.
未レ得二其意一。 I do not yet understand this.
就中、上声之重、 Above all, the heavy shang tone
去声軽重三声非二弁得一。 and the light and heavy qu tones, these three
cannot be distinguished.
世俗所レ読無レ有二差別一。 In general, when reading them
there is no difference.
但略頌云、 But to give a short explanation,
上短下長去声軽、 The first tone short and the second long
gives a light qu tone.
43 I adduce the text as given by Mabuchi except in this sentence, where I follow Konishi, as
Mabuchi seems to have reversed the correct word order. (Mabuchi has: 以未其知意云々。)
44 If – as we have seen before – the term muddy refers to the Chinese second muddy category here,
(such as Mabuchi (1996:308) assumes), this is another reference to the fact that in Japan all
characters with second muddy initials belonged to the heavy subtone in the ping tone but to the
light subtone in the ru tone. (“The ping tone has no light (second muddy) sounds and the ru
tone has no heavy (second muddy) sounds.”) So indeed; whether characters with second muddy
initials belonged to the light or the heavy category in the ping and ru tones did not depend on
the first character (of the fanqie) but on the second. When the second character was ping, they
were heavy, when it was ru, they were light. In the ru tone it is even the case that all characters
were light. (See for instance the five-tone marking system used in the Tosho-ryō-bon of Ruiju
myōgi-shō.) The remark that in the ru tone even clear characters belong to the light category
probably refers to this characteristic of the five-tone system, as – confusing enough – Late
Middle Chinese muddy (daku) characters were clear (sei) in Kan-on.
420 7 Later Japanese tone theories
上長下短去声重。 the first tone long and the second short
gives a heavy qu tone.
謂二去声重一者上平声、而長、 In the heavy qu tone the first tone is as ping and
long,
下去声而短。 and the second tone is as qu and short.
軽者其音不二相殊一。 The light is not different from this.
但上短下長、 However, the first tone is short and the second
long,
尋常所レ読去声是也。 and this is how the qu tone is normally read.
上声重者、 As to the heavy shang tone,
上軽平声、下是上声。 the first tone is like the light ping tone
and the second is a shang tone.
其間直折不レ同二去声一云々。 This tone is different from the qu tone in that
there is a sharp bend/breach between the tones
Fujiwara Munetada may have been acquainted with Myōgaku’s new theories, but he
appears to oppose Myōgaku’s ideas rather than to endorse them. His description has
more points in common with the first tone description quoted in this chapter, in
Hoke-kyō shakumon by the monk Chūzan from the Hossō school:
They both include remarks on complications in the division into light and heavy
in the ping and ru tones (to which Chūzan also adds the shang tone), and on how
differences in length play a role in the distinction of the light and heavy of the shang
and qu tones. Furthermore, Fujiwara Munetada also adheres to the five-tone system
that was typical of the Hossō school. (This is probably no coincidence as the
Fujiwara family had close ties with the Hossō school.)
The points of agreement between Hoke-kyō shakumon and Sakumon daitai
probably indicate that these texts reflect an older interpretation of Annen’s text, in
which tone length distinctions played a role.
The fact that the descriptions do not agree with each other on the other hand,
may indicate that this theory had already become confused. Another indication for
this is the fact that shortly after Sakumon daitai was written, this view of the tones
was abandoned and replaced by Myōgaku’s new interpretation, in which the tones
were defined purely in terms of tone height.
Fujiwara Munetada seems to oppose Myōgaku’s new definition of heavy and
light as ‘beginning falling’ and ‘beginning rising’. In Myōgaku’s description, the
light qu tone was 上去 and the heavy qu tone was 平去, but according to Fujiwara
Munetada they were both 平去. In other words, Fujiwara Munetada rejects the idea
that a light tone should begin rising.
In Myōgaku’s description the heavy shang tone was 平 上 , but here it is
described as 平軽上, so Fujiwara even seems to disagree with the idea that a heavy
tone should start heavy. (Strangely enough, Fujiwara Munetada’s description of the
7.3 The descriptions of the tones 421
heavy shang tone agrees exactly with Isei’s description of the light shang tone,
which makes me wonder about the accuracy of this text.)
I will leave a further discussion of the description of the heavy shang and light
and heavy qu tones, and the origin of the merger of light qu with shang to chapter 8.
7.3.1.4 Eijū 恵什 (Shingon school):
As the compiler of the work Kunshū shittan shii yōketsu-shō 捃拾悉曇思惟要決鈔
(1140-1150?) both Ninkai and Eijū of the Shingon school are mentioned. Konishi
(1948:485) thinks that it must date from the middle of the Heian period, before the
Insei period (1086-1192). Because of this he considers Ninkai 仁海 (951-1046) as
its compiler.
I will follow Mabuchi however, who regards this work as strongly influenced by
Myōgaku’s theories. Mabuchi therefore concludes that the author must have been
Eijū who was active at the Ninna-ji from approximately 1140 to 1150 (1962: 390,
401). This is about 50 years after the time of Saisen 済暹, Kanchi 寛智 and
Myōgaku. The work therefore most likely belongs to the Siddham tradition of the
Ninna-ji temple in the line of Saisen and Kanchi.
10 Kunshū shittan shii yōketsu-shō on the difference between qu and shang
去声軽始上終去、 The light qu tone is beginning shang, ending qu
是響頗聊相近。 These sounds are rather close to each other.
上声軽始終上声、 The light shang tone is beginning and ending
shang tone
又去声重始平終去、 and the heavy qu tone is beginning ping
and ending qu
此響聊相似。 they are somewhat similar to each other.
上声重始平終上、 The heavy shang tone is beginning ping
and ending shang,
難二分判一。 they are difficult to distinguish.
故、有書、挙二平入二声軽重一、 In a book the light and heavy of ping and ru
are given,
竟云、 and in the end it says:
爰亦挙二平入二声一者、 The reason why I give only ping and ru,
these two tones,
上声重渉二於去声一、 is because the heavy shang tone
goes over to the qu tone
去声軽渉二於上声一、 and the light qu tone goes over to the shang tone
逓難二分別一故也。文 and they are hard to distinguish from each other.
(End of quotation.)
422 7 Later Japanese tone theories
The quotation from ‘a book’ appears to be from Fujiwara Munetada’s Sakumon
daitai, which can be regarded as corroboration for Mabuchi’s dating of this work
(Konishi (1948:486) however, assumes that Ninkai and Fujiwara Munetada must
have used the same unknown older source.)
Finally, Kunshū shittan shii yōketsu-shō includes a remark about the ping tone
(Mabuchi, 1962:397).
11 Kunshū shittan shii yōketsu-shō on the difference between qu and ping
平声重始去終平。 The heavy ping tone is beginning qu
and ending ping.
故平声響相二近去声一。 Therefore the sound of the ping tone
is somewhat close to that of the qu tone.
依レ之難二分別一。 Because of this, they are hard to distinguish.
故初挙二平声一 Therefore, I first mention the ping tone,
或亦云レ近二去声一也。 and say again that it is close to the qu tone.
故検二諸文一、 Therefore if you study various writings,
平去声逓渉。 (you will see that) the ping and the qu tones
go over into each other.
The remarks concerning a similarity between shang an qu appear to have been
adopted from Annen, while the influence of Myōgaku in this text can be seen in the
division of the tones in a initial part that is determined by whether the initial is heavy
or light and a final part that is determined by the traditional tonal category of the
character.
Descriptions of the light ping tone and the ru tones are missing, but the
descriptions that are included mostly agree with Myōgaku. An exception is the
heavy ping tone, which is described as ‘first qu and then ping’ instead of ‘beginning
and ending both ping’ as is usual in Myōgaku’s tone works. The reasons Eijū gives
for his description of the heavy ping tone are interesting: “the sound of the ping tone
is somewhat like that of the qu tone”, “they are hard to distinguish”, and in the
various writings “the ping and the qu tones go over into each other”.
Myōgaku connected the merger of heavy shang with qu with changes that he
observed in the transcription of Sanskrit over time: Sanskrit short vowels that had
earlier been transcribed by means of qu tone characters were later transcribed by
means of shang tone characters.
As I have explained, in reality this change in transcription practice was caused by
the fact that the qu tone in Late Middle Chinese had lengthened. In later
transcriptions therefore, the qu tone was no longer used to trancribe short vowels,
and replaced by the shang tone. The shang tone came to be regarded as the only tone
appropriate for the transcription of Sanskrit short vowels.
7.3 The descriptions of the tones 423
Eijū appears to be doing something similar here, but this time with regard to the
ping and qu tones: He claims that they must be similar because Sanskrit long vowels
that were earlier marked with a ping tone are later marked with a qu tone (so they
“go over into each other”).
In reality however, change in transcription goes back to the same development
that had made the qu tone unfit for the transcription of short vowels: The qu tone
had lengthened, and was now occasionally used to transcribe long vowels.
7.3.1.5 (Kōmyō-san) Jūyo (光明山) 重誉 (Tendai school)
The work Shittan jiki-shō 悉曇字記抄 (1142) by Jūyo is about 50 years later than
Myōgaku’s Han’on sahō. Although this work reflects the typical theory of Myōgaku
which results in eight tones, the Japanization of the fanqie method appears to have
progressed even further, as the first character of the fanqie has by now developed
into the first kana of the first character of the fanqie, and the second character of the
fanqie has developed into the second kana of the second character of the fanqie
(Mabuchi, 1962:483).
12 Shittan jiki-shō on the fanqie method
ハ ル ノ ニ
夫、平上去入 者依 ニ 下字之下 仮名 一 。 Ping, shang, qu, ru depend on the
second kana of the second character.
ノ ノ タルルヲ ト
45
謂字 終 音低 云ニ平声 一、 When the final sound of the character
is falling it is called ping,
アカルヲハ ト
終昂 云ニ上声 一、 when it is rising it is called shang,
ノ マカレルヲハ ト
終 音曲 云ニ去声 一矣。 when it is bent/warped it is called qu.
ハ
入声 終音不レ通レ余、其心顕也。 In the ru tone the final sound does not
continue, its meaning is clear.
ノ ニ
次軽重依ニ上字之上 仮名 一 。 Next, light and heavy depend
on the first kana of the first character.
ノ ヲハ
謂初 音昂 云レ軽、初音低云レ重。 When the initial sound is rising
it is called light, when the initial
sound is falling it is called heavy.
45 Originally, the intransitive verb taru – which has appeared in the texts a number of times so far
– belonged to the yodan conjugation, while the transitive verb taru belonged to the shimo-
nidan conjugation. During the Heian period, the intransitive verb adopted the conjugation of
the transitive verb. (The modern verb tareru (shimo-ichidan conjugation) still has intransitive
as well as transitive meanings.) The form taruru in this text is therefore the rentaikei of the
intransitive verb. (See also tarete in Hoke-kyō onkun 法華経音訓.)
424 7 Later Japanese tone theories
7.3.1.6 Shinren 心蓮 (Shingon school)
Shittan kuden 悉曇口伝 (1180), is the work by the Shingon monk Shinren 心蓮 (?-
1181) quoted in section 4.1. Apart from the detailed comparison of the Go-on and
Kan-on tones, this work does not contain concrete descriptions of the tones
(Mabuchi, 1962:492, 507, Konishi,1948:494).
7.3.2 Kamakura period (1185-1338)
7.3.2.1 Dōhan 道範 (Shingon school)
The work Shittan-jiki kikigaki 悉曇字記聴書 (1241) by Dōhan (1178-1252), is also
known as Dōhan-ki 道範記. The author Dōhan lived at Kōya-san, and was the
teacher of the famous scholar Shinpan 信範. (See section 7.3.2.2.)
Shittan-jiki kikigaki contains an eight-tone chart with tone dots, the names of the
tones, and example characters. In addition, the reading of the example characters is
written out in kana to which fushihakase marks have been added. It is clearly visible
that each mark still has a tone dot as its starting point, and that the angle and location
of the marks around the kana graph is determined by the tone dot. (As will be shown
in chapter 14, this type of notation (shōten hakase or ‘tone dot’ hakase), in which the
hakase marks were extensions of the tone dots, and expressed the tonal value of the
dots, is the oldest type of fushihakase notation in Japan.)
Figure 5: Tone dot chart with hakase marks in Shittan-jiki kikigaki
Source: Mabuchi (1962:568)
To ping tone dots a horizontal hakase mark is added, to shang tone dots a diagonal
(backslash) or vertical hakase mark is added. Qu tone dots are marked with a
diagonal mark in the opposite direction of the mark added to the shang tone
7.3 The descriptions of the tones 425
(forward slash), or with a hakase mark that has a kink in it and appears to consist of
a horizontal mark followed by a diagonal (forward slash) mark. (The hakase marks
are drawn from the character outward, with the tone dot as the starting point.)
Although no concrete values for the tones are mentioned, it is clear that this tone
system belongs to the tradition of Myōgaku: Each of the two separate kana graphs
has been allotted a proper tone of its own, which has to be combined to form the
tone contour of the tone as a whole. We also see that light tones start with a shang
tone dot added to the first kana (and a diagonal or vertical hakase mark added to this
tone dot) and that heavy tones start with a ping tone dot added to the first kana (and
a horizontal hakase mark added to this tone dot), which is typical of Myōgaku’s
concept of light and heavy.
The exact tonal value of such early examples of fushihakase is hard to determine
with certainty. Although outwardly this type of fushihakase resembles the later
Shingon goin hakase, which was a hakase system based on the distinction of
absolute tone, the hakase shown in the chart in Shittan-jiki kikigaki date from before
the invention of the goin hakase system.46
7.3.2.2 Shinpan 信範 (Shingon school)
According to Konishi, Shinpan, who wrote no less than twelve works on Siddham
phonology, and who was the first to make use of the Chinese rhyme tables, is one of
the most important figures in the history of Japanese phonological study. Shinpan
(1223-1296) was the best pupil of Jōchō 承 澄 in the line of Shinren 心 蓮 .
(Interestingly, Jōchō was by origin a Tendai monk, who nevertheless received
instruction in Siddham studies from Shingon teachers. Konishi explains this by
pointing out that in this period the level of Siddham studies was much higher in the
Shingon school.)
One of the texts used by Kindaichi in Nihon shisei kogi to establish the tonal
value of the tone dots in the standard theory, is Shinpan’s work Shittan hiden-ki 悉
曇秘伝記 (1286). Shittan hiden-ki is a kind of final summary of Shinpan’s Siddham
studies.
None of the other studies I have used quote this text, and the only version
available to me is the text as presented by Kindaichi (1951:691). Kindaichi
introduces this text as follows: “According to Iida Rigyō, Shittan hiden-ki by
Shinpan contains the following description of the ‘eight-tone theory’.” Kindaichi
however, does not mention in which article or book by Iida this text can be found.47
This is unfortunate, as the text presented by Kindaichi contains the only example of
the okurigana シ being added to the character 低 in the works of the Siddham
46 According to tradition, the goin-hakase system was introduced as a new notation around 1270,
by the Shingon monk Kakui 覚意 (1237-?), but it was not readily accepted and did not come
into use in the Shingon school until the 14th century. (See chapter 14.)
47 It may have been personal communication by Iida, as the passage is not included in Iida’s
‘Nihon ni zanson-seru Shina koin no kenkyū’ (1941), and a number of other works by his hand.
426 7 Later Japanese tone theories
scholars. (The okurigana シ added to the character 低 indicates that this character
should be read as hikisi/hikusi ‘low’.)
I strongly suspect that this kana note is not attested in the original version of the
text of Shittan hiden-ki: The reading hikisi/hikusi for this character is not listed in the
contemporary Ruiju myōgi-shō, and Iida has the habit of adding kana reading aids
himself to quotations from Chinese hakubun 白文 texts (i.e. texts that contain no
reading aids in the original).48 Iida’s rendering of this text was probably influenced
by the fact that the character 低 has acquired the reading hikui ‘low’ in modern
Japanese, while its counterpart in the text, the character 昂 has not acquired the
reading takai ‘high’ in modern Japanese. (It truly seems out of the question to me
that the annotations added to this text would have indicated that 低 should be read as
an adjective, but 昂 as a verb.)
Although I have not been able to verify the presence or absence of kana reading
aids in the original text,49 I have decided to translate the character 低 as ‘falling’ in
(13), even though this is in contradiction with the reading notes, which I have
adopted unaltered from the text as presented by Kindaichi in Nihon shisei kogi.
I am convinced that the appropriate translation here is ‘falling’, in agreement
with all previous texts: The description of the light ping tone for instance, also
shows that a translation of 低 as ‘low’ in combination with a translation of 昂 as
‘rising’ does not make sense. The light ping tone in my translation is therefore
‘beginning rising and ending falling’ and not ‘beginning rising and ending low’.
13 Shittan hiden-ki on the light and heavy of the four tones
先ヅ四声軽重ヲ明ラカニセバ、 If I first clarify the light and heavy
of the four tones
私頌ニ曰く、 I say as my personal explanation:
平声ノ重ハ初後倶ニ低シ。 the heavy ping tone is beginning and ending
both falling;
48 For instance, in his study ‘Nihon ni zanson-seru Chūgoku kinsei-on no kenkyū’ (1955) the
(light and heavy) ping tones of Biao in Annen’s text are described as 直ぐに低し sugu ni
hikusi ‘immediately low’ (p. 71), and the shang tone of Biao in the same text is described as 直
ぐに昂し sugu ni takasi ‘immediately high’. Isei’s light and heavy ru tones are described as 昂
く呼ぶ takaku yobu ‘to pronounce high’ and 低く呼ぶ hikuku yobu ‘to pronounce low’
respectively (p. 74).
49 The Koku-sho sō-mokuroku lists the following manuscripts of this work: One manuscript from
1321 kept at the Shinpuku-ji 真福寺, one manuscript from 1340, kept at the Kanchi-in Kongō-
zō 観智院金剛蔵, one manuscript from 1571, kept at the Kōya-san Ryūkō-in 高野山竜光院,
and one manuscript from the end of the Muromachi period (16th century) kept at the Hō-bodai-
in 宝菩提院. According to the Koku-sho sō-mokuroku the only printed edition of Shittan
hiden-ki can be found in the Buddhist canon Taishō shinshū daizō-kyō 大正新修大蔵経 vol.
84. However, the version of Shittan hiden-ki that is included in this series (which is based on
the manuscript from the year 1571), does not include the passage under discussion.
7.3 The descriptions of the tones 427
平声ノ軽ハ初メ昂リ後低シ。 the light ping tone is beginning rising
and ending falling;
上声ノ重ハ初メ低ク後昂ル。 the heavy shang tone is beginning falling
and ending rising;
上声ノ軽ハ初後倶ニ昂ル。 the light shang tone is beginning and ending
both rising;
去声ノ重ハ初メ低ク後チ (sic) 偃ス。 the heavy qu tone is beginning falling
and ending bending down;
去声ノ軽ハ初メ昂リ後偃ス。 the light qu tone is beginning rising and
ending bending down;
入声ノ重ハ初後倶ニ低シ。 the heavy ru tone is beginning and ending
both falling;
入声ノ軽ハ初後倶ニ昂ル。 the light ru tone is beginning and ending
both rising.
The most interesting feature of this tone description is the use of the character 偃 to
describe the qu tone. In my opinion the use of this character is an indication that the
standard theory’s reconstruction of the value of the qu tone is not correct. (In order
to understand the next passages a look at the overview of the different tone systems
at the end of this chapter may be helpful.)
The character 偃 has a number of readings in the Ruiju myōgi-shō dictionary, the
most standard being フス ‘bend, bow down, bend one’s head, stoop’ (This is also
the reading indicated by Iida’s furigana ス.) Furthermore there are タフル ‘fall,
come down’, ハイフス ‘to crouch down (on hands and knees)’, ヤスム ‘rest, sleep’
(< lie down?) ノケサマ ‘(to fall down) on one’s back’. The readings カクル ‘hide’
(< to crouch down?) and ア ラ ス ëto ruin, damage, disturb’ (<to make tumble
down?) can probably be regarded as related. There are also a number of readings
related to a bending, flapping or averting movement: ナ ヒ カ ス
ëto make something bend, to let something flutter’, アフク ‘fan, fan a fire’,50 ソラ
ス ‘bend, warp, avert, divert’, ノク ‘to get out of the way, step aside’, ニカス
‘let escape, let go’, フセク ‘defend against, ward off’. Furthermore there is ヲ
ノ ツ カ ラ ‘spontaneously, of itself’. Looking at the many readings above that
indicate a falling contour there can be no doubt that this character should be
translated as ‘bending down’ such as I have done above, or as ‘falling down’.
Mabuchi (1962:437) and Kindaichi (1951:691) on the other hand, have chosen to
translate this character as ‘rising up’.
50 There are two verbs アフク (mod. Jap. aogu). One means ‘to fan’ and the other means ‘to look
up at (from a humble position), to revere, to respect’. I think the first verb is probably meant
here because ‘to fan’ agrees with ナヒカス ‘to flutter’. On the other hand, although ‘to look up
at’ appears to be in contradiction with meanings like ‘bend one’s head, stoop’, the possibility
cannot be ruled out that this meaning developed from ‘bend, bow down, crouch down’ because
of the humble position from which one looks up.
428 7 Later Japanese tone theories
Mabuchi gives no explanation for his choice of translation, but Kindaichi’s
argumentation is as follows: He thinks that the key to the interpretation of Shinpan
use of 偃 ス husu to describe the ‘final sound’ of the qu tone in this text is
Myōgaku’s Shittan yōketsu (1101).51 In the 2nd passage of this text the heavy qu tone
in the eight-tone theory is described as 初平後去 “in the beginning ping and in the
end qu”, and Kindaichi concludes (as do I) that Shinpan’s 偃ス husu and Myōgaku’s
去 qu must have indicated one and the same tone contour.
However, instead of applying Shinpan’s rather concrete description as 偃ス to
Myōgaku’s qu tone in the eight-tone theory, Kindaichi switches to Myōgaku’s
description of the qu tone in the six-tone theory. This description – which is 初低後
昂 “beginning low and ending high” in Kindaichi’s interpretation – is now imposed
on 偃ス. The result is that the character 偃, despite all the readings pointing to a
falling tone contour listed above, is interpreted as ‘rising up’ instead of ‘falling
down’ or ‘bending down’.52
It has to be borne in mind that the value that Myōgaku gave to the qu tone in the
six-tone theory (which he attributed to the Shingon school) is fundamentally
different from the value that he gave to the qu tone in his own eight-tone theory. (As
will be explained in more detail in section 8.3.4, Myōgaku’s systematic method
made him apply the tone contour that he had reconstructed for the heavy shang tone
also to the qu tone, but only to the qu tone of the six-tone theory, as it was only in
this tone system that the qu tone and the heavy shang tone had merged.)
Shinpan’s ‘bending/falling down’ can only correctly be compared with
Myōgaku’s description of the qu tone in his eight-tone theory, which was ユガム
yugamu ‘to bend/warp’ in his works Shittan hi (1090) and Han’on sahō (1093), but
these are works which Kindaichi does not include.
There is no need to translate 偃ス as ‘rising up’ now. Rather, the use of the
character 偃 to describe the qu tone in Shinpan’s eight-tone theory is a strong
indication that the term yugamu ‘to bend/twist/warp’ in Myōgaku’s eight-tone
theory should be interpreted as ‘bending down’. (Such an interpretation would also
bring the Japanese descriptions of the Middle Chinese qu tone into a agreement with
the view of this tone prevalent in Sinologist circles.)
We can summarize the adaptations Kindaichi made to the translations as follows:
First, he used Iida’s reading hikusi for 低 in Shittan hiden-ki to argue for a
translation of 低 as ‘low’ in Shittan yōketsu, despite the fact that the kana notes
added to the version of Shittan yōketsu that he himself used indicated the reading
taru ‘falling’. Next, he used the six-tone theory of Shittan yōketsu (1101) to argue
for a reversal of the meaning of the character 偃 in the eight-tone theory of Shittan
hiden-ki.
51 The texts that Kindaichi used in Nihon shisei kogi to establish the standard theory are: Annen’s
Shittan zō, Myōgaku’s Shittan yōketsu, Shinpan’s Shittan hiden-ki, Moji-han and Shinkū’s
Hoke-kyō onkun, which are all included in this chapter.
52 For the real reason behind this remarkable reversal of the indicated tone value, see section 9.1.
7.3 The descriptions of the tones 429
In the first case, a character reading that is based on a questionable attestation of
a kana note overrules a character reading that has been amply attested, both by
means of kana notes in the tone descriptions themselves, and as a reading in the
Ruiju myōgi-shō dictionary. In my opinion, the second argument is based on a mix-
up of two fundamentally different quantities, namely the six-tone theory and the
eight-tone theory.
7.3.2.3 Ryōson 了尊 (Shingon school)
In Shittan rinryaku-zu-shō 悉曇輪略図抄 (1287) Ryōson 了尊, a Shingon scholar
and pupil of Shinpan, describes the typical eight-tone system in which all four tones
have a light and a heavy variant. (Cf. Mabuchi, 1962: 436-437 and Konishi, 1948:
502-503).
Ryōson’s ideas – and the text itself as well – are clearly identical to the ones
expressed by his teacher Shinpan, except that in addition, Ryōson explicitly
mentions that in practice only six tones are used. I see the fact that Ryōson’s
description, which must have been based on the tone description of his teacher
Shinpan, does not contain kana reading aids after 低 (or anywhere else for that
matter) as an indication that Shinpan’s original text probably lacked such reading
aids as well.
The text also includes the chart comparing the Go-on and Kan-on tones shown in
section 4.1. Following directly after the chart, which includes light and heavy tone
dots for all of the four tones, Ryōson continues: “I will give my personal explanation
of the light and heavy of the four tones shown earlier (in the chart) above” (右先明
二四声軽重一者、私頌云). His explanation (Konishi, 1948: 502-503) is as in (14).
14 The tone theory in Shittan rinryaku-zu-shō
平声重初後倶低、 The heavy ping tone is beginning and ending both falling;
平声軽初昂後低、 the light ping tone is beginning rising and ending falling;
上声重初低後昂、 the heavy shang tone is beginning falling
and ending rising;
上声軽初後倶昂、 the light shang tone is beginning and ending both rising;
去声重初低後偃、 the heavy qu tone is beginning falling
and ending bending down;
去声軽初昂後偃、 the light qu tone is beginning rising
and ending bending down;
入声重初後倶低、 the heavy ru tone is beginning and ending both falling;
入声軽初後倶昂。 the light ru tone is beginning and ending both rising.
但入、久津布千鬼。 However, the ru tone ends in ku, tu, hu, ti or ki.
重通レ平、軽通二上声一。Heavy corresponds to ping and light corresponds to shang.
四声各軽重八声。 All the four tones have heavy and light,
which makes eight tones.
430 7 Later Japanese tone theories
上重摂二去声之重一、 The heavy qu tone acts in place of the heavy shang tone,
去軽摂二上声之軽一。 and the light shang tone acts in place of the light qu tone.
除二上重去軽一六声。 Removing the heavy shang tone and the light qu tone
leaves six tones.
7.3.2.4 Anonymous (Tendai school)
Moji-han 文字反, written in the Genkō era (1331-1334) by an unknown author, but
clearly part of the Tendai tradition, was originally kept at the library of the Kōzan-ji
高山寺 temple. It is an elaboration on Myōgaku’s work Han’on sahō.
The first quotation (Kindaichi, 1951:694) contains a description of the four basic
tones:
15 Moji-han on the four tones
平声タヒラカナルコヱ The ping tone is a level tone
上声アガルコヱ The shang tone is a rising tone
去声サルコヱ The qu tone is a going tone
入声イリテサガルコヱ The ru tone is an entering and falling tone
This description appears to be so strongly influenced by the names of the four tones
that it hardly contains any concrete information. The description of the ru tone as
falling however, is interesting. It would agree with the value posited for the heavy ru
tone in Myōgaku’s system. As ping and ru usually have the same pitch, it is perhaps
possible that in practice the ping tone had a falling tone contour as well.
The second quotation deals with the complications in distinguishing the light and
heavy of the shang and qu tones (Konishi, 1948: 496).
16 Moji-han on the difference between shang and qu
上声去声軽重者、安然云、As to the light and heavy of the shang and qu tones,
according to Annen,
上声字、重ヲバ短声呼、 in case of shang tone characters,
the heavy is pronounced short,
軽ヲバ長呼。 and the light is pronounced long.
去声者,此反セリ云々。 In case of qu tone characters, this is the opposite.
This remark seems to confirm my assumption that the idea of a difference in length
as the distinguishing feature of light and heavy in the shang and qu tones in Chūzan
and Fujiwara Munetada’s tone descriptions goes back to Annen’s text.
Moji-han is often quoted because it contains one of the first indications that a
change in the Japanese tone system was under way: In Moji-han the Japanese word
sima ‘island’, which belongs to tone class 2.3 and which would normally be marked
7.3 The descriptions of the tones 431
with 平平 tone dots, occurs marked as 上平 (Ōno, 1950). Kindaichi therefore thinks
that the historical change in which – according to the standard theory – words that
started with sequences of /L/ tone developed /H/ tones at the beginning of the word
(/LL/ > /HL/, /LLL/ > /HHL/ etc.) had already begun in the Kamakura period.
Following Ramsey’s theory on the other hand, a 上平 marking can only be
interpreted as /LH/. In Ramsey’s theory therefore the change is /HH/ > /LH/, and
reflects the development towards a restricted tone language. (Cf. chapter 4 of part I.)
7.3.3 The early Muromachi or Nanboku-chō period (1338-1392)
7.3.3.1 Kenpō 賢宝 (Kogi Shingon school)
Shittan-jiki shōgaku-shō 悉曇字記創学抄 (1380) is a work by the Shingon monk
Kenpō who worked at the Tō-ji 東寺 temple of the Kogi Shingon school in Kyōto.
(As mentioned in section 5.6, the Shingon school had split into two branches, Kogi
Shingon and Shingi Shingon, over a dispute on doctrine in 1299.) In all of his works,
Kenpō devoted himself to committing the teachings of his teacher Gōhō 杲宝 to
writing.
This work contains a tone dot chart (Fig. 6) that is introduced with the remark:
心覚抄云 ‘an excerpt from Shinkaku says’. The chart, as well as the example
characters and the descriptions of the tones, coincide exactly with those of Myōgaku
in Shittan-hi, and in addition there is the following remark, which can also be found
in Shittan-hi:
此等皆是明覚私案也、All this is the personal opinion of Myōgaku,
不 レ可 レ為 レ本 one cannot rely on it.
Konishi (1948:505) therefore concludes that the character 心 must have been a
miscopy of the character 明 written in the cursive script. It appears therefore that it
is the early tone theory of Myōgaku that is being transmitted here in the Shingon
school. (It is mentioned however, in Shittan-jiki shōgaku-shō that in practice most
people used only six tones.)
What I find truly surprising, is that one suddenly relies here on Myōgaku’s early
tone theory with the ill-chosen example characters, when earlier on in the Shingon
school, we have seen beautifully worked out tone systems like those of Shinpan and
Ryōson. I see this as an indication that the great flourishing of Shingon Siddham
scholarship had come to an end.
432 7 Later Japanese tone theories
Figure 6: Tone dot chart with descriptions of the tones in Shittan-jiki shōgaku-shō
Source: Mabuchi (1962:657)
Another work by Kenpō, Shittan shogaku-shō 悉 曇 初 学 抄 (date unknown),
includes a small fushihakase chart. The way in which the fushihakase are added to
the consecutive kana signs shows a resemblance with the hakase chart in Dōhan’s
Shittan-jiki kikigaki: Although the marks no longer have a tone dot as their starting
point, the side to which they are added is still determined by the original location of
the tone dots, and it is still easy to see that shang was expressed by a diagonal mark
(backslash), ping by a horizontal mark, and qu by a z-shaped mark. In addition, light
is marked with a diagonal mark (i.e. shang) and heavy with a horizontal mark (i.e.
ping), except in case of the heavy shang tone, where the first kana is marked with a
diagonal mark (forward slash).
The fact that the side of the kana to which the marks are added is still determined
by the former location of the tone dots, shows that this hakase type is still very close
to shōten hakase, and does not belong to the goin hakase type that would later
become typical of the Shingon school.
7.3 The descriptions of the tones 433
Figure 7: Fushihakase marks expressing the tones in Shittan shogaku-shō
Source: Mabuchi (1962:659)
7.3.3.2 Anonymous (Tendai school)
According to Konishi, the work Shosha-san shōmyō-shō 書写山声明抄 belongs to
the same tradition as works from the Tendai school like Hoke-kyō on (12th century)
and Dokkyō kuden myōkyō-shū (1284), both mentioned in section 7.1.3. The original
title of this work is missing and the present title was formulated by Konishi based on
a number of remarks in the colophon.
Konishi divides his material into ‘Heian period’, ‘Middle Ages’ and ‘Edo period’,
and this anonymous work is placed in the Middle Ages, which could mean any time
between 1200 and 1600. Some time around the 14th century seems most likely, as
Shosha-san shōmyō-shō contains the following remark (Konishi, 1948:489):
師云、仮名ノ一字ニ去声無レ 之、高野林臨法印ト云時ヨリ始レリ.
According to my teacher, the qu tone does not occur with single kana. This
started from the time of Kōya-san’s Rinrin Hōin.53
This remark in Shosha-san shōmyō-shō refers to the development discussed in
section 4.4. Single-kana characters that had a qu tone dot in the Wa-on readings of
the Tosho-ryō-bon of Ruiju myōgi-shō were later marked with a shang tone dot in
Shinkū’s 14th century pronunciation guide to the Lotus Sutra Hoke-kyō ongi. I have
placed Shosha-san shōmyō-shō just before Shinkū’s work, but it is also possible that
it dates from after Shinkū’s time.
Next there is a description of all the tones (Konishi, 1948:501).
53 Hōin is one of the grades of the priesthood in Japan. I have not been able to identify Rinrin
Hōin.
434 7 Later Japanese tone theories
17 Shosha-san shōmyō-shō on the quasi eight-tone system
平声ハソノ声始ヨリ終マデサガレリ。 The ping tone is falling, from the
beginning of the tone to the end.
妙法之妙是也。 The meu (myoo)54 of meuhohu
(myoohoo) is like this.
上声ハソノ声始ヨリ終マデアガレリ。 The shang tone is rising, from the
beginning of the tone to the end.
蓮花ノ花是也。 The kwe (ke)55 of rengwe (renge) is like
this.
去声ハソノ声始ハサガリテ The qu tone is in the beginning falling
終ハアガレリ。蓮花ノ蓮是也。 and in the end rising. The ren 56 of
rengwe (renge) is like this.
入声ハソノ声サガリタル事ハ In being falling the ru tone
平声ノ様ナレドモ、 is like the ping tone,
終ノカナニフツクチキノ五ノカナ but it is ru tone because as the last kana
有ニヨリテ入声トス。 it has (one of ) the five kana hu tu ku ti
ki.
平声ノ軽音始ハアガリテ然サガレリ。 The light ping tone is first rising and
then falling.57
其字如何。一切ノ切是也。 What kind of character is this? The
sai58of issai is like this.
入声ノ軽ハ始ヨリ終マデ In being rising from the beginning to
the end,
アガリタル事ハ上声ノ様ナレドモ、 the light ru tone is like the shang tone
入声ノフツクチキノ五ノ字ノアル故ニ but because it has one of the five ru
tone characters hu tu ku ti ki
入声ノ軽トス。 it becomes light ru tone.
他国ノ国、奇特ノ特等、是也。 The koku of takoku and the doku of
kidoku are like this.59
54 In reality this is a qu tone character with a second muddy initial in Middle Chinese, so the Go-
on reading meu is appropriate.
55 In reality this is a ping character with a clear initial in Middle Chinese, so the Go-on reading
kwe (クヱ) is appropriate.
56 In reality this is a ping tone character with a second muddy initial in Middle Chinese. Ren can
be the Go-on or the Kan-on reading.
57 Elsewhere in the text (after some examples of characters with a light ping tone), the following
note is added: 或人、平声ノ軽ハ初ノカナハ上リ後ノカナハ下ル。According to some, the
first kana of the light ping tone is rising, and the second kana is falling.
58 In reality the character sai in issai is a qu tone character with a clear initial in Middle Chinese,
and so the Go-on reading sai is appropriate.
59 Koku is ru tone character with a clear initial in Middle Chinese (the Go-on and the Kan-on
reading are the same), and doku is a ru tone character with a muddy initial in Middle Chinese. I
have adopted the Go-on reading doku, as all previous examples in this text appeared to refer to
7.3 The descriptions of the tones 435
上ノ中ニアル声デハ通声ト名ク。 The tone in the middle above is called a
‘common/general’ tone.
去声ニヨマントモ上声ニヨマントモ Because it can be read as qu tone or
shang tone
心ニ任テヨム故ニ任意ノ声トモ云也。 as you like it, it is also called ‘optional’.
下ノ中ニアル声ヲバ半音ト云。 The tone in the middle below is called a
‘half sound’.
是ハ入声ノフツクチキノ中ノフ Especially characters that end in hu from
among the hu tu ku ti ki of the ru tone,
ノ文字ヲトリワケテ半音ト是ヲ云。 are called ‘half sounds’.
其字如何。法、入、葉等、是也。 Which characters are like this? Hohu (>
hoo), nihu (> nyuu), ehu (> yoo) and
such are like this.
中央ノ声ヲバ唐音ト名ク、 A ‘central tone’ is called Kara-goe
(Kan-on).60
是ヲヨム時ハ其声上声ノ様ナレドモ When reading it, the tone is like the
shang tone,
入声ノフツクチキノフ仮名ヲトリ but especially the kana hu from among
ワケテ、 the hu tu ku ti ki of the ru tone,
サセキラタメテ61半音ト云。 is called (...) a ‘half sound’.
其声ヲ軽クヨム時ハ When this tone is read lightly
唯ダ上リニアガリタルコトハ the fact that it only rises up,
上声ノ様ナレドモ、 is like the shang tone
半音の字ナレバ是ヲ半音ノ軽キ声 but because it is a ‘half sound’ character,
トス。 I consider it the light ‘half sound’.
其ノ声如何トナレバ、無復、 As to what tones are like this, not huu,
枝葉、是。 but siehu (> siyoo)62 is like this.
Go-on readings as well.
60 Although 唐音 nowadays refers to Tō-in, in early usage it was read as Kara-goe and was
another term for Kan-on. (See section 3.10.)
61 This passage may contain a scribal error, as I have not been able to find a translation for this
string of kana signs.
62 A translation of this sentence as “(the huu of) muhuu and (the yoo of) siyoo are like this” seems
obvious, but can only be correct if the writer was under the impression that the spelling of 復
was huhu, just as yoo was spelled as ehu. I have chosen a different translation for the following
reasons: The character yoo 葉 in the well-known compound siyoo (‘branches and leaves’ or
‘side issues, minor details’) was originally a ru tone character ending in -hu (< -p), and is
therefore an appropriate example of a fu-nisshō. The character combination 無復 on the other
hand, is not a well-known compound, but more importantly, I do not see how the character 復
could have been quoted as an example of the fu-nisshō: This character had two readings in
Middle Chinese, one (‘again’) had a qu tone and one (‘resume’, ‘restore’) had a ru tone ending
in -k, and not in -p. The result of the fact that this character had two readings in Middle Chinese,
is that in Japan this character has the Kan-on readings huu (< qu tone) as well as huku (< ru
tone), and the Go-on readings bu (< qu tone) as well as buku (< ru tone). As far as I understand
this passage, it is stressed here that the long vowel in the Kan-on reading huu did not develop
436 7 Later Japanese tone theories
The example characters show that this work deals with the correct pronunciation of
Go-on type loanwords in Japanese texts, and not with the Kan-on tone system used
for the correct recitation of the dhāran,ī.
Although the complicated tones of the Siddham scholars originally applied to
Kan-on only, we see that they were later freely applied (in the reverse) to the Go-on
readings to which the Tendai and Shingon schools had reverted in almost all other
realms of usage. (This even included typical Kan-on features such as the
representation of light qu by means of the bifura dot, and the inclusion of a
heavy/light distinction in the ping and ru tones.)
The passages on the fu-nisshō are rather obscure, but seem to refer to a
difference in the tone of the fu-nisshō depending on whether a character was read
according to Go-on or Kan-on, just as there was such a difference in tone between
the Go-on ru tone and the Kan-on ru tone. (See section 11.1.2.)
7.3.3.3 Shinkū 心空 (Tendai school)
Shinkū (1319-1401) – also known as 真空 – devoted himself to producing all kinds of
pronunciation guides for the Lotus Sutra (Hoke-kyō): Guides in which the characters
are given in the order in which they appear in the Hoke-kyō, guides with the
characters arranged according to the Japanese syllabary, and guides with the
characters arranged according to their radical. His works include Hoke-kyō ongi 法
華経音義 (1365-1370), Hoke-kyō onkun (1386) 法華経音訓 and Waten hoke-kyō
倭点法華経.
Some of Shinkū’s tone systems appear to be purely theoretical. The first part of
Hoke-kyō ongi (written around 1365) for instance, includes tone charts with four and
six tones, but also a tone chart with as many as twelve tones. (See section 7.1.1.) To
the four-tone chart, only the briefest possible of descriptions is added (Konishi,
1948:504).
18 Hoke-kyō ongi on the four basic tones
平進 Ping: advancing
上昂 Shang: rising
去初低後昂 Qu: beginning falling ending rising
入フツクチキ Ru: hu, tu, ku, ti, ki
The notes added to the shang, qu and ru tones are conventional, but the character 進
‘to advance’ that has been added to the ping tone appears for the first time. There
can be little doubt that it expresses a level tone contour; ‘to go on as it is’. (In
modern Japanese a level tone contour is still described as heishin 平進.)
out of the ru tone, and can therefore not be compared to a case like ehu > yoo.
7.3 The descriptions of the tones 437
To the six-tone chart the following rather mysterious remarks have been added
(Konishi, 1948:504).
19 Unusual features of the six-tone system in Hoke-kyō ongi
上ノ重ハ去ニワタリ Heavy shang goes over to qu
去ノ重ハ平ノ軽ニワタル Heavy qu goes over to light ping
平濁ニ軽音 There are light sounds among the muddy ping
Although the first line is conventional, the next two lines are not. The remark about
heavy qu going over to light ping seems to have been adopted from descriptions of
the relation between Go-on and Kan-on (section 4.1), which is out of place here, as
the first line clearly refers to a merger within Kan-on. The last line seems to echo
Chūzan’s remark in Hoke-kyō shakumon: “Ping tone characters with muddy initials
are light-heavy” (section 7.3.1.1).
These remarks, which are out of place and seem to have been adopted rather
arbitrarily from earlier works, as well as Shinkū’s unusual twelve-tone system
indicate that in the Tendai school as well, the golden age of Siddham studies had
come to an end.
As a final work by Shinkū, I introduce Hoke-kyō onkun 法華経音訓 (1386). The
tone descriptions in Hoke-kyō onkun ostensibly still conform to tradition. The
descriptions of the tones are short (Wenck, 1953:218, Kindaichi, 1951:693) but in
line with what we have come to expect from tone descriptions in the Tendai school.
20 Hoke-kyō onkun on the four basic tones
清声ハ一点 A clear tone has one dot
濁声ハ二点 A muddy tone has two dots
平声ハタル The ping tone falls
上声ハアガル The shang tone rises
去声ハハジメタレテ The qu tone falls in the beginning
ノチニアガル and later rises up
入声ハフツクチキニトドマル The ru tone ends in hu tu ku ti or ki
呉漢ノ声ヲノヲノ異ナリ The Go-on and the Kan-on tones
each differ from each other
With these tone descriptions by Shinkū, I conclude the list of descriptions that can
be regarded as contemporary with the production of the tone dot material on which
our knowledge of the tone system of Middle Japanese is based. In the Muromachi
period the tone descriptions become increasingly hard to follow. The tone dots fall
into disuse, and are replaced by a number of different fushihakase musical notation
systems.
438 7 Later Japanese tone theories
7.4 Overview of the tone descriptions
The tone descriptions in this chapter can be divided into two major types:
Descriptions that concentrate on differences in length, and descriptions that
concentrate on differences in pitch.
7.4.1 Descriptions that concentrate on differences in length
between light and heavy in the shang and qu tones
In the first type, the distinction between heavy and light in the shang and qu tones is
defined in terms of length. (This happens in Hoke-kyō shakumon, Sakumon daitai
and Moji-han.) The descriptions that define the distinction between the heavy and
light in the shang and qu tones as having to do with differences in length tend to
deal with these tones only, as they were considered problematic. The ping tone that
is so important in marking the tones of Japanese is not even mentioned, and concrete
descriptions of the pitch of the shang tone, which is equally important in marking
the tones of Japanese, are lacking as well.
I see the descriptions in Hoke-kyō shakumon by Chūzan of the Hossō school and
Sakumon daitai by the courtier Fujiwara Munetada as remnants of an older
interpretation of Annen’s text, which was replaced by Myōgaku’s new way of
defining all the subtones in terms of pitch. (The anonymous work Moji-han from the
Tendai school does not belong to this group, as it is of much later date, but may have
been inspired by it.) Fujiwara Munetada stays closest to Annen’s text as he only
mentions a length distinction in the qu tone and not also in the reverse in the shang
tone, such as is the case in Hoke-kyō shakumon and Moji-han.
21 The description of shang and qu in Hoke-kyō shakumon and Sakumon daitai
Hoke-kyō Sakumon
shakumon (976) daitai (1108)
ping x x
light shang long x
heavy shang short 平軽 + 上 > qu
light qu short 平 (short) + 去 (long) > shang
heavy qu long 平 (long) + 去 (short)
ru x x
7.4.2 Descriptions that concentrate on differences in pitch
In the next type, the distinction between the different tones – including the
difference between heavy and light in the shang and qu tones – is defined in terms of
pitch. This type can be divided into descriptions of the eight-tone system and
descriptions of the six-tone system. I present these in two different tables (22 and
7.4 Overview of the tone descriptions 439
23). It is this system which is most relevant to the question of what the tone value of
the tone dots was like when they were used to mark the tones of Japanese.
22 Descriptions of the tones in the eight-tone theory
Shittan-hi Shittan Kunshū shittan Shittan Shittan rinryaku
(1090) yōketsu shii yōketsu-shō jiki-shō zu-shō
Myōgaku (1101) (±1140) (1142) (1287)
Myōgaku Shingon Tendai Shingon
平軽 RF RF n.m. RF RF
平重 FF FF 去平 FF FF
上軽 n.m. RR 上上 RR RR
上重 FR (平上) FR (平上) 平上 FR FR
去軽 n.m. 上去 上去 R+ ‘bent’ R+ ‘bending down’
去重 F +‘bent’ 平去 平去 F + ‘bent’ F+ ‘bending down’
入軽 RR RR n.m. n.m. RR
入重 FF FF n.m. n.m. FF
23 Descriptions of the tones in the six-tone theory
Shittan-hi Shittan yōketsu Shittan rinryaku Shosha-san
(1090) (1101) zu-shō shōmyō-shō
Myōgaku Myōgaku (1287) (Middle Ages)
Shingon Tendai
平軽 RF RF RF RF
平重 FF FF FF F
上 n.m. RR RR R
(上重) ‘as 去’ ‘as 去’ ‘as 去’ n.m. (as 去?)
63
去軽 n.m. n.m. ‘as上’ bifura
去 FR (平上) FR (平上) F + ‘bending down’ FR
入軽 RR RR RR R
入重 FF FF FF F
(フ入声) n.m. n.m. n.m. フ入声
In the tables, F stands for a falling tone contour and R stands for a rising tone
contour. Sequences of FF and RR, should be interpreted as simple F and R, as the
repetition is no more than a formality stemming from Myōgaku’s transcription
method. If a tone does not exist in a certain tone description, this is indicated by
means of x in the appropriate slot. If the tone exists, but is not mentioned this is
indicated by means of n.m. (‘not mentioned’).
63 According to Dokkyō kuden myōkyō-shū (1284) the bifura-ten indicated that light qu was read
as shang. (See section 7.1.3.)
440 7 Later Japanese tone theories
I have not included descriptions that contain only marginal information on the
tones such as Shittan daitai, Shittan kuden and Dokkyō kuden myōkyō-shū, or
descriptions that are identical to the ones that have already been included in the
tables, such as Han’on sahō (repeating the eight-tone system typical of Myōgaku),
Shittan-jiki shōgaku-shō (repeating Myōgaku’s early eight-tone system), Hoke-kyō
onkun (repeating the tone system typical of the later Tendai school) and Shinpan’s
Shittan hiden-ki (in which the eight-tone description is identical to the one in Shittan
rinryaku-zu-shō).
8 Background and analysis of the tone theories
of the Siddham scholars
For an adequate interpretation of the tone theories of the Siddham scholars, and for
an assessment of their value in the reconstruction of the tone system of Middle
Japanese, it is necessary to place these theories in a correct context. I therefore want
to start by expelling two common misunderstandings concerning the texts
introduced in the previous chapter. The first misunderstanding is that what the
Siddham scholars describe, is the tone system of a natural form of Chinese (Late
Middle Chinese). The second misunderstanding is that the tones that were selected
from these tone systems to mark the tones of Middle Japanese were completely
identical in tonal value to the Middle Japanese tones that they were marking.
8.1 The tones of the Siddham scholars do not represent
the tones of LMC
In Annen’s 安然 time (9th century) direct contact with spoken Chinese had still been
relatively recent. The tone theories that are contemporary with the period from
which the bulk of the tone dot material stems however, are strongly theoretical, as in
the mid 9th century official contact with China had been severed. Although not long
after, unofficial contact with China through trade was restored, and monks from Zen
schools continued to visit China, the Chinese tones as they were used in the
recitation of the dhāran,ī in the esoteric schools can be shown to go back to
meticulous study of the tone descriptions that had been recorded by Annen. Shittan-
zō 悉曇蔵, the classic textbook of Siddham studies in Japan, contained the most
authoritative overview of the tone systems that had reached Japan.
The tone system of Late Middle Chinese was recorded by Annen in an older
form (表 Biao) and in a late 9th century, most likely Chang’an based form (Isei 惟正
and Chisō 智聡). These descriptions by Annen, rather than the oral transmission
started by returnees from China, shaped the tone systems that were passed on from
one generation to the next in the Shingon and Tendai schools.
The interpretations of Annen’s text by later generations of scholars cannot be
regarded as directly reflecting the tones of Late Middle Chinese. The tone systems
of the Siddham scholars were part of an idealized form of Chinese, used in the
context of religious chanting. The complicated sequences of rises and falls that we
find in this very specific realm of usage have little or nothing to do with the standard
language of Tang China.
442 8 Background and analysis of the tone theories of the Siddham scholars
The idea that the tone systems of the later Siddham scholars form a faithful
representation of the tones of Late Middle Chinese however, has played an
important role in the interpretation and translation of these tone descriptions by
Kindaichi and others. Kindaichi (1951) argued for instance, that a tone system that
did not include even one level tone was ‘strange’, and he therefore decided to
translate taru and agaru in the tone descriptions as ‘low’ and ‘high’ instead of
‘falling’ and ‘rising’.
In other words, the mistaken notion that these tone systems represent the tones of
some form of natural spoken Chinese has made modern Japanese scholarship
reluctant to accept the tone descriptions at face value.
I think however, that there is proof enough to conclude that the Siddham
scholars’ views do not go back to a faithfully preserved oral transmission of the
pronunciation of Late Middle Chinese. In my opinion it is therefore useless to make
these tone systems resemble the tone system of a natural spoken language. The
Siddham scholars developed their theories on the basis of interpretations and
reinterpretations of the tone descriptions by Annen, as well as on all kinds of
inferences based on the way in which Chinese characters had been used to transcribe
Sanskrit in works on the Siddham script from China.
Although it has often been pointed out that all shōmyō theories after Annen were
based on Shittan-zō, for a long time I did not realize how literally this fact can be
taken. During my study of the Japanese tone theories, I have only gradually come to
the conclusion (which was much to my own surprise) that there is no evidence for
the existence of multiple, truly independent traditions among the different schools:
There are differences; but all tone systems appear to be based on Annen’s record.
The differences are no more than differences in the interpretation of ambiguities in
Annen’s text. (For instance heavy and light in the shang and qu tones as differences
in length by Chūzan 仲算 and Fujiwara Munetada 藤原宗忠, and as differences in
tone height by Myōgaku and the people after him.)
Ideally, in esoteric Buddhism, the mysteries of the faith are transmitted orally
and not written down. The relationship between master and pupil is therefore of the
greatest importance. (Often a master would divulge all his knowledge of the secret
teachings to only one pupil.) But I think that we should not exaggerate the reliability
of orally transmitted tradition. Orally transmitted tradition is only as reliable as its
least accomplished teachers and students; the failure of a single generation of
teachers or students is enough to break the chain of transmission.
There can be no doubt that as the centuries passed, the Shingon and Tendai tone
theories came to rely more and more on Annen’s text. The inclination to check a
deficient oral tradition against a famous and authoritative written record – when such
a thing is available – is apparently quite strong.
The fact that they all looked for guidance in Shittan-zō is understandable when
we consider that Annen’s text contains the only known Japanese descriptions of the
Chinese tones from this period. Except for Yuanhe yunpu no other text from this
period is ever quoted in the many later Siddham studies, so it is unlikely that there
8.1 The tones of the Siddham scholars do not represent the tones of LMC 443
were others. (In Moji-han 文字反 for instance, almost 500 years after Annen wrote
Shittan-zō, Annen is still quoted as an authority, and the idea that the light shang
tone was long, while the heavy shang tone was short and vice versa in the qu tone, is
still attributed to him.)
Evidence for this reliance on Shittan-zō can be seen in the fact that the Siddham
scholars tone systems contain elements of which it is highly unlikely that they truly
go back to Late Middle Chinese. The realization that all tone theories after Annen
were based on Shittan-zō opens the way to explaining these puzzling characteristics
of the Japanese tone systems as stemming from a reading of Annen’s text by later
generations of scholars who lacked a full understanding of what Annen had
originally meant.
8.1.1 Features that go back to a misinterpretation of Annen’s text
The first indication that the Siddham tone theories go back to later reinterpretations
of Annen’s text can be seen in the way in which the characters in Kan-on are divided
into light and heavy: In Kan-on, the ping tone is the only tone in which the sonorant
(jidaku/second muddy) initials belong to the heavy category.
In most Chinese dialects that have a split into a higher yin and a lower yang
register, syllables with sonorant initials belong to the lower yang group in the ping,
qu and ru tones. In the shang tone this can differ depending on the dialect: in some
dialects they belong to the higher yin group (such as in Mandarin and some northern
Wu dialects), and in others (such as in Cantonese and other Wu dialects) they belong
to the lower yang group. In Japanese Kan-on on the other hand, characters with
second muddy initials only belong to the heavy (yang) group in the ping tone. In all
other tones, they belong to the light (yin) group.1
According to Endō, the difference in the division between ping and the other
tones in Japan is an accurate reflection of the way in which these tones were affected
by the yin/yang register split in China: Endō thinks that this is the way in which the
tones split in the dialect of Chang’an as described by Isei and Chisō. He sees
confirmation in the fact that in the modern dialect of the area of Chang’an (Xian),
former ru tone syllables with sonorant initials have merged with the yinping tone,
and not with the yangping tone such as former ru tone syllables with muddy initials
did. (Syllables with sonorant initials in the ping tone have joined the yangping tone.
Cf. Zavjalova, 1983.)
In Isei and Chisō’s time, the merger of ru with ping in this area had of course not
yet taken place. If the later merger of ru tone syllables with sonorant initials with the
yinping tone truly means that in this area, syllables with sonorant initials belonged to
the yin register in the 9th century, then knowledge of this difference between ping
1 In Hoke-kyō shakumon 法華経釈文 (976) by the Hossō monk Chūzan for instance, in the ping
tone heavy consists of jidaku and muddy characters, while in the ru tone heavy consists of
muddy characters only.
444 8 Background and analysis of the tone theories of the Siddham scholars
and ru as to the division into the tonal registers must have been preserved through
oral tradition in Japan: It is not mentioned in Isei and Chisō’s descriptions.
I find it hard to imagine, that syllables with sonorant initials in the ru tone would
have joined the higher register in a tonal split based on voicing, such as described by
Isei and Chisō. I therefore assume that the merger pattern of syllables with sonorant
initials in the ru tone in Xian is of later date, stemming from the period of the
merger of the ru tone with the ping tone. I find it even harder to imagine that such a
relatively insignificant feature of the ru tone would have been faithfully preserved
through oral tradition, when the far more notable fact that the heavy shang tone had
merged with the qu tone in Chang’an (and that Isei and Chisō’s heavy shang
category therefore consisted of second muddy initials) was overlooked. I therefore
think that the unusual division of the characters over the two registers in Japanese
Kan-on has a different origin:
The most likely explanation is, that this division goes back to a misinterpretation
of Annen’s text: The fact that in Biao’s ping tone the nu-sounds were heavy (voiced
aspirated) is so clearly presented as something exceptional, that the Siddham
scholars must have automatically assumed that the second muddy initials in all the
other tones were light. (And they were indeed light in Biao’s time – in the sense of
having no breathy voice quality – but not in the tonal sense that the later Siddham
scholars assumed.) The agreement between what is suggested in Annen’s text, and
the actual division that can be seen in Japanese Kan-on is too striking to be a
coincidence, especially as we know that the circles that introduced the heavy/light
distinction in Japan were also avid students of Annen’s work. It is more likely that
the similarity between the division in heavy and light in the ru tone in Kan-on and
the modern dialect of Xian is a coincidence, the result of a later developments in the
dialect of Chang’an.
Heavy and light as a tonal distinction did not yet form part of the Late Middle
Chinese standard language that was brought to Japan in the 7th and 8th centuries. No
difference in tone height between heavy and light is mentioned by Biao, and the
distinction is lacking in the modern dialect reflexes of the numerous Kan-on ru tone
words that were adopted as loanwords in the spoken language (cf. section 11.1.2).
This means that the clearly tonal light/heavy distinction in the ru and ping tones in
the standard Kan-on character reading tradition was the result of later developments
in the Japanese tone theories that took place in Japan. The distinction was most
likely introduced based on 9th century reports (brought back by students like Isei and
Chisō ) of the register split that had meanwhile taken place in Chang’an.
In other words, the new definition of heavy and light as a tonal distinction was
applied, not only to Annen’s two later tone descriptions (in which the meaning of
the terms had indeed been mostly tonal), but also to Biao’s tone description in which
the term had still referred to voice quality. The nature of the 9th century split ([H]
versus [L]) was projected back onto the division into voice quality categories of the
earlier system mentioned by Biao. Biao’s tradition, which was associated with the
8.1 The tones of the Siddham scholars do not represent the tones of LMC 445
Kan-on reading standard, was now being read as if it had already described the 9th
century tonal split.2
The second misunderstanding of Annen’s text is also related to the division into
heavy and light. As Pulleyblank has pointed out, the dialect that Isei and Chisō
describe is almost certainly the dialect of Chang’an, the capital of Tang China where
they both had lived. The dialect of Chang’an represented the standard dialect that
was adopted throughout all of China, and no one – including these two students
from abroad – would have been interested in recording the tone system of a
provincial dialect. The merger of the heavy shang tone with the qu tone was typical
of the Tang standard language, and we must therefore assume that in the tone system
described by Isei and Chisō as well, shang tone syllables with voiced aspirated
(muddy) initials had merged with the qu tone. As explained in section 6.3, the
separate ‘heavy shang’ category in Isei and Chisō’s tone system can therefore only
have consisted of syllables with second muddy initials.3
Myōgaku’s eight-tone system, and the eight-tone systems after him, were based
on study of Isei and Chisō’s descriptions in Shittan-zō, but as said, in the Siddham
scholars’ tone theories, shang tone characters with second muddy initials are
categorized as belonging to the light group. Only shang tone characters with muddy
initials are regarded as heavy, and they have not merged with the qu tone. This
means that any awareness of the way in which these terms had been applied in Isei
and Chisō’s tone systems, and any awareness of the fact that in the eight-tone system
shang tone syllables with muddy initials had already merged with the qu tone had
been lost.
Not only Myōgaku was not aware of these things anymore; no one appears to
have preserved this knowledge through oral tradition. As we will see in section 8.1.2,
when we examine the six-tone theory of the Shingon school, it turns out that this
tradition suffers from a similar misunderstanding.
We have to remember that even much of the terminology that Annen used was
not yet clearly defined at the time. The terminology to distinguish between muddy
and second muddy initials for instance, had not yet reached Japan, and without this
distinction it is very hard to arrive at a correct understanding of Annen’s text.
Pulleyblank could check the merger patterns of the modern Chinese dialects that
developed from the Tang standard language, and conclude that (despite Annen’s
2 In some early Kan-on material in which the tones definitely expressed tonal distinctions (such
as the Sei-on in the Tosho-ryō-bon of Ruiju myōgi-shō 図書寮本類聚名義抄)a register
distinction in the ru is still missing (Komatsu, 1971:510, 519-520). This is probably because
Biao’s tradition in Annen’s text mentions a voice quality distinction for the ping tone, but not
for the ru tone. Later however, a heavy/light distinction was acknowledged for the Kan-on ru
tone as well, most likely based on the comments by Annen discussed in section 6.5. See also
section 10.2.
3 Assuming that the merger of shang tone syllables with muddy initials with the qu tone had
already taken place, such as Pulleyblank does, also explains why the shang tone, unlike the
ping tone, had only two tonal categories in the ‘double’ tone system of the 9th century.
446 8 Background and analysis of the tone theories of the Siddham scholars
misleading description), the heavy shang tone had already merged with the qu tone
in Isei and Chisō’s tone systems. For Myōgaku and scholars like him this was not
possible.
Another example which shows that the later Japanese tone theorists based
themselves on interpretations of Annen’s text and not on a natural form of Late
Middle Chinese, is the fact that the tonal split into a higher and a lower register
shows up in such a strangely distorted form: We would have expected the heavy
(yang) register to be lower or at least to have a lower onset of the tone, as it derived
from voiced initials, which are known to have a lowering effect on pitch. But instead,
the heavy register is described as starting with a falling tone contour, so it in fact
starts on a higher pitch. The light (yin) register which derived from voiceless initials
is described as starting with a rising tone contour meaning that this register, which
we would have expected to be high, in fact has a low onset of the tone.
This strange feature can be traced back to Myōgaku’s 11th century interpretation
of Annen’s text. It is this feature that had the most decisive influence on the tone
theories of the Siddham scholars. It will therefore be discussed separately in 8.3.2
and subsections.
The fact that Myōgaku’s tone system, and the tone systems after him, were based
on Annen’s record cannot come as a surprise, as Myōgaku’s theories – by his own
admission – were based on a thorough research of earlier works on Siddham
phonology (among which Shittan-zō played a central role), and not on oral tradition.
But even before Myōgaku, the descriptions in Shittan-zō seem to have had more
influence than oral tradition.
The only tone description that we have from the period before Myōgaku is
contained in Hoke-kyō shakumon (967) by Chūzan of the Hossō school, the same
school from which the Tosho-ryō-bon manuscript of Ruiju myōgi-shō originated.
Although in the 7th century the Hossō school contained a large group of monks who
had all studied in Chang’an for many years, three hundred years had passed since
then, and Chūzan’s work is in fact deeply influenced by study of Annen’s text: The
division between heavy and light in the ping and ru tones that is visible from the
way in which the tone dots have been added to Chinese characters in Hoke-kyō
shakumon shows the unnatural deviation from what is expected in languages that
have gone through a register split based on voicing.
Secondly, Chūzan’s tone description mentions a heavy/light distinction in the
shang and qu tones that did not yet exist in the 7th century, and the way in which this
difference is defined in terms of length can be traced back directly to Annen’s
description. The yin/yang split in Chinese was in the first place a split between a
higher and a lower register. As a result, some tone contours may have become more
complicated, and differences in length may have developed (this seems to be what
we can infer from the description of the qu tone by Isei and Chisō), but such a
difference in length would have been secondary. To define the difference between
heavy and light in this tone in the first place as a difference in length, most likely
8.1 The tones of the Siddham scholars do not represent the tones of LMC 447
stems from a later reading of Annen’s text, in which far too much weight is given to
(especially) Isei’s description of the qu tone.
Although originally based on the description of the qu tone in Annen’s text, this
definition of heavy and light in terms of length was then also applied (in the reverse)
to the shang tone. (See section 7.3.1.1.)
Another text that still seems to reflect notions that date from before Myōgaku’s
time – even though it was written by a contemporary of Myōgaku – is Sakumon
daitai 作文大体 (1108) by Fujiwara Munetada. In this text too, the difference
between heavy and light in the qu tone is defined in terms of length.
Summarizing we can say that the following characteristics of the later Kan-on
tone descriptions can be attributed to influence by Annen: In all tones that have a
division into heavy and light other than the ping tone, characters with second muddy
initials belong to the light category. As far as I know all Japanese Kan-on materials
in which the tone dots distinguish between heavy and light show this type of division.
This means that all eight-tone and all six-tone theories in Japan are based on
Annen’s record. (Even the tone dot markings in Hoke-kyō shakumon, which contains
the oldest tone description after Annen, already shows this unnatural division in the
ru tone.)
Secondly, the shang tone has a distinction between heavy and light. Characters
with muddy initials in the shang tone have not merged with the qu tone, even though
this merger is characteristic of Late Middle Chinese. This distinction is based on the
fact that Isei and Chisō still mention a heavy shang category. The fact that this
merger is lacking can only go back to a misunderstanding of Annen’s text.
Next, there is the fact that the split into a higher and a lower register shows up in
a distorted form, the origin of which will be discussed in more detail later on in this
chapter.
Finally, before Myōgaku’s purely tonal interpretation of the difference between
heavy and light became standard, heavy and light in the qu (and shang) tones was
defined in terms of length.
It will be clear from the above that I see both the tone systems before Myōgaku,
as well as the tone systems after Myōgaku, as being based on Annen’s text. Just as
Myōgaku mentions, the oral tradition had become severely confused in the period
before him, and it seems that all kinds of new interpretations – as long as they were
ostensibly based on Annen’s record – were readily accepted.
8.1.2 The merger of light qu with shang is a Japanese invention
The idea that the light qu tone merged with the light shang tone (just as the heavy
shang tone had merged with the heavy qu tone) is another notion that can only be
found in Japan. Although this merger was not derived directly from Annen’s text,
there are lines in Annen’s text that aided the acceptance of this idea.
448 8 Background and analysis of the tone theories of the Siddham scholars
This merger is first mentioned by Fujiwara Munetada, who says that he does not
understand it. It cannot be found in Myōgaku’s works.4 Because Fujiwara Munetada
mentions it (even though he does not understand it) and because this merger is
typical of the Shingon school, I assume that the idea originated in the Shingon
and/or Hossō school, and was already around when Myōgaku developed his ideas.
This merger probably developed in these circles in order to bring Biao’s tone
description and Isei and Chisō tone descriptions into agreement with each other.
Although light and heavy subtones of shang and qu are explicitly mentioned in
Annen’s last two tone systems, according to Biao, the shang tone was entirely light
as the heavy shang tone had merged with the qu tone. In addition Biao mentions that
the qu tone had no distinction between heavy and light.
In the Chinese works on the Siddham script that the Siddham scholars used, it is
sometimes mentioned that the qu tone was heavy. Originally, this referred to the
breathy voice quality of the qu tone.5 To the scholars who studied the Chinese tones
with the later developed notion that heavy and light referred simply to tone height,
and who had no awareness of the double meaning that these terms still had in
Annen’s work, these remarks must have been hard to understand: A split into light
qu and heavy qu is after all clearly mentioned in the last two 9th century tone systems
that had gone through the yin/yang register split.
In order to solve this apparent contradiction, the merger of heavy shang with qu
developed a logical pendant in the merger of light qu with shang. What now
remained were a completely ‘light’ shang tone (as heavy shang had merged with qu)
and a completely ‘heavy’ qu tone (as light qu had merged with shang).6
This last merger is clearly something that developed in Buddhist circles in Japan,
as the result of a reinterpretation of the last two tone systems described in Annen’s
Shittan-zō (and other material on the Siddham script) by the Siddham scholars. It
does not have any basis in the phonology of Late Middle Chinese.7
4 Myōgaku attempted to reconstruct the eight-tone system of Isei and Chisō because it was this
tone system that was associated with the founders of his own Tendai school in Japan.
5 In Korean dhāran,ī (chinôn 真言) collections published in the 18th and 19th centuries, Siddham
syllables with voiced aspirated initials like g˙a, j˙a, d˙a, d˙a and b˙a are also still marked
with the qu tone dot when transcribed into hangul, even though the use of tone marks for
Korean had already died out at the end of the 16th century. In addition, the qu tone dot is used
to transcribe syllables with long vowels, while syllables with short vowels are marked with the
shang tone dot (Rosen, 1974: 129-130, 131). These practices have their roots in the same
Chinese systems of transcribing Sanskrit by means of Chinese characters that the Japanese
scholars were studying.
6 It is therefore probably no coincidence that when the tone dots fell into disuse, the daku-ten
(muddy dot) ended up in the upper-right corner, the former location of the qu tone. The daku-
ten were two paired circles that originally functioned to mark the tone of a syllable as well as
the fact that the initial consonant of the syllable was voiced. (When the tone dots fell into
disuse the first function was lost.)
7 In the dialect of Shanghai, which belongs to the southern Wu dialects, the yangshang tone has
merged with the yangqu tone, and the yinqu tone has merged with the yinshang tone, which is
reminiscent of the Siddham scholars’ tone system, but the division of the initials is different: In
8.2 The tones of the Siddham scholars do not represent the tones of Middle Japanese 449
Annen’s rather cryptic remark about the two later tone traditions (或上去軽重稍
近) when read as “Sometimes the heavy and light of the shang and the qu tones is
somewhat similar” can also be brought into agreement with the idea of a parallel
merger of (heavy) shang with qu and (light) qu with shang.
The Shingon school now had six tones, but these developed in a theoretical
framework that acknowledged light and heavy subtones in all the tones. In other
words; the Shingon six-tone theory is a product of developments that date from after
the 9th century register split in Late Middle Chinese, and was based on the
descriptions of this split by Isei and Chisō in Annen’s text. (In the Shingon six-tone
theory as well, shang tone characters with second muddy initials are light, which
betrays that the Shingon scholars too had no idea of what the yin/yang tonal split in
the 9th century had really been like.) This means that Myōgaku was mistaken when
he associated the Shingon six-tone theory with the period of Amoghavajra.8
I see this example as one more indication that the tone systems of the Siddham
scholars are the result of attempts to bring as many aspects of Biao’s, Isei and
Chisō’s descriptions of the Chinese tones as possible into agreement with each other.
They are not the result of a faithful transmission of one particular natural Chinese
tone system to Japan.
8.2 The tones of the Siddham scholars do not represent
the tones of Middle Japanese
I have just argued that the tone theories of the Siddham scholars are too Japanized to
reflect the tone system of Late Middle Chinese. This however, does not mean that
the tones of the Siddham scholars represent the tones of Middle Japanese. This is
nevertheless an idea that seems to have played a role in the interpretation of the tone
dot material from the start.
Everything we know of the Middle Japanese tone system indicates that the basic
tonal oppositions of the language were those of a register tone language, and not
those of a contour tone language like Chinese. Although there were contour tones,
these were rare and most likely the result of contractions (and most likely
lengthened: [R:] and [F:]).
I think it cannot be denied that Kindaichi and Mabuchi were both influenced by
their awareness of this fact, when they decided to translate taru and agaru as ‘low’
and ‘high’ instead of ‘falling’ and ‘rising’. (And as a consequence, they were forced
to translate 偃 as ‘rising up’. Cf. section 9.1.) I think it is also clear that they had the
Shanghai, the clear and second clear initials belong to the yinqu category and the muddy and
second muddy initials belong to the yangqu category. In Japan, the clear, second clear and
second muddy initials belong to the yinqu category and only the muddy initials belong to the
yangqu category.
8 See section 7.3.1.2 on Myōgaku’s interpretation of historical changes in the transcription of
Sanskrit.
450 8 Background and analysis of the tone theories of the Siddham scholars
standard reconstruction of the Middle Japanese tone system – which had been
around at least since the early nineteen thirties – at the back of their minds.
It has to be remembered however, that the Buddhist tone systems were devised
for the correct recitation of the magical formulae in a highly ritualized context, and
not for the purpose of marking the tones of Japanese. That was a by-product that
developed later, but because the tone dot material is so important in the study of the
history of the Japanese language, it is easy to loose sight of this fact.
When the tone descriptions of the Siddham scholars – including the reading notes
that are added to them – are translated literally, and not adapted to the standard
theory in advance, it seems that the Kan-on tones used in the recitation of the
mantras and dhāran,ī from the 11th to 14th century, consisted of rises and falls and
combinations of these. Looking at the Tendai and Shingon tone descriptions
introduced in chapter 7, one gets the impression that the use of tone contours in
these circles has actually been exaggerated. As the interest in the Middle Chinese
tones of the Siddham scholars sprung from their interest in a correct pronunciation
of the mantras and dhāran,ī in the context of religious chanting, it is not surprising
that their tone systems were complicated; in fact, it could almost be expected.
The complicated tones developed by the Siddham scholars do not represent the
tones of Late Middle Chinese, but they do not directly reflect the tones of Middle
Japanese either. We should take the descriptions of the tones introduced in the
previous chapter literally, and should not try to adapt them to what we know to be
realistic from the viewpoint of Chinese or Japanese historical phonology.
8.3 The influence of Myōgaku’s innovations
In order to understand how the complicated sequences of rises and falls that we find
in the Siddham tone descriptions developed, we have to discuss the innovations that
are the result of Myōgaku’s work.
Because official contact with China had long ceased in his time, Myōgaku’s
theories have a strong Japanese flavor, often differing profoundly from the original
Chinese practice. This tendency for Japanized theories and systems, which is the
distinctive trait of Myōgaku’s work, is no doubt related to his lack of kuden 口伝 or
oral instruction.
At the root of many of Myōgaku’s new ideas and solutions is the fact that by his
time any awareness of the fact that heavy and light had originally referred to
differences in voice quality had been lost. This – in turn – is most likely related to
the decay of the oral transmission in the period preceding Myōgaku: Japanese has no
opposition between aspirated and unaspirated sounds, and without the terminology
developed in modern articulatory phonetics, it is unlikely that the concept of voiced
aspiration could have been adequately conveyed without direct oral instruction by a
teacher.
8.3 The influence of Myōgaku’s innovations 451
Myōgaku and the scholars before him consequently interpreted the heavy/light
opposition purely in terms of tone height. The basis for this interpretation was found
in the passages in Shittan-zō that describe the tonal split in the tone system of the 9th
century, where Annen indeed uses the terms in a broader sense. The differences in
pitch that Annen mentions in connection with the terms heavy and light in the two
last tone descriptions included in Shittan-zō came to determine the meaning of these
terms in Japan.
Mabuchi (1963:402) stresses the fact that among the Siddham scholars after
Myōgaku’s time there are none who were not deeply influenced by his work, and
that it is through him that Siddham studies changed completely and became
thoroughly Japanized.
8.3.1 Myōgaku’s fanqie theory
As Myōgaku was interested in a correct pronunciation of the dhāran,ī written in the
Siddham script (which is apparent from the titles of his works and his background) a
central part of this task was to bring the writing systems of Sanskrit, Chinese and
Japanese into the greatest possible agreement with each other. Just as Sanskrit
(Siddham) had been transcribed in China by means of Chinese characters, the
Chinese characters themselves could be transcribed by means of the fanqie method,
and Myōgaku in turn devised a system by which the Chinese fanqie could be
transcribed with Japanese kana graphs. All theoretical works on the fanqie method
in Japan after Myōgaku followed his method. (As I have mentioned earlier, Mabuchi
sees this as an indication that the study of fanqie and Siddham had indeed been in
serious decay when Myōgaku introduced his new theories.)
As there was no direct contact with India, for a correct pronunciation he had to
rely on the Chinese characters with which the dhāran,ī had been transcribed in China.
In order to read the Chinese characters correctly he had to establish a correct reading
of the fanqie spelling method by which the Chinese characters were in turn
transcribed. Although he studied earlier works on Siddham phonology from the
Tendai library, many aspects of his fanqie reading method are original.9
Myōgaku’s transcription of the fanqie was based on the Japanese syllabary and
operated as follows (Mabuchi, 1963:166-171): The character 東, is spelled in the
fanqie method as 徳紅. In the Chinese fanqie spelling method these characters
would have been combined as follows: The initial t- of Middle Chinese 徳 t´k
would have been combined with the syllable minus initial -´wN of Middle Chinese
紅 ƒ´wN to spell Middle Chinese 東 t´wN.
9 In China, the division into an initial and a final was already considered quite complicated. In
四声譜 Sishengpu, which transmitted the fanqie method of the Six-dynasties period (4th - 6th
centuries) and which was quoted in Kūkai’s 空海 Bunkyō hifu-ron 文鏡秘府論 and Annen’s
Shittan-zō, two different methods were introduced as equally possible for the division of
syllables that included semivowels, the chūsei-han 紐声反 (or ‘rhyming spelling’) and the
sōsei-han 双声反 (or ‘alliterating spelling’), without giving a preference. Myōgaku’s method is
at variance with both the chūsei-han and the sōsei-han method.
452 8 Background and analysis of the tone theories of the Siddham scholars
The Kan-on reading of these two fanqie characters is 徳 toku and 紅 kou. As
consonants cannot be separated from vowels in the Japanese kana script, Myōgaku
transcribed the first character of the fanqie spelling with the kana ト(to-) from Kan-
on 徳 toku, combined with a final kana ウ (-u) from Kan-on 紅 kou in order to spell
Kan-on 東 tou. In this example the vowel in the first kana of the first character and
the first vowel after the initial in the second character were the same (o), but in the
next example this is different:
In case of the fanqie 多動 (タ ta and トウ tou) for the character 董 for instance,
Myōgaku chooses as initial kana the kana ト (to) and not タ (ta) because of the
vowel of the initial kana of the second character. He combines this ト (to) with the
final kana ウ (u) from 動 トウ (tou) in order to spell 董 トウ (tou). In other words,
to transcribe the first character of the fanqie, it is necessary to choose from the five
kana that include the correct initial consonant (in this case タ ta, チ ti, ツ tu, テ te,
ト to) the one that also includes the appropriate vowel, which depends on the second
character.
In case of the fanqie 陟為 (チヨク tyoku and ヰ wi) for the character 追,
Myōgaku’s method results in the single-kana character reading チ (ti).
8.3.2 Myōgaku divides the tones in two parts
The single most important innovation in the tone systems that can be traced back to
Myōgaku is the fact that he divided the tone of a character into a first part (the jisho-
sei 字初声 or ‘initial tone of the character’) which was determined by the first
character of the fanqie, and a second part (the jishū-on 字終音 or ‘final sound of the
character’), which was determined by the second character of the fanqie.
Just as the two characters of the fanqie have to be combined to form the reading
of the character as a whole, Myōgaku saw the tones as made up of two separable
components that had to be combined in order to make up the tone contour of the
tone as a whole. Each of the two parts had their own tone contour.
The initial tone contour was determined by whether the initial of the first
character of the fanqie was light or heavy. Myōgaku abstracted the concept of light
and heavy from the tone as a whole. While voicing in a natural tone system would
influence the tone height or tone contour of the entire tone, according to Myōgaku
the categories light and heavy only determined whether tone contour of the ‘initial
tone’ was rising or falling. 10 The second tone contour was determined by the
traditional tonal category of the character.
10 We have seen that in order to understand Annen’s descriptions of the 9th century dialect of
Chang’an – with which contact in his time had still been fresh – it was necessary to take all
kinds of modern linguistic knowledge into consideration. For instance, that the merger patterns
of the tones in the modern Chinese dialects indicate that heavy shang had already merged with
the qu tone in 9th century Chang’an, or knowledge of what the influence of voiced consonants
on the pitch of a following syllable is like. In order to understand Myōgaku’s interpretation of
Annen’s text, it is necessary to consciously disregard this kind of knowledge.
8.3 The influence of Myōgaku’s innovations 453
The idea that the tones could be divided in a beginning and an ending part is
suggested by a number of passages in Annen’s text. 11 Annen already used the
technique of describing a tone as a combination of two other tones, and Myōgaku
applied this technique in his own way.
8.3.2.1 The tone contour of the ‘initial tone’
The tone contour that Myōgaku allots to the ‘initial tone’ (which is determined by
whether the initial of the first character of the fanqie was light or heavy) goes back
to his interpretation of Annen’s text. A key passage in the development of the
concepts of light and heavy in Japan is line 22 of Annen’s description. From what
we know of the effect of voiced and voiceless consonants on the pitch of a following
syllable, Annen’s description of the ru tone in 9th century Chang’an can only be read
as: 入有軽重重低軽昂 “Ru tone has light and heavy. Heavy is low and light is
high.”
We can be quite certain that Annen, who still described a natural from of
Chinese, must have intended these characters to be read or understood in this
passage as ‘low’ and ‘high’. But we also know what kind of readings for these
characters were recorded in Ruiju myōgi-shō, and what kind of furigana and
okurigana were later added to these characters by the Siddham scholars, and that –
no matter how Annen originally meant the use of these characters in this context –
they came to be interpreted in Japan as ‘falling’ and ‘rising’: “Ru tone has light and
heavy. Heavy is falling, light is rising.”
It is clear that at some point, this passage was generalized to mean that heavy as
such indicated a falling tone contour, and that light as such indicated a rising tone
contour, not only in case of the ru tone, but in general. And this is understandable:
After all, the light and heavy ru tones can be said to show the ‘pure’ light and heavy
tone contour, as in Chinese the ru tone is characterized by the fact that it has a
closed syllable, and not by a tone contour of its own. (Neither the Chinese tone
descriptions, such as the one in Yuanhe yunpu, nor the Japanese tone descriptions,
such as those by Annen, ever give any information on a possible tone height or tone
contour for the ru tone as such.)
Moreover, Annen’s remark 或上去軽重稍近, when read as “sometimes, the
shang/qu distinction has some resemblance to the light/heavy distinction” also fits in
well with the idea that light was rising (just as the shang tone) while heavy was
falling (just as the qu tone).
The habit of reading 昂 as ‘rising’ and 低 as ‘falling’ – which is apparent from
the readings of these characters in Ruiju myōgi-shō – appears to be older than
Myōgaku. (See section 9.4.3.) The idea however, that light means ‘beginning rising’
and that heavy means ‘beginning falling’ can definitely be traced back to Myōgaku:
11 In Sei’s description for instance: 17 上有軽重 The shang tone has ‘light’ and ‘heavy’ 18 軽似
相合金声平軽上軽 ‘light’ is like combining the ‘light’ ping and the ‘light’ shang tone of Jin
19 始平終上呼之 beginning with the ping tone and ending with the shang tone.
454 8 Background and analysis of the tone theories of the Siddham scholars
Myōgaku’s works are the first in which these ideas, which would become typical of
the Japanese tone theories, can be found.
An important reason behind the development of the idea that the tone of a
character can be cut into two parts must have been the need to solve a disturbing
contradiction in Annen’s text in line 2: 表則平直低有軽有重 “According to Biao
ping was straight and low/falling and has light and heavy.”
Biao’s description is in keeping with Pulleyblank’s idea that in Biao’s time, the
distinction had been based on voice quality rather than tone height, but this passage
must have puzzled scholars ever since the awareness that light and heavy originally
referred to voice quality had disappeared.
As is clear from the texts in chapter 7, Myōgaku read the character 低 in all
contexts as taru ‘to droop, to fall’. How then could the ping tone be described as
inherently falling but at the same time have a light (= rising) and a heavy (= falling)
variety? Myōgaku’s idea that light and heavy only influenced the initial part of the
tone, and that the second part of the tone was not affected, solved this contradiction.
As a result of this new view, characters with heavy initials in Japan were thought
to have an initial falling tone contour, while characters with light initials were
thought to have an initial rising tone contour, which in a natural language would
have to be considered extremely unusual. Syllables that originally started with
voiced initials would normally be expected to start on a lower pitch than syllables
that originally started with voiceless initials.
The Siddham scholars’ tone descriptions make no sense when read with a natural
Chinese tone system in mind. Because of this, scholars like Kindaichi (1951) and
Mabuchi (1968) decided to read the character 低 (‘falling’) as ‘low’, and to read the
character 昂 (‘rising’) as ‘high’, despite the fact that the readings in contemporary
dictionaries say otherwise, and despite the fact that when reading notes are added to
these characters, these indicate ‘falling’ and ‘rising’ and not ‘low’ and ‘high’.
It is true that adjusting the translations of these characters makes the Buddhist
tone systems look much more like a form of natural Chinese, but however tempting
this may be, the fact remains that this is not how these characters were read by the
monks themselves in the period when these tone theories proliferated.
The tone systems of the Siddham scholars, and the rules along which they
evolved, have little or nothing to do with the tone systems and rules of natural,
living languages. The highly theoretical nature of these tone systems has been
acknowledged by previous scholarship, but even so, when it comes to interpreting
the tone values described in the Buddhist tone systems, normal linguistic rules are
invoked: Too many contour tones are unnatural, and syllables with heavy initials
must necessarily have a lower onset than syllables with light initials, as if – despite
everything – these highly theoretical tone systems should be subject to normal
linguistic rules.12
12 It is worth quoting the opinion of Günther Wenck on the sound mysticism of the esoteric
schools (1953:208): “It is true that these notions have phonetic facts as their basis; but it is
8.3 The influence of Myōgaku’s innovations 455
8.3.2.2 The tone contour of the ‘final sound’
The way in which Myōgaku describes the tone contour of the second kana with
which a character is transcribed, can be regarded as a description of the traditional
four tones: ping is falling, shang is rising, qu is ‘bent’ and ru ends in hu, tu, ku, ti or
ki.
This definition of the traditional four tones may have been handed down to him,
or it may have been based on his own interpretation of Annen’s text: Because light
(ru) is described by the same character as shang (昂) and heavy (ru) with the same
character as ping (低) in Annen’s text, the tone values of light and heavy and shang
and ping are interconnected. We see for instance that Myōgaku indeed identifies
light directly with shang and heavy directly with ping: Light tones are described by
Myōgaku as ‘beginning as shang’ and heavy tones are described as ‘beginning as
ping’.
The way in which these characters were read in Japan (as contour tones) had
implications for the perceived tone value of all of these concepts together: The idea
that light meant ‘rising’ and heavy meant ‘falling’, and that shang meant ‘rising’ and
ping meant ‘falling’ can be traced back directly to Annen’s text.
Because the qu tone was described as ‘slightly drawn out’, by contrast, the
character 直 ‘straight, immediate’ in the description of the ping and the shang tones
(直低 and 直昂) may have been interpreted as expressing shortness.
8.3.3 Myōgaku’s eight-tone theory: A tone system that had no historical basis
The two tone contours (the beginning one and the ending one) that were derived
from the two characters with which a character was spelled in the fanqie spelling
method were intoned consecutively, and could be each other’s opposite. In case the
first and the second tone contour were the same, I assume that these two tone
contours were contracted to one single falling or rising contour ([F] or [R]).
1 Myōgaku’s eight-tone theory
ping (light) R+F
(heavy) F + F (= F)
shang (light) R + R (= R)
(heavy) F+R
qu (light) R + ‘bent’
(heavy) F + ‘bent’
ru (light) R + -hu, -tu, -ku, -ti, -ki
(heavy) F + -hu, -tu, -ku, -ti, -ki
exactly the distinguishing feature of magical and mystical thinking, that it moves from its basis
in reality in leaps and bounds, and that its leaps leave out the required intermediate stations. It
seems therefore a rather hopeless endeavor from the start, to want to translate the products of a
magical-mystical phonology into objective phonetics.”
456 8 Background and analysis of the tone theories of the Siddham scholars
The four traditional tones were thus divided into two: One set starting with a rising
tone contour, and one set starting with a falling tone contour, resulting in eight tones.
At first sight, this looks like a reconstruction of the eight-tone system of Isei and
Chisō, but this is deceptive.
Since the introduction of the new Late Middle Chinese standard language in the
7th and 8th centuries, the idea that heavy shang had merged with qu had been a given,
and Annen must surely have been aware of the fact that shang tone characters with
muddy initials in Isei and Chisō’s description had merged with the qu tone. There
may initially have been an oral tradition concerning the correct interpretation of Isei
and Chisō’s tone descriptions in Annen’s text, but this tradition had apparently died
out, as Myōgaku was clearly not aware of it. Isei and Chisō’s descriptions must
therefore have been as confusing to Myōgaku as they would have been to us without
Pulleyblank’s analysis: If Isei and Chisō still mention the category heavy shang, it is
a natural mistake to conclude that in their tone systems heavy shang had not merged
with qu, and Myōgaku indeed made this mistake: He concluded that there were tone
systems in which heavy shang and qu had merged, but also tone systems in which
they had not, and that the eight-tone system belonged to the latter group. He thus
reconstructed a tone system which had no historical basis, and which would soon be
simplified again by mergers that made it resemble the simpler six-tone system.
We can therefore say that the complications concerning heavy and light in the
shang and qu tones that can be seen in the Japanese Buddhist tone descriptions stem
directly from Annen’s ambiguous use of the terms heavy and light.
8.3.4 Myōgaku adapts the tone contour of the qu tone in the six-tone theory
In Myōgaku’s descriptions of the eight-tone system the qu tone is ‘bent’ (see
Shittan-hi 悉曇秘 and Han’on sahō 反音作法). In my opinion, the term ‘bent’ for
the jishū-on of the qu tone must have indicated a falling tone contour. It may have
been a traditional notion that Myōgaku took over, or a term introduced by himself,
but in either case, I think that the origins of this designation lie in Annen’s tone
descriptions. In Biao’s tone description for instance, the qu tone is the only tone that
does not have the attributive 直 ‘straight’ attached to it. 13 Furthermore, line 33
(Chisō’s description) mentions that the qu tone is 角引. As the character 角 in
Annen’s text may refer to the third tone of the pentatonic scale, I have adopted the
translation ‘drawn out on a middle pitch’, but the fact that the character 角 itself
means ‘angle, hook’, can also have contributed to the idea in Japan that the qu tone
was ‘bent’.
13 In the tone-length theories on the other hand, ‘straight’ was interpreted as ‘short’, an
interpretation that agreed well with the fact that the qu tone lacked this qualification, and was
described by Annen as ‘slightly drawn out’.
8.3 The influence of Myōgaku’s innovations 457
I think there can be no doubt that in Myōgaku’s eight-tone system the qu tone
was falling. After all, Myōgaku calls the qu tone heavy, 14 and we know that he
associated heavy with a falling tone contour.
In Myōgaku’s description of the six-tone theory in Shittan yōketsu 悉曇要決
(and implicitly already in Shittan-hi) however, the tone contour of the (heavy) qu
tone has been adapted, and is suddenly described as falling-rising. To Myōgaku this
was the logical consequence of the fact that in the six-tone system the heavy shang
tone and the qu tone had merged:
2 Myōgaku’s six-tone theory
ping (light) R+F
(heavy) F + F (= F)
shang R + R (= R)
qu F+R
ru (light) R + -hu, -tu, -ku, -ti, -ki
(heavy) F + -hu, -tu, -ku, -ti, -ki
In Myōgaku’s tone theory – in which the tones were cut into two parts – the heavy
shang tone had a falling-rising tone contour. As this tone was not distinguished from
the qu tone in the six-tone system, it made sense to reconstruct the qu tone in this
system as having a falling-rising tone contour as well. This reconstruction served to
explain why the two tones had merged in the six-tone system, and it was even
possible to find confirmation for the idea in line 35 of Annen’s text, which describes
the qu tone according to Chisō as follows: 直止為軽稍昂為重 “If it stops directly it
is light. If it rises slightly it is heavy.”
We have to remember that this tone system is attributed by Myōgaku to the
Shingon school and to the ‘general public’ (sejin 世人) while he himself preferred
the eight-tone system. It is only much later, when the Tendai school had given up on
the too complicated eight-tone system devised by Myōgaku, that this tone contour
for the qu tone came in use in the Tendai school.
The fact that Myōgaku attributed the six-tone system with the falling-rising tone
contour for the qu tone to the Shingon school has led to the idea that this was the
tone system that formed the basis of tone dot material from this school. A look at the
tone descriptions of the Shingon scholars themselves however, shows that the
falling-rising contour is never mentioned there. Ironically enough, it was the Tendai
school, and not the Shingon school, that ended up adopting it. (The description in
Shinkū’s 心空 Hoke-kyō onkun 法華経音訓 (1386) confirms that this value for the
qu tone became accepted in the Tendai school.)
14 See for instance the 5th passage from Shittan yōketsu: 故知,去声者即今重音也。“We
therefore know that the qu tone corresponds to the heavy sound here.”
458 8 Background and analysis of the tone theories of the Siddham scholars
This later Tendai tone system, which included the bifura-ten and the fu-nisshō-
ten, is the quasi eight-tone system that became truly typical of the Tendai school.15
The light qu tone characters in this system are marked with the bifura-ten, and the
heavy shang tone characters have merged with the qu tone.
8.4 Myōgaku’s influence on the Shingon tone theories
According to Mabuchi, there are no Siddham scholars after Myōgaku that have not
been influenced by his work, and as far as his fanqie method, and his idea of cutting
the tones into two parts are concerned this is true.
In the Shingon school as well as in the Tendai school for instance, the light ping
tone is described as having a rising-falling tone contour, in agreement with
Myōgaku’s idea of two separate tone contours for the two elements that make up the
tone, the jisho-sei and the jishū-on, which can even be each other’s opposite. The
Shingon school accepted the logic of Myōgaku’s new idea that light meant
‘beginning rising’ and heavy meant ‘beginning falling’. After all, good arguments
for Myōgaku’s analysis could be found in Annen’s text itself: As we have seen, it
provided a solution to the puzzling fact that Biao described the light and heavy ping
tones both as 直低.
Except for the rising-falling light ping tone however, Myōgaku’s influence did
not lead to complicated sequences of contours in the Shingon school. In the Shingon
school after all, light qu characters had merged with the shang tone, just as heavy
shang characters had merged with the qu tone. This means they had a completely
‘light’ shang tone with a rising tone contour and a completely ‘heavy’ qu tone with a
falling tone contour.
The idea of a parallel merger of light qu with shang (which agreed well with
remarks in works on the Siddham script that defined the qu tone as heavy) is already
mentioned in the work of Fujiwara Munetada, and seems to date back to before
Myōgaku’s time. 16 The values of the shang and qu tones in the Shingon school
therefore, did not clash with Myōgaku’s new idea that light tones were ‘beginning
rising’ and heavy tones were ‘beginning falling’.
However, the Shingon school was not influenced by Myōgaku’s idea that in the
six-tone system the qu tone had a falling-rising tone contour. Myōgaku’s six-tone
theory, in which the falling-rising tone contour of the heavy shang tone was
exported to the qu tone with which it had merged, was never accepted in the
15 I have placed the Tendai eight-tone system in the table of the six-tone systems in section 7.4.2,
as this is where it belongs as far as mergers are concerned.
16 Mabuchi (1996:312) thinks that the idea that light qu merged with light shang, parallel to the
merger of heavy shang with heavy qu developed at the end of the Heian period as a logical
consequence of Myōgaku’s ideas. The tendency to make the tone systems more symmetrical
however, could already be seen earlier, in Chūzan’s tone description.
8.5 Summary 459
Shingon school. (As we have seen, it was Myōgaku’s own Tendai school in which
this aspect of his six-tone theory eventually became the norm.)
In the Shingon school (as is for instance clear from Shinpan 信範 and Ryōson’s
了 尊 descriptions), the qu tone continued to be described as a falling tone.
(Ryōson’s Shittan rinryaku-zu-shō 悉曇輪略図抄 is from the year 1287, so this
value for the qu tone persisted in the Shingon school at least until the end of the 13th
century.)
Eventually, the Shingon tone theories changed dramatically, but this happened
sometime after the end of the 13th century. As I will argue in chapter 12, this was
most likely the result of the upheaval caused by the leftward shift of the /H/ tone in
the Kyōto type dialects proposed by Ramsey.
8.5 Summary
The eight-tone system is not a direct descendant, or a continuation of the tone
system attributed to Isei and Chisō by Annen. Myōgaku based his eight-tone theory
on the description of the tone systems of Isei and Chisō in Shittan-zō, but as we have
seen, ambiguities in Annen’s text caused him to reconstruct a system that was
fundamentally different from the system Isei and Chisō must have described.
Neither is the six-tone system a direct descendant, or a continuation of the tone
system attributed to Biao by Annen. In the six-tone theory, a heavy shang and a light
qu tone are distinguished in theory, even if they are not used, so that at closer
examination it becomes clear that the six-tone theory presupposes an underlying
eight-tone distinction. The fact that the register distinction is defined in terms of tone
height rather than voice quality also betrays influence of the 9th century tonal split.
The six-tone system as well as the eight-tone system can be traced back to the
tone descriptions of Isei and Chisō in Shittan-zō. In the Shingon school the eight-
tone system was simplified by merging not only heavy shang with qu but also light
qu with shang. In the Tendai school the eight-tone system was simplified by
merging heavy shang with qu and by inventing the bifura-ten. (Although in effect,
the use of the bifura-ten equals a merger of light qu with shang, the Tendai theorists
avoided positing such a merger directly. This may have been out of deference to the
Tendai scholars Myōgaku and Annen, in whose works a merger of heavy shang with
qu is mentioned, but not a merger of light qu with shang.)
The reduction of the number of tones was not only practical, more important is
probably, that by means of these mergers as many of Annen’s remarks as possible
could be brought into agreement with each other: On the one hand, the Siddham
scholars posited eight tones as a register split in all the tones is clearly mentioned in
the descriptions of Isei and Chisō, but on the other hand, they reduced this number
460 8 Background and analysis of the tone theories of the Siddham scholars
again to six, as this brought their tone systems in agreement with the description by
Biao.17
Because the Kan-on tones were preserved artificially, it has been thought that
they were exposed to changes to a lesser degree than would have been the case in a
natural language. The many unnatural features that form part of the Kan-on tone
system however, show that this assumption is not correct.
17 As said (cf. footnote 2), Biao in fact only mentions five tones, but some of Annen’s remarks
later led to the idea that Biao’s tone system had included a heavy and a light variant of the ru
tone.
9 Which reconstruction agrees better with the tones
of the Siddham scholars?
The descriptions in chapter 7 show that the tones of the Siddham scholars consisted
of contour tones and sequences of contour tones. We saw earlier that the Siddham
scholars’ tone systems made no sense when read with a natural Chinese tone system
in mind, but the same is true when they are read with the Middle Japanese tone
system in mind: The three tone dots that were primarily used to mark the tones of
Japanese, namely (heavy) ping [F], (light) shang [R] and (heavy) qu [F:], were the
simpler contour tones, but they were nevertheless contour tones. The basic tonal
opposition of Middle Japanese on the other hand, was between the two level tones
/H/ and /L/.
The tones that were used to mark Middle Japanese do not directly agree with the
Middle Japanese tones in Ramsey’s reconstruction, nor do they agree with the
Middle Japanese tones in the standard reconstruction. For both theories – Ramsey’s
as well as Kindaichi’s – a certain amount of adaptation of the contour tones of the
Siddham scholars to the basically level tone system of Middle Japanese is in order.
The question is now, for which of the two reconstructions of the Middle Japanese
tone system this adaptation can be made in the most convincing way.
9.1 The tones of the Siddham scholars and the standard reconstruction
The solution chosen by Kindaichi, Mabuchi and others, has been to adapt the
translation of the Siddham scholars’ tone descriptions in order to make them include
level tones. In case the descriptions are in Chinese, the characters 低 and 昂 that
appear in many of these texts to describe the ping and the shang tones and heavy and
light, have been translated as ‘low’ and ‘high’, despite the fact that the accepted
readings of these characters at the time point to contour tones, and despite the fact
that the reading notes added to them indicate that they were read as taru ‘to droop, to
fall’ and agaru ‘to go up, to rise’.
Other descriptions are directly in Japanese, and thus less influenced by the
traditional use of the characters 低 and 昂 (which goes back to Annen). In these
descriptions, instead of taru the verb sagaru ‘to go down, to fall’ is often used to
describe the ping tone, which Kindaichi and Mabuchi also translate as ‘low’.
To interpret a term like ‘falling’ as ‘low’ or ‘rising’ as ‘high’ may not seem too
radical a decision. (This is at least how it is presented.) However, the interpretation
of the character 偃 that is used in the Shingon school to describe the qu tone as
‘rising up’ (see Kindaichi 1951:691 and Mabuchi 1962:437, who has the translation
462 9 Which reconstruction agrees better with the tones of the Siddham scholars?
ゆるやかにあがる ‘to rise up gently’) amounts to a complete reversal of the
indicated tone value: As shown in section 7.3.2.2, the readings of this character in
Ruiju myōgi-shō 類聚名義抄 can be translated as ‘to bow down’, ‘to stoop’, ‘to
fall’, ‘to crouch down’ and the like.
This much more fundamental intervention in the indicated tone value in case of
the qu tone was the unavoidable consequence of the earlier decision to interpret the
ping tone as [L] and the shang tone as [H]. As the following examples (which have
already been introduced in section 1.2 of part I) make clear, the qu tone can only be
analyzed as a sequence of a ping tone followed by a shang tone on one syllable:
Ruiju myōgi-shō has nu 去, but also nuu 平上 for ‘marsh’, goma 去上, but also
ugoma 平上上 for ‘sesame’, hagi 去平, but also haagi 平上平 for ‘shank’ (haagi 平
上 平 is also attested in Dai-hannya-kyō ji-shō 大 般 若 経 字 抄 .) Furthermore,
Shinsen ji-kyō 新撰字鏡 has hii 平上 for ‘shuttle’, which is also attested as hi 去 in
Ruiju myōgi-shō.
The fact that the 去 tone must have consisted of a 平上 contour tone is also
evident from a different type of entry in Ruiju myōgi-shō: Characters that have a
one-kana pronunciation in Wa-on like i 伊 (イ) or hu 不 (フ) and a ping tone in
Kan-on, usually have a qu tone dot in the Wa-on entries of Ruiju myōgi-shō, but
characters that have a two-kana pronunciation in Wa-on like kue or kei 佳 (クヱ, ケ
イ) and a ping tone in Kan-on will have a ping tone dot followed by a shang tone
dot (平上) added to the two consecutive kana signs (Kindaichi 1951:646-648).
Furthermore, the Ruiju myōgi-shō patterns hardly ever show a shang-ping
sequence followed by a shang tone within the word (*上平上)1, whereas a ping-
shang sequence followed by a shang tone within the word (平上上) is very common.
This makes it likely that the qu tone in examples like goma 去上 or siwoni ‘aster’ 去
上上 in Ruiju myōgi-shō consisted of a 平上 contour, but very unlikely that it
consisted of a 上平 contour.
The result of these adaptations is a tone system in which the value of the tones
coincides exactly with the value of the tone used to mark the tones of Middle
Japanese according to Kindaichi’s reconstruction.
It is possible to show however, that the way in which the standard theory
reconciled the contour tones of the Siddham scholars with the level tone system of
Middle Japanese is illogical and cannot be correct. We have already seen that the
reconstruction of the qu tone as [R] in the standard theory goes directly against the
description of this tone by scholars from the Shingon school. Other tones pose
similar problems, as the way in which the tone dots were used to mark contour tones
in general, is in contradiction with the descriptions of the tones by the Siddham
scholars.
1 As has been discussed in part I, in most cases 上平上 tone sequences (/LHL/ in Ramsey’s
reconstruction) had already developed into 平上上 (/LHH/ in Ramsey’s reconstruction) within
the word stem.
9.1 The tones of the Siddham scholars and the standard reconstruction 463
1 Kindaichi’s interpretation of the Shingon and Tendai six-tone systems
Shingon Tendai Interpretation
six-tone system six-tone system by Kindaichi
light ping RF RF HL = [F]
heavy ping F F [L]
shang R R [H]
qu F + ‘bent down’ FR LR/LH2 = [R]
light ru R R [H]
heavy ru F F [L]
The shang tone for instance, did double service. Tone class 1.2 (which had /F/ tone
in the standard theory) is almost exclusively marked with a 上 tone dot.3 After the
light ping tone dot had fallen out of use, tone class 2.5, which had /F/ tone on the
final syllable in the standard theory, is marked 平上 instead of 平東.
The /F/ toneme did not disappear from the language at that stage. (See section
9.4.1.) Although it is thought that – when followed by a particle – class 1.2 may
have been realized with [H-L] instead of [F-L] pitch, and that class 2.5 may have
been realized with [LH-L] instead of [LF-L] pitch, in isolation the /F/ toneme in
these words is still thought to have been realized with [F:] pitch.4
This means that according to the standard theory, the shang tone dot was used to
mark [F:] as well as [H] pitch. In other words, if you assume (as proponents of the
standard theory do) that the tone described as ‘rising’ by the Siddham scholars
(shang) expressed /H/ tone when it was used to mark the level tones of Japanese,
you are faced with the problem that this tone suddenly expressed a falling tone
contour when it was used to mark the contour tones of Japanese.
Attestations like hi ‘cypress’ and me ‘female’ with 去 as well as 平 tone marks in
the Tosho-ryō-bon of Ruiju myōgi-shō, siwoni ‘aster’ as 平上上 in the Fushimi-
Miyake-bon 伏見宮家本 of Kokin waka-shū, but as 去上上 in Ruiju myōgi-shō, and
eyami ‘epidemic’ as 平平平 but also as 去平平 in Ruiju myōgi-shō, indicate that the
ping tone too occasionally did double service. Although it is clear that the qu tone
was preferred for this purpose, the ping tone dot could be used to mark /R/ tone.
Again, if you assume that the tone described as ‘falling’ by the Siddham scholars
(ping) expressed /L/ tone when it was used to mark the level tones of Japanese, you
2 [LR] represents Kindaichi’s interpretation of the Shingon qu tone (which is interpreted as L +
‘rising up’) and [LH] represents Kindaichi’s interpretation of the Tendai qu tone.
3 Examples of tone class 1.2 being marked with a light ping tone are few but there is one
example of a noun of class 1.2 that has been marked as 上平 in Ruiju myōgi-shō, namely 1.2 kii
‘yellow’. Furthermore, 1.2 e ‘inlet’ is marked with the light ping tone dot in the Tosho-ryō-bon
of Ruiju myōgi-shō and in Wamyō ruiju-shō, and 1.2 na ‘name’ is marked with the light ping
tone dot in the Kanchi-in-bon of Ruiju myōgi-shō (Mochizuki, 1974:668-669).
4 See for instance Suzuki Yutaka’s overview of the different marking strategies in Akinaga et al.
(1998:580-581).
464 9 Which reconstruction agrees better with the tones of the Siddham scholars?
are faced with the problem that this tone suddenly expressed a rising tone contour
when it was used to mark the contour tones of Japanese. Finally, as has already been
pointed out, a tone that is described as ‘bending down’ (qu) is also used to mark a
rising tone contour. It will be clear that something is fundamentally wrong with the
standard interpretation of the Japanese tone dot material.
The way in which the contour tones were marked clearly shows that the Buddhist
scholars associated [R] with [L] and [F] with [H]. This means that Mabuchi and
Kindaichi’s association of ‘falling’ with ‘low’ and ‘rising’ with ‘high’ goes against
the instincts of the very people who marked the Japanese texts with the tone dots.
An important reason why Ramsey’s theory has not been accepted is because it is
thought that the old Buddhist tone descriptions rule out his reconstruction of the
value of the Middle Chinese tones in Japan. As we have just seen however, it takes
quite a bit of adaptation to make these tone descriptions fit into the mould of the
standard theory: It is all too clear that one has tried to squeeze the Siddham tone
descriptions into preconceived ideas of what the tones must have been like.
9.2 The tones of the Siddham scholars and Ramsey’s reconstruction
In order to reconcile the tones of the Siddham scholars with Ramsey’s
reconstruction, the idea that the tones of Myōgaku and the other Siddham scholars
were identical to the tones of Middle Japanese has to be abandoned. The Buddhist
tone systems were devised for the correct recitation of the magical formulae and not
for the purpose of marking the tones of Japanese. Even after the practice of selecting
some of the dots to mark the tones of Middle Japanese had developed, to the monks,
marking the tones of the magical formulae remained the primary function of the tone
dots.
This means that it is not necessary to twist the translations of these tone
descriptions in order to arrive at something that is likely from the viewpoint of
Japanese historical phonology. With this in mind, it is no longer a problem to
translate sagaru or taru simply as ‘falling’ and agaru as ‘rising’. When the Siddham
scholars themselves unequivocally used these terms and not the terms takasi ‘high’
and hikisi/hikusi ‘low’, which were also at their disposal, we should take their
descriptions at face value: This is apparently the way in which these scholars saw
the tones of Chinese, a language with which they had been out of direct contact for
more than 200 years. The complicated tone contours of the Siddham scholars were
appropriate for the recitation of the magical formulae. How these tone contours were
applied to the Japanese language by means of the tone dots, is an entirely different
question.
Kindaichi and Mabuchi approached this question from the viewpoint of meaning.
They judged the meaning of ‘falling’ closest to that of ‘low’ and the meaning of
9.2 The tones of the Siddham scholars and Ramsey’s reconstruction 465
‘rising’ closest to that of ‘high’.5 The 11th century Japanese scholars on the other
hand, were concerned with what these contour tones sounded like, and with what
they most reminded them of in the basically level tone system of their own language.
To my ears a short falling tone sound high and a short rising tone sounds low,
and this impression has been confirmed to me by people from different linguistic
backgrounds. It may be that a rise stresses the low onset of the tone, and that a fall
stresses the high onset of the tone. In any case, it is apparently the onset of a contour
tone that leaves the strongest auditory impression, so that a [R] will be more easily
be associated with [L] than with [H], and a [F] will be more easily associated with
[H] than with [L].
We have seen that in many of the materials that are contemporary with the
Siddham scholars’ works, the shang tone dot did double service, both as the
indicator of simple shang and of a shang-ping sequence (light ping), and even the
ping tone dot occasionally did double service, both as the indicator of simple ping
and of a ping-shang sequence (qu). This shows that the Siddham scholars too,
associated a contour tone primarily with its onset.
When some of the simpler contour tones of the Siddham scholars were selected,
and adapted to mark the basically level tones of Middle Japanese, I therefore think
this happened in the following way: The heavy ping tone [F] was used to mark /H/,
and the light shang tone [R] was used to mark /L/, as well as the occasional /R/ tone
of Middle Japanese. The ‘drawn out’ heavy qu tone [F:] was used to mark the
occasional /F/ contour tone of Middle Japanese, while the use of the heavy ping tone
for this purpose remained rare.
This means that the tone described as ‘falling’ by the Siddham scholars (ping)
expressed /H/ tone when it was used to mark a level tone, and /F/ tone when it was
used to mark a contour tone. Likewise the tone described as ‘rising’ by the Siddham
scholars (shang) expressed /L/ tone when it was used to mark a level tone, and /R/
tone when it was used to mark a contour tone. The tone described as ‘bending down’
by the Siddham scholars (qu) expressed /F/ tone. There is no question here of the
kind of contradictions that are inherent in the interpretation of Kindaichi and
Mabuchi.
Mabuchi and Kindaichi’s way to solve the discrepancy between the contour
tones of the Siddham scholars and the basically level tone system of Japanese was to
adapt the translation of the tone descriptions. My solution to this same discrepancy
is to make the adaptation one step later, at the stage of the application of the tones to
the Japanese language. As we have seen, the difference between the outcomes of
these two strategies is fundamental.
5 The fact that they had been biased towards such an interpretation from the start, as such an
interpretation would result in an Middle Japanese tone pattern that resembled that of modern
Kyōto no doubt played a large role, but they may also have been influenced by the use of the
character 低 for the ping tone, which in modern Japanese has lost the reading tareru (which is
now expressed by the character 垂) and is used exclusively to express hikui ‘low’.
466 9 Which reconstruction agrees better with the tones of the Siddham scholars?
9.3 The Shingon qu tone and the background of Ruiju myōgi-shō
The description of the qu tone (a ping-shang sequence) as ‘bending down’ in the
Shingon tradition is probably one of the most direct arguments from philology for
Ramsey’s theory and against the standard theory. If the qu tone in the tone system of
the Shingon scholars – at least until the end of the 13th century – expressed /F/ tone
([F:] pitch), it inevitably follows that in works associated with this school, ping was
used to mark /H/ tone and shang was used to mark /L/ tone.
A point that now has to be resolved is whether the use of the tone dots in the
most important source on the tone system of Middle Japanese, the dictionary Ruiju
myōgi-shō, was based on Myōgaku’s six-tone system, or on the Shingon six-tone
system (which was less directly influenced by Myōgaku and had preserved the
falling tone contour of the qu tone). To answer this question some information on
the background of Ruiju myōgi-shō is required.
There are two major lineages in manuscripts of Ruiju myōgi-shō that have such
different characteristics that the first one, that of the Tosho-ryō-bon 図書寮本 is often
seen as a separate work. The Tosho-ryō-bon is considered to be (close to) the
original work. This lineage is called the genbon-kei 元本系 or ‘original lineage’, and
is thought to have been the work of a monk of the Hossō 法相 school from around
1080 to 1100 (Tsukishima 1969). Even though the Hossō school belonged to the old
Nara Buddhism (it was founded and supported by the Fujiwara family) it had close
ties with the Shingon school (Komatsu, 1993:24).6
The dictionary received its name by combining the ruiju 類聚 of Minamoto
Shitagō’s 源順 Wamyō ruiju-shō 和名類聚抄 (930) and the myōgi 名義 of Kūkai’s
(774-835) Tenrei banshō myōgi 篆隷萬象名義.
Minamoto Shitagō was an authority on the Chinese classics, while Kūkai was the
foremost scholastic authority in the Shingon school. The title Ruiju myōgi-shō
symbolizes the fact that this dictionary was a work that brought together the fruits of
two fields of learning between which there had so far been no direct contacts. The
aspects of the secular Chinese scholarship that the Buddhist scholars valued and
adopted can be ascertained on the basis of the quotations appearing in the dictionary,
and these indicate that it were the native Japanese readings of Chinese characters
and the sound glosses for Sei-on readings on which the greatest value was set.
In addition, Wa-on readings are also quoted from Dai-hannya-kyō ongi 大般若
経 音 義 (also known as Dai-hannya-kyō ji-shō 大 般 若 経 字 抄 ), compiled by
Fujiwara Kintō 藤原公任, (although the ongi of a Buddhist scripture, this was not
6 The Hossō school, although it did not trace its origin back to the 9th century, was nevertheless
also deeply influenced by Annen’s description of the tonal split in 9th century Chang-an. The
tone description by Chūzan in Hoke-kyō shakumon (in section 7.3.1.1) shows that his view of
the difference between light and heavy in the shang and qu tones (which was defined in terms
of length) can probably be traced back to Annen. The way in which the ping and ru tone
characters in Hoke-kyō shakumon are divided into light and heavy also stems from Annen.
9.4 The light ping tone dot 467
the work of a monk) and Dai-hannya-kyō onkun 大般若経音訓 by Shingō 真興
(935-1009) another monk of the Hossō school.
Unfortunately, only the first part of the 法 Hō section has been preserved. It
contains many quotations and character compounds from other famous Buddhist
dictionaries such as Yiqiejing yinyi 一切経音義, Hoke-kyō ongi 法華経音義 and
Dai-hannya-kyō ongi. The Tosho-ryō-bon is famous for its use of the light ping dot
to mark the tone of certain syllables in native Japanese words.
The other manuscripts together form the other lineage, in which the Buddhist
character of the original work has almost completely disappeared. The work that is
considered to have formed the origin of the other lineage is called the kōeki-bon 公
益本 or ‘public use manuscript’. The manuscripts of this lineage concentrate on
single-character entries. The Chinese and Man’yō-gana notes have been replaced by
katakana notes, while notes in Japanese have been multiplied fourfold. The work
that forms the origin of this lineage is thought to have been a revised edition of the
original work made by a monk of the Shingon school around the year 1100, or at the
latest around 1180 (Satō ed. 1977:521-522).
The kōeki-lineage of Ruiju myōgi-shō therefore, is thought to have originated
from the Shingon school. (The Shingon school appears to have been more
instrumental in general in the development of standard Kan-on than the Tendai
school, as the special Tendai Kan-on never became the norm.) Whether the light
ping dot was still used in the original manuscript of the kōeki-lineage is a matter of
debate (cf. section 9.4.1); the qu tone however, is still used to mark syllables with a
falling tone such as nouns of class 1.3b.
The association of Ruiju myōgi-shō with the Shingon school leaves only one
conclusion open: As the qu tone in the tone system of the Shingon scholars
expressed a falling tone contour, the ping tone dot in Ruiju myōgi-shō was used to
mark /H/ tone, and the shang tone dot was used to mark /L/ tone. This is the only
reconstruction that is able to solve the many contradictions that otherwise form part
of the Middle Japanese system to mark the tones. The evidence from philology
confirms the values that Ramsey reconstructed for the ping and the shang tones in
Middle Japanese.
9.4 The light ping tone dot
So far I have passed over a discussion of the light ping tone dot or tō-ten 東点. In
most tone dot material that is contemporary with the Siddham scholars’ works, it is
no longer used, and the complicated rising-falling tone contour of the light ping tone
in the Siddham scholars’ theories after Myōgaku is hard to reconcile with the much
simpler [R] contour that the light ping tone dot must have expressed if we follow
Ramsey’s theory.
468 9 Which reconstruction agrees better with the tones of the Siddham scholars?
9.4.1 The abandonment of the use of the light ping tone to mark Japanese words
The practice of using the light ping tone dot to mark the tones of Japanese appears to
have been quite widespread in the 11th century, but seems to have died out in the
course of the 12th century. If we look only at clearly attested cases of the use of the
light ping tone dot for Japanese words, such as in the Tosho-ryō-bon of Ruiju myōgi-
shō, the use of this tone seems to have been truly marginal:
The use of the light ping dot to mark the tone of certain syllables in Japanese
words was first discovered in the Tosho-ryō-bon of Ruiju myōgi-shō (1081-1100) by
Komatsu Hideo (1959) and later also in Konkōmyō saishōō-kyō ongi 金光明最勝王
経音義 (1079). The light ping tone was used most consistently in case of the
adjective suffixes -si (shūshi-kei) and -ki (rentai-kei), and the shūshi-kei of the verb
su ‘to do’.
In addition, there are some instances of use of the light ping tone in manuscript
versions of Wamyō ruiju-shō 和名類聚抄 (936) and in the Iwasaki-bon 岩崎本
(1000) and the Maeda-ke-bon 前田家本 (1150) of Nihon shoki 日本書紀. However,
the use of the light ping tone for Japanese words need not always have been the
marginal phenomenon that it appears to be if we only look at these rare attestations.
Komatsu (1971) looked at the way in which the rentai-kei adjective suffix -si had
been marked in several manuscripts of Ruiju myōgi-shō and discovered the
following pattern: In the Tosho-ryō-bon of Ruiju myōgi-shō about 50% had been
marked with the light ping tone, and 50 % had been marked with the shang tone. In
the Kanchi-in-bon 観智院本 (1241, 1251) on the other hand, where the light ping
dot is hardly used, 80% was marked with the shang tone and 20 % with the ping tone
instead. 7 In the Kōzan-ji-bon 高山寺本 85% was marked with the shang tone and
15% with the ping tone.
Kindaichi explains the fluctuation between the shang and the ping tone dots by
arguing that the light ping distinction probably existed in the original manuscripts of
these materials, and in other material as well (for instance in several of the
manuscripts of Nihon shoki that are marked with tone dots), but that later scribes
mistook the only slightly raised light ping tone dot for the normal ping tone and that
the distinction was thus obliterated. In other words, inconsistencies in the use of the
ping and the shang tone can point to an original light ping marking in older
manuscripts that have now been lost (Kindaichi, 1960).
Similar reasoning applies to the original but no longer extant manuscripts of
works like Iroha-ji rui-shō 色葉字類抄 (1180) and Wamyō ruiju-shō 和名類聚抄
(934). (The 2.5 noun mado ‘window’ for instance has 平平 markings instead of the
expected 平上 in the Kōzan-ji-bon of Wamyō ruiju-shō, which could be another
clerical error pointing to an original 平東 marking.)
7 Although tone dots in the location of the light ping dot can be found in the Kanchi-in-bon, the
Kanchi-in-bon includes many instances of tone dots that have strayed a little bit from their
standard place and so it is hard to prove that these dots were consciously placed in the location
of the light ping dot.
9.4 The light ping tone dot 469
This explanation for unexpected occurrences of the ping tone is supported by the
fact that in Ruiju myōgi-shō and annotated manuscripts of Nihon shoki nouns of
class 1.2 can occur with a ping tone dot in isolation (平), while their tone in
combination with a particle will be 上-平. This shows that the original marking of
the nouns in isolation must have been with a light ping tone dot instead of the heavy
ping tone dot that remains today (Wenck, 1959:414).
In later material, like the many Kokin-shū manuscripts marked with tone dots,
these nouns are uniformly marked with a shang tone dot, although the tone of the
particle (which attaches with a ping tone) still shows that the contour tone of this
tone class had not disappeared on the phonological level. In Kokin kunten-shō 古今
訓点抄 for instance, 1.2 na ‘name’ and 1.2 ha ‘leaf’ are marked as na-wo 上-平 and
ha-ni 上-平.8 By contrast, after tone class 1.1, which was also marked with a shang
tone dot, the tone of the particle is shang.
Komatsu (1993) supports the idea that the mixed shang and ping markings in the
Kanchi-in-bon of the final kana of adjectives ending in -si and -ki indicate that the
light ping distinction probably existed in the original manuscript but was corrupted
by later scribes. Komatsu considers the idea that the original manuscript of Kanchi-
in-bon of Ruiju myōgi-shō had the heavy and light distinction in the ping tone for
Japanese words as proven.
As the original manuscript of the kōeki-bon of Ruiju myōgi-shō and the original
manuscript of Iroha-ji rui-shō are both dated at the latest around 1180, we can
perhaps point to the end of the 12th century as the period in which the light ping tone
dot fell out of use as a marker of the tones of Japanese.
While the qu tone dot continued to be used to mark syllables with a falling tone
contour in Japanese – for some reason – the more sophisticated system of tone
markings, in which there had been no ambiguity between /L/ and /R/ tone, was
abandoned. This is despite the fact that the distinction between /L/ and /R/ did not
disappear from the Japanese language at that stage: The presence of final /R/ tone is
still visible when the particle no is attached, and in many materials from this period
(MJ Chūrin and MJ Gairin material) the influence of final /R/ tone can be seen on
the tone of other attached particles as well.
It is not unlikely that in such an environment /R/ tone was realized as [L] tone
with the rise to [H] pitch shifted onto the particle, but the realization without
attached particle was no doubt still [R]. Moreover, in MJ Nairin material, where
there was no tone spreading onto the particles, the realization must have remained
[R], as /R/ tone definitely survived, and did not merge with /L/ tone in these dialects.
(It left a trace in the Kyōto type dialects when the leftward tone shift took place, as
in Kyōto after all, tone classes 1.2 and 2.5 are still distinguished.)
The difference between voiced and voiceless consonants in Japanese – which had
been distinguished in the Man’yō-gana writing system – was no longer expressed in
8 Tone class 2.5 is treated in a similar way: 2.5 tuyu ‘dew’ for instance, is marked as tuyu-wo 平
上-平.
470 9 Which reconstruction agrees better with the tones of the Siddham scholars?
the later kana writing system, although the distinction had not disappeared. This is
evident from the fact that the daku-ten 濁点, which was later invented to express
voiced consonants, was employed in the same syllables and in the same words that
had earlier been written with voiced Ma’nyō-gana graphs. For some reason it was
not felt necessary to mark the difference for a period of time.
In a similar way, it is possible that the abandonment of the earlier marking
system, in which there had been no ambiguity between /R/ and /L/, came about
because it was no longer felt necessary to mark the difference between the level
tones and the contour tones of the language.
If that were the case however, one would have expected to see a simultaneous
disappearance of the distinction between /H/ and /F/ tone from the marking system.
The qu tone however, continued to be used, and when the use of this tone was
eventually abandoned (some time during the 13th century) it is thought that the
disappearance of the qu tone mark reflects the actual disappearance of the distinction
between /H/ and /F/ tone from the language at that period.
9.4.2 Was the use of the light ping tone abandoned
because of Myōgaku’s theories?
The Siddham scholars’ tone systems after Myōgaku are mostly contemporary with
the later system of tone dot markings, in which the use of the light ping dot to mark
the tones of Japanese words had been abandoned. The Tosho-ryō-bon of Ruiju
myōgi-shō however (in which the light ping tone was still used for this purpose), is
contemporary with Myōgaku, but was still uninfluenced by Myōgaku’s new theories.
(The Tosho-ryō-bon is thought to have been written between 1080 and 1100 by a
monk of the Hossō school, in the same period when Myōgaku just began developing
his theories, which was between 1090 and 1101.)
It may be no coincidence that the use of the light ping tone dot to mark the tones
of Middle Japanese was discontinued as Myōgaku’s theories became accepted. It is
clear that the light ping tone dot was used to mark a contour tone in the tone system
of Middle Japanese, and following Ramsey’s theory this contour tone has to be
reconstructed as /R/. The replacement of the light ping tone dot as a marker of /R/
tone in Japanese by the shang tone may be related to Myōgaku’s new theories, in
which the tones were divided into a beginning and an ending part.
According to Myōgaku and the Siddham scholars after him the light ping tone
had a rising-falling tone contour. There were more of such combinations of contour
tones in the Siddham scholars’ tone systems, but these were not selected as markers
of the tones of Middle Japanese. If the use of the light ping tone had continued, it
would have formed the only exception.
It is true that at some point, the qu tone, which – unlike the light ping tone –
continued to be used, developed a falling-rising tone contour in the Tendai tone
theories. In the Shingon school however – from which the important tone dot
material in which the qu tone is used stems – the original falling tone contour of this
9.4 The light ping tone dot 471
tone was preserved. In case of the light ping tone on the other hand, the rising-falling
contour reconstructed by Myōgaku was accepted in the traditions of both schools.
If it was the rising-falling tone contour of the light ping tone which made the
light ping tone dot unfit as a marker of syllables with /R/ tone in Middle Japanese,
and if it was for this reason that the shang tone came to be preferred for this purpose,
we must assume that the tone contour of the light ping tone before Myōgaku’s time
was simpler: That it was not only used to mark /R/ tone in Middle Japanese, but that
it really had [R] tone in the Siddham scholars tone systems.
The idea that the tone contour of the light ping tone must have been simpler
before Myōgaku makes sense, as the theory that the tones consisted of two parts, a
beginning and an ending, each with its own tone contour, was introduced by him and
cannot be found in earlier works. Also, the remark contained in Shosha-san shōmyō-
shō, 書写山声明抄 of which the date is uncertain (“According to some, the first
kana of the light ping tone is rising and the second kana is falling”) may be a
reservation stemming from the awareness that the tonal value of light ping had
previously been different.
However, we do not have concrete descriptions of the tonal value of this tone
from the period before Myōgaku. There are of course the descriptions of the Kan-on
tones by Chūzan (976) and Fujiwara Munetada (1108), who were probably still
transmitting notions that were common before Myōgaku’s novel interpretation of
Shittan-zō, but these texts concentrate on complications involving differences in
length in the shang and qu tones, and do not contain descriptions of any of the tones
in terms of pitch.
9.4.3 The origin of the rising contour of the light ping tone
and the falling contour of the heavy ping tone in Japan
The rising tone contour of the light ping tone in the period before Myōgaku is hard
to explain if we see the distinction between light and heavy ping in Japan as the
faithful transmission of a distinction adopted directly from a form of Late Middle
Chinese that had gone through the register split. In that case we would have
expected the light ping tone to start on a higher pitch than the heavy ping tone.
The inclusion of heavy/light as a tonal distinction in the Japanese tone theories
was complex. As I have described in section 8.1.1, Isei and Chisō’s report of the
tonal split in 9th century Chang’an was applied to a division into tonal categories that
belonged to an earlier period (Biao’s), in which light and heavy had still referred to
voice quality. The description of the heavy/light distinction in the ping tone by Isei
and Chisō for instance, actually involved a three-way split, but this is not what we
find in the Japanese tone theories.
As soon as the oral tradition explaining these complications in the ping tone had
died out, one had to rely on Annen’s text, and so the definition of light and heavy
was based on the only concrete description of this distinction included in the text,
which was in Isei’s remarks on light and heavy in the ru tone (“light is high/rising,
heavy is low/falling”). In a similar way, the division into categories was based on the
472 9 Which reconstruction agrees better with the tones of the Siddham scholars?
only concrete information on this division contained in the text, which was in Biao
remark on light and heavy in the ping tone.9
The distinction between light and heavy ping was now treated as a distinction in
pitch, but it was adopted in a distorted shape: The rising tone contour of the light
ping tone can be traced back to the idea that light meant rising, and the falling tone
contour of the heavy ping tone can be traced back to the idea that heavy meant
falling, which – in both cases – is the opposite of what one would expect in a natural
tone system. In both cases, it can be traced back to a mistaken reading as contour
tones of the characters 昂 and 低.
In Chinese, these characters can be read as ‘low’ and ‘high’ as well as ‘falling’
and ‘rising’. If – especially – the later two tone traditions in Annen’s text are to be
read as descriptions of a natural form of Chinese (which is likely as they stem from
his own time), a mixture of both readings is required. In the description of the light
and heavy ru tones in line 22, they should definitely be read as ‘low’ and ‘high’. In
line 35 however, the character 昂 can only be read as ‘rising’. The reading that
Annen intended when he used these characters to describe the ping and the shang
tones (in Biao’s tradition, lines 2 and 3) remains uncertain, as the text itself – in this
case – contains no clues as to how they must have been read. It seems logical to
assume however, that ping and shang should either both be read as contour tones or
both as level tones, as Annen used parallel expression for both tones (直低 and 直
昂).
The interpretation of the concepts of heavy and light in Japan goes back to the
description of light as 昂 and heavy as 低. These are the same characters that were
used to describe the shang tone and the ping tone in Biao’s description. As a result,
to the Siddham scholars, the concept of light was connected to the shang tone, and
the concept of heavy was connected to the ping tone: Light and shang either both
meant ‘rising’ or they both meant ‘high’. Likewise, heavy and ping either both
meant ‘falling’ or they both meant ‘low’. The later Japanese tone descriptions show
that for both sets of concepts an interpretation as a contour tone was chosen.
Because the idea that light and heavy mean ‘rising’ and ‘falling’ is clearly
unnatural, the reading of 昂 and 低 as contour tones most likely goes back to their
function as descriptors of shang and ping. The best way in which to explain the
unnatural interpretation of the concepts of light and heavy in Japan, is if it stems
from the fact that light and heavy happened to be described by means of the same
characters as the shang and ping tones in Annen’s text. Although Annen must have
intended these characters to be read differently depending on context, the reading as
contour tones was later generalized to include instances in which these characters
were used to describe light and heavy.
9 The first time – ever since Annen – in which the three-way split in the ping tone mentioned by
Isei and Chisō was interpreted correctly again, is in Pulleyblank’s analysis of Annen’s text
(1978).
9.4 The light ping tone dot 473
The idea that the shang tone was rising already forms part of the Sinologist view
of the Middle Chinese tones, but the idea that the ping tone was falling does not.
Ping is usually reconstructed as a level tone: In a tone system that is assumed to
have had a rising shang tone and a falling qu tone, it makes sense to reconstruct ping
as a level tone.
If the ping tone had level (perhaps [M:]) pitch, it may have been adopted as a
contour tone in Japan simply because it was long: The Japanese tone system was
basically a register tone system. Syllables with a rising or a falling tone contour were
the result of contractions, and most likely lengthened. If the contour tones of the
language were long, while the level tones were short, it is not impossible for the
long ping tone to have been adopted as a contour for that reason alone.
It has to be remembered that Biao’s tones are transmitted to us by Annen, who
lived approximately a century later. The pitch or contour of Biao’s tones in Shittan-
zō is most likely the impression that Late Middle Chinese made on the Japanese ear,
rather than a description of the Chinese tones by a native speaker, which means that
some measure of Japanization may have to be taken into account even in case of an
early tradition such as the one by Biao.
On the other hand, the reconstruction of ping as a level tone in Middle Chinese is
based on the earliest development of the tones, at the stage when final glottal stop
was being replaced by a rising tone contour and final aspiration by a falling tone
contour.
As mentioned in section 2.3, the names of the tones seem to suggest that shang
was rising and ping was level, but these names were conceived in a different time
and based on a regionally different type of Chinese. Moreover – as pointed out by
Hashimoto Mantarō – these names may mean no more than “a tone just like the tone
of the following example word”.
In other words, we simply do not know what the phonetic realization of the tones
was like at the Late Middle Chinese stage, and we definitely cannot rule out the
possibility that the rising contour of the shang tone and the falling contour of the
ping tone in Japan really go back to the Late Middle Chinese reading standard
introduced by teachers like Biao.10 It is assumed after all, that the qu tone ended in
aspiration (initially voiceless but later voiced), which means that ping as well as qu
could have been falling (involving most likely an additional difference in length) in
Early and/or Late Middle Chinese, while remaining phonologically distinct.
Many aspects of the Siddham scholars’ tone theories are unnatural, and in
particular the developments in the concepts of heavy and light. It may be however,
that the most basic part of their tone systems, namely the contours of ping, shang
and qu, were adopted faithfully from Middle Chinese.
10 An interesting point in this context, is the fact that the ping tone also had a falling tone contour
in the earlier Wa-on character reading tradition, as ping tone characters – when read in the Wa-
on pronunciation – were later marked ‘in the reverse’ with the qu tone [F:]. (See also section
11.1.1.)
474 9 Which reconstruction agrees better with the tones of the Siddham scholars?
As to the rising tone contour of the light ping tone in the period before Myōgaku:
Even before Myōgaku, in the interpretation of the concepts of light and heavy in
Japan, one relied more on Annen’s written record than on oral tradition. By his own
admission Myōgaku developed his theories without having been introduced to the
oral tradition as – according to him – this oral tradition had ceased to exist by his
time.11 As we have seen, there is plenty of evidence indicating that this was indeed
the case, and that the tone theories in the monasteries deviated strongly from what
the tone system of Late Middle Chinese had really been like. (Cf. the lack of the
merger of heavy shang with qu, the strange division of the initials into light and
heavy and – in some circles – the interpretation of light and heavy in the shang and
qu tones as a difference in length.) To this list can now be added the unnatural
interpretation of light and heavy as rising and falling, and – connected to this – the
interpretation of the light ping tone as rising.
Myōgaku’s innovation was that he interpreted light and heavy as ‘beginning
rising’ and ‘beginning falling’. The logic of Myōgaku’s idea to divide the tones in
two parts was persuasive as it solved the contradiction included in Biao’s description,
in which the light and heavy ping tones had both been described as ‘low/falling’.
The tone contour of the light ping tone, which had been rising up until then, was
changed accordingly to rising-falling.
11 In chapter 5 on the introduction of Tendai and Shingon in Japan and the development of the
different shōmyō traditions, there is frequent mention of splitting up into chaotically divided
shōmyō schools, reforms and revivals, most notably also before Myōgaku’s time. As the
Siddham scholars’ tone theories and a correct shōmyō practice go hand in hand, it is likely that
similar upheavals and reforms took place in the Siddham scholars’ theories. Some of the
disputes among the shōmyō schools actually may have gone back to disagreements on the
correct recitation of the Late Middle Chinese tones.
10 Stages in the adaptation of the tones of
Late Middle Chinese in Japan
In chapters 8 and 9, I have mentioned different stages in the development of the tone
systems that were based on Late Middle Chinese in Japan. I will now give an
overview of these stages in chronological order.
10.1 The first stage: The tone system of the Han pronunciation
The type of Late Middle Chinese that was introduced in Japan in the 7th and 8th
centuries later developed into the mainstream Sino-Japanese Kan-on reading
pronunciation. This process seems to have started towards the end of the 9th century
(cf. section 3.2.4). The tone system of Late Middle Chinese in Japan before it
developed into a form of Sino-Japanese was not necessarily identical to the tone
system of the later Kan-on, as we have seen that from an early time on, the
development of the latter was influenced by all manner of theoretical considerations.
The tone system of Late Middle Chinese in Japan has to be recognized as a
separate stage, as it is this tone system which must have formed the basis of the
tonal spelling system used in certain parts of the Nihon shoki 日本書紀.1 It may also
have left traces in the tone class that loanwords from Late Middle Chinese belong to
in the modern Japanese dialects. In order to distinguish it from the tone system of
Kan-on, I will refer to this separate stage as ‘the tone system of the Han
pronunciation’. Biao’s 表 tone system in Shittan-zō 悉曇蔵 definitely refers to this
type of ‘foreign Chinese’, and not yet to Kan-on.
1 The Man’yō-gana in the poetry parts of the Nihon shoki are sometimes referred to as ‘Kan-on’,
as they show influence of Late Middle Chinese. Kan-on as a system of Sino-Japanese character
readings however, had not yet developed in the 8th century. This is illustrated by the fact that
the Man’yō-gana that are used to write syllables that include the Old Japanese otsu-e in the
poetry of the Nihon shoki have the vowel /e/ in Middle Japanese, while the readings of the
same characters in standard Kan-on in the Middle Japanese period have the diphthong /ai/. This
means that the otsu-e (reconstructed by Miyake (1999) as [´y], based on the readings that the
Man’yō-gana have in Sino-Korean and Sino-Vietnamese) and the kō-e of Old Japanese had
already merged when Kan-on developed. According to Martin (1987:79) the merger between
kō-e and otsu-e occurred sometime between 800 and 850. As standard Kan-on must have
developed after this time, the Man’yō-gana of the Nihon shoki cannot be equated with standard
Kan-on, and this applies – of course – to the suprasegmental level as well as the segmental
level.
476 10 Stages in the adaptation of the tones of Late Middle Chinese in Japan
According to Biao’s description, the muddy initials were accompanied by voiced
aspiration and in the ping tone the sonorant (second muddy) initials were included in
this voiced aspirated group. It was also in the ping tone that the voiced aspiration
spread through the entire syllable (probably facilitated by the fact that ping tone
syllables were unchecked) creating a distinction between ping tone syllables that had
a clear voice quality (with clear and second clear initials) and ping tone syllables
that had a breathy voice quality (with muddy and second muddy initials). It is not
entirely certain from Biao’s description whether this voice quality distinction had
also spread to the ru tone.
The select few who learned Chinese directly from the on-hakase at the Daigaku-
ryō in the 7th and 8th centuries were no doubt initiated in the secrets of the
heavy/light voice quality distinction, but this distinction was so alien to the Japanese
phonological system that it was ignored in loanwords that were adopted into the
spoken language in the 7th and 8th centuries. (In terms of differences in pitch, the
Han pronunciation had only four tones.) The tone that these words acquired in
Japanese was most likely based directly on the impression that the pronunciation of
native speakers of Late Middle Chinese left on the Japanese ear. We have no
certainty as to the exact pitches or tone contours of the Han pronunciation, or how
close they were to the tones of the later Kan-on, as Biao’s description leaves many
things unclear.
I have tried to ascertain in how far the tone of the foreign Chinese Han
pronunciation resembled the tones of the later Kan-on, by looking at the Japanese
tone class that loanwords that may date from this period belong to. If, for instance,
ping tone loanwords and shang tone loanwords from this period belong to the same
Japanese tone class as native Japanese words that were later marked with the ping or
shang tone dots, this would mean that the tones of the Han pronunciation were
relatively close to the values of the Kan-on tones that were later used to mark the
tones of Japanese.
Apart from the fact that it is hard to be sure when a particular word was
borrowed (does it date from the period of the Han pronunciation, or from the later
period of Kan-on?), it turns out that valid data of this type are hard to obtain for
other reasons as well: The number of loanwords with tonal reflexes in the different
dialects that correspond to one of the tone classes of native Japanese nouns is small:
The tone of most loanwords in the modern dialects shows no connection with the
tone category to which the word belonged in Late Middle Chinese. (This is certainly
the case with neologisms coined from character readings.) Apart from this, as
Okumura (1963) pointed out, disyllabic loanwords that are not part of everyday
speech, usually have [HL] tone in all three major tone systems, i.e. Tōkyō 2.4/5
', Kyōto 2.2/3 ' Kagoshima (word-tone A). Okumura called this pattern
the ‘basic’ tone.
Furthermore, in the group of frequently used loanwords that do have reflexes that
correspond to one of the tone classes of native Japanese nouns, there is no clear link
between the tone class in Middle Chinese and the tone class in Japanese.
10.1 The first stage: The tone system of the Han pronunciation 477
For a part, this is due to the fact that the tones of the two main Sino-Japanese
character reading traditions that came to Japan (Go-on and Kan-on) were different:
Words that belonged to the ping tone class in Middle Chinese for instance, may end
up belonging to different Japanese tone classes, depending on whether they were
introduced in Japan from the Go-on or the Kan-on tradition.
Evidence for the tonal values and oppositions of the Han pronunciation can only
be found in loanwords of which the pronunciation unambiguously belongs to the
Late Middle Chinese layer. As on a segmental level, the Go-on and Kan-on
pronunciations are often the same, it is in most cases not possible to separate the two
layers in order to see what the correspondences are like within the separate layers.
As the already limited examples of frequently used Sino-Japanese loanwords
with good cross-dialect correspondences in Kindaichi (1980, 1984), Martin (1987)
and Okumura (1963:49) have to be separated into a Go-on and a Kan-on layer
(whenever possible), the number of valid examples that can be used as evidence for
the tones of the Han pronunciation is further reduced.
Moreover, it turns out that after this sifting is done, the distribution pattern over
the different native Japanese tone classes of the small group of Kan-on loanwords
that remains is too irregular to draw conclusions about the correspondence between
the tone class in Chinese and the tone class in Japanese. And this is also true for the
Go-on layer.
The irregularity of the Go-on reflexes is perhaps not unexpected, as Go-on is
well-known to consist of different layers. The irregularity of the Kan-on reflexes on
the other hand, is more surprising, and may mean that there was indeed a difference
between the way in which the tones of the Han pronunciation were heard and
adopted into Japanese (in the shape of loanwords), and the tones of the later Kan-on.
Loanwords from the period of the Han pronunciation may therefore have a different
tone than loanwords from the period of Kan-on, even though (based on their
segmental shape) they now both count as belonging to the Kan-on layer and cannot
be separated from each other.
An exception is the Middle Chinese ru tone. (See also section 11.1.2.) The tonal
reflexes of Kan-on ru tone words in the different dialects are the same as native tone
class 2.1 with hardly any exceptions. Native words of class 2.1 are marked with 上
上 tone in the tone dot material, which is reconstructed as /LL/ in Middle Japanese
following Ramsey’s theory. 2 There is no trace of a division into heavy and light,
even though standard Kan-on does have such a division in the ru tone. I therefore
conclude that the ru tone in the Han pronunciation had [L] pitch, and the regularity
of the reflexes indicates that it had [L] pitch in the later Kan-on also. Based on the
fact that early Kan-on material does not show a heavy/light distinction in the ru tone
either (cf. section 10.2), we can conclude that the inclusion of this distinction in
Kan-on was a relatively late development.
2 The reflexes of the ru tone in the Go-on layer are the same as native class 2.3, which was
marked 平平 in the tone dot material. The Go-on ru tone therefore, must have been [H].
478 10 Stages in the adaptation of the tones of Late Middle Chinese in Japan
As mentioned, the tone system of the Han pronunciation is the system that must
have formed the background of the tonal use of the Man’yō-gana in the poetry parts
of Nihon shoki (720). According to Takayama, in the tonal spelling system used in
parts of the Nihon shoki the Late Middle Chinese tones were divided into two groups,
ping versus shang, qu and ru. As Takayama indicates, such a division of the tones
over the two groups was most likely adopted from the rules of Chinese poetry, in
which the ping tone was opposed to all the other tones together. (The level/oblique
opposition used in Chinese poetry is often referred to as hyō-ta 平他 ‘the opposition
between ‘ping and the others tones’.) The writing system of the Nihon shoki is after
all said to have been a novelty; an experiment in which the newest knowledge from
China was being incorporated. This writing system was later abandoned, as the
Man’yō-shū 万葉集 reverted to the spelling system that had also been used in the
Koji-ki 古事記, which was based on the already established Wa-on Sino-Japanese
readings.
The representation of the tones in this tonal spelling system is very irregular, but
the resemblances with the later tone dot markings that are nevertheless there, allow
the cautious conclusion that (similar to the later tone dot marking system) the ping
tone of the Han pronunciation was used to mark Old Japanese /H/ tone, while the
shang tone of the Han pronunciation was used to mark Old Japanese /L/ tone.3
We can draw the following conclusions concerning the tones of the Han
pronunciation: The ru tone was realized with [L] pitch. The ping tone was heard as
‘straight/direct’ and ‘falling/low’ in Japan (cf. Biao in Shittan-zō), and used to mark
Old Japanese /H/ tone. The shang tone of the Han pronunciation was heard as
‘straight/direct’ and ‘rising/high’ in Japan (cf. Biao in Shittan-zō) and used to mark
Old Japanese /L/ tone.
10.2 The second stage: The Early Kan-on tone system
After contact with China had been severed in the mid 9th century, the Han
pronunciation started to develop into an indigenous system of Sino-Japanese
readings, the standardized form of Sino-Japanese that later became known as Kan-
on.
From Shittan-zō it is clear that in the 9th century, a split into a [H] and a [L]
register had occurred in the standard dialect of Chang’an in all of the four tones. The
division was between syllables with voiceless (clear and second clear) initials on the
one hand ([H] register), and syllables with voiced (muddy and second muddy) initials
3 The use of the ru tone is rare, and when I looked at the way in which the qu tone had been used
in the material included in Takayama’s paper of 1983, it seemed that in about half of the cases
the qu tone was used for syllables that were later marked with a ping tone dot, and in about half
of the cases for syllables that were later marked with a shang tone dot. As said, the use of the
other tones shows a considerable deviation from the later tone dot markings too.
10.2 The second stage: The Early Kan-on tone system 479
on the other ([L] register). Shittan-zō, as the classic textbook on Siddham studies in
Japan, was intensively studied. The fact that reports of the tonal spit by 9th century
returnees from China were included in this work resulted in incorporation of the
tonal split in the tone theories of the esoteric schools.
The new register split at first only influenced Tendai and Shingon circles. Tendai
and Shingon were introduced in the 9th century, when the tone split in Chang’an had
already occurred. They were not up against an already established Late Middle
Chinese-based reading tradition, as Buddhist circles until that time still largely held
on to Wa-on. The returning students introduced, along with their new doctrine, the
tone system of the language that they had learned during their study abroad.
A complex and new manner of pronunciation and chanting of the religious texts
probably also functioned to affirm their status as new schools with the latest
knowledge from China, but the most important factor was the importance of a
correct pronunciation of the Chinese characters in the magical formulae. It must
have been this concern that was the main cause behind the development of a
distinction between light and heavy tone dots in Japan (and as far as I know only in
Japan) in Buddhist circles.
By the 11th century, tone dots start to be used to mark the tones of Japanese, and
the use of these tones indicates that a light/heavy distinction in the ping tone that was
definitely tonal, had now been integrated in the tone system of standard Kan-on. I
have called this stage the Early Kan-on tone system.
The introduction of the new tonal interpretation of heavy and light into the Kan-
on tone system seems to have started with the ping tone, as according to Komatsu
(1971:510, 519-520) ru tone characters among the Sei-on in the Tosho-ryō-bon of
Ruiju myōgi-shō 図書寮本類聚名義抄 (late 11th century) and other materials from
the same period were still uniformly marked with the light ru tone dot, irrespective
of the initial. (This means that before the introduction of the tonal split in the ru tone,
this tone had the same pitch as the shang tone.)
The fact that the light/heavy tonal distinction was at first limited to the ping tone
is understandable if Biao’s description (which does not mention light and heavy ru
tones) was still regarded as the standard description of the new pronunciation
standard from China. The distinction between light and heavy in the ru tone must
have been added sometime afterwards.4 (As shown in section 6.5, certain remarks in
Annen’s text give room for the inclusion of this distinction in Biao’s ru tone.)
The voice quality distinction in the ping tone mentioned by Biao was now treated
as a tonal distinction: The light ping tone received the same rising tone contour as
the shang tone, as ‘light’ in Isei’s tone system had been described with the same
character as Biao’s shang tone 昂. The falling tone contour of the heavy ping tone
4 A tone chart included in Konkōmyō saishōō-kyō ongi 金光明最勝王経音義 of 1079 on the
other hand, already shows a light ping and light ru tone dot, so the precise history of the
introduction of the distinction in the ru tone is complex. There seems to have been an Early
Kan-on tone system with five tones and an Early Kan-on tone system with six tones.
480 10 Stages in the adaptation of the tones of Late Middle Chinese in Japan
remained unchanged, as ‘heavy’ in Isei’s description had been described with the
character 低, which was the same character that was also used to describe the ping
tone in Biao’s tone system.
When the additional heavy/light distinction for the ru tone was introduced into
the Early Kan-on tone system, likewise, heavy ru received the same tone as ping and
light ru received the same tone as shang because Isei’s, light ru was described with
the same character as Biao’s shang 昂 and Isei’s heavy ru was described with the
same character as Biao’s ping 低.5
The Early Kan-on tone system is the first tone system that was applied to the
tones of Middle Japanese in the shape of tone dots. This marking system was
characterized by the use of the light ping tone dot to mark syllables with a rising
tone contour in Japanese words. It is this system that formed the basis of early tone
dot material such as the Tosho-ryō-bon of Ruiju myōgi-shō and other works.
The tones were applied in two different ways, depending on the context: One
way was appropriate for Chinese characters used in the transcription of the dhāran,ī,
while the other way was used for marking the tones of Middle Japanese. There can
be no doubt that the second way was also used for Sino Japanese loanwords that had
become part of the spoken language.
1 The use of the tone dots in the Early Kan-on tone system
Value of the tone dots Value of the tone dots
applied to dhāran,ī applied to Middle Japanese
light ping [R] (or [R:] ) [R:]
heavy ping [F] [H]
shang [R] [L]
qu [F:] [F:]
light ru [R] (-hu, -tu, -ku, -ti, -ki) [L] (only in Sino-Japanese loanwords)
heavy ru [F] (-hu, -tu, -ku, -ti, -ki) [H] (only in Sino-Japanese loanwords)
In the Early Kan-on tone system (at least in certain circles) an additional light and
heavy distinction was acknowledged in the shang and qu tones. This distinction was
not defined in terms of pitch but in terms of length, which probably goes back to the
tone descriptions of Isei and Chisō. The clearest description can be found in
Chūzan’s 仲算 Hoke-kyō shakumon 法華経釈文 (976).6 According to Chūzan, light
5 See the remark by the Siddham scholar Nōyo 能誉 (Tendai school) in Dokkyō kuden myōkyō-
shū 読経口伝明鏡集 (1284): 唇内入声重者如平 “Heavy ru-tones ending in a bilabial (i.e. the
category that originally ended in -p in China, but was now developing into an open syllable in
Sino-Japanese) are like the ping tone” (Wenck, 1957:110).
6 Fujiwara Munetada’s 藤原宗忠 description in Sakumon daitai 作文大体(1108) is so hard to
follow that I can only assume it reflects the disarray in the tone theories mentioned by
Myōgaku: Heavy shang is 平軽 + 上 (= [RR]?) and merges with the qu tone. (“This tone is
different from the qu tone in that there is a sharp bend/breach between the tones.”) Light qu is
10.3 The third stage: The Later Kan-on tone system 481
shang and heavy qu were long, whereas heavy shang and light qu were short. If we
add these distinctions (which surely only applied to characters used in dhāran,ī
transcription) to the Early Kan-on tone system, the additional distinctions look as
follows: light shang [R:], heavy shang [R], light qu [F], heavy qu [F:].
10.3 The third stage: The Later Kan-on tone system
At the end of the 11th century, Myōgaku set out to reconstruct the eight-tone system
that was associated with the founders of the Tendai school in Japan. He based
himself on the descriptions of this tone system by Isei and Chisō, as well as on the
other traditions mentioned by Annen. This tone system is characterized by
Myōgaku’s idea of cutting the tones into two separate parts, each with its own tone
contour. I have called this third stage the Later Kan-on tone system, which can be
divided into a Shingon and a Tendai type, based on the difference that later
developed between the two schools in the tone value of the qu tone. I give the
Shingon system in (2), as the tone dot material on which our knowledge of the tonal
distinctions of Middle Japanese is based stems from this school.
The use of the light ping tone dot to mark /R/ tone in Japanese words was
abandoned, but the qu tone continued to be used to mark /F/ tone. The double tonal
values indicated for the ping and the shang tones when the Later Kan-on tone
system was applied to Middle Japanese are based on the fact that the ping tone dot
and the shang tone dot did double service. (In case they were used to express the two
contour tones of Japanese one can say that they were applied in the same way as
when used for dhāran,ī recitation.)
2 The use of the tone dots in the Later Kan-on tone system
Value of the tone dots Value of the tone dots
applied to dhāran,ī applied to Middle Japanese
light ping [RF] x
heavy ping [F] [H/F:]
shang [R] [L/R:]
qu [F:] [F:]
light ru [R] (-hu, -tu, -ku, -ti, -ki) [L] (only in Sino-Japanese loanwords)
heavy ru [F] (-hu, -tu, -ku, -ti, -ki) [H] (only in Sino-Japanese loanwords)
The use of the qu tone dot was influenced to a lesser extent by Myōgaku’s theories
than the use of the light ping dot. The falling nature of the qu tone was after all
preserved in the Shingon school. In works associated with this school, such as the
平 short + 去 long (= [HF:]?) and merges with the shang tone. Heavy qu is 平 long + 去 short
(= [H:F]?). (Cf. section 7.3.1.3.)
482 10 Stages in the adaptation of the tones of Late Middle Chinese in Japan
kōeki-lineage of Ruiju myōgi-shō (cf. the Kanchi-in-bon 観智院本) the qu tone dot
was still used to mark syllables that had a falling tone contour in Middle Japanese.
10.4 The tone systems used outside the monasteries
It is likely that in secular circles, the four-tone system continued to be used for a
longer time. (See also Hashimoto Mantarō (1978:267) on this point.) The official
government school most likely held on to the dokusho-on 読書音, the established
Late Middle Chinese reading tradition, in which heavy and light was not a matter of
pitch. The four-tone system, after all, already had a venerable tradition at the
government school, and it probably never occurred to the officials at the Daigaku-
ryō to introduce a new tone system. This would not have contributed anything new
to the understanding of written Chinese texts or the correct composition of Chinese
poetry and in this context would have been meaningless.
The Confucianist hakase-ke 博士家 were families of hereditary scholars who
continued the tradition of secular study of the Chinese classics which was started by
the official government school. In these families the official Kan-on reading
tradition was the norm, and according to Iida (1955) a system of four tones was in
use in these circles.
According to Konishi on the other hand, eight- and six-tone systems can also be
found in the hakase-ke (1948:494). As, in due course, the scholarly traditions of the
hakase-ke were strongly influenced by the reading practices developed in Buddhist
circles (see for instance the introduction of furigana notes and wokoto-ten), it is
probably safe to assume that the tonal split in the ping and ru tones was adopted in
the hakase-ke under Buddhist influence. (The light and heavy tone dots are after all
definitely a Buddhist invention.)
In the manuscripts of the Kiyohara 清原 and Nakahara 中原 families no tone
charts can be found, but Chūyū-ki 中右記, the diary of Fujiwara Munetada, includes
a tone chart from the Sugehara 菅原 family with six tone dots. (Ping, shang, qu, ru
and two dots above the ping and the ru marks which are not named, but which
clearly indicate a light ping and a light ru tone dot.) This also agrees with the kind of
dots in use in the Ōe 大江 family that are included in the work Wokoto-ten-fu 遠古
登点譜 in possession of the Seika-dō 静嘉堂 library (Konishi, 1948:477).
To the monks of the esoteric schools, the contour tones represented the primary
value of the tones, the value that was appropriate for the recitation of the magical
formulae. In the secular world however, the complicated tone contours of the
Siddham scholars were most likely never used. These circles most likely adopted the
simpler tone value that the tones had when the monks applied them to Middle
Japanese.
11 Miscellaneous issues
This chapter addresses a number of separate issues. The first two have to do with the
difference in tone between Wa-on and Kan-on. The next two issues deal with the
relation between the reconstruction of the tones of Middle Japanese and the tone
systems of Middle Chinese and Sino Korean. The last section addresses the evidence
contained in the tone of a number of Paekche loanwords in Old Japanese (as attested
in Middle Japanese).
11.1 The Wa-on tones
We have seen in chapter 4, that characters are marked with a different tone,
depending on whether they are read in the Go-on or in the Kan-on pronunciation.
This kind of reversal of the tones between two different character reading traditions
is unique to Japan, as Japan is also the only country in which subsequent character
reading traditions have been preserved in different realms of usage: Kan-on became
mainstream, but Go-on survived in Buddhist usage in general, while Tendai Kan-on
survived in the Tendai school and Tō-in in Zen.1
The correspondence rules between the tones of Go-on and Kan-on shown in
chapter 4 are the result of a later generalization, and do not give a reliable picture of
what the tonal oppositions of Wa-on were really like. In original Wa-on material the
relationship between the tonal category that a character belongs to in Kan-on and the
reversed Wa-on tone marking is far from regular. This indicates that the Chinese
tones were not introduced in a systematic manner at the time when the Wa-on
pronunciation came to Japan. 2 It is nevertheless possible to make the following
generalization: In genuine Wa-on tone markings the shang tone is hardly ever used:
In general, ping tone characters are marked with the qu tone, and shang and qu tone
characters are both marked with the ping tone.
In the period of the introduction of Wa-on in Japan (the 5th and 6th centuries),
distortions based on theoretical considerations surely played no role yet, and so these
1 As has been discussed section 6.5 however, there is a difference between the tones of the oldest
loanwords from Chinese in Vietnamese and those of the Sino-Vietnamese character reading
tradition (which is based on Late Middle Chinese). This concerns a reversal of the shang and
qu tones and not, as in Japanese, of the shang/qu tone and the ping tone.
2 This is one more reason why an association between Wa-on and the tradition attributed to Jin in
Annen’s 安然 text is most likely not correct, as it is unlikely that the Wa-on tones go back to a
tradition that can be identified with one particular teacher.
484 11 Miscellaneous issues
markings may be able to tell us something about the realization of the tones in the
form(s) of Early Middle Chinese that were transmitted to Japan.
It is possible to dismiss the difference between the Wa-on and the later Kan-on
tones by saying that it must stem from Paekche interference, and does not really go
back to a difference between Early and Late Middle Chinese. The Wa-on readings
after all, did not come to Japan directly, but were transmitted for an important part
via the Korean kingdom of Paekche. As information on what kind of influence (if
any) the language of Paekche may have had on the tones of Wa-on is not available,
the story would end there.
It is also possible however, to make use of what we know of differences in the
realization of the tones between Early and Late Middle Chinese and try to make
sense of the reversal in the markings between Wa-on and Kan-on. The available
information is primarily related to differences in vowel length.
11.1.1 Differences in vowel length as the origin
of the reversed Wa-on tone markings?
The parallel way in which Annen describes the ping and shang tones suggests that –
even before the development of a new Sino-Japanese standard – the Late Middle
Chinese ping tone was heard as falling, and the shang tone as rising in Japan (cf.
section 9.4.3).
In the earlier Wa-on, the ping tone had had a falling tone contour as well: In Wa-
on after all, ping tone characters are marked ‘in the reverse’ with the qu tone dot.
The value of the tone dots is based on the Kan-on tones of the Siddham scholars,
which means that in Wa-on, the ping tone had a long falling tone contour [F:].
The Japanese tone system was a register tone system, in which syllables with a
rising or a falling tone contour were the result of contractions, and most likely
lengthened. (Apart from this, in many languages the syllabic support of contour
tones is automatically lengthened, irrespective of whether they are the result of
contractions or not.) As I have argued in section 9.4.3, if the contour tones of the
language were long, while the level tones were short, it is not impossible for the
long ping tone (in either one or both of the varieties of Middle Chinese) to have been
adopted as a contour in Japan, for that reason alone.
Based on changes in the transcription of Sanskrit by means of Chinese characters,
it is assumed that there were differences in vowel length between the tones of Early
Middle Chinese and Late Middle Chinese. In the Early Middle Chinese-based
Sanskrit transcription system ping tone characters were the favored indicator of
vowel length. The shang tone was the favored indicator of short vowels, but there
are also plenty of examples in the Early Middle Chinese period of the qu tone being
used for this same purpose. By the middle of the Tang dynasty, when the new Late
Middle Chinese standard language (based on the dialect of Chang’an) was replacing
the old standard of the Qieyun 切韻, there was a change in usage in transcribing
Sanskrit. The shang tone remained the preferred indicator of short vowels but for
long vowels the annotation qu yin 去引 ‘qu drawn out’ was now also used.
11.1 The Wa-on tones 485
Although the addition of the word yin presumably means that the qu tone by
itself was not felt to be quite appropriate for indicating vowel length, there must
obviously have been some degree of length involved. 3
To the later Siddham scholars, the description of the qu tone in Biao’s tone
system would probably have been enough to give them the idea that in Kan-on the
qu tone was the longest tone, as only the qu tone had the annotation that it was ‘a
little drawn out’. In combination with the annotation qu yin 去 引 , which the
Siddham scholars regularly encountered in Chinese works on the Siddham script, the
Kan-on qu tone in the Siddham scholars tone systems developed a strong association
with vowel length. 4 This was no different in Korea, where syllables with long
vowels are likewise marked with qu tone dots, and syllables with short vowels are
marked with shang tone dots when Sanskrit dhāranī are transcribed into Korean
(Rosen, 1974:131).
As the Siddham scholars were under the impression that the qu tone was the
longest tone, and as the Japanese contour tones were most likely long while the level
tones were short, it is not surprising that qu was used to mark [F:], while ping was
used to mark [H].
In Early Middle Chinese on the other hand, the ping tone had been long, while
the qu tone had been short. It is therefore not surprising that in Wa-on, the ping tone
was taken over as a contour tone [F:] (later marked with the qu tone dot), while the
qu tone was taken over as [H] (later marked with the ping tone dot).
In other words, if we assume that there was an automatic difference in length
between level tones and contour tones in Old Japanese and Early Middle Japanese,
the reversed markings of the ping and qu tones in Wa-on and Kan-on could stem
from real (and partly perceived) differences in vowel length in certain tones between
Early and Late Middle Chinese.
3 A change from the final voiceless [-h] of the qu tone in Early Middle Chinese to voiced [-˙] in
Late Middle Chinese (which has to be assumed to account for the merger of heavy shang with
qu as well) would account for the fact that the qu tone had become somewhat longer
(Pulleyblank, 1978).
4 In addition there are other passages in Shittan-zō that imply vowel length for the qu tone. Endō
(1988:46) for instance, quotes passages from Shittan-zō in which Annen shows transliterations
of Siddham graphs by Kūkai, with additional explanatory notes (cf. ‘Biao’s light shang tone’
and ‘Biao’s qu tone, pronounced long’) added by himself.
a 阿 上声、表上之軽 a shang tone, Biao’s light shang tone
ā 阿 去声、表去、長呼 ā qu tone, Biao’s qu tone, pronounced long
The change from qu tone markings to shang tone markings of single-kana characters in the
work of the Tendai monk Shinkū (section 4.4) is probably due to the strong association with
vowel-length that the qu tone developed, as this made the qu tone unfit as a marker for these
short character readings. The remark in the Tendai work Shosha-san shōmyō-shō, 書写山声明
抄 that the qu tone does not occur with single kana (section 7.3.3.2) refers to the same change
in the practice of adding tone dots to character readings.
486 11 Miscellaneous issues
As to the shang tone in Wa-on: Just as the qu tone, the Wa-on shang tone was
marked with the ping tone dot [H]. We may perhaps assume that that it was [H] in
Early Middle Chinese, which is not unlikely for a tone ending in a glottal stop.
In the next section we will see that the ru tone in Wa-on had [H] pitch, and on
this basis, we can probably reconstruct it with [H] pitch in Early Middle Chinese.
For the [H] pitch of the ru tone, there is rather strong evidence from Japanese, but
the rest of the Early Middle Chinese reconstructions in (1) are tentative.
1 Possible relation between the tones of EMC and Wa-on
EMC Wa-on
ping [F:] or [M:] [F:]
shang [H] [H]
qu [F] [H]
ru [H] [H]
11.1.2 The ru-tone in Wa-on and Kan-on
Because classification of a character as belonging to the ru tone is determined by
segmental features, the ru tone remains marked as ru tone in both Wa-on and Kan-
on. The pitch of the ru tone in both traditions was nevertheless different, which
probably goes back to a different realization in the two varieties of Middle Chinese.
Chūzan 仲算 makes no mention of a difference between the Wa-on (‘Tsushima-
on’) ru tone and the Kan-on ru tone in Hoke-kyō shakumon 法華経釈文 (976), but
according to Shinren 心蓮 in Shittan kuden 悉曇口伝 (1180), the Kan-on ru tone
was always light while the Go-on ru tone was always heavy: 於漢無入重於呉無入
軽也 “Kan-on has no heavy ru tone and Go-on has no light ru tone.” The fact that in
the Sei-on readings in the Tosho-ryō-bon, ru tone characters with heavy as well as
light initials are marked with the light ru-tone dot is in agreement with this
observation (Komatsu, 1971:510).
In the modern Japanese dialects, ru-tone loanwords are divided into two groups,
those that have the tone of class 2.1 and those that have the tone of class 2.3. This
division is based on whether they belong to the Go-on or to the Kan-on layer of
Sino-Japanese. The division is not based on the initial of the character: Kan-on ru-
tone loanwords mostly belong to class 2.1, irrespective of the initial of the character.
Go-on ru-tone loanwords on the other hand, mostly belong to tone class 2.3,
likewise irrespective of the initial of the character.
Shinren’s remark was therefore clearly based on the observation of a real tonal
difference between the ru tone in Wa-on/Go-on and Kan-on in his time. (As far as I
know, Okumura (1963:53) was the first modern linguist to draw attention to the fact
that the ru tone in Go-on and the ru tone Kan-on also have different reflexes in the
modern dialects.
The examples in (2) of the reflexes of Kan-on ru tone loanwords have been
adopted from Okumura (1961, 1963) and Kindaichi (1984). The examples of the
11.1 The Wa-on tones 487
reflexes of Go-on ru tone loanwords in (3) have been adopted from Okumura (1961,
1963), Kindaichi (1980, 1984) and Martin (1987).
2 The tone of Kan-on ru tone loanwords in the Japanese dialects
Kan-on ru tone loanword Tōkyō Kyōto Kagoshima Type of initial
2.1 kaku ‘rank’ 格 A clear
2.1 kyuu ‘emergency’ 急 A clear
2.1 kotu ‘skeleton’ 骨 ' A clear
2.1 setu ‘theory’ 説 , ' ' A clear
2.1 syoku ‘a job’ 職 , ' A clear
2.1 kyoku ‘melody’ 曲 , ' A second clear
2.1 eki ‘fortune telling’ 易 , ' A second muddy
2.1 riku ‘land’ 陸 , ' A second muddy
2.1 syoku ‘a meal’ 食 A muddy
2.1 seki ‘seat’ 席 , ' A muddy
2.1 taku ‘house’ 宅 A muddy
2.1 teki ‘enemy’ 敵 A muddy
2.3 huku ‘clothes’ 服 ' ' B muddy
3 The tone of Go-on ru tone loanwords in the Japanese dialects
Go-on ru tone loanword Tōkyō Kyōto Kagoshima Type of initial
2.3 bati5 ‘plectrum’ 撥 ' ' B clear
2.3 hati ‘eight’ 八 ' ' B clear
2.3 hati ‘begging bowl’ 鉢 ' ' B clear
2.3 hyaku ‘hundred’ 百 ' ' A6 clear
2.3 iti ‘one’ 一 ' ' B clear
2.3 siti ‘pawn’ 質 ' ' B clear
2.3 yaku ‘misfortune’ 厄 ' ' B clear
2.3 siti ‘seven’ 七 ' ' B second clear
2.3 syaku ‘a foot (length)’尺 ' ' B second clear
2.3 maku ‘curtain’ 幕 ' ' B second muddy
2.3 mitu ‘honey’ 蜜 ' ' B second muddy
2.3 myaku ‘pulse’ 脈 ' ' B second muddy
2.3 netu ‘heat’ 熱 ' ' B second muddy
2.3 niku ‘meat’ 肉 ' ' B second muddy
2.3 niku7 ‘blanket’ 縟 x ' x second muddy
5 The regular Go-on reading of this character is hati, the Kan-on reading is hatu. The reading
bati is a Kan’yō-on. The rendering of the clear initial as b- is surprising.
6 Nagasaki has B and A.
488 11 Miscellaneous issues
Go-on ru tone loanword Tōkyō Kyōto Kagoshima Type of initial
2.3 roku ‘six’ 六 ' ' B second muddy
2.3 yaku ‘role’ 役 ' ' B second muddy
2.3 bati ‘punishment’ 罰 ' ' B muddy
2.3 doku ‘poison’ 毒 ' ' B muddy
2.3 ziku ‘axle’ 軸 ' ' B muddy
2.3 zitu ‘truth’ 実8 ' ' B muddy
2.3 zyutu ‘technique’ 術9 ' ' B muddy
2.3 goo ‘karma’ 業 '10 ' B second muddy
2.1 kyaku ‘guest’ 客 A second clear
2.1 ziki ‘soon’ 直 A muddy
2.1 zoku ‘vulgarity’ 俗 A muddy
2.1 zoku ‘thief’ 賊 ' A muddy
In the previous chapters, I have argued that in Japan light was understood as ‘rising’,
and that rising tone contours in dhāran,ī recitation were used to express /L/ tone in
Middle Japanese. Also that heavy was understood as ‘falling’ and that falling tone
contours in dhāran,ī recitation were used to express /H/ tone in Middle Japanese.
When Shinren therefore, says that the Kan-on ru tone was light, this amounts to
saying that it was [L], and when he says that the Go-on ru tone was heavy, this
amounts to saying that it was [H].
The Siddham scholars use of the terms heavy and light goes completely against
what one would normally expect the meaning of these terms to be, but evidence
from the modern dialects confirms that this is nonetheless the sense in which
Shinren used these terms: We have seen that Kan-on ru tone loanwords belong to
tone class 2.1, and that Go-on ru tone loanwords belong to tone class 2.3. If light
indeed means ‘low’ and heavy means ‘high’ in Shinren’s description, class 2.1
should be reconstructed with /LL/ tone and class 2.3 with /HH/ tone.
Both in the Tōkyō type dialects of central Japan and in the Kyōto type dialects,
class 2.3 attaches with -[HH] pitch while class 2.1 attaches with -[LL] pitch as
second element of a compound. (See section 5.2.3 of part I.) As the pitches in the
Tōkyō type dialects and the Kyōto type dialects agree, they must reflect the tone that
these classes had before the Tōkyō type and the Kyōto type tone systems split. In
other words: Class 2.1 had /LL/ tone, and this is the class to which the light Kan-on
ru tone belongs. Class 2.3 had /HH/ tone and this is the class to which the heavy Go-
on ru tone belongs. The tones that have been preserved in compound nouns
7 The dialect data for this word are few, but (just as almost all of these examples) it has been
attested with 平平 tone dots.
8 Zitu is a Kan’yō-on. The official Go-on reading is ziti, the Kan-on reading is situ.
9 Zyutu is a Kan’yō-on. The official Go-on reading is zyuti, the Kan-on reading is syutu
10 This example has the /H/ tone on the first mora (which developed from the initial syllable) in
Tōkyō because /H/ tone is not allowed on the dependant second mora.
11.1 The Wa-on tones 489
therefore confirm that Shinren’s term ‘light’ referred to [L] pitch and that his term
‘heavy’ referred to [H] pitch.
In the Wa-on tone system therefore, the pitch of the ru tone was similar to the
pitch of the merged shang/qu tone. When Wa-on/Go-on character readings are
marked ‘in the reverse’ by means of Kan-on-based tone dots however, the shang/qu
tone is marked with the ping tone dot, but the marking of ru tone characters (even
though the pitch was different) was based on segmental considerations, and was not
changed. As a result the grouping of the Wa-on tones is different, depending on
whether we look at the original tone of the characters in the rhyme books, or at the
tone dots with which they were later marked in Japan.
4 Different groupings of the Wa-on ru tone
Wa-on pitches Tone of character Reversed tone dot marking
[F:] ping qu
[H] shang/qu, ru ping, ru
In works that deal with Go-on readings, we therefore find remarks pointing out that
in Go-on, the pitch of the ru tone was the same as the pitch of the ping tone, whereas
within the Wa-on tone system itself, the ru tone had the same pitch as the shang/qu
tone. The first example is in the Kujō-ke-bon 九條家本 of Hoke-kyō-on 法華経音
from the end of the Heian period. Hoke-kyō-on is a collection of traditional
characters readings of Wa-on and Kan’yō-on type by an unknown Tendai monk.11
According to Myōgaku, the Hoke-kyō should be read according to Wa-on (Mabuchi,
1963:1069) and the tone markings in Hoke-kyō-on are clearly of the reversed type.
In this work we find the following remark about the fu-nisshō-ten: 本入声ナル
ヲ平声呼ブ “pronounce what is originally a ru tone as a ping tone” (Konishi, 1948:
478). This shows that the pitches of the ru tone and the ping tone were considered
identical, as the two were indistinguishable as soon as the final consonant of the ru
tone was lost.
The next example is from the much later Edo-period work Bumō-ki 補忘記,
written by a monk of the Shingi Shingon school. Bumō-ki deals with the correct
recitation of the rongi ceremonies. These ceremonies – although conducted in
Japanese – contain many Sino-Japanese loanwords, which overwhelmingly belong
to the Go-on type. The observations in Bumō-ki as to the behavior of the tones in
11 It is however, very close to Myōgaku’s 明覚 school, and it is even thought that it may have
been written by Myōgaku himself. This is because of the following link: The fanqie in Hoke-
kyō-on are often identical to those in Hoke-kyō tanji 華経単字. As Konishi (1948: 361)
explains; in these two works the many Kan’yō-on in the recitation of the Lotus sutra are spelled
by means of fanqie, and so the fanqie contained in both works have nothing to do with the
traditional fanqie that are based on the rhyme books. Furthermore, of Hoke-kyō tanji there
exists a copy from the year 1136 to which at the end a gojūon-zu table is added that closely
resembles Myōgaku’s table in Han’on sahō 反音作法.
490 11 Miscellaneous issues
Chinese loanwords (the so-called ideai rules, which will be discussed in section
ハ シ ニ スル
13.1.2) state for instance: 平入同 様 出合 也 “The ping and the ru tones meet
in the same way.” This, again, confirms that in Go-on, the ru tone had the same
pitch as characters that were marked with the ping tone dot.12 (But these characters
belong for the largest part to the shang and qu categories in the rhyme books.)
11.2 The Sinologist view of the shang and qu tones
In Kindaichi’s opinion, the Sinologist view of the Middle Chinese tones hampers
rather than helps the interpretation of the Middle Japanese material (1951:4). The
view of the Middle Chinese tones that was standard – not only among Sinologists,
but at the time when Kindaichi wrote Nihon shisei kogi also still among most
Japanologists – was that the ping tone was level, the shang tone was rising, and the
qu tone was falling. The ru tone was distinguished from the other tones because it
ended in -p, -t, -k. As a result of Kindaichi’s work however, the opinion in
Japanologist circles changed, and nowadays the Japanologist view of the Middle
Chinese tones and the Sinologist view of the Middle Chinese differ considerably,
especially as far as the tone contour of the qu tone is concerned: In the Japanologist
view, the Late Middle Chinese qu tone was [R] , and the shang and ping tones were
[H] and [L] respectively.
These values are based for the most part on the way in which the tones are used
to mark the tones of Middle Japanese in the standard reconstruction, but Kindaichi
sees corroboration for the reconstruction of shang as [H] and qu as [R] in the well-
known merger of the heavy shang tone with the qu tone: The voiced aspirated
initials of the heavy shang tone lowered the onset of the originally level high shang
tone, which developed a rising contour tone as a result, and merged with the qu tone.
According to Kindaichi therefore, the yin/yang register split is what caused the
heavy shang tone to merge with the qu tone, and so the register split and the merger
of heavy shang with qu occurred simultaneously. Pulleyblank on the other hand,
explained the merger of heavy shang with qu as the result of assimilation of the final
glottal stop of the shang tone to the voiced aspiration of the muddy initials in Late
Middle Chinese. According to Pulleyblank, this merger predated the split into a yin
(high) and yang (low) register in the 9th century.
As we have seen, Pulleyblank’s analysis agrees better with the fact that in Biao’s
tone system the heavy shang tone had already merged with the qu tone, even though
in his tone system a split into two registers of different tone height is still lacking.
Kindaichi’s theory also fails to explain why shang tone syllables with sonorant
initials (which were equally voiced) did not merge with the qu tone.
12 Another example from Bumō-ki can be seen in the tone chart included in this work (cf. section
13.1.1) where the fu-nisshō dot is marked with the same hakase marks as the heavy ping and
heavy ru tones.
11.3 The shang and qu tones in Sino-Korean 491
I have argued that the tone systems of the Siddham scholars after Annen have
little or nothing to do with Late Middle Chinese, as they have been severely
distorted by Myōgaku’s division of the tones into two separate parts, and by the
unnatural interpretation of the heavy/light distinction. As I have mentioned in
section 9.4.3 however, it is possible that the most basic element of these tone
systems (namely the contours of ping, shang and qu, which would form the second
part of the tone in Japan), was adopted faithfully from Late Middle Chinese.
These possibly reliable elements in the Japanese tone systems do not agree with
Kindaichi’s reconstruction of the Late Middle Chinese tones and his ideas on the
origin of the merger of heavy shang with qu: The shang tone is described as ‘rising’
by the Siddham scholars and not as ‘high’ and the qu tone is described as ‘bending
down’, and not as ‘rising’. As it turns out, the descriptions – in fact – agree rather
well with the Sinologist view of the shang and qu tones.
11.3 The shang and qu tones in Sino-Korean
Sino-Korean is the 15th and 16th century Middle Korean character reading tradition,
which – like Japanese Kan-on – is based on a form of Late Middle Chinese.13 In
Sino-Korean, the shang and the qu tones had merged, mostly as shang [R]. In Japan,
this merger is sometimes seen as confirmation of the Japanese reconstruction of the
Late Middle Chinese qu tone as [R]. However, as I will argue below, based on the
Sino-Korean data, a reconstruction of the Late Middle Chinese qu tone as [F] makes
more sense.
In contrast to the situation in Japanese linguistics, there is no controversy
surrounding the tonal value of the tone dots that were added to Middle Korean texts.
The pitches of Korean words were marked by zero dots = ping [L], one dot = qu [H]
or two dots = shang [R], placed to the left side of the Chinese characters or hangul
graphs.
5 The marking system used to indicate the pitches of Middle Korean
Tone Value Marked by means of
ping [L] zero
shang [R] two dots
qu [H] one dot
13 According to Arisaka, 15th century Sino-Korean was based on the dialect of Kaifeng (the
capital of the northern Song) of the 10th century. The relationship of Sino-Korean with Kan-on
is unclear, in the sense that in some respects it appears to be more archaic while in others it
appears to be younger.
492 11 Miscellaneous issues
Although the tonal value of ping, shang and qu must have been based on the form of
Late Middle Chinese that was introduced in Korea, the marking of Chinese
characters read in Sino-Korean does not agree completely with the value of the ping,
shang and qu tones indicated in (5).
Apart from a small percentage of irregular markings, ping tone characters were
indeed marked as ping (= [L]). In case of shang and qu however, there is a
difference: In the majority of cases (81.1 %) shang tone characters were marked as
shang = [R], but in the majority of cases (80.4%) qu tone characters were also
marked as shang = [R]. A small percentage of shang tone characters (15.2%) and a
small percentage of qu tone characters (14.5%) were marked as qu. Finally, ru tone
characters were marked as qu = [H] (Kim Yengman, 1967). In Sino-Korean in other
words, shang and qu had merged as shang [R], and to a lesser extent as qu [H].
I think this merger can be explained as a result of developments within Korean,
and that it does not mean that the qu tone in Late Middle Chinese had a rising tone
contour, such as Kindaichi assumed. It is possible to understand the eventual merger
of shang and qu in Sino-Korean if we look at the nature of the Middle Korean pitch-
accent system.
Middle Korean had a pitch-accent system in which the first [H] pitch was
distinctive, while the following pitches were highly unstable as they were governed
by automatic rules of prosody (Kadowaki, 1976). In the Middle Korean pitch-accent
system the falling qu tone of Late Middle Chinese could therefore not be taken over
as [F], as only the first rise to [H] pitch in the word was distinctive, while a
subsequent drop to [L] pitch was not. (In this respect the Middle Korean accent
system was similar to that of modern Hokkaidō Ainu.) It was therefore impossible to
have a stable falling pitch contour, much less a distinctive one (Ramsey 1978:120).
The Late Middle Chinese qu tone therefore, could only be realized as [H], [L] or
[R]. If the Late Middle Chinese qu tone had been [R], the qu tone would no doubt
have been taken over as [R]. There are indications however, that the qu tone in Sino-
Korean was initially taken over as [H] and not as [R].
The qu tone dot was, after all, used to mark [H] pitch when used to mark Middle
Korean words. In addition there is a remark in the explanatory notes to Sohak ônhae
小學諺解 (1587): “Nowadays in the sounds of the vulgar (pronunciation of the
Chinese characters) the shang and the qu tone are confused with each other” (Rosen,
1974:115), which suggests that originally, this was not the case.
A reconstruction of the Late Middle Chinese qu tone as [F] therefore agrees
much better with the Middle Korean data: As this tone started with [H] pitch, and as
[H] (at least the first [H] in the word) in the pitch-accent system of Middle Korean
was distinctive, a realization as [H] would have been natural.
So why did the originally [H] qu tone later merge with the [R] shang tone in
Sino-Korean? I think that a combination of different factors may have played a role:
In the Middle Korean pitch-accent system, the rising tone (which was lengthened)
only occurred on the initial syllable of the word. In Sino-Korean words, it was only
11.3 The shang and qu tones in Sino-Korean 493
possible for the long rising tone to occur in non-initial position if it was not preceded
by [H] pitch within the word.
As the second character in a Sino-Korean compound – in the majority of cases –
the shang tone was therefore not distinguished from the qu tone from the start. In
addition there was the usual overlap between shang and qu due to the well-known
merger of shang tone characters with muddy initials with the qu tone. At first, the
situation was as shown in (6).
6 The pre-15th century Sino-Korean tones
Initial syllable Other syllables
shang [R] (muddy initials > qu = [H]) mostly [H]
qu [H] mostly [H]
Later however, segmental influences started to play a role: In native Korean words,
[R] pitch was rare for words that started with an aspirated initial (Ito, 2005:6) and
this tendency appears to have influenced Sino-Korean, as relatively many Sino-
Korean morphemes derived from the shang and qu tones that have second clear
initials have [H] pitch. There is also a tendency for Sino-Korean morphemes derived
from the shang and qu tones that have muddy and second muddy initials to have [R]
rather than [H] pitch.14
Even though these developments were not absolute, the result was nevertheless
that even in initial position there now was a considerable mixture of [H] and [R]
reflexes for syllables derived from the Late Middle Chinese shang and qu tones. It is
these developments – together with the already considerable overlap between the
two tones in non-initial position – that caused the qu and the shang tones to merge
completely in Sino-Korean.15
14 According to Ito (2005), Sino-Korean shang and qu tone morphemes with aspirated initials
(second clear initials in Chinese) had most occurrences of [H] pitch (shang 27.12 and qu 21.23),
next were morphemes with clear initials (shang 20.73 and qu 20.89) and the smallest
percentage of occurrences of [H] pitch were with muddy and second muddy initials (shang
16.03 and 17.18 respectively and qu 20.41 and 16.90 respectively). These segmental influences
on the pitch of the initial syllable in pre-15th century Sino-Korean can be represented as
follows:
Tone Initial Realization
shang clear [R]
second clear [R] > [H]
second muddy [R]
muddy > qu
qu clear [H]
second clear [R] > [H]
second muddy [H] > [R]
muddy [H] > [R]
15 Eventually a system evolved in which the division was based on a new criterion (Ito 2005:5-6):
Syllables of a certain segmental type – whether they originally derived from the shang tone or
494 11 Miscellaneous issues
11.4 Paekche loanwords in Old Japanese
The tone of a number of early Korean loanwords in Old Japanese (as attested in
Middle Japanese) has been cited as evidence for the standard reconstruction of the
Middle Japanese tone system. 16 The examples that are usually mentioned can be
divided into two types, as Miyake (1997) identified a number of them as loanwords
from Early Middle Chinese that ended up in Japan via Korea.
7 Comparison of the tone of Paekche loanwords in Old Japanese
in Middle Japanese and Middle Korean
Middle Japanese tone Middle Korean tone
‘hatchet’ nata17 /LL/ nat /H/
‘district’ kohori /HHH/ koWol /LL/
‘temple’(< EMC *chraat) tera /LH/ tyel /H/
‘begging bowl’(< EMC *pat) hati /HH/ pali /LL/
‘Buddha’ (< EMC *but-daa/ hotoke /HHH/ pwuthye /LL/
buu-doo or *but-lay/buu-daa
The loanwords from Early Middle Chinese among the examples in (7) almost
certainly came to Japan from Paekche in the 5th to 6th century, as Buddhism (to
which all three words are related) and the first character readings (Wa-on) were
introduced in Japan from Paekche.18 The first two words (‘hatchet’ and ‘district’)
from the qu tone – would always have [H] pitch (a small group), and syllables of a different
segmental type would always have [R] pitch (a large group). Yet others could have either [H]
or [R] pitch.
16 These words have been discussed in this context by Murayama (1990), Starostin (1991),
Kortlandt (1993) and Vovin (1997).
17 The reconstruction of the tone of nata is based on the modern dialect reflexes, as nata is not
attested in (Old and) Middle Japanese texts. Starostin and Miller have therefore expressed their
doubts as to the possibility that this word is an early loan from Korean nas/ nat- ‘sickle’
(Robbeets, 2003).
18 Both ‘temple and ‘bowl’ and perhaps also the first syllable of ‘Buddha’ belonged to the ru tone
in Chinese, but the tone of ‘temple’ in Middle Korean and Old Japanese does not agree with the
tone of the other two examples. (In the period when these words were introduced in Japan a
register split in the ru-tone definitely played no role yet.) As we have seen, the older layer (Wa-
on/Go-on) of ru tone loanwords in Japanese tends to have [H] tone, and two of the three
examples above (‘begging bowl’ and ‘Buddha’) follow this trend. In standard Sino-Korean,
which was based on some form of Late Middle Chinese, ru tone loanwords normally have [H]
pitch, and so the tone of the examples above does not fit into the usual Sino-Korean pattern. In
Middle Korean therefore, these words probably also belonged to an old layer of Buddhist
related loanwords from Early Middle Chinese that had been introduced through some other
Korean peninsular language. The Late Middle Chinese ru tone was taken over as [H] in Korean,
but as [L] in Japan. The origin of the difference may lie in the different prosodic systems of the
two receiving languages, or in a difference in the variety of Late Middle Chinese that was
11.4 Paekche loanwords in Old Japanese 495
however, can probably be regarded as loanwords from the 5th to 6th century language
of Paekche itself.
As information on the prosodic system of Paekche is not available, we can only
compare the tone that these words have in Middle Japanese with the tones of 15th to
16th century Middle Korean. A complicating factor is that Middle Korean is the
direct descendant of the language of Shilla and not of the language of Paekche. In
spite of the paucity of Paekche material among the toponyms of the Three Kingdoms,
the consensus of scholarly opinion seems to be that the language of Paekche was
probably close to the language of Shilla, but that the two were almost surely distinct
and separate languages.
Ramsey (1991) has argued that the tonal oppositions of Middle Korean do not go
back to proto-Korean, but developed later, from segmental elements (Ramsey, 1991).
Proto-Korean was most likely characterized by a non-distinctive prosodic system in
which the last (or only) syllable of a morpheme was automatically accented. We can
therefore wonder whether the 5th to 6th century language of Paekche already had
tonal oppositions. But let us assume – for argument’s sake – that these tonal
oppositions were there, and that the Middle Japanese tones faithfully reflect the
pitches that these words had (whether they were Chinese in origin or not) in the
language of Paekche some 500 years earlier.
If we follow Ramsey’s reconstruction, we have to accept the fact that the pitches
of 5th to 6th century Paekche did not agree phonetically with the pitches of Shilla-
based Middle Korean, a full millennium later. This does not seem much of a
problem, but – as mentioned – it has been cited as a serious argument against
Ramsey’s theory.19
We know from Japanese that the realization of the tonal oppositions of even
closely related dialects can be radically different: Kagoshima and nearby
Makurazaki for instance, share a very similar word-tone system, in which the
vocabulary is divided in the same manner over two word-tones, A and B. The
phonetic realization of the word-tones in the two dialects however, is almost exactly
each other’s opposite.
In my opinion, it would be stretching the limits of comparison to compare the
tone that 5th to 6th century Paekche loanwords in Old Japanese had in the Japanese
language of the 12th century, with 15th to 16th century Middle Korean – which was
the descendant of another (though related) language – especially when we draw
conclusions about the phonetic values of the two.
introduced in the two countries.
19 “One would expect that early Korean loanwords in Old Japanese will either preserve original
Korean pitches or will be pretty close to them. It would be absolutely inconceivable if the
pitches flip-flopped” (Vovin, 1997:116).
12 Determining the time of the tone shift in Kyōto
In part I, I have argued that there are three major phases in the developments that
generated the tone systems of the modern Tōkyō type and Kyōto type dialects:
First, /H/ tone spreading onto the particles after words ending in /LH/ tone and/or
/R/ tone in certain dialects. Secondly, a reduction of the number of /H/ tones per
word, in that only /H/ before /L/ was preserved as a phonological /H/ tone. As a
result of this reduction, the remaining /H/ tones became accent-like. From then on,
syllables with /H/ tone were highlighted over other syllables in the word to the
extent that the pitch of all other syllables could be predicted based on the location of
the /H/ tone in the word. This means that all other syllables in the word can be
analyzed as having Ø tone. Finally, in Kyōto and a large surrounding area, there was
a leftward tone shift. This shift not only resulted in the typical location of the /H/
tone in Kyōto – one more syllable towards the left than in Tōkyō – but also re-
created a distinctive /L/ toneme, which is limited to the initial syllable of the word.
The leftward tone shift was such a radical change that it would have upset the
traditional view of the tones completely. The /H/ tone restriction that preceded it, on
the other hand, was less invasive and probably occurred only gradually. This change
would not have caused a complete disruption of the traditional tone system, at least
not initially. We would however, expect to see it reflected in the documents in the
shape of occasional ‘mistakes’: Cases where words that traditionally contained
sequences of /H/ tones now show a reduced number of /H/ tones per word.
The earliest indication of /H/ tone restriction can be seen in the Jakue-bon 寂恵
本 (1278) of Kokin waka-shū 古今和歌集 where siho ‘tide’ and hana ‘flower’ (both
tone class 2.3) are marked with 上平 instead of 平平 tone dots. (Also in the
Fushimi-miyake-bon 伏見 宮 家本 of Kokin waka-shū from the end of the 13th
century.)
Such early examples are rare, but the fact that such mistakes started to be made is
significant. Especially in copied texts, the tone dots would not easily be altered. As
there was probably some time between the start of the /H/ tone restriction and the
appearance of the first ‘mistakes’ in the texts, we can perhaps point to around 1250
as the time in which the /H/ tone restriction reached central Honshū. (I assume that
the written record reflects the Nairin, Chūrin and Gairin dialects from the Kinki and
Tōkai regions.)1
A similar example can be found in Moji-han 文字反 (1331-1334) where sima
‘island’ (class 2.3) is marked with 上平, instead of the traditional 平平 tone dots.
1 In western Japan on the other hand, the /H/ tone restriction must have started before the 10th
century. (See section 10.7 of part I.)
12.1 Evidence from the 14th century 497
Moji-han is a late example of the use of tone dots, as the habit of marking the tones
of Japanese by means of tone dots was abandoned in the early 14th century.
It is clear that the Kyōto tone shift had already taken place by 1530, as around
that time Konparu Zenpō 金春禅鳳 (1454-1532) mentions the typical difference
between Kyōto type tone and Tōkyō type tone in unmistakable terms in Mōtan
shichin-shō 毛端私珍抄. (See section 3.3.1 of part I.) As the disruption of the
traditional tone system by the leftward tone shift would explain the abandonment of
the tone dot marking system, Ramsey pointed to the 14th century, as the time in
which the leftward tone shift occurred.
In the following sections I will discuss a number of alternative sources of
information on the tone system of the 14th century, in an attempt to determine the
time of the tone shift in Kyōto more precisely.
12.1 Evidence from the 14th century
The most detailed information on the tone system of the 14th century is contained in
fushihakase material from this period. This material – which I refer to as ‘old’ rongi
material in order to distinguish it from ‘new’ rongi material such as Bumō-ki 補忘記
(which stems from the 17th century period of rongi revival) – consists of musical
scores of the rongi ceremonies. Rongi ceremonies are formalized discussions on the
Buddhist teachings that occurred in the Shingon and the Tendai schools. These
ceremonies developed a fixed shape in the early 14th century, and the fushihakase in
the old rongi materials are thought to reflect the tone system of this period.
This material and the history of the rongi ceremonies are discussed in more
detail in chapter 14, but it is clear that the fushihakase reflect a tone system in which
the number of /H/ tones is being restricted, not yet a post-shift tone system. The
markings for classes 2.3, 3.4 and 3.5 for instance, are predominantly ,
and . Markings as , and also still occur, but only very
rarely. As explained in section 4.1.2 of part I, there are reasons to assume that the
/H/ tone restriction was a gradual process in phonetic terms ([H] > [M] > [L]), and
that – temporarily – [M] pitch played a role in the tone system of this period.
Although the old rongi material is the most detailed material from the 14th
century, some information can also be found in the tonal spelling system devised by
Gyōa 行阿.
12.1.1 Gyōa’s emendations to Fujiwara Teika’s spelling system
In the 13th century a spelling system had been devised for Japanese by the poet
Fujiwara Teika 藤原定家 (1162-1241). Teika took the spellings in old manuscripts
as his example, in order to determine the correct spelling of words that included
intervocalic は and わ, the kana signs え, へ and ゑ and the signs い, ひ and ゐ
that had fallen together in contemporary pronunciation. In case of the signs お and
を on the other hand, the decision of which sign to use in which words was based on
498 12 Determining the time of the tone shift in Kyōto
the tone of the syllable in question: Ōno Susumu (1950) discovered that when Ruiju
myōgi-shō 類聚名義抄 and other Heian period manuscripts indicated a ping tone,
Teika used the sign お, and when the Heian period manuscripts indicated a shang
tone Teika used を.
By the mid 14th century, the kana orthography proposed by Teika was being used
extensively when writing Japanese poetry. Soon thereafter however, a system of
kana spellings that had an indirect link with Teika’s spellings was proposed by
Minamoto Tomoyuki 源知行 (1290-1370) in his work Kana moji-zukai 仮名文字遣
sometime after he entered the priesthood in 1363 and took the name Gyōa 行阿.
This work contained the recommended kana spellings for over a thousand entries.
The criteria appear to be essentially the same as those employed by Teika, i.e. usage
in old documents is followed for all problematic kana other than お and を, for
which the tonal principle applied.
Gyōa amended Teika’s usage of お and を in a number of cases, which indicates
that the Japanese tone system had gone through some changes. 2 Among Gyōa’s
emendations there are 35 examples of お changing to を (such as in 2.3 oni をに
‘demon’, 2.3 ono をの ‘axe’, 2.3 oya をや ‘parent’) indicating a change from 平平
to 上平. Changes from を to お on the other hand, do not occur, which means that
there were no examples of shang tones changing to ping tone.3
The fact that Gyōa only criticized Teika’s use of the kana お and を in words
that started with sequences of ping tones indicates that the change did not upset his
entire perception of the tones. Except in case of certain syllables in the word, the
2 Based on Gyōa’s spelling system, Sakurai (1976:404) argued that this change (which I identify
as the /H/ tone restriction, but which Sakurai – who adheres to the standard theory – sees as the
elimination of sequences of /L/ tone) must have started between 1241 (the death of Fujiwara
Teika) and 1293, the birth of Gyōa. As we have just seen however, there are examples of /H/
tone restriction that date from before 1293.
3 Ōno also found a few examples where Gyōa had not changed Teika’s spelling, even though this
would have been expected, such as in 3.4 otoko おとこ ‘man’ and 4.5 otouto おとうと
‘younger brother’. He also says that there are no instances of tonal change to be seen yet in the
initial syllable of nouns of class 3.5. Unfortunately Ōno does not present the examples on
which he bases his remark. Later on in his paper (1950:16), in another context, there is the
example of 2.3 oya をや (which points to 上平 tone) ‘parent’ vs. oyako おやこ ‘parent and
child’. Ōno probably reconstructed 平平上 tone (class 3.5) for oyako, based on 2.3 oya + 1.1
ko. According to Martin’s listing however, Heian period attestations of the tone of oyako are
lacking and the modern reflexes indicate tone class 3.6 (平上上 in Middle Japanese) instead of
3.5. There is however, another example: The shūshi-kei of the verb osoru ‘to fear’ (tone class
B) normally has the same tone in Middle Japanese as tone class 3.5 (平平上). This word is
spelled as おそる, showing no change of tone in the first syllable. (Otherwise the spelling
would have been をそる.) If Ōno is right, the idea that [M] pitch played a role in the tone
system of the period may explain the lack of /H/ tone restriction in sequences of /H/ tone in
class 3.5: It could be that what we see here is a difference in marking strategy depending on the
tonal context. It may have felt appropriate to Gyōa to annotate [M] tone followed by [H] tone
with a [L] tone mark, as it was lower in comparison. Conversely, it may have felt appropriate to
annotate [M] tone followed by a [L] tone later on in the word with a [H] tone mark, as it was
higher in comparison.
12.1 Evidence from the 14th century 499
traditional tones still agreed with those of Gyōa himself, which again points to /H/
tone restriction, and not to the leftward tone shift.
12.1.2 Emperor Chōkei on the ping, shang and qu tones
In the postscript to his work Sengen-shō 仙源抄 (1381), emperor Chōkei 長慶 (who
was in exile in Yoshino), likewise sought to explain usage of the two kana signs お
and を in texts in Teika’s own hand in terms of differences in tone. Just as in Gyōa’s
case, difficulty was caused by the fact that his frame of reference was the tone
system of his own day. Chōkei was further confused because he looked to
differences in tone as the possible basis for distinction in usage of kana other than
just お and を (Seeley, 2000).
An interesting part of emperor Chōkei’s postscript is formed by the fact that he
elucidates the tones by giving examples from Japanese.
1 Emperor Chōkei’s examples of the ping, shang and qu tones
ping 2.3 kami ‘god’ (marked 平平)
shang 2.4 kami ‘upper, above’ (marked 去上)
qu 2.2 kami ‘paper’ (marked 上去)
The use of the qu tone to mark [H] tone as opposed to [L] tone in this marking
system is interesting. The ping tone seems to have been preferred to mark [H] (or
[M]) tone that did not occur with [L] tone within the same word.
Interpreted according to Ramsey’s theory, tone class 2.3 would have had or
tone, tone class 2.4 would have had tone and tone class 2.2 would have
had tone. The strange result is then, that the ping tone according to Chōkei was
[H], which is no problem, but that the shang tone was [F] and the qu tone [R], which
is the opposite of the value of these tones in Ramsey’s theory.
Interpreted according to the standard theory, tone class 2.3 would have had
tone, tone class 2.4 would have had tone, and tone class 2.2 would have had
tone. In that case, the strange result is that the ping tone would have been [L],
which agrees with the value of this tone in the standard theory, but that the shang
tone was [R] instead of [H], and that the qu tone was [F] instead of [R].
I do not think that it is necessary to dismiss Chōkei’s description as irrational, if
we regard his determination of the tonal category of these example words as
influenced by Myōgaku’s ideas. I think that emperor Chōkei started by applying
tone dots to the first and the second kana of his example words. As Myōgaku’s
notion that the tone of the second kana is the tone that represents the tonal category
of the word must have been known to him, he then regarded 2.3 平平, as a ping tone
word, 2.4 去上 as a shang tone word, and 2.2 上去 as a qu tone word.
I see the fact that Chōkei still marks tone class 2.3 with 平平 tone dots as a sign
that in his own speech this class was still being distinguished from class 2.2.
Although it is, of course, possible that he distinguished this class merely out of
500 12 Determining the time of the tone shift in Kyōto
deference to the traditional markings, such adherence to traditional norms in case of
this class would be hard to reconcile with his unorthodox marking of classes 2.4 and
2.2. I therefore assume that class 2.3 still had or tone in Chōkei’s own
speech.
The differences that we see between the representations of Gyōa, Chōkei and the
14th century fushihakase material may represent different approaches to the
representation of [M] pitch, and not necessarily different tone systems.
12.2 Confusion in the 15th century
The dramatic transformation of the tone system that results from the leftward tone
shift would have fundamentally upset the traditional concept of the tones. This is not
what we see yet in the material from the 14th century discussed in the previous
sections: What we see in these materials points /H/ tone restriction, and not yet to
the leftward tone shift
The clearest sign that the leftward tone shift had occurred is when we find
evidence of a complete disruption of the traditional tone system. Such evidence is
only first seen in the early 15th century.
12.2.1 Yūkai 宥快 (Kogi Shingon school)
In Shishō shiki 四声私記 (1409) the oral tradition of Yūkai 宥快, a Siddham scholar
of the Hōshō-in 宝性院 temple on Kōya-san is recorded. (The Hōshō-in had been
the centre of the Kogi Shingon rongi tradition). The tone descriptions in Shishō shiki
show that by that time, all the traditional tonal distinctions had been obliterated and
that the tones were defined on the basis of completely new terms, of which the
meaning is not always clear.
Yūkai’s tradition represents the tone theory that was prevalent at the time in the
Kogi Shingon school on Kōya-san (Konishi, 1948:506).4 As Kōya-san had been the
centre of Siddham study of the Kogi Shingon school, we must conclude that the
traditional tone theories had become extinct in the Kogi Shingon school by this time.
With the tradition so severely in disarray it is surprising that any theoretical work on
the tones was written at all.
2 Shishō shiki on the four basic tones and light and heavy
去声上声出気、 In the qu and the shang tone the breath is expelled,
入声平声入気也矣。in the ru and the ping tone the breath is inhaled.
是分二四声一。 In this way the four tones are divided.
4 I have adopted the description in Shishō shiki from Konishi, but I have left out the well-known
quotation from Yuanhe yunpu and the usual description of how the qu tone in Go-on becomes
ping tone in Kan-on and how the ping tone in Go-on becomes qu tone or shang tone in Kan-on.
12.3 Reanalysis at the end of the 15th century 501
軽短、重長。 Light is short and heavy is long
是亘二四声一。 This is true for all of the four tones.
凡四声,一音発、 In general, as regards the four tones, if one sound is emitted,
必四声在レ之。 the four tones are definitely in it.
最初声上声。 The first tone is the shang tone.
上声長呼平声、 When the shang tone is long, it is called the ping tone
遠去声、 when it peters out, it is the qu tone,
急留入声也。 when it stops suddenly, it is the ru tone.
As the word 気 in this text can only refer to breath, it appears as though the qu and
shang tones were regarded as aspirated, and the ping and ru tones as unaspirated. It
is however, highly unlikely that the concept of aspiration would suddenly have been
introduced in the 15th century as the distinction between the Chinese clear (voiceless
unaspirated) and second clear (voiceless aspirated) initials was never adopted in
Japan. I assume that the remarks are strongly inspired by the name of the ru tone
(‘entering’) and do not contain much actual meaning.
As to the descriptions of the individual tones, it appears that the idea that each
tone had a different tone contour has been abandoned: The conventional description
of the qu tone as ‘distant’ (translated here as ‘petering out’, as the context relates to
differences in length), does not contain information on the tone contour or pitch
height of this tone, but the rest of the tones now all seem to have the same tone
height, and differences in length have become the main criterion for distinguishing
between them.5
The equation of light with shortness and heavy with length is reminiscent of the
tone length theories expressed in Hoke-kyō shakumon 法華経釈文, Sakumon daitai
作文大体 and Moji-han. These theories could be traced back to a generalization of a
number of remarks in Shittan-zō 悉曇蔵, but here the generalization has gone even
further: Distinctions in length have now replaced distinctions in tone height entirely.
As I assume that this happened because the traditional tones no longer
conformed to contemporary pronunciation, I see this text as positive evidence that
by 1409 the Kyōto tone shift had already taken place.
12.3 Reanalysis at the end of the 15th century
The traditional tone system had collapsed because the shift had upset the one-on-one
relation between the tone of a syllable and the tone dot with which this syllable had
traditionally been marked. The traditional value of the tone dots that had been added
to native Japanese words no longer agreed with the tone that these words had
5 The habit of chanting the kaku tone on the same tone height as the chi tone, which can still be
found in the modern Nanzan Shin-ryū tradition on Kōya-san probably stems from this period.
(Cf. section 5.6.1.)
502 12 Determining the time of the tone shift in Kyōto
acquired in the spoken language. Likewise, the traditional tones of the Chinese
characters with which the Kan-on loanwords were written, no longer agreed with the
pitches that such loanwords had acquired in the spoken language.
In this situation there were two possible ways to reconcile the markings in the
old documents once more with the tones of the spoken language: One option would
be to place the pitch fall each time one syllable earlier than indicated by the tone
dots. This option would require an awareness of the fact that a leftward tone shift
had taken place, and that this was the origin behind the divergence between the
traditional markings and the spoken language. This choice of solution is highly
unlikely in a time before the development of modern linguistic theory and research.
The other option was therefore chosen: The tones were reinterpreted in such a
way that they agreed with the now current pronunciation as closely as possible,
which meant interpreting them exactly in the reverse: Although the cause of the
change was a shift of the /H/ tone to the left, in a large number of tone classes the
pitches that are the result of this shift look like an exact reversal of the tones before
the shift. (This is for instance so in classes 1.1, 1.2, 1.3, 2.1, 2.2, 2.4 and 3.1, 3.3, 3.6,
3.7, but not in classes 2.3, 3.2, 3.4, and 3.5.) Reversing the value of the tones thus
results in a considerable agreement between the tone dots added to the old
documents and the post-shift spoken language, albeit not in all of the tone classes.6
A reversed interpretation of the traditional markings would have been acceptable
only, if there was no longer a living tradition to contradict it. In other words, there
must have been a period in which the traditional tone systems had completely
collapsed before such a radical change in the tradition could have occurred. From
the description by Yūkai in Shishō shiki it can be seen that in the Kogi Shingon
school there indeed was such a period of confusion, and that it had definitely set in
by the late 14th to early 15th century.
At the end of the 15th century we find the first description in which the tones start
to be defined on the basis of tone height again. This is in the work Siddhām jūhas-
shō kikigaki by In’yū 印融. This time however, the differentiation of the tones on
the basis of tone height appears in a new way, which made it possible to reconcile
the old tone dot markings with the post-shift tone system. Like Yūkai, In’yū,
belonged to the Kogi Shingon school on Kōya-san, but he was active about seventy
years later.
12.3.1 In’yū 印融 (Kogi Shingon school)
The following tone chart of the six tones (六声分別図) is included in the work
Siddhām jūhas-shō kikigaki 十八章聞書7 (1482), by the monk In’yū 印融 (1435-
1519), who belonged to the line of Shinpan. The manuscript is in In’yū’s own
handwriting. The tone chart has the note: 以下私図 (‘Now follows my personal
6 In order to solve the discrepancies in classes 2.3, 3.2, 3.4, and 3.5, the ataru device and later
the ideai rules were developed. (See sections 12.4 and 13.1.2.)
7 The word ‘Siddhaṃ’ is here written in the Siddham script.
12.3 Reanalysis at the end of the 15th century 503
chart’), which may indicate that the renewed introduction of tone height differences
in this tone theory was In’yū’s own initiative.
Figure 1: The six-tone chart in Siddhām jūhas-shō kikigaki
Source: Mabuchi, 1962:682
504 12 Determining the time of the tone shift in Kyōto
I have kept the translation of the descriptions of the tones as neutral as possible in
(3), by translating the characters 平 and 上 as ping and shang, even though I believe
that in several instances in this text these characters most likely mean ‘(low) level’
and ‘high’ respectively. In other words, I believe that (at least part of) the tones had
already been reinterpreted into their ‘modern’ shape by this time. In these cases I
have added an alternative translation between brackets. It is only in the description
of the light ping tone that I have translated 上 directly as ‘high’ as this translation is
strongly suggested by the juxtaposition of the characters 上 and 下.
The use of okurigana (cf. 時キ) and particles in this text is unusual. The use of
the particle wo in 平平声ヲ成ル and 上声ヲ成ル is even ungrammatical. (I also
wonder if there are special Shingon writing habits in play here such as ナ而 as
shorthand for なって and 而ル二字 as an abbreviation of 而ルニ二字?) As a
consequence the translation of this text is sometimes tentative.
3 Siddhām jūhas-shō kikigaki on the six tones
平声ハ初後共ニ平ニナ而モ Although the ping tone is beginning
and ending both ping (level)
初メ少シ上リ終リ少シ夷レリ in the beginning it is slightly rising
and at the end slightly falling
軽平声ハ初後下デ中カ上ナル声也 The light ping tone is a tone
that is beginning and ending low,
and in the middle high
而モ二字連声之時キ平平声ヲ成ル but when two graphs (kana signs?)
follow each other it becomes a ping+ping
(level+level) tone
上声ハ初後共ニ上昇ヘ8ル声也 The shang tone is a tone that rises up
both in the beginning and the end
去声ハ初ハ平ニナ而 The qu tone is a tone
that is in the beginning ping (level)
絳9リユカミ上ル声也 and bends upwards at the end
入声ハスクム10声也 The ru tone is a short tone
惣テ入声ハ定レル声只文字ノ終リニ All ru tones are determined tones,
but the graphs that come at the end
of the word are
備ルヲ二フツクチキノ五字ヲ一云 the five fu, tu ku ti ki (?)
軽入声ハ初後共ニ上但シスクミ声也 The light ru tone is shang (high)
both at the beginning and the end,
but it is a short tone
8 I suspect that ヘ is a copying mistake for セ.
9 The character 絳 must be a mistake for 終.
10 Sukumu has the meanings: ‘crouch’, ‘cower’, ‘shrink at’.
12.4 The annotation ‘ataru’ in the 16th century 505
而ル二字連声ノ時キ However, if two graphs (kana signs?)
follow each other,
上得テ上声ヲ成ル也 shang/the first graph (?) becomes
shang tone
Although the formulation stays as close as possible to the traditional descriptions,
we can see how the tones were starting to be redefined into a form in which the new
pronunciation could be reconciled with the traditional tone dot markings. The
clearest indication for this can be seen in the completely new rising tone contour that
is attributed to the qu tone. These changes confirm that the Kyōto tone shift had
indeed transformed the traditional tone system, as was already evidenced by the loss
of all tonal distinctions in Shishō shiki
Not all deviations from what was traditional in the text above are easy to
interpret: The description of the ping tone and the light ping tone look identical to
me, and I do not know what to make of the pronunciation in case of words written
with two kana graphs, or – as a possible alternative interpretation of these passages –
of Sino-Japanese words made up of two characters.
12.4 The annotation ‘ataru’ in the 16th century
Other evidence for the fact that a reinterpretation of the traditional tone systems was
taking place can be found in the ‘ataru’ annotations. When the leftward tone shift
occurred, the tone of Chinese loanwords shifted along. As a result, the tones of the
characters with which these words were written no longer always fitted the actual
pronunciation in the spoken language. Those tone classes in which the post-shift
pitch did not agree with the new interpretation of the tones most conspicuously
included the tone classes that had started with sequences of ping tones.
The tone that the characters of such Sino-Japanese loanwords had (in Kan-on
and in Go-on) was actively studied by the monks and could not easily be dismissed.
It was therefore for Sino-Japanese readings that a special annotation was devised to
solve the discrepancy between the traditional tone markings and the post-shift tones.
In these cases the pitches were now adjusted by adding the note ataru.
Such markings can for instance be found in Daisho hyaku-jō dai-san-jū yomikuse
大疏百条第三重読曲 (1563), an early example of ‘new’ rongi material from the
period of rongi revival at the Negoro-ji. (The ‘old’ rongi tradition at the Negoro-ji
had died out during the 15th century). This pronunciation guide contains a collection
of yomikuse or ‘special pronunciations’ of Sino-Japanese words from the text
Daisho hyaku-jō dai-san-jū 大 疏百条第三重 , which originally stems from the
Negoro-ji in the 14th century (Sakurai, 1976:402). To a number of Go-on loanwords
506 12 Determining the time of the tone shift in Kyōto
that had originally been marked only with (‘reversed’ Go-on style) tone dots, the
annotation ataru was now added.11
Ataru (literally ‘to hit (the mark)’, to strike’) is a musical term used in shōmyō
and also in the recitation of the Heike monogatari 平 家 物 語 . In the Heike
monogatari it can nowadays apparently function to express low pitch (Okumura
1981:298), but according to Okumura (1981:294-295) it originally expressed a
raised pitch, followed by a lowered pitch. According to the dictionary of Buddhist
music (Amano et al. ed., 1995), ataru or atari is a melodic shape used mainly in
Tendai shōmyō. It means making the voice jump up strongly and sharply, and – after
cutting off the voice for a moment – either continue at the original tone height or
changing the tone height, usually to a lower pitch.
Konishi (1948:520) interprets the term as ‘beginning high’ and according to
Kindaichi (1943a, 1951), ataru means: “pronounce this high and the following
syllables low.” Whatever the exact evolution of this term may have been, it seems to
have expressed a change from [H] to [L] pitch, stressing or ‘hitting’ the syllable with
the [H] pitch. According to Sakurai (1976:402) the readjustments to the
contemporary pronunciation look as in (4). The part with the annotation ataru has
been marked with an apostrophe.12
4 The change in tone indicated by the note ‘ataru’
taking the standard reconstruction as a starting point
Tone of Tone in standard Realization
Character reconstruction indicated by ‘ataru’
平去 ' >
平去 or 入去 ' >
平上 or 入上 ' >
If we reverse the tones in the left-side column in accordance with Ramsey’s
reconstruction as in (5), we see how the post-shift tones developed as the result of
the leftward tone shift. The tonal changes indicated by means of the ataru markings,
which seemed to lack any motivation in (4), now show that they have a
straightforward origin.
11 Works that deal with the pronunciation of the Siddham script concentrate on the Kan-on tones.
This work on the other hand, deals with the pronunciation of Sino-Japanese loanwords in
Buddhist texts. These were predominantly of the Go-on type. The ru tone for instance, has the
same tone as the ping tone, which is typical of the reversed marking system used for Go-on
readings. (In Kan-on, the ru tone had the same tone as the shang tone.) See section 11.1.2.
12 The dots in (4) indicate moras. In the first example the reading of the initial character has one
mora. In the other examples, all of the character readings have two moras.
12.5 Summary 507
5 The change in tone indicated by the note ‘ataru’
taking Ramsey’s reconstruction as a starting point
Tone of Tone in Ramsey’s Realization
Character reconstruction indicated by ‘ataru’
平去 >
平去 or 入去 >
平上 or 入上 >
12.5 Summary
The collapse of the traditional tone theories, followed by a reinterpretation of the
traditional tones, and the employment of the ataru device all point to the disruption
of the tone system of the standard language of Kyōto by the leftward tone shift.
On the basis of Shishō shiki from Kōya-san, one can probably say that the shift
must have occurred at least some time before the end of the 14th century: Yūkai’s
ideas must have had a period of time to develop after the shift made the traditional
tones unworkable. (Perhaps, 1390 at the latest, is a reasonable date.) The last tone
descriptions from the Tendai school that express a traditional view of the tones in
chapter 7 also date from the late 14th century.
However, this does not mean that the change in Kyōto could not have occurred
earlier. The fact that emperor Chōkei still adheres to a pre-shift tone system in
Sengen-shō (1381) may be due to the fact that Emperor Chōkei had grown up
outside of Kyōto, among people who had left the capital even earlier: Emperor Go-
Daigo was exiled from the capital and fled to the mountains of Yoshino in 1336,
where emperor Chōkei was born in 1342. It does suggest however, that the shift had
not yet taken place in Kyōto in 1336, when emperor Go-Daigo and his entourage
fled to Yoshino.
Gyōa, in turn, was of advanced age when he wrote Kana moji-zukai sometime
after 1363. If the new Kyōto style pronunciation developed after 1336, he would
have been too old to acquire it as he grew up.13
The rongi collection Rongi-shō 論議抄, which was published on Kōya-san in
1376, seems to represents one of the last records of the traditional performance of
13 Apart from this; I have not been able to find out where Gyōa grew up. Gyōa’s grandfather,
Minamoto Chikayuki, of whom more biographical data are available, served the military
government in Kamakura for many years, so all we can say is that it is possible that Gyōa’s
father grew up in Kamakura and spoke the dialect of that area. In later ages, people would
probably have made a conscious effort to use a Kyōto type tone system, even if they originally
came from different parts of the country. In case of Gyōa however, the Kyōto tone shift may
have been too recent to have already given rise to an awareness of the distinctive difference
between the tone system of Kyōto and that of other areas, including a clear preference for the
former.
508 12 Determining the time of the tone shift in Kyōto
these ceremonies, as according to Sakurai, by 1407 the rongi ceremonies on Kōya-
san had died out. These materials still reflect a restricted, pre-shift tone system, but
again, this does not have to mean that the shift had not yet taken place in Kyōto
itself by that time. Kōya-san was quite far removed from Kyōto, and the monastic
chanting traditions could have survived for decades even after changes had occurred
in the spoken language of the capital.
All of this makes it hard to pinpoint the time of the Kyōto tone shift. Considering
the relevant dates, which are summarized in (6), a reasonable assumption is probably
that the /H/ tone restriction reached central Honshū in the mid-13th century, and that
the leftward tone shift in Kyōto followed in the mid to late 14th century.
6 Chronological list of materials and events relevant to the dating
of the major tonal changes in central Japanese
11th c. 1080-1100 Compilation of the Tosho-ryō-bon of Ruiju myōgi-shō
12th c. 1100-1180 Compilation of the kōeki lineage of Ruiju myōgi-shō
13th c. 1278 /H/ tone restriction in manuscripts of Kokin waka-shū
1293 Birth of Gyōa
14th c. 1333-1334 /H/ tone restriction in Moji-han
1336 Emperor Go-Daigo goes into exile
1342 Birth of emperor Chōkei at Yoshino
1363-1370 /H/ tone restriction in Gyōa’s Kana moji-zukai
1376 /H/ tone restriction in Rongi-shō collection published at
Kōya-san.
1381 Emperor Chōkei (in exile) finishes Sengen-shō, which
is still based on a pre-shift tone system.
15th c. 1407 Kogi Shingon rongi tradition on Kōya-san lost
1409 Evidence of a complete collapse of the traditional tone
system in Shishō shiki (Kogi Shingon)
±1450 Shingi Shingon rongi tradition at the Negoro-ji lost
1482 Reinterpretation of the tones in Siddhām jūhas-shō
kikigaki (Kogi Shingon)
16th c. ±1530 Evidence for the spread of post-shift Kyōto type tone in
Mōtan shichin-shō
1563 First ataru annotation used in Daisho hyaku-jō dai-san-
jū yomikuse (Shingi Shingon school at the Negoro-ji).
13 The Japanese tone theories after the shift
13.1 The shōmyō revival in the 17th century
After the period of decline, which started in the 14th century, there was a remarkable
revival of both shōmyō and Siddham studies in the Edo period. This revival started
towards the end of the 16th century, as is evident from the publication of a work like
Daisho hyaku-jō dai-san-jū yomikuse 大 疏 百 条 第 三 重 読 曲 in 1563. It was
especially the Shingon school that flourished.
An important figure in the revival of Siddham studies in this school was Jōgon
淨厳 (1639-1702).1 Patronized by the Shogun’s court and by the rich merchants, its
ritual was conducted with great pomp and circumstance and mantras and dhāran,ī
figured conspicuously in all its ceremonies (Van Gulik, 1953:122). In this period the
tone descriptions crystallized into their final shape.
13.1.1 Kannō 観応 (Shingi Shingon school)
The time of shōmyō revival saw the publication of many reference guides to the
correct recitation of the rongi ceremonies. These guides are called rongi-sho 論議書
or ‘rongi books’. One of the most famous of these guides is Bumō-ki 補忘記
(1687/1695).
Bumō-ki is famous because it contains a vocabulary of Chinese loanwords
(mostly of the Go-on type) as well as Japanese words with tone markings, arranged
in the order of the Japanese syllabary. The tones are marked by means of
fushihakase musical notation marks.
Two editions of this work exist: The first edition (1687) consists of two parts,
‘upper’上 (30 chō) and ‘lower’ 下 (18 chō) and a postscript (1 chō).2 The second
edition (1695) falls into three parts, of which the first two parts ‘heaven’ 天
(consisting of 38 chō plus a 2 chō introduction) and ‘earth’ 地 (consisting of 38 chō)
correspond to the ‘upper’ part of the first edition. The third part ‘man’ 人 (consisting
of 16 chō plus a postscript of 1 chō) corresponds to the ‘lower’ part of the first
edition.
Although the ‘upper’ part (1687) or ‘heaven’ and ‘earth’ parts (1695) contain
vocabulary words (myōmoku 名 目 ) in the order of the Japanese syllabary, the
1 His work Shittan sammitsu-shō 悉曇三密鈔 (1682), has been mentioned in section 6.1.1 as the
source of the claim that Biao and Jin can be identified as Biao Xingong 表信公 and Jin Lixin
金礼信.
2 A chō 丁 corresponds to the front and the back of a western book page.
510 13 The Japanese tone theories after the shift
‘lower’ (1687) or ‘man’ (1695) part contains whole sentences and phrases quoted
from the rongi ceremonies themselves and uses a different kind of fushihakase.3
In the vocabulary part, the fushihakase signs used are: ∖, −, ┐and └, the last two
being used in Sino-Japanese words only. Although this is not explicitly mentioned in
Bumō-ki, there is no reason to doubt that the fushihakase are of the goin hakase 五音
博士 type and that the marks express the following tones of the Chinese pentatonic
scale: The sign ∖ expresses the tone chi 徴 [H], the sign − expresses the tone kaku 角
[L], the sign ┐expresses a sequence of chi-kaku 徴角 [F] and the sign └ expresses a
sequence of kaku-chi 角徴 [R].4 (The goin hakase system was the preferred musical
notation system in the Shingon school in the 17th century.)
From the tone chart (Figure 1) that is included in Bumō-ki, it can be seen that ∖
[H] was used to express the shang tone, − [L] to express the ping tone, ┐[F] to
express the light ping tone and└ [R] to express the qu tone.5 In practice however,
these correspondences are often disrupted by the ideai rules (see section 13.1.3).
Figure 1: Tone chart with fushihakase marks in Bumō-ki
Source: Facsimile edition of Bumō-ki Genroku-ban (Hakutei-sha, 1962)
3 According to Hattori (1942:137-138), the fushihakase used in the quotation part are of a
‘completely different kind’ and have to be studied separately. An interesting aspect of Bumō-ki,
which has been pointed out by Sakurai (1976: 381), is that the tone system reflected in the
quotation part is of a different type than that of the well-known vocabulary part. (See sections
14.4 and 14.5.)
4 Chi and kaku are two tones of the Chinese pentatonic scale. The names of the tones from low to
high are: 1 kyū 宮, 2 shō 商, 3 kaku 角, 4 chi 徴, 5 u 羽.
5 The fact that for Japanese words, only ∖ and − are used, agrees with the following remark in
Bumō-ki where it is said that words written by means of kana (which I interpret as referring to
Japanese words, as Sino-Japanese words are written by means of characters) only have the
ニハ テ ノ ノミ ノ
tones shang and ping: 仮名 有二上平二声 一 無二 入去 声一 也 “Kana only have the two
tones shang and ping, and no ru or qu tones.”
13.1 The shōmyō revival in the 17th century 511
We see that the bifura-ten and the fu-nisshō-ten are also included in this chart. About
the bifura-ten it says that this tone is used in the Tendai and Hossō schools, and it is
described as follows: 是同二上声一高唱レ之 “Chant this high just as the shang tone.”
The tone system that forms the background of Bumō-ki is clearly of the new style
that started to develop at the end of the 15th century, after the traditional tone system
had been disrupted by the Kyōto shift.
The fu-nisshō-ten does not have the note that it is only used in other schools, and
fu-nisshō-ten are indeed used in Bumō-ki. Although the Shingon school is usually
said to use a six-tone system, Konishi concludes that the Shingi Shingon school
adhered to a seven-tone (or rather, quasi seven-tone) system. Konishi suggests that
the desire to distinguish itself from the Kogi Shingon school may have played a role
in the adoption of the fu-nisshō-ten in the Shingi Shingon school (1948:514).
In the chart, the hakase marks are arranged around the character, next to the
location of the tone dots. In the actual text they are added to the left side only, and
the marks should be read moving outward from the point closest to the character.
The tone system recorded in the vocabulary part of Bumō-ki was initially
regarded as the Kyōto type tone system of the early Edo period (Hattori, 1942), as
Bumō-ki was written by Kannō 観応 (1650-1710), who belonged to the Chizan
branch 智山派 of Shingi Shingon and lived at the Chishaku-in 智積院, the head
temple of the school in Kyōto.6
Although Mabuchi (1958) later argued that Bumō-ki reflects the tone system of
Wakayama and Nara prefecture of the Muromachi period, rather than the tone
system of Kyōto of the early Edo period,7 the difference between the tone system
recorded in the vocabulary part of Bumō-ki and the tone system of the contemporary
population of Kyōto seems to have been negligible: In one case it is mentioned as
something peculiar in Bumō-ki that ‘lay people’ pronounced a certain word with a
different pitch than the users of Bumō-ki themselves, which probably means that in
general the pitches used by the monks and by the 17th century lay population of
Kyōto agreed.
Bumō-ki also contains the remark (already quoted in section 5.2.2 of part I) that
the kana ‘ma’ of yama ‘mountain’ is pronounced hikusi ‘low’, when this word is
used in isolation, while in compounds like nisiyama ‘western mountain’ and
higasiyama ‘eastern mountain’ the kana ma is pronounced takasi ‘high’. This is the
first instance of the use of the adjectives hikusi and takasi in a Japanese tone
description.
In case of the word hikusi, the katakana reading notes ヒクシ have been added
to the character 卑 and not to the character 低. (The character 卑 is nowadays
usually read as iyasii ‘humble’, but as we have seen at the end of section 7.3.1.2, in
6 The Chizan branch of Shingi Shingon developed in the Chishaku-in in Kyōto, after the
traditional centre of the school, the Negoro-ji on Mount Negoro in Wakayama prefecture, had
been destroyed by Toyotomi Hideyoshi in 1585.
7 Mabuchi’s theory will be addressed in more detail in section 14.4.3.
512 13 The Japanese tone theories after the shift
the Heian period, this was already one of the characters chosen to write Japanese
hikisi ‘low’.) The word takasi is written with the character that is still used today,
namely 高, with the kana adjective ending シ (高シ) and not with the character 昂.8
We see that as late as the 17th century, the characters 低 and 昂 were not
employed to express ‘low’ and ‘high’ in Japan. A different set of expressions is used
here for the first time to describe the tones, written by means of a different set of
characters. There clearly is a new approach in the tone theories visible here,
consistent with the idea of a reinterpretation of the traditional tone systems after the
Kyōto tone shift.
13.1.2 Pitch readjustment rules after the shift: ideai 出合
As mentioned in section 12.4 on the ataru annotations, when the leftward tone shift
occurred, the tone of Chinese loanwords shifted along. As a result, the tones of the
characters (and the tone dots added to them) of Chinese loanwords no longer fitted
the pronunciation of the spoken language. These discrepancies were initially
resolved on an ad hoc basis by the ataru notes.
As the tonal changes that took place in the Japanese language are regular, the
divergence between the tone dots and the post-shift tone of Chinese loanwords is
predictable. During the revival of shōmyō and Siddham studies in the 17th century,
the readjustments that were needed to bring the old markings in agreement with the
contemporary tone system were systematized into a set of rules.
With these rules the Buddhist monks tried to capture the regular relationship
between the tone of the characters of these Sino-Japanese loanwords and the actual
realization. As the fact that certain tones ‘met’ (i.e. were combined) was seen as the
cause of the changes (adjustments were after all only needed in case of certain tone
sequences), the rules were called ideai 出合 or ‘the meeting (of tones)’.
The most well-known set of ideai rules is the one formulated by Kannō in Bumō-
ki.9 It is clear that Kannō’s view of the tones agrees with the tone system that is
typical of the Shingon school in the Edo period (the tone system that is regarded as
traditional in the standard theory). The ideai rules in Bumō-ki for instance include
the following passages:
1 Excerpts from the ideai rules in Bumō-ki
ト ク
入声軽 者同 Just like the shang tone
ニ ト ク ル
二 上声 自然 高出声也
一 the light ru tone is a tone that naturally
comes out high
ノ ノ
8 The iroha syllabary with tone dot markings in Bumō-ki is introduced as 伊呂波高下声.
9 See Konishi for other descriptions of ideai rules from the Shingon school (1948:515). Konishi
also includes an example of ideai rules from the Tendai school (1948:507) but the dating of this
text is uncertain.
13.1 The shōmyō revival in the 17th century 513
ハ シ ニ スル
平入同様 出 合也 The ping and the ru tones meet in the same way
ヨリ ニ ル
平 上 移 時 When a ping tone is followed by a shang tone
ノ ノ ナラハ
平声字二字仮名字 if the character with the ping tone
is a two-kana word
ノ ヲ シテ ノ ヲ ラニ フ
則平声字押 上声字平 唱 press the ping character down10
and chant the shang character as ping
ノ ノ ナラハ
若又平声字一字仮名字 But if it is a single-kana ping character
ノ ニ ュ
則本声任 用 レ 之 use the original tone
ミヲ ラニ モヲ ク フ
上 平 下 高言 and pronounce the first character as ping/level
and the second character high
ヨリ ニ リ
平 去 移 When a ping tone is followed by a qu tone
ヨリ ニ ルハ
入 去 移 者 and when a ru tone is followed by a qu tone
ニ ク ミヲ ク モヲ キク フ
供 同 上 高下 卑 言 pronounce them both with the first character high
and the second character low
The remark that the ping and the ru tones behave in the same way as far as the ideai
rules are concerned, has already been quoted in section 11.1.2. It shows that the
ideai rules were developed for Go-on loanwords in Buddhist texts. (The fact that the
fu-nisshō dot in the tone chart in Bumō-ki is marked with the same fushihakase as
the heavy ping and heavy ru tones indicates the same.)
There is no awareness that the ideai rules were needed because of an earlier tone
shift that had destroyed the one-on-one relationship between the tones of the
characters and their actual realization. The rules however, still betray the cause
behind the tonal change: The need to specify the number of kana of the first
character for instance, is best explained by a leftward tone shift: A two-kana ping
tone character followed by a shang tone character ( in Middle Japanese in
Ramsey’s reconstruction) becomes after the shift, but a one-kana ping tone
character followed by a shang tone character ( in Middle Japanese in
Ramsey’s reconstruction) becomes after the shift.
It is telling that when ping or ru are followed by a qu tone, there is no need to
specify the number of kana of the first character. This has to do with the fact that the
tone contour of the qu tone, following behind the ping tone, guaranteed that
these examples started with at least two [H] moras in a row in Middle Japanese,
10 The character 押 with which the verb osu is written here expresses a downward movement, as
opposed to the character 推 which expresses a forward movement. Konishi (1948:520)
therefore interprets osu (‘to push down, to press down’) as ‘ending low’, i.e. .
514 13 The Japanese tone theories after the shift
even if the ping tone was only one mora in length: + or +. As a result,
even words with a one-kana first character would still start with [H] pitch after the
leftward tone shift, and no rule as to the number of kana of the first character needed
to be formulated.
13.1.3 Keichū 契沖 (Kogi Shingon school)
The Shingon monk Keichū (1640-1701) was a pupil of Jōgon, one of the people who
revived Siddham studies after the decline in the Muromachi period. In his work Waji
shōran-shō 和字正濫抄 (1693) Keichū established the spelling of Japanese that was
in use until 1945 (Rekishi-teki kanazukai 歴史的仮名遣). Keichū, who went to
Kōya-san at the age of 13, belonged to the Kogi Shingon tradition just as Yūkai and
In’yū (Konishi, 1948:520).
2 Waji shōran-shō on the six-tone system
平声は声の本末あがらずさがらず The ping tone is without rise or fall
from the beginning to the end,
一文字のごとくして長し。 and is as long as one (kana) graph
上声は短くしてすぐにのぼる。 The shang tone is short and rises
immediately
去声はなまるやうに声をまはす。 In the qu tone stretch the voice
and make it sound dull
入声ハフツクチキの音ありて切直なり The ru tone includes the sounds fu, tu,
ku, ti or ki and is abruptly cut off
平声と入声とに軽あり。 The ping tone and the ru tone have light.
当りて居るなり。 They are raised and then put down.
In his description of the ping tone Keichū seems to refute some of the notions
expressed by his predecessor In’yū in Siddhām jūhas-shō kikigaki. Viewed as a
whole, In’yū’s tone description strongly suggests that he regarded the ping tone as
level, but In’yū nevertheless adhered to the use of traditional terms like agaru 上ル
and taru 夷ル in his description of this tone.11 Keichū on the other hand, leaves no
doubt as to the tone contour of the ping tone: I see the remark that the ping tone is
‘as long as one (kana) graph’ as a means to stress the opposite of the rising-falling
description by In’yū, as such a complicated tone contour would have to be spread
out over more than one mora.
The addition of the expression sugu ni ‘immediately’ in his description of the
shang tone goes back to Annen’s description of this tone: Sugu ni is a common
reading for the character 直 , which is used in line 3 of Annen’s text in the
description of the shang tone: 上声直昂. I think Keichū’s use of this term sheds
11 In’yū’s description is in fact a mixture of the terminology that was traditionally used for the
ping tone and the light ping tone.
13.1 The shōmyō revival in the 17th century 515
light on how the reinterpretation of this tone as [H] instead of [R] was rationalized:
A ‘straight/short rising tone’ becomes a tone that ‘rises immediately’, a tone – in
other words – that can be regarded as [H], which agreed with the contemporary
pronunciation, and at the same time did not disagree with Annen’s record.
Koe wo mawasu (literally ‘making the voice turn (about)’) in the description of
the qu tone is a term that is used in singing which involves lengthening the vowel of
a syllable. In the expression namarigoe the word namaru, which is also used in the
description of the qu tone, has the meaning ‘dull’, ‘indistinct’, ‘rough’.
The expressions namaru and koe wo mawasu as such do not contain concrete
information on the tone contour of the qu tone in Keichū’s description and in similar
descriptions in the later ideai rules. 12 There can be no doubt however, that in
Keichū’s tone system the qu tone had a rising tone contour. This can be seen, for
instance, in the examples from Waji shōran-shō in (3).
The terminology used in the description of the light ping and light ru tones is
complicated: literally suuru means ‘to fix’, ‘to set down’, ‘to put (down)’, ‘to place’.
In the musical terminology used in the recitation of the Heike monogatari 平家物語
すゑ
the term sue 居 nowadays functions to express lowered pitch (Okumura
1981:298). 13 From the context in which the different terms are used, Konishi
(1948:519) concludes that suuru ‘to put down’ refers to a [HL] contour, in contrast
to mawasu ‘to turn about’ which refers to a [LH] contour.
Ataru, as we have seen in the previous chapter, is a musical term used in shōmyō
and also in the recitation of the Heike monogatari. It is used to express a change
from [H] to [L] pitch, stressing the syllable with the [H] pitch. I interpret the
description of the light ping and light ru tones (atarite suuru nari) as ‘raise and then
put down’, which amounts to a falling tone contour.
Although Keichū’s tone description is not overly explicit, the conclusion can
probably be drawn that the ping tone was [L], the light ping tone was [F], the shang
tone was [H] and the qu tone was [R:]. The ru tone was short and ended in hu tu ku
ti ki, and the light ru tone was [F] just as the light ping tone. (The falling tone
contour of the light ru tone is the only point in which Keichū’s tone system deviates
from Kindaichi’s view of the Middle Chinese tones.)
Such an interpretation agrees with examples from Waji shōran-shō where
Keichū applied the Chinese tones to a number of words in the Kyōto type tone
system (Kindaichi 1973:198).
12 Inoue (1928:77) for instance (who wrote before the standard theory on the nature of the Middle
Chinese tones in Japan had developed) interpreted this passage as 高より低へ急降してナマ
ルやうに呼ぶ ‘pronouncing it using a dull voice and a sudden fall from high to low’.
13 The recitation of the Heike monogatari is thought to have developed under strong influence of
shōmyō recitation. (See section 14.7.2.)
516 13 The Japanese tone theories after the shift
3 Japanese words used as examples of the Chinese tones in Waji shōran-shō
Example Modern Kyōto Tone indicated
pitches by Keichū
1.1 樋 hi ‘water pipe’ : shang
1.2 日 hi ‘sun’ : ping
1.3 火 hi ‘fire’ : qu
B 蹴 ke ‘kick’ ? shang
1.2 毛 ke ‘hair’ : ping
1.3 食 ke ‘food’ ? qu
2.1 端 hasi ‘edge’ shang
2.2 橋 hasi ‘bridge’ ping
2.4 箸 hasi ‘chopsticks’ qu
2.1 釣 turu ‘to fish’ shang
2.2 弦 turu ‘string’ ping
2.5 鶴 turu ‘crane’ :, - qu
The 17th century tone system of the Kyōto area was already very close to the modern
Kyōto tone system.14 A comparison with the modern Kyōto pitches of these nouns in
the right-most column shows that Keichū’s view of the Middle Chinese tones, as
expressed by means of the Japanese examples words, agrees with that posited by
Kindaichi if we regard Keichū’s ping tone as referring to the light ping tone.
In material from this period the difference between ping and light ping is usually
not maintained and the ping tone can express [F] tone as well as [L] tone. In this
way, words that started with sequences of ping tones in the old tone dot material
could be read in a way that agreed with the contemporary pronunciation: For
instance class 2.3 平平 as instead of , and class 3.4 平平平 as
instead of .15
Keichū uses two different systems of adding tone dots to Japanese words. For
some words the tone dots express the pitch of each separate kana sign, just as in the
14 Kindaichi (1951:653), for instance, has compared the modern Kyōto pitches of a number of
Sino-Japanese loanwords with the fushihakase marks added to them in Bumō-ki. The tone in
modern Kyōto is close to the 17th century Kyōto type tone expressed by the fushihakase, but it
has evolved somewhat. Earlier ' (still to be found in Bumō-ki) shifted to modern Kyōto
' in case of examples like Kannon ‘goddess of mercy’ and yuuzu ‘merger, flexibility’.
A similar development can be seen in tone classes 2.4 '- > '- and 3.6 '- >
', '-. (The dialect of Kōchi, which has preserved a more archaic stage, has
retained the Bumō-ki type pitches in these cases.)
15 This situation is fundamentally different from the situation in pre-shift materials. In these
materials, at some point, the shang tone started to be used to express both shang and light ping.
At the same time, in some texts original light ping tone dot markings were mistaken for
ordinary ping tone dots by later scribes. These scribal errors however, do not mean that the
ping tone dot functioned as a marker of both ordinary ping and light ping in the contemporary
marking system. (See section 9.4.1.)
13.2 Diversity in the tone theories of the 18th century 517
old tone dot material. Other words are marked with ‘tone dots in the new style’ or 新
式声点. In this system the tone dot added to the first kana of words that are written
with more than one kana expressed the tone contour of the word as a whole, while
the tone dot added to the other kana expressed the tone of that syllable only, so that
was marked as 上上, as 平平, as 去上 and as 去平.
13.1.4 Anonymous (Kogi Shingon school)
The following work, Go-on shishō kaigō hi-shō 呉音四声開合秘抄, stems from the
Kogi Shingon school in the Edo period (Konishi, 1948:481, 516). The author and
date of compilation are unknown, but it must date from after the end of the 17th
century, as it contains quotations from Bumō-ki.
The title of the work may give the impression that the indicated value of the
tones pertains to the Go-on tones, but this work also contains the remark that the
ping, shang, qu and ru tones used in Go-on are determined on the basis of the Kan-
on tones (Mabuchi, 1996). In other words, the indicated pitches still pertain to the
Kan-on tones. The tones are indicated by means of goin hakase marks.
4 Go-on shishō kaigō hi-shō
light ping ┐ 徴角 chi-kaku [F]
heavy ping − 角 kaku [L]
shang ∖ 徴 chi [H]
qu └ 角徴 kaku-chi [R]
light ru ∖ 徴 chi [H]
heavy ru ⁄ 商 shō extra [L]
According to Konishi (1948), in the most widely used system in the Shingon school,
the pitch of the ru tone was kaku, (just as the heavy ping tone) instead of shō.16
13.2 Diversity in the tone theories of the 18th century
In the 18th century there was a considerable diversity in the tone descriptions.
Among these descriptions we find for the first time, tone systems that agree with the
modern Sinologist view of the Middle Chinese tones: The ping tone is level, the
shang tone is rising and the qu tone is falling. (This is close to, but not the same as
the tone system of the Siddham scholars until the 14th century, as in their system the
ping tone had been falling.)
According to Kindaichi, the Sinologist view developed in the 18th century,
independently from older traditions, and rapidly gained authority in the 19th century.
16 The fushihakase chart in Bumō-ki (cf. section 13.1.1) also adheres to such a system.
518 13 The Japanese tone theories after the shift
Wenck (1953) thinks that this may have happened under the influence of Qing
phonology. The diversity in this period may therefore be partly due to the influx of
new ideas from China. Another factor is that people from outside the clergy now
also showed an interest in the tones.
13.2.1 Monnō 文雄 (1700-1763, Jōdo school)
Monnō (or Bun’yū), who belonged to the Jōdo school, became a monk at the age of
14. He applied his knowledge of modern Chinese to the study of the rhyme tables.
(As usual, Monnō took to the study of Chinese with an eye to the correct
pronunciation of the dhāran,ī.) In Waji taikan-shō 和 字 大 観 抄 (1754) Monnō
dismisses Keichū’s description as a mistake and lists as ping the group that Keichū
gave as shang, and as shang the group that Keichū gave as ping. There is no
difference in case of the qu tone. Taking the pitches of the Kyōto dialect again as the
starting point, this would mean that in Monnō’s view, ping was level, shang was
falling and qu was rising.
Kindaichi explains Monnō’s view as developed under the influence of his study
of Chinese (more precisely the Chinese dialect of Hangzhou 杭 州 ), which he
regarded as the standard language. Monnō was one of the first to acknowledge that
of the modern Chinese dialects the dialect of Hangzhou agreed best with the
categories of the rhyme tables, even though it was this pronunciation that had always
been designated with the pejorative term ‘Wu-pronunciation’ (Wenck, 1953:242,
261).
13.2.2 Ise Sadatake 伊勢貞丈 (1715-1784)
Ise Sadatake in Ansai zuihitsu 安斎随筆 lists Keichū’s ping group as qu, his qu
group as shang and his shang group as ping. Ise Sadatake’s view therefore agrees
with that of the Sinologists: ping is level, shang is rising and qu is falling. According
to Kindaichi (1951), this new concept of the tones had already developed somewhat
earlier (it can for instance also be found in the work Gengo kuninamari 言語国訛 of
1698 or 1758), but Ise Sadatake was the first to apply a tone system of this kind to
the tones of Japanese.
13.2.3 Motoori Norinaga 本居宣長 (1730-1801)
According to Kindaichi, Motoori Norinaga still expresses the same view of the tones
as Keichū, in which ping is low level or falling (in case of the light ping tone) shang
is high level and qu is rising. Ōhara (1951:11) on the other hand includes a quotation
from Motoori’s work, which shows that Motoori’s position was more complex.
Even though Motoori adopts most of Keichū’s examples, he has replaced 食 with
気 Furthermore, his interpretation of – especially – the qu tone in fact agrees with
the Sinologist’s view, and not with Keichū’s view. Keichū’s description of the qu
tone as ‘namaru yō ni koe wo mawasu’ is ambiguous enough to leave room for an
interpretation of the qu tone as falling instead of rising, but Keichū’s example words
all clearly have rising tone contours in the post-shift Kyōto tone system.
13.3 The Buddhist tone theories in the 19th century 519
5 Motoori Norinaga’s comments on Keichū’s examples
契沖が云ハく、 According to Keichū,
平上去の三声を一音言にていはゞ、if you express the tones ping, shang and
qu in words of one syllable,
ヒ ヒ ヒ
日は平,樋は上,火は去なり、 hi: ‘sun’ is ping, hi: ‘water pipe (gutter)’
is shang and hi: ‘fire’ is qu,
ヶ ヶ ヶ
毛は平,蹴は上,気は去なり、 ke: ‘hair’ is ping, ke: ‘kick’ is shang and
ke: ‘indication, trace’ is qu.
ハシ ハシ
二音の言は橋 は平、端 は上、 As to words of two syllables,
hasi ‘bridge’ is ping, hasi ‘edge’ is shang,
ハシ
箸 は去なり、 hasi ‘chopsticks’ is qu,
ツル ツル ツル
弦 は平,釣 は上,鶴 は去なり。 turu ‘a string’ is ping, turu ‘to fish’ is shang
and turu ‘crane’ is qu.
の
此説の如くにて、 According to this theory,
アガ サガ タヒラカ
平は上 らず下 らず平 なる声, ping is a level tone that rises nor falls,
アガ サガ
上は上 る声,去は下 る声なり。 shang is a rising tone, and qu
is a falling tone.
It is known that Motoori regarded the dialect of Kyōto as the standard language, and
it is puzzling that his interpretation of this tone is in contradiction with the actual
pitches of the dialect of Kyōto. Ōhara has suggested that Motoori was perhaps
trapped into this contradiction, influenced by the meaning of the characters 平, 上
and 去. Apart from the possibility that Motoori may just have quoted two current
theories without noticing the contradiction between the two, I am unable to provide
a better explanation.
13.3 The Buddhist tone theories in the 19th century
The Confucianist scholars Ise Sadatake and Motoori Norinaga supported the
Sinologist view, and the Jōdo scholar Monnō was influenced by his knowledge of
Chinese dialects. Despite the fact that (especially in secular circles) a new view of
the tones had developed, 19th century materials from the esoteric schools show no
signs of being influenced by these ideas.
520 13 The Japanese tone theories after the shift
13.3.1 Anonymous (Shingi Shingon)
The following work, Shisei narabi ni ideai dokushō shiki 四声并出合読誦私記
(1844) continues to adhere to the tone system that had been established in the 17th
century.
At first it may seem as if this anonymous work stems from the Tendai tradition,
as it mentions the bifura and fu-nisshō tones. As we have seen however, these are
also mentioned in Bumō-ki, which stems from the Shingi Shingon school (Konishi,
1948:518). In Bumō-ki the bifura tone is mentioned, but it is said that it is only used
by the Tendai and Hossō schools. In Shisei narabi ni ideai dokushō shiki too, the
tone is mentioned, but it is said that it is not normally used. This makes it more
likely that the text stems from the Shingi Shingon tradition.
6 Shisei narabi ni ideai dokushō shiki on the quasi eight/seven-tone system
平声 Ping tone: make it low and gentle
ヒキクユルヤカニス
平声軽 必ズスユル Light ping tone: definitely put it down
上声 高シ Shang tone: high
毘富羅 Bifura: the same as the shang tone
上声ト同。常ニハ用ヰズ
It is usually not used
上声ト通ズル故ナリ as it coincides with the shang tone
去声 マワス Qu tone: make it turn about
入声 ヒキク急ニス Ru tone: make it low and hurried
入声軽 上声ト同 Light ru tone: the same as the shang tone
不入声 入声ノ重ニ同 Fu-nisshō: the same as the heavy ru tone
但、引カナヲ入声ニツカフ but use a drawn-out kana for the ru tone17
The fact that the fu-nisshō is equated with the heavy ru tone here indicates that this
work’s main concern was with the correct pronunciation of Go-on loanwords in
Buddhists texts, as in Go-on, the fu-nisshō was regarded as a heavy tone (cf. sections
11.1.2 and 13.1.1). The reference to ideai in the title (which deals with the tone of
Chinese loanwords in Buddhist texts) points to the same.
13.3.2 The Tendai tone system after the shift: Rai Tsutomu’s study
of the Kan-on shōmyō of the Tendai school
The descriptions in this and the previous chapter mostly stem from the Shingon
school. After a breach in the tradition in the 14th century, the Tendai tradition was
restored in the 15th century, but during the 17th century much of the tradition was
again lost. The Tendai tradition was only restored again in the 19th century.
17 Although I have not come across the word 引カナ (hikikana?) elsewhere, the frequent use of
the character 引 in works on the Siddham script to indicate vowel length (cf. 去引) suggests
that it refers to the long vowel that was the result of the loss of intervocalic -h- in this tone.
13.3 The Buddhist tone theories in the 19th century 521
In order to give an idea of the way in which the tones came to be regarded in
Tendai circles after the shift, I will introduce here Rai Tsutomu’s study of the
melody of a Kan-on shōmyō text (Kaihon 戒品) from the 19th century Tendai Ōhara-
ryū collection Gyosan sō-sho 魚山叢書.
Rai (1951/1989) compared the tone value of the hakase marks with the tone
classes of the characters to which the hakase marks had been added. The results
were presented by Rai as in Figure 2. (The hakase marks have been given codes; f5 =
falling, e4 = low level, e5 = high level, r4 = rising.)
Figure 2: Rai’s statistics on the fushihakase in Kaihon
Source: Rai (1951/1989:398)
The shōmyō text in question is said to have been introduced in Japan by Ennin in the
9th century. Rai therefore regards the melody of this text as stemming from Ennin’s
time; a faithful transmission of a melody that goes back a thousand years. Based on
the statistics above, Rai reconstructed the Late Middle Chinese tones as they must
have been introduced in Japan in the 9th century as in Figure 3:
Figure 3: Rai’s reconstruction of the Late Middle Chinese tones
based on the fushihakase in Kaihon
Source: Rai (1951/1989:402)
Rai’s tone system comes across as a plausible form of Late Middle Chinese, and is
said to be based on a melody that goes back a thousand years. There are however, a
number of points that need to be considered before relying too much on Rai’s
reconstruction.
522 13 The Japanese tone theories after the shift
First of all, it is more than unlikely that the melody of this text goes back to
Ennin’s time: Rai’s study is based on a text from the Ōhara branch, the only
surviving Tendai shōmyō school. The Ōhara branch went through several reforms
and breaches in the tradition. The present-day tradition is the result of a restoration
in the 19th century, and – as mentioned – the text that Rai uses stems from a 19th
century edition.18
Another problematic point is that the tones as reconstructed by Rai cannot be
found in this form anywhere in the melody of the text. The above reconstruction can
only be arrived at, if one allows Rai’s method, which is as follows:
If a tone is marked in more than one way, Rai reasons that the original tonal
value of this tone must have been approximately the average of the different
markings combined. For instance: The light ping tone is marked with [F] tone
contours most of the time, but is also marked with [L], and a few times with [H]
tone marks. The heavy ping tone on the other hand, is marked overwhelmingly with
[L] tone marks, with only a few [F] tone markings. From this, Rai concludes that
both tones were originally falling, but that the heavy ping tone was lower and less
abruptly falling than the light ping tone.
In a similar way, Rai concludes that the light shang tone was higher and less
abruptly rising than the qu tone, because the qu tone is almost exclusively marked
with [R] tone marks, while the light shang tone is marked with [R] as well as with
[H] tone marks. The light ru tone is marked almost exclusively with [H] tone, but is
reconstructed as slightly falling by Rai because there are a few markings with [F]
tone (even though there are just as many markings with [L] tone). Finally, the heavy
ru tone is marked 50 times with [H] tone, 20 times with [L] tone and 4 times with
[R] tone, from which Rai concludes that this tone must have been lower than the
light ru tone, and possibly slightly rising.
However, based on the majority of the markings in Rai’s statistics, it is also
possible to conclude that the tone system prevalent in the Tendai school in the 19th
century was as in (7). These tones do not differ much from the way in which the
tones were reanalyzed in the Shingon school after the shift. This is not surprising as
it is very likely that the scholars who revived Tendai shōmyō in the 19th century
were influenced by the contemporary chanting tradition of the Shingon school.
An interesting difference with the old Tendai tone systems shown in chapter 7,
and the Kogi Shingon and Shingi Shingon tone systems shown earlier on in this
chapter, is that ru tones tend to be marked with [H] tone, irrespective of the initial.
But according to the Buddhist encyclopedia Hōbō girin (Demiéville, 1930), in the
Shingon school as well, the relation between the tone of a character and the way in
which it is chanted has become vague, and there is no (longer) a distinction between
18 If the melody of this text really went back to Ennin, we would expect the tone system to be
very close or almost identical to the tone systems of Isei and Chisō. The way in which the
characters are divided into the light (yin) and heavy (yang) registers depending on their initials
however, does not agree with Isei and Chisō at all, who had a light and heavy distinction in all
of the four tones.
13.4 The Edo period tone theories and modern scholarship 523
light and heavy ru. (The ru tone is most often chanted with a short, falling tone
contour, which agrees with the value that Keichū (Kogi Shingon) gave to the light ru
tone. (Cf. section 13.1.3.)19
7 Alternative interpretation of Rai’s statistics on the fushihakase in Kaihon
light ping [F]
heavy ping (includes the jidaku category) [L]
light shang (includes the jidaku category) [R] or [H]
heavy shang merged with qu
light and heavy qu [R]
light ru (includes the jidaku category [H]
heavy ru mostly [H], sometimes [L]
13.4 The Edo period tone theories and modern scholarship
In the previous chapter (section 12.3.1) we have seen how in Siddhām jūhas-shō
kikigaki 十八章聞書, a text from the end of the 15th century, the tones started to
be reinterpreted. The tone descriptions in this work are still relatively obscure, but
the fact that the ataru annotation is used in Daisho hyaku-jō dai-san-jū yomikuse 大
疏百条第三重読曲 of 1563 is positive evidence that this work already assumes a
reversed value of the traditional tones.
In the 17th century, Buddhist phonological study flourished again. In this period
we find – for the first time – tone descriptions that are detailed enough to conclude
that the tone system that can be regarded as typical of the Shingon school in the Edo
period had taken shape: ping = [L], light ping = [F], shang = [H], qu = [R], light ru
[H], heavy ru [L].
In the article on the shōmyō of the Shingon school in the Buddhist encyclopedia
Hōbō girin (Demiéville, 1930), a similar type of modern Shingon tone system is
being described, although the distinctions appear to have been simplified somewhat.
This article influenced Mei’s reconstruction of the Middle Chinese tones (1970).
Although Mei mentions that the faithfulness of the transmission of the Middle
Chinese tones in the Japanese shōmyō tradition has not been established, and that
evidence from this tradition should be handled with care, his reconstruction
nevertheless relies heavily on the Shingon tone theory of the Edo period.
It is because of the description in Hōbō girin for instance, that he reconstructs the
Middle Chinese shang tone as [H], and the Middle Chinese qu tone as [R].
19 This treatment of the ru tone may be based on old remarks (such as by Shinren), stating that the
Kan-on ru tone was always light. Also, Kan-on ru tone loanwords are the only Kan-on
loanwords that have a relatively clear reflex in the modern dialects: They tend to belong to
class 2.1, irrespective of the initial (cf. section 11.1.2), and this class has [HH] pitch in the
post-shift Kyōto type tone system.
524 13 The Japanese tone theories after the shift
Pulleyblank (1978) on the other hand, who did not trust the reliability of the shōmyō
tradition either, decided not to use the evidence from the Hōbō girin, and in his
reconstruction, the Middle Chinese shang tone is [R] and the qu tone is [F].
Mei’s reconstruction, the article in Hōbō girin, and Rai’s paper (1951/1989) on
the shōmyō recitation of the Tendai school have been cited by Vovin (1997:116) as
arguments against Ramsey’s theory. In addition, Starostin (1991:137) dismisses
Ramsey’s theory because of a later paper by Rai (1976) on Tendai shōmyō recitation.
As we have seen however, the Tendai tone system probably goes back no further
than the 19th century period of shōmyō restoration in this school. What is more, it is
likely that the Tendai scholars at the time took the tone system of the Shingon
school as their example.
The Shingon tone system of the Edo period agrees in every detail with the
standard reconstruction of the Middle Japanese tones. This has been regarded as
confirmation of the historicity of the reconstruction. In my opinion however, the
similarity is the result of something different:
The Middle Japanese tone system of modern scholarship, and the Edo period
tone system of the Shingon school are both reconstructions. The 20th century
scholars who set out to reconstruct the Middle Japanese tone system had one central
assumption in common with their Edo period predecessors. This was the idea that
the tone system reflected by the tone dots must have resembled the contemporary
tone system of Kyōto. The Edo period scholars were not concerned with historical
tone change, of course. But because of this common starting point, the reasoning of
both followed parallel paths towards similar outcomes.
The core idea of the standard theory, namely that the Middle Japanese tone
system must have resembled the tone system of Kyōto, can thus be traced back to
the 17th century. The reconstructed tone system that the standard theory has
produced derives from this assumption.
14 Fushihakase material
Kindaichi’s theory and Ramsey’s theory both concentrated on the reconstructed
tonal value of the tone dots. Although tone dot material is no doubt the most
important source on the history of the Japanese tone system, the story does not end
there. There is also another body of historical material on the Japanese tones, and
this is material in which the pitches of Japanese have been marked by means of
fushihakase 節博士 musical notation marks.1 The most famous example of this type
of material is the ‘accent dictionary’ contained in the vocabulary part of Bumō-ki 補
忘 記 , but other examples are for instance the musical scores of the Heike
monogatari 平家物語 and Nō music (yōkyoku 謡曲).
There is no controversy surrounding the interpretation of late fushihakase
material such as contained in the vocabulary part of Bumō-ki, and the relatively
modern Kyōto type tone expressed in it can be explained by Ramsey’s theory and
agrees well with it: The patterns expressed by the fushihakase marks in these works
agree with the patterns that result if we take Ramsey’s reconstruction of the Middle
Japanese tone system as the starting point and then shift the tones one syllable to the
left.2
14.1 The interpretation of older fushihakase material is uncertain
There also is however, older historical fushihakase material with a quite different
tone pattern, which in its current interpretation cannot be reconciled with Ramsey’s
theory: If one accepts Ramsey’s theory this means that some early fushihakase
material, such as the Daiji-in-bon of Shiza kōshiki 大慈院本四座講式 and a number
of musical scores of the rongi ceremonies from the Middle Ages, will have to be
reinterpreted.
The current interpretation however, is not beyond doubt, as not much is known
about many of the older types of fushihakase. All but the two most recent types of
fushihakase (for the Tendai school this is the so-called meyasu hakase 目安博士,
1 The origin of the term fushihakase is not completely clear. The word fushi means ‘melody’, and
the word hakase means ‘doctor’, ‘learned person’. The use of the term hakase may be related to
the fact that the teachers of Chinese phonology in Japan were called on-hakase or ‘doctors of
sound’.
2 Bumō-ki represents a stage that is somewhat more archaic than the tone system of the modern
dialect of Kyōto. The later change in class 3.4 for instance, which the /H/ tone was shifted one
more syllable to the left, onto the initial syllable (' > ') had not yet occurred. (For
examples of Bumō-ki type tone, see section 14.5 and section 2.2 of part I.)
526 14 Fushihakase material
and for the Shingon school this is the goin hakase 五音博士) had been long extinct
when 20th century linguists started to be interested in the historical material on the
tones contained in the old musical scores.
In Japanese traditional music in general, one does not find detailed notations. In
Japan, notation was merely a memory aid. The concentration was on aural and
technical skills with as few visual distractions as possible.
The rote teaching tradition in Japan made a detailed notation superfluous and
sometimes even undesirable: Because there has always been a strong guild system in
Japanese music, and a tradition of ‘secret’ pieces, notation systems were fostered
which would preserve compositions for future generations, but this happened in an
outline form, so that only the initiated could translate the notation into actual sounds.
A discussion of Japanese musical notation systems up to the late 19th century must
deal with numerous systems, which were different not only for each genre or
musical instrument, but also for the various guilds of performers within each
tradition.
As a result, the study of fushihakase material is an extremely complicated field,
which requires knowledge of what kinds of fushihakase existed, by which group
they were used in which period, and which material was written by which group and
when.
In case of the tone dots, it is at least clear which tone is meant, even if the tonal
value of the tone is uncertain. In case of early examples of fushihakase on the other
hand, it is often not even certain what kind of system the marks represent: A system
based on absolute tone height distinctions, or a system that visually represented the
movements of the voice. And even if we know that the system was based on tone
height, in different texts, the different angles of the strokes can represent different
tones. (In some systems therefore a ‘key-figure’ or hakase chart had to be attached
to the piece, which showed which stroke represented which tone.)
As a consequence, there is the risk that the reconstructed tonal value of the marks
is influenced by already existing ideas on the history of the Japanese tone system
that the modern researcher has. (This is true for proponents of the standard theory as
well as for me.) In order to illustrate the fact that the correct interpretation of the
earlier fushihakase material is far from clear, and that there is room for
interpretations of this material that are in agreement with Ramsey’s theory, I will
start with a general overview of the development of the different types of
fushihakase marking systems.
14.2 The historical development of the fushihakase
Shōmyō music is usually notated by means of fushihakase. In origin this is a means
of visually indicating a melody, using a combination of straight and/or curved lines.
Although much is still uncertain about the development of these notation marks,
they seem to have originated from very simple visual representations of the
14.2 The historical development of the fushihakase 527
movement of the melody that were used in conjunction with the tone dots. The
oldest extant example of such early hakase notation dates from the year 1034
(Numoto, 1991).
Scholars agree on the fact that these lines did not indicate absolute tone, as
would a real music notation system, but were neumatic, i.e. they indicated the rising
or falling of a melody, but not absolute tone height or duration. 3 They were a
memory aid to a melody learned directly from a teacher. In most shōmyō,
fushihakase notation represents the outline of the melody, and does not represent
meter in terms of a standard temporal unit. The duration of the tones is only vaguely
indicated by the relative lengths of sections of the fushihakase lines.
Kataoka Gidō (1972) explains that there are actually two separate origins to the
fushihakase systems in use in Japan: One of those origins was the neumatic script
mentioned above. It is usually assumed that the earliest forms of neumatic vocal
music notation in Japan developed in Buddhist music. 4 Giesen (1977) however,
stresses that lack of old documents makes it impossible to determine whether this
oldest type of vocal music notation developed first in shōmyō or in the vocal genres
of court music (gagaku 雅楽). (Old manuscripts are usually not well preserved in
the world outside the monasteries.) What is certain, is that the neumatic marks used
in the two traditions were very similar (see section 14.2.7). The second origin was a
script that was devised much later, as a real musical notation system from the start,
which expressed absolute tone.
Notation systems belonging to the first category (shōten hakase 声点博士, tada
hakase 只 博 士 , fu-hakase 譜 博 士 , zu-hakase 図 博 士 and meyasu hakase)
concentrate on a visual representation of the melodic patterns (senritsu-kei 旋律形)
that connect the central tones; they are descriptive in nature. Notations belonging to
the second category (Tanchi’s goin hakase and Kakui’s goin hakase or hon-bakase
本博士) concentrate on expressing the height of the central tones, and are more
prescriptive in nature.
Later versions of the two different systems, such as the modern meyasu script of
the Tendai school – which is still in use – tried to combine the strong points of the
two systems by visually representing the tone contours of the melodic shapes, but
also adding a mark indicating on which precise tone height the tone started. In the
3 The term is derived from Greek νευµα ‘hint, nod’.
4 Shōmyō has a number of points in common with Gregorian chant: the use of unaccompanied
men’s voices with a special intonation, a preference for free rhythm, two distinct types of chant
(one marching with the text, the other using long melismas), and notation in neumes.
According to Kaufmann (1967, 1974) these resemblances are not accidental as there may have
been contact in ancient Central Asia between Judaism, Nestorianism and Buddhism. The
Japanese fushihakase marks however, developed gradually in Japan itself, originally as
extensions of the tone dots. Even though Tibetan Buddhist chant also uses neumes, it is
unlikely that the two systems have a common origin: The Chinese sources (which form the
origin of Jsapanese shōmyō) do not contain hakase notations, and neither do the oldest Japanese
sources.
528 14 Fushihakase material
systems that concentrated on absolute tone, such as the well-know goin hakase
system of the Shingon school, additional symbols were used in order to indicate the
inflections of the voice in the senritsu-kei. (These are usually abbreviations of the
names of the different vocal formulae.)
Below, I will introduce the two different fushihakase systems and their many
sub-types in the general order in which they developed.
14.2.1 Ko-hakase (‘old hakase’): Shōten hakase, tada hakase, fu-hakase
The oldest extant examples of hakase notation can be found in the Kongō-kai giki 金
剛 界 儀 機 軌 of 1034 (Arai, 1996:5). These hakase notations were used in
conjunction with shōten: The sounds of Sanskrit in dhāran,ī, mantras and hymns were
transcribed by means of Chinese characters to which shōten were added in order to
indicate the tone of the characters. Very occasionally simple neumatic notations
were added to the shōten in order to indicate a certain melodic movement. This can
be regarded as the earliest form of shōten hakase 声点博士 or ‘tone dot’ hakase.
The development of fushihakase is related to the correct pronunciation of the
magical formulae and thus closely connected to the Siddham tone theories.
From the first half of the 11th century, hakase notation appears to have been used
as a type of memorandum. By the first half of the 12th century, single pieces were
transmitted from teacher to student in written form with full hakase notation, and
later these materials were compiled into collections of shōmyō pieces, or shōmyō-
shū 声明集.
Shōten hakase is said to have been used first in Tendai shōmyō, and later in
Shingon shōmyō. It fell out of use in Tendai shōmyō from the mid 12th century. In
Shingon shōmyō its use continued for a longer time. It was for instance used when
Shukaku 守覚 of the Ninna-ji 仁和寺 in Kyōto compiled a two-volume collection of
shōmyō pieces entitled Hossoku-shū 法則集 (1180).5
Figure 1 shows a 13th century copy of this work, owned by the Ninna-ji.6 Shōten
hakase continued to be used in the Ninna-ji Sōō-in school 仁和寺相応院流 until
the 14th century.
5 A shōmyō collection in which only pieces that are used in one ceremony are included and
presented in the order in which they occur in the ceremony is called a hossoku-shū 法則集.
Other shōmyō collections are called shōmyō-shū.
6 It can be seen from this example that the hakase marks and the tone dots do not always agree.
During the Kamakura period the agreement between the tone dots and the hakase marks
gradually disappeared in most shōmyō genres, as hakase marks started to be added for purely
musical demands. (See section 14.2.6.)
14.2 The historical development of the fushihakase 529
Figure 1: Shōten hakase from Hossoku-shū (1180)
Source: Arai (1996:5)
Another example of shōten hakase from the Shingon school has been shown in
section 7.3.2.1, in the eight-tone chart with tone dots as well as hakase marks
included in Shittan-jiki kikigaki 悉曇字記聴書, by Dōhan (1178-1252), who was the
teacher of the famous scholar Shinpan. It can be seen how to ping tone dots a
horizontal hakase mark has been added, and to shang tone dots a diagonal
(backslash) or vertical hakase mark. Qu tone dots are marked with a diagonal mark
in the opposite direction of the mark added to the shang tone (forward slash), or with
a hakase mark that has a kink in it and appears to consist of a horizontal mark
followed by a diagonal (forward slash) mark.
The next oldest system is known as tada hakase 只博士.7 In this system, the
marks, which were normally added to the left-hand side of the text, no longer took
the tone dots as their starting point. The name tada hakase or ‘simple’ hakase
referred to the fact that only the general movement of the melody was indicated by
simple lines. This type of notation still corresponds closely to shōten hakase: It
indicated the movement of the voice, and lacked a standard criterion for indicating
pitch, but it was already more complex and more precise.
An example of hakase notation that belongs to this type has been show in section
7.3.3.1, in Shittan shogaku-shō 悉曇初学抄 (late 14th century) by Kenpō 賢宝 of the
Kogi Shingon school in Kyōto. The ping tone is expressed by means of a horizontal
mark, the shang tone by means of a diagonal mark (backslash), and the qu tone by
means of a z-shaped mark. Although the marks no longer have a tone dot as their
7 Both shōten hakase and tada hakase are sometimes called ko-hakase, but the term ko-hakase is
also often used as a kind of generic term for all kinds of types of hakase notations that later
became obsolete.
530 14 Fushihakase material
starting point, this hakase type is still very close to shōten hakase. The mark for the
qu tone for instance, is still on the right-hand side of the character, as this is the side
on which the qu tone dot was located.
Figure 2 is an example of such notations (referred to as goma-fu 胡麻譜 or
‘sesame notation’ because of the shape of the strokes) added to a kōshiki text. Figure
3 is an example of Tendai tada hakase (3):
Figure 2: Tada hakase (goma-fu)
Source: Malm (1959:263)
Figure 3: Tendai tada hakase
Source: Arai (1986:16, adopted from Yoshida 1972:278)
14.2 The historical development of the fushihakase 531
In addition, the height of the central tones could be indicated by means of numbers
that were added to the tada hakase marks. These numbers corresponded to the
strings of the zither (sō no koto 箏の琴) or the finger positions of the flute.8 This
system is also called fu-hakase 譜博士 (‘musical score’ hakase), although the term
fu-hakase is also sometimes used as an equivalent of tada hakase. It continued to be
used in the Tendai school until the end of the 14th century. In the Shingon school it
started to be used later and was replaced by zu-hakase.
Examples of Shingon fu-hakase (left) and of Tendai fu-hakase (right) are given
in Figure 4.
Figure 4: Examples of Shingon and Tendai fu-hakase
Source: Hōbō girin, Demiéville, P. (1932:107)
Shōmyō reached it peak from the 12th century on, and in the 13th century extensive
revision of music theory, notational systems and collections of hakase manuscripts
was undertaken.
14.2.2 Zu-hakase
Zu-hakase 図博士 (‘graph’ or ‘chart’ hakase) is a type of notation in which a hakase
chart or ‘key figure’ is used as a standard model to indicate the tone height of the
initial tone or shutton 出音 by identifying it with the position and the angle from
which the hakase line leaves the character to which it has been added. Although it
was possible to indicate the first tone height of a character in the text, for the
following tone heights the system followed the same neumatic principle as the tada
8 According to Hōbō girin (Demiéville, P. ed., 1930) this is the musical notation system that was
in use in the Shingon school after the big conference at the Ninna-ji of 1145-1150.
532 14 Fushihakase material
hakase system. (Like fu-hakase, zu-hakase can be regarded as an improved form of
tada hakase.)
Lines pointing vertically up or down, horizontally to the left or right, or at 45°
degree angles to the horizontal-vertical axes were added at three points; top, middle
and bottom, on either right or left of the character (but usually on the left side), and
pitches were assigned to these lines in such a way that the pitch range from L to H
ran from the bottom to the top of the character.
Zu-hakase was widely used in the Shingon school. The first example of a key
figure below is of the Daigo school 醍醐流 (13th century), the second and third
examples are of the Ninna-ji Sōō-in school (Saihō-in 西方院 branch) from Sonpen
尊遍 (13th century) and Senga 宣雅 (14th century), the fourth and fifth examples are
hakase charts for the ryo and the ritsu scale from the Saidai-ji Sōō-in school 西大
寺相応院流 (14th century).9
Figure 5: Examples of Shingon zu-hakase charts
Source: Arai (1986: 18-19)
The examples in Figure 6 are hakase charts for different shōmyō pieces in use in the
Nanzan-Shin 南山進 school (as shown in Chūga no ki 忠我記 of 1495).
Although the initial tone (shutton) of each character in the text is thus quite clear,
the following pitches in melismatic phrases are not, and for this reason, at first,
instrumental tablature signs (especially those of the transverse flute used in
gagaku)10 and later the names of the five tones of the pentatonic scale (from low to
9 The Nara temple Saidai-ji was re-established in the 13th century as a temple of the Shingon
school. In the 14th century the shōmyō of the Saihō-in branch of the Ninna-ji Sōō-in school
were introduced in this temple, where a new school was formed, producing shōmyō collections
with a distinct notational style. It can still be seen that the chart for the ritsu scale at the Saidai-
ji (nr. 5) has been based on the hakase chart designed by Senga (nr. 3) for the Ninna-ji Sōō-in
school, except that the numbers of the strings of the sō no koto (6 六, 7 七, 8 八, 9 九, 10 十,
11 斗,12 為) have been added to the names of the five tones.
10 The Nanzan-Shin school and the Daigo school used flute tablature, and the Ninna-ji Sōō-in
school used the numbers of the strings of the sō no koto.
14.2 The historical development of the fushihakase 533
high: 1 kyū 宮, 2 shō 商, 3 kaku 角, 4 chi 徴, 5 u 羽) still had to be added as
supplementary signs.
Figure 6: Examples of zu-hakase charts from the Nanzan-Shin school
Source: Arai (1986:19)
Following the establishment of a music theory for shōmyō by Tanchi 湛智 (1163-
±1240) of the Ōhara school, who introduced many revolutionary practices in Tendai
shōmyō, there was a shift away from the use of flute tablature signs towards the use
of the names of the tones of the five-tone scale in the notation of both Tendai and
Shingon, and the goin hakase or ‘five-tone’ hakase notation was conceived.
14.2.3 The early goin hakase system of Tanchi
In the Tendai school an early form of goin hakase 五音博士 invented by Tanchi
himself appears to have been used, in which the pitches of the melodic line were
indicated precisely by means of the direction and angle of the hakase lines. The tone
indication by means of instrumental tablature signs of zu-hakase, was replaced by a
notation using the five tones of the pentatonic scale. These tones were represented
by the direction and the slant of the notational line with reference to the horizontal
and vertical axes. This notation made it possible to read the pitch of the hakase
notation automatically, without having to resort to the types of supplementary signs
mentioned above. Goin hakase was thus devised as a true musical notation system,
indicating absolute tone. The lines that indicate the five tones are arranged in
clockwise order from low to high pitch around the character. Figure 7 shows an
534 14 Fushihakase material
example of Tanchi’s hakase chart for the ritsu scale, as well as (to the right) an
example of what the system looks like when used in practice.
Tanchi’s notation was hard to read; the angles of the lines were so close together
that the notation looked crowded, and at times hakase lines indicating the same pitch
were written in different directions because of their relation to the pitches of
adjacent tones. Perhaps for this reason this notation never became popular,11 and
seems to have disappeared completely by the end of the 14th century (Arai, 1996:
23). It was replaced by the meyasu hakase notation which came into use in the
Tendai school at the beginning of the 15th century.
Figure 7: Tanchi’s goin hakase
Source: Arai (1986:20)
14.2.4 Meyasu hakase
The system of neumes, placed on the left hand side of the text in use in the Tendai
school is called meyasu hakase 目安博士 (‘easy-on-the-eyes’ hakase). Meyasu
hakase is an improved, more explicit version of tada hakase, and can be classified as
belonging to the same category in terms of the method by which the notation
indicates pitch. The early example shown in Figure 8 for instance, still shows a
strong resemblance to the tada hakase system used by Kenpō of the Kogi Shingon
school.
Just as in tada hakase, the marks represent the tone contours visually, and do not
indicate absolute tone. An important difference is that meyasu hakase also tries to
visually express the melodic patterns such as yuri (tremolo) and other special
11 Tanchi’s own pupil Shūkai 宗快 of the Tendai Ōhara school 大原流 already tried to revive the
earlier tada hakase system, by replacing the old flute tablature signs with the five tone names in
his notation used in Gyosan mokuroku 魚山目録 (1238).
14.2 The historical development of the fushihakase 535
inflections of the voice. Meyasu hakase derives it name (‘easy-on-the-eyes’) from
this visual aspect of the marks.
The development of meyasu hakase is traditionally attributed to Ryōnin 良忍
(1071-1132), but the only known example of an autograph in Ryōnin’s hand is a
single sheet with a hakase type that seems to be an isolated example. (Interestingly it
is added to the right-hand side of the text instead of the left-hand side that became
standard in meyasu hakase.)
An early form of meyasu hakase, which can be regarded as the ancestor of the
present-day system of the Ōhara school, can be found in Kakuen’s 覚淵 Shōmyō-shū
nikan-shō 声明集二巻抄 of the mid 13th century (included in the Gyosan sō-sho 魚
山叢書, copy of 1853). The marks still occasionally occur on the right-hand side of
the characters (cf. Figure 8).
Figure 8: Early meyasu hakase
Source: Arai (1996:7)
Although the 12th century is mentioned as the time of origin of meyasu hakase, it
was actually only in the early 15th century that meyasu hakase replaced the earlier
tada hakase and the goin hakase invented by Tanchi as a hakase for common use in
the Tendai school.
In later forms of the meyasu script, the angle of the first hakase stroke, closest to
the character, and the location from which it leaves the character indicates the initial
tone or shutton, a principle adopted from zu-hakase. The following strokes in the
hakase line remained neumatic. Figure 9 shows the modern (early 19th century)
meyasu hakase used in the Tendai school, including the attached key-figure that is
needed to determine the shutton.
536 14 Fushihakase material
Figure 9: Modern meyasu hakase
Source: Arai (1986:18)
14.2.5 The goin hakase or hon-bakase system of Kakui
In 1270, Kakui (1237-?) of the Nanzan Shin school invented a new goin hakase style
of notation, which is also known as hon-bakase 本博士 (‘essential notation’).
Figure 10: Kakui’s goin hakase
Source: Arai (1996:19)
14.2 The historical development of the fushihakase 537
Figure 10 shows how the lines that indicate the five tones are arranged in clockwise
order from low to high pitch around the character with the center of the vocal
register aligned to the left side. The direction of the lines of adjacent pitches differs
from each other by 45° degree angles:
The Tendai goin hakase system had failed to gain popularity because the angles
between adjacent pitches were narrower and the lines of the notation were arranged
around the character in a complicated fashion. Kakui’s hakase notation on the other
hand, was revolutionary, in that it could be read easily, no matter how long the
notational lines were. The examples in Figure 11 are from the Shingon shōmyō
collection Gyosan taigai-shū 魚山條芥集 (printed in 1496 at Kōya-san).
Figure 11: Texts using Kakui’s goin hakase
Source: Hōbō girin, Demiéville, P. (1932:109)
Kakui’s notation was not readily accepted at Kōya-san, and he moved to Kamakura
and passed it on to Kenna 剣阿 of the Shōmyō-ji 称名寺 and others. Kakui’s
notation finally gained acceptance at Kōya-san in the 14th century.12
It is assumed that at the same time the musical structure also changed rather
strongly, although the extent of this change cannot be ascertained. It is often thought
that the entire shōmyō repertoire of the Nanzan Shin school – which until that time
12 In Shōketsu-sho 声決書 of 1396, which – according to Kindaichi (1972) – is an extremely
confused work on shōmyō musical theory, it is stated that the hakase notation used in the
Nanzan Shin school before the development of Kakui’s goin-hakase, was called konpon no
hakase (‘basic’ hakase, a form of tada hakase?) and that it resembled the hakase of the Sōō-in
school, but that it was no longer used at the time, i.e. in the late 14th century (Arai, 1996).
538 14 Fushihakase material
had been transmitted by means of different types of notation – was transposed in
Kakui’s goin hakase system by Ryūnen 隆然 in the mid 14th century. According to
Kindaichi (1972:111-114) however, this is only true for a small part of the shōmyō
repertoire, and for instance not for the genre of kōshiki. It was only in the mid 18th
century that the goin hakase system truly became standard at Kōya-san, and other
notation systems were no longer used at all.
This notation was also adopted by other schools as it was possible to indicate
tone height more precisely than before. It is still used today in the Nanzan-shin
school, as well as in the Buzan 豊山 and the Chizan 智山 branches of the Shingi
Shingon school.
The concept of the three ‘layers’ or octaves, shojū 初重, nijū 二重 and sanjū 三
重 (used to refer to the lower, middle and upper octaves in the vocal register)
arranged around the character are a later development, since they cannot be seen in
materials from Kakui’s time. This system, called the go-on-sanjū 五音三重 system
(‘five notes-three layers’) divides a range of fifteen notes into three layers of five
notes each. In actual practice only eleven notes were used, so that the bottom layer
actually consisted of only two notes, the middle layer of five notes and the top layer
of four notes.
In Kakui’s time the notes ‘upper’ and ‘lower’ were added for tones outside of the
central octave. The representation of the five tones in three octaves seems to have
developed in the 15th century. The oldest known graphical representation dates from
1443.
Figure 12: Representation of the five tones in three octaves
Source: Arai (1996:19)
Expressed by means of western staff notation the value of the goin hakase marks is
as in Figure 13. The ‘white’ strokes below mark the notes that are not used in
14.2 The historical development of the fushihakase 539
practice. The first layer is marked by means of the strokes numbered 1 to 5, the
second layer by means of the strokes numbered 6 to 10 and the top layer by means
of the strokes numbered 11 to 15. The dots or circles added to the hakase marks in
this system functioned as a way to distinguish certain tones (for instance 4 and 8, 5
and 9, 6 and 10) from each other.
Figure 13: The tonal value of the goin hakase marks in staff notation
Source: Malm (1959:262)
Because there is no connection between the direction (angle) of a hakase mark and
the direction of the voice, this system has the disadvantage of working counter-
intuitive, and in the Tendai school neumatic notations continued to be preferred.
14.2.6 Shōmyō genres that contain historical information on the Japanese tones
Kataoka Gidō (1972) mentions that in shōmyō texts from the beginning of the
Kamakura period 96% of the ko-hakase agreed with the tone dots, in a text of 1205
only 71% agreed and in the mid Kamakura period almost none of the ko-hakase
marks agree with the tone dots and the tone dots disappear from the text. He thinks
that this development in which the tone of the character was ignored and fushihakase
were added for purely musical demands had spread to the majority of shōmyō genres
in the mid Kamakura period.
For this reason, only a very limited type of shōmyō texts can be used as historical
material on the tone system of Japanese. Types of shōmyō that tend to reflect the
tones of the spoken language (including the many Sino-Japanese character-based
loanwords included in these texts) usually belong to the yomu shōmyō genre (hymns
that are recited) such as hyōhyaku 表白 and saimon 祭文, and to the kataru shōmyō
genre (hymns that are narrated), such kōshiki 講式 and rongi 論議. Furthermore,
Butsuyuigyō-kyō 仏遺教経 also reflects the pitches of the spoken language, which is
unusual for a sutra.
14.2.7 Neumes versus absolute tone
Summarizing the overview of the hakase types, we can say that there are two main
types of fushihakase: One type developed out of the tone dots, while a second type
was devised as a true musical notation system from the start. This second type is the
goin hakase system invented by Kakui. (Although the principle was first invented by
Tanchi, we have seen that his system never became popular. Therefore, when the
term goin hakase is used, it normally refers to Kakui’s system.) We know that the
540 14 Fushihakase material
earliest fushihakase type originated from the tone dots, and that in this system a
horizontal line represented the ping tone, a diagonal line (usually a ‘backslash’)
represented the shang tone, while the qu tone was marked in different ways (a hook,
a z-shape, a forward slash). We also know that the goin hakase system was of later
invention and only truly started to be used in the 14th century in the Kogi Shingon
school on Kōya-san.
As the oldest hakase marks date from the Kamakura period, before Kakui’s goin
hakase system became popular, and as they agree with the tone dots, it makes sense
to assume that the value of the strokes agreed with the value of the tone dots that
they represented: Level stroke [H]/[F] (ping), diagonal stroke (backslash) [L]/[R]
(shang), and hook/forward slash, z-shape etc. [F] (qu).13
A system in which the two basic tones used to mark the tones of Japanese (ping
and shang) were marked by means of a horizontal line and a diagonal line
(backslash) respectively, was definitely still in use in the Shingon school on Kōya-
san in the 13th century (see Dōhan’s Shittan-jiki kikigaki of 1241, section 7.3.2.1)
and – judging from Kenpō’s Shittan shogaku-shō (section 7.3.3.1) – continued to be
in use in the Kogi Shingon school at least until late into the 14th century.
Dōhan’s Shittan-jiki kikigaki dates from before the invention of Kakui’s goin
hakase system in 1270, so there can be no doubt that in these early shōten hakase
and tada hakase/fu-hakase marking systems, a horizontal stroke expressed the ping
tone and not kaku, and a diagonal stroke expressed the shang tone and not chi.
A similar notation system was used in court music (gagaku) genres such as mi-
kagura 御神楽, saibara 催馬楽 and rōei 朗詠 (banquet songs). This type of
fushihakase notation is called goma-fu (‘sesame notation’) because the shape of the
marks resembles that of sesame seeds (Malm,1959:262). Figure 14 shows the
Bun’ei-bon of Rōei yōshū 文永本朗詠要集 (1309), a collection of 41 Kamakura-
period banquet songs. The tones are marked by means of goma-fu, which are added
to the left-hand side of the text.14
We see that a word like 2.3 toki ‘time’ (平平 in Middle Japanese), has been
marked with two horizontal hakase marks , while 2.2 hito-ni ‘to the
person’ (marked as 上平-平 in MJ Gairin type texts like Kokin kunten-shō, i.e. with
/H/ tone spreading onto the particles) is marked in Rōei yōshū.
Elsewhere, Kindaichi (1973) mentions that in Rōei yōshū the word 2.5 mayu
‘eyebrow’ ( or (-) in Middle Japanese is marked (平上).
13 The rising shang tone of the Siddham scholars was represented by means of a diagonal stroke,
but the falling ping tone on the other hand, was represented by means of a horizontal stroke.
Two factors may have played a role in the origin of this asymmetry: In the first place, it is
practical, as it is much easier on the eyes to distinguish between a horizontal stroke and a
diagonal stroke than between two diagonal strokes poised at different angles. Secondly,
choosing a horizontal stroke for the ping tone and a diagonal stroke for the shang tone provided
a visual representation of the names of the tones (ping = ‘level’, shang = ‘rising’).
14 These songs were probably recorded sometime between 1232 and 1267 (Kindaichi, 1973:287).
14.2 The historical development of the fushihakase 541
Figure 14: Goma-fu, the notation system used in court vocal music
Source: Kindaichi (1974) (Figure nr. 4 at the beginning of the book)
According to Malm (1959:262) the notation used in rongi, kōshiki and other such
shorter, more rhythmical forms of shōmyō which stayed close to the spoken
language, was derived from this simple style of neumatic vocal music notation used
in court music. Most scholars however, assume that the derivation was the other way
around, and that the system used in court music developed from the system used in
shōmyō, as the development of fushihakase is closely related to dhāran,ī recitation.
At any rate, it is clear that the systems were very similar, and even the sesame
shaped form of the marks also occurs in Buddhist tada hakase notation. (Cf. Figure
2.)
The hakase notations used in Nō music (yōkyoku 謡曲) and in the recitation of
the Heike monogatari 平家物語 are also regarded as offshoots of Buddhist vocal
music notation, and as can be seen in sections 14.7.1 and 14.7.2, in these systems as
well, a horizontal stroke expressed [H] pitch and a diagonal stroke (backslash)
expressed [L] pitch.
Scholars who have looked at early Buddhist fushihakase material from the
viewpoint of the standard theory on the other hand, invariably regard a horizontal
542 14 Fushihakase material
stroke as expressing [L] pitch and a diagonal stroke as expressing [H] pitch. This
would be correct if the material concerned were based on Kakui’s goin hakase,
where a horizontal stroke represents the kaku tone and a diagonal stroke the chi tone.
However, if the material belongs to the older type, in which the horizontal stroke
represents ping [F]/[H] and the diagonal stroke represents shang [R]/[L], such a
reading would result in an exact reversal of the recorded pitches.15
Just as I think that in the standard theory the Edo period value of the tones has
been projected backwards onto the tone dot markings of the Middle Japanese period,
I suspect that in these cases the value of the hakase marks in the goin hakase system
that was so dominant in the Edo period, has been projected backwards onto samples
of fushihakase of much earlier periods.
Kataoka Gidō for instance remarks that although ko-hakase (with which he refers
to the neumatic tada hakase/fu-hakase system described under section 14.2.1) is
older than goin hakase, the oldest extant examples of goin hakase appear to be older
than the oldest examples of ko-hakase. He also writes that “as far as goin hakase is
concerned, there are in fact all manner of different kinds, and it happens that
amongst these there are types that inadvertently appear to be annotated with ko-
hakase, but that are without doubt fundamentally goin hakase. Therefore, if
considerable care is not taken, there is a substantial risk of a mix-up” (1972:484).
Kataoka does not go into further detail and I do not know on what his decision to
regard the oldest material as belonging to the goin hakase type instead of the
neumatic ko-hakase type is based. But the fact that ko-hakase and goin hakase can
be so hard to distinguish – even to a specialist – means that linguists studying these
older fushihakase materials may easily have mistaken the old neumatic marks that
grew out of the tone dots, for goin hakase marks that indicate absolute tone.
14.3 Fushihakase material that has to be reinterpreted
in view of Ramsey’s theory
14.3.1 Fushihakase material that reflects a Middle Japanese tone system
I have already mentioned the Kamakura period rōei collection Rōei yōshū, where I
regard the horizontal stroke as expressing the ping tone (and not chi) and the
diagonal stroke as expressing the shang tone (and not kaku). This results in a
reversal of the value attributed to the hakase marks in the standard theory.
15 When one follows the standard theory on the value of the Middle Chinese tones in Japan, it is
not important whether the horizontal stroke expressed the ping tone or kaku, or whether a
diagonal stroke expressed the shang tone or chi, as in either case the horizontal stroke would
represent low pitch (as ping = [L]) and the diagonal stroke would represent high pitch (as shang
= [H]).
14.3 Fushihakase material that has to be reinterpreted in view of Ramsey’s theory 543
A more famous example of a work in which the value of the hakase marks in the
standard theory has to be reversed if we follow Ramsey’s theory, is that of the well-
known Daiji-in-bon of Shiza kōshiki 大慈院本四座講式.16
This is the only manuscript of Shiza kōshiki that still shows a Middle Japanese
(MJ Gairin type) tone pattern. All other manuscripts or printed versions of Shiza
kōshiki show a later type of tone system, in which tone classes 2.2 and 2.3 for
instance, have already merged.17
14.3.1.1 The Daiji-in-bon of Shiza kōshiki
Kōshiki is a shōmyō genre that developed in the Shingon school. The work Shiza
kōshiki 四座講式 (‘Four-lecture sermon’) was composed in 1216 by the Kegon
and/or Shingon priest Kōben or Myōe 明慧 (1173-1232).18 The language used in
Shiza kōshiki is thought to have been close to the spoken language of the time, and
the hakase marks in Shiza kōshiki manuscripts are also thought to closely reflect the
tones of the spoken language.
From the second half of the 13th century Shiza kōshiki was performed on Kōya-
san (Giesen, 1977:300). The Daiji-in-bon manuscript, which, according to Kindaichi
(1964:154), stems from the (Kogi) Shingon Ninna-ji Sōō-in-ryū shōmyō tradition
(the Ninna-ji temple is the head temple of the Kogi Shingon school in Kyōto), is
regarded as the oldest extant version of Shiza kōshiki, and probably very close to the
original. The Daiji-in-bon only contains the first part, Nehan kōshiki 涅槃講式, of
the original four parts of Shiza kōshiki.
16 As the Daiji-in-bon and Kindaichi’s study of it (1964c) are famous, Ramsey already addressed
the problem of the interpretation of the hakase marks in the Daiji-in-bon in the Japanese
language version of his article (1980:73). When interpreted in Kindaichi’s way this material
contradicts Ramsey’s theory, and Ramsey therefore suggested that the tonal value of the
diagonal mark and the horizontal mark should be reversed, because they are added towards the
unusual right-hand side. This idea is based on the fact that in the Tendai goin-hakase system
(see section 14.2.3) as well as in the Shingon goin-hakase system (see section 14.2.5) the
hakase marks are arranged in clockwise order from low to high around the character. As a
result, from the lower left-hand side to the upper left-hand side the tones get progressively
higher, but from the lower right-hand side to the upper right-hand side the tones get
progressively lower. Keeping in mind however, that in the Shingon school the habit of adding
the hakase marks all the way around a character apparently only developed in the 15th century,
this explanation cannot be correct.
17 I have not looked into this material sufficiently, but from what I have seen, the tone pattern of
some of this material (when reversed) looks very similar to the old rongi material discussed in
section 14.5. According to Kindaichi (cf. section 14.2.5) older notation systems than Kakui’s
goin-hakase continued to be used for the genre of kōshiki even after Kakui’s notation system
became accepted at Kōya-san in the 14th century, which could mean that the value of the
hakase marks used in these later versions of Shiza kōshiki should indeed be reversed.
18 Myōe was originally ordained in the Shingon school, although at the end of his career he served
as abbot of the Kōzan-ji 高山寺 temple of the Kegon school. In Medieval Japan it was not
uncommon for monks to be ordained in multiple schools, and Myōe signed his treatises and
correspondence as a monk of either school throughout much of his career.
544 14 Fushihakase material
According to Kindaichi (1964c:141-142) the paper seems to indicate that it dates
at the earliest from the end of the Nanboku-chō period (1338-1392), but some of the
characteristics in the writing suggest that it dates from the Kamakura period (1189-
1333). In the end he concludes that the manuscript is either from the Kamakura
period, or a very faithful copy of a Kamakura period manuscript from the Nanboku-
chō period.
In the first few lines of this work, the marks have been added to the left-hand
side of characters read in Sino-Japanese (on-yomi), and consist of horizontal and
diagonal strokes and hooks. In the rest of the work, two very simple hakase marks,
one horizontal and one diagonal ( / forward slash) have been added to the right-hand
side of the kana graphs that are added to the text. The fact that the hakase marks
have been added to the right-hand side, is unusual.
It is clear however, that syllables that are marked with the ping tone in the tone
dot material are marked with the horizontal stroke in this text, and syllables that are
marked with the shang tone in the tone dot material are marked with the diagonal
stroke in this text. The circumstance that the Daiji-in-bon is material from the very
beginning of the development of the Ninna-ji shōmyō tradition may be one of the
reasons why the marks have been added in an unusual way compared with what later
became the norm. A problem is that not much is known about the early notation
used at the Ninna-ji temple.
According to Kindaichi the hakase marks added to the Daiji-in-bon “expressed
intonation, and not a musical melody” (1964c:161). Based on the fact that in a few
cases the numbers 10 (十) and 11 (斗) have been added to the horizontal and the
diagonal marks respectively, Kindaichi argues that in the Daiji-in-bon manuscript
the mark − referred to the 10th string (十) of the sō 箏 and that the mark / referred to
the 11th string (斗) of the sō. This analysis is based on the zu-hakase chart for the
ritsu scale that was used in the Saidai-ji, which was derived from Senga’s hakase
chart. (See Figure 5, section 14.2.2.) After the first few lines however, the notation
was added to the right-hand side instead of to the normal left-hand side, and the
angle of the diagonal mark was reversed, so that Senga’s mark ∖ became /. Kindaichi
argues that this was for practical reasons; there was more space to add the hakase
marks to the right-hand side of the kana graphs (1964c:144).
The sō or sō no koto is a zither with thirteen strings, of which the 1st string has
the lowest pitch, and the 13th string has the highest pitch. Although the marking
system does not belong to the goin hakase type, Kindaichi’s analysis therefore
results in a higher pitch for the mark / than for the mark − .
According to Arai however, although Senga was the legitimate heir to the Saihō-
in branch of the Sōō-in school, he moved from the Ninna-ji to the Daikaku-ji. Kenjū
兼什 of the Ninna-ji Saihō-in school (who was a contemporary of Senga) criticized
Senga’s hakase chart for ignoring the connection between hakase marks and tone
dots and did not employ it at the Ninna-ji (Arai, 1996:18). Instead, Senga’s notation
was transmitted at the Tō-ji 東寺 temple and the Saidai-ji in Nara.
14.3 Fushihakase material that has to be reinterpreted in view of Ramsey’s theory 545
In the Ninna-ji Sōō-in school, one continued to employ the fu-hakase system,
and later took over Nanzan Shin-ryū shōmyō. The Ninna-ji Sōō-in school died out at
the end of the Tokugawa period, and the Daiji-in-bon, as well as other shōmyō
scores from the Ninna-ji temple were later kept at the Saidai-ji temple in Nara.
If the Daiji-in-bon of Shiza kōshiki (although later kept at the Saidai-ji)
originated from the Ninna-ji – such as Kindaichi claims – it is not correct to connect
its marking system with Senga’s hakase chart.19
Another point which argues against the idea that the hakase marks of the Daiji-
in-bon are based on Senga’s hakase chart is the following: As mentioned in section
14.2.5, even after Kakui’s goin hakase notation system became accepted at Kōya-
san in the 14th century, in the genre of kōshiki a neumatic fushihakase notation
system continued to be used. Kōshiki was after all a simple type of recitation that
closely followed the pitches of the spoken language, in which the distinction of
absolute tone was not required.20 This makes it seem strange that in case of the Daiji-
in-bon, after all a kōshiki text, the neumatic system preferred for simple recitation
would have been replaced by a system which was designed to mark the (absolute)
tone of the shutton in true musical compositions.21
As Arai mentions that the Ninna-ji continued to use the neumatic fu-hakase
system, it makes sense to assume that the tone value of the horizontal and the
diagonal stroke in the Daiji-in-bon was still based on the tone value of the ping and
the shang tones, the horizontal stroke expressing ping [H]/[F] and the diagonal
stroke expressing shang [L]/[R].22
1 The value of the horizontal and diagonal strokes in the neumatic marking systems
Tone Tada hakase, fu-hakase, goma-fu Daiji-in-bon
ping = [H]/[F] > − 字 [H] 字 − [H]
shang = [L]/[R] > ∖ 字 [L]/[R] 字 / [L]/[R]
19 According to Sakurai (1976:7-109), the hakase marks added to Hasso saimon 八祖 祭文
(1409) from the Sōō-in-ryū at the Ninna-ji temple (which are added to the left-hand side) are
also based on the tablature of the 13-stringed sō no koto and he therefore identifies the mark −
with the 10th string (十) of the sō and the mark ∖ with the 11th string of the sō (斗). As it is
doubtful however, whether such a system was ever used at the Ninna-ji temple, this calls
Sakurai’s analysis of the hakase marks in Hasso saimon into question.
20 It is likely that this was also the case in other shōmyō genres that were recited in a way that
closely reflected the tones of the spoken language, such as rongi, hyōhyaku and saimon. If the
hakase marks in Hasso saimon (1409) for instance, are read as ko-hakase this material
expresses a tone system similar to the restricted tone system of the old rongi material shown in
section 14.5.
21 It is perhaps possible that the numbers 10 and 11 with which some of the hakase marks are
annotated were added later, when the manuscript was kept at the Saidai-ji.
22 The right-hand side of the marks, and the angle of the diagonal mark is unusual ( / instead of ∖),
but as we have seen, Kindaichi’s analysis in terms of Senga’s hakase chart has the same
problem.
546 14 Fushihakase material
According to Kindaichi, the diagonal stroke in the Daiji-in-bon, which marks
syllables that have a shang tone dot in the tone dot material, expressed [H] pitch, but
occasionally it was used to expressed [F] pitch, such as when it marked tone class
1.2 and the final syllable of tone class 2.5, when these nouns occurred without a
particle.
It should be noted that the association between [H] and [F], which Kindaichi
acknowledges here, contradicts the association of ‘falling’ with ‘low’ (and ‘rising’
with ‘high’) on which his reconstruction of the value of the tone dots is based. (See
section 9.1.) Also, the fact that one and the same hakase mark could be used to
express [F] as well as [H] points to a link with the old neumatic marking system.
The hakase marks in this system after all, originated as extensions of the tone dots,
which expressed the contour tones of the Siddham scholars, and were later adapted
to mark the level tones of Middle Japanese as well. (In the goin hakase system by
contrast, which is not neumatic in origin, contour tones are expressed by means of
composite marks that express sequences of kaku-chi or chi-kaku.)
The diagonal stoke, which corresponds to the shang tone dot, was thus
occasionally used to express [F] pitch according to Kindaichi (just as in the standard
theory the shang tone dot was occasionally used to express [F] pitch in the tone dot
material). In tone descriptions that stem from the same circles (Shingon) and the
same period (13th century) as the Daiji-in-bon however, such as the ones by Shinpan
and Ryōson, the shang tone is described as ‘rising’ and not as ‘falling’. Also, the
first kana of double-kana qu tone character readings in the Daiji-in-bon is marked −,
and the second kana /. As can be seen in section 7.3.2.2 and 7.3.2.3, there can be
little doubt that the qu tone in Shinpan 信範 and Ryōson’s 了尊 tone systems had a
falling tone contour, as it is described by means of the character 偃. This, again,
indicates that the horizontal stroke in the Daiji-in-bon expressed [H] pitch and the
diagonal stroke [L] pitch, which is the exact opposite of Kindaichi’s reconstruction.
Another argument for a reversal of the tonal value of the hakase marks in the
Daiji-in-bon has to so with the influence of segmental features on tone in the Daiji-
in-bon (Ramsey, 1980).
In MJ ‘Nairin’ or MJ ‘Chūrin’ material such as Ruiju myōgi-shō, the tone of the
ren’yōkei with and without the verbal suffix -te was as in (2). (Verbs of type A
started with /L/ tone, and verbs of type B started with /H/ tone.)
2 The tone of the ren’yōkei with and without -te in Ruiju myōgi-shō
Ren’yōkei Ren’yōkei + -te
A ‘to roll’ maki 上平 ‘to put’ oki-te 上平-上 -
B ‘to hold’ moti 平上 ‘to stretch’ nobi-te 平上-上 -
14.3 Fushihakase material that has to be reinterpreted in view of Ramsey’s theory 547
In the Daji-in-bon, which had an MJ ‘Gairin’ type tone system, tone spreading had
caused the /L/ tone of the particle -te (when it was attached after verbs of type A) to
change to /H/ after the /LH/ tone contour of the preceding ren’yōkei verb form.23
3 The tone of the ren’yōkei with and without -te in the Daji-in-bon
Ren’yōkei Ren’yōkei + -te
A ‘to stop’ yame /− ‘to lift up’ age-te / − − -
B ‘to wait’ mati −/ ‘to throw’ nage-te − / / -
The influence of segmentals on tone in the Daiji-in-bon can be seen in verb forms
that included voiceless geminated consonants: In the Kamakura period, ren’yōkei
forms ending in -i with the verbal suffix -te had contracted (cf. kari-te > katte, uti-te
> utte), leading to the development of geminated consonants. The text of Shiza
kōshiki – being close to the spoken language – reflected these contractions, and it
can be seen from the hakase marks in the Daji-in-bon that the presence of geminated
consonants influenced the tone of these verb forms.
The geminated consonants that had developed because of the loss of the vowel
would either not be marked at all, or they would be marked with the hakase mark /,
the equivalent of the shang tone dot in Ruiju myōgi-shō. A form like (kari-te >) katte
for instance, which belongs to tone class A, would no longer be marked with / − −
marks but with / / − marks. The markings of (uti-te >) utte, on the other hand, which
belongs to tone class B, remained − / /.
4 The influence of segmentals on tone in the Daji-in-bon (Ramsey’s interpretation)
Ren’yōkei + -te Ren’yōkei + -te
(uncontracted) (contracted)
A ‘to lift up’ age-te - ‘to urge on’ (kari-te >) katte -
B ‘to throw’ nage-te - ‘to strike’ (uti-te >) utte -
The comparison of the tone of uncontracted and contracted ren’yōkei + -te verb
forms in the Daiji-in-bon in (4) shows that the influence of the change in segmental
shape was as follows: Moras that lost the vowel segment and developed voiceless
geminated consonants shifted away /H/ tone. In case of ‘to urge on’ kari-te > katte
23 The tone system of Shiza kōshiki agrees with the tone system of tone dot material such as the
Dateke-bon of Kokin waka-shū 伊達家本古今和歌集 (1226) and Kokin kunten-shō 古今訓点
抄 (1305). This is the MJ ‘Gairin’ type tone system in which /H/ tone spreading onto attached
monosyllabic case particles had occurred, the type of tone system that generated the later
merger pattern of the Gairin type dialects. (See section 3.1.4 of part I.)
According to Kindaichi (1955) the same type of tone system is found in Myōgo-ki 名語記
(1268/1275), a vocabulary list with tone markings by means of tone dots by the priest Kyōson
経尊.
548 14 Fushihakase material
for instance, the /H/ tone is shifted away from the lost vowel segment onto the next
syllable. We have seen in part I (such as in section 0.6.1.5) that loss of voicing is
usually associated with a loss (or shifting away) of [H] pitch. When the mora that
lost the vowel segment had /L/ tone, such as in case of ‘to strike’, on the other hand,
there was no need to shift /H/ tone away, and so no change in the tones is observed.
When the tonal value of the hakase marks is reconstructed in accordance with
Ramsey’s theory, the developments that can be seen in the Daiji-in-bon are therefore
completely natural.
Following Kindaichi’s interpretation of the hakase marks, the influence of
voiceless geminated consonants on tone in Shiza kōshiki would have been as in (5).
5 The influence of segmentals on tone in the Daji-in-bon (Kindaichi’s interpretation)
Ren’yōkei + -te Ren’yōkei + -te
(uncontracted) (contracted)
A ‘to lift up’ age-te - ‘to urge on’ (kari-te >) katte -
B ‘to throw’ nage-te - ‘to strike’ (uti-te >) utte -
We have seen in chapter 7 of part I, that the preference is for /H/ tone to be based on
a strong vowel segment. The development seen here, where a mora that had /L/ tone
historically, all of a sudden acquires /H/ tone when it loses the vowel segment and
develops a voiceless geminated consonant (as in katte) is therefore extremely
unlikely.
If the development in the two forms had been parallel (i.e. > as well as
> ) one could have concluded that the loss of the vowel segment had as a
result that the tone of the previous syllable was continued without change, but we
see that a tone sequence remains unaltered. Even though it is possible for a
voiceless geminated consonant following after a syllable with /H/ tone to have been
perceived as continuing the preceding /H/ tone (cf. katte), it is inconceivable that it
could have been perceived as /H/ (even though the acoustic realization in this
environment is silence) when preceded by /L/ tone (as in utte). It is simply not
possible to make sense of the influence of segmentals on tone in the Daiji-in-bon,
when Kindaichi’s interpretation of the value of the fushihakase marks is adopted.
The influence of geminated consonants on tone that can be seen in Shiza kōshiki
forms an important argument against the standard interpretation of the fushihakase
marks in this text. The argument has not drawn much attention since it was first put
forward by Ramsey in 1980, but it is too important to be overlooked.
14.3.2 Fushihakase material that reflects a restricted tone language
Kindaichi did not regard the fushihakase of the Daiji-in-bon of Shiza kōshiki as goin
hakase material. Other early hakase material that expresses a tone system of before
the Kyōto tone shift however, is usually treated as goin hakase, even when this
material dates from before the period in which the goin hakase system came in use,
14.3 Fushihakase material that has to be reinterpreted in view of Ramsey’s theory 549
and even when it dates from before the official date of invention of the system in the
year 1270 by Kakui: A horizontal stroke is invariably regarded as representing the
tone kaku from the goin hakase system (角 = 3 = [L]), and a diagonal stroke as
representing the tone chi from the goin hakase system (徴 = 4 = [H]). What are
regarded as early examples of the goin hakase system however, could just as well be
a continuation of the older type of neumatic notation in which the horizontal stroke
represented ping and the diagonal mark represented shang.
The tone system reflected in these materials (which stem from the 13th and 14th
centuries) differs from the tone system of Middle Japanese, but is also different from
the tone system of the modern Kyōto type dialects. When the marks in this type of
material are read in terms of the older neumatic notation system, they reflect a
Tōkyō type tone system that is in the process of restricting the number of /H/ tones
per word. (See section 14.5.) In my opinion therefore, the tone system of this
transitional period shows how the development of the modern Japanese restricted
tone system took place.
14.3.2.1 Butsuyuigyō-kyō
The first example of such a text is Butsuyuigyō-kyō 仏遺教経, which was first
introduced in an article by Kindaichi in 1955. Butsuyuigyō-kyō is a sutra chanted in
Japanese by adherents of Zen. Zen was first introduced in Japan in the early 13th
century, and Sakurai (1975) therefore thinks that the tone system reflected in this
document also dates from the early 13th century.24 According to Kindaichi on the
other hand, the material stems from the mid to late 14th century.
If Sakurai is right, this material predates the invention of the goin hakase system,
but according to Sakurai the melody was transmitted by memory until Kakui devised
his new musical notation system. If Kindaichi is right, it is still unlikely that this
new system would have already been used in this Zen text: According to Giesen
(1977) and Arai (1996), when goin hakase finally became popular in the 14th century
this happened first in the Shingon school.
The tone markings in Butsuyuigyō-kyō are very irregular. As the text is a printed
version from the Edo period, the irregularity of the markings is usually explained as
the result of corruption by later scribes, who were influenced by tonal changes that
had occurred in later periods: One and the same word can have tone markings that
agree with those in Ruiju myōgi-shō as well as tone markings that appear to stem
from later periods. It is among the tone markings that appear to stem from later
periods that we find tone patterns that reflect a restricted tone language.
Kindaichi was the first to reconstruct a separate stage in the history of the
Japanese tone system based on the unusual markings in Butsuyuigyō-kyō in his
article of 1955. Although an exact dating of the deviant tone pattern that can be
24 In a diary notation, Myōe (1173-1232) for instance already mentions a performance of
Butsuyuigyō-kyō.
550 14 Fushihakase material
found in part of Butsuyuigyō-kyō is difficult, Kindaichi located this stage in-between
what he called the ‘Gyōa accent type’ and the ‘Bumō-ki accent type’.25
Because of the mixed nature of the markings, and the uncertainties about their
dating Günther Wenck (1959:411) criticised the use of this material (which contains
only 67 different words) as a basis for the reconstruction of a separate stage in the
history of the Japanese tone system. Since 1955 however, material that shows
certain similarities with the Butsuyuigyō-kyō material has been found. This type of
material can mainly be found in musical scores of the rongi 論議 ceremonies, that
originally stem from the early 14th century.
14.3.2.2 The old rongi material and the quotation part of Bumō-ki
Rongi ceremonies are formalised discussions of the religious teachings in question
and answer style. The rongi books or rongi-sho 論議書 were reference guides to the
correct recitation of the rongi ceremonies. (See also 14.4 and subsections.)
According to Sakurai (1976) who studied many of these materials, the hakase marks
in these works were not used as indicators of absolute tone, but merely to indicate
the simple [H] and [L] of the Japanese tone system. He identifies the two basic
marks as chi [H] and kaku [L]. 26
The special type of tone system attested in the rongi musical scores can also be
found in the second part of Bumō-ki 補忘記, the most famous example of a rongi
book. When Bumō-ki is mentioned in connection with the history of the Japanese
tone system however, it almost always concerns the material contained in the first
part, or ‘vocabulary part’ of Bumō-ki, in which the words are arranged in iroha order.
As we have seen, this part of Bumō-ki has traditionally been regarded as
representing the Kyōto type tone system of the early Edo period, as it was written by
Kannō (1650-1710), who resided at the Chishaku-in in Kyōto. (This temple had
become the head temple of the Shingi Shingon school, after Toyotomi Hideyoshi
had destroyed the Negoro-ji 根来寺 near Kōya-san in Wakayama prefecture.) The
tone system reflected in the vocabulary part is clearly of the post-shift Kyōto type.
The second part or ‘quotation part’ of Bumō-ki however, contains a number of
sentences and phrases that are direct quotations from the rongi ceremonies.
According to Sakurai the history of the development of the rongi and the tone
system reflected in the rongi books is not very clear, and different parts of Bumō-ki
reflect tone systems from different periods. Sakurai’s study of the quotation part of
Bumō-ki (1976: 381) shows that the tone system reflected in this part must indeed
stem from a different period than that of the vocabulary part. Furthermore, Hattori –
who conducted the first study of Bumō-ki – has pointed out that the fushihakase used
25 For a discussion of Gyōa’s tone system, see section 12.1.1.
26 Sakurai used Hōdan rongi yōshū 法談論議要集 (Mizuhara Mukō (ed.), 1938), a collection of
rongi books from the Shingon school. Nine of the sixteen rongi books in this work, of which
the earliest extant editions do not go back further than the Edo period, can be used as historical
material on the Japanese tone system (Sakurai, 1976: 139, 404), the rest deal with other aspects
of the rongi ceremonies and do not contain fushihakase.
14.4 The history of the rongi ceremonies and the rongi books 551
in the two different parts are “of a completely different type that would require a
separate study” (1942:138). The hakase chart contained in Bumō-ki for instance (cf.
section 13.1.1), only agrees with the hakase added to the vocabulary part. It is not
possible to read the hakase marks in the quotation part of Bumō-ki based on this
chart, as the quotation part contains hakase marks with shapes and angles that do not
occur in the hakase chart .27
The question of how material that reflects two different stages in the history of
the Japanese language and two different types of fushihakase ended up in the same
work is related to the complicated history of the rongi ceremonies and the rongi
books.
In the next section I will provide more background information on the history
and nature of the rongi material. The special type of tone system that is reflected in
the rongi material and Butsuyuigyō-kyō will be presented later, in section 14.5,
where I will also explain why this tone system can only be reconciled with
Ramsey’s theory if the current interpretation of the value of the hakase marks is
reversed.
14.4 The history of the rongi ceremonies and the rongi books
14.4.1 The rongi ceremonies
Rongi 論議 is the name of a certain genre of shōmyō. Already since the time of
Kūkai, rongi (‘discussions’ or ‘debates on doctrine’) occurred both at Tō-ji, the head
temple of the Tō-ji branch of the Shingon school in Kyōto, and on Kōya-san where
Kūkai had founded the Kongōbu-ji 金剛峰寺.
Although the rongi ceremonies as such are quite old, rongi in the modern sense
does not go back further than the early Muromachi period, when a number of these
discussions were recorded and formalized into ceremonies while others disappeared.
What originally had been discussions or examinations of the Buddhist teachings,
developed into ritualized ceremonies with fixed questions and answers. This appears
to have happened first in the Shingi Shingon school on Mount Negoro in the
beginning of the 14th century, and what is nowadays usually referred to as rongi is
this last type of fixed ceremony.
Depending on the style, there is one person who has the role of asking the
questions (jusha 竪者), one person who has the role of answering the questions
(shōja 精者) and a kind of judge (hanza or hanja 判者) who has the role of
pronouncing a final conclusion.
It is usually mentioned that there are two rongi traditions, which developed
independently from each other, one at the Negoro-ji (Shingi Shingon) and one at
27 These hakase marks include for instance, a mark that is slanted in the opposite direction of the
chi mark ⁄ (forward slash), which is mostly used towards the end of sentences and phrases, as
well as the marks ┘, and「, of which the meaning is unclear.
552 14 Fushihakase material
Kōya-san (Kogi Shingon).28 Sakurai (1976:135) stresses the fact that not much is
known about the history of the rongi ceremonies and the rongi books.
As mentioned, rongi in the modern sense first developed in the Negoro-ji, but
from the publication of the rongi collection Rongi-shō 論議抄 in 1376 at Kōya-san,
it is clear that by that time the Kogi Shingon rongi tradition had also been well
established. This tradition however, was soon afterward lost and the rongi
ceremonies were reintroduced at Kōya-san around 1407 from the Kōfuku-ji 興福寺,
the head temple of the Hossō school in Nara. 29 Although rongi ceremonies still
occur on Kōya-san, the traditional way of performing the ceremonies has died out.
As for the rongi tradition at the Negoro-ji, Sakurai (1984) mentions that there
was a rongi restoration in the late 16th century, which occurred after a gap of more
thant a century in which the rongi tradition of the Negoro-ji had ceased to exist. This
means that the original Shingi Shingon rongi tradition died out at Mount Negoro
sometime during the 15th century.
14.4.2 The rongi books
The rongi books were guides that were produced for the purpose of properly
conducting the rongi ceremonies after these had developed into ritualised
discussions. A number of these works contain vocabulary lists (myōmoku-shō 名目
抄 ) with tone markings. Such lists contain words with special pronunciations
(yomikuse 読曲, with -kuse written with the character for ‘melody’) as well as
Buddhist idiom. It is interesting to note that the rongi guides that contain such
vocabulary lists all stem from the time of the rongi restoration, and not from the 14th
century, the first period in which rongi flourished.
The oldest example dates from the mid 15th century, and is simply entitled
Myōmoku-shō 名目抄 (Sakurai, 1976:404). Myōmoku-shō is a small dictionary of
about 600 words, divided into different categories. The author (Fujiwara) Tōin
Sanehiro 洞院実熙 (1409-1457), was a monk of the Shingi Shingon school.
The copy in Sanehiro’s own handwriting does not yet contain tone dots. It is
thought that the tone dots that can be found in other versions of the text were added
later by different people, from the late Muromachi period to the beginning of the
Edo period. The oldest extant version with tone dots dates from 1519 (Akinaga (ed.)
1998:25). These tone dots are dots in the so-called ‘new style’ (新式声点).30
28 Although the Hossō school also conducted rongi ceremonies, works from this rongi tradition
do not play a role in the study of the history of the Japanese tone system.
29 It is hard to say what the recitation of the rongi discussions reintroduced at this time was like.
We can see from Shishō shiki 四声私記 (1409) that – as far as the Chinese tones are concerned
– the shōmyō tradition on Kōya-san was still in severe disarray.
30 The tone dots are added in the new style that would later also be used by Keichū, but the
system used in Myōmoku-shō was not yet as fixed as in Keichū’s time. Apparently, /HL/ and
/HLL/ tone could be marked 平平 and 平平平, but also 上平 and 上平平, and /LHL/ tone
could be marked 平上平, 去上平 or 去平平. According to Kindaichi, who studied this material,
the tone dots in Myōmoku-shō express a tone system that agrees closely with the tone system of
14.4 The history of the rongi ceremonies and the rongi books 553
Another example is Daisho hyaku-jō dai-san-jū yomikuse 大疏百条第三重読曲
(1563) from the Shingi Shingon school. This work represents an incipient stage, as
the vocabulary is still small and not yet arranged in the order of the Japanese
syllabary.
The anonymous work Kaigō myōmoku-shō 開合名目抄 also stems from the
period of rongi revival. It was most likely written by a Shingi Shingon monk of the
Negoro-ji in the late 16th to early 17th century, in an effort to collect and pass on the
myōmoku of the rongi tradition at the Negoro-ji, which had died out more than a
century earlier (Sakurai, 1984). The present-day copy (with fushihakase marks) is
from 1813. The vocabulary, which is already arranged in iroha order, is less
extensive than that of Bumō-ki, but the tone system coincides with the tone system
expressed in the vocabulary part of Bumō-ki.
Bumō-ki is without doubt the most famous example of a rongi book because of
its size and the reliability of the tone markings. While the first part of Bumō-ki
contains the vocabulary list in which the tone of both Japanese words and Chinese
loanwords has been marked, the second part of Bumō-ki contains a hakase chart, and
explains the difference between hondaku and shindaku,31 the ideai rules, the way in
which to conduct a rongi ceremony, and finally, a number of example sentences
quoted from the rongi ceremonies. This last part, the quotation part, is the part that
in my opinion reflects a restricted tone language.
14.4.3 Why is the vocabulary part of Bumō-ki regarded as yomikuse?
The tone system reflected in the vocabulary part of Bumō-ki is usually regarded as
the Kyōto type tone system of the early Edo period. Mabuchi however (1958:167),
pointed out that Bumō-ki is a collection of special or traditional pronunciations of
words that were used in the recitation of rongi at the Negoro-ji, the old centre of the
Shingi Shingon school. The vocabulary part of Bumō-ki is after all introduced as 根
来寺名目集 ‘collection of myōmoku from the Negoro-ji’. As the rongi tradition had
already been established by the Muromachi period, Mabuchi argues that Bumō-ki
reflects the tone system of Wa-shū 和州 or Ki-shū 紀州 (present-day Wakayama
and Nara prefecture) of the Muromachi period, rather than the tone system of Kyōto
of the early Edo period.
In a review (1978, reprinted in 1996) of Sakurai’s study of Bumō-ki (Sakurai,
1977) Mabuchi again wonders why the vocabulary in Bumō-ki would be regarded as
yomikuse if the fushihakase marks added to this vocabulary represented the
contemporary pronunciation of Kyōto. It is, after all, much more likely that a
yomikuse represents a pronunciation that is not current, for instance a traditional but
now obsolete pronunciation. Mabuchi therefore again stressed that the tone system
reflected in the vocabulary part of Bumō-ki must stem from an earlier period (the
the vocabulary part of Bumō-ki.
31 Hondaku syllables are syllables that start with originally voiced consonants, while shindaku
syllables are syllables that start with voiced consonants that are the result of sequential voicing.
554 14 Fushihakase material
Muromachi period instead of the Edo period), with a different regional basis
(Wakayama prefecture instead of Kyōto).
A problem with this idea is the fact that in one case it is mentioned as something
peculiar in Bumō-ki that ‘lay people’ pronounced a certain word with a different tone
than the users of Bumō-ki, which suggests that in general the tones used by the
monks and by the lay population were identical. If there was no fundamental
difference between the contemporary tone system of Kyōto, and the tone system
expressed in Bumō-ki, why would it have been necessary to create a collection of
special pronunciations?
I think the answer to this question may be found in the interesting fact that there
are two different tone systems represented in Bumō-ki: Sakurai’s idea that the
quotations from the rongi ceremonies in Bumō-ki represent the tones of an earlier
period, while the yomikuse in the vocabulary part represent a more modern
pronunciation, can be combined with Mabuchi’s idea that yomikuse should represent
pronunciations that are in some way special.
The rongi tradition at the Negoro-ji became extinct sometime during the 15th
century, but it was revived again in the second half of the 16th century, as evidenced
by the publication of the pronunciation guide Daisho hyaku-jō dai-san-jū yomikuse
in 1563. As said, the rongi guides that contain vocabulary lists with yomikuse
(indications of the tone) stem from the period in which one tried to restore the Shingi
Shingon rongi recitation. (The title of Bumō-ki for instance ‘Record for the
restoration of things forgotten’ refers to this effort.) This means that they do not
necessarily reflect the tone system of the early Muromachi period (14th century), the
first period in which rongi flourished.
We also see that term yomikuse is not only used in reference to the vocabulary
part of Bumō-ki, but already in Daisho hyaku-jō dai-san-jū yomikuse, which is much
older (16th century) and still stems from the Negoro-ji. This shows that there was a
need for these guides with tone markings at the time of rongi restoration, even
before the destruction of the Negoro-ji and the move of the Shingi Shingon school
from Wakayama to Kyōto.
There can be no doubt that when the monks attempted to reintroduce the extinct
rongi ceremonies, they based themselves on the old musical scores from the 14th
century. (The rongi books from the period of revival after all contain quotations
from these earlier works.) In the mean time however, the tone system of the standard
language had changed, as in the intervening period, the leftward tone shift had
occurred in Kyōto.
The term yomikuse may therefore refer to the post-shift Kyōto type tone system
that had become the new standard, as opposed to the pre-shift tones reflected in the
old rongi musical scores. This would explain why these lists of special
pronunciations were necessary in the first place, even before the Shingi Shingon
school changed its regional base. Another reason for the production of vocabularies
with tone markings in this period, may have been the need to educate monks who
came from outside the Kyōto dialect area in the use of the correct tone system.
14.4 The history of the rongi ceremonies and the rongi books 555
The reason why the vocabulary of Bumō-ki was referred to as yomikuse then,
would not be because it deviated fundamentally from the 17th century dialect of
Kyōto, but because this had been the traditional designation of these vocabulary lists
in the Shingi Shingon school ever since the 16th century. The reason why these
vocabulary lists with tone markings were created and were referred to as yomikuse in
the 16th century would be, because they indicated a tone system that deviated from
the old rongi scores, and/or because they indicated a prestigious pronunciation that
had to be taught to monks who came from outside the Kyōto dialect area.
Summarizing we can say that the tone system reflected in the quotation part of
Bumō-ki most likely represents the tone system prevalent in the 14th century, when
the rongi ceremonies first developed a fixed shape. After all, the quotations included
in Bumō-ki, are annotated with hakase marks that are not of the goin hakase type,
and therefore appear to have been adopted unaltered from older musical scores.
As to the vocabulary part, there are two possibilities: If the yomikuse listed in the
vocabulary part do not really go back to the Negoro-ji, they most likely represent the
tone system of the 17th century dialect of Kyōto as recorded by Monnō.
If the yomikuse listed in the vocabulary part are faithful transmissions from the
period when the Shingi Shingon school was still based on Mount Negoro, they
would most likely represent the tone system of the late 16th century standard
language of Kyōto, and not the tone system of Wakayama prefecture:
Even now, both Mount Kōya and Mount Negoro lie on the border of the area on
the Kii-peninsula where the Tōkyō type tone system has been preserved. (The
Negoro-ji was situated 25 km. to the north-west of Kōya-san.) In the Muromachi
period, when Kyōto type tone had just started to spread, the new post-shift tone
system had most likely not yet reached the Kii peninsula. (And even if it had already
spread that far to the south, the area with Tōkyō type tone in the Totsukawa region
would probably still have been larger than it is today, and would still have included
the area of Mount Kōya and Mount Negoro.)
This makes it highly unlikely that the post-shift Kyōto type tone system adopted
at the Negori-ji when the rongi ceremonies were revived (as evidenced from the
vocabulary part of Bumō-ki) was based on the spoken language of Wakayama
prefecture.
It is not surprising that the prestigious dialect of Kyōto was preferred over the
dialect of the (sparsely populated) surrounding area, which was still characterized by
a pre-shift tone system. What is more, the elite of the monasteries – among whom
we should probably look for the writers of these pronunciation guides – may have
stemmed from families from the Kyōto area.
I therefore do not think that the vocabulary part of Bumō-ki represents the tone
system of Wakayama prefecture, but it is possible that the tone system dates back to
the standard language of the Muromachi period.
556 14 Fushihakase material
14.5 The tone system reflected in the old rongi material
and Butsuyuigyō-kyō
As I follow Sakurai’s idea that the tone system reflected in the quotation part of
Bumō-ki is older than the tone system reflected in the vocabulary part, I refer to
rongi material reflecting the tone system of the first period as ‘old’ rongi material,
while I refer to material reflecting the tone system of the second period as ‘new’
rongi material. The quotations contained in this part of Bumō-ki are regarded as
‘old’ rongi material.
At first sight, the old rongi material appears to show a tone system that is in
accordance with the Kyōto type tone system of after the leftward shift, a tone system
which forms no problem from the viewpoint of Ramsey’s theory. We see for
instance that the tone of the following classes agrees with the tones in the modern
Kyōto type dialects: 2.3 , 3.4 , 3.7 . At closer inspection however,
it turns out that it is impossible to reconcile this tone system with Ramsey’s theory,
because the tone of the particles is problematic. In addition, the tone of class 3.5,
which has developed the conspicuous tone pattern , is hard to explain.
5 Unlikely developments from the MJ Nairin tone system to the old rongi tone
system in the standard interpretation
MJ Nairin Old rongi New rongi
2.1 - - -
2.2 - - -
2.3 - - (also -) -
2.4 - - -
2.5 - - -
3.1 - - -
3.2 - - - and -
3.3 - attested? -
3.4 - - -
3.5 - - (also -) -
3.6 - attested? -
3.7 - - -
The tones reflected in the old rongi material presented in (5), are based on Sakurai’s
study of this material (1976:150-158, 377-381 and 411-412). The two alternative
tonal shapes added between brackets (one for tone class 2.3, and one for tone class
3.5) have been attested in the quotation part of Bumō-ki, but not in the other rongi
material that Sakurai examined. The tone of the particle after tone class 3.5 also
stems from the quotation part of Bumō-ki. In the quotation part of Bumō-ki on the
14.5 The tone system reflected in the old rongi material and Butsuyuigyō-kyō 557
other hand, the tone of class 3.2 + particle, and the tone of the particle after class 2.2
has not been attested.
Finally, I have not adopted the reflexes that Sakurai indicates for tone classes 3.3
and 3.6. As far as I can see, these tone classes have not actually been attested in the
material that Sakurai used.32
I have added the tone system of the MJ ‘Nairin’ material and the tone system of
the vocabulary part of Bumō-ki (called ‘new rongi material’), in order to show why
the tone system of the old rongi material cannot be reconciled with Ramsey’s theory.
As said, the problem with this material is in the tone of the particles and the
unusual tone of class 3.5. (Verb and adjective forms that have the same tone as tone
class 3.5 in Middle Japanese, also show this unusual tone pattern in the old
rongi material.) It is hard to find a motivation behind a change from (Middle
Japanese) to (old rongi).33 As for the tone of the particles, even though the
tone of the nouns themselves is very similar to the vocabulary part of Bumō-ki, a
development from the Middle Japanese tone system in Ramsey’s reconstruction to
the tone system of the old rongi material is not possible:
In the vocabulary part of Bumō-ki, the pitch of the case particles is only [H] after
nouns that end in [LH], which can easily be explained as a result of tone spreading.
But the sudden occurrence of [H] pitch on the particle after nouns with a falling tone
contour over the word in the old rongi material is hard to explain. (Cf. tone classes
2.2, 2.3, 3.2, 3.3, 3.4 and 3.7.) While the clearly post-shift tones of the vocabulary
part of Bumō-ki make sense, the tones of the old rongi material do not. The only way
in which the tone pattern of this type of material can be explained is if one reverses
the interpretation of the fushihakase of the old rongi material, as in (6).
6 The developments from the MJ Nairin tone system to the old rongi tone system
in the reversed interpretation
MJ Nairin Old rongi New rongi
2.1 - - -
2.2 - - -
2.3 - - (also -)34 -
2.4 - - -
2.5 - - -
32 Compare for instance Sakurai’s examples (1976:150-158 and 378-379) with the tone classes
indicated for these nouns in Martin (1987).
33 According to the standard theory on the other hand, the tone of class 3.5 in the rongi books is
an intermediate stage between the tone system of Middle Japanese and modern Kyōto:
(the tone of class 3.5 in Middle Japanese in the standard reconstruction) > (intermediate
stage) > (modern Kyōto). See section 2.3.1 of part I.
34 The most common reflex for this tone class + particle in the quotation part of Bumō-ki as well
as in other old rongi material is -, but in the quotation part of Bumō-ki, ‘time’ (tone class
2.3) has been attested once as toki-wo - (Sakurai, 1976:377-379).
558 14 Fushihakase material
MJ Nairin Old rongi New rongi
3.1 - - -
3.2 - - - and -
3.3 - attested? -
3.4 - - -
3.5 - - (also -)35 -
3.6 - attested? -
3.7 - - -
When this old rongi tone system is compared to the MJ Nairin tone system, it can be
seen that in the old rongi material the number of /H/ tones per word has been
radically reduced: Only /H/ before /L/ in the MJ Nairin material has been preserved
as /H/ tone in the old rongi material.
The Sino-Japanese words contained in the old rongi material show a similar
restriction of the number of H tones per word: > , > ,
> , > , > .36
When the tones of Butsuyuigyō-kyō are likewise reversed as in (7), the
developments look similar to what can be observed in the old rongi material. I have
made a division of the Butsuyuigyō-kyō markings into markings that still reflect a
Middle Japanese tone system, markings that reflect the later restricted tone system,
and markings that are simply irregular or reflect a modern Kyōto type tone system.
7 The Butsuyuigyō-kyō markings
MJ type Transitional type Irregular/Other
2.1 47x 21x37
2.2 7x 16x 2x, 1x
2.3 3x 8x
2.4 23x 2x
3.1 13x 3x
3.2 2x 5x 2x
3.4 1x
3.5 11x 22x 1x
3.7 2x 1x
35 The most common reflex for this tone class + particle in the quotation part of Bumō-ki as well
as in other old rongi material is -, but in the quotation part of Bumō-ki, ‘heart’ (tone
class 3.5) has been attested once as kokoro-ga - (Sakurai, 1976:378-379).
36 Represented in moras, adapted from Sakurai, 1976:173.
37 Out of the 21 examples of unexpected tone for class 2.1, 18 occur with the demonstratives
kore, kono and sono. Wenck (1959:411) explains the level high tone in these cases by assuming
that the demonstratives lost their independent tone.
14.6 The reading of the ko-hakase materials in the 16th century and later 559
The developments in the restricted tone language stage agree with the old rongi
material in that the number of /H/ tones per word is reduced in a similar fashion. We
see however, that at this stage tone class 2.2 ( in Middle Japanese) merges with
class 2.1 as , and that tone class 3.2 ( in Middle Japanese) merges with
3.2 as .
Although the development towards a restricted tone system was similar, the tone
system of Butsuyuigyō-kyō must have had an MJ ‘Gairin’ type tone system as a
starting point, as the mergers that result from the tone reduction are of the Gairin
type. I do not know if anything is known of the regional base of the Butsuyuigyō-kyō
tone system, but as it seems to represent a Gairin A type tone system, it probably
represents the tone system of the Gairin area in the Tōkai region.
The question of how this transitional material (rongi and Butsuyuigyō-kyō) fits
into the history of the Japanese tone systems on Honshū has been treated in extenso
in chapter 4 of part I.
14.6 The reading of the ko-hakase materials in the 16th century
and later
In chapter 12, I have shown how in the late 15th century, after a period of profound
confusion in the tone theories, the tones were adapted to the new (post-shift) tone
system of the standard dialect of Kyōto. A similar adaptation can be seen in the way
in which older hakase material was later read in the Edo period.
The rongi musical scores published by Mizuhara in Hōdan rongi yōshū 法談論
議要集 (1938) for instance, go back to Edo period editions. According to Sakurai
(1976:377, 141), the fushihakase used in the texts did not indicate absolute tone and
the hakase mark ∖ could be used for both [H] and [F], and the mark − for both [L]
and [R]. These points still betray the neumatic origin of the marking system.
Although Sakurai mentions that the original tonal value of a number of the
strokes is not clear, the names of the different strokes in Hōdan rongi yōshū are
apparently based on the goin hakase system, so it is reasonable to assume that in the
Edo period this material was read as goin hakase. It seems therefore, that at some
point a reinterpretation of earlier hakase material as goin hakase took place.
Such a reinterpretation could only have happened if the original recitation
practice had died out, so that there would have been nothing to stand in the way of a
‘modern’ reading of the old hakase marks. In case of the Shingi Shingon school, it is
said that the rongi tradition of the Negoro-ji died out sometime during the 15th
century (Sakurai 1984).
When the rongi tradition was revived in the second half of the 16th century (the
period when Daisho hyakujō dai-san-jū yomikuse and Kaigō myōmoku-shō
appeared) the old musical scores from the 14th and 15th century were used again.
Naturally, these musical scores would now have been read in a way that agreed as
closely as possible with the contemporary tone system.
560 14 Fushihakase material
As it happens, a reading of this material as if it were goin hakase made eminent
sense, as a reversal of the tones resulted in an even closer resemblance to the post-
shift tone system of Kyōto than a reversal of the tones of Middle Japanese. In case of
the latter, the ideai rules and special notational devices such as ataru were needed to
adjust the tone of a number of the larger tone classes (such as 2.3 and 3.4) to the
post-shift Kyōto type tone system. In case of the tone system reflected in the old
rongi scores, a reversed interpretation of the hakase marks resulted in a tone system
that agreed with the contemporary pronunciation almost completely, the exception
being the tone of the particles and the tone of class 3.5.
Before the shift, at a time when the tada hakase/fu-hakase system was still
widely used, the fact that the 45° degree angles of Kakui’s goin hakase marks
coincided exactly with those of the old neumatic marks while they had a completely
different tonal value, may have been one of the reasons why Kakui’s system failed
to win popularity in the Shingon school. At the time of the restoration of the rongi
tradition after the occurrence of the Kyōto tone shift however, this was an
advantage: By adopting a goin hakase interpretation of the old neumatic marks it
was possible to make sense of the tone markings of a previous era again.38
The complicated history of the rongi books in the Shingi Shingon school, and the
fact that a reinterpretation of the older hakase material made it resemble a post-shift
Kyōto type tone system, may explain how in a work like Bumō-ki, two entirely
different kinds of hakase could co-occur: A younger one (used in the vocabulary
list), based on the goin hakase system, and an older one (used in quotations), which
was of the neumatic type, and had been faithfully copied from earlier works.
14.7 The musical notation systems of Nō and the Heike monogatari
It is said that rongi had a tremendous influence on Japanese literature of all genres.
There were rongi on the topic of poetry, and in 1270 there was even held a rongi on
the topic of the Genji monogatari 源氏物語. Especially the music of the major
classical drama form Nō is said to have been deeply influenced by the style and
terminology of the rongi ceremonies: The question and answer style exchanges
between the shite (シテ the protagonist, the leading role) and the waki (ワキ the
‘feeder’, the supporting role, often a priest) are also called rongi (ロンギ), and are
clearly influenced by the terminology and structure of these Buddhist debates.
Malm (1959:262) claimed that the notation of the music of rongi, kōshiki and
other such shorter, more rhythmical forms that stayed close to the spoken language,
derived from a simple style of neumatic vocal music notation called goma-fu
(‘sesame notation’) that was also used in court music such as mi-kagura, saibara
38 See also Kataoka Gidō’s remark (in section 14.2.7) on the similarity in appearance between
goin-hakase and the older neumatic types of notation.
14.7 The musical notation systems of Nō and the Heike monogatari 561
and rōei. It is more likely that the derivation was the other way around, but what is
clear is that the systems were indeed very similar.
Both in shōmyō (see Dōhan’s Shittan-jiki kikigaki and Kenpō’s Shittan shogaku-
shō) and in court music (see the rōei example in section 14.2.7) the two most basic
marks were a horizontal stroke, which was used to mark the ping tone, and a
diagonal stroke (‘backslash’) which was used to mark the shang tone.
If rongi had such a strong influence on Nō recitation, we would expect the
fushihakase notation systems of the two musical traditions to have points in common,
and according to Piggot (1954:538) the origin of the notation system used in Nō, but
also that used in the epic tradition of the Tale of Heike, was indeed this same goma-
fu notation system. In contrast to what is usual in shōmyō musical notation however,
the marks were now placed at the right-hand side of the text. Adopted in the
recitation of the Heike monogatari this vocal notation system was called sumi-fu
(‘ink notation’), while the musical notation marks used in Nō are called goma-ten
(‘sesame dots’).
14.7.1 Yōkyoku
The musical notation of Nō 能 music (yōkyoku 謡曲) developed in the late 14th
century, and the shapes and the names of the musical notation marks, as well as the
musical terminology have a very close resemblance to those still in use in Buddhists
circles. In Hana kagami 華鏡 (1424) for instance, the Nō writer Zeami 世阿弥
(1363-1444) uses phrases like 軽重清濁は上によるといへり ‘it is said that light
and heavy, clear and muddy depend on what comes first’ (Konishi,1948:479), which
shows that Zeami was influenced by the tone theories of shōmyō.39
As the musical notation of Nō reflects the tones of the spoken language, Nō
music is an important historical source on the Japanese tone system, but most extant
pieces were composed rather late, and the tone system represented in Nō plays is
therefore of the post-shift Kyōto type.
In general the horizontal mark represents [H] pitch, and the diagonal
mark represents [L] pitch, but a third goma-ten, a diagonal (‘forward slash’)
poised in the opposite direction of the mark expresses [extra-H] pitch. When this
sign is used in combination with , the sign expresses [L] pitch, and so the
value of the horizontal goma-ten is determined to a certain degree by context. (See
also section 3.3.1 of part I.)
The marks are added to the right-hand side of the characters or kana graphs, and
Ueno Kazuaki (Akinaga (ed.), 1998:23-24) points out that there is an example of
early shōmyō notation (i. e. the Daiji-in-bon of Shiza kōshiki) where the hakase
39 According to Konishi, Zeami misinterpreted the original meaning, which referred to the first
character of the fanqie, for ‘a character that comes first in a sentence’ (i.e. it appears that Zeami
is referring to the phenomenon of ideai), but Konishi also points out that a similar
misinterpretation can already be found in an earlier Buddhist work like Shishō no shiki 四声之
私記, so the mistake may not be of Zeami’s own making (1948:480).
562 14 Fushihakase material
marks are likewise added to the right-hand side of the katakana readings.40 Ueno
therefore thinks that the musical notation system of Nō may have developed from
such an early type of notation, but remarks upon the fact that in Nō the tonal value of
the horizontal and the diagonal mark appears to have been reversed.
14.7.2 Heikyoku
Another example of a musical notation system that is said to have developed out of
shōmyō notation is the notation system used in musical scores of the Heike
monogatari 平家物語 (Tale of the Heike), an epic poem recounting the battle
between the Taira and the Minamoto clans, that was set to music. (This also
happened to other military tales, but by the 13th century they had all been eclipsed by
the Heike monogatari.) In this type of recitation, which is called Heikyoku 平曲,
episodes from the Heike monogatari are chanted to the accompaniment of the biwa
(lute). Heikyoku recitation contains elements of court music, Buddhist chant and
mōsō-biwa (blind monk’s lute).
Just as the goma-ten of Nō music, the fushihakase marks used in Heikyoku
recitation are added to the right-hand side of the text. An example of the musical
notation of the Heike monogatari, can be found in the work Gengo kuninamari 言語
国訛 (1698/1758). Gengo kuninamari is an anonymous work, compiled by someone
who was well versed in Heikyoku recitation. It consists of a list of about four
hundred words selected from the Heike monogatari to which the post-shift Kyōto
type tones have been marked by means of simple horizontal and diagonal hakase
marks: /, − and ∖.
The mark ∖ expresses [L] pitch, the mark / expresses [H] pitch, and the pitch of
the mark − depends on context (Ueno Kazuaki in Akinaga (ed.), 1998:45). The word
2.4 kasa ‘bamboo hat’ for instance, which has pitch in the modern Kyōto type
dialects, occurs marked as ∖ / , but also as ∖ − and the word 2.2 hasi ‘bridge’,
which has pitch in the modern Kyōto type dialects, occurs marked as / ∖, but
also as − ∖ and the word san ‘three’ (2.4), which has pitch in the modern
Kyōto type dialects occurs marked as − / but also as ∖ −. It can be argued however,
that the basic meaning of the mark − is [H], as instances of − in combination with
the mark / are rare, and sequences of [H] pitch are always indicated by means of the
horizontal mark ( − −, − − − etc).41
40 The autograph in Ryōnin’s hand (see section 14.2.4) is another early example, which shows
that in Tendai shōmyō too, the hakase were once added to the right-hand side of the text.
41 Gengo kuninamari also contains a short description of the tones, expressing the Sinologist view
on the value of the Middle Chinese tones that had become popular in Japan by that time:
文字ノ声ノアゲサゲハ The rising and falling of the tones of written characters
平上去入ノ四声ニ分テ is divided into the four tones ping, shang, qu and ru.
上下タイラカナル声 A tone that is level in the beginning and the end
下カラアガル声 A tone that rises up from down below
上カラ去ル声 A tone that passes (down) from up high
ツメテ入ル声 A tone that is cut off and entering
14.7 The musical notation systems of Nō and the Heike monogatari 563
14.7.3 The value of the marks in Nō and Heikyoku: reversal or preservation?
We see that in the recitation of the Heike monogatari as well as in Nō, the horizontal
stroke expressed [H] pitch, the diagonal ‘backslash’ stroke expressed [L] pitch, and
the diagonal ‘forward slash’ stroke expressed extra [H] pitch. Konishi assumed that
the hakase marks used in Nō and Heikyoku recitation had developed from goin
hakase, but as the value that these strokes have in the goin hakase notation system is
the exact opposite, such a derivation is unlikely.42
If they developed from the older neumatic notation system instead, we can
assume that the goma-ten derived from − (ping or [H]), the goma-ten
derived from ∖ (shang or [L]), and the goma-ten derived from / (qu or [F]).43
This would mean that the Nō and Heikyoku traditions preserved the tonal value of
the old neumatic marks that were in use before goin hakase became popular.
8 Similarity in the value of the horizontal and diagonal marks
in the old neumatic marking systems and in Nō and Heikyoku
Tone Tada hakase, Nō Heikyoku
fu-hakase etc.
qu = [F] > / 字 [F] > 字 extra [H] 字 / extra [H]
ping = [H]/[F] > − 字 [H] > 字 [H] 字 − [H]
shang = [L]/[R] > ∖ 字 [L]/[R] > 字 [L] 字 ∖ [L]
Ueno Kazuaki remarked upon the fact that in the musical notation system of Nō, the
tonal value of the hakase marks appeared to have been reversed, compared to the
value that the marks have in the standard interpretation of the Daiji-in-bon. I prefer
42 According to Konishi, the oldest manuscript marked with goin-hakase that he encountered on
Kōya-san, was a manuscript of the Rishu-kyō 理趣経 from the mid Kamakura period (± mid
13th century) kept at the Sanbō-in, to which the hakase marks were probably added not long
afterwards. Konishi mentions their remarkable resemblance to the goma-ten of Nō music
(1948: 481-484). Because he took the material to be goin hakase, he assumed that Nō notation
(and also the very similar hakase notation of the banquet songs) developed from goin-hakase
notation. (He assumed for instance that the goma-ten developed from chi, but was
subsequently added to the right-hand side instead of to the left.)
Apart from the problem that the [H] and [L] of the marks in Nō and goin hakase are each
other’s opposite, there are other objections: The text of the Rishu-kyō seems too early for it to
have already been marked with goin hakase, which only became popular on Kōya-san in the
14th century.
Secondly, Nō notation developed in the mid Kamakura period, which seems too early for it to
have developed from goin-hakase, which was only invented in 1270. It is much more likely
that it developed from the much older and well-established neumatic marking system,
especially as Nō was stronly influenced by rongi, one of the shōmyō genres that stayed close to
the spoken language and maintained the old neumatic marking system for he longest time.
43 Cf. the hakase marks of Dōhan in section 7.3.2.1. See also section 4.1.2 of part I on how [H]
before [L] (which is how we can analyze [F]), has a tendency to develop into super [H] or extra
[H] pitch.
564 14 Fushihakase material
to turn this observation around. I do not think that the value of the marks in Nō and
Heikyoku has been reversed: I see the fact that the value of the marks in these
traditions does not agree with the standard interpretation of the Daiji-in-bon as
another indication that the tonal value of the marks in this text should be reversed.
14.8 Summary
The oldest fushihakase marks used in shōmyō recitation were simple neumatic marks
that had the tone dots as their starting point. A horizontal stroke expressed the ping
tone, a diagonal stroke (backslash) expressed the shang tone and the qu tone was
marked in different ways (forward slash, hook, z-shape). In recitation genres that
stayed close to the spoken language, the horizontal and diagonal strokes were later
used to express the tones of Japanese in the same way that the tone dots had been
used: The horizontal stroke (ping) expressed /H/ tone, the diagonal ‘backslash’
stroke (shang) expressed /L// or /R/ tone, and the mark used for the qu tone
expressed /F/ tone.
In the course of the 14th century, a new marking system, which was based on
absolute tone distinctions (the goin hakase system) came in use, starting with the
Kogi Shingon school on Kōya-san. In this system, a horizontal stroke expressed the
tone kaku [L], while a diagonal stroke (backslash) expressed the tone chi [H]. Even
on Kōya-san however, the older neumatic marking systems continued to be used in
shōmyō genres that were recited in a way that stayed close to the spoken language. It
is exactly these genres that form a source of historical information on the Japanese
tone system. Only in the 18th century did goin hakase finally supplant all other
marking systems.
Musical genres such as Nō and Heikyoku use notation systems that are regarded
as offshoots of an early type of Buddhist fushihakase. The fact that in these
traditions the tonal value of the horizontal mark is [H], while the tonal value of the
diagonal mark is [L], indicates that they developed from the older neumatic marking
system, and not from the later goin hakase. (The stroke that developed from the
hakase mark for the qu tone expresses [extra-H] tone in these systems.)
Around the early 15th century, the standard dialect of Kyōto went through a
leftward tone shift that fundamentally changed its tone system. Many tone classes
even acquired pitches that were the exact opposite of what they had previously been,
which resulted in a period of profound confusion in the tone theories.
In the Buddhist monasteries with their extensive libraries, examples of old texts
that had been annotated with tone dots and fushihakase were at hand, and these were
studied when the shōmyō traditions were revived in the 16th and 17th centuries. As
the chanting practice in the esoteric schools aimed at adhering to the most correct
recitation practice from the past, it was essential to reconcile the markings in the old
texts with the new tone system of the standard dialect. In this period therefore, the
14.8 Summary 565
tone theories were adapted to the new post-shift Kyōto type tone system and
changed radically. (See chapter 12.)
As part of this process, older musical scores started to be read as if the neumatic
hakase marks with which they had been annotated, were a form of goin hakase. The
horizontal stroke was now read as kaku 角 = [L], the backslash stroke was read as
chi 徴 = [H], and the forward slash (when used) was read as shō 商 = extra [L].
Especially in case of material dating from the restricted tone language phase, such a
reading resulted in a tone system of which the resemblance with the post-shift tone
system of the standard language of Kyōto was almost perfect.
In Buddhist circles, the new goin hakase reading of the old neumatic marks
formed an integral part of the larger process in which the tone markings of previous
eras were made to fit into the theoretical framework that had developed after the
shift. The Nō and Heikyoku traditions on the other hand, were less concerned with
such historical and theoretical matters. In these traditions therefore, the original tonal
value of the old neumatic marks was preserved.
15 Conclusion
Reservations towards Ramsey’s theory have been based for an important part on the
idea that his theory cannot be reconciled with commonly accepted reconstructions of
the tone system of Late Middle Chinese, and historical descriptions of the tones by
Japanese Buddhist monks. It is also often thought that the Tendai and Shingon
chanting traditions, as well as old Buddhist musical scores, confirm the standard
reconstruction of the Middle Japanese tone system.
In the second part of this study, I have therefore examined what is known of the
tone system of Late Middle Chinese, and the way in which this was adopted in Japan,
judging from the way in which the tones are described and analyzed by Buddhist
scholars in different historical periods. In addition, I have looked at the history of the
Buddhist chanting traditions and the historical development of the Buddhist notation
systems for vocal music.
The two variants of Middle Chinese that were transmitted to Japan were Early
Middle Chinese, which would form the basis of the Go-on Sino-Japanese character
reading tradition, and Late Middle Chinese, which would form the basis of the Kan-
on Sino-Japanese character reading tradition. The consonantal distinctions from
which the Chinese shang and qu tones are thought to have developed suggest that
they had a rising and a falling tone contour respectively (at least at the time when the
tones originally developed), while the ping tone most likely had a level tone contour
(cf. chapters 1 and 2). 1
Political developments had an impact on the development of two distinct types
of Sino-Japanese. In different circles and depending on context, different character
reading traditions were fostered (cf. chapter 3). Although Buddhists largely held on
to the older Go-on reading tradition, scholars from the two esoteric branches of
Buddhism in Japan (Tendai and Shingon) were nevertheless at the forefront of the
phonological study of Kan-on. The study of Kan-on and the use of tone dots in these
circles can be traced back to the vital importance of a correct recitation of the
dhāran,ī, as these were transcribed by means of Chinese characters that were to be
read in the Kan-on pronunciation.
An interesting feature of the Japanese character reading traditions is the fact that
different tone dots were added to a character, depending on whether it was read in
Go-on or in Kan-on (cf. chapter 4). It is clear that the value of the dots was based on
the value that the tones had in Kan-on: Tone dots were not yet used at the time of the
introduction of the older Go-on reading tradition, and the Go-on readings were
marked with tone dots in retrospect, in later centuries. The result is that the dots
added to Go-on readings often do not agree with the tonal category that a character
belonged to in Chinese. 15 Conclusion
15 Conclusion 567
Some of the tonal differences between Go-on and Kan-on were probably the
result of later attempts to regularize the opposition between the two reading
traditions, but the reversal of the value of the ping and qu tones especially, appears
to go back to a real difference between the tones of early Go-on (Wa-on) and Kan-
on.
The use of tone dots was closely connected to the tradition of Buddhist chant in
the Tendai and Shingon schools. In the long history of shōmyō in Japan there were
many periods of upheaval, so that the shōmyō traditions that have survived to this
day go back in an uninterrupted line to the late 16th century, but not further (cf.
chapter 5). The idea that the tones of Late Middle Chinese were faithfully preserved
in the chanting traditions of the esoteric schools has nevertheless played an
important role in the rejection of Ramsey’s theory.
As the modern Tendai and Shingon traditions are of little value for the
reconstruction of the tone system of Middle Japanese, one has to rely on historical
material from these schools, such as descriptions of the tones and manuscripts with
fushihakase musical notation marks from earlier periods.
The earliest description of the Middle Chinese tones in Japan is included in
Shittan-zō (880) by the Tendai monk Annen, which is the oldest work on shōmyō
recitation written in Japan. Annen’s description is the only work on the tones that
was written at a time when direct contact with spoken Chinese was still relatively
recent. Annen’s descriptions are hard to interpret because of his ambiguous use of
phonological terminology, but important progress in clarifying the meaning of this
difficult text has been made by Edwin Pulleyblank and Endō Mitsuaki (cf. chapter 6).
When their insights are combined, it is possible to read the text as containing two
detailed and realistic descriptions of the contemporary (9th century) tone system of
Chang’an, as well as two older tone traditions which had been handed down to
Annen. Especially in case of the two older traditions it is hard to draw conclusions
as to the phonetic realization of the tones. All other Japanese writings on the tones
date from a much later period.
It is these later Buddhist tone theories that are contemporary with the production
of the tone dot material in the 11th to late 13th centuries. When these theories are
examined (cf. chapter 7), it turns out that in previous studies the reading notes that
are often added to the texts have been ignored. The implications that this has had on
their translation are far-reaching. The most conspicuous example is that it led to a
complete reversal – ‘falling’ turning into ‘rising’ – of the tone contour of the qu tone.
Part of the reason why the translations were adapted is because – when translated
literally – the tone systems that result do not look like tone systems that could have
formed part of a natural language: They do not contain, for instance, even one level
tone, and many of the tones even consist of sequences of rises and falls. There is
however, ample evidence from these later writings on the tones that indicate that
these tone systems all go back to interpretations and reinterpretations of the much
earlier, authoritative text by Annen. This is apparent from the fact that – without
exception – they include features of which it is highly unlikely that they truly
568 15 Conclusion
formed part of the tone system of Late Middle Chinese (cf. chapter 8). These
unusual features can be explained however, if they are regarded as
misunderstandings that grew out of the ambiguous use of phonological terminology
in Shittan-zō.
The tone theories that are contemporary with the production of the Japanese tone
dot material do not describe the tone system of Late Middle Chinese: They are
theoretical constructs, based on interpretations of Annen’s much earlier and
notoriously ambiguous text. As they are clearly indigenous Japanese creations,
intended for the highly ritualized context of religious chanting, it is possible to
accept these later tone theories for what they are, without trying to change them into
something that is likely from the viewpoint of Chinese historical phonology.
This means that a comparison should be made between the tones that result from
a literal translation of the Buddhist tone theories (which includes the reading notes)
with the value of the tone dots in the standard theory and in Ramsey’s theory. In
case of both theories, a measure of adaptation of the complicated contour tones of
the Buddhist scholars to the basically level tone system of Middle Japanese is
required.
It turns out however, that the adaptations that are needed to reconcile the
Buddhist tone systems with Ramsey’s theory are fairly simple and straightforward.
(The ping tone [F] was used to mark /H/, and the shang tone [R] was used to mark
/L/, as well as the occasional /R/ contour tone of Middle Japanese. The ‘drawn out’
qu tone [F:] was used to mark the occasional /F/ contour tone of Middle Japanese,
while the use of the ping tone for this purpose remained rare.)
In case of the standard theory on the other hand, the adaptations are far more
complicated and contain conspicuous contradictions in the system (cf. chapter 9).
Contrary to what has been thought therefore, it turns out that the old Buddhist tone
descriptions support Ramsey’s reconstruction of the value of the tone dots, and not
the standard reconstruction.
As there were developments in the tone theories over time, and changes in the
tone system of Japanese as well, there were also changes in the way in which the
dots were applied to mark the tones of Japanese, and it is possible to distinguish
different phases in the history of the Japanese marking system (cf. chapter 10).
Chapter 11 adresses a number of separate issues from the viewpoint of Ramsey’s
reconstruction: The fact that the value of the ping and the qu tones in Wa-on/Go-on
and Kan-on is reversed can be explained as resulting from real (and partly
perceived) differences in vowel length in certain tones between Early and Late
Middle Chinese.
The reconstruction of the shang and qu tones of Late Middle Chinese that
follows from the standard theory is in contradiction with the way in which these
tones are usually reconstructed in Chinese historical linguistics. However, the fact
that in Sino-Korean, which – like Japanese Kan-on – was based on some form of
Late Middle Chinese, the shang and qu tones have merged, could be seen as support
for the standard reconstruction: It has been argued that this means the two tones
15 Conclusion 569
were phonetically similar, and that – like the shang tone – the qu tone in Late
Middle Chinese must have been rising. It turns out however that the merger between
the two tones in Sino-Korean can be explained as the result of developments within
the prosodic system of Middle Korean itself, and that there is no reason to assume a
rising tone countour for the qu tone in Late Middle Chinese.
Finally, as an argument against Ramsey’s theory, a comparison has often been
made between the tone of certain words (a number of Paekche loanwords in Old
Japanese) in Middle Japanese and Middle Korean. I argue however, that the
relevance of this comparison is highly questionable: It is unreasonable to expect the
pitches of 5th to 6th century Paekche loanwords in Old Japanese (as evidenced by the
tones of Middle Japanese), to be identical with the pitches of these words in Shilla-
based Middle Korean a full millennium later.
As to the time at which the most radical change in the tone system of the Middle
Japanese standard language (i.e. the leftward tone shift in Kyōto proposed by
Ramsey) took place: Ramsey assumed that it must have occurred when the shōmyō
traditions went into decline in the 14th century, and the use of tone dots was
abandoned. The clearest sign of a fundamental change in the tone system of the
standard language is a complete collapse of the traditional tone theories. The fact
that such a collapse is indeed visible in the early 15th century confirms that the tone
shift must have taken place sometime during the 14th century. Based on various
historical records it is possible to narrow the period in which this change started to
spread down somewhat, to the mid to late 14th century (cf. chapter 12).
At the end of the 15th century there are the first attempts to reinterpret the
traditional tone theories in such a way that they could be reconciled with the new
post-shift tone system of Kyōto. The end of the 16th century saw the beginning of a
revival of the Buddhist tone theories and the shōmyō tradition in the Shingon school,
which once more experienced a time of great flourishing in the Edo period.
In the 17th century, certain rules were set up, which made it possible to read the
old tone markings in such a way that they agreed with the new tone system of the
standard language. The 17th century is also the time when the modern chanting
tradition of the Shingon school developed into its final shape. (The Tendai tradition
was revived again only in the 19th century, and the recitation practice may have been
based on the Shingon example.)
In the 18th century there was a certain diversity in the tone theories, as people
from outside of the clergy too, developed an interest in the Chinese tones, and there
was an influx of new ideas from China. The Tendai and Shingon chanting traditions
however, were not influenced by these new ideas: Two 19th century works from the
Shingi Shingon and Tendai schools show that the chanting traditions in these
schools continued to adhere to the type of tone system that had developed in the
Shingon school in the 17th century (cf. chapter 13).
Finally, as to the tone system reflected in the fushihakase musical notation marks
added to a number of old Buddhist musical scores: Only a very limited number of
shōmyō genres reflect the tonal distinctions of the spoken language. In these genres,
570 15 Conclusion
in which the recitation was relatively simple and stayed close to the spoken language,
an equally simple notation system was used, which consisted of horizontal and
diagonal strokes.
This is the oldest type of vocal music notation in Japan, and the earliest examples
of this notation clearly show that the strokes originated as extensions of the tone dots
(cf. chapter 14). Each hakase mark can therefore be identified with a particular tone;
the horizontal stroke expressed ping, the diagonal stroke (usually backslash)
expressed shang, while qu was expressed in a number of ways (hook, forward slash,
z-shape).
When the hakase marks in the old style are given the same tonal value
(according to Ramsey’s reconstruction) as the tone with which they can be identified
– i.e. ping = [F], used to mark /H/ in Middle Japanese, shang = [R], used to mark /L/
and occasionally /R/ in Middle Japanese, and qu = [F:], used to mark /F/ in Middle
Japanese – the oldest and most famous of these musical scores, the Daiji-in-bon of
Shiza kōshiki reflects an unrestricted Middle Japanese (‘Gairin’) type tone system,
similar to the tone system expressed by the tone dots in manuscripts such as Kokin
kunten-shō.
The other examples of Buddhist musical scores that are annotated with the old
marking system are chanting guides that date from the early 14th century. The tonal
distinctions reflected in these materials indicate that the Middle Japanese tone
system went through a process of /H/ tone restriction in the period preceding the
leftward tone shift in Kyōto. As outlined in part I, the restricted nature of the tone
systems of the modern Japanese dialects – compared to the far richer tone system of
Middle Japanese – already indicated that there must have been a stage in the history
of Japanese in which the tonal contrasts of the language were severely reduced.
The fushihakase systems in modern use in the Tendai and Shingon schools are of
a different type, in which the pitch of the hakase marks is based on the five tones of
the pentatonic scale. In the goin hakase system for instance, which supplanted all
other marking systems in the Shingon school in the mid 18th century, the horizontal
stroke signifies the tone kaku 角 = [L] and the diagonal stroke signifies the tone chi
徴 = [H]. Proponents of the standard theory usually apply this value for the
horizontal and diagonal marks also to fushihakase material of earlier periods.
However, the notation systems used in Nō and the recitation of the Heike
monogatari – which developed out of the older style of shōmyō notation – show that
it is not correct to apply such a reading to the earlier type of fushihakase: In Nō and
Heikyoku notation the horizontal hakase mark still expresses [H] pitch, and the
diagonal (backslash) mark still expresses [L] pitch, confirming the tonal value that
has to be reconstructed for the old hakase marks in accordance with Ramsey’s
theory.
My conclusion from the issues researched in Part II is therefore similar to my
conclusion from the investigation of the tone systems of the modern dialects in Part I,
namely that the evidence supports Ramsey’s theory in every instance.
References
Akinaga, Kazue ‘Yaeyama-hōgen ichi-ni-onsetsu meishi no akusento no keikō’,
Kokugo-gaku; 41, 1960, 121-125 References
– ‘Kyōtsū-go no akusento’, (Nihon hōsō-kyōkai ed) Nihon-go hatsuon akusento
jiten, Tōkyō, 1966a, 45-90
– ‘Sanagi-akusento no teiki suru mono, (Waseda-daigaku) Kokubungaku-kenkyū;
33, 1966b, 74-88
– Kokin waka-shū shōten-bon no kenkyū, kenkyū-hen, Tōkyō, 1972, Sakuin-hen,
Tōkyō, 1974
Akinaga, Kazue (et al. ed) Nihon-go akusento-shi sōgō shiryō, kenkyū-hen, Tōkyō-
dō Shuppan, Tōkyō, 1998
Amano, Denchū (et al. ed) Bukkyō ongaku jiten, Hōzō-kan, 1995
Arai, Kōjun Musik und Zeichen, –Notationen buddhistischer Gesänge Japans,
Schriftquellen des 11.-19. Jahrhunderts– , (Goepper, Roger & Günther, Robert
ed) Katalog zur Ausstellung des Museums für Ostasiatische Kunst der Stadt
Köln), Kleine Monographien; 4, Köln, 1986
– ‘Shōmyō no kifu-hō no hensen, –hakase-zu wo chūshin ni–, (Ueno gakuen
Nihon ongaku shiryō-shitsu ed), Nihon ongaku-shi kenkyū; 1, 1996, 3-32
(English translation by Steven G. Nelson on pp. VII-XXXIX)
Arisaka, Hideyo ‘Shittan-zō shoden no shisei ni tsuite’, Onseigaku-kyōkai kaihō;
41, 1936 (Reprinted in: Koku-go on’in-shi no kenkyū, Sansei-dō, 1957, 591-599)
Asai, Tōru ‘Ainu-go no bunpō’, Ainu minzoku-shi, (Ainu-bunka hozon taisaku
kyōgi-kai ed), Tōkyō, 1969, 771-800
– ‘Ainu-go ni-san no onsei genshō ni tsuite,(Redundant feature to shite no)’,
Gengo-kenkyū; 61, 1972, 84-85
– ‘A note on Ainu ‘r’, Hoppō-bunka kenkyū; 10, 1976, 191-205
Asato, Susumu & Doi, Naomi Okinawa-jin wa doko kara kita ka: Ryūkyū,
Okinawa-jin no kigen to seiritsu, Naha, Bōdā inku, 1999
Austen, C. L. ‘Anatomy of the tonal system of a Bantu language’, Papers from the
5th Conference on African linguistics. Studies in African linguistics; Supp. 5,
1974, 21-33
Batchelor, J. An Ainu-English-Japanese Dictionary, Tōkyō, 1938
Baxter, William A handbook of Old Chinese phonology, Berlin, 1992
Beckman, Mary E. Stress and non-stress accent, Dordrecht, 1986
Bodman, Richard, R. Poetics and prosody in Early Mediaeval China. A study and
translation of Kūkai’s Bunkyō Hifuron, Cornell University dissertation, 1978
572 References
De Boer, Elisabeth M. ‘The origin of alternations in initial pitch in the verbal
paradigms of the central Japanese (Kyōto type) accent systems’, Evidence and
Counter-evidence, Festschrift Frederik Kortlandt; 2, SSGL 33, Amsterdam-New
York, Rodopi, 2008, 35-50
– ‘The Middle Chinese tones through Japanese eyes’, Chinese linguistics in
Leipzig (CLÉ 2), (Djamouri, Redouane and Sybesma, Rint, eds), EHESS-
CRLAO, Paris, 2008, 71-86
Bugaeva, Anna Grammar and folklore texts of the Chitose dialect of Ainu, (Idiolect
of Ito Oda), Endangered languages of the pacific rim (A2-045), Kyōto, 2004
Chew, John ‘Accent in Japanese compounds’, Papers of the CIC far Eastern
Language Institute, 1963, 85-87
– ‘Various short papers’ in: Papers of the CIC Far Eastern language institute, Ann
Arbor, 1964
Chiri, Mashiho Ainu-go nyūmon, –toku ni chimei kenkyū-sha no tame ni–, 1956
(reprinted in 1997)
Davidov, G. Slovar’ narechij narodov, obitayushchix na yuzhnoj okonechnosti
poluostrova Sakhalina, St. Petersburg, 1812
Demiéville, Paul (ed) Hōbōgirin, dictionnaire encyclopédique du Bouddhisme
d’après les sources Chinoises et Japonaises, deuxième fascicule: Bombai-
Bussokuseki, Tōkyō, 1930
Dehnhardt, Annette ‘Geschichte des Akzents in den japanischen Dialekten’,
Bochumer Jahrbuch zur Ostasienforschung; 17, 1993, 103-130
Dettmer, Hans A., Rev. John Batchelor –a preliminary report on his method of
working–, Proceedings of the international symposium on B. Piłsudski’s
phonographic records and the Ainu culture, (Executive committee of the
international Symposium ed), Hokkaido University, Sapporo, 1985, 117-122
Dobrotvorskij, M. Ainsko-russkij slovar’, Kazan’, 1875
Donohue, Mark ‘Tone in New Guinea’, Linguistic Typology; 1-3, 1997, 347-386
Elert, C. Phonologic studies of quantity in Swedish, Stockholm, 1964
Endō, Kunimoto ‘Kyoshō-ten to dakuon’, Kokugo-kokubun; 475, 1974, 35-48
Endō, Mitsuaki ‘Shittan-zō no Chūgoku-go seichō’, Kango-shi no sho-mondai,
Kyōto-daigaku jimbun-kagaku kenkyū-sho, (Ozaki, Yujiro ed), 1988, 39-53
Eom, Ik-sang A comparative phonology of Chinese and Sino-Paekche Korean,
Indiana University dissertation, 1991
Frellesvig, Bjarke ‘Morphemic tone and word tone in central Japanese’, Acta
Linguistica Hafniensia; 27, 1994, 147-161
– The prosodic shape of mono derivatives in Kyoto Japanese. Acta Linguistica
Hafniensia; 31, 1999, 91-124
Fujiwara, Yoichi ‘Ura-Nihon-chihō no kotoba no hatsuon’, Onsei no kenkyū; 7,
1951, 177-190
Giesen, Walter Zur Geschichte des buddhistischen Ritualgesangs in Japan. Traktate
des 9. bis 14. Jahrhunderts zum shōmyō der Tendai-Secte, Bärenreiter, 1977
Gulik, van, Robert Siddham, New Delhi, 1953 (Repr. in: Śata-Pit,aka Series; 247,
New Delhi, 1980)
References 573
Hagers, Steven ‘The attributive and conclusive forms of modern Japanese and
Ryukyuan dialects in a historical perspective’, Studia etymologica Cracoviensia;
5, 2000, 13-42
Hamano, Shōko ‘Voicing of obstruents in Old Japanese: evidence from the sound-
symbolic stratum, Journal of East-Asian linguistics; 9-3, 2000, 207-225
Han, M.S., ‘The feature of duration in Japanese’, Onsei no kenkyū; 10, 1962a, 65-80
– ‘Unvoicing of vowels in Japanese’, Onsei no kenkyū; 10, 1962b, 81-100
Haraguchi, Shōsuke The tone pattern of Japanese. An autosegmental theory of
tonology, Tōkyō, 1977
– ‘The accent of Tsuruoka Japanese reconsidered’, Issues in Japanese phonology
and morphology (Van de Weijer, Jeroen & Nishihara, Tetsuo eds.), Berlin, 2001,
47-65
Hashimoto, Shinkichi ‘Nyūtō-sō Chisō to Shittan-zō no Sō-hōshi’, 1920 (Reprinted
in Denki tenseki kenkyū, Iwanami Shoten, 1972, 123-133)
– ‘Koku-go ni okeru biboin’, Hōgen; 2, 1932 (Reprinted in Koku-go on’in no
kenkyū, Tōkyō, 1950)
Hashimoto, Mantarō Phonology of Ancient Chinese, (2 vols.), Institute for the Study
of Languages and cultures of Asia & Africa, Tōkyō, 1978-1979
– ‘Ajia no naka no Nihon-go’, Gengo seikatsu; 322, 1978, 18-27
Hashimoto, Mantarō (ed) Genetic relationship, diffusion and typological
similarities of East and Southeast Asian languages, Tōkyō, 1976
Hattori, Shirō ‘Kinki-akusento to Tōhō-akusento to no kyōkai-sen’ Onsei no kenkyū;
3, 1930, 131-144
– ‘Koku-go sho-hōgen no akusento gaikan’, Hōgen; 1, 1931, 11-33 (I), 170-180
(II), 245-261 (III), Hōgen; 2, 1932, 77-88 (IV), 148-156 (V), Hōgen; 3, 1933,
406-419 (VI)
– ‘Genshi Nihon-go no ni-onsetsu meishi no akusento’, Hōgen; 7, 1937
– ‘Bumō-ki no kenkyū, –Edo jidai shoki no Kinki-akusento shiryō to shite–’,
Nihon-go no akusento, (Nihon hōgen-gakkai ed), 1942, 125-159
– ‘Genshi Nihon-go no akusento’, Koku-go akusento ronsō, (Terakawa et al. ed)
Hōsei-daigaku shuppan-kyoku, 1951
– Nihon-go no keitō, Tōkyō, 1959
– ‘Ainu-go no on’in-kōzō to akusento, –Ainu sogo saikō no kokoromi–’, Onsei no
kenkyū; 13, 1967, 207-223
– ‘Nihon-sogo ni tsuite’ (21-22), Gengo, 1979
Hattori, Shirō (ed) Ainu-go hōgen-jiten, An Ainu dialect dictionary with Ainu,
Japanese and English indexes, Tōkyō, Iwanami Shoten, 1964
Hattori, Shirō & Chiri, Mashiho ‘Ainu-go sho-hōgen no kiso-goi tōkeigaku-teki
kenkyū’ Minzokugaku-kenkyū; 24-4, 1960, 307-342
Haudricourt, André ‘Comment reconstruire le Chinois archaïque’, Word; 10, 1954a
– ‘De l’origine des tons en Vietnamien’, Journal Asiatique; 242, 1954b
Hayata, Teruhiro ‘Accent in Old Kyoto and some modern Japanese dialects’,
Gengo no kagaku; 4, 1973
574 References
– ‘Heian makki Kyō-Ki-hōgen no shōten to sono onka. –Ramuzei-setsu no kiketsu
suru tokoro– ’, Kyūdai gengo-gaku kenkyū-shitsu hōkoku; 1, 1980, 3-11
– ‘Ko-shahon Nihon shoki no joshi no akusento’, Kyūdai gengo-gaku kenkyū-
shitsu hōkoku; 5, 1984, 15-49
– ‘Akusento bunpu ni mieru Nihon-go no ko-sō’, Gengo; 16-7, 1987, 158-166
– ‘Accent and tone: Towards a general theory of prosody’, Cross-linguistic studies
of tonal phenomena, tonogenesis, typology and related topics, (Kaji, Shigeki ed),
Institute for the study of languages and cultures of Asia and Africa (ILCAA),
Tōkyō, 1999, 221-234
Higurashi, Yoshiko The accent of extended word structures in Tōkyō standard
Japanese, Tōkyō, 1983
Hino, Sukenari ‘Nihon-sogo no boin taikei –Jōdai Azuma-hōgen shiryō ni yoru
saikō–’ Nihon-go keitō-ron no genzai –Perspectives on the origins of the
Japanese language–, (Osada Toshiki & Vovin, Alexander eds), International
Research Center for Japanese Studies, Kyōto, 2003, 187-206
Hirayama, Teruo ‘Noto-hōgen ni okeru Tōkyō-go-shiki onchō ni tsuite, ‘onchō no
shima’ no kaishaku-rei’, Koku-gakuin zasshi; 57-1, 1956
– Nihon-go onchō no kenkyū, Tōkyō, Meiji Shoin, 1957
– Zenkoku akusento jiten, Tōkyō, 1960
– Ryūkyū Yonaguni-hōgen no kenkyū, Tōkyō-dō, 1964
– ‘Ryūkyū Iriomote Sonai-hōgen no akusento taikei’, Onsei no kenkyū 13, 1967a
– Ryūkyū Sakishima-hōgen no sōgō-teki kenkyū, Tōkyō, 1967b
– ‘Gengo-tō Nara-ken Totsukawa-hōgen no seikaku’, Gengo-kenkyū; 76, 1979, 29-
73
– Minami-Ryūkyū no hōgen kiso-goi, Tōkyō, 1988
Hirayama Teruo, Ōshima Ichirō & Nakamoto Masachie (eds) Ryūkyū-hōgen no
sōgō-teki kenkyū, Meiji Shoin, Tōkyō, 1966
Hirayama, Teruo (ed) Gendai Nihon-go hōgen dai-jiten; 1,Tōkyō, 1992
Hiroto, Atsushi ‘Chūgoku chihō no akusento’, Onsei no kenkyū; 9, Tōkyō, 1961,
155-168
Hiroto, Atsushi & Ōhara Takamichi San’in chihō no akusento. Matsue, Hōkō-sha,
1953
Hombert, Jean-Marie ‘Consonant types, vowel quality and tone’, Tone, a linguistic
survey, (Fromkin, Victoria A. ed), New York/San Francisco/London, 1978, 77-
111
Hudson, Mark J. ‘The linguistic prehistory of Japan: Some archeological
speculations’, Anthropological science; 102-3, 1994, 231-255
– Ruins of identity: Ethnogenesis in the Japanese islands, Honolulu, University of
Hawai’i Press, 1999
Hyman, Larry, M. Phonology: Theory and analysis, New York, 1975
– ‘Historical tonology’, Tone, a linguistic survey (Fromkin, Victoria A. ed), 1978,
257-269
References 575
– ‘Privative tone in Bantu’, Cross-linguistic studies of tonal phenomena:
Tonogenesis, Japanese accentology, and other topics, (Kaji, Shigeki ed),
Institute for the study of languages and cultures of Asia and Africa (ILCAA),
Tōkyō, 2001, 237-257
– ‘Word-prosodic typology’, (Remijsen, Bert & Van Heuven, Vincent J. eds),
Between stress and tone, special thematic issue of Phonology; 23, 2006, 225-257
– ‘Universals of tone rules: 30 years later’, Tones and tunes; 1, Typological studies
in word and sentence prosody, (Gussenhoven, Carlos & Riad, Tomas eds) Berlin,
Mouton de Gruyter, 2007, 1-34
Hyman, Larry, M. & Schuh, Russel G. ‘Universals of tone rules: Evidence from
West Africa’, Linguistic Inquiry; 5, 1974, 81-115
Iida, Rigyō Nihon ni zanson-seru Chūgoku kinsei-on no kenkyū, Tōkyō, 1955
Iitoyo, Kiichi (ed) Kōza hōgen-gaku; 6, –Chūbu chihō no hōgen–, Koku-sho
Kankō-kai, 1983
– Kōza hōgen-gaku; 10, –Okinawa, Amami chihō no hōgen– , Koku-sho Kankō-
kai, 1984
Ikegami, Jirō ‘Shimokita hantō no on’in’, akusento’, Shimokita, shizen, bunka,
shakai, 1970
Ikuta, Sanae ‘Kinki-akusento-ken henkyō-chiku no sho-akusento’, Koku-go
akusento ronsō, (Terakawa et al. ed) Hōsei-daigaku shuppan-kyoku, 1951
Inoue, Fumio ‘Hōgen no tayōsei to Nihon bunka no nagare’, Nihon-go-gaku; 11,
1992, 57-67
Inoue, Okumoto ‘Nihon-go-chō genri yoron’, Kokugaku-in zasshi; 22, 1916
– ‘Nihon-go chō-gaku shoshi’, Onsei no kenkyū; 2, 1928 (reprinted in 1966), 75-91
Ishizuka, Harumichi ‘Maeda-bon Nihon shoki Insei-ki-ten (honbun henpo)’,
Hokkaidō-daigaku bungaku-bu kiyō; 26:1, 1977, 69-136
– ‘The origins of the ssŭ-shêng marks’, Acta Asiatica; 65, (Tōhō Gakkai ed),
Tōkyō, 1993, 30-50
– ‘Shōten no kigen’, Nihon kanjion-shi ronshū, (Tsukishima Hiroshi ed), Tōkyō,
1995, 39-64
Ito, Chiyuki Chōsen kanjion-kenkyū, Tōkyō University dissertation, 2002
– ‘On the functions of the Sino-Korean accents’, ILCAA (Institute for the Study of
Languages and Cultures of Asia & Africa), chyuki@aa.tufs.ac.jp, 2005
Itō, Seichi ‘Yamamoto Tasuke-shi no Ainugo’, Hōkkaidō-shi kenkyū; 14, 1978, 85-
95
Iwahara, Taishin Nanzan Shin-ryū shōmyō no kenkyū, Kyōto, 1932
Jaxontov, Sergei Drevnekitajskij jazyk (Jazyki narodov Azii i Afriki), Moskva, 1965
Kadowaki, Seiichi ‘Chūki Chōsen-go ni okeru seichō kōtai ni tsuite’, Chōsen
gakuhō; 79, 1976:17-54
Kamei, Takashi, Kōno, Rokurō & Chino, Eiichi (eds), Gengo-gaku dai-jiten; 2,
Sekai gengo-hen (chū), Sansei-dō, 1989
Kannō Bumō-ki, Facsimile edition of Genroku-ban (1695), Hakutei-sha, 1962
Kataoka, Gidō ‘Shōmyō-fu no ni keitō ni tsuite’, Bukkyō no ongaku, 1972, 481-489
576 References
Kaufmann, Walter ‘The mudrās in sāmavedic chant and their probable relationship
to the go-on hakase of the shōmyō of Japan’, Ethnomusicology; XI-2, 1967, 161-
169
– ‘Parallel trends of musical liturgies and notations in Eastern and Western Asia’,
Orbis Musicae; II 3-4, 1974, 97-119
Kawakami, Shin ‘Heian-akusento to Bumō-ki no akusento’, Kokugo-kokubun;
34:2, 1965 (reprinted in Kawakami, 1995, 414-433)
– Nihon-go akusento ronshū, Kyūko Shoin, Tōkyō, 1995
Kibe, Nobuko ‘Historical development of tone in the southwest Kyūshū dialects of
Japanese’, Proceedings of the symposium cross-linguistic studies of tonal
phenomena: Historical development, phonetics of tone, and descriptive studies,
(Kaji Shigeki ed), Institute for the study of languages and cultures of Asia and
Africa (ILCAA), Tokyo, 2003, 125-142
Kida, Akiyoshi ‘Rendaku to akusento’, Kokugo-kokubun; 48-3, 1979, 51-64
– ‘Iwayuru akusento no kaku wo megutte, –Ramuzei-setsu no imi suru mono–’,
Kokugo-kokubun; 54-9, 1985, 1-20
Kim, Yeng-man ‘Pangcemuy ponciley tayhan kochal’, Kwuk-e kwukmun hak; 36,
1967, 71-87
Kimenyi, A. ‘Tone anticipation in Kinyarwanda’, Studies in Bantu tonology
Southern California occasional papers in linguistic; 3, 1976, 167-181
Kindaichi, Haruhiko ‘Sho-hōgen no hikaku kara mita Heian-chō no akusento’,
Hōgen; 7-6, 1937
– ‘Bumō-ki no kenkyū, zoku-chō’, Nihon-go no akusento (Nihon hōgen-gakkai ed),
1942
– ‘Koku-go akusento dansō’, Romazi Sekai; 33-1, 1943, 23-31
– ‘Keichū no kanazukai-sho shosai ni mieru koku-go no akusento’, Koku-go to
koku-bungaku; 18-4, 1943a
– ‘Ruiju myōgi-shō no wakun ni hodokosaretaru shōfu ni tsuite’, Kokugo-gaku
ronshū, 1944
– ‘Konkōmyō saishōō-kyō ongi ni mieru isshu no Man’yōgana-zukai ni tsuite’,
Koku-go to koku-bungaku; 24, 1947, 33-48
– ‘Nihon shisei kogi’, Koku-go akusento ronsō, (Terakawa et al. ed) Hōsei-daigaku
shuppan-kyoku, 1951, 629-708
– ‘Tōzai ryō-akusento no chigai ga dekiru made’, Bungaku; 22:8, 1954, 63-84
(Reprinted in Nihon no hōgen, akusento hensen to sono jissō, Kyōiku Shuppan,
1975, 1983, 49-81)
– ‘Tsushima, Iki no akusento no chii’, (Kyū-gakkai rengō-hen) Tsushima no shizen
to bunka, 1954a (Reprinted in Nihon no hōgen, akusento hensen to sono jissō,
Kyōiku Shuppan, 1975, 1983, 27-48)
– ‘Kodai-akusento kara kindai-akusento e’, Kokugo-gaku; 22, 1955, 15-29
– ‘Hyōshō-karu no ten ni tsuite’, Kokugo-gaku; 41, 1960, 115-121
– ‘Watakushi no hōgen kukaku’, Nihon no hōgen kukaku, Tōkyō, 1964
– ‘Tōzai ryō-akusento hassei no mondai-ten. Tsutake, Yamaguchi ryō-shi no sho-
ron wo yonde’, Kokugo-gaku; 58, 1964a, 10-22
References 577
– ‘Until there arose a difference between Eastern and Western accent systems in
Japan’ (translation by Bailey, Don C.) Papers of the CIC Far Eastern language
institute, Ann Arbor, 1964b
– Shiza kōshiki no kenkyū, Sansei-dō, 1964c
– ‘On’in henka kara akusento henka e’, Kindaichi hakushi beiju kinen ronshū,
Sansei-dō, 1971
– ‘Shingon shōmyō’, Bukkyō ongaku, Tōyō ongaku gakkai (ed), 1972, 82-170
– Koku-go akusento no shi-teki kenkyū, –Genri to hōhō– , Tōkyō, Hanawa Shobō,
1974
– ‘Akusento kara mita Ryūkyū sho-hōgen no keitō’, Nihon no hōgen, akusento
hensen to sono jissō, Kyōiku Shuppan, 1975a, 1983, 129-159
– ‘Sado-akusento no keitō’, Nihon no hōgen, akusento hensen to sono jissō,
Kyōiku Shuppan, 1975b, 1983, 160-175
– ‘Akusento bunpu-zu’, Iwanami kōza Nihon-go; 11, Nihon no hōgen, (Nakamura,
Yukio ed), Iwanami Shoten, 1977
– ‘Miso yori wa atarashiku cha yori wa furui. Akusento kara mita Nihon-sogo to
jion-go’, Gengo; 9-4, 1980, 88-98
– ‘Nihon-go sogo no akusento to Ryūkyū-hōgen’, Sophia linguistica; 19, Tōkyō
(Sophia University), 1984, 3-25
Kindaichi, Haruhiko (ed) Meikai Nihon-go akusento jiten, 1958, revised edition
1981
Kindaichi, Kyōsuke ‘Shiiboruto-sensei to Ainugo-gaku’, Shiiboruto kenkyū, Tōkyō
1938, 353-426
– Ainu-go kenkyū, Kindaichi Kyōsuke zenshū I, Sanseidō, 1960
Kirikae, Hideo ‘Pa/ca correspondence between Ainu dialects, A linguistic-
geographical study’, Proceedings of the 8th international Abashiri symposium,
Peoples and cultures of the Boreal forest, Abashiri, Hokkaidō-ritsu Hoppō
Minzoku Hakubutsukan, 1994, 99-113
Kisseberth, Charles Comments on ‘The history of Kyōto accent’ by Nakai Yukihiko,
Cross-linguistic studies of tonal phenomena: Tonogenesis, Japanese
accentology, and other topics, (Kaji, Shigeki ed), Institute for the study of
languages and cultures of Asia and Africa (ILCAA), Tōkyō, 2001, 145-151
Klaproth, Julius Asia polyglotta, Paris, 1823
Kobayashi, Chieko Japanese dialects: Phonology and reconstruction of the proto-
accentual system, Cornell University dissertation, 1975
Kobayashi, Yasuhide ‘The accent of compound nouns in the Tsugaru dialect of
Japanese’, Gengo-kenkyū; 66, 1974, 45-58
Koku-go gakkai (ed.) Kokugo-gaku dai-jiten, Tōkyō-dō Shuppan, 1980
Koku-ritsu koku-go kenkyū-jo (ed) Okinawa-go jiten, Koku-ritsu koku-go kenkyū-
jo shiryō-shū; 5, Ōkura-shō insatsu-kyoku, 1963
Komatsu, Hideo ‘Heian makki Kinai-hōgen no onchō taikei’, Kokugo-gaku; 39,
1959
– Nihon seichō-shi ronkō, Kazama Shoin, 1971
578 References
– ‘Sino-Japanese systems in use’, Acta Asiatica; 65, (Tōhō Gakkai ed), Tōkyō,
1993, 11-29
Konishi, Jin’ichi Bunkyō hifuron-kō, Kenkyū-hen (jō), Tōkyō, 1948
Kortlandt, Frederik ‘The origin of the Japanese and Korean accent systems’, Acta
linguistica Hafniensia; 26, 1993, 57-65
Krasheninnikov, Stepan P. Opisanie zemli Kamchatki, St. Petersburg, 1755/1756
Kuno, Mariko ‘Okierabu-tō Wadomari-hōgen no akusento: Nagasa ga akusento no
katachi wo shimesu hōgen’, Nihon-go ronkō (Ōshima Ichirō ed.), Ōfū-sha, 1991,
145-164
Lawrence, Wayne P. Nakijin phonology –Feet and extrametricality in a Japanese
dialect–, University of Tsukuba dissertation, 1990
– ‘Tone and/or accent in Ryukyuan dialects’, Cross-linguistic studies of tonal
phenomena: Tonogenesis, Japanese accentology, and other topics, (Kaji,
Shigeki ed), Institute for the study of languages and cultures of Asia and Africa
(ILCAA), Tōkyō, 2001, 195-204
Leben, William R. ‘The representation of tone’, Tone, a linguistic survey, (Fromkin,
Victoria A. ed), New York/San Francisco/London, 1978, 177-219
Lehiste, Ilse Suprasegmentals, Cambridge, 1970
Lewin, Bruno Abriss der Japanischen Grammatik, 2nd edition, Wiesbaden, 1975
Liu, Fu Sisheng shiyanlu, Shanghai, Qunyi shushe, 1924
Mabuchi, Kazuo ‘Koku-go no on’in no hensen’, Koku-go kyōiku no tame no
koku-go kōza 2, Onsei no riron to kyōiku, 1958a, 105-175
– ‘Teika-kanazukai to Keichū-kanazukai’, Zoku-Nihon bunpō kōza; 2, Meiji Shoin,
1958, 110-145
– Nihon ingaku-shi no kenkyū I & II, Nihon gakujutsu shin-kōkai, 1962 & 1963
– Wamyō ruiju-shō ko-shahon shōten-bon honbun oyobi sakuin, Kazama Shobō,
1973
– ‘Go-on no shisei-setsu’, Kokugo-shi sōkō, Kasama Shoin, 1996, 316-332
– ‘Jōdai Nihon Kanjion-shi kenkyū no mondai-ten’ Kokugo-shi sōkō, Kasama
Shoin, 1996, 239-251
Malm, William P. Japanese music and musical instruments, Rutland and Tōkyō,
1959
Martin, Samuel E. Morphophonemics of standard colloquial Japanese, (Linguistic
society of America ed), Baltimore, 1952
– ‘Shodon: A dialect of the northern Ryukyus’, Journal of the American Oriental
Society; 90-1, 1970, 97-139
– ‘The nature of accentual distinctions in earlier Japanese’, Proceedings from the
first Nordic symposium in Japanology,Occasional papers; 3, East Asian Institute
University of Oslo, 1981, 84-105
– The Japanese language through time, New Haven/London, 1987
Masamune, Atsuo (ed) Ruiju myōgi-shō (hensan, kōtei), Kazama Shobō, 1970
Mase, Yoshio ‘Kiso-hōgen no akusento’, Onsei-gakkai kaihō; 105, 1960
Maspéro, Henri ‘Le dialecte de Tch’ang-ngan sous les T’ang’, Bulletin de l’Ecole
Française d’Extrême- Orient; XX-2, 1920, 1-124
References 579
Matsumori, Akiko ‘Nihon-go akusento no so-taikei saiken no kokoromi, iwayuru
‘kakō-shiki akusento’ no seiritsu ni kan-suru kōsatsu wo moto ni shite’, Gengo-
kenkyū; 103, 1993, 37-91
– ‘Tokushima-ken Wakimachi to Mikamomachi no akusento to Hondo sogo no
akusento taikei’, Kokugo-gaku; 189, 1997, 68-55
– ‘Nihon-sogo no akusento’, Nihongo-gaku tokushū: akusento-kenkyū no genzai; 3,
1998a, 34-44
– ‘Ryūkyū-akusento no rekishi-teki keisei katei, –ruibetsu goi 2-haku-go no tokui
na gōryū no shikata wo tegakari ni– ’, Gengo-kenkyū; 114, 1998b, 85-114
– ‘Historical tonology of Japanese dialects’, Cross-linguistic studies of tonal
phenomena: Tonogenesis, Japanese accentology, and other topics, (Kaji,
Shigeki ed), Institute for the study of languages and cultures of Asia and Africa
(ILCAA), Tōkyō, 2001, 93-122
– ‘On the reconstruction of the proto-accentual system of Japanese’, Proto
Japanese: Issues and Prospects, (Frellesvig, Bjarke & Whitman, John eds), 2008,
103-124
McCawley, James ‘What is a tone language’, Paper presented to the summer
meeting of the linguistic society of America, 1964
– Review article of Papers of the CIC Far Eastern language institute, Language;
42, 1966, 170-175
– The phonological component of a grammar of Japanese, The Hague-Paris, 1968
– ‘Some tonal systems that come close to being pitch-accent systems but don’t
quite make it’, Papers from the sixth regional meeting, Chicago linguistics
society, Chicago, 1970, 526-532
– ‘Accent in Japanese’ Studies in stress and accent, (Hyman, Larry ed) SCOPIL; 4,
1977, 26-302
– ‘What is a tone language?’, Tone, a linguistic survey, (Fromkin, Victoria A. ed),
New York/San Francisco/London, 1978a, 113-131
– ‘Notes on the history of accent in Japanese’, Recent developments in historical
phonology, Trends in linguistics; 4, (Fisiac ed), 1978b, 287-307
Meeussen, A. E. Essai de grammaire Rundi, Tervuren, 1959
– ‘Japanese accentuation as a restricted tone system’, Papers in Japanese
linguistics;1, 1972, 267-270
Mei, Tsu-lin ‘Tones and prosody in Middle Chinese and the origin of the rising
tone’, Harvard Journal of Asiatic Studies; 30, 1970, 86-110
Miller, Roy A. Die Japanische Sprache, Geschichte und Struktur, Monographien
aus dem Deutschen Institut für Japanstudien der Philipp-Franz von Siebold-
Stiftung, Band 4, München, 1993
Miyake, Marc H. ‘Pre-Sino-Korean and Pre-Sino-Japanese: Re-examining an old
problem from a modern perspective’, Japanese/Korean Linguistics; 6, (Sohn,
Ho-min & Haig, John ed), Stanford, 1997, 179-211
– The phonology of eighth century Japanese revisited: Another reconstruction
based upon written records, University of Hawai’i dissertation, 1999
580 References
Mochizuki, Ikuko ‘Jōshō-chō ichi-onsetsu no joshi no heichō-ka: joshi ni no gochō
no saikentō’, Tokiwa joshi tanki-daigaku kiyō; 6, 1973
– Ruiju myōgi-shō yonshu shōten-tsuki wakun shūsei, Tōkyō, Kasama Shoin, 1974
Murasaki, Kyōko Karafuto Ainu-go (Sakhalin Rayciska Ainu dialect), Tōkyō,
Koku-sho Kankō-kai, 1976
Murayama, Shichirō ‘Kurasheninnikofu oyobi Steller no Kita-Chishima Ainu-go
shiryō’, Kyūshū-daigaku bungaku-bu inbun kiyō; 12, 1968, 63-71
– Kita-Chishima Ainu-go, Tōkyō, Yoshikawa Kōbun-kan, 1971
– ‘Über den musikalischen Wortakzent des Japanischen und des
Mittelkoreanischen’, Rocznik Orientalistyczny; XLVI-2, 1990, 99-105
Nakagawa Hiroshi ‘Ainu-go no basho hyōgen to meishi no shozoku-kei’, Gengo-
kenkyū (Journal of the linguistic society of Japan); 84, 1983, 197-198
– ‘Nihon-go to Ainu-go to no sōji goi’, Yamatai-koku; 38, 1989, 190-204
– ‘A historical relationship of hashi ‘chopsticks’ and Ainu pasuy’, Language
contact: Examined from viewpoints of language teaching, comparative
linguistics and historical linguistics, Reports on the research projects; 141,
(Ogura, Michiko ed), Graduate school of humanities and social sciences, Chiba
University, 2007, 12-25
Nakamoto, Masachie & Shinozaki, Koichi ‘Okinawa hontō Shuri to Onna no
akusento’, Ryūkyū no hōgen; 13, 1-61, Hōsei-daigaku Okinawa-bunka kenkyū-
jo, 1988
Nakasone, Seizen Okinawa Nakijin-hōgen jiten, –Nakijin-hōgen no kenkyū. Goi-
hen–, Tōkyō, 1983
Nakamura, Yukio (ed) Iwanami kōza Nihon-go; 11, Nihon no hōgen, Iwanami
Shoten, 1977
Narita, Shūichi Kinsei no Ezo-goi –Moshiogusa-hen–, (unpublished ms. from the
library of Hokkaidō University), 1977
Nihon Hōsō Kyōkai Nihon-go hatsuon akusento jiten, NHK, Tōkyō, 1985
Norman, Jerry Chinese, Cambridge, 1988
Numoto, Katsuaki ‘Ninnaji-zō jūbun Kujaku-kyō jion-ten, Kan-on seichō shiryō to
shite no ichi-zuke’, Kunten-go to kunten shiryō; 55, 1974, 29-47
– ‘Go-on no seichō taikei ni tsuite’, Kokugo-gaku; 107,1976
– ‘Heian jidai ni okeru nichijō Kan-go no akusento’, Kokugo-kokubun; 48-6, 1979,
25-40
– ‘Kunten shiryō ni okeru fushihakase ni tsuite, –fushihakase no hassei to hattatsu–
Kunten-go to kunten shiryō; 86, 1991, 45-74
– ‘Sino-Japanese kana usage’, Acta Asiatica (Bulletin of the institute of Eastern
culture); 65, 1993, 65-130
Odden, David ‘Typological issues in tone and stress in Bantu’, Cross-linguistic
studies of tonal phenomena, tonogenesis, typology and related topics, (Kaji,
Shigeki ed), Institute for the study of languages and cultures of Asia and Africa
(ILCAA), Tōkyō, 1999, 187-215
References 581
Ōhara, Takamichi ‘Ruiju myōgi-shō no akusento to sho-hōgen akusento no taiō
kankei, –shu to shite san-onsetsu meishi ni tsuite–’, Nihon-go no akusento,
(Nihon hōgen-gakkai ed), 1942, 50-122
– ‘Koji-ki ni chūki sareta jō-kyo no akusento ni tsuite no shiken’ (jō & ge), Onsei-
gakkai kaihō; 77 & 78, 1951
Okada, Shōko ‘Nihon shoki ko-shahon no akusento to Kokin kunten-shō no
akusento ni tsuite’, Joshi-dai bungaku; 8, 1956, 51-69
Okamoto, Yasuhiro ‘Ōita-ken Asaji-machi hōgen ni okeru meishi akusento ni
tsuite’, ms. Kyūshū University, 1990
Okuda, Kunio Accentual systems in the Japanese dialects, UCLA dissertation, 1971
(Published: Tōkyō, 1975)
Okumura, Mitsuo ‘Tōzai akusento bunri no jiki, –gairai-go no akusento–’, Kokugo-
kokubun; 24, 1955a, 34-44
– ‘Rendaku’, Kokugo-gaku jiten, (Kokugo-gakkai ed), Tōkyō, 1955b, 961-961
– ‘Kan-go no akusento’, Kokugo-kokubun; 30-1, 1961, 1-16
– ‘Kan-go no akusento, –akusento kara goi-ron e–’, Nihongo-gaku; 55, 1963, 36-
53
– Heikyoku fuhon no kenkyū, Ōfū-sha, 1981
– ‘Koku-go akusento no ichi-mondai: Izumo-hōgen no akusento wo chūshin ni’,
Fujiwara Yoichi sensei koki kinen ronshū, Hōgen-gaku ronsō; 2, Hōgen-kenkyū
no shatei, Sansei-dō, 1981, 165-176
– ‘The tones of Japanese Go’on’, Acta Asiatica; 65, (Tōhō Gakkai ed), Tōkyō,
1993, 51-64
– Nihon-go akusento-shi kenkyū. Jōdai-go wo chūshin ni, Kazama Shobō, 1995
Ōno, Kazutoshi ‘Sei-daku: More than a voicing difference. –For a better
understanding of the rendaku phenomenon–’, Ohnok@u.arizona.edu, June, 2002
Ōno, Susumu ‘Kana-zukai no kigen ni tsuite’, Koku-go to koku-bungaku; 27:12,
1950, 1-20
Ōyama, Kōjun Bukkyō ongaku to shōmyō, Ōsaka,1989 (2nd ed 1992)
Pierrehumbert, Janet B. & Beckman, Mary E. Japanese tone structure, Linguistic
enquiry monographs; 15, MIT press, Cambridge (Mass.) and London, 1988
Piggott, Sir Francis & Fox-Strangways, A.H. Grove’s dictionary of music and
musicians, 1954
Piłsudski, Bronisław Materials for the study of the Ainu language and folklore,
Crakow, 1912
Pulleyblank, Edwin ‘Late Middle Chinese’, Asia Major, a British journal of Far
Eastern studies; 15-2, 1970, 197-239
– ‘The nature of the Middle Chinese tones and their development to Early
Mandarin’, Journal of Chinese linguistics; 6, 1978, 73-203
– ‘Stages in the transcription of Indian words in Chinese from Han to Tang’,
Sprachen des Buddhismus in Zentralasien, (Röhrborn, K. ed), 1983
– Middle Chinese: a study in historical phonology, Vancouver, 1984
– Lexicon of reconstructed pronunciation in Early Middle Chinese, Late Middle
Chinese, and Early Mandarin, Vancouver, 1991
582 References
Radlinski, I. Slownik narzecza Ainów, zamieszkujacych wyspe Szumszu w lancuch
Kurylskim przy Kamczatce. Ze zbiorów prof. B. Dybowskiego. Rozprawy
Akademii Umejetnosci, Wydzial Filologiczny. Serya; II, Tom I, Kraków, 1892
Rai, Tsutomu ‘Kan-on no shōmyō to sono seichō’, 1951 (Reprinted in Chūgoku
on’in ron-shū, Rai Tsutomu chosaku-shū I, 1989, 383-427)
– ‘The shōmyō hymns of Japanese Buddhism and the history of Chinese tones,
Genetic relationship, diffusion and typological similarities of East and Southeast
Asian languages, (Hashimoto, Mantarō J. ed) Tōkyō, 1976
– ‘Chūgoku-go seichō-shi shiryō to shite no bukkyō ongaku, Chūgoku on’in ron-
shū, Rai Tsutomu chosaku-shū I, 1989, 484-497
Ramsey, Samuel R. Accent and Morphology in Korean dialects, The society of
Korean linguistics, 1978
– ‘The Old Kyōto dialect and the historical development of Japanese accent’,
Harvard Journal of Asiatic Studies; 39, 1979, 157-175
– ‘Nihon-go akusento no rekishi-teki henka’, Gengo; 9-2, 1980, 64-76
– ‘Language change in Japan and the Odyssey of a Teisetsu’, Journal of Japanese
studies, 1982, 97-131
– ‘Proto Korean and the origin of Korean accent’, Studies in the historical
phonology of Asian languages, (Amsterdam studies in the theory and history of
linguistic science, series IV; Current issues in linguistic theory), (Boltz, William
G. & Shapiro Michael C. eds), Amsterdam-Philadelphia, 1991, 215-238
Reischauer, E.O. Ennin’s travels in T’ang China, New York, 1955
Robbeets, Martine I. Is Japanese related to the Altaic languages?, Leiden University
dissertation, 2003
Rosen, Staffan A study on tone and tone marks in Middle Korean, Stockholm,
1974
Russel, Keri ‘Contractions and monophthongisations in Old Japanese’, Nihon-go
keitō-ron no genzai –Perspectives on the origins of the Japanese language–,
(Osada Toshiki & Vovin, Alexander eds), International Research Center for
Japanese Studies, Kyōto, 2003, 511-538
Sagart, Laurent ‘The origin of Chinese tones’, Cross-linguistic studies of tonal
phenomena, tonogenesis, typology and related topics, (Kaji, Shigeki ed),
Institute for the study of languages and cultures of Asia and Africa (ILCAA),
Tōkyō, 1999, 91-103
Sakurai, Shigeharu ‘Akusento-shiryō to shite no shōmyō, Shingon-shū shoden no
hyōhyaku wo chūshin to shite’ (1, 2), Kokugo-gaku; 44, 34-51, 45; 29-37, 1961
– ‘Rongi no senritsu ni han’ei shita Muromachi jidai shoki no akusento-taikei’,
Kokugo-kokubun; 32-5, 1963 (Reprinted in Sakurai 1976)
– ‘Kyōtsū-go no hatsuon de chūi subeki kotogara’, (Nihon hōsō kyōkai ed) Nihon-
go hatsuon akusento jiten, Tōkyō, 31-43, 1966
– Kodai koku-go akusento-shi ronkō, Ōfū-sha, 1975
– Chūsei koku-go akusento-shi ronkō, Ōfū-sha, 1976
– Shingi Shingon-shu-den ‘Bumō-ki’ no kokugogaku-teki kenkyū, Ōfū-sha, 1977
– Nihon-go no senritsu, Sōbun-sha, Tōkyō, 1978
References 583
– Chūsei Kyōto-akusento no shi-teki kenkyū, Ōfū-sha, 1984
– Nihon-go on’in akusento-shi ron, Ōfū-sha, 1994
Sargent, John Review of The Kuril Islands: Russo-Japanese frontier in the Pacific
by John J. Stephan, Bulletin of the School of Oriental and African Studies,
University of London; 39-1, 1976, 215-217
Satō, Kiyoji (et al. ed.), Kokugo-gaku kenkyū jiten, Meiji Shoin, Tōkyō, 1977
Satō, Tomomi ‘Takeshiro no Ainugo-gaku, –Kaei sannen no ‘Ezo-go’ ni tsuite–’,
Shinpojiamu ‘Matsuura Takeshiro’, Kita e no shikaku, Sapporo, 1990, 148-181
– ‘Ezo-kotoba irohabiki’ no kenkyū. Kaisetsu to sakuin, Hokudai gengogaku-
kenkyū hōkoku; 8, Hokkaidō-daigaku bungaku-bu gengo-gaku kenkyū-shitsu,
Sapporo, 1995
– ‘Tenri-daigaku fuzoku Tenri-toshokan shozō ‘Matsumae no kotoba’ ni tsuite (1)’,
Hokudai bungaku-bu kiyō; 93, 1998, 41-64
– ‘Tenri-daigaku fuzoku Tenri-toshokan shozō ‘Matsumae no kotoba’ ni tsuite (2)’,
Hokudai bungaku-bu kiyō, 97, 1999, 53-88
Schadeberg, Thilo C. ‘Anticipation of tone: Evidence from Bantu’, Language and
linguistic problems in Africa, Proceedings of the VII conference on African
linguistics (Kotey, Paul F.A. & Der-Houssikian, Haig eds), Hornbeam Press,
South Carolina, 1977, 195-204
Seeley, Christopher A history of writing in Japan, University of Hawai’i press,
Honolulu, 2000 (First published by Brill, Leiden, the Netherlands, 1991)
Serafim, Leon A. ‘The importance of Vovin’s Proto-Japanese etymology for
‘mold’’, unpublished ms., 1993
– ‘When and from where did the Japonic language enter the Ryūkyūs? A critical
comparison of language, archaeology and history’, Nihon-go keitō-ron no genzai
–Perspectives on the origins of the Japanese language–, (Osada Toshiki &
Vovin, Alexander eds), International Research Center for Japanese Studies,
Kyōto, 2003, 463-475
– ‘The uses of Ryukyuan in understanding Japanese language history’, Proto
Japanese: Issues and Prospects, (Frellesvig, Bjarke & Whitman, John eds), 2008,
79–99
Shibata, Takeshi ‘Shirabīmu hōgen to mōra hōgen’, Hōgen-gaku gaisetsu, (Kokugo-
gakkai ed), Tōkyō, Musashino Shoin, 1962, 137-161
Shibatani, Masayoshi The Languages of Japan, Cambridge University press,
Cambridge, 1990
Shimabukuro, Moriyo ‘Word-initial low register in Proto-Japanese’,
Japanese/Korean Linguistics; 6, (Sohn, Ho-min & Haig, John ed), Stanford,
1997, 135-141
– A reconstruction of the accentual history of the Japanese and Ryukyuan
languages, Hawai’i University dissertation, 2002
– The accentual history of the Japanese and Ryukyuan languages –A
reconstruction– (The languages of Asia series, Vovin, Alexander ed), Folkestone,
Global Oriental, 2007
584 References
Starostin, Sergej Altajskaja problema i proisxozhdenie japonskogo jazyka,
Moskva, 1991
Stephan, John J. The Kuril Islands: Russo-Japanese frontier in the Pacific, Oxford,
Clarendon Press, 1974
Stevick, Earl W. ‘Tone in Bantu’, International Journal of American Linguistics;
35-4, 1969, 330-341
Sugitō, Miyoko & Tawara, Hiroshi ‘Tōkei-teki kanten kara mita Ōsaka-akusento’,
Onsei-gengo; 3, Ōsaka-Tōkyō, 1989, 143-165
Takahashi, Yasushige ‘Ainu-go Taraika hōgen to Hokkaidō hōgen no aida ni
mirareru /r/ to /t/ no taiō to sono reigai ni tsuite’, Nihon gengo-gakkai, dai 113-
kai taikai (kaiba: Hokkaidō-daigaku) yo-kōshū, 1996, 80-83
Takayama, Michiaki ‘Gen’on seichō kara mita Nihon shoki on-gana hyōki shiron’,
(Kyūshū-daigaku) Go-bun-kenkyū; 51, 1981
– ‘Shoki kayō on-gana to gen’on seichō’, (Kyūshū-daigaku) Bunken tankyū;
10,1982
– ‘Shoki kayō ni-onsetsu meishi no hyōki ni tsuite –akusento gorui no kanren wo
megutte–’, (Kyūshū-daigaku) Bunken tankyū; 12, 1983
Takeuchi, Lone The structure and history of Japanese: From Yamatokotoba to
Nihongo, Longman linguistics library, London and New York, 1999
Thorpe, Maner L. Ryūkyūan language history, University of Southern California
dissertation, 1983
Tōjō, Misao (ed) Hōgen-gaku kōza, 4 vols., 1961
Tokugawa, Munemasa ‘Nihon sho-hōgen akusento no keifu shiron, Gakushūin-
daigaku, Kokugo-kokubun gakkai-shi; 6, 1962, 1-19 (Translated by James
McCawley: ‘Towards a family tree for accent in Japanese dialects’, Papers in
Japanese linguistics; 1:2, 1972, 301-320)
– ‘Hōgen-kenkyū no rekishi’, Iwanami kōza Nihon-go; 11, Nihon no hōgen,
(Nakamura, Yukio ed), Iwanami Shoten, 1977
– Nihon-go no sekai; 8, Kotoba, nishi to higashi, Chūō Kōron-sha, 1981
– ‘Nihon no hōgen –Nihon-go no keisei to no kakawari–’, Nihon-go no keisei –The
formation of the Japanese language–, (Sakiyama, Osamu ed), Sansei-dō, 1990
Tsujimura, Natsuko & Davis, S. ‘The accent of long nominal compounding in
Tokyo Japanese’, Studies in language; 11-1, 1987, 199-217
Tsukishima, Hiroshi ‘Jōben-bon Shūi waka-shū shosai no akusento ni tsuite’,
Koku-go akusento ronsō, (Terakawa et al. ed), Hōsei-daigaku shuppan-kyoku,
1951, 107-178
– ‘Koku-go shiryō to shite no Tosho-ryō-bon Ruiju myōgi-shō’, Tosho-ryō-bon
Ruiju myōgi-shō, Bensei-sha, 1969
– Heian-jidai-go shinron, Tōkyō-daigaku, 1969b
– ‘Sino-Japanese studies: in retrospect and future prospects’, Acta Asiatica
(Bulletin of the institute of Eastern culture); 65, 1993, 1-10
Unger, J. Marshall Studies in early Japanese morphophonemics, (2nd ed),
Bloomington, 1993 (1977)
References 585
– ‘Rendaku and Proto-Japanese accent classes’, Japanese-Korean linguistics; 9,
(Nakayama, M. & Quinn, C.J. eds), Center for the study of language and
information, Stanford, 2000, 17-30
Uemura, Yukio ‘Ryūkyū sho-hōgen ni okeru ichi-ni onsetsu meishi no akusento
gaikan’, (Koku-ritsu koku-go kenkyū-jo ed), Kotoba no kenkyū, 1959, 121-140
– The Ryukyuan language, (translation by Lawrence, Wayne P.), Endangered
languages of the pacific rim (A4-018), 2003
Uwano, Zendō ‘Narada no akusento-so no shozoku goi’, Hirosaki-daigaku jinbun
gaku-bu bun-kei ronsō; 11-3, 1976, 1-32
– ‘Nihon-go no akusento’, Iwanami kōza Nihon-go; 5, On’in, 1977, 283-321
– Article accompanying the dialect map in: Language atlas of the Pacific area,
(Wurm, Stephen A. & Hattori Shirō ed), Canberra, 1981
– ‘Kagawa-ken Ibukijima no akusento’, Nihon gakushi-in kiyō; 40, 1985
– ‘Nase-shi Ashikebu-Arira-hōgen no meishi no akusento-taikei’, Tōkyō-daigaku
gengo-gaku ronshū; 15, 1996, 3-68
– ‘Fukugō meishi kara mita Nihon sho-hōgen no akusento’ Nihon-go onsei; 2,
Akusento, intonêshon, rizumu to pōzu, Sansei-dō, Tōkyō, 1997, 231-270
– ‘Classification of Japanese accent systems’, Cross-linguistic studies of tonal
phenomena, tonogenesis, typology and related topics, (Kaji, Shigeki ed),
Institute for the study of languages and cultures of Asia and Africa (ILCAA),
Tōkyō, 1999a, 151-178
– ‘Yoron-tō higashi-ku-hōgen no takei akusento-taikei’, Kokugo-gaku; 199, 1999b,
1-15
– ‘Nihon-go akusento no saiken’, Gengo-kenkyū; 130, 2006, 1-42
Vance, Timothy An introduction to Japanese phonology, New York, 1987
Vovin, Alexander A reconstruction of Proto-Ainu, Leiden/New York/Köln, Brill,
1993a
– ‘Long vowels in Proto-Japanese’, Journal of East Asian linguistics; 2-2, (Huang
J. & Kuroda S. ed), 1993b, 125-134
– ‘The origin of register in Japanese and the Altaic theory’, Japanese/Korean
linguistics; 6, (Sohn, Ho-min & Haig, John ed), Stanford, 1997, 113-133
– A reference grammar of classical Japanese prose, London and New York,
Routledge-Curzon, 2003
– ‘Proto-Japanese beyond the accent system’, Proto-Japanese: Issues and Prospects,
(Frellesvig, Bjarke & Whitman, John eds), 2008, 141–156
Wada, Minoru ‘Kinki-akusento ni okeru meishi no fukugō keitai’, Onsei-gakkai
kaihō; 71, 1942, 10-13
– ‘Fukugō-go akusento no kōbu seiso to shite mita ni-onsetsu meishi’, Hōgen-
kenkyū; 7, 1943, 1-26
– ‘Dai-ichiji akusento no hakken: Ibukijima’, (Kokugaguin-daigaku) Kokugo-
kenkyū; 22, 1966, 24-28
– Wenck, Günther Japanische Phonetik 1, Wiesbaden, 1953 (Vol. 2, Wiesbaden,
1954, Vol. 3, Wiesbaden, 1957, Vol. 4, Wiesbaden, 1959)
586 References
– ‘Zum Problem der nasalisierten Verschlusslaute im Japanischen’, Asiatica,
(Festschrift Friedrich Weller), 1954
Whitman, John The phonological basis for a comparison of Japanese and Korean,
Harvard University dissertation, 1985
Wurm, Stephen A. & Hattori Shirō (ed) Language atlas of the Pacific area,
Canberra, 1981
Yamada, Bimyōsai ‘Nihon onchō-ron’, Nihon dai-jisho, Nihon dai-jisho hakkō-sho,
Tōkyō, 1892
Yamagiwa, Joseph K. (ed) Papers of the CIC Far Eastern language institute, Ann
Arbor, 1964
Yamamoto, Tasuke ‘Ainu itak sinrit puwe’, Ainu mosiri; 8, 1959
Yamana, Kunio ‘Totsukawa onchō’, Onsei no kenkyū; 7, 1951, 191-201
Yanagita, Kunio ‘Kagyū-kō’, Jinrui zasshi; 42, 1927
Yoshida, Tsunezō ‘Tendai shōmyō-gaku gairon’, Bukkyō ongaku no kenkyū, 12-13,
1954 (reprinted in: Bukkyō ongaku (Tōyō ongaku gakkai (ed), Tōkyō, 1972)
Yoshida, Yuko ‘Lexical accent assignment in standard Japanese –The benefit of a
single pitch analysis–’, Japan and Korea contemporary studies, (Frellesvig B. &
Starrs E, ed.), Aarhus Universiy Press, 1997, 92-98
Zavjalova, Olga, I. ‘A linguistic boundary within the Guanhua area’, Computational
analyses of Asian & African languages; 21, (Project on lexicological analysis,
national inter university research institute of Asian & African languages &
cultures), Tōkyō, 1983
Index
accent (stress): vs. pitch-accent and tone, 11- akusento, 14
14 Amoghavajra, 362, 381-382, 403, 413, 449
Aden (dialect), 216-220 analogy, 104-105
Ainu: close vowels, 303; dialect interference, Annen, 7, 359, 360, 371-373, 392
265; internal reconstruction, 271-272, Ansai zuihitsu: tone system in, 518
275-279, 325; Japanese loanwords in, Aomori (dialect), 28, 30, 98, 105; tone of
9, 245-246, 305-317, 313-319; OJ otsu i particle no, 129-130; rightward H tone
& Ainu -uy, 319-321; possessive suffix, shift blocked by close vowels, 182-183
264-265, 295; proto-Ainu consonant Arai (dialect), 187, 211
clusters (Vovin), 321-325; proto-Ainu Arte da Lingoa de Iapam, 93-94
phonology and prosody (Hattori), 269- Asama (dialect), 216-220, 230, 242
274, proto-Ainu phonology and prosody Ashikebu (dialect), 215-220
(Vovin), 292-305; reliability of ‘ataru’ notations, 505-507
Batchelor’s dictionary, 324; vowel Azuma uta, 22
harmony, 264, 295 Bantu, 14, 18, 97, 100-108, 123
Ainu dialects: Asahikawa, 276; Bihoro, Baoyue (Hōgetsu), 382, 403-404, 414
260-261; comparison of, 261-269; Biao Xingong (Hyō Shinkō), 372
Hokkaidō, 259-269, 279-285; Kurils, bifura-ten, 396-398; in Bumō-ki, 511
259-264, 285-291, 316; Kushiro, 274; Bōsō peninsula (dialect), 34, 177, 182, 184;
Nairo, 265-266; Raichishka, 261; tone system influenced by Gairin B? 188
Sakhalin, 259-269; Saru, 259, 260, 293, Bumō-ki, 509-514, 510; ideai rules in, 489-
298-300; Taraika, 265-266; Tokachi, 490, 512-514
276; Yakumo, 265, 267, 278-279, 307 Bumō-ki (quotation part): fushihakase, 550-
Ainu pitch-accent: influence of Japanese, 551, 560; tone system, 556-558, 550-
285, 291-292; from earlier vowel length 551, 553-555; see also old rongi material
distinctions, 269-274, 277-292 Bumō-ki (vocabulary part): fushihakase,
Ainu vowel length: in Japanese loanwords, 510-511, 525; tone system, 57-58, 112;
245-246; in Sakhalin (vs. accent in Go-on tone markings, 357, 394, 490;
Hokkaidō), 266-268; loss of distinctions, tone of class 3.2, 62-63; tone of
291-292; in monosyllables, 266, 274- compound nouns in, 135, 511; tone
276; in older sources, 279-291; system of, 42, 53-54, 516, 525, 553-555,
Akita: settlement by speakers of Japanese, 555; see also new rongi material
253 Bunkyō hifu-ron, 409, 451
Akita (dialect), 28 pitch assignment rules, Butsuyuigyō-kyō, 539, 549-550; tone marks
10, 20, 29-30, 34, 98, 105, 116; reflect Gairin type tone system, 115; H
rightward H tone shift blocked by close tone restriction in, 558-559; regional
vowels, 182-183, 186, 231 base, 559; tone of class 3.7 in, 115
588 Index
Buzan-ha, 368, 538 Dai-hannya-kyō ongi, see Dai-hannya-kyō
case particles (monosyllabic): tone in MJ, ji-shō
81-91 Dai-hannya-kyō onkun, 467
Chang’an (dialect), 384-386, 331-335, 342, Daijion-ji sanzō hōshi-den, 45
392, 404-405; prenazalised stops in, 334 Daisho hyaku-jō dai-san-jū, 505
Chinese loanwords: tone in modern Japanese Daisho hyaku-jō dai-san-jū yomikuse, 505,
dialects, 486-490, 73-75 509, 523, 553-554
Chisō, 371, 373-375, 384-386 daku, 410; may refer to jidaku initials in
Chizan-ha, 368, 511, 538 Japan, 401-402, 419, 445; see also
chōjō-ten, see bifura-ten muddy
Chōkei, 499-500 Index daku-ten, 402, 448, 470
Chūga no ki, 532 dengyuntu (rhyme tables), 332-333
chūkyoku (scale), 360, 367 depressor consonants, 174
Chūrin type tone system, 31-33 depressor syllables, 175-179
Chūyū-ki, 347, 482 devoicing: as cause of H tone shift, 28, 201,
Chūzan, 354, 394, 400-402, 420, 430, 437- 227, 548
438, 442-443, 446, 458, 466, 471, 480- dhāranī, 5, 344, 350-351; in MK, 414, 448,
481, 486 485; see also mantra
ci zhuo, see second muddy dialect area theory, 66-67
circle theory, 66-67 dialect contact, 94-95, 109, 234-237
clear, 378, 381-382 dialect diversification, 253
close vowels: avoidance of H pitch, 27, 171- dialect geography, 56, 58, 60-62, 65, 70, 78,
175, 177-179, 180-188; devoicing of, 28; 227-228, 251, 253-254; evidence for
in onbin changes, 23; short/weak in antiquity of Gairin/Chūrin split, 94-95
Japanese, 251-252, 181 dialect-geographical paradox, 248-249
compound nouns: tone rules, 132-159; in Dōhan: fushihakase marks, 424-425, 432,
Gairin dialects, 118, 144-147; in MJ, 529, 540, 561, 563
147-152; in Nairin/Chūrin dialects, 132- Dokkyō kuden myōkyō-shū, 397-398, 433,
135, 138-144; antiquity of rules in Gairin 439, 480
dialects, 148, 154-155, 255-256; Dunhuang manuscripts, 344
antiquity of rules in Nairin/Chūrin Early Kan-on tone system, 478-481
dialects, 152-154, 256-257; relation with Early Middle Chinese (EMC), 331-332, 334;
sequential voicing, 156-159; relevance to EMC-based transcription method of
reconstruction of proto-Japanese tone Sanskrit, 376-378, 405, 484-485;
classes, 155-156 EMC vs. LMC, 334; mixed up with
contour tones: automatic lengthening of, 22; LMC in Sino-Japanese, 387-389;
associated with onset of tone, 465 realization of tones, 338, 484; vowel
contour tone system, 7, 449 length in, 389, 484-485
culminativity, 12-13, 17-18, 101 Early Middle Japanese (EMJ), 3
Daigaku-ryō, 343, 347 Early Yayoi period, 250-251, 253
Daigo-ryū (Kogi) Shingon shōmyō, 366-368, Eigaku yō-ki, 362
532 eight-tone system: of hakase-ke, 482; of
Dai-hannya-kyō ji-shō, 346, 466 Myōgaku, 481, 395, 403-404, 415, 428-
Index 589
429, 448, 455; of Siddham scholars fushihakase: historical development of, 527-
(compared), 438-439; see also tone dots 539; origin of term, 525
used to distinguish Chinese initials Gairin A type tone system, 31-33
Eijū: tone system, 421-423; interpretation of Gairin B type tone system, 33-34; date of
changes in transcription of Sanskrit, 423 development, 254
Ekō, 346 Gengo kuninamari, 562; tone system, 518
EMC, see Early Middle Chinese Genshin, 364
Emishi, 253 goin hakase, 425, 432-433, 510, 517, 526,
EMJ, see Early Middle Japanese 536-539; early version by Tanchi, 533-
Enchin, 363 534
Engi shiki, 45 Go-kyōgoku sesshō-ki, 347
Ennin, 343, 345-346, 360, 363, 365, 371, goma-fu, 530, 540-541
373, 382, 384, 402, 404, 414, 521-522 goma-ten, 91-92, 541, 560-565, 570
enunciatory strength, 383-384 Go-nemmon, 346
Ezo kotoba irohabiki, 284-285 Go-on, 353, 372; relation with EMC, 341-
F tone: in MJ (standard theory), 51; in MJ 342; reorganization of, 349, 354-359;
Ramsey’s theory, 78, 198-200, in proto- tones, see four-, five-tone systems
Japanese (Hattori’s theory), 70; in proto- Go-on shishō kaigō hi-shō, 517
Japanese (Matsumori’s theory), 71; in Gyōa: tonal spelling system, 6, 96, 102, 497-
proto-Japanese (Ramsey’s theory), 207 499, 507, 550
F0 compression/polarization, 58, 114 Gyosan mokuroku, 365, 534
fanqie, 332, 349, 350, 380, 397, 423, 561; Gyosan shōmyō rokkan-jō, 365, 369
Myōgaku’s method, 399-400, 404, 408- Gyosan sō-sho, 366, 369, 535; tone marks in
409, 451-455, 489 Kaihon text, 521-523
five-tone system, 477, 479; in Go-on, 354- Gyosan taigai-shū, 367, 369, 537
355; of Hossō school, 354, 394-395, Gyosan-ryū, see Ōhara-ryū
420; of Shingon school, 394, 354-355 H tone: accent-like, 101; [F] in phrase-final
four-tone system: in earliest Kan-on, 394; in position, 104, 167, 200; in initial syllable
Go-on, 357, 394; of hakase-ke, 482; in in Ryūkyūs, 229-230;
Han pronunciation, 476 H tone anticipation (HTA), 14, 19-20, 29-30,
fu-hakase, 531 34-35, 97-99, 103-107, 115, 121, 181
Fujiwara Kintō, 346, 466 H tone-bearing unit, 21, 24-27, 33
Fujiwara Moronaga (Myōnon-in Chōen), H tone loss: in class 3.6, 42, 185-186
365 H tone restriction: date and starting point,
Fujiwara Munetada, 347, 399, 480-482; tone 254-256; earliest attestations, 99, 431,
system of, 418-421; opposition to 496; attestation in fushihakase material,
Myōgaku’s ideas, 420 99-100, 556-559; never reached
Fujiwara Sanehiro (Tōin), 552 Shikoku? 92, 112-114, 256
Fujiwara Teika, 5, 497-499 H vs. Ø tone system, 100-101, 14, 107, 116;
fu-nisshō-ten, 396-397, 489; adopted in see also tone: restricted
Shingi Shingon school, 397; in Bumō-ki, hakase-ke, 347-348; tone systems adopted
511 from Buddhist scholars, 482
590 Index
Hamamatsu (dialect): relation to MJ ‘Gairin’ Hoke-kyō onkun, 436-437, 457
tone system, 154, 205; tone of class 3.6, Hoke-kyō shakumon, 354, 394, 400-402,
42, 185; tone of compound nouns, 144; 420, 437-438, 443, 446-447, 466, 480-
tone of particle no, 131 481, 486; light vs. heavy in shang and qu
Han pronunciation, 342-346, 353, 373; tone defined in terms of length, 400-401,
system of, 475-478 480-481; unnatural division of initials
Han’on sahō, 399, 404, 408-412, 418, 423, over light and heavy, 401
428, 430, 456, 489, 489 Hoke-kyō tanji, 357, 489
Hana kagami, 561 Hokekyō-on, 489
han’on, see fanqie Hokke senpō, 346
hansetsu, see fanqie hon-bakase, see goin hakase
hasi ‘chopsticks’: etymology of, 151-152; as Hon-Sōō-in-ryū, see Ninnaji Sōō-in-ryū
source of Ainu pasuy, 319-321 Honzō wamyō, 45
Hasso saimon, 545 Hossoku-shū, 528
Hatoma (dialect), 211 Huilin, 332
Haya: tone system, 103 Hyōgo (dialect): monosyllables
Hayato, 250 automatically lengthened, 84; preserves
heavy ru tone: not distinguished in Kan-on, archaic (Bumō-ki-type) tone system, 42;
355, 486-490 tone of class 3.2, 195-197
heavy shang tone: merger with qu tone in Ibukijima (dialect), 37, 51, 61, 114, 130;
LMC, 335 rightward H tone shift in, 189-190;
heavy, see light vs. heavy possible origin of M tone in, 190
Heike monogatari: recitation, 6, 128, 361, Ichinoseki (dialect), 187, 211-212
506, 515, 562 ideai rules, 512-514, 489-490
Heikyoku, see Heike monogatari: recitation iki-group (Ryūkyūs), 216-227
Hiei-zan, see Mount Hiei temple complex In’yū, 514
Hiroshima (dialect), 98; tone of class 3.6, Inokawa (dialect), 209
185; tone of class 3.7a, 194; tone of internal reconstruction, 7, 125-152; Ainu,
particle no, 126-129; tone rules for 271-272, 275-279 , 325
compound nouns, 140-144, 147-148; Iroha-ji rui-shō, 39, 468-469
antiquity of tone rules for compound Ise Sadatake: tone system of, 518
nouns, 152-154; influence of Isei, 371, 373-375, 384-386
sequential voicing on tone of compound, Ishinomaki (dialect), 187, 211-212
158 isogloss: between Tōkyō type and Kyōto
Hōdan rongi yōshū, 550, 559 type tone systems, 160-161; between
Hoke-kyō, 348 Gairin type and Chūrin type tone
Hoke-kyō on (Kujō-ke-bon), 397, 433, 489 systems, 94-95
Hoke-kyō ongi, 356, 359, 467; reversed Go- ita-group (Ryūkyūs), 216-227
on tone dots in, 356-357; shang tone Izumo (dialect), 28, 192; pitch assignment
replaces qu tone in Go-on (single-kana rules, 115; rightward H tone shift, 146,
character readings), 485, 344, 359; tone 182-186, 211, 252; tone of class 3.5b,
system of Shinkū in, 436-437; twelve- 201; tone rules for compound nouns,
tone system in, 394
Index 591
118, 145-147; tone of particle no, 129- Kindaichi’s theory, see standard theory
130 Kinyarwanda: tone system, 107-108, 117
ji-amari, 23 Kirundi: tone system, 107-108, 117
jidaku, 401-402, 410, 419; see also second kitamae bune, 36, 313
muddy Kōbe (dialect): tone of particle no, 129
Jie, 363-364, 369 Kōchi: origin of place name, 22
Jiguang, 383 Kōchi (dialect), 35, 37, 44-47, 80-81; cause
Jin Lixin (Kin Reishin), 372 of H tone restriction, 113; developments
Jion kariji yōkaku, 352 in monosyllabic nouns, 86; preserves
jisei, 410; see also second clear archaic (Bumō-ki-type) tone system, 41-
jisho-sei, 409-412, 452-454, 458 42, 516; tone of class 3.2, 62-63, 86,
jishū-on, 409-412, 452, 455, 458 109, 195-197, 248; tone of class 3.3,
Jōgon, 372, 509, 514 118-120; tone of particle no, 127-129
jōkyo nin’i-ten, see bifura-ten Kōfu (dialect), 98, 168-169
Jōmon period, 252 Kofun period, 251
Jōshin, 365 Kogi Shingon school, 366-370; for shōmyō
Jūyo (Kōmyō-san): tone system of, 423 tradition & tone systems see Shingon
Kagoshima (dialect), 44-47, 74, 116-118, school
487-488, 495; tone rules for compound ko-hakase, 528-531; reinterpretation in Edo
nouns, 118 period, 559-560
Kagoshima type tone system, 17, 37-38, 131, Koji-ki: spelling system, 73, 478; tone notes
187 in, 4-5, 72-73
Kaigō myōmoku-shō, 553, 559 Kōke shidai, 347
Kakuban, 367-368 Kokin (waka-shū) kunten-shō: as MJ
Kakui, 364, 367, 425, 527, 536-540, 545, ‘Gairin’ material, 49, 82; tone of class
549, 560 1.2, 469
Kan’yō-on, 352-353, 489 Kokin waka-shū, 5; Date-ke-bon as MJ
Kana moji-zukai, 498, 507 ‘Chūrin’/‘Gairin’ material, 49; H tone
Kanchi, 399, 421 restriction attested in Fushimi miyake-
Kannō, 509, 511-512, 550 bon & Jakue-bon, 99, 496
Kan-on, 51, 345, 353; relation with LMC, Komatsu (dialect): relation between vowel
343, 345-347; tones, see four-, five-, length and tone class, 238
quasi seven-, eight-, quasi eight-tone Kongō-kai giki, 344, 528
systems Konkōmyō saishōō-kyō, 348
kataru shōmyō, 361, 363, 539 Konkōmyō saishōō-kyō ongi, 5, 73, 358,
Kegon ongi shi-ki, 45 394, 468; six-tone system in, 479; use of
Keichū, 72, 372, 514-519, 523, 552 light ping tone in, 40, 468; Wa-on
Kekan, 365 readings in, 358
Kenna, 364, 537 Konparu Zenpō, 91-92, 497
Kenpō: fushihakase marks, 433, 529, 534, Konryū mandara goma giki, 344
540, 561; tone system of, 406, 431-432 ko-ryū (‘traditional school’) Tendai shōmyō,
key figure/hakase chart, 526; see also zu- 364-365
hakase
592 Index
kōshiki, 360-361, 364, 530, 538-539, 543, Later Kan-on tone system, 481-482
545, 560 lexical diffusion, 237
Kōya-san, see Mount Kōya temple complex light ping tone: as shang-ping sequence in
kuden (oral tradition): reliability of, 442 MJ, 40; in Mōgyū, 350; in Go-on, 354-
Kujaku-kyō, 393 359; not distinguished in Wa-on, 354,
Ku-jō shakujō, 346 358
Kūkai, 332, 342, 362-363, 366, 381, 403, light ping tone dot, 39; abandoned as marker
409, 451, 466, 485, 551 of MJ tones, 41, 468-471; see also tone
Kumaso, 250 dots: distribution over MJ lexicon
kundoku, 348 light ru tone: in Mōgyū, 350; not
Kunshū shittan shii yōketsu-shō: tone system distinguished in Wa-on, 355, 486-490
of Eijū in, 421-423, 439 light vs. heavy: defined in terms of length,
Kyengsang, 107 438, 400-401, 419-420, 430, 446-447,
Kyōto (dialect): archaic features of, 50-51, 480-481, 500-501; distorted in Siddham
64, 110-111; as witness to proto- scholars’ tone systems, 446, 452-454,
Japanese tone system, 4; geographical 472; in EMC/LMC, 376-380, 385-386;
distribution in earlier periods, 91-94; interpretation by Myōgaku, 450-451,
tone rules for compound nouns, 132-133, 453-454, 474, 470-471; tone of
136-138; tone of particle no, 127 compound nouns as evidence for nature
Kyōto type tone system, 34-37, 108-109 of distinction in Japan, 488-489;
L tone: development in Kyōto type tone unnatural division of initials over
systems, 108-109; elimination from MJ categories in Kan-on, 443-445, 401
tone system, 106; failure to develop in LMC, see Late Middle Chinese
certain Kyōto type dialects, 161-165; LMJ, see Late Middle Japanese
related to vowel length in proto- locus (of accent), 15-16, 299
Japanese? 237-246 Lu Fayan, 331
L tone spreading (rightward): facilitated by Luganda: tone system, 14, 103, 117
close vowels, 180, 183-184; in Kyōto, M tone, 58-60, 65, 100-104, 190
112 Maeno (dialect), 209, 228, 230
Late Middle Chinese (LMC), 332-333, 334- Mahāvairocana sutra, 334
335; as ancestor of modern dialects, 335; Makō inkyō, 352
differences between EMC & LMC, 334; Makurazaki (dialect), 38, 116-117, 212, 235-
merger of heavy shang tone with qu 237, 495
tone, 335, 377, 384-385; origin of Man’yō-gana, 73, 469; tonal spelling system
merger heavy shang tone with qu tone in Nihon shoki, 475, 478
(Pulleyblank vs. Kindaichi), 490; mixed Man’yō-shū, 22; spelling system, 73, 478
up with EMC in Sino-Japanese, 387- Manabe type tone system, 37
389; realization of tones, 339-340, 371- mantra, 344, 350-351
391, 384-386, 389-391; LMC-based mari-group (Ryūkyūs), 216-222
transcription method of Sanskrit, 381- Masana (dialect), 18, 209, 242
382, 484-485; vowel length in, 334, Matsubara (dialect), 210-211, 228, 230
377, 389, 484-485 Matsue (dialect), 192, 115, 216, 243;
Late Middle Japanese (LMJ), 3 devoicing of close vowels, 28; L tone
Index 593
spreading facilitated by close vowels, Mount Hiei temple complex, 362-363, 366,
180, 183-184; rightward H tone shift 399
blocked by close vowels, 33, 168, 180, Mount Kōya temple complex, 100, 359, 363,
182-184; tone of particle no, 130 366-369, 405, 433, 500, 502, 507-508,
Matsumae no kotoba, 279-282 514, 540, 545, 551-552, 555, 563-564
Matsumoto (dialect), 98; tone of class 3.6, Mount Negoro temple complex, 100, 368-
185; tone of class 3.7a, 191, 194 369, 505, 508, 511, 550-555, 559
meyasu hakase, 525, 534-536 muddy, 378, 381-382
Middle Chinese, see Early Middle Myōe, 364, 543, 549
Chinese/Late Middle Chinese Myōgaku, 346, 380, 395-397, 431;
Middle Japanese (MJ), 3; dialectal background of, 399-400; influence on
distinctions in tone system of, 87-91 Siddham scholars’ tone theories, 418,
Middle Korean (MK): dhāranī recitation, 420-423, 422, 425, 431, 438, 450-459;
414, 448, 485; merger of shang and qu interpretation of changes in transcription
in Sino-Korean, 491-493; pitches of Sanskrit, 402-405, 422;
compared with MJ, 495; tonal value of misinterpretations of Shittan-zō, 445-
tone dots, 491 446; tone systems of, 406-418, 439-440;
Middle Yayoi period, 251 use of terminology (Wa-on/Go-on/Kan-
Min dialects, 333, 335 on), 346
Minamoto Shitagō, 466 Myōgo-ki: as MJ ‘Gairin’ material, 49, 82,
Minamoto Tomoyuki, see Gyōa 547
MJ ‘Chūrin’ material: geographical base, 83- myōmoku, 509
86, 88-91 Myōmoku-shō, 552
MJ ‘Gairin’ material: geographical base, 88- Myōnon-in-ryū Tendai shōmyō, 363-365
91, 114, 154, 205 Nagoya: area (in Middle Yayoi period), 251
MJ ‘Nairin’ material: geographical base, 83- Nagoya (dialect), 10, 20, 30, 32, 84, 89, 98,
86, 88-91 251
MJ, see Middle Japanese Nairin type tone system, 31-32
MK, see Middle Korean Nakijin (dialect), 239-245
Modern Japanese, 3 Nan-zan, see Kōya-san
Mōgyū, 334, 349-350, 393 Nanzan Shin-ryū (Kogi) Shingon shōmyō,
Moji-han, 430-431; H tone restriction in, 99, 363, 366-369, 501, 532-533, 536-538,
431, 496-497, 508; light vs. heavy in 545
shang and qu defined in terms of length, Nara shōmyō, 361-362
438; influence of Shittan-zō, 443, 501 Narada (dialect), 34; tone of particle no, 129
Monnō, 352; tone system of, 518 Naze (dialect), 216-220
mora (vs. syllable), 20-27 Negoro-san see Mount Negoro temple
moraic tone markings, 45 complex
Moshiogusa, 282-283 neuma/neumatic, 527, 539-542
Mōtan shichin-shō, 91-92, 497 new rongi material, 99, 497, 505, 509
Motoori Norinaga, 23, 72, 352, 372; tone new two-step analysis, 30
system of, 518-519 Nihon shoki, 5, 73, 372, 468-469, 475, 478;
as MJ ‘Nairin’ material, 49; use of light
594 Index
ping tone in (Iwasaki-bon, Maeda-ke- old rongi material, 99-102, 114, 497, 507,
bon), 40, 468 510, 550-560; H tone restriction in, 556-
Nihon shoki shi-ki: as MJ ‘Chūrin’ material, 558
49; use of qu tone abandoned (Ōei-bon), Ōmi shōnin, 313-316
39, 90 onbin, 23-24, 251
Ninna-ji Sōō-in-ryū (Kogi) Shingon shōmyō, ondoku, 348
366, 528, 532, 544-545; use of shōten ongi, 348, 352
hakase, 528 on-hakase, 343, 347, 372, 392
Ninna-ji-ryū, see Ninna-ji Sōō-in-ryū Onna (dialect), 215, 230, 239, 241-244
Nō: recitation, 6, 91, 361, 561-562 open vowels: preferred as H tone-bearing
Nomori no kagami, 365 unit, 27
Noto type tone system, 165-177; influence Ōsaka (dialect), 109, 112, 193; tone of
of segments on tone, 171-175; particle no, 127-129
Kindaichi’s view, 165-166; McCawley’s Oshimizu (dialect): relation between vowel
view, 169-170 length and tone class, 238
Nōyo, 397, 480 Pacific coast dialect group, 252
Nozaki (dialect): as archaic Tōkyō type tone Paekche, 341, 361, 372; influence on Wa-on
system, 17-20, 79-80, 106, 111, 123, tones? 389, 484; loanwords in OJ, 494-
165-177 495
Nubi: pitch-accent, 13 parallel developments: in Chūrin and Gairin
Numazu (dialect), 98; tone of class 3.6, 185; dialects, 249-250; in Gairin B and East
tone of class 3.7a, 191, 194 Sanuki dialects, 37, 188; in standard
nu-sounds, 381-383 theory, 56, 58-62, 68-69, 71-72
Ō(mu), 239-243 particle no (special tone): in modern dialects,
Obligatory contour principle (OCP), 186 125-132; in Kagoshima type tone
obligatoryness, 12-13, 15, 17, 234 system, 131; in MJ, 81-91, 126
Ōe Koretoki, 341 pause: as L tone, 19-20, 103
Ōgimi (dialect), 209 phrase boundary tone, 10, 29, 34-35, 98-99,
Ōhara-ryū Tendai shōmyō, 365-366, 369, 103-105, 109-110, 115, 181, 183, 188
521-522, 533-535 ping tone: in Wa-on/Go-on, 354-359;
Ōhara’s theory, 72-75 realization in LMC, 473; vs. oblique
Ōita (dialect), 44, 46-47, 115; tone rules for tones in Middle Chinese, 338; see also
compound nouns, 118, 144, 146-147, tone dots: distribution over MJ lexicon
154; tone of particle no, 126, 129-130; pitch assignment rules, 10, 29, 34, 98-116,
tone of class 3.6, 42, 185 146
OJ, see Old Japanese pitch: default, 15
Okinawa (dialects): 208, 217-222, 227, 229- pitch-accent: in Ainu, 13, 267-268, 285, 291-
230, 237, 239-246, 259, 317; 292; in MJ? 14-17, 122; in the modern
monosyllables automatically lengthened, Japanese dialects, 17-20, 121; in Nubi,
84 13; vs. (stress)-accent and tone, 11-14
Old Japanese (OJ), 3; tone system of, 72-75 proto language, 4
Index 595
proto-Japanese tone system: in Hattori’s 81-82; development of Chūrin type tone
theory, 65, 70-71; in Hayata’s theory, system, 82-83; developments in
58-60; in Matsumori’s theory, 71-72; in monosyllabic nouns, 84-86; origin of L
Ramsey’s theory, 77; in standard theory, register in class 3.2 in Kyōto type tone
50; evidence from Japanese loanwords in systems, 86; relation with Siddham
Ainu? 246, 305-317 scholars’ tone systems, 464-467; value
Qieyun, 331-333, 335-337, 339, 342, 376, of tone dots in, 77-78
387-389, 405, 411, 484 reduction (of differences in pitch height),
Qin pronunciation, 333-334, 342; see also 103, 113
Han pronunciation ‘register’: in Kyōto type tone systems, 10,
qing, see clear 15, 35-36, 53, 86-87, 108-109, 121;
Qiyinlue, 333 incongruent in compounds in Kyōto, 138
qu tone: abandonment as marker of MJ tones, register (yin/yang, light/heavy): in
39-41, 90, 470; as ping-shang EMC/LMC, 336, 378-380, 382-387,
sequence in MJ, 39, 462, 466-467; in 389-391, 404, 443-444, 478-479, 490;
Myōgaku’s six-tone system, 428, 456- distorted in Siddham scholars’ tone
458; in Wa-on/Go-on, 354-359; not systems, 446, 452-454, 472
distinguished from shang tone in Wa-on, register tone system, 7, 16-17, 101, 254, 293,
354; origin in Chinese, 336-339; 449, 473, 484
realization in LMC, 334; regarded as Reiji sahō, 346
longest tone in Kan-on, 359; replaced by ‘reversed’ circle theory, 68-69
shang tone in Go-on (single-kana rhyme dictionary (yunshu), 331, 390
character readings), 354-359, 359, 433, rhyme tables (dengyuntu), 381-382
485; value in Kan-on (Shingon school), Rishu-kyō, 345, 563
427-429, 459, 461-462, value in Kan-on ritsu (scale), 360-361, 367, 532, 534, 544
(Tendai school), 457; see also tone dots: Rodrigues, João, 93-94
distribution over MJ lexicon rōei (banquet songs), 540-542, 561
quasi eight-tone system (Tendai school), Rōei yōshū (Bun’ei-bon), 540-542
396-398, 457-458 Roku-jō Arifusa, 365
quasi seven-tone system (Shingi Shingon Romanization: Kunrei, 10-11; Hepburn, 10-
school), 397, 510-512, 520 11; of OJ, 11
R tone: in MJ (Kindaichi’s theory), 51; in rongi, 357, 360, 362-363, 489, 497, 550-552;
MJ (Ramsey’s theory), 78-79, 83-86, 88, extinction in (Kogi) Shingon school,
200-207; in proto-Japanese 502, 508, 552; extinction in Shingi
(Matsumori’s theory), 71; in proto- Shingon school, 369, 508, 552, 554;
Japanese (Ramsey’s theory), 193-194, restoration, 554, 556, 562
207, 247 rongi books, 42, 357, 509, 550-553; see also
Raiyu, 368 old/new rongi material
Ramsey’s theory: development of Nairin Rongi-shō, 507, 552
type tone system, 79; development of rongi-sho, see rongi book
Nozaki tone system, 79; development of ru tone: in EMC/LMC, 336, 338, 386, 389-
Kyōto type tone system, 80-81; 390; in Kan-on, 324-359, 486-490, 525;
development of Gairin type tone system,
596 Index
in Wa-on/Go-on, 354-359, 486-490, 525 Sanuki (dialect): rightward H tone shift
weakening of final closure in 9th century blocked by close vowels, 37, 188
LMC, 378-380; weakening of final Sanuki type tone system, 37, 61, 92, 113,
closure attested in Tendai Kan-on, 346 130, 178, 187-188, 235
Ruiju myōgi-shō: background of, 466-467; Sea of Japan coast dialect group, 252
reversed Wa-on tone dots in, 356; use of second clear, 381-382
terminology (Wa-on/Sei-on) in, 347 second muddy, 381-382
Ruiju myōgi-shō (Kanchi-in-bon): as main segments: influence on tone in Japanese, 25-
source on MJ tone system, 4-5; as MJ 28, 34, 37, 171-175, 177-179, 180-188,
201, 227, 230-232, 239, 546-548
‘Chūrin’ material, 49; use of light ping
Sei, see Isei
tone in, 468-469
sei, 401-402, 410, 419; see also clear
Ruiju myōgi-shō (Tosho-ryō-bon): as MJ
Sei-on, 346, 353, 356
‘Nairin’ material, 49; no merger of
Senga, 532, 544-545
heavy shang with qu in Sei-on, 387; ru
Sengen-shō, 499-500, 507
tone always light in Sei-on (five-tone
senritsu-kei (melodic pattern/vocal formula)
system), 394-395, 445, 479, 486; use of
361, 527-528
light ping tone in, 40, 468, 480
seven-tone system, see quasi seven-tone
ryō (scale), 360, 367, 532
system
Ryōnin, 365, 369, 535, 562
shang tone: function as shang-ping sequence
Ryōson, 355-356; comparison of Go-on and
in MJ, 463; not distinguished from qu
Kan-on tones by, 355-357; tone system
tone in Wa-on, 354; origin in Chinese,
of, 429-430; value of qu tone, 459, 546
336-339; realization in LMC, 473; see
Ryōyū, 365
also tone dots: distribution over MJ
Ryūkyūs: settlement by speakers of
lexicon
Japanese, 255
Shilla, 340, 345, 372, 495
Ryūkyūs (dialects), 208-246; no separate
Shimokita peninsula (dialect), 33, 253;
tone class 2.3 in Ryūkyūs? 212, 215-
rightward H tone shift blocked by close
216; special tone class distinctions, 121-
vowels limited to longer words, 180, 182,
122, 208; possible origin of special tone
185-187, 211
class distinctions, 212-214, 232-237
Shingi Shingon school, 367-370; use of
Saichō, 342, 345, 362-363
ko-hakase, 550-551, 560; adoption of
Saidai-ji Sōō-in school, see Saihō-in school
goin hakase, 510-511, 525, 538;
Saihō-in school (Kogi) Shingon, 366, 532,
reanalysis of tone system, 505-507;
544
rongi tradition, 42, 369, 508, 550-555,
Sakimori uta, 22
559-560; tone system after shift, 397,
Sakumon daitai, 418-420, 422, 438, 447,
510-512, 520
480-481, 501
Shingō, 467
San (dialect), 18, 209-210, 235
Shingon school, 350-351, 359, 360-370;
Sandai jitsu-roku, 372
rongi tradition, 552; tone system, 425-
Sanskrit (transcription of), 332, 334, 392;
430, 546; disruption of tone system, 500-
vowel length distinctions in, 334, 338-
501; tone system after shift, 500-502,
339, 484-485
Index 597
514-517; reanalysis of tone system after 466, 471-472, 474, 478-479, 481, 485,
shift, 501-505; use of ko-hakase, 424, 514-515, 567-568; misinterpretations of
433, 528-531, 534; use of zu-hakase, text, 443-449
531-533; adoption of goin hakase, 536- Shiza kōshiki (Daiji-in-bon), 24-25, 526,
539, 564, 570 561, 570; as MJ ‘Gairin’ material, 49,
Shinkū, 356, 359, 394, 436-437, 485 82, 547; fushihakase marks in, 543-548;
Shinpan, 382, 424, 502, 529; tone system, influence of segments on tone in, 546-
425-429, 431, 546; use of rhyme tables, 548
382, 402; value of qu tone, 427-429, 459 Shiza kōshiki (other manuscripts), 543
Shinren, 354-355, 388, 394, 399, 424-425, Shizukuishi (dialect), 34, 144; tone rules for
486, 488-489, 523 compound nouns, 146
Shin-ryū (Daishin shōnin-ryū), see Nanzan Shōketsu-sho, 537
Shin-ryū shōmyō, 360-370, 567-569; fushihakase
shin-ryū (‘innovative school’) Tendai musical notation systems, 525-565;
shōmyō, 364-365 influence of Shittan-zō, 371, 441-442;
Shinsen jikyō, 39, 45 Rai’s study of Tendai shōmyō, 520-523;
Shin-Sōō-in-ryū (Kogi) Shingon shōmyō, revival in 17th century, 509-517; styles
366 of recitation that reflect MJ tones, 360-
Shisei narabi ni ideai dokushō shiki, 520 361, 539, 541, 545
Shishō shiki, 369, 369, 500-502, 505, 507- Shōmyō yōjin-shū, 365, 371
508, 552, 561; tones defined in terms of shōmyō-shū, 528
length, 500-501 Shōmyō-shū nikan-shō, 535
Shittan, see Siddham Shosha-san shōmyō-shō, 359, 471; adoption
Shittan go-on-shō, see Shittan-hi of Kan-on distinctions in Go-on tones,
Shittan-hi, 404-408, 412, 431, 439, 456-457 436 tone system of, 433-436, 439
Shittan hiden-ki, 425-429 shōten hakase, 424, 432-433, 528-529
Shittan-jiki kikigaki/Dōhan-ki: fushihakase shōten, see tone dots
marks in, 424-425, 432, 529, 540, 561, Shuei, 382, 403-404, 414
563 Shūen, 366
Shittan jiki-shō, 423, 439 Shūi waka-shū (Jōben-bon/Maedake-bon):
Shittan-jiki shōgaku-shō, 406, 431-432 as MJ ‘Nairin’ material, 49, 89; use of
Shittan kuden, 354-355, 394 qu tone abandoned, 39-40, 90
Shittan rinryaku-zu-shō, 355-356, 429-430, Shūkai, 365, 534
459; comparison of Go-on and Kan-on Shuri (dialect), 122, 215, 217, 220-222, 230,
tones in, 355-357 238-244; vowel length in Chinese
Shittan sammitsu-shō, 372, 509 loanwords, 229
Shittan shogaku-shō: fushihakase marks in, shutton (initial tone of melodic pattern/vocal
433, 529, 540, 561 formula), 531-532, 535, 545
Shittan yōketsu, 346, 412-418, 428, 439, 457 Siddhām jūhas-shō kikigaki, 514, 523
Shittan-zō, 7, 342, 346, 360, 371-391; as Siddham scholars, 392
basis of the Siddham scholars’ tone Siddham script, 351
systems, 394, 396, 401-404, 414, 420, Sino-Japanese, 341-359
422, 430, 438, 441-449, 451, 453-460, Sino-Korean: tones, 491-493
598 Index
Sino-Vietnamese: tones, 390-391; merger of Tendai Kan-on, 345-346, 353, 379, 384,
heavy shang tone with qu tone, 384 467; weakening of final closure in ru
sixteen-tone system, see tone dots: used to tone, 346
distinguish Chinese initials Tendai school, 359, 360-370, 437; meyasu
six-tone system: in Kindaichi’s hakase, 525-536; modern meyasu
interpretation, 463; of hakase-ke, 482; of hakase, 570; Rai’s study of Tendai Kan-
Hossō school, 394, 479; of Myōgaku, on tones, 520-523; tone system after
403-406, 415, 428-429, 457; of Shingon shift, 520-523
school, 354-55, 395-396, 403-404, 414, Tenrei banshō myōgi, 332, 466
431, 436, 439, 445, 447-449, 457-459, three-step analysis, 30
461-463, 466-467, 470-471, 481, 511; of Tōhoku: settlement by speakers of Japanese,
the Siddham scholars compared, 438- 251-254; southern Tōhoku: mixed
439; merger of light qu with shang in dialect reflexes in, 254
Shingon/Hossō school, 396, 447-449 Tō-in, 346, 349, 353
six-tone system (after shift): of Shingon Tokunoshima (dialect), 18, 68, 209, 212,
school, 502-505, 512, 514-517, 520, 227-231, 235-236, 243-244
523-524; of Tendai school, 520-523 Tokuwase (dialect), 216-220, 231
Sō see Chisō Tōkyō (dialect), 44-47; tone rules for
Sonpen, 366, 532 compound nouns, 132-135, 138-144,
South Hamkyeng, 107 132-138
standard theory: development of Gairin type Tōkyō type tone system, 29-34; see also
tone system, 54-56; development of MJ Nairin, Chūrin, Gairin A, Gairin B type
tone system, 52-53; development of tone system
Nairin/Chūrin type tone system, 53-54; tonal spelling system: Fujiwara Teika, 5,
explanations for development of H tones, 497-499; Gyōa, 6, 96, 102, 497-499,
56-58; MJ tone system predates the tone 507-508, 550; in Konkōmyō saishōō-kyō
system of proto-Japanese, 55; ongi, 5, 73; in Nihon shoki, 5, 73, 475,
proto-Japanese tone system, 50; relation 478
with Siddham scholars’ tone systems, tone: active, 14, 17, 35, 102, 107, 109;
461-464; unexpected L register in class default, 14, 17, 100-101; distinctions in
3.2, 62-63; value of tone dots in, 51 verbs and adjectives, 3, 41, 109, 122-
sumi-fu, 525, 541, 560-565, 570 123; restricted, 13-14, 18-20 see also: H
Swadesh’s (basic word list), 260 vs. Ø tone system
syllable: vs. mora, 20-27; syllable (heavy): tone assimilation: perseverative, 107;
exclusion of, 27-28 anticipatory, see H tone anticipation
syllable-tone, 16, 29; in the Ryūkyūs, 209- tone classes, 4, 9, 31, 37-38, 122-123, 116;
210; vs. word-tone, 19, 17 merger patterns, 31-32, 186; possible
tada hakase, 529-530 origin of class 2.5, 205-207, 245; see
Tanchi, 360, 365, 369, 371, 527, 533-535, also Ryūkyūs: special tone class
539; early goin hakase, 533-534 distinctions
Tarama (dialect), 19, 209, 228 tone classes (subclasses), 6-7, 9, 41, 43, 46,
Tarui type tone system, 29-34, 37, 160-165 109, 114, 191-207, 247; based on
Index 599
modern dialect reflexes, 191-198; based the modern dialects, 180-181, 186, 190,
on reflexes in the Ryūkyūs, 233; based 211
on tone dot attestations, 198-207; class tone: vs. (stress)-accent and pitch-accent, 11-
2.2a too large in Martin’s classification? 14
191-192; class 3.2a/b attested in Kyōto tonogenesis: in Chinese, 336-339; in
type dialects? 194-198; class 3.5a/b Japanese, 4; in Vietnamese, 336-337
attested in Tyōkyō type dialects? 204- tō-ten, see light ping tone dot
205; possible origin of class 3.5b, 205- Totsukawa (dialect), 32, 44, 46-47; pitch
207; reasons for lack of reflex for class assignment rules, 105-106; tone of
1.3b in modern dialects, 198-200 particle no, 129-131; as dialect island,
tone dots, 5; changes in location of, 343-344; 61, 65, 69, 94, 99, 161, 555;
distribution over MJ lexicon, 38-49; in developments in monosyllabic nouns, 84
the ‘new style’, 517, 552; introduction Toyama (dialect), 177-179
of, 343-344; reversed Wa-on/Go-on Tsugaru (dialect): tone rules for compound
type, 356-358; selected to mark the tones nouns, 146
of MJ, 461, 470; tone value in Ramsey’s Tsuruoka (dialect): tone of particle no, 129-
theory, 51, 77-78; types of, 393-398; 130
used to distinguish Chinese initials, 350, Tsushima kōgin-ki, 341
393; value based on Kan-on, 358, 392; Tsushima-on, 341, 353-354, 372; see also
value when added to dhāranī, 480; value Wa-on
when added to MJ, 480-481; value in twelve-tone system (of Shinkū), 436, 394,
MK, 491 437
tone shift (leftward): in Ōhara’s theory, 72; two-step analysis, 30
in Ramsey’s theory, 77; acknowledged utau shōmyō, 360
in standard theory, 41-42, 107; in Kyōto, vowel length: as origin of H tone in initial
106-109; more common in restricted syllable in the Ryūkyūs? 222-229; as
tone systems, 107-108; time of origin of split in classes 2.3 and 2.4/5 in
occurrence in Kyōto, 256, 496-508 the Ryūkyūs? 212-214; as origin of
tone shift (rightward): affected tone classes, Tōkyō type location of H tone? 213-214;
180, 182, 188; H tone shift blocked by attestations in Ruiju myōgi-shō, 22;
close vowels, 34, 37, 145-146, 178, 180- automatic in monosyllables, 23, 44, 83-
188, 210, 230-232, 239, 251-252, 257; 84, 249; geograpical distribution in the
relation to development of word-tone, Ryūkyūs, 222-228; in Chinese
187; does not result in Tōkyō type tone loanwords in the Ryūkyūs, 229; in proto-
system, 187, 189-190; in standard Japanese? 233-234; in Japanese
theory, 61; in the Ryūkyūs, 209-212, loanwords in Ainu, 245-246, 259;
235; starts in longer words, 184-186, relation with L tone in proto-Japanese?
235, 211 237-246
tone spreading: vs. shift, 181; leftward, see Wadomari (dialect), 18, 209, 216, 218-220
H tone anticipation (HTA) Waji shōran-shō, 372, 514-517
tone spreading (rightward), 60-61; in MJ, Waji taikan-shō (tone system), 518
14-15, 48-49, 81-83, 88-90, 96, 120, Wakayama (dialect), 511, 553-555;
123, 193, 199, 204-205, 248-250, 557; in monosyllables automatically lengthened,
600 Index
84; preserved archaic (Bumō-ki-type) Yamaguchi (dialect), 98, 130, 185
tone system), 42, 112, 193; tone of class Yamato: province, 92; kingdom, 253-254
3.2, 63, 86, 109, 195-197; tone of Yang Zhitui, 332
particle no, 129 Yayoi migration: order of settlement, 250-
Wamyō ruiju-shō, 5, 39, 393, 395, 466, 468; 251, 253
use of light ping tone in, 43, 463, 468 yin/yang tonal split, see register: in
Wa-on, 344-345, 346-347, 352, 356; EMC/LMC
Paekche component in? 341; nasal initial Yiqiejing yinyi, 332, 467
in, 387; reversed tone markings, 356- Yōkyoku, see Nō: recitation
358; possible origin of reversed tone yomikuse, 505, 552-555
markings, 483-490 yomu shōmyō, 361, 363, 539
weak vowels, 174; in proto-Japanese, 252 Yonaguni (dialect), 210, 220-222, 228; tone
wokoto-ten, 344, 347 classes similar to Akita? 231; d- not
Wokoto-ten-fu, 482 original, 229
word-tone: Chibu dialect, 38; development Yuan Jinqing (En Shinkei), 372
of, 187, 116-118, 209-212, 235-237; in Yuanhe yunpu, 339
Kagoshima dialect, 37-38; in Yūkai, 500, 502, 507, 514
Makurazaki dialect, 38; realization in the Yunjing, 333
Ryūkyūs, 209-212; vs. syllable tone, 17, Yunying, 332
19 Yupian, 332, 341
Wu dialects, 377-378 Zeami, 561
Wu pronunciation, 333-334, 342, 373 Zhiguang, 333, 403, 405
Xitan, see Siddham zhuo see muddy
Xitanziji, 333, 342 Zoku-on, 352-353
zu-hakase, 531-533