This file is indexed.

/usr/share/doc/latex-cjk-common/pyhyphen.txt is in latex-cjk-common 4.8.3+git20120914-2ubuntu1.

This file is owned by root:root, with mode 0o644.

The actual contents of the file can be viewed below.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
This is the file pyhyphen.txt of the CJK macro package ver. 4.8.3
(07-May-2012).

Hyphenation patterns for unaccented pinyin syllables
----------------------------------------------------

Sometimes it makes sense to use unaccented pinyin syllables for common names
and phrases which are repeated frequently; sometimes you are in an
environment which doesn't allow accented pinyin syllables at all. For such
cases it is desirable to have correct hyphenation, avoiding manually added
hints using e.g., `\-' between the syllables.

Fortunately, due to the limited numbers of Chinese pinyin syllables (407 for
Mandarin), it is easy to create hyphenation patterns. The logical
consequence is to add a new `language' to the Babel package, and exactly
this can be found in the directory utils/pyhyphen.


Installation
------------

This is fairly straightforward. Move the Babel language definition file
pinyin.ldf file to a place found by TeX. If you e.g., maintain a local TEXMF
tree, a good place would be $TEXMFLOCAL/tex/generic/babel/pinyin.ldf.
Similarly, move the pinyin hyphenation pattern file pyhyph.tex into your
(local) TEXMF tree: The analogous place would be
$TEXMFLOCAL/tex/generic/hyphen/pyhyph.tex.

Now run texconfig (or a similar tool) to add pyhyph.tex to the used
hyphenation patterns. In the usual case you have to add a line saying

  pinyin    pyhyph.tex

to the hyphenation configuration file language.dat. Finally, build a new
format file (usually the command `initex latex.ltx'); in most cases this
happens automatically.

Using Babel ensures that it works both with LaTeX and Plain TeX.


Usage
-----

Do something like this:

  \documentclass[...]{...}

  \usepackage[T1]{fontenc}
  \usepackage[pinyin,german,english]{babel}
  ...

  \begin{document}
  ...
  \foreignlanguage{pinyin}{some pinyin syllables}
  ...
  \end{document}


Note 1: pinyin.ldf is intentionally very minimal. Don't expect that e.g.,
        \chapter yields a pinyin version of the Chinese word for `chapter'.
        It might be useful to define a shorthand macro like the following:

          \newcommand{\py}[1]{\foreignlanguage{pinyin}{#1}}

        Now you can simply say

          \py{Beijing}

Note 2: The hyphenation patterns use `umlaut u' with code position 0xFC
        (this is latin-1 and T1 encoding). You can also use OT1 encoding,
        but then the patterns containing `umlaut u' won't work.
        Additionally, the quote character `'' is used as a letter which is
        needed to resolve ambiguities like this:

          Xi'an <-> Xian        

        If a syllable not at the beginning of a word starts with a vowel
        (i.e., `a', `e', or `o'), you must precede it with a quote
        character. Example:

          Tian'anmen

        The hyphenation patterns correctly treat it as Tian'-an-men.

        The shorthand `"u' (as used in German) is available to input
        `umlaut u'.

Note 3: Most Babel language support files define a `<language>.sty' file
        also. This is not true for pinyin! pinyin.sty is used for accented
        pinyin syllables which don't need a special hyphenation support.
        (pinyin.sty works with Plain TeX also.)


Technical details
-----------------

The dictionary used to construct the hyphenation patterns has been created
with the small C program `pinyin.c' which simply combines all existing
Chinese syllable pairs, inserting quote characters where needed. Then,
`patgen' has been run on the dictionary; `pinyin.tr' defines the used
character set.

Due to the regularity of the word combinations, only two-letter patterns of
the first level are needed to find all possible breaks without a single
error or omission.

---End of pyhyphen.txt---