Source

mule-ucs / doc / mule-ucs.texi

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
\input texinfo @c  -*-texinfo-*-

@c %**start of header
@setfilename mule-ucs.info
@settitle Mule-UCS Manual
@setchapternewpage odd
@c %**end of header

@c This is *so* much nicer :)
@footnotestyle end

@c Version values, for easy modification
@set VERSION $Revision$
@set UPDATED 25 January 2002


@c Entries for @command{install-info} to use
@direntry
* Mule-UCS::                Lisp-based Unicode support for Emacsen.
@end direntry

@c Copying permissions, et al
@ifinfo
This file documents the XEmacs package distribution of Mule-UCS, a
package providing efficient Lisp-based coding support (specifically,
Unicode) for Emacs and XEmacs.
     
Copyright @copyright{} 1997 MIYASHITA Hisashi
Copyright @copyright{} 2001, 2002 Free Software Foundation, Inc.
     
Permission is granted to make and distribute verbatim copies of this
manual provided the copyright notice and this permission notice are
preserved on all copies.
     
@ignore 
Permission is granted to process this file through TeX and print the
results, provided the printed document carries a copying permission
notice identical to this one except for the removal of this paragraph
(this paragraph not being relevant to the printed manual).
   
@end ignore
Permission is granted to copy and distribute modified versions of this
manual under the conditions for verbatim copying, provided also that the
sections entitled ``Copying'' and ``GNU General Public License'' are
included exactly as in the original, and provided that the entire
resulting derived work is distributed under the terms of a permission
notice identical to this one.
     
Permission is granted to copy and distribute translations of this manual
into another language, under the above conditions for modified versions,
except that this permission notice may be stated in a translation
approved by the Free Software Foundation.
@end ifinfo

@tex

@titlepage
@title Mule-UCS User Manual
@subtitle Last updated @value{UPDATED}

@author by Stephen J. Turnbull
@author including documentation by MIYASHITA Hisashi
@page

@vskip 0pt plus 1filll
Permission is granted to make and distribute verbatim copies of this
manual provided the copyright notice and this permission notice are
preserved on all copies.
     
Permission is granted to copy and distribute modified versions of this
manual under the conditions for verbatim copying, provided also that the
sections entitled ``Copying'' and ``GNU General Public License'' are
included exactly as in the original, and provided that the entire
resulting derived work is distributed under the terms of a permission
notice identical to this one.
     
Permission is granted to copy and distribute translations of this manual
into another language, under the above conditions for modified versions,
except that this permission notice may be stated in a translation
approved by the Free Software Foundation.

@end titlepage
@page

@end tex

@ifnottex
@node Top, Copying, (dir), (dir)
@top Mule-UCS User Manual

Mule-UCS is a character code translator.  It provides functions to
translate from any character set to any other, and construct new coding systems
easily.  It requires the MUltiLingual extensions to Emacs (MULE),
including extended CCL facilities.  These functions are provided by
XEmacs (versions 21.2.36 and later), GNU Emacs (versions 20.3 and
later), Emacs patched to use Mule 3.0, and Meadow.

Mule-UCS was designed and implemented by Miyashita Hisashi (HIMI)
@email{himi@@bird.scphys.kyoto-u.ac.jp}

This is version @value{VERSION} of the Mule-UCS manual, last updated on
@value{UPDATED}.  It documents the XEmacs package distribution of
Mule-UCS.  It should be applicable to other versions of Mule-UCS with
slight changes.  Please report errors and variations among platforms to
@email{stephen@@xemacs.org,Stephen Turnbull}, for incorporation in
future versions of this manual.

@c You can find the latest version of this document on the web at
@c @uref{http://www.xemacs.org/}.

@ifhtml
@c This manual is also available as a @uref{mule-ucs_ja.html, a Japanese
@c translation}.

The latest release of Mule-UCS is available for
@uref{ftp://ftp.xemacs.org/pub/xemacs/packages/,
download}, or you may see @ref{Obtaining Mule-UCS} for more details,
including the CVS server details.
@end ifhtml

Mule-UCS is discussed on the mailing lists for Mule at @samp{m17n.org}.

@end ifnottex

@c Yeah the menu is incomplete.  Go right ahead and fix it!!
@menu
* Copying::                     Mule-UCS Copying conditions.
* Overview::                    What Mule-UCS can and cannot do.

For the end user:
* Obtaining Mule-UCS::          How to obtain Mule-UCS.
* History::                     History of Mule-UCS
* Installation::                Installing Mule-UCS with your (X)Emacs.
* Configuration::               Configuring Mule-UCS for use.
* Design of Mule-UCS::          How it works.
@c * Usage::                       An overview of the operation of Mule-UCS.
@c * Bug Reports::                 Reporting Bugs and Problems
@c * Frequently Asked Questions::  Questions and answers from the mailing list.

For the developer:

@c @detailmenu
@c  --- The Detailed Node Listing ---
@c 
@c Configuring Mule-UCS for use
@c 
@c Using Mule-UCS
@c @end detailmenu
@end menu

@node Copying, Overview, Top, Top
@chapter Mule-UCS Copying conditions

Copyright (C) 1998, 1999, 2000 Free Software Foundation, Inc.

This file is part of the XEmacs distribution of Mule-UCS.

Mule-UCS is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 2, or (at your option) any later
version.

Mule-UCS is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
more details.

You should have received a copy of the GNU General Public License along
with XEmacs; see the file COPYING. If not, write to the Free Software
Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307,
USA.


@node Overview, Obtaining Mule-UCS, Copying, Top
@chapter An overview of Mule-UCS

After the installation of Mule-UCS into your Emacs, you will be able to
access Unicode files transparently.  All that is needed is to load the
@file{un-define} library.  Mule-UCS implements rather low-level
functions, and once loaded, the user should never notice that coding
systems implemented via Mule-UCS are any different from those
implemented in C or CCL.

Mule-UCS contains large tables, and takes about 4 seconds to load on a
450MHz Pentium III notebook.  Thus if your use of Unicode is at all
regular, it is recommended that the Mule-UCS Unicode coding systems be
loaded by including

@example
(require 'un-define)
@end example

@noindent
in your init file.  Otherwise, you must load @file{un-define} by hand,
using @code{load-library}.  Also, by default XEmacs does not autodetect
Unicode.  For the most common case, UTF-8, include

@example
(set-coding-priority-list '(utf-8))
(set-coding-category-system 'utf-8 'utf-8)
@end example

@noindent
in your init file.  UTF-8 has a very characteristic signature; false
negatives and positives should be very rare.

Autodetecting 16-bit wide-char versions of Unicode is not currently
implemented in XEmacs itself.  Mule-UCS provides some utilities in the
@file{un-tools} library, but these are of unknown reliability.

Since Mule-UCS uses regular Mule code internally, and does not create an
internal Mule charset for UCS, your normal input methods, whether native
(Wnn), Lisp + backend (new Tamago), all in Lisp (Quail), or XIM-based
(kinput2) should work with Unicode files without any change in your
setup or habits.  Input methods supported by terminals (cxterm,
localized keyboards) should also work (if they work on the native
Chinese!) as long as the terminal coding system is set properly by
@samp{set-terminal-coding-system}.

Mule-UCS was written by a Japanese and thus gives priority to Japanese
by default.  This means that Unicode characters that are unified from
various Asian character sets (eg, the single horizontal stroke meaning
"one" is present in all of them) will be presented in the Mule buffer as
Japanese characters, and displayed with a Japanese font.  @emph{No
information will be lost or corrupted} as long as you @emph{save back to
Unicode}.  (That's what "unification" means.)

However, if you wish to use Mule-UCS to translate Unicode to national
subsets other than ASCII, Latin-1, and Japanese, you must change the
priorities.  This also allows you to satisfy cultural preferences for
glyph styles by defaulting to an appropriate font.  Use
@samp{un-define-change-charset-order}.  For the common case of the Latin
character sets, where by international standard as well as common
practice characters common to more than one character set are considered
identical (not "unified" as for the Han characters in Unicode), the
@file{latin-unity} package may be of use.

@c #### need examples of un-define-change-charset-order usage

(Mule-UCS does not understand Plane 14 tags.  Therefore attempts to
translate multilingual texts into non-Unicode encodings such as ISO 2022
will have to be done by hand.)

That is all that most users of Mule-UCS need to know.

Mule-UCS is still under development and any problems you encounter,
trivial or major, should be reported to the Mule-UCS developers.  Use
the standard package bug address @email{mule-ucs-bugs@@xemacs.org}.
@c #### @xref{Bug Reports}.

@subsubheading Behind the scenes

This section tries to explain what goes on behind the scenes when you
visit a file encoded in Unicode with Mule-UCS.

#### to be written

@c For the end user
@node Obtaining Mule-UCS, History, Overview, Top
@chapter Obtaining Mule-UCS.

Mule-UCS is freely available on the Internet and the latest release may
be downloaded from @uref{ftp://ftp.m17n.org/pub/mule/Mule-UCS/}. This
release includes the full documentation and code for Mule-UCS, suitable
for installation.  The current version is 0.84 @samp{KOUGETSUDAI}, and
is in the file @file{Mule-UCS-0.84.tar.gz}.

For the especially brave, Mule-UCS is available from CVS. The CVS version
is the latest version of the code and may contain incomplete features or
new issues. Use these versions at your own risk.

Follow the example session below:

@example
$ @kbd{cvs -d:pserver:anonymous@@cvs.meadowy.org:/cvsroot login}
(Logging in to anonymous@@cvs.meadowy.org)
CVS password: @key{RET}
@dots{}

$ @kbd{cvs -z3 -d:pserver:anonymous@@cvs.meadowy.org:/cvsroot co mule-ucs}
@end example

You should now have a directory @file{mule-ucs} containing the latest
version of Mule-UCS. You can fetch the latest updates from the repository
by issuing the command:

@example
$ @kbd{cd mule-ucs}
$ @kbd{cvs update -d}
@end example

@c #### Document XEmacs packages here.

Mule-UCS is also available as an XEmacs package.  @xref{Packages,,,xemacs}.


@node History, Installation, Obtaining Mule-UCS, Top
@chapter History of Mule-UCS

Development was started in late 1997.  The earliest net releases were
done in about July 1999.


@node Installation, Configuration, History, Top
@chapter Installing Mule-UCS into Emacs or XEmacs

  Since Mule-UCS is only an Emacs Lisp library, you have only to
byte-compile @file{*.el} files and install them to the location refered by
@code{load-path}.

  You can use @file{mucs-comp.el} at the top directory.
Enter the following command line:

@example
emacs(xemacs) -q --no-site-file -batch -l mucs-comp.el
@end example

If you use Meadow, enter the following:

@example
Meadow95(NT) -q --no-site-file -batch -l mucs-comp.el
@end example

Then you will obtain byte-compiled emacs-lisp files.
Finally, you should install the files in the lisp directory to your
@file{site-lisp} directory.

@c #### document build and install of big5conv and JIS X 0213 support.

@c #### document creation and formatting of Info docs.


@node Configuration, Design of Mule-UCS, Installation, Top
@chapter Configuring Mule-UCS for use

if your use of Unicode is at all
regular, it is recommended that the Mule-UCS Unicode coding systems be
loaded by including

@example
(require 'un-define)
@end example

@noindent
in your init file.  Otherwise, you must load @file{un-define} by hand,
using @code{load-library}.  Also, by default XEmacs does not autodetect
Unicode.  For the most common case, UTF-8, include

@example
(set-coding-priority-list '(utf-8))
(set-coding-category-system 'utf-8 'utf-8)
@end example

@noindent
in your init file.  UTF-8 has a very characteristic signature; false
negatives and positives should be very rare.

Autodetecting 16-bit wide-char versions of Unicode is not currently
implemented in XEmacs itself.  Mule-UCS provides some utilities in the
@file{un-tools} library, but these are of unknown reliability.

That is all that most users of Mule-UCS need to know.  The rest of this
section documents various advanced features which allow Mule-UCS to be
tuned to resolve ambiguities (such as the unification of the Han
characters across several languages) more appropriately.

@c #### FIXME!
Well, it will once it's written.  @code{:-P}


@node Design of Mule-UCS, , Configuration, Top
@chapter Design goal

MULE-UCS is a character code translator system.
I set the goal of this system as follows.

@table @emph
@item map character codepoint. 
MULE-UCS have to map character codepoint fast, and give a flexible way
to change mapping policy.

@item utilize character codetables
MULE-UCS can handle multiple codepoint tables, and then reorganize many
character set.

@item generate coding system.
MULE-UCS can generate coding systems from your own translation rule.
Of course including a CCL to convert font codepoint.
@end table

MULE-UCS has the following supplementary features.

@itemize @bullet
@item Very biased (@code{:-P}) MULE-INTERNAL and ISO-10646 translator. and
ISO-10646 coding-system.

@item Convertor tables from text representation to MULE-UCS awarable emacs
lisp representation.
@end itemize

MULE-UCS overview.

MULE-UCS consists of these modules mainly.

@enumerate
@item Association compiler.
@item Table organizer.
@item CCL generator.
@end enumerate

@table @emph
@item Association compiler.
On MULE-UCS, codepoint mapping rule is described by association
list(alist).  Association compiler generate table set from an assocation
list for encoding and decoding.  Association compiler also optimize tables.

@item Table organizer.
Table Organizer can 
@end table
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.