Source

mule-ucs / doc / mule-ucs.texi

Full commit
\input texinfo @c  -*-texinfo-*-

@c %**start of header
@setfilename mule-ucs.info
@settitle Mule-UCS Manual
@setchapternewpage odd
@c %**end of header

@c This is *so* much nicer :)
@footnotestyle end

@c Version values, for easy modification
@set VERSION $Revision$
@set UPDATED Thursday, 13 December, 2001


@c Entries for @command{install-info} to use
@direntry
* Mule-UCS::                Lisp-based Unicode support for Emacsen.
@end direntry

@c Copying permissions, et al
@ifinfo
This file documents Mule-UCS, a package providing efficient Lisp-based
coding support (specifically, Unicode) for Emacs and XEmacs.
     
Copyright @copyright{} 1997 MIYASHITA Hisashi
Copyright @copyright{} 2001 Stephen J. Turnbull
     
Permission is granted to make and distribute verbatim copies of this
manual provided the copyright notice and this permission notice are
preserved on all copies.
     
@ignore 
Permission is granted to process this file through TeX and print the
results, provided the printed document carries a copying permission
notice identical to this one except for the removal of this paragraph
(this paragraph not being relevant to the printed manual).
   
@end ignore
Permission is granted to copy and distribute modified versions of this
manual under the conditions for verbatim copying, provided also that the
sections entitled ``Copying'' and ``GNU General Public License'' are
included exactly as in the original, and provided that the entire
resulting derived work is distributed under the terms of a permission
notice identical to this one.
     
Permission is granted to copy and distribute translations of this manual
into another language, under the above conditions for modified versions,
except that this permission notice may be stated in a translation
approved by the Free Software Foundation.
@end ifinfo

@tex

@titlepage
@title Mule-UCS User Manual
@subtitle Last updated @value{UPDATED}

@author by Stephen J. Turnbull
@author including documentation by MIYASHITA Hisashi
@page

@vskip 0pt plus 1filll
Permission is granted to make and distribute verbatim copies of this
manual provided the copyright notice and this permission notice are
preserved on all copies.
     
Permission is granted to copy and distribute modified versions of this
manual under the conditions for verbatim copying, provided also that the
sections entitled ``Copying'' and ``GNU General Public License'' are
included exactly as in the original, and provided that the entire
resulting derived work is distributed under the terms of a permission
notice identical to this one.
     
Permission is granted to copy and distribute translations of this manual
into another language, under the above conditions for modified versions,
except that this permission notice may be stated in a translation
approved by the Free Software Foundation.

@end titlepage
@page

@end tex

@ifnottex
@node Top, Copying, (dir), (dir)
@top Mule-UCS User Manual

Mule-UCS is a character code translator.  It provides functions to
translate from any character set to any other, and construct new coding systems
easily.  It requires the MUltiLingual extensions to Emacs (MULE),
including extended CCL facilities.  These functions are provided by
XEmacs (versions 21.2.36 and later), GNU Emacs (versions 20.3 and
later), Emacs patched to use Mule 3.0, and Meadow.

Mule-UCS was designed and implemented by Miyashita Hisashi (HIMI)
@email{himi@@bird.scphys.kyoto-u.ac.jp}

This is version @value{VERSION} of the Mule-UCS manual, last updated on
@value{UPDATED}.

@c You can find the latest version of this document on the web at
@c @uref{http://www.xemacs.org/}.

@ifhtml
@c This manual is also available as a @uref{mule-ucs_ja.html, a Japanese
@c translation}.

The latest release of Mule-UCS is available for
@uref{ftp://ftp.xemacs.org/pub/xemacs/packages/,
download}, or you may see @ref{Obtaining Mule-UCS} for more details,
including the CVS server details.
@end ifhtml

Mule-UCS is discussed on the mailing lists for Mule at @samp{m17n.org}.

@end ifnottex

@c Yeah the menu is incomplete.  Go right ahead and fix it!!
@menu
* Copying::                     Mule-UCS Copying conditions.
* Overview::                    What Mule-UCS can and cannot do.

For the end user:
* Obtaining Mule-UCS::          How to obtain Mule-UCS.
* History::                     History of Mule-UCS
* Installation::                Installing Mule-UCS with your (X)Emacs.
* Configuration::               Configuring Mule-UCS for use.
* Design of Mule-UCS::          How it works.
@c * Usage::                       An overview of the operation of Mule-UCS.
@c * Bug Reports::                 Reporting Bugs and Problems
@c * Frequently Asked Questions::  Questions and answers from the mailing list.

For the developer:

@c @detailmenu
@c  --- The Detailed Node Listing ---
@c 
@c Configuring Mule-UCS for use
@c 
@c Using Mule-UCS
@c @end detailmenu
@end menu

@node Copying, Overview, Top, Top
@chapter Mule-UCS Copying conditions

Copyright (C) 1998, 1999, 2000 Free Software Foundation, Inc.

Mule-UCS is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 2, or (at your option) any later
version.

Mule-UCS is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
more details.

You should have received a copy of the GNU General Public License along
with GNU Emacs; see the file COPYING. If not, write to the Free Software
Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307,
USA.


@node Overview, Obtaining Mule-UCS, Copying, Top
@chapter An overview of Mule-UCS

After the installation of Mule-UCS into your Emacs, you will be able to
access Unicode files transparently.  All that is needed is to load the
@file{un-define} library.  Mule-UCS implements rather low-level
functions, and once loaded, the user should never notice that coding
systems implemented via Mule-UCS are any different from those
implemented in C or CCL.

Mule-UCS contains large tables, and takes about 4 seconds to load on a
450MHz Pentium III notebook.  Thus if your use of Unicode is at all
regular, it is recommended that the Mule-UCS Unicode coding systems be
loaded by including

@example
(require 'un-define)
@end example

@noindent
in your init file.  Otherwise, you must load @file{un-define} by hand,
using @code{load-library}.  Also, by default XEmacs does not autodetect
Unicode.  For the most common case, UTF-8, include

@example
(set-coding-priority-list '(utf-8))
(set-coding-category-system 'utf-8 utf-8)
@end example

@noindent
in your init file.  UTF-8 has a very characteristic signature; false
negatives and positives should be very rare.

Autodetecting 16-bit wide-char versions of Unicode is not currently
implemented in XEmacs itself.  Mule-UCS provides some utilities in the
@file{un-tools} library, but these are of unknown reliability.

That is all that most users of Mule-UCS need to know.

Mule-UCS is still under development and any problems you encounter,
trivial or major, should be reported to the Mule-UCS developers.
@c #### @xref{Bug Reports}.

@subsubheading Behind the scenes

This section tries to explain what goes on behind the scenes when you
visit a file encoded in Unicode with Mule-UCS.

@c For the end user
@node Obtaining Mule-UCS, History, Overview, Top
@chapter Obtaining Mule-UCS.

Mule-UCS is freely available on the Internet and the latest release may
be downloaded from @uref{ftp://ftp.m17n.org/pub/mule/Mule-UCS/}. This
release includes the full documentation and code for Mule-UCS, suitable
for installation.  The current version is 0.84 @samp{KOUGETSUDAI}, and
is in the file @file{Mule-UCS-0.84.tar.gz}.

For the especially brave, Mule-UCS is available from CVS. The CVS version
is the latest version of the code and may contain incomplete features or
new issues. Use these versions at your own risk.

Follow the example session below:

@example
$ @kbd{cvs -d:pserver:anonymous@@cvs.meadowy.org:/cvsroot login}
(Logging in to anonymous@@cvs.meadowy.org)
CVS password: @key{RET}
@dots{}

$ @kbd{cvs -z3 -d:pserver:anonymous@@cvs.meadowy.org:/cvsroot co mule-ucs}
@end example

You should now have a directory @file{mule-ucs} containing the latest
version of Mule-UCS. You can fetch the latest updates from the repository
by issuing the command:

@example
$ @kbd{cd mule-ucs}
$ @kbd{cvs update -d}
@end example

@c #### Document XEmacs packages here.

Mule-UCS is also available as an XEmacs package.  @xref{Packages,,,xemacs}.


@node History, Installation, Obtaining Mule-UCS, Top
@chapter History of Mule-UCS

Development was started in late 1997.  The earliest net releases were
done in about July 1999.


@node Installation, Configuration, History, Top
@chapter Installing Mule-UCS into Emacs or XEmacs

  Since Mule-UCS is only an Emacs Lisp library, you have only to
byte-compile @file{*.el} files and install them to the location refered by
@code{load-path}.

  You can use @file{mucs-comp.el} at the top directory.
Enter the following command line:

@example
emacs(xemacs) -q --no-site-file -batch -l mucs-comp.el
@end example

If you use Meadow, enter the following:

@example
Meadow95(NT) -q --no-site-file -batch -l mucs-comp.el
@end example

Then you will obtain byte-compiled emacs-lisp files.
Finally, you should install the files in the lisp directory to your
@file{site-lisp} directory.

@c #### document build and install of big5conv and JIS X 0213 support.

@c #### document creation and formatting of Info docs.


@node Configuration, Design of Mule-UCS, Installation, Top
@chapter Configuring Mule-UCS for use

if your use of Unicode is at all
regular, it is recommended that the Mule-UCS Unicode coding systems be
loaded by including

@example
(require 'un-define)
@end example

@noindent
in your init file.  Otherwise, you must load @file{un-define} by hand,
using @code{load-library}.  Also, by default XEmacs does not autodetect
Unicode.  For the most common case, UTF-8, include

@example
(set-coding-priority-list '(utf-8))
(set-coding-category-system 'utf-8 utf-8)
@end example

@noindent
in your init file.  UTF-8 has a very characteristic signature; false
negatives and positives should be very rare.

Autodetecting 16-bit wide-char versions of Unicode is not currently
implemented in XEmacs itself.  Mule-UCS provides some utilities in the
@file{un-tools} library, but these are of unknown reliability.

That is all that most users of Mule-UCS need to know.  The rest of this
section documents various advanced features which allow Mule-UCS to be
tuned to resolve ambiguities (such as the unification of the Han
characters across several languages) more appropriately.

@c #### FIXME!
Well, it will once it's written.  @code{:-P}


@node Design of Mule-UCS, , Configuration, Top
@chapter Design goal

MULE-UCS is a character code translator system.
I set the goal of this system as follows.

@table @emph
@item map character codepoint. 
MULE-UCS have to map character codepoint fast, and give a flexible way
to change mapping policy.

@item utilize character codetables
MULE-UCS can handle multiple codepoint tables, and then reorganize many
character set.

@item generate coding system.
MULE-UCS can generate coding systems from your own translation rule.
Of course including a CCL to convert font codepoint.
@end table

MULE-UCS has the following supplementary features.

@itemize @bullet
@item Very biased (@code{:-P}) MULE-INTERNAL and ISO-10646 translator. and
ISO-10646 coding-system.

@item Convertor tables from text representation to MULE-UCS awarable emacs
lisp representation.
@end itemize

MULE-UCS overview.

MULE-UCS consists of these modules mainly.

@enumerate
@item Association compiler.
@item Table organizer.
@item CCL generator.
@end enumerate

@table @emph
@item Association compiler.
On MULE-UCS, codepoint mapping rule is described by association
list(alist).  Association compiler generate table set from an assocation
list for encoding and decoding.  Association compiler also optimize tables.

@item Table organizer.
Table Organizer can 
@end table