Source

recoll / unac /

Filename Size Date modified Message
..
6 B
28 B
17.6 KB
48.9 KB
4.8 KB
7.6 KB
2.5 KB
24.7 KB
1.2 KB
3.2 KB
271 B
818.8 KB
5.3 KB
175.5 KB
13.5 KB
37.7 KB
1.9 KB
28.8 KB
294.9 KB
1.8 KB
11.8 KB
34.1 KB
6.3 KB
5.5 KB
95.4 KB
138.9 KB
8.6 KB
722 B
10 B
806 B
10.8 KB
567.6 KB
17.7 KB
301 B
1.1 KB
2.6 KB
3.8 KB
1.5 KB
12.7 KB
$Header: /cvsroot/unac/unac/README,v 1.5 2002/09/02 10:40:09 loic Exp $

What is it ?
------------

unac is a C library that removes accents from characters, regardless
of the character set (ISO-8859-15, ISO-CELTIC, KOI8-RU...)  as long as
iconv(3) is able to convert it into UTF-16 (Unicode).  For instance
the string été will become ete.  It provides a command line interface
(unaccent) that removes accents from an input flow or a string given
in argument. When using the library function or the command, the
charset of the input must be specified. The input is converted to
UTF-16 using iconv(3), accents are removed and the result is converted
back to the original charset. The iconv -l command on GNU/Linux will
show all charset supported.

Where is the documentation ?
----------------------------

The manual page of the unaccent command : man unaccent.
The manual page of the unac library : man unac.

How to install it ?
-------------------

For OS that are not GNU/Linux we recommend to use the iconv library
provided by Bruno Haible <haible@ilog.fr> at
ftp://ftp.gnu.org/pub/gnu/libiconv/libiconv-1.8.tar.gz.

./configure [--with-iconv=/my/local]

make all

make check

make install

How to link with unac ?
-------------------------

Assuming you've installed unac in the /usr/local directory use something 
similar to the following:

In the sources:
...
#include <unac.h>
...

On the command line:

cc -I/usr/local/include -o prog prog.cc -L/usr/local/lib -lunac 

Where can I download it ?
-------------------------
The main distribution site is http://www.senga.org/unac/.

What is the license ?
---------------------
unac is distributed under the GNU GPL, as found at 
http://www.gnu.org/licenses/gpl.txt. Unicode data files are
under the following license, which is compatible with the 
GNU GPL:

http://www.unicode.org/Public/3.2-Update/UnicodeData-3.2.0.html#UCD_Terms
UCD Terms of Use

Disclaimer

The Unicode Character  Database is provided as is  by Unicode, Inc. No
claims  are  made  as  to  fitness  for  any  particular  purpose.  No
warranties of any kind are  expressed or implied. The recipient agrees
to determine  applicability of information provided. If  this file has
been purchased  on magnetic or  optical media from Unicode,  Inc., the
sole remedy for  any claim will be exchange  of defective media within
90 days of receipt.

This disclaimer  is applicable for  all other data  files accompanying
the Unicode  Character Database, some  of which have been  compiled by
the Unicode Consortium, and some  of which have been supplied by other
sources.  Limitations on Rights to Redistribute This Data

Recipient is granted the right to make copies in any form for internal
distribution  and  to  freely  use  the information  supplied  in  the
creation of  products supporting the Unicode(TM)  Standard.  The files
in  the  Unicode Character  Database  can  be  redistributed to  third
parties or other organizations (whether  for profit or not) as long as
this notice  and the disclaimer notice are  retained.  Information can
be extracted from  these files and used in  documentation or programs,
as long as there is an accompanying notice indicating the source.

Loic Dachary
loic@senga.org
http://www.senga.org/