Source

emacs / lispref / hash.texi

Full commit
@c -*-texinfo-*-
@c This is part of the GNU Emacs Lisp Reference Manual.
@c Copyright (C) 1999 Free Software Foundation, Inc. 
@c See the file elisp.texi for copying conditions.
@setfilename ../info/hash
@node Hash Tables, Symbols, Sequences Arrays Vectors, Top
@chapter Hash Tables
@cindex hash tables

  A hash table is a very fast kind of lookup table, somewhat like
an alist in that it maps keys to corresponding values.  It differs
from an alist in these ways:

@itemize @bullet
@item
Lookup in a hash table is extremely fast for large tables---in fact, the
time required is essentially @emph{independent} of how many elements are
stored in the table.  For smaller tables (a few tens of elements)
alists may still be faster because hash tables have a more-or-less
constant overhead.

@item
The correspondences in a hash table are in no particular order.

@item
There is no way to share structure between two hash tables,
the way two alists can share a common tail.
@end itemize

  Emacs Lisp (starting with Emacs 21) provides a general-purpose hash
table data type, along with a series of functions for operating on them.
Hash tables have no read syntax, and print in hash notation, like this:

@example
(make-hash-table)
     @result{} #<hash-table 'eql nil 0/65 0x83af980>
@end example

@noindent
(The term ``hash notation'' refers to the initial @samp{#}
character---@pxref{Printed Representation}---and has nothing to do with
the term ``hash table.'')

  Obarrays are also a kind of hash table, but they are a different type
of object and are used only for recording interned symbols
(@pxref{Creating Symbols}).

@menu
* Creating Hash::
* Hash Access::
* Defining Hash::
* Other Hash::
@end menu

@node Creating Hash
@section Creating Hash Tables

  The principal function for creating a hash table is
@code{make-hash-table}.

@tindex make-hash-table
@defun make-hash-table &rest keyword-args
This function creates a new hash table according to the specified
arguments.  The arguments should consist of alternating keywords
(particular symbols recognized specially) and values corresponding to
them.

Several keywords make sense in @code{make-hash-table}, but the only two
that you really need to know about are @code{:test} and @code{:weakness}.

@table @code
@item :test @var{test}
This specifies the method of key lookup for this hash table.  The
default is @code{eql}; @code{eq} and @code{equal} are other
alternatives:

@table @code
@item eql
Keys which are numbers are ``the same'' if they are equal in value;
otherwise, two distinct objects are never ``the same''.

@item eq
Any two distinct Lisp objects are ``different'' as keys.

@item equal
Two Lisp objects are ``the same'', as keys, if they are equal
according to @code{equal}.
@end table

You can use @code{define-hash-table-test} (@pxref{Defining Hash}) to
define additional possibilities for @var{test}.

@item :weakness @var{weak}
The weakness of a hash table specifies whether the presence of a key or
value in the hash table preserves it from garbage collection.

The value, @var{weak}, must be one of @code{nil}, @code{key},
@code{value}, @code{key-or-value}, @code{key-and-value}, or @code{t}
which is an alias for @code{key-and-value}.  If @var{weak} is @code{key}
then the hash table does not prevent its keys from being collected as
garbage (if they are not referenced anywhere else); if a particular key
does get collected, the corresponding association is removed from the
hash table.

If @var{weak} is @code{value}, then the hash table does not prevent
values from being collected as garbage (if they are not referenced
anywhere else); if a particular value does get collected, the
corresponding association is removed from the hash table.

If @var{weak} is @code{key-or-value}, associations are removed from the
hash table when either their key or their value part would be collected
as garbage, not counting references to the key and value from weak hash
tables.  Likewise, if @var{weak} is @code{key-and-value}, associations
are removed from the hash table when both their key and value would be
collected as garbage, again not considering references to the key and
value from weak hash tables.

The default for @var{weak} is @code{nil}, so that all keys and values
referenced in the hash table are preserved from garbage collection.  If
@var{weak} is @code{t}, neither keys nor values are protected (that is,
both are weak).

@item :size @var{size}
This specifies a hint for how many associations you plan to store in the
hash table.  If you know the approximate number, you can make things a
little more efficient by specifying it this way.  If you specify too
small a size, the hash table will grow automatically when necessary, but
doing that takes some extra time.

The default size is 65.

@item :rehash-size @var{rehash-size}
When you add an association to a hash table and the table is ``full,''
it grows automatically.  This value specifies how to make the hash table
larger, at that time.

If @var{rehash-size} is an integer, it should be positive, and the hash
table grows by adding that much to the nominal size.  If
@var{rehash-size} is a floating point number, it had better be greater
than 1, and the hash table grows by multiplying the old size by that
number.

The default value is 1.5.

@item :rehash-threshold @var{threshold}
This specifies the criterion for when the hash table is ``full.''  The
value, @var{threshold}, should be a positive floating point number, no
greater than 1.  The hash table is ``full'' whenever the actual number of
entries exceeds this fraction of the nominal size.  The default for
@var{threshold} is 0.8.
@end table
@end defun

@tindex makehash
@defun makehash &optional test
This is equivalent to @code{make-hash-table}, but with a different style
argument list.  The argument @var{test} specifies the method
of key lookup.

If you want to specify other parameters, you should use
@code{make-hash-table}.
@end defun

@node Hash Access
@section Hash Table Access

  This section describes the functions for accessing and storing
associations in a hash table.

@tindex gethash
@defun gethash key table &optional default
This function looks up @var{key} in @var{table}, and returns its
associated @var{value}---or @var{default}, if @var{key} has no
association in @var{table}.
@end defun

@tindex puthash
@defun puthash key value table 
This function enters an association for @var{key} in @var{table}, with
value @var{value}.  If @var{key} already has an association in
@var{table}, @var{value} replaces the old associated value.
@end defun

@tindex remhash
@defun remhash key table
This function removes the association for @var{key} from @var{table}, if
there is one.  If @var{key} has no association, @code{remhash} does
nothing.
@end defun

@tindex clrhash
@defun clrhash table
This function removes all the associations from hash table @var{table},
so that it becomes empty.  This is also called @dfn{clearing} the hash
table.
@end defun

@tindex maphash
@defun maphash function table
This function calls @var{function} once for each of the associations in
@var{table}.  The function @var{function} should accept two
arguments---a @var{key} listed in @var{table}, and its associated
@var{value}.
@end defun

@node Defining Hash
@section Defining Hash Comparisons
@cindex hash code

  You can define new methods of key lookup by means of
@code{define-hash-table-test}.  In order to use this feature, you need
to understand how hash tables work, and what a @dfn{hash code} means.

  You can think of a hash table conceptually as a large array of many
slots, each capable of holding one association.  To look up a key,
@code{gethash} first computes an integer, the hash code, from the key.
It reduces this integer modulo the length of the array, to produce an
index in the array.  Then it looks in that slot, and if necessary in
other nearby slots, to see if it has found the key being sought.

  Thus, to define a new method of key lookup, you need to specify both a
function to compute the hash code from a key, and a function to compare
two keys directly.

@tindex define-hash-table-test
@defun define-hash-table-test name test-fn hash-fn
This function defines a new hash table test, named @var{name}.

After defining @var{name} in this way, you can use it as the @var{test}
argument in @code{make-hash-table}.  When you do that, the hash table
will use @var{test-fn} to compare key values, and @var{hash-fn} to compute
a ``hash code'' from a key value.

The function @var{test-fn} should accept two arguments, two keys, and
return non-@code{nil} if they are considered ``the same.''

The function @var{hash-fn} should accept one argument, a key, and return
an integer that is the ``hash code'' of that key.  For good results, the
function should use the whole range of integer values for hash codes,
including negative integers.

The specified functions are stored in the property list of @var{name}
under the property @code{hash-table-test}; the property value's form is
@code{(@var{test-fn} @var{hash-fn})}.

This example creates a hash table whose keys are strings that are
compared case-insensitively.

@example
(defun case-fold-string= (a b)
  (compare-strings a nil nil b nil nil t))

(defun case-fold-string-hash (a)
  (sxhash (upcase a)))

(define-hash-table-test 'case-fold 'case-fold-string= 
                        'case-fold-string-hash))

(make-hash-table :test 'case-fold)
@end example
@end defun

@tindex sxhash
@defun sxhash obj
This function returns a hash code for Lisp object @var{obj}.
This is an integer which reflects the contents of @var{obj}
and the other Lisp objects it points to.

If two objects @var{obj1} and @var{obj2} are equal, then @code{(sxhash
@var{obj1})} and @code{(sxhash @var{obj2})} are the same integer.

If the two objects are not equal, the values returned by @code{sxhash}
are usually different, but not always; but once in a rare while, by
luck, you will encounter two distinct-looking objects that give the same
result from @code{sxhash}.
@end defun

@node Other Hash
@section Other Hash Table Functions

  Here are some other functions for working with hash tables.

@tindex hash-table-p
@defun hash-table-p table
This returns non-@code{nil} if @var{table} is a hash table object.
@end defun

@tindex copy-hash-table
@defun copy-hash-table table
This function creates and returns a copy of @var{table}.  Only the table
itself is copied---the keys and values are shared.
@end defun

@tindex hash-table-count
@defun hash-table-count table
This function returns the actual number of entries in @var{table}.
@end defun

@tindex hash-table-test
@defun hash-table-test table
This returns the @var{test} value that was given when @var{table} was
created, to specify how to hash and compare keys.  See
@code{make-hash-table} (@pxref{Creating Hash}).
@end defun

@tindex hash-table-weakness
@defun hash-table-weakness table
This function returns the @var{weak} value that was specified for hash
table @var{table}.
@end defun

@tindex hash-table-rehash-size
@defun hash-table-rehash-size table
This returns the rehash size of @var{table}.
@end defun

@tindex hash-table-rehash-threshold
@defun hash-table-rehash-threshold table
This returns the rehash threshold of @var{table}.
@end defun

@tindex hash-table-size
@defun hash-table-size table
This returns the current nominal size of @var{table}.
@end defun