Source

akismet / docs / akismet_python.txt

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
==========================
 Akismet - The Python API
==========================
------------------------------------
 Stopping Comment Spam with Akismet
------------------------------------

:Author: `Michael Foord`_
:Contact: fuzzyman@voidspace.org.uk
:Version: 0.1.5
:Date: 2007/02/05
:License: `BSD License`_ [#]_
:Online Version: `akismet.py online`_

.. _`akismet.py online`: http://www.voidspace.org.uk/python/akismet_python.html
.. _BSD License: BSD-LICENSE.txt

.. contents:: The Akismet API


Introduction
============

`Akismet <http://www.akismet.com>`_ is a web service for recognising spam
comments. It promises to be almost 100% effective at catching comment spam. 
They say that currently 81% of all comments submitted to them are spam.

It's designed to work with the `Wordpress Blog Tool <http://wordpress.org/>`_,
but it's not restricted to that - so this is a Python interface to the 
`Akismet API <http://akismet.com/development/api/>`_.

You'll need a `Wordpress Key <http://wordpress.com>`_ to use it. This script 
will allow you to plug akismet into any CGI script or web application, and 
there are full docs in the code. It's extremely easy to use, because the folks 
at  akismet have implemented a nice and straightforward 
{acro;REST;REpresentational State Transfer} 
{acro;API;Application Programmers Interface}.

.. note::

    If possible you should build into your program the ability to inform
    Akismet of false positives and false negatives.
    
    Informing Akismet helps makes the service more reliable. {sm;:-)}
    
    To do this, use the `submit-spam`_ and `submit-ham`_ functionality.

Most of the work is done by the `comment-check`_ function.

Downloading
-----------

You can download **akismet.py** from :

* `akismet.zip (48k) <http://www.voidspace.org.uk/cgi-bin/voidspace/downman.py?file=akismet.zip>`_

This contains the docs, **akismet.py**, and a test {acro;CGI} called 
*test_akismet.py*.


apikey.txt
==========

The easiest way to use *akismet.py* is to provide the wordpress key and blog
{acro;URL} in a text file called ``apikey.txt``.

This should be in the current directory when you create your ``Akismet``
instance (or call setAPIKey_ with no arguments).

The format for ``apikey.txt`` is simple (see the example one in the
distribution.

Lines that start with a ``#`` are comments. The first non-blank, non-comment
line should be the API key. The second line should be the blog URL (or 
application URL) to use.

::

    # Lines starting with '#' are comments
    # The first non-blank, non-comment, line should be your api key
    # The second your blog URL
    #
    # You can get a wordpress API key from http://wordpress.com/
    some_key
    some_blog_url


The Akismet Class
=================

The akismet_ API provides four functions. *akismet.py* gives you access to
all of these through a single class.

The four akismet functions are :

* `verify-key`_     - the ``verify_key`` method
* `comment-check`_  - the ``comment_check`` method
* `submit-spam`_   - the ``submit_spam`` method
* `submit-ham`_   - the ``submit_ham`` method

In addition to these, the Akismet class has the following user methods and
attributes :

* setAPIKey_    - method
* key           - attribute
* blog_url      - attribute


Creating an Instance
--------------------

::

    Akismet(key=None, blog_url=None, agent=None)

To use the akismet web service you *need* an API key. There are three ways of
telling the ``Akismet`` class what this is.

1) When you create a new ``Akismet`` instance you can pass in the API key and
   blog url.
2) If you don't pass in a key, it will automatically look for apikey.txt_ in
   the current directory, and attempt to load it.
3) You can set the ``key`` and ``blog_url`` Attributes manually, after creating
   your instance.


User-Agent
~~~~~~~~~~

As well as setting your key, you *ought* to pass in a string for ``Akismet`` to
create a User-Agent header with. This is the ``agent`` argument.

According to the `API docs <http://akismet.com/development/api/>`_, this ought
to be in the form : ::

    Program Name/Version

*akismet.py* adds it's version number to this, to create a User-Agent in the
form liked by akismet.

The default User-Agent (if you don't pass in a values to ``agent``) is : ::

    Python Interface by Fuzzyman/0.1.5 | akismet.py/0.1.5


Example 1
~~~~~~~~~

.. raw:: html

    {+coloring}
    
    #example 1
    api = Akismet(api_key, url, agent='Example/0.1') 
    
    {-coloring}

Example 2
~~~~~~~~~

.. raw:: html

    {+coloring}
    
    #example 2
    if os.path.isfile('apikey.txt'):
        api = Akismet(agent='Example/0.2') 
    
    # The key and URL are loaded from
    # 'apikey.txt'
    
    {-coloring}

Example 3
~~~~~~~~~

.. raw:: html

    {+coloring}
    
    #example 2
    url = 'http://www.voidspace.org.uk/cgi-bin/voidspace/guestbook.py'
    api_key = '0acdfg1fr'
    if not os.path.isfile('apikey.txt'):
        api = Akismet(agent='Example/0.3') 
        api.key = api_key
        api.blog_url = url
    
    # The key and URL are set manually
    
    {-coloring}


setAPIKey
---------

::

    setAPIKey(key=None, blog_url=None)

Set the wordpress API key for all transactions.

If you don't specify an explicit API ``key`` and ``blog_url`` it will
attempt to load them from a file called ``apikey.txt`` in the current
directory.

This method is *usually* called automatically when you create a new ``Akismet``
instance.


Akismet Methods
---------------

These four methods equate to the four functions of the `Akismet API <http://akismet.com/development/api/>`_.

verify-key
~~~~~~~~~~

::

    verify_key()

This equates to the ``verify-key`` call against the akismet API.

It returns ``True`` if the key is valid.

The docs state that your program *ought* to call this at the start of the
transaction.

It raises ``APIKeyError`` if you have not yet set an API key.

If the connection to akismet fails, it allows the normal ``HTTPError``
or ``URLError`` to be raised. (*akismet.py* uses 
`urllib2 <http://docs.python.org/lib/module-urllib2.html>`_)


comment-check
~~~~~~~~~~~~~

::

    comment_check(comment, data=None, build_data=True, DEBUG=False)

This is the main function in the Akismet API. It checks comments.

It returns ``True`` for spam and ``False`` for ham.

If you set ``DEBUG=True`` then it will return the text of the response,
instead of the ``True`` or ``False`` object.

It raises ``APIKeyError`` if you have not yet set an API key.

If the connection to Akismet fails then the ``HTTPError`` or
``URLError`` will be propogated.

As a minimum it requires the body of the comment. This is the
``comment`` argument.

Akismet requires some other arguments, and allows some optional ones.
The more information you give it, the more likely it is to be able to
make an accurate diagnosise.

You supply these values using a mapping object (dictionary) as the
``data`` argument.

If ``build_data`` is ``True`` (the default), then *akismet.py* will
attempt to fill in as much information as possible, using default
values where necessary. This is particularly useful for programs
running in a {acro;CGI} environment. A lot of useful information
can be supplied from evironment variables (``os.environ``). See below.

You *only* need supply values for which you don't want defaults filled
in for. All values must be strings.

There are a few required values. If they are not supplied, and
defaults can't be worked out, then an ``AkismetError`` is raised.

If you set ``build_data=False`` and a required value is missing an
``AkismetError`` will also be raised.

The normal values (and defaults) are as follows :

*    'user_ip':          ``os.environ['REMOTE_ADDR']``       (*)
*    'user_agent':       ``os.environ['HTTP_USER_AGENT']``   (*)
*    'referrer':         ``os.environ.get('HTTP_REFERER', 'unknown')`` [#]_
*    'permalink':        ''
*    'comment_type':     'comment' [#]_
*    'comment_author':   ''
*    'comment_author_email': ''
*    'comment_author_url': ''
*    'SERVER_ADDR':      ``os.environ.get('SERVER_ADDR', '')``
*    'SERVER_ADMIN':     ``os.environ.get('SERVER_ADMIN', '')``
*    'SERVER_NAME':      ``os.environ.get('SERVER_NAME', '')``
*    'SERVER_PORT':      ``os.environ.get('SERVER_PORT', '')``
*    'SERVER_SIGNATURE': ``os.environ.get('SERVER_SIGNATURE', '')``
*    'SERVER_SOFTWARE':  ``os.environ.get('SERVER_SOFTWARE', '')``
*    'HTTP_ACCEPT':      ``os.environ.get('HTTP_ACCEPT', '')``

(*) Required values

You may supply as many additional **'HTTP_*'** type values as you wish.
These should correspond to the http headers sent with the request.



submit-spam
~~~~~~~~~~~

::

    submit_spam(comment, data=None, build_data=True)

This function is used to tell akismet that a comment it marked as ham,
is really spam.

It takes all the same arguments as ``comment_check``, except for
*DEBUG*.


submit-ham
~~~~~~~~~~

::

    submit_ham(self, comment, data=None, build_data=True)

This function is used to tell akismet that a comment it marked as spam,
is really ham.

It takes all the same arguments as ``comment_check``, except for
*DEBUG*.


Error Classes
=============

In the course of using *akismet.py*, there are two possible errors you could
see.

AkismetError
------------

This is for general Akismet errors. For example, if you didn't supply some of
the required information.

This error is a subclass of ``Exception``.

This error is also raised if there is a network connection error. This can happen when the Akismet
service or domain goes down temporarily.

Your code should trap this and handle it appropriately (either let the comment through or push it
onto a moderation queue).


APIKeyError
-----------

If *apikey.txt* is invalid, or you attempt to call one of the `akismet methods`_
without setting a key, you will get an ``APIKeyError``.

This error is a subclass of ``AkismetError``.


Usage Example
=============

A simple example that loads the key automatically, verifies the key, and then
checks a comment.

.. raw:: html

    {+coloring}
    
    api = Akismet(agent='Test Script')
    # if apikey.txt is in place,
    # the key will automatically be set
    # or you can call ``api.setAPIKey()``
    #
    if api.key is None:
        print "No 'apikey.txt' file."
    elif not api.verify_key():
        print "The API key is invalid."
    else:
        # data should be a dictionary of values
        # They can all be filled in with defaults
        # from a CGI environment
        if api.comment_check(comment, data):
            print 'This comment is spam.'
        else:
            print 'This comment is ham.'
    
    {-coloring}


Akismet Test CGI
================

Included in the distribution is a file called ``test_akismet.py``.

This is a simple test CGI. It needs `cgiutils <http://www.voidspace.org.uk/python/recipebook.shtml#util>`_
to run.

When activated, it allows you to put a comment in and test it with akismet. It
will tell you if the comment is marked as *ham*, or *spam*.

To confirm that your setup is working; any post with **viagra-test-123** as the
name, should be marked as spam.

Obviously you will need an API key for this to work.

You can try this online at :

    `Akismet Example CGI <http://www.voidspace.org.uk/cgi-bin/akismet/test_akismet.py>`_


---------------------


TODO
====

Make the timeout adjustable ?

Should we fill in a default value for permalink ?

What about automatically filling in the 'HTTP_*' values from os.environ ?

CHANGELOG
=========

2007/02/05      Version 0.1.5
-----------------------------

Fixed a typo/bug in ``submit_ham``. Thanks to Ian Ozsvald for pointing this out.

2006/12/13      Version 0.1.4
-----------------------------

Akismet now traps errors in connections. If there is a network error it raises an ``AkismetError``.

This can happen when the Akismet service or domain goes down temporarily.

Your code should trap this and handle it appropriately (either let the comment through or push it onto a moderation
queue).

2006/07/18      Version 0.1.3
-----------------------------

Add the blog url to the data. Bugfix thanks to James Bennett.

2005/12/04      Version 0.1.2
-----------------------------

Added the ``build_data`` argument to ``comment_check``, ``submit_spam``, and
``submit_ham``.

2005/12/02      Version 0.1.1
-----------------------------

Corrected so that ham and spam are the right way round {sm;:-)}

2005/12/01      Version 0.1.0
-----------------------------

Test version.


Footnotes
=========

.. [#] Online at http://www.voidspace.org.uk/python/license.shtml
.. [#] Note the spelling "referrer". This is a required value by the
    akismet api - however, referrer information is not always
    supplied by the browser or server. In fact the HTTP protocol
    forbids relying on referrer information for functionality in 
    programs.
.. [#] The `API docs <http://akismet.com/development/api/>`_ state that this value
    can be " *blank, comment, trackback, pingback, or a made up value*
    *like 'registration'* ".

.. _Michael Foord: http://www.voidspace.org.uk/python/index.shtml
.. _let me know: fuzzyman@voidspace.org.uk


.. note::

    Rendering this document with docutils also needs the
    textmacros module and the **PySrc** CSS stuff. See
    http://www.voidspace.org.uk/python/firedrop2/textmacros.shtml

.. raw:: html

    <div align="center">
        <a href="http://www.python.org">
            <img src="images/powered_by_python.jpg" width="602" height="186" border="0" />
        </a>
        <a href="http://www.opensource.org">
            <img src="images/osi-certified-120x100.gif" width="120" height="100" border="0" />
            <br /><strong>Certified Open Source</strong>
        </a>
    <script src="http://www.google-analytics.com/urchin.js" type="text/javascript">
    </script>
    <script type="text/javascript">
    _uacct = "UA-203625-1";
    urchinTracker();
    </script>
    </div>
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.