Source

smtpErrorAnalysis / doc / build / html / findBadAddresses.html



<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">


<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    
    <title>findBadAddresses Module &mdash; &#39;smtp-error-analysis&#39; &#39;0.1.0&#39; documentation</title>
    
    <link rel="stylesheet" href="_static/default.css" type="text/css" />
    <link rel="stylesheet" href="_static/pygments.css" type="text/css" />
    
    <script type="text/javascript">
      var DOCUMENTATION_OPTIONS = {
        URL_ROOT:    '',
        VERSION:     '&#39;0.1.0&#39;',
        COLLAPSE_INDEX: false,
        FILE_SUFFIX: '.html',
        HAS_SOURCE:  true
      };
    </script>
    <script type="text/javascript" src="_static/jquery.js"></script>
    <script type="text/javascript" src="_static/underscore.js"></script>
    <script type="text/javascript" src="_static/doctools.js"></script>
    <link rel="top" title="&#39;smtp-error-analysis&#39; &#39;0.1.0&#39; documentation" href="index.html" />
    <link rel="next" title="regexEmailTester Module" href="regexEmailTester.html" />
    <link rel="prev" title="Welcome to ‘smtp-error-analysis’’s documentation!" href="index.html" /> 
  </head>
  <body>
    <div class="related">
      <h3>Navigation</h3>
      <ul>
        <li class="right" style="margin-right: 10px">
          <a href="genindex.html" title="General Index"
             accesskey="I">index</a></li>
        <li class="right" >
          <a href="py-modindex.html" title="Python Module Index"
             >modules</a> |</li>
        <li class="right" >
          <a href="regexEmailTester.html" title="regexEmailTester Module"
             accesskey="N">next</a> |</li>
        <li class="right" >
          <a href="index.html" title="Welcome to ‘smtp-error-analysis’’s documentation!"
             accesskey="P">previous</a> |</li>
        <li><a href="index.html">&#39;smtp-error-analysis&#39; &#39;0.1.0&#39; documentation</a> &raquo;</li> 
      </ul>
    </div>  

    <div class="document">
      <div class="documentwrapper">
        <div class="bodywrapper">
          <div class="body">
            
  <div class="section" id="module-findBadAddresses">
<span id="findbadaddresses-module"></span><h1>findBadAddresses Module<a class="headerlink" href="#module-findBadAddresses" title="Permalink to this headline"></a></h1>
<p>Allows a directory of email messages to be parsed for &#8216;bounce messages&#8217;
and for those &#8216;bounce messages&#8217; to be parsed for details which will 
allow the problems to be analysed.</p>
<p>Particular focus on emails bounced due to sender having used an invalid
address</p>
<dl class="exception">
<dt id="findBadAddresses.FindBadAddExcptn">
<em class="property">exception </em><tt class="descclassname">findBadAddresses.</tt><tt class="descname">FindBadAddExcptn</tt><big>(</big><em>value</em><big>)</big><a class="reference internal" href="_modules/findBadAddresses.html#FindBadAddExcptn"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#findBadAddresses.FindBadAddExcptn" title="Permalink to this definition"></a></dt>
<dd><p>Bases: <tt class="xref py py-class docutils literal"><span class="pre">exceptions.Exception</span></tt></p>
<p>Base class for errors in this script.</p>
</dd></dl>

<dl class="function">
<dt id="findBadAddresses.build_ignore_list">
<tt class="descclassname">findBadAddresses.</tt><tt class="descname">build_ignore_list</tt><big>(</big><big>)</big><a class="reference internal" href="_modules/findBadAddresses.html#build_ignore_list"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#findBadAddresses.build_ignore_list" title="Permalink to this definition"></a></dt>
<dd><p>Returns a hard-coded list of file names which will be ignored
in subsequent processing</p>
<p>This is not currently used but is left in place as it supports        
the &#8216;ignore me&#8217; structure which is in place</p>
</dd></dl>

<dl class="function">
<dt id="findBadAddresses.find_email">
<tt class="descclassname">findBadAddresses.</tt><tt class="descname">find_email</tt><big>(</big><em>instr</em><big>)</big><a class="reference internal" href="_modules/findBadAddresses.html#find_email"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#findBadAddresses.find_email" title="Permalink to this definition"></a></dt>
<dd><p>Given a string searches for all email addresses contained
within the string. We assume:</p>
<ul class="simple">
<li>At least email address will be found</li>
<li>All addresses found will be identical</li>
</ul>
<p>If this is so the email address found will be returned.
If this is not so errors are raised</p>
</dd></dl>

<dl class="function">
<dt id="findBadAddresses.main">
<tt class="descclassname">findBadAddresses.</tt><tt class="descname">main</tt><big>(</big><big>)</big><a class="reference internal" href="_modules/findBadAddresses.html#main"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#findBadAddresses.main" title="Permalink to this definition"></a></dt>
<dd><p>The main() function</p>
<p>Needs work in order that the location of email files to be parsed
and the location of output files may be specificed via command
line params</p>
</dd></dl>

<dl class="function">
<dt id="findBadAddresses.parse_email_for_del_stat_part">
<tt class="descclassname">findBadAddresses.</tt><tt class="descname">parse_email_for_del_stat_part</tt><big>(</big><em>file_name</em>, <em>path_em_file</em>, <em>csv_dict_wrtr</em><big>)</big><a class="reference internal" href="_modules/findBadAddresses.html#parse_email_for_del_stat_part"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#findBadAddresses.parse_email_for_del_stat_part" title="Permalink to this definition"></a></dt>
<dd><p>Given the text of a SMTP &#8216;bounce message&#8217; writes a CSV row 
to match the headers in the global variable HDR_OUTPUT_COLS.</p>
<p>It does this by finding the &#8216;message/delivery-status&#8217; part of 
the entire email and parsing the headers.</p>
<p>An &#8216;message/delivery-status&#8217; part of a &#8216;bounce email&#8217; looks a 
little like this</p>
<div class="highlight-python"><pre>Content-Description: Delivery report
Content-Type: message/delivery-status

Reporting-MTA: dns; a.b.web              
X-Postfix-Queue-ID: 808F17F8080
X-Postfix-Sender: rfc822; someone@c.d.web
Arrival-Date: Tue,  8 May 2012 16:30:12 -0700 (PDT)

Final-Recipient: rfc822; john.smith@e.web
Original-Recipient: rfc822;john.smith@e.web
Action: failed
Status: 5.0.0
Remote-MTA: dns; smtp.e.web
Diagnostic-Code: smtp; 550 &lt;john.smith@e.web&gt;, Recipient unknown</pre>
</div>
<p>NB: All sorts of assumptions are made about the structure of the 
bounce message which seem to hold true for a large sample I have 
used in testing but it seems likely that somewhere there are &#8216;bounce
messages&#8217; which follow different conventions. In particular I suspect
that were the original email message to be something other than a two
part multipart email message there might be problems</p>
</dd></dl>

<dl class="function">
<dt id="findBadAddresses.remove_rfc_notation">
<tt class="descclassname">findBadAddresses.</tt><tt class="descname">remove_rfc_notation</tt><big>(</big><em>email_to_be_cleaned</em><big>)</big><a class="reference internal" href="_modules/findBadAddresses.html#remove_rfc_notation"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#findBadAddresses.remove_rfc_notation" title="Permalink to this definition"></a></dt>
<dd><p>Given a string which contains an email address in oe of the two 
following formats</p>
<blockquote>
<div><ul class="simple">
<li><tt class="docutils literal"><span class="pre">a&#64;foo.bar</span></tt></li>
<li><tt class="docutils literal"><span class="pre">rfc:a&#64;foo.bar</span></tt></li>
</ul>
</div></blockquote>
<p>this function will return <tt class="docutils literal"><span class="pre">a&#64;foo.bar</span></tt></p>
</dd></dl>

<dl class="function">
<dt id="findBadAddresses.strip_line_feeds">
<tt class="descclassname">findBadAddresses.</tt><tt class="descname">strip_line_feeds</tt><big>(</big><em>string</em><big>)</big><a class="reference internal" href="_modules/findBadAddresses.html#strip_line_feeds"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#findBadAddresses.strip_line_feeds" title="Permalink to this definition"></a></dt>
<dd><p>Return the input string with CRLF
characters removed</p>
</dd></dl>

</div>


          </div>
        </div>
      </div>
      <div class="sphinxsidebar">
        <div class="sphinxsidebarwrapper">
  <h4>Previous topic</h4>
  <p class="topless"><a href="index.html"
                        title="previous chapter">Welcome to &#8216;smtp-error-analysis&#8217;&#8217;s documentation!</a></p>
  <h4>Next topic</h4>
  <p class="topless"><a href="regexEmailTester.html"
                        title="next chapter">regexEmailTester Module</a></p>
  <h3>This Page</h3>
  <ul class="this-page-menu">
    <li><a href="_sources/findBadAddresses.txt"
           rel="nofollow">Show Source</a></li>
  </ul>
<div id="searchbox" style="display: none">
  <h3>Quick search</h3>
    <form class="search" action="search.html" method="get">
      <input type="text" name="q" />
      <input type="submit" value="Go" />
      <input type="hidden" name="check_keywords" value="yes" />
      <input type="hidden" name="area" value="default" />
    </form>
    <p class="searchtip" style="font-size: 90%">
    Enter search terms or a module, class or function name.
    </p>
</div>
<script type="text/javascript">$('#searchbox').show(0);</script>
        </div>
      </div>
      <div class="clearer"></div>
    </div>
    <div class="related">
      <h3>Navigation</h3>
      <ul>
        <li class="right" style="margin-right: 10px">
          <a href="genindex.html" title="General Index"
             >index</a></li>
        <li class="right" >
          <a href="py-modindex.html" title="Python Module Index"
             >modules</a> |</li>
        <li class="right" >
          <a href="regexEmailTester.html" title="regexEmailTester Module"
             >next</a> |</li>
        <li class="right" >
          <a href="index.html" title="Welcome to ‘smtp-error-analysis’’s documentation!"
             >previous</a> |</li>
        <li><a href="index.html">&#39;smtp-error-analysis&#39; &#39;0.1.0&#39; documentation</a> &raquo;</li> 
      </ul>
    </div>
    <div class="footer">
        &copy; Copyright 2012, Richard Shea.
      Created using <a href="http://sphinx.pocoo.org/">Sphinx</a> 1.1.3.
    </div>
  </body>
</html>