Commits

Anonymous committed 9fff42a

Initial commit.

Comments (0)

Files changed (98)

+This project is meant to archive the mysqlreport script (and it's documentation). From the original author:
+
+"I have decided to officially retire Hack MySQL, which is to say that the web site (pages, tools, etc.) will no longer be maintained. The MySQL universe has changed significantly since 2004 when I began writing tools. Similarly, my professional and personal lives have changed, too. I dont think Hack MySQL is needed or relevant any longer, and I certainly do not have the time to maintain it.
+
+The web site will remain online for perhaps another year (since Media Temple has always provided free hosting for this sitenks!), then I will remove it or archive it somewhere else.
+
+For further MySQL resources, see Planet MySQL. That aggregate is, imho, the pulse of the MySQL community."
+-- Daniel Nichter, 19 November 2010 (http://hackmysql.com/blog/2010/11/19/hack-mysql-is-longer-maintained/)
+
+Thus, the mysqlreport script could be lost, even though it might still be useful.

hackmysql.com/MySQL-Log-ParseFilter.html

+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+<html xmlns="http://www.w3.org/1999/xhtml">
+<head>
+<title>MySQL::Log::ParseFilter - Parse and filter MySQL slow, general and binary logs</title>
+<link rev="made" href="mailto:root@localhost" />
+</head>
+
+<body style="background-color: white">
+
+<p><a name="__index__"></a></p>
+<!-- INDEX BEGIN -->
+
+<ul>
+
+	<li><a href="#name">NAME</a></li>
+	<li><a href="#synopsis">SYNOPSIS</a></li>
+	<li><a href="#description">DESCRIPTION</a></li>
+	<ul>
+
+		<li><a href="#functions">Functions</a></li>
+	</ul>
+
+	<li><a href="#metaproperties">META-PROPERTIES</a></li>
+	<ul>
+
+		<li><a href="#naming_scheme">Naming Scheme</a></li>
+		<li><a href="#slow_logs">Slow Logs</a></li>
+		<li><a href="#general_logs">General Logs</a></li>
+		<li><a href="#binary_logs">Binary Logs</a></li>
+	</ul>
+
+	<li><a href="#setting_filters">SETTING FILTERS</a></li>
+	<ul>
+
+		<li><a href="#metaproperty">Meta-Property</a></li>
+		<li><a href="#sql_statement">SQL Statement</a></li>
+	</ul>
+
+	<li><a href="#examples">EXAMPLES</a></li>
+	<ul>
+
+		<li><a href="#parsing_general_logs">Parsing General Logs</a></li>
+		<li><a href="#parsing_slow_logs">Parsing Slow Logs</a></li>
+		<li><a href="#parsing_binary_logs">Parsing Binary Logs</a></li>
+		<li><a href="#complete_miniscript__dump_type_">Complete Mini-Script (dump_type)</a></li>
+	</ul>
+
+	<li><a href="#options">OPTIONS</a></li>
+	<li><a href="#userdefined_logs">USER-DEFINED LOGS</a></li>
+	<li><a href="#hacks">HACKS</a></li>
+	<li><a href="#debugging">DEBUGGING</a></li>
+	<li><a href="#bugs">BUGS</a></li>
+	<li><a href="#author">AUTHOR</a></li>
+	<li><a href="#see_also">SEE ALSO</a></li>
+	<li><a href="#version">VERSION</a></li>
+	<li><a href="#copyright_and_license">COPYRIGHT AND LICENSE</a></li>
+</ul>
+<!-- INDEX END -->
+
+<hr />
+<p>
+</p>
+<h1><a name="name">NAME</a></h1>
+<p>MySQL::Log::ParseFilter - Parse and filter MySQL slow, general and binary logs</p>
+<p>
+</p>
+<hr />
+<h1><a name="synopsis">SYNOPSIS</a></h1>
+<pre>
+    use MySQL::Log::ParseFilter;</pre>
+<pre>
+    # Parse all unique queries from logs given on command line</pre>
+<pre>
+    %params = (logs    =&gt; \@ARGV,
+               queries =&gt; \%queries);</pre>
+<pre>
+    parse_slow_logs(%params);
+    parse_general_logs(%params);
+    parse_binary_logs(%params);</pre>
+<pre>
+    calc_final_values(%params);</pre>
+<p>
+</p>
+<hr />
+<h1><a name="description">DESCRIPTION</a></h1>
+<p>MySQL::Log::ParseFilter is a Perl module for parsing and filtering MySQL
+slow, general and binary logs. MySQL::Log::ParseFilter also parses and filters
+user-defined logs: logs with variable headers and SQL statement meta-properties.</p>
+<p>Each MySQL log is formatted differently and poses many problems to accurate
+parsing. From the range of MySQL server versions to the vast extent of
+SQL syntax, parsing a MySQL log file is rarely a trivial task if done well.</p>
+<p>MySQL::Log::ParseFilter handles all the heavy log chopping, hacking and filtering,
+allowing a script to simply extract the data that it wants.</p>
+<p>
+</p>
+<h2><a name="functions">Functions</a></h2>
+<p>The following functions are exported by default:</p>
+<dl>
+<dt><strong><a name="item_set_meta_filter"><code>set_meta_filter($filter)</code></a></strong><br />
+</dt>
+<dd>
+Set meta-property filter to <code>$filter</code>. <code>$filter</code> is a scalar containing a single
+string of meta-property filter conditions. See <a href="#metaproperty">Meta-Property</a>.
+</dd>
+<dd>
+<p>Returns nothing.</p>
+</dd>
+<p></p>
+<dt><strong><a name="item_set_statement_filter"><code>set_statement_filter($filter)</code></a></strong><br />
+</dt>
+<dd>
+Set SQL statement filter to <code>$filter</code>. <code>$filter</code> is a scalar containing a single
+string of allowed or disallowed SQL statement types. See <a href="#sql_statement">SQL Statement</a>.
+</dd>
+<dd>
+<p>Returns nothing.</p>
+</dd>
+<p></p>
+<dt><strong><a name="item_set_grep"><code>set_grep($pattern)</code></a></strong><br />
+</dt>
+<dd>
+Set the grep pattern against which SQL statements must match. (This could
+be called the ``grep filter.'') <code>$pattern</code> is a scalar containing a single
+Perl regex pattern without m// or similar. For example: <code>&quot;^SELECT foo FROM (?:this|that)&quot;</code>.
+</dd>
+<dd>
+<p>Returns nothing.</p>
+</dd>
+<p></p>
+<dt><strong><a name="item_parse_binary_logs"><code>parse_binary_logs(%params)</code></a></strong><br />
+</dt>
+<dd>
+Parse output of mysqlbinlog.
+</dd>
+<dd>
+<p><code>%params</code> may contain the following key/values:</p>
+</dd>
+<dd>
+<p><code>logs</code>        =&gt; ref to array having log file names to parse (REQUIRED)</p>
+</dd>
+<dd>
+<p><code>queries</code>     =&gt; ref to hash in which to save unique queries (REQUIRED)</p>
+</dd>
+<dd>
+<p><code>all_queries</code> =&gt; ref to array in which to save all queries   (optional)</p>
+</dd>
+<dd>
+<p><strong>NOTE</strong>: MySQL binary logs are, as the name suggests, binary--they are not
+text files. It is necessary to first ``decode'' a binary log with the
+MySQL-provided program mysqlbinlog. The log files given to
+<a href="#item_parse_binary_logs"><code>parse_binary_logs()</code></a> <em>must</em> be the text output from mysqlbinlog ran first
+on the binary log files (without the --short-form option).</p>
+</dd>
+<dd>
+<p>Returns nothing.</p>
+</dd>
+<p></p>
+<dt><strong><a name="item_parse_general_logs"><code>parse_general_logs(%params)</code></a></strong><br />
+</dt>
+<dd>
+Parse MySQL general logs.
+</dd>
+<dd>
+<p><code>%params</code> may contain the following key/values:</p>
+</dd>
+<dd>
+<p><code>logs</code>        =&gt; ref to array having log file names to parse (REQUIRED)</p>
+</dd>
+<dd>
+<p><code>queries</code>     =&gt; ref to hash in which to save unique queries (REQUIRED)</p>
+</dd>
+<dd>
+<p><code>all_queries</code> =&gt; ref to array in which to save all queries   (optional)</p>
+</dd>
+<dd>
+<p><code>users</code>       =&gt; ref to hash in which to save unique users   (optional)</p>
+</dd>
+<dd>
+<p>Returns nothing.</p>
+</dd>
+<p></p>
+<dt><strong><a name="item_parse_slow_logs"><code>parse_slow_logs(%params)</code></a></strong><br />
+</dt>
+<dd>
+Parse MySQL slow and microslow logs.
+</dd>
+<dd>
+<p><code>%params</code> may contain the following key/values:</p>
+</dd>
+<dd>
+<p><code>logs</code>        =&gt; ref to array having log file names to parse (REQUIRED)</p>
+</dd>
+<dd>
+<p><code>queries</code>     =&gt; ref to hash in which to save unique queries (REQUIRED)</p>
+</dd>
+<dd>
+<p><code>all_queries</code> =&gt; ref to array in which to save all queries   (optional)</p>
+</dd>
+<dd>
+<p><code>users</code>       =&gt; ref to hash in which to save unique users   (optional)</p>
+</dd>
+<dd>
+<p><code>microslow</code>   =&gt; 0 = regular slow log, 1 = microslow log     (optional)</p>
+</dd>
+<dd>
+<p>0 is default for <code>microslow</code>.</p>
+</dd>
+<dd>
+<p>Returns nothing.</p>
+</dd>
+<p></p>
+<dt><strong><a name="item_set_udl_format"><code>set_udl_format($format_file)</code></a></strong><br />
+</dt>
+<dd>
+Set the user-defined log format defined in <code>$format_file</code>. <code>$format_file</code> is a
+scalar containing a single file name. See <a href="#userdefined_logs">USER-DEFINED LOGS</a>.
+</dd>
+<dd>
+<p>This function should be called before calling <a href="#item_parse_udl_logs"><code>parse_udl_logs()</code></a>.</p>
+</dd>
+<dd>
+<p>Returns nothing.</p>
+</dd>
+<p></p>
+<dt><strong><a name="item_parse_udl_logs"><code>parse_udl_logs(%params)</code></a></strong><br />
+</dt>
+<dd>
+Parse user-defined logs.
+</dd>
+<dd>
+<p><code>%params</code> may contain the following key/values:</p>
+</dd>
+<dd>
+<p><code>logs</code>        =&gt; ref to array having log file names to parse (REQUIRED)</p>
+</dd>
+<dd>
+<p><code>queries</code>     =&gt; ref to hash in which to save unique queries (REQUIRED)</p>
+</dd>
+<dd>
+<p><code>all_queries</code> =&gt; ref to array in which to save all queries   (optional)</p>
+</dd>
+<dd>
+<p><code>users</code>       =&gt; ref to hash in which to save unique users   (optional)</p>
+</dd>
+<dd>
+<p>Returns nothing.</p>
+</dd>
+<dd>
+<p>See <a href="#userdefined_logs">USER-DEFINED LOGS</a>.</p>
+</dd>
+<p></p>
+<dt><strong><a name="item_calc_final_values"><code>calc_final_values(%params, $grand_totals)</code></a></strong><br />
+</dt>
+<dd>
+Calculate final meta-property and grand total values after parsing logs.
+This function should only be called after calling one of the parse_ functions.
+</dd>
+<dd>
+<p>This function calculates: number of unique queries, c_sum_p, averages,
+grand total sums, percent true for true/false (or yes/no) meta-properties,
+per-meta-property percentages of grand total sums, per-user percentages of
+all users, and number of unique users.</p>
+</dd>
+<dd>
+<p><code>%params</code> is the same hashed passed earlier to one of the parse_ functions.</p>
+</dd>
+<dd>
+<p><code>$grand_totals</code> is a reference to a hash in which to save the grand total sums.
+Pass 0 or <code>undef</code> if you do not want grand total sums.</p>
+</dd>
+<dd>
+<p>Returns total number of queries.</p>
+</dd>
+<p></p>
+<dt><strong><a name="item_apply_final_meta_filters"><code>apply_final_meta_filters(%params, $total_queries)</code></a></strong><br />
+</dt>
+<dd>
+Apply final meta-propertry filters for min, max, average, percent and sum
+meta-property values. This function should be called after calling
+<a href="#item_calc_final_values"><code>calc_final_values()</code></a>.
+</dd>
+<dd>
+<p><code>%params</code> is the same hashed passed earlier to <a href="#item_calc_final_values"><code>calc_final_values()</code></a>
+or to one of the parse_ functions if you did not calculate final
+values for some reason.</p>
+</dd>
+<dd>
+<p><code>$total_queries</code> is a reference to a scalar having the total number of queries.
+Usually, this is obtained from the return value of <a href="#item_calc_final_values"><code>calc_final_values()</code></a>.
+<code>$total_queries</code> will be adjusted to account for queries which were removed
+by a filter. Pass 0 or <code>undef</code> if you do not want an adjusted total queries.</p>
+</dd>
+<dd>
+<p>Returns total number of queries removed.</p>
+</dd>
+<p></p></dl>
+<p>
+</p>
+<hr />
+<h1><a name="metaproperties">META-PROPERTIES</a></h1>
+<p>Every SQL statement has many meta-properties. These are values about
+the SQL statement such as its execution time, how many rows it examines, the
+MySQL connection ID it is associated with, etc. It is by these values that a
+MySQL log is filtered (and usually sorted).</p>
+<p>Every type of MySQL log provides different meta-properties. The list of all
+meta-properties is very long so only the basics are given here. For the full
+list visit <a href="http://hackmysql.com/mysqlsla_filters">http://hackmysql.com/mysqlsla_filters</a>.</p>
+<p>But first, it is important to understand the naming scheme that 
+MySQL::Log::ParseFilter uses for meta-properties.</p>
+<p>
+</p>
+<h2><a name="naming_scheme">Naming Scheme</a></h2>
+<p>Meta-properties are either numeric or string. For strings, the naming scheme
+does not change: <code>db</code> is always just <code>db</code>. For numeric values, however,
+several additional meta-properties are created and identified by consistent
+extensions to the base meta-property name.</p>
+<p>Take for example <code>t</code> from slow logs. In addition to this base
+meta-property, MySQL::Log::ParseFilter also creates: <code>t_min</code>,
+<code>t_max</code>, <code>t_avg</code>, <code>t_sum</code> and <code>t_sum_p</code> (unless <a href="#item_set_save_meta_values"><code>set_save_meta_values()</code></a>
+was disabled; see <a href="#options">OPTIONS</a>).</p>
+<p>These additional meta-properties are identified by their extensions: <code>_min</code>,
+<code>_max</code>, <code>_avg</code>, <code>_sum</code>, <code>_sum_p</code>. These extensions are consistent and
+form the naming scheme for <em>most</em> numeric meta-properties. (<code>cid</code> is a
+notable exception.) They tell you as well as MySQL::Log::ParseFilter what
+the additional meta-property value represents: the minimum, maximum, average
+and sum value of their base meta-property.</p>
+<p><code>_sum_p</code> means percentage that the base meta-property constitutes of the
+grand total sum for all those base meta-properties (if grand total sum were
+calculated when calling <a href="#item_calc_final_values"><code>calc_final_values()</code></a>).</p>
+<p>There is another extension for true/false (yes/no) meta-properties: <code>_t</code>
+and <code>_t_p</code>. Currently, this type of meta-property is only found in microslow logs.</p>
+<p>This naming scheme is very important when working with user-defined logs because
+it allows you to know in advance the names the of additional meta-properties that
+MySQL::Log::ParseFilter will create from the given bases meta-properties.</p>
+<p>
+</p>
+<h2><a name="slow_logs">Slow Logs</a></h2>
+<p><code>c_sum</code> : Total number of times SQL statement appears in log</p>
+<p><code>host</code> : Host name of MySQL connection</p>
+<p><code>ip</code> : IP address of MySQL connection</p>
+<p><code>l</code> : Time spent acquiring lock</p>
+<p><code>l_sum</code> : Total time spent acquiring lock</p>
+<p><code>re</code> : Number of rows examined</p>
+<p><code>rs</code> : Number of rows sent</p>
+<p><code>t</code> : Execution time</p>
+<p><code>t_sum</code> : Total execution time</p>
+<p><code>user</code> : User of MySQL connection</p>
+<p>
+</p>
+<h2><a name="general_logs">General Logs</a></h2>
+<p><code>c_sum</code> : Total number of times SQL statement appears in log</p>
+<p><code>cid</code> : Connection ID of MySQL connection</p>
+<p><code>host</code> : Host name of MySQL connection</p>
+<p><code>user</code> : User of MySQL connection</p>
+<p>
+</p>
+<h2><a name="binary_logs">Binary Logs</a></h2>
+<p><code>c_sum</code> : Total number of times SQL statement appears in log</p>
+<p><code>cid</code> : Connection ID of MySQL connection</p>
+<p><code>ext</code> : Execution time</p>
+<p><code>err</code> : Error code (if any) caused by SQL statement</p>
+<p><code>sid</code> : Server ID of MySQL server</p>
+<p>
+</p>
+<hr />
+<h1><a name="setting_filters">SETTING FILTERS</a></h1>
+<p>All filters are inclusive: every condition for every filter must pass for the
+statement to be saved.</p>
+<p>
+</p>
+<h2><a name="metaproperty">Meta-Property</a></h2>
+<p>The format of <code>$filter</code> when calling <a href="#item_set_meta_filter"><code>set_meta_filter($filter)</code></a> is
+<code>[CONDITION],[CONDITION]...</code> where each <code>[CONDITION]</code> is <code>[meta][op][val]</code>.</p>
+<p><code>[meta]</code> is a meta-property name (listed above or from the full list
+at <a href="http://hackmysql.com/mysqlsla_filters">http://hackmysql.com/mysqlsla_filters</a>). <code>[op]</code> is either &gt;, &lt;, or =.
+And <code>[val]</code> is the value against which <code>[meta]</code> from the log must
+pass according to <code>[op]</code>. <code>[val]</code> is numeric or string according to
+<code>[meta]</code>. For string values, only = is valid for <code>[op]</code>.</p>
+<p>
+</p>
+<h2><a name="sql_statement">SQL Statement</a></h2>
+<p>The format of <code>$filter</code> when calling <a href="#item_set_statement_filter"><code>set_statement_filter($filter)</code></a> is
+<code>[+-][TYPE],[TYPE]...</code>.</p>
+<p>[+-] is give only once at the first start of the filter string. + means that
+the filter is positive: allow only the given <code>[TYPE]</code>s. - means that the filter
+is negative: remove the given <code>[TYPE]</code>s. <code>[TYPE]</code> is a SQL statement type:
+SELECT, UPDATE, INSERT, DO, SET, CREATE, DROP, ALTER, etc.</p>
+<p>
+</p>
+<hr />
+<h1><a name="examples">EXAMPLES</a></h1>
+<p>
+</p>
+<h2><a name="parsing_general_logs">Parsing General Logs</a></h2>
+<pre>
+    # Parse general logs given on command line extracting only SELECT queries
+    # using database foo and calculate grand total sums</pre>
+<pre>
+    my %queries;
+    my %grand_totals;</pre>
+<pre>
+    my %params = (
+        logs    =&gt; \@ARGV,
+        queries =&gt; \%queries,
+    );</pre>
+<pre>
+    set_meta_filter(&quot;db=foo&quot;);
+    set_statement_filter(&quot;+SELECT&quot;);</pre>
+<pre>
+    parse_general_logs(%params);</pre>
+<pre>
+    calc_final_values(%params, \%grand_totals);</pre>
+<p>
+</p>
+<h2><a name="parsing_slow_logs">Parsing Slow Logs</a></h2>
+<pre>
+    # Parse slow logs given on command line removing SET statements and
+    # extracting only queries which took longer than 5 seconds to execute</pre>
+<pre>
+    my %queries;</pre>
+<pre>
+    my %params = (
+        logs    =&gt; \@ARGV,
+        queries =&gt; \%queries,
+    );</pre>
+<pre>
+    set_meta_filter(&quot;t&gt;5&quot;);
+    set_statement_filter(&quot;-SET&quot;);</pre>
+<pre>
+    parse_slow_logs(%params);</pre>
+<pre>
+    calc_final_values(%params, 0);</pre>
+<p>
+</p>
+<h2><a name="parsing_binary_logs">Parsing Binary Logs</a></h2>
+<pre>
+    # Parse output files from mysqlbinlog given on command line extracting
+    # only INSERT and UPDATE queries which account for more than 75% of all
+    # INSERT and UPDATE queries extracted</pre>
+<pre>
+    my %queries;
+    my %grand_totals;</pre>
+<pre>
+    my %params = (
+        logs    =&gt; \@ARGV,
+        queries =&gt; \%queries,
+    );</pre>
+<pre>
+    set_meta_filter(&quot;c_sum_p&gt;75&quot;);
+    set_statement_filter(&quot;+INSERT,UPDATE&quot;);</pre>
+<pre>
+    parse_binary_logs(%params);</pre>
+<pre>
+    calc_final_values(%params, \%grand_totals);</pre>
+<pre>
+    apply_final_meta_filters(%params, 0);</pre>
+<p>
+</p>
+<h2><a name="complete_miniscript__dump_type_">Complete Mini-Script (dump_type)</a></h2>
+<pre>
+    #!/usr/bin/perl -w</pre>
+<pre>
+    use strict;
+    use MySQL::Log::ParseFilter;</pre>
+<pre>
+    my %queries;</pre>
+<pre>
+    if(@ARGV != 2) {
+       print &quot;dump_type dumps a unique sample of all statements of TYPE from general LOG.\n&quot;;
+       print &quot;Usage: dump_type TYPE LOG\n&quot;;
+       exit;
+    }</pre>
+<pre>
+    set_statement_filter(&quot;+$ARGV[0]&quot;);</pre>
+<pre>
+    parse_general_logs( (logs =&gt; [ $ARGV[1] ], queries =&gt; \%queries) );</pre>
+<pre>
+    foreach(keys %queries) { print &quot;$queries{$_}-&gt;{sample}\n&quot;; }</pre>
+<pre>
+    exit;</pre>
+<p>
+</p>
+<hr />
+<h1><a name="options">OPTIONS</a></h1>
+<p>MySQL::Log::ParseFilter has six functions to set special options which can be
+imported with the <code>:options</code> tag (<code>use MySQL::Log::ParseFilter qw(:DEFAULT :options)</code>).</p>
+<dl>
+<dt><strong><a name="item_set_save_meta_values"><code>set_save_meta_values($val)</code></a></strong><br />
+</dt>
+<dd>
+Save extra meta-property values. Default 1 (enabled). Can be set to 0 (disabled)
+which will result in only the following meta-properties being saved: <code>sample</code>,
+<code>db</code>, <code>cid</code>.
+</dd>
+<dd>
+<p>Any meta-property value not check in <a href="#item_apply_final_meta_filters"><code>apply_final_meta_filters()</code></a>
+can still be used (<code>t</code>, <code>l</code>, <code>host</code>, <code>cid</code>, etc.)</p>
+</dd>
+<p></p>
+<dt><strong><a name="item_set_save_all_values"><code>set_save_all_values($val)</code></a></strong><br />
+</dt>
+<dd>
+Save ``all values'': arrays of every single value for certain meta-properties
+(<code>meta_all</code>). Default 0 (disabled). Can be set to 1 (enabled). This does not
+affect user-defined logs which has a seperate mechanism for saving all values
+(type nf).
+</dd>
+<dd>
+<p>At present, enabling this option causes the following all values to be saved:
+for microslow (msl) patched slow log with InnoDB values: <code>iorops_all</code>,
+<code>iorbytes_all</code>, <code>iorwait_all</code>, <code>reclwait_all</code>, <code>qwait_all</code>, <code>pages_all</code>;
+for binary logs: <code>ext_all</code>.</p>
+</dd>
+<p></p>
+<dt><strong><a name="item_set_in_abstraction"><code>set_IN_abstraction($val)</code></a></strong><br />
+</dt>
+<dd>
+Abstract IN () clauses further by grouping in groups of $val. Default 0 (disabled).
+</dd>
+<dd>
+<p>This is an experimental option. Normally, all IN clauses are condensed from
+<code>IN (N, N, N)</code> to <code>IN (N3)</code>. This option furthers this abstraction by grouping
+the condensed IN clauses in groups of $val where $val is the ``dividing line.''</p>
+</dd>
+<dd>
+<p>Example: with $val=10  <code>IN (N3)</code> becomes <code>IN (N0-9)</code>. Therefore, any IN clause
+with 0 to 9 values will be condensed and then further abstracted to <code>IN (N0-9)</code>.
+Likewise, any IN clauses with 10 to 19 values will be condensed and further
+abstracted to <code>IN (10-19)</code>.</p>
+</dd>
+<p></p>
+<dt><strong><a name="item_set_values_abstraction"><code>set_VALUES_abstraction($val)</code></a></strong><br />
+</dt>
+<dd>
+Abstract VALUES () clauses further by removing the number of condensed value sets.
+Default 0 (disabled). Can be set to 1 (enabled).
+</dd>
+<dd>
+<p>This is an experimental option. Normally, all VALUES clauses are condensed from
+<code>VALUES (NULL, 'foo'), (NULL, 'bar')</code> to <code>VALUES (NULL, 'S')2</code>. This option
+furthers this abstractiong by removing that number of condensed value sets: 2.</p>
+</dd>
+<dd>
+<p>Example: two queries <code>INSERT INTO table VALUES ('S')</code> and
+<code>INSERT INTO table VALUES ('S'), ('S'), ('S')</code> are first condensed to
+<code>INSERT ... VALUES ('S')1</code> and <code>INSERT ... VALUES ('S')3</code> then further abstracted
+to one single query: <code>INSERT INTO table VALUES ('S')</code>.</p>
+</dd>
+<p></p>
+<dt><strong><a name="item_set_atomic_statements"><code>set_atomic_statements($val)</code></a></strong><br />
+</dt>
+<dd>
+Treat multi-statement groups atomically when filtering. Default 0 (disabled).
+Can be set to 1 (enabled).
+</dd>
+<dd>
+<p>This is an experimental option. Normally, each statement in a multi-statement
+group is filtered individually: only those which fail a filter are removed
+and those which pass are kept. With this option enabled, if any one statement
+in a group fails, the entire group of statements is removed.</p>
+</dd>
+<dd>
+<p>This option does not apply to general logs because general logs never group statements.</p>
+</dd>
+<p></p>
+<dt><strong><a name="item_set_db_inheritance"><code>set_db_inheritance($val)</code></a></strong><br />
+</dt>
+<dd>
+Allow queries to inherit the last database specified in the log. Default 0 (disabled).
+Can be set to 1 (enabled).
+</dd>
+<dd>
+<p>Normally, the log must explicitly specify the database for each statement. Or, in the
+case of general logs, the current database is tracked by other means. Sometimes, however,
+logs only specify the database explicitly once. If this option is enabled, all statements
+following an explicit database specification inherit that database.</p>
+</dd>
+<p></p></dl>
+<p>
+</p>
+<hr />
+<h1><a name="userdefined_logs">USER-DEFINED LOGS</a></h1>
+<p>MySQL::Log::ParseFilter can parse user-defined logs which have variable
+headers and meta-property values. Accomplishing this is not a trivial task.
+Therefore the subject is not covered here but at <a href="http://hackmysql.com/udl">http://hackmysql.com/udl</a>.</p>
+<p>
+</p>
+<hr />
+<h1><a name="hacks">HACKS</a></h1>
+<p>The following four functions can be imported with the <code>:hacks</code> tag.</p>
+<dl>
+<dt><strong><a name="item_get_meta_filter"><code>get_meta_filter()</code></a></strong><br />
+</dt>
+<dd>
+Returns hash ref to internal meta filter hash which is structured:
+<code>meta =&gt; [ op, value ]</code>.
+</dd>
+<p></p>
+<dt><strong><a name="item_get_statement_filter"><code>get_statement_filter()</code></a></strong><br />
+</dt>
+<dd>
+Returns hash ref to internal statement filter hash which is structured:
+<code>type =&gt; 0</code>. Also has <code>pos_neg =&gt; 1 (positive) or 0 (negative)</code>.
+</dd>
+<p></p>
+<dt><strong><a name="item_passes_meta_filter"><code>passes_meta_filter($meta, $val, $type)</code></a></strong><br />
+</dt>
+<dd>
+<code>$meta</code> is a meta-property name. <code>$val</code> is the log value. <code>$type</code> is
+'n' (numeric) or 's' (string). Returns 1 on pass, 0 on fail.
+</dd>
+<p></p>
+<dt><strong><a name="item_passes_statement_filter"><code>passes_statement_filter($type)</code></a></strong><br />
+</dt>
+<dd>
+<code>$type</code> is a SQL statement type (SELECT, CREATE, DROP, etc.), case
+insensitive. Returns  1 on pass, 0 on fail.
+</dd>
+<p></p></dl>
+<p>
+</p>
+<hr />
+<h1><a name="debugging">DEBUGGING</a></h1>
+<p>Calling <code>MySQL::Log::ParseFilter::set_debug(1)</code> will enable debugging and cause
+MySQL::Log::ParseFilter to print a <em>flood</em> of debugging information to STDOUT.
+This may be necessary if you feel a function is not working correctly because,
+although they do not return errors, they print debugging messages.</p>
+<p>
+</p>
+<hr />
+<h1><a name="bugs">BUGS</a></h1>
+<p>There are no known bugs. Please contact me if you find one. Expect that I will
+ask for at least a portion of your log because that makes finding and fixing
+the bug easier.</p>
+<p>
+</p>
+<hr />
+<h1><a name="author">AUTHOR</a></h1>
+<p>Daniel Nichter &lt;<a href="mailto:perl@hackmysql.com">perl@hackmysql.com</a>&gt;</p>
+<p><a href="http://hackmysql.com/">http://hackmysql.com/</a></p>
+<p>
+</p>
+<hr />
+<h1><a name="see_also">SEE ALSO</a></h1>
+<dl>
+<dt><strong><a name="item_http_3a_2f_2fhackmysql_2ecom_2fmlp"><a href="http://hackmysql.com/mlp">http://hackmysql.com/mlp</a></a></strong><br />
+</dt>
+<dd>
+MySQL::Log::ParseFilter home page
+</dd>
+<p></p>
+<dt><strong><a name="item_http_3a_2f_2fhackmysql_2ecom_2fudl"><a href="http://hackmysql.com/udl">http://hackmysql.com/udl</a></a></strong><br />
+</dt>
+<dd>
+Document describing how to make user-defined logs
+</dd>
+<p></p>
+<dt><strong><a name="item_http_3a_2f_2fhackmysql_2ecom_2fmysqlsla"><a href="http://hackmysql.com/mysqlsla">http://hackmysql.com/mysqlsla</a></a></strong><br />
+</dt>
+<dd>
+mysqlsla uses every part of MySQL::Log::ParseFilter to analyze, sort and report
+data from MySQL logs. To study MySQL::Log::ParseFilter in all its glory, study
+mysqlsla. In fact, MySQL::Log::ParseFilter was born from mysqlsla.
+</dd>
+<p></p>
+<dt><strong><a name="item_http_3a_2f_2fhackmysql_2ecom_2fmicrosecond_slow_lo"><a href="http://hackmysql.com/microsecond_slow_logs">http://hackmysql.com/microsecond_slow_logs</a></a></strong><br />
+</dt>
+<dd>
+Document summarizing microsecond resolution support for MySQL slow logs
+</dd>
+<p></p>
+<dt><strong><a name="item_http_3a_2f_2fdev_2emysql_2ecom_2fdoc_2frefman_2f5_"><a href="http://dev.mysql.com/doc/refman/5.0/en/slow-query-log.html">http://dev.mysql.com/doc/refman/5.0/en/slow-query-log.html</a></a></strong><br />
+</dt>
+<dd>
+Official MySQL slow query log documentation.
+</dd>
+<p></p>
+<dt><strong><a href="http://dev.mysql.com/doc/refman/5.0/en/query-log.html">http://dev.mysql.com/doc/refman/5.0/en/query-log.html</a></strong><br />
+</dt>
+<dd>
+Official MySQL general query log documentation.
+</dd>
+<p></p>
+<dt><strong><a href="http://dev.mysql.com/doc/refman/5.0/en/binary-log.html">http://dev.mysql.com/doc/refman/5.0/en/binary-log.html</a></strong><br />
+</dt>
+<dd>
+Official MySQL binary query log documentation.
+</dd>
+<p></p>
+<dt><strong><a href="http://dev.mysql.com/doc/refman/5.0/en/mysqlbinlog.html">http://dev.mysql.com/doc/refman/5.0/en/mysqlbinlog.html</a></strong><br />
+</dt>
+<dd>
+Official MySQL documentation for mysqlbinlog
+</dd>
+<p></p></dl>
+<p>
+</p>
+<hr />
+<h1><a name="version">VERSION</a></h1>
+<p>v1.00</p>
+<p>
+</p>
+<hr />
+<h1><a name="copyright_and_license">COPYRIGHT AND LICENSE</a></h1>
+<p>Copyright 2008 Daniel Nichter</p>
+<p>This program is free software; you can redistribute it and/or modify it under
+the same terms as Perl itself.</p>
+
+</body>
+
+</html>

hackmysql.com/as-of

+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
+<head>
+<title>MySQL Features Timeline</title>
+<meta name="keywords" content="mysql, timeline, features, added, removed, changed, version" />
+<meta name="description" content="A timeline showing as of what version MySQL features were added, removed or changed." />
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+<link rel="stylesheet" href="http://hackmysql.com/styles.css" type="text/css" />
+<style type="text/css">
+.him {border-bottom:2px solid #f07f00;}
+</style>
+
+<style type="text/css">
+table { border-spacing: 0px }
+td { border-bottom: 1px solid #999; }
+td.v { width:15% }
+td.a { width:40% }
+td.c { width:40% }
+ul { margin: 5px; padding-left: 5px; }
+code.rm { color:red; }
+#td.v54 { background-color:#fcc; }
+#td.v51 { background-color:#ff8; }
+#td.v50 { background-color:#bf8; }
+#td.v41 { background-color:#cfc; }
+#td.v40 { background-color:#ccc; }
+#td.v3  { background-color:#999; }
+</style>
+
+</head>
+
+<body>
+<div id="Menu">
+   <div id="bt">
+   <a href="http://hackmysql.com">Hack MySQL</a> &lsaquo; <a class="him" style="color:#f07f00;" href="as-of">MySQL Features Timeline</a>   </div>
+   <div id="info">
+      <a class="not_him" href="blog/">Blog</a>&nbsp;&nbsp;&nbsp;
+      <a class="not_him" href="contact">Contact</a>&nbsp;&nbsp;&nbsp;
+      <a class="not_him" href="sitemap">Sitemap</a>
+      <a name="__top"></a>
+   </div>
+   <div style="clear:both; font-size:10pt; text-align:center; width:100%; padding:10px 0px 5px 0px;">
+      <b>This web site is no longer maintained.&nbsp;&nbsp;<a href="http://hackmysql.com/blog/2010/11/19/hack-mysql-is-longer-maintained/">Read why</a>.</b>
+   </div>
+</div>
+
+
+<div id="Content">
+<div id="c1"> 
+   <h1>MySQL Features Timeline</h1>
+   <p>This timeline shows as of what version MySQL features were added, removed and changed.  This is a work
+   in progress so many features are missing.  The <a href="http://dev.mysql.com/doc/">MySQL manual</a> is the
+   authoritative source for all feature additions, removals and changes.  To add a feature, <a href="contact">contact</a> me.</p>
+
+   <table style="width:100%">
+      <tr>
+         <th>MySQL version</th><th colspan="3" style="text-align:center">Features</th>
+      </tr>
+      <tr>
+         <th>&nbsp;</th><th>Added</th><th>Changed</th>
+      </tr>
+
+      <tr>
+         <td class="v v54">5.4.2</td>
+         <td class="a v54">
+         </td>
+         <td class="c v54">
+            <ul>
+               <li><a href="http://www.innodb.com/products/innodb_plugin/plugin-documentation/"><code>InnoDB plugin</code></a>
+                   replaced <a href="http://dev.mysql.com/doc/refman/5.4/en/innodb.html"><code class="rm">built-in InnoDB</code></a> storage engine</li>
+            </ul>
+         </td>
+      </tr>
+<!----------------------------- 5.4 ----------------------------->
+      <tr>
+         <td class="v v51">5.1.38</td>
+         <td class="a v51">
+            <ul>
+                <li><a href="http://www.innodb.com/products/innodb_plugin/plugin-documentation/"><code>InnoDB plugin</code></a></li>
+            </ul>
+         </td>
+         <td class="c v51">&nbsp;</td>
+      </tr>
+
+      <tr>
+         <td class="v v51">5.1.33</td>
+         <td class="a v51">
+            <ul>
+                <li><a href="http://dev.mysql.com/doc/refman/5.1/en/se-db2.html"><code>IBMDB2I</code></a> storage engine</li>
+            </ul>
+         </td>
+         <td class="c v51">&nbsp;</td>
+      </tr>
+      
+      <tr>
+         <td class="v v51">5.1.31</td>
+         <td class="a v51">
+            <ul>
+                <li><a href="http://dev.mysql.com/doc/refman/5.1/en/server-status-variables.html"><code>Queries</code></a> to <code>SHOW STATUS</code></li>
+            </ul>
+         </td>
+         <td class="c v51">
+            <ul>
+               <li><a href="http://dev.mysql.com/doc/refman/5.0/en/server-status-variables.html"><code>Questions</code></a> in <code>SHOW STATUS</code>
+               includes only statements sent to the server by clients and no longer includes statements executed within stored programs, unlike the
+               Queries  variable</li>
+            </ul>
+         </td>
+      </tr>
+
+      <tr>
+         <td class="v v51">5.1.24</td>
+         <td class="a v51">
+            <ul>
+                <li><a href="http://dev.mysql.com/doc/refman/5.1/en/show-profiles.html"><code>SHOW PROFILES</code></a> (enterprise branch)</li>
+            </ul>
+         </td>
+         <td class="c v51">&nbsp;</td>
+      </tr>
+
+      <tr>
+         <td class="v v51">5.1.12</td>
+         <td class="a v51">
+            <ul>
+               <li><code>--add-user</code></li>
+               <li><a href="http://dev.mysql.com/doc/refman/5.1/en/query-log.html"><code>--general_log</code></a></li>
+            </ul>
+         </td>
+         <td class="c v51">
+            <ul>
+               <li>Can enable/disable <a href="http://dev.mysql.com/doc/refman/5.1/en/query-log.html">general query log</a> at runtime</li>
+            </ul>
+         </td>
+      </tr>
+      
+      <tr>
+         <td class="v v51">5.1.8</td>
+         <td class="a v51">
+            <ul>
+               <li><a href="http://dev.mysql.com/doc/refman/5.1/en/replication-formats.html">Mixed replication</code></a></li>
+            </ul>
+         </td>
+         <td class="c v51">&nbsp;</td>
+      </tr>
+      
+      <tr>
+         <td class="v v51">5.1.7</td>
+         <td class="a v51">
+            <ul>
+               <li><a href="http://dev.mysql.com/doc/refman/5.1/en/mysql-upgrade.html">mysql_upgrade</code></a> program</li>
+            </ul>
+         </td>
+         <td class="c v51">&nbsp;</td>
+      </tr>
+
+      <tr>
+         <td class="v v51">5.1.6</td>
+         <td class="a v51">
+            <ul>
+               <li><a href="http://dev.mysql.com/doc/refman/5.1/en/partitioning-pruning.html">Partition pruning</li>
+            </ul>
+         </td>
+         <td class="c v51">&nbsp;</td>
+      </tr>
+
+      <tr>
+         <td class="v v51">5.1.5</td>
+         <td class="a v51">
+            <ul>
+               <li><a href="http://dev.mysql.com/doc/refman/5.1/en/replication-formats.html">Row-based replication</code></a></li>
+            </ul>
+         </td>
+         <td class="c v51">&nbsp;</td>
+      </tr>
+      
+      <tr>
+         <td class="v v51">5.1</td>
+         <td class="a v51">
+            <ul>
+               <li><a href="http://dev.mysql.com/doc/refman/5.1/en/plugin-api.html">Plugin API</code></a></li>
+            </ul>
+         </td>
+         <td class="c v51">
+            <ul>
+               <li><a href="http://dev.mysql.com/doc/refman/5.0/en/bdb-storage-engine.html"><code class="rm">BDB</code></a> storage engine removed</li>
+            </ul>
+         </td>
+      </tr>
+<!----------------------------- 5.1 ----------------------------->
+      <tr>
+         <td class="v v50">5.0.76</td>
+         <td class="a v50">
+            <ul>
+                <li><a href="http://dev.mysql.com/doc/refman/5.0/en/server-status-variables.html"><code>Queries</code></a> to <code>SHOW STATUS</code></li>
+            </ul>
+         </td>
+         <td class="c v50">&nbsp;</td>
+      </tr>
+
+      <tr>
+         <td class="v v50">5.0.72</td>
+         <td class="a v50">
+         </td>
+         <td class="c v50">
+            <ul>
+               <li><a href="http://dev.mysql.com/doc/refman/5.0/en/server-status-variables.html"><code>Questions</code></a> in <code>SHOW STATUS</code>
+               includes only statements sent to the server by clients and no longer includes statements executed within stored programs, unlike the
+               Queries  variable</li>
+            </ul>
+         </td>
+      </tr>
+
+      <tr>
+         <td class="v v50">5.0.32</td>
+         <td class="a v50">
+            <ul>
+                <li><a href="http://dev.mysql.com/doc/refman/5.0/en/show-profiles.html"><code>SHOW PROFILES</code></a> (community branch)</li>
+            </ul>
+         </td>
+         <td class="c v50">&nbsp;</td>
+      </tr>
+      
+      <tr>
+         <td class="v v50">5.0.19</td>
+         <td class="a v50">
+            <ul>
+                <li><a href="http://dev.mysql.com/doc/refman/5.1/en/mysql-upgrade.html">mysql_upgrade</code></a> program</li>              
+            </ul>
+         </td>
+         <td class="c v50">&nbsp;</td>
+      </tr>
+
+      <tr>
+         <td class="v v50">5.0.13</td>
+         <td class="a v50">
+         </td>
+         <td class="c v50">
+            <ul>
+               <li><a href="http://dev.mysql.com/doc/refman/5.0/en/comparison-operators.html"><code>GREATEST()</code></a>
+                   returns <code>NULL</code> if any argument is <code>NULL</code> instead of if all arguments are <code>NULL</code></li>
+               <li><a href="http://dev.mysql.com/doc/refman/5.0/en/comparison-operators.html"><code>LEAST()</code></a>
+                   returns <code>NULL</code> if any argument is <code>NULL</code> instead of if all arguments are <code>NULL</code></li>
+            </ul>
+         </td>
+      </tr>
+      
+      <tr>
+         <td class="v v50">5.0.7</td>
+         <td class="a v50">
+            <ul>
+               <li>Prepared statement <a href="http://dev.mysql.com/doc/refman/5.0/en/select.html">placeholders for <code>LIMIT</code></a></li>
+            </ul>
+         </td>
+         <td class="c v50">
+         </td>
+      </tr>
+
+      <tr>
+         <td class="v v50">5.0.5</td>
+         <td class="a v50">
+         </td>
+         <td class="c v50">
+            <ul>
+               <li><a href="http://dev.mysql.com/doc/refman/5.0/en/numeric-types.html"><code>BIT data type</code></a>
+                   support added to <code>MEMORY</code>, <code>InnoDB</code>, <code>BDB</code>, and <code>NDBCLUSTER</code> storage engines</li>
+            </ul>
+         </td>
+      </tr>
+
+      <tr>
+         <td class="v v50">5.0.3</td>
+         <td class="a v50">
+            <ul>
+               <li><a href="http://dev.mysql.com/doc/refman/5.0/en/federated-storage-engine.html">FEDERATED</a> storage engine</li>
+               <li><a href="http://dev.mysql.com/doc/refman/5.1/en/group-by-functions.html"><code>STDDEV_POP()</code></a></li>
+               <li><a href="http://dev.mysql.com/doc/refman/5.1/en/group-by-functions.html"><code>STDDEV_SAMP()</code></a></li>
+               <li><a href="http://dev.mysql.com/doc/refman/5.1/en/group-by-functions.html"><code>VAR_POP()</code></a></li>
+               <li><a href="http://dev.mysql.com/doc/refman/5.1/en/group-by-functions.html"><code>VAR_SAMP()</code></a></li>
+            </ul>
+         </td>
+         <td class="c v50">
+            <ul>
+               <li><a href="http://dev.mysql.com/doc/refman/5.0/en/numeric-types.html"><code>BIT data type</code></a>
+                   stores bit-field values, no longer interpreted as <code>TINYINT (1)</code></li>
+               <li><a href="http://dev.mysql.com/doc/refman/5.0/en/numeric-types.html"><code>DECIMAL</code> and <code>NUMERIC</code> data types</a>
+                   stored in binary format instead of as strings</li>
+            </ul>
+         </td>
+      </tr>
+
+      <tr>
+         <td class="v v50">5.0.2</td>
+         <td class="a v50">
+            <ul>
+               <li><a href="http://dev.mysql.com/doc/refman/5.0/en/triggers.html">Triggers</a></li>
+            </ul>
+         </td>
+         <td class="c v50">
+            <ul>
+               <li><a href="http://dev.mysql.com/doc/refman/5.0/en/select.html"><code>HAVING</code></a> can refer to columns in the
+                   <code>SELECT</code> list and outer subqueries, in addition to columns in GROUP BY and aggregate functions (SQL standard)</li>
+            </ul>
+         </td>
+      </tr>
+
+      <tr>
+         <td class="v v50">5.0.1</td>
+         <td class="a v50">
+            <ul>
+               <li><a href="http://dev.mysql.com/doc/refman/5.0/en/ansi-diff-views.html">Views</a></li>
+            </ul>
+         </td>
+         <td class="c v50">
+            <ul>
+               <li><code>SHOW COLUMNS</code> shows views</li>
+            </ul>
+         </td>
+      </tr>
+
+      <tr>
+         <td class="v v50">5.0</td>
+         <td class="a v50">
+            <ul>
+               <li><a href="http://dev.mysql.com/doc/refman/5.0/en/cursors.html">Cursors</a></li>
+               <li><a href="http://dev.mysql.com/doc/refman/5.0/en/archive-storage-engine.html"><code>ARCHIVE</code></a> storage engine</li>
+               <li><a href="http://dev.mysql.com/doc/refman/5.0/en/information-schema.html"><code>INFORMATION_SCHEMA</code></a></li>
+            </ul>
+         </td>
+         <td class="c v50">
+            <ul>
+               <li><a href="http://dev.mysql.com/doc/refman/4.1/en/isam-storage-engine.html"><code class="rm">ISAM</code></a> storage engine removed</li>
+            </ul>
+         </td>
+      </tr>
+<!----------------------------- 5.0 ----------------------------->
+      <tr>
+         <td class="v v41">4.1.11</td>
+         <td class="a v41">
+            <ul>
+               <li><a href="http://dev.mysql.com/doc/refman/5.1/en/blackhole-storage-engine.html"><code>BLACKHOLE</code></a> storage engine</li>
+            </ul>
+         </td>
+         <td class="c v41">
+            <ul>
+               <li><a href="http://dev.mysql.com/doc/refman/4.1/en/server-sql-mode.html"><code>ANSI</code> SQL mode</a>
+                   no longer includes <code>ONLY_FULL_GROUP_BY</code></li>
+            </ul>
+         </td>
+      </tr>
+      
+      <tr>
+         <td class="v v41">4.1.8</td>
+         <td class="a v41">
+            <ul>
+               <li><a href="http://dev.mysql.com/doc/refman/4.1/en/commit.html"><code>WITH CONSISTENT SNAPSHOT</code></a>
+                  clause to <code>START TRANSACTION</code></li>
+            </ul>
+         </td>
+         <td class="c v41">&nbsp;</td>
+      </tr>
+      
+      <tr>
+         <td class="v v41">4.1.4</td>
+         <td class="a v41">
+            <ul>
+               <li><a href="http://dev.mysql.com/doc/refman/5.1/en/csv-storage-engine.html"><code>CVS</code></a> storage engine</li>
+            </ul> 
+         </td>
+         <td class="c v41">
+               <li><code>SHOW PROCESSLIST</code> displays <code>Rolling back</code> in the <code>State</code> column</li>
+         </td>
+      </tr>
+
+      <tr>
+         <td class="v v41">4.1.2</td>
+         <td class="a v41">
+            <ul>
+               <li><a href="http://dev.mysql.com/doc/refman/4.1/en/create-table.html"><code>ENGINE</code></a> to <code>CREATE TABLE</code>,
+                   synonym for <code>TYPE</code></li>
+            </ul>
+         </td>
+         <td class="c v41">
+         </td>
+      </tr>
+
+      <tr>
+         <td class="v v41">4.1.1</td>
+         <td class="a v41">
+            <ul>
+               <li><a href="http://dev.mysql.com/doc/refman/4.1/en/checksum-table.html"><code>CHECKSUM TABLE</code></a></li>
+               <li><a href="http://dev.mysql.com/doc/refman/5.1/en/group-by-functions.html"><code>BIT_XOR()</code></a></li>
+               <li><a href="http://dev.mysql.com/doc/refman/4.1/en/server-sql-mode.html"><code>NO_AUTO_VALUE_ON_ZERO</code></a></li>
+               <li><a href="http://dev.mysql.com/doc/refman/4.1/en/server-sql-mode.html"><code>NO_FIELD_OPTIONS</code></a></li>
+               <li><a href="http://dev.mysql.com/doc/refman/4.1/en/server-sql-mode.html"><code>NO_KEY_OPTIONS</code></a></li>
+               <li><a href="http://dev.mysql.com/doc/refman/4.1/en/server-sql-mode.html"><code>NO_TABLE_OPTIONS</code></a></li>
+            </ul> 
+         </td>
+         <td class="c v41">&nbsp;</td>
+      </tr>
+
+      <tr>
+         <td class="v v41">4.1</td>
+         <td class="a v41">
+            <ul>
+               <li><a href="http://dev.mysql.com/doc/refman/4.1/en/subqueries.html">Subqueries</a></li>
+               <li><a href="http://dev.mysql.com/doc/refman/4.1/en/show-warnings.html"><code>SHOW WARNINGS</code></a></li>
+               <li><a href="http://dev.mysql.com/doc/refman/4.1/en/create-table.html"><code>LIKE</code></a> to <code>CREATE TABLE</code></li>
+               <li><a href="http://dev.mysql.com/doc/refman/5.1/en/mathematical-functions.html"><code>CRC32()</code></a></li>
+               <li><a href="http://dev.mysql.com/doc/refman/5.1/en/group-by-functions.html"><code>GROUP_CONCAT()</code></a></li>
+               <li><a href="http://dev.mysql.com/doc/refman/5.1/en/group-by-functions.html"><code>VARIANCE()</code></a></li>
+               <li><a href="http://dev.mysql.com/doc/refman/5.0/en/charset-unicode.html">Unicode</a> support</li>
+            </ul> 
+         </td>
+         <td class="c v41">
+            <ul>
+               <li><a href="http://dev.mysql.com/doc/refman/4.1/en/create-table.html"><code>COMMENT</code></a> for <code>CREATE TABLE</code> made operational
+                   (allowed but ignored in earlier versions)</li>
+               <li><a href="http://dev.mysql.com/doc/refman/5.0/en/password-hashing.html"><code>PASSWORD()</code></a> hash value length increased</a>
+                   from 16 to 41 bytes</li>
+               <li><a href="http://dev.mysql.com/doc/refman/4.1/en/string-type-overview.html">Character column lengths</a> interpreted in character
+                    units instead of bytes</li>
+               <li><code>HEAP</code> storage engine renamed <code>MEMORY</code>, but <code>HEAP</code> still works</li>
+               <li><a href="http://dev.mysql.com/doc/refman/4.1/en/show-columns.html"><code>SHOW FULL COLUMNS</code></a> displays any per-column
+                   collation and comments</li>
+            </ul>
+         </td>
+      </tr>
+<!----------------------------- 4.1 ----------------------------->
+      <tr>
+         <td class="v v40">4.0.18</td>
+         <td class="a v40">
+            <ul>
+               <li><a href="http://dev.mysql.com/doc/refman/4.1/en/create-table.html"><code>ENGINE</code></a> to <code>CREATE TABLE</code>,
+                   synonym for <code>TYPE</code></li>
+            </ul>
+         </td>
+         <td class="c v40">
+         </td>
+      </tr>
+
+      <tr>
+         <td class="v v40">4.0.15</td>
+         <td class="a v40">
+            <ul>
+               <li><a href="http://dev.mysql.com/doc/refman/4.1/en/server-sql-mode.html"><code>NO_DIR_IN_CREATE</code></a></li>
+            </ul>
+         </td>
+         <td class="c v40">
+         </td>
+      </tr>
+      
+      <tr>
+         <td class="v v40">4.0.14</td>
+         <td class="a v40">
+         </td>
+         <td class="c v40">
+            <ul>
+               <li><code>InnoDB</code> storage engine <a href="http://dev.mysql.com/doc/refman/4.1/en/create-table.html">supports indexing on BLOB and TEXT columns</a></li>
+            </ul>
+         </td>
+      </tr>
+      
+      <tr>
+         <td class="v v40">4.0.2</td>
+         <td class="a v40">
+            <ul>
+               <li><a href="http://dev.mysql.com/doc/refman/5.1/en/encryption-functions.html"><code>SHA1()</code></a></li>
+               <li><a href="http://dev.mysql.com/doc/refman/4.1/en/server-sql-mode.html"><code>NO_UNSIGNED_SUBTRACTION</code></a></li>
+            </ul>
+         </td>
+         <td class="c v40">
+            <ul>
+               <li><code>MEMORY</code> storage engine <a href="http://dev.mysql.com/doc/refman/4.1/en/create-table.html">supports indexes on columns that can have NULL values</a></li>
+               <li><a href="http://dev.mysql.com/doc/refman/4.1/en/string-type-overview.html"><code>VARCHAR</code> column length range</a>
+               changed from 1-255 to 0-255</li>
+            </ul>
+         </td>
+      </tr>
+      
+      <tr>
+         <td class="v v40">4.0.1</td>
+         <td class="a v40">
+            <ul>
+               <li><a href="http://dev.mysql.com/doc/refman/4.1/en/query-cache.html">Query cache</a></li>
+            </ul>
+         </td>
+         <td class="c v40">
+            <ul>
+               <li><code>MEMORY</code> storage engine <a href="http://dev.mysql.com/doc/refman/4.1/en/create-table.html">supports indexes on columns that can have NULL values</a></li>
+               <li><a href="http://dev.mysql.com/doc/refman/4.1/en/string-type-overview.html"><code>VARCHAR</code> column length range</a>
+               changed from 1-255 to 0-255</li>
+            </ul>
+         </td>
+      </tr>
+
+      <tr>
+         <td class="v v40">4.0</td>
+         <td>
+            <ul>
+               <li><a href="http://dev.mysql.com/doc/refman/4.1/en/select.html"><code>UNION</code></a> to <code>SELECT</code></li>
+            </ul>
+         </td>
+         <td class="a v40">
+            <ul>
+               <li><a href="http://dev.mysql.com/doc/refman/4.1/en/server-sql-mode.html"><code>ANSI_QUOTES</code></a></li>
+               <li><a href="http://dev.mysql.com/doc/refman/4.1/en/server-sql-mode.html"><code>IGNORE_SPACE</code></a></li>
+               <li><a href="http://dev.mysql.com/doc/refman/4.1/en/server-sql-mode.html"><code>ONLY_FULL_GROUP_BY</code></a></li>
+               <li><a href="http://dev.mysql.com/doc/refman/4.1/en/server-sql-mode.html"><code>PIPES_AS_CONCAT</code></a></li>
+               <li><a href="http://dev.mysql.com/doc/refman/4.1/en/server-sql-mode.html"><code>REAL_AS_FLOAT</code></a></li>
+            </ul>
+         </td>
+         <td class="c v40">
+         </td>
+      </tr>
+<!----------------------------- 4.0 ----------------------------->
+      <tr>
+         <td class="v v3">3.23.23</td>
+         <td class="a v3">
+            <ul>
+               <li><a href="http://dev.mysql.com/doc/refman/4.1/en/fulltext-search.html">Full-text indexing and searching</a></li>
+               <li><a href="http://dev.mysql.com/doc/refman/4.1/en/show-columns.html"><code>FULL</code></a> to <code>SHOW COLUMNS</code></li>
+            </ul>
+         </td>
+         <td class="c v3">
+         </td>
+      </tr>
+
+   </table>
+ </div>
+</div>
+
+<div id="footer">
+This page was last updated June 22, 2010.
+</div>
+
+</body>
+</html>

hackmysql.com/case1

+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
+<head>
+<title>Indexing Basic MySQL Queries</title>
+<meta name="keywords" content="MySQL, basic, index, indexes, indexing, queries, query" />
+<meta name="description" content="Indexing basic MySQL queries." />
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+<link rel="stylesheet" href="http://hackmysql.com/styles.css" type="text/css" />
+<style type="text/css">
+.him {border-bottom:2px solid #f07f00;}
+</style>
+
+</head>
+
+<body>
+<div id="Menu">
+   <div id="bt">
+   <a href="http://hackmysql.com">Hack MySQL</a> &lsaquo; <a class="him" style="color:#f07f00;" href="case1.php">Indexing Basic MySQL Queries</a>   </div>
+   <div id="info">
+      <a class="not_him" href="blog/">Blog</a>&nbsp;&nbsp;&nbsp;
+      <a class="not_him" href="contact">Contact</a>&nbsp;&nbsp;&nbsp;
+      <a class="not_him" href="sitemap">Sitemap</a>
+      <a name="__top"></a>
+   </div>
+   <div style="clear:both; font-size:10pt; text-align:center; width:100%; padding:10px 0px 5px 0px;">
+      <b>This web site is no longer maintained.&nbsp;&nbsp;<a href="http://hackmysql.com/blog/2010/11/19/hack-mysql-is-longer-maintained/">Read why</a>.</b>
+   </div>
+</div>
+
+
+<div id="Content">
+<div id="c1">
+   <h1>Indexing Basic MySQL Queries</h1>
+   "Why is the server so slow?" That's what one customer was asking in regards to her dual 2.8 GHz Xeon with 3 Gigs
+   of RAM server. Her primary website is a bulletin board with over 25,000 registered users and 151,000 posts total.
+   Not a large site by these metrics, but extremely high bandwith due to 62,000 files totaling 110 Gigs which are
+   all uploaded and accessed with <a href="http://www.opentools.de/">attachment mod</a> for
+   <a href="http://www.phpbb.com/">phpBB</a>.<br /><br />
+
+   When I first started working this server a load of 200 was not uncommon and MySQL was always the culprit. The
+   basics all checked out: Plenty of server capacity, normal MySQL and Apache configs, no hardware or network 
+   problems. One thing always kept showing up though: A slow query. Fortunately another engineer wiser than I had
+   turned on slow query logging in my.cnf:
+   <div style="padding: 7px">
+   log-slow-queries<br />
+   long_query_time = 5
+   </div>
+   The slow queries log file reached 2 Gigs in size after just a few days and almost every query was like:
+   <div style="padding: 7px">
+   # Query_time: 5&nbsp;&nbsp;Lock_time: 0&nbsp;&nbsp;Rows_sent: 1&nbsp;&nbsp;Rows_examined: 61043<br />
+   SELECT attach_id as total_attachment FROM phpbb_attachments<br />
+   WHERE post_id IN (163246, 164224, 164894, 165146, 167931);
+   </div>
+   Some queries had literally hundreds of post_id specified for IN; this a mild example of the horrors within. I say
+   "horrors" because this is short and sweet query correct? Oh no, look: "Rows_sent: 1&nbsp;&nbsp;Rows_examined: 61043".
+   Examining 61,043 to return just 1 is not short or sweet. Multiply this times 200 visitors and MySQL is examining
+   12,208,600 rows! I know it's difficult to believe but it gets worse later when we see where this slow query is
+   coming from and how often it runs.<br /><br />
+
+   At first I thought: What can I do about some else's query? I did everything I could for the server and besides,
+   can one query really bring down such a powerful machine? Well there's only one way to find out: Hack it.<br /><br />
+
+   <h2>Hacking Some Else's Slow Query</h2>
+   First we must understand why the query is slow in the mind of MySQL and the only way to do that is to have MySQL
+   <a href="http://dev.mysql.com/doc/refman/5.0/en/explain.html">EXPLAIN</a> it to us:
+   <pre>
+   mysql> EXPLAIN
+       -> SELECT attach_id as total_attachment FROM phpbb_attachments
+       -> WHERE post_id IN (163246, 164224, 164894, 165146, 167931);
+   +-------------------+-------+---------------+-------------------+---------+------+-------+--------------------------+
+   | table             | type  | possible_keys | key               | key_len | ref  | rows  | Extra                    |
+   +-------------------+-------+---------------+-------------------+---------+------+-------+--------------------------+
+   | phpbb_attachments | index | NULL          | attach_id_post_id |       6 | NULL | 61834 | Using where; Using index |
+   +-------------------+-------+---------------+-------------------+---------+------+-------+--------------------------+
+   </pre>
+   It's a thing of beauty to see inside the mind of the machine, but in this case perhaps not. Why is MySQL telling us
+   there's no possible keys but it's using key attach_id_post_id? And if it's using a key then why does it suspect it
+   will have to examine 61,834 rows (by performing a full index scan, denoted by "type: index")? It seems there's a
+   problem with the keys so we must now understand them:
+   <pre>
+   mysql> DESCRIBE phpbb_attachments;
+   +-------------+-----------------------+------+-----+---------+-------+
+   | Field       | Type                  | Null | Key | Default | Extra |
+   +-------------+-----------------------+------+-----+---------+-------+
+   | attach_id   | mediumint(8) unsigned |      | MUL | 0       |       |
+   | post_id     | mediumint(8) unsigned |      |     | 0       |       |
+   | privmsgs_id | mediumint(8) unsigned |      |     | 0       |       |
+   | user_id_1   | mediumint(8)          |      |     | 0       |       |
+   | user_id_2   | mediumint(8)          |      |     | 0       |       |
+   +-------------+-----------------------+------+-----+---------+-------+
+
+   mysql> SHOW INDEX FROM phpbb_attachments;
+   +-------------------+------------+-----------------------+--------------+-------------+-----------+-------------+
+   | Table             | Non_unique | Key_name              | Seq_in_index | Column_name | Collation | Cardinality |
+   +-------------------+------------+-----------------------+--------------+-------------+-----------+-------------+
+   | phpbb_attachments |          1 | attach_id_post_id     |            1 | attach_id   | A         |       61834 |
+   | phpbb_attachments |          1 | attach_id_post_id     |            2 | post_id     | A         |       61834 |
+   | phpbb_attachments |          1 | attach_id_privmsgs_id |            1 | attach_id   | A         |       61834 |
+   | phpbb_attachments |          1 | attach_id_privmsgs_id |            2 | privmsgs_id | A         |       61834 |
+   +-------------------+------------+-----------------------+--------------+-------------+-----------+-------------+
+   </pre>
+   Understanding indexes (or keys if you prefer) is two part: Understanding the structure of the table <i>then</i>
+   understanding the indexes. You can't just slap an index on a table and think everything will be wonderful. In this
+   example it looks like everything should be wonderful with key attach_id_post_id. Given that the SELECT statement is
+   selecting attach_id and post_id and that's just what this key indexes, so why isn't it working? It is working, just not
+   how we're intending; it's working for MySQL which is why in EXPLAIN it says "Using index." When MySQL says this
+   in "Extra" is means "The column information is retrieved from the table using only information in the index tree 
+   without having to do an additional seek to read the actual row." In other words: It finds and returns matching
+   columns from the index in memory not the table on disk, which is a good thing, unless it's doing this 12 million
+   times for 1 matching column.<br /><br />
+
+   How very annoying: MySQL is using the index but still in effect examing every row of the table. The reason why in
+   this example deals with how MySQL uses multiple column indexes. From DESCRIBE we see "MUL" for multi-column
+   index, and from SHOW INDEX we see attach_id_post_id twice, first for attach_id second for post_id. A multiple
+   column index acts like a single column index if the columns were put end-to-end in the order specified by
+   "Seq_in_index" from SHOW INDEX. In this example if attach_id were 100 and post_id were 200 this is indexed as
+   "100 200". Painfully simple correct? Throw this in the mix: MySQL will only use a multi-column index if a value
+   is specified for the first column in the index. In this example the first column in the index is attach_id and
+   we're not specifying a value for this column which is why MySQL won't use the index like we want it to. What
+   MySQL does do, and why it's able to use the index at all,  is use any value for attach_id and the values we gave
+   it for post_id. In effect it looks for '* 163246', '* 164224', '* 164894', '* 165146', and '* 167931'. Since
+   attach_id is unique MySQL really does have to look at every single one, all 61,00+. While doing that if it
+   comes across one with a matching post_id lucky for us. I hope you see the obvious and simple solution: Swap
+   the order of columns in the key, post_id then attach_id. Later we'll do this but first it's good learning
+   to examine another possibility.<br /><br />
+
+   <h2>The Little Index That Could</h2>
+   Here's an idea: If post_id is what we're matching rows on, just index post_id. The command would be "CREATE INDEX
+   test_index ON phpbb_attachments (post_id);". Then EXPLAIN the exact same slow query again:
+   <pre>
+   mysql> EXPLAIN
+       -> SELECT attach_id as total_attachment FROM phpbb_attachments
+       -> WHERE post_id IN (163246, 164224, 164894, 165146, 167931);
+   +-------------------+-------+---------------+------------+---------+------+------+-------------+
+   | table             | type  | possible_keys | key        | key_len | ref  | rows | Extra       |
+   +-------------------+-------+---------------+------------+---------+------+------+-------------+
+   | phpbb_attachments | range | test_index    | test_index |       3 | NULL |    5 | Using where |
+   +-------------------+-------+---------------+------------+---------+------+------+-------------+
+   </pre>
+   That output speaks volumes: 5 rows to examine. This is the index that saved the server. I have not seen the server's
+   load go above 4 and the website is running faster with more users at once. The difference was night and day. But
+   why stop there? The query is no longer slow but it could be better. Notice how MySQL is not "Using index" anymore.
+   This is because attach_id is not in the index its using. Whereas it's in the attach_id_post_id index MySQL has
+   wisely chosen to examine fewer rows at the cost of doing a few disk seeks. The solution is a multiple column index
+   on both post_id and attach_id, with post_id first.<br /><br />
+
+   <h2>The Solution</h2>
+   <pre>
+   mysql> CREATE INDEX post_id_attach_id ON phpbb_attachments (post_id, attach_id);
+   Query OK, 61834 rows affected (0.53 sec)
+   Records: 61834  Duplicates: 0  Warnings: 0
+
+   mysql> EXPLAIN
+       -> SELECT attach_id as total_attachment FROM phpbb_attachments
+       -> WHERE post_id IN (163246, 164224, 164894, 165146, 167931);
+   +-------------------+-------+------------------------------+-------------------+---------+------+------+--------------------------+
+   | table             | type  | possible_keys                | key               | key_len | ref  | rows | Extra                    |
+   +-------------------+-------+------------------------------+-------------------+---------+------+------+--------------------------+
+   | phpbb_attachments | range | test_index,post_id_attach_id | post_id_attach_id |       3 | NULL |    5 | Using where; Using index |
+   +-------------------+-------+------------------------------+-------------------+---------+------+------+--------------------------+
+   </pre>
+   As we can see MySQL still considers the test_index key but chooses post_id_attach_id because doing so will allow
+   it to get matching attach_id from the index instead of the disk. A simple swap of column orders in the index made
+   all the difference. As the saying goes, it takes one tree to make a thousand matches and one match to burn a
+   thousand trees down.<br /><br />
+
+   <h2>Hacking the Source</h2>
+   Let us be clear about cause and effect: The slow query we just fixed was only the effect, some code is causing it
+   to run. Finding the source code is easy: grep for the SQL statement. In this example the source code is attachment
+   mod's addition to phpBB's index.php and viewforum.php scripts (one slight difference in each script):
+   <pre>
+   // Display Total Attachments MOD Start
+   $sql = "SELECT post_id
+           FROM " . POSTS_TABLE . "
+           WHERE topic_id = $topic_id
+           GROUP BY post_id";
+   
+   if ( !($result = $db->sql_query($sql)) )
+   {
+      message_die(GENERAL_ERROR, 'Could not query post information', '', __LINE__, __FILE__, $sql);
+   }
+   
+   $post_ids = '';
+   while ( $row = $db->sql_fetchrow($result) )
+   {
+      $post_ids .= ($post_ids == '') ? $row['post_id'] : ', ' . $row['post_id'];
+   }
+   
+   if ( $post_ids != '' )
+   {
+      $sql = "SELECT attach_id as total_attachment
+              FROM " . ATTACHMENTS_TABLE . "
+              WHERE post_id IN (" . $post_ids . ")";
+      
+      if ( !($result2 = $db->sql_query($sql)) )
+      {
+         message_die(GENERAL_ERROR, 'Could not query attachment information', '', __LINE__, __FILE__, $sql);
+      }
+      
+      $attach_num = $db->sql_numrows($result2);
+   }
+   else
+   {
+      $attach_num = 0;
+   }
+   // Display Total Attachments MOD End
+   </pre>
+   This query runs every time a vistor views the forums index or views the topics in a forum. If, before we fixed
+   the slow query, 200 hundred vistors navigated from the index to a topic MySQL would have examined about 24,417,200
+   rows. Now it may examine a few thousand at worse. What's more interesting is the purpose this query serves: It
+   counts the number of attachments for a given forum or topic. Instead of using the SQL
+   <a href="http://dev.mysql.com/doc/refman/5.0/en/counting-rows.html">COUNT function</a> to return one 
+   row with the sum it litterally gets all matching rows and then counts them afterwards. In some cases this means
+   it matches and counts hundreds of rows when MySQL could do this internally. Ultimately the PHP function mysql_num_rows
+   is called to count the rows returned so perhaps this function is optimized to know ahead of time how many rows
+   were returned without actually counting them. Still it seems more logical just to use the SQL COUNT function.<br /><br />
+
+   Let's go even deeper and look at the first query in this mod:
+   <div style="padding: 7px">
+   $sql = "SELECT post_id
+           FROM " . POSTS_TABLE . "
+           WHERE topic_id = $topic_id
+           GROUP BY post_id";
+   </div>
+   MySQL EXPLAIN says:
+   <pre>
+   mysql> EXPLAIN SELECT post_id FROM phpbb_posts WHERE topic_id=30 GROUP BY post_id;
+   +-------------+------+---------------+----------+---------+-------+------+----------------------------------------------+
+   | table       | type | possible_keys | key      | key_len | ref   | rows | Extra                                        |
+   +-------------+------+---------------+----------+---------+-------+------+----------------------------------------------+
+   | phpbb_posts | ref  | topic_id      | topic_id |       3 | const |    3 | Using where; Using temporary; Using filesort |
+   +-------------+------+---------------+----------+---------+-------+------+----------------------------------------------+
+   </pre>
+   Type "ref," using a key, returns 1 row, that's all great but what the hell: "Using temporary; Using filesort"?
+   "Using temporary" means MySQL creates a temporary table (hopefully in memory) to hold the results. According to
+   the manual "This typically happens if the query contains GROUP BY and ORDER BY clauses that list columns
+   differently." A filesort is not desirable either because it means "MySQL will need to do an extra pass to find
+   out how to retrieve the rows in sorted order." Seems overkill when only 1 row is expected to be examined. To
+   optimize this query we need to know a little about the structure of the table and the current indexes:
+   <pre>
+   mysql> DESCRIBE phpbb_posts;
+   +-----------------+-----------------------+------+-----+---------+----------------+
+   | Field           | Type                  | Null | Key | Default | Extra          |
+   +-----------------+-----------------------+------+-----+---------+----------------+
+   | post_id         | mediumint(8) unsigned |      | PRI | NULL    | auto_increment |
+   | topic_id        | mediumint(8) unsigned |      | MUL | 0       |                |
+   | forum_id        | smallint(5) unsigned  |      | MUL | 0       |                |
+   ... (We don't need the remaining columns)
+
+   mysql> SHOW INDEX FROM phpbb_posts;
+   +-------------+------------+-----------+--------------+-------------+-----------+-------------+
+   | Table       | Non_unique | Key_name  | Seq_in_index | Column_name | Collation | Cardinality |
+   +-------------+------------+-----------+--------------+-------------+-----------+-------------+
+   | phpbb_posts |          0 | PRIMARY   |            1 | post_id     | A         |      151962 |
+   | phpbb_posts |          1 | forum_id  |            1 | forum_id    | A         |          23 |
+   | phpbb_posts |          1 | topic_id  |            1 | topic_id    | A         |       16884 |
+   ... (We don't need the remaining indexes)
+   </pre>
+   There's a single column index on each column basically. MySQL uses the topic_id index for the WHERE condition but
+   then it must disk seek for post_id. Since rows on the disk can be in any order and we're grouping by a column
+   we have to read from rows on the disk (post_id) MySQL creates a temporary table and filesorts. First MySQL has
+   to scan the matching rows to determine how to retrieve them in sorted order, this is the extra filesort pass.
+   Then it retrieves the rows in sorted order putting them into a temporary table because where else can it put
+   them? The rows aren't sorted on the disk and there's no sorted index. If we had a sorted index then none of this
+   would be necessary because all the values (for topic_id and post_id) would already exist in memory in sorted
+   order. All we have to do is create an appropriate index:
+   <pre>
+   mysql> CREATE INDEX topic_id_post_id ON phpbb_posts (topic_id, post_id);
+   Query OK, 151999 rows affected (3.74 sec)
+   Records: 151999  Duplicates: 0  Warnings: 0
+   
+   mysql> EXPLAIN SELECT post_id FROM phpbb_posts WHERE topic_id=30 GROUP BY post_id;
+   +-------------+------+---------------------------+------------------+---------+-------+------+--------------------------+
+   | table       | type | possible_keys             | key              | key_len | ref   | rows | Extra                    |
+   +-------------+------+---------------------------+------------------+---------+-------+------+--------------------------+
+   | phpbb_posts | ref  | topic_id,topic_id_post_id | topic_id_post_id |       3 | const |    2 | Using where; Using index |
+   +-------------+------+---------------------------+------------------+---------+-------+------+--------------------------+
+   </pre> 
+   Much better: No temporary table or file sort and getting matching post_id from the index instead of the disk.<br /><br />
+
+   <h2>Final Thoughts</h2>
+   Notice how the queries we worked with in this example were painfully simple? Yet their power to bring a rather
+   strong server down was amazing. Likewise, the solutions were equally simple and powerful. I should metion
+   this was a special example of attachment mod and the current version does not have this slow query. If
+   you're wondering the version of MySQL used was 4.0 and all the examples are real from a production server.
+</div>
+</div>
+
+<div id="footer">
+This page was last updated on May 16, 2010.
+</div>
+
+</body>
+</html>

hackmysql.com/case1.php

+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
+<head>
+<title>Indexing Basic MySQL Queries</title>
+<meta name="keywords" content="MySQL, basic, index, indexes, indexing, queries, query" />
+<meta name="description" content="Indexing basic MySQL queries." />
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+<link rel="stylesheet" href="http://hackmysql.com/styles.css" type="text/css" />
+<style type="text/css">
+.him {border-bottom:2px solid #f07f00;}
+</style>
+
+</head>
+
+<body>
+<div id="Menu">
+   <div id="bt">
+   <a href="http://hackmysql.com">Hack MySQL</a> &lsaquo; <a class="him" style="color:#f07f00;" href="case1.php">Indexing Basic MySQL Queries</a>   </div>
+   <div id="info">
+      <a class="not_him" href="blog/">Blog</a>&nbsp;&nbsp;&nbsp;
+      <a class="not_him" href="contact">Contact</a>&nbsp;&nbsp;&nbsp;
+      <a class="not_him" href="sitemap">Sitemap</a>
+      <a name="__top"></a>
+   </div>
+   <div style="clear:both; font-size:10pt; text-align:center; width:100%; padding:10px 0px 5px 0px;">
+      <b>This web site is no longer maintained.&nbsp;&nbsp;<a href="http://hackmysql.com/blog/2010/11/19/hack-mysql-is-longer-maintained/">Read why</a>.</b>
+   </div>
+</div>
+
+
+<div id="Content">
+<div id="c1">
+   <h1>Indexing Basic MySQL Queries</h1>
+   "Why is the server so slow?" That's what one customer was asking in regards to her dual 2.8 GHz Xeon with 3 Gigs
+   of RAM server. Her primary website is a bulletin board with over 25,000 registered users and 151,000 posts total.
+   Not a large site by these metrics, but extremely high bandwith due to 62,000 files totaling 110 Gigs which are
+   all uploaded and accessed with <a href="http://www.opentools.de/">attachment mod</a> for
+   <a href="http://www.phpbb.com/">phpBB</a>.<br /><br />
+
+   When I first started working this server a load of 200 was not uncommon and MySQL was always the culprit. The
+   basics all checked out: Plenty of server capacity, normal MySQL and Apache configs, no hardware or network 
+   problems. One thing always kept showing up though: A slow query. Fortunately another engineer wiser than I had
+   turned on slow query logging in my.cnf:
+   <div style="padding: 7px">
+   log-slow-queries<br />
+   long_query_time = 5
+   </div>
+   The slow queries log file reached 2 Gigs in size after just a few days and almost every query was like:
+   <div style="padding: 7px">
+   # Query_time: 5&nbsp;&nbsp;Lock_time: 0&nbsp;&nbsp;Rows_sent: 1&nbsp;&nbsp;Rows_examined: 61043<br />
+   SELECT attach_id as total_attachment FROM phpbb_attachments<br />
+   WHERE post_id IN (163246, 164224, 164894, 165146, 167931);
+   </div>
+   Some queries had literally hundreds of post_id specified for IN; this a mild example of the horrors within. I say
+   "horrors" because this is short and sweet query correct? Oh no, look: "Rows_sent: 1&nbsp;&nbsp;Rows_examined: 61043".
+   Examining 61,043 to return just 1 is not short or sweet. Multiply this times 200 visitors and MySQL is examining
+   12,208,600 rows! I know it's difficult to believe but it gets worse later when we see where this slow query is
+   coming from and how often it runs.<br /><br />
+
+   At first I thought: What can I do about some else's query? I did everything I could for the server and besides,
+   can one query really bring down such a powerful machine? Well there's only one way to find out: Hack it.<br /><br />
+
+   <h2>Hacking Some Else's Slow Query</h2>
+   First we must understand why the query is slow in the mind of MySQL and the only way to do that is to have MySQL
+   <a href="http://dev.mysql.com/doc/refman/5.0/en/explain.html">EXPLAIN</a> it to us:
+   <pre>
+   mysql> EXPLAIN
+       -> SELECT attach_id as total_attachment FROM phpbb_attachments
+       -> WHERE post_id IN (163246, 164224, 164894, 165146, 167931);
+   +-------------------+-------+---------------+-------------------+---------+------+-------+--------------------------+
+   | table             | type  | possible_keys | key               | key_len | ref  | rows  | Extra                    |
+   +-------------------+-------+---------------+-------------------+---------+------+-------+--------------------------+
+   | phpbb_attachments | index | NULL          | attach_id_post_id |       6 | NULL | 61834 | Using where; Using index |
+   +-------------------+-------+---------------+-------------------+---------+------+-------+--------------------------+
+   </pre>
+   It's a thing of beauty to see inside the mind of the machine, but in this case perhaps not. Why is MySQL telling us
+   there's no possible keys but it's using key attach_id_post_id? And if it's using a key then why does it suspect it
+   will have to examine 61,834 rows (by performing a full index scan, denoted by "type: index")? It seems there's a
+   problem with the keys so we must now understand them:
+   <pre>
+   mysql> DESCRIBE phpbb_attachments;
+   +-------------+-----------------------+------+-----+---------+-------+
+   | Field       | Type                  | Null | Key | Default | Extra |
+   +-------------+-----------------------+------+-----+---------+-------+
+   | attach_id   | mediumint(8) unsigned |      | MUL | 0       |       |
+   | post_id     | mediumint(8) unsigned |      |     | 0       |       |
+   | privmsgs_id | mediumint(8) unsigned |      |     | 0       |       |
+   | user_id_1   | mediumint(8)          |      |     | 0       |       |
+   | user_id_2   | mediumint(8)          |      |     | 0       |       |
+   +-------------+-----------------------+------+-----+---------+-------+
+
+   mysql> SHOW INDEX FROM phpbb_attachments;
+   +-------------------+------------+-----------------------+--------------+-------------+-----------+-------------+
+   | Table             | Non_unique | Key_name              | Seq_in_index | Column_name | Collation | Cardinality |
+   +-------------------+------------+-----------------------+--------------+-------------+-----------+-------------+
+   | phpbb_attachments |          1 | attach_id_post_id     |            1 | attach_id   | A         |       61834 |
+   | phpbb_attachments |          1 | attach_id_post_id     |            2 | post_id     | A         |       61834 |
+   | phpbb_attachments |          1 | attach_id_privmsgs_id |            1 | attach_id   | A         |       61834 |
+   | phpbb_attachments |          1 | attach_id_privmsgs_id |            2 | privmsgs_id | A         |       61834 |
+   +-------------------+------------+-----------------------+--------------+-------------+-----------+-------------+
+   </pre>
+   Understanding indexes (or keys if you prefer) is two part: Understanding the structure of the table <i>then</i>
+   understanding the indexes. You can't just slap an index on a table and think everything will be wonderful. In this
+   example it looks like everything should be wonderful with key attach_id_post_id. Given that the SELECT statement is
+   selecting attach_id and post_id and that's just what this key indexes, so why isn't it working? It is working, just not
+   how we're intending; it's working for MySQL which is why in EXPLAIN it says "Using index." When MySQL says this
+   in "Extra" is means "The column information is retrieved from the table using only information in the index tree 
+   without having to do an additional seek to read the actual row." In other words: It finds and returns matching
+   columns from the index in memory not the table on disk, which is a good thing, unless it's doing this 12 million
+   times for 1 matching column.<br /><br />
+
+   How very annoying: MySQL is using the index but still in effect examing every row of the table. The reason why in
+   this example deals with how MySQL uses multiple column indexes. From DESCRIBE we see "MUL" for multi-column
+   index, and from SHOW INDEX we see attach_id_post_id twice, first for attach_id second for post_id. A multiple
+   column index acts like a single column index if the columns were put end-to-end in the order specified by
+   "Seq_in_index" from SHOW INDEX. In this example if attach_id were 100 and post_id were 200 this is indexed as
+   "100 200". Painfully simple correct? Throw this in the mix: MySQL will only use a multi-column index if a value
+   is specified for the first column in the index. In this example the first column in the index is attach_id and
+   we're not specifying a value for this column which is why MySQL won't use the index like we want it to. What
+   MySQL does do, and why it's able to use the index at all,  is use any value for attach_id and the values we gave
+   it for post_id. In effect it looks for '* 163246', '* 164224', '* 164894', '* 165146', and '* 167931'. Since
+   attach_id is unique MySQL really does have to look at every single one, all 61,00+. While doing that if it
+   comes across one with a matching post_id lucky for us. I hope you see the obvious and simple solution: Swap
+   the order of columns in the key, post_id then attach_id. Later we'll do this but first it's good learning
+   to examine another possibility.<br /><br />
+
+   <h2>The Little Index That Could</h2>
+   Here's an idea: If post_id is what we're matching rows on, just index post_id. The command would be "CREATE INDEX
+   test_index ON phpbb_attachments (post_id);". Then EXPLAIN the exact same slow query again:
+   <pre>
+   mysql> EXPLAIN
+       -> SELECT attach_id as total_attachment FROM phpbb_attachments
+       -> WHERE post_id IN (163246, 164224, 164894, 165146, 167931);
+   +-------------------+-------+---------------+------------+---------+------+------+-------------+
+   | table             | type  | possible_keys | key        | key_len | ref  | rows | Extra       |
+   +-------------------+-------+---------------+------------+---------+------+------+-------------+
+   | phpbb_attachments | range | test_index    | test_index |       3 | NULL |    5 | Using where |
+   +-------------------+-------+---------------+------------+---------+------+------+-------------+
+   </pre>
+   That output speaks volumes: 5 rows to examine. This is the index that saved the server. I have not seen the server's
+   load go above 4 and the website is running faster with more users at once. The difference was night and day. But
+   why stop there? The query is no longer slow but it could be better. Notice how MySQL is not "Using index" anymore.
+   This is because attach_id is not in the index its using. Whereas it's in the attach_id_post_id index MySQL has
+   wisely chosen to examine fewer rows at the cost of doing a few disk seeks. The solution is a multiple column index
+   on both post_id and attach_id, with post_id first.<br /><br />
+
+   <h2>The Solution</h2>
+   <pre>
+   mysql> CREATE INDEX post_id_attach_id ON phpbb_attachments (post_id, attach_id);
+   Query OK, 61834 rows affected (0.53 sec)
+   Records: 61834  Duplicates: 0  Warnings: 0
+
+   mysql> EXPLAIN
+       -> SELECT attach_id as total_attachment FROM phpbb_attachments
+       -> WHERE post_id IN (163246, 164224, 164894, 165146, 167931);
+   +-------------------+-------+------------------------------+-------------------+---------+------+------+--------------------------+
+   | table             | type  | possible_keys                | key               | key_len | ref  | rows | Extra                    |
+   +-------------------+-------+------------------------------+-------------------+---------+------+------+--------------------------+
+   | phpbb_attachments | range | test_index,post_id_attach_id | post_id_attach_id |       3 | NULL |    5 | Using where; Using index |
+   +-------------------+-------+------------------------------+-------------------+---------+------+------+--------------------------+
+   </pre>
+   As we can see MySQL still considers the test_index key but chooses post_id_attach_id because doing so will allow
+   it to get matching attach_id from the index instead of the disk. A simple swap of column orders in the index made
+   all the difference. As the saying goes, it takes one tree to make a thousand matches and one match to burn a
+   thousand trees down.<br /><br />
+
+   <h2>Hacking the Source</h2>
+   Let us be clear about cause and effect: The slow query we just fixed was only the effect, some code is causing it
+   to run. Finding the source code is easy: grep for the SQL statement. In this example the source code is attachment
+   mod's addition to phpBB's index.php and viewforum.php scripts (one slight difference in each script):
+   <pre>
+   // Display Total Attachments MOD Start
+   $sql = "SELECT post_id
+           FROM " . POSTS_TABLE . "
+           WHERE topic_id = $topic_id
+           GROUP BY post_id";
+   
+   if ( !($result = $db->sql_query($sql)) )
+   {
+      message_die(GENERAL_ERROR, 'Could not query post information', '', __LINE__, __FILE__, $sql);
+   }
+   
+   $post_ids = '';
+   while ( $row = $db->sql_fetchrow($result) )
+   {
+      $post_ids .= ($post_ids == '') ? $row['post_id'] : ', ' . $row['post_id'];
+   }
+   
+   if ( $post_ids != '' )
+   {
+      $sql = "SELECT attach_id as total_attachment
+              FROM " . ATTACHMENTS_TABLE . "
+              WHERE post_id IN (" . $post_ids . ")";
+      
+      if ( !($result2 = $db->sql_query($sql)) )
+      {
+         message_die(GENERAL_ERROR, 'Could not query attachment information', '', __LINE__, __FILE__, $sql);
+      }
+      
+      $attach_num = $db->sql_numrows($result2);
+   }
+   else
+   {
+      $attach_num = 0;
+   }
+   // Display Total Attachments MOD End
+   </pre>
+   This query runs every time a vistor views the forums index or views the topics in a forum. If, before we fixed
+   the slow query, 200 hundred vistors navigated from the index to a topic MySQL would have examined about 24,417,200
+   rows. Now it may examine a few thousand at worse. What's more interesting is the purpose this query serves: It
+   counts the number of attachments for a given forum or topic. Instead of using the SQL
+   <a href="http://dev.mysql.com/doc/refman/5.0/en/counting-rows.html">COUNT function</a> to return one 
+   row with the sum it litterally gets all matching rows and then counts them afterwards. In some cases this means
+   it matches and counts hundreds of rows when MySQL could do this internally. Ultimately the PHP function mysql_num_rows
+   is called to count the rows returned so perhaps this function is optimized to know ahead of time how many rows
+   were returned without actually counting them. Still it seems more logical just to use the SQL COUNT function.<br /><br />
+
+   Let's go even deeper and look at the first query in this mod:
+   <div style="padding: 7px">
+   $sql = "SELECT post_id
+           FROM " . POSTS_TABLE . "
+           WHERE topic_id = $topic_id
+           GROUP BY post_id";
+   </div>
+   MySQL EXPLAIN says:
+   <pre>
+   mysql> EXPLAIN SELECT post_id FROM phpbb_posts WHERE topic_id=30 GROUP BY post_id;
+   +-------------+------+---------------+----------+---------+-------+------+----------------------------------------------+
+   | table       | type | possible_keys | key      | key_len | ref   | rows | Extra                                        |
+   +-------------+------+---------------+----------+---------+-------+------+----------------------------------------------+
+   | phpbb_posts | ref  | topic_id      | topic_id |       3 | const |    3 | Using where; Using temporary; Using filesort |
+   +-------------+------+---------------+----------+---------+-------+------+----------------------------------------------+
+   </pre>
+   Type "ref," using a key, returns 1 row, that's all great but what the hell: "Using temporary; Using filesort"?
+   "Using temporary" means MySQL creates a temporary table (hopefully in memory) to hold the results. According to
+   the manual "This typically happens if the query contains GROUP BY and ORDER BY clauses that list columns
+   differently." A filesort is not desirable either because it means "MySQL will need to do an extra pass to find
+   out how to retrieve the rows in sorted order." Seems overkill when only 1 row is expected to be examined. To
+   optimize this query we need to know a little about the structure of the table and the current indexes:
+   <pre>
+   mysql> DESCRIBE phpbb_posts;
+   +-----------------+-----------------------+------+-----+---------+----------------+
+   | Field           | Type                  | Null | Key | Default | Extra          |
+   +-----------------+-----------------------+------+-----+---------+----------------+
+   | post_id         | mediumint(8) unsigned |      | PRI | NULL    | auto_increment |
+   | topic_id        | mediumint(8) unsigned |      | MUL | 0       |                |
+   | forum_id        | smallint(5) unsigned  |      | MUL | 0       |                |
+   ... (We don't need the remaining columns)
+
+   mysql> SHOW INDEX FROM phpbb_posts;
+   +-------------+------------+-----------+--------------+-------------+-----------+-------------+
+   | Table       | Non_unique | Key_name  | Seq_in_index | Column_name | Collation | Cardinality |
+   +-------------+------------+-----------+--------------+-------------+-----------+-------------+
+   | phpbb_posts |          0 | PRIMARY   |            1 | post_id     | A         |      151962 |
+   | phpbb_posts |          1 | forum_id  |            1 | forum_id    | A         |          23 |
+   | phpbb_posts |          1 | topic_id  |            1 | topic_id    | A         |       16884 |
+   ... (We don't need the remaining indexes)
+   </pre>
+   There's a single column index on each column basically. MySQL uses the topic_id index for the WHERE condition but
+   then it must disk seek for post_id. Since rows on the disk can be in any order and we're grouping by a column
+   we have to read from rows on the disk (post_id) MySQL creates a temporary table and filesorts. First MySQL has
+   to scan the matching rows to determine how to retrieve them in sorted order, this is the extra filesort pass.
+   Then it retrieves the rows in sorted order putting them into a temporary table because where else can it put
+   them? The rows aren't sorted on the disk and there's no sorted index. If we had a sorted index then none of this
+   would be necessary because all the values (for topic_id and post_id) would already exist in memory in sorted
+   order. All we have to do is create an appropriate index:
+   <pre>
+   mysql> CREATE INDEX topic_id_post_id ON phpbb_posts (topic_id, post_id);
+   Query OK, 151999 rows affected (3.74 sec)
+   Records: 151999  Duplicates: 0  Warnings: 0
+   
+   mysql> EXPLAIN SELECT post_id FROM phpbb_posts WHERE topic_id=30 GROUP BY post_id;
+   +-------------+------+---------------------------+------------------+---------+-------+------+--------------------------+
+   | table       | type | possible_keys             | key              | key_len | ref   | rows | Extra                    |
+   +-------------+------+---------------------------+------------------+---------+-------+------+--------------------------+
+   | phpbb_posts | ref  | topic_id,topic_id_post_id | topic_id_post_id |       3 | const |    2 | Using where; Using index |
+   +-------------+------+---------------------------+------------------+---------+-------+------+--------------------------+
+   </pre> 
+   Much better: No temporary table or file sort and getting matching post_id from the index instead of the disk.<br /><br />
+
+   <h2>Final Thoughts</h2>
+   Notice how the queries we worked with in this example were painfully simple? Yet their power to bring a rather
+   strong server down was amazing. Likewise, the solutions were equally simple and powerful. I should metion
+   this was a special example of attachment mod and the current version does not have this slow query. If
+   you're wondering the version of MySQL used was 4.0 and all the examples are real from a production server.
+</div>
+</div>
+
+<div id="footer">
+This page was last updated on May 16, 2010.
+</div>
+
+</body>
+</html>

hackmysql.com/case2

+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
+<head>
+<title>Table Design and MySQL Index Details</title>
+<meta name="keywords" content="MySQL, table, design, index, details, indexes" />
+<meta name="description" content="Table design and MySQL index details." />
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+<link rel="stylesheet" href="http://hackmysql.com/styles.css" type="text/css" />
+<style type="text/css">
+.him {border-bottom:2px solid #f07f00;}
+</style>
+
+</head>
+
+<body>
+<div id="Menu">
+   <div id="bt">
+   <a href="http://hackmysql.com">Hack MySQL</a> &lsaquo; <a class="him" style="color:#f07f00;" href="case2.php">Table Design and MySQL Index Details</a>   </div>
+   <div id="info">
+      <a class="not_him" href="blog/">Blog</a>&nbsp;&nbsp;&nbsp;
+      <a class="not_him" href="contact">Contact</a>&nbsp;&nbsp;&nbsp;
+      <a class="not_him" href="sitemap">Sitemap</a>
+      <a name="__top"></a>
+   </div>
+   <div style="clear:both; font-size:10pt; text-align:center; width:100%; padding:10px 0px 5px 0px;">
+      <b>This web site is no longer maintained.&nbsp;&nbsp;<a href="http://hackmysql.com/blog/2010/11/19/hack-mysql-is-longer-maintained/">Read why</a>.</b>
+   </div>
+</div>
+
+
+<div id="Content">
+<div id="c1">
+   <h1>Table Design and MySQL Index Details</h1>
+   It's not often I get to work with a true homemade database design. In this case the customer (or their
+   developer to be accurate) had designed the whole database backend of their website from scratch. However,
+   problems arose as usual: An overly powerful server seeing sustained high loads and CPU usage by MySQL&mdash;which
+   is actually an understatement given that the load would spike to 50+.
+   After <a href="http://dev.mysql.com/doc/refman/5.0/en/slow-query-log.html">enabling slow query logging</a> and
+   examining the log a day later one query stood out in terms of frequency:
+   <div class="inset">
+      # Query_time: 7  Lock_time: 1  Rows_sent: 2  Rows_examined: 45023<br />
+      SELECT * FROM hints WHERE game_id = 374 ORDER BY date DESC;
+   </div>
+   Seven seconds is bad for a query this simple, as is examining 45,000 rows to return 2. Furthermore, this
+   query was logged over 4,000 times in one day, meaning MySQL spent about 28,000 seconds a day on this query,
+   or 7 hours. Let's first consider design and later we'll see how it effects the details of indexes. (You may
+   notice the numbers bet