Issue #1282 open

Cherrypy logger doesn't escape special characters

Will Dormann
created an issue
>>> import cherrypy
>>> newline = 'Line1\nLine2'
>>> 
>>> cherrypy.log(newline)
08/Jan/2014:11:45:05  INFO Line1
Line2
>>> 

Comments (11)

  1. Jason R. Coombs

    That's right. I don't see any reason why it should escape newlines. I'm going to close this ticket as wontfix, but do please elaborate on what behavior you would expect in this instance, what impact the existing behavior has on your operations, and why your expected behavior is superior to the existing behavior.

  2. Branko Vukelic

    To throw in my 2c, I used to maintain a logger library for NodeJS and one of the feature requests was something similar: escaping special characters. There are two things that can be prevented. One is corruption of the logs by some of the non-priting characters. Another is a log poisoning attack. Personally I never had to deal with either myself, so I'm not familiar with how it all works, but I imagine there could be situations where such things can happen for one reason or another.

    I think this can be solved by adding an escaping function which can be specified as a global configuration option as well as a parameter to the log method. The escaping function could be a no-op or pass-through by default, so user could choose an implementation that suits their needs.

  3. Jason R. Coombs

    I believe that because cherrypy leverages the Python logging framework that it already exposes ways to route logs through custom handlers for handling the output in custom ways. It would require disabling the default handler if printing those characters to stdout is problem case. Otherwise, all that's necessary is to write the custom handler and bind it to the root handler.

  4. Will Dormann reporter

    Sorry. Perhaps I should have been more detailed in my original report. 1) Cherrypy indicates that log entries should be escaped: http://svn.cherrypy.org/trunk/cherrypy/_cplogging.py Like Apache started doing in 2.0.46, non-printable and other special characters in %r (and we expand that to all parts) are escaped using \xhh sequences, where hh stands for the hexadecimal representation of the raw byte. Exceptions from this rule are " and \, which are escaped by prepending a backslash, and all whitespace characters, which are written in their C-style notation (\n, \t, etc). This does not happen. 2) Because this does not happen, this allows for log entry spoofing. The reason why Apache escapes log entries is to prevent this very thing. As an attacker, I could just create a loggable event that includes a newline (or other special character), and at that point the log entries cannot be trusted. This is CWE-117 http://cwe.mitre.org/data/definitions/117.html

    I believe that this bug should be reopened.

  5. Branko Vukelic

    Will Dormann (1) seems to apply to access() calls, not log() calls. This suggest that, while it's not possible to mangle your log output by embedding special characters in URLs and such, it's possible to do this if you use unsanitized user input in your logs when calling cherrypy.log() call. Since the latter is an application-level issue, I think the solution mentioned by Jason R. Coombs is adequate.

  6. Jason R. Coombs

    Hi Will. Thanks for the clarification and the reference. Here is the link to that code in the current repository.

    I note that the comment is written in the access method, which logs cherrypy access requests, which is different from error logs (in which the serving application is given broad control over what is output and how). Therefore, the comment isn't relevant to error logs.

    Should this same principle be applied to the error logs? My initial reaction is no, but because what you're describing is a potential security issue, let's consider further.

    I know from personal experience that many applications depend on being able to output tabs and newlines in the error logs. The default behavior of allowing logging of any content is congruent with user expectation and with the Python logging infrastructure. Because the error logs are free-form, there's no prescribed syntax to which a potential attacker could conform to spoof logs. Such an attacker would have to know specifics about the application to spoof those application logs. And indeed, if that behavior is a security issue, that same issue applies to any Python application that leverages the logging framework.

    So I'm not yet convinced there's a security issue here, and furthermore, changing the behavior of the error log to escape non-printable characters would have broad, backward-incompatible effects. Additionally, the logging framework provides hooks for an individual hosting provider or application developer to customize the formatting of logging output, allowing a particular CherryPy application to mitigate such attacks.

    Perhaps CherryPy could provide a secure logger option, an opt-in feature which explicitly encodes all logs for line-parsed safety. Someone would have to step up and volunteer the implementation.

    Otherwise, I believe the existing behavior is as-designed and CherryPy makes no guarantees about the structure of the error logs.

  7. Will Dormann reporter

    I think that at the very least, the documentation should be updated to clearly indicate that it's the application developer's responsibility to manually escape data before using cherrypy's log() functionality, or they may create an application that is vulnerable to CWE-117. I see now the difference between access() and error(). However, if I made the mistake of assuming that cherrypy should be escaping log data, then I assume that other developers using cherrypy may be making the same mistake.

  8. Jason R. Coombs

    It doesn't seem unreasonable to have additional documentation to present this potential vulnerability to the user for consideration. I'm still not entirely convinced the CWE applies, as the examples in the CWE assume structured logs, but the CherryPy error logs are largely unstructured (until structure is applied by the client). Nevertheless, I'm willing to entertain pull requests for documentation updates to elucidate the behavior and set the expectations.

  9. Log in to comment