1. cherrypy
  2. CherryPy
Issue #48 resolved

Memory DoS caused by very long URL

created an issue

When cp2 receives a very long URL (thousands of path elements), memory usage raises quickly and, eventually, a {{{ RuntimeError: maximum recursion limit exceeded }}} occurs, compromising cp2-application and system stability.

The problem seems to be the '''cherrypy._cphttptools.getObjFromPath()''' recursive function, and the order it handles path elements.

An ''iterative'' version which handles URL's in the ''opposite direction'' (i.e., from left to right) is proposed below. It may hopefully help solve this issue.


------------------ PATCH STARTS AFTER THIS LINE --------------------------------


def getObjFromPathIterative(objPathList, objCache):

""" For a given objectPathList (like ['root', 'a', 'b', 'index']),

    return the object (or None if it doesn't exist).

    Also keep a cache for maximum efficiency

    *** Notes about this implementation ***

        [1] It avoids the

                RuntimeError: maximum recursion depth exceeded

            raised by the recursive implementation upon being requested

            a (possibly malicious) very long URL.

        [2] It scans the URL (here represented by the objPathList) from

            left to right (from 'root' to 'index' instead of the opposite).

            It stops once an object is not valid.

        [3] It (hopefully) avoids a main-memory DoS vulnerability suffered

            by the recursive implementation (related to [1]).

        [4] It passes all (current) unit tests.


# Let cpg be the first valid object.

validObjects = ["cpg"]

# Scan the objPathList in order from left to right

for index, obj in enumerate(objPathList):

    # currentObjStr holds something like 'cpg.root.something.else'

    currentObjStr = ".".join(validObjects)


    #   Cache check


    # Generate a cacheKey from the first 'index' elements of objPathList

    cacheKey = tuple(objPathList[:index+1])

    # Is this cacheKey in the objCache?

    if cacheKey in objCache:

        # And is its value not None?

        if objCache[cacheKey]:

            # Yes, then add it to the list of validObjects


            # OK, go to the next iteration


        # Its value is None, so we stop

        # (This means it is not a valid object)



    # Attribute check


    if getattr(eval(currentObjStr), obj, None):

        #  obj is a valid attribute of the current object


        #  Store it in the cache

        objCache[cacheKey] = eval(".".join(validObjects))


        # obj is not a valid attribute

        # Store None in the cache

        objCache[cacheKey] = None

        # Stop, we won't process the remaining objPathList


# Return the last cached object (even if its None)

return objCache[cacheKey]

If accepted, remove these lines of code too please.

For testing we do this.

getObjFromPath = getObjFromPathIterative

------------------ PATCH ENDS BEFORE THIS LINE --------------------------------


Comments (4)

  1. Sylvain Hellegouarch

    It would be interesting to see how it performs on the average load. If it is better or at least as good as the current algorithm we use, I think there is no reason why we shouldn't replace it by your code. If not, it might be wise to simply find a way to use it when we know we have a great number of elements in the URL.

  2. Anonymous

    I have a working version on the experimental branch that is fully iterative, and passes all tests. It uses a slightly different algorithm; it scans the URL from left to right, checking not only the path items, but also the slashes. This helps to disambiguate cases such as the trailing slash, and also, any kinds of partial URL match.

  3. Log in to comment