Strange behaviour of the POST service (BB-8245)

Fabian Lenz avatarFabian Lenz created an issue

Some things I noticed about the POST service that I considere bugs:

  • Actual POST data is in the format
payload=%7B%22repository%22%3A+%7B%22website%22%3A+%22%22%2C+%22fork%22%3A+false...

what has two issues: a) the client has to strip off the "payload=" part of the message b) the actual content is URL encoded what makes no sense for POST data

  • The triggered message from the POST service will be sent twice to the client (one duplicate after around 20 seconds), see screenshots!

I hope you can resolve this issues because I have build a small update server for my live system that relies on the POST service. :)

Kind regards

Comments (27)

  1. Dylan Etkin

    Hi Fabian,

    I am not sure how you are evaluating this. I have just tired to reproduce what you explain and am not seeing your issue.

    I received one post that looked like this:

    Time: Mon, 20 Aug 12 13:36:46 -0700
    Source ip: 63.246.22.222
    
    Headers (Some may be inserted by server)
    UNIQUE_ID = UDKf3tBx6hIAAKS7ikQAAAAA
    HTTP_HOST = posttestserver.com
    CONTENT_LENGTH = 1010
    CONTENT_TYPE = application/x-www-form-urlencoded
    HTTP_ACCEPT_ENCODING = gzip, deflate
    HTTP_USER_AGENT = Bitbucket.org
    REMOTE_ADDR = 63.246.22.222
    REMOTE_PORT = 18473
    GATEWAY_INTERFACE = CGI/1.1
    REQUEST_METHOD = POST
    QUERY_STRING = dump&html&dir=dylan
    REQUEST_URI = /post.php?dump&html&dir=dylan
    HTTP_CONNECTION = close
    REQUEST_TIME = 1345495006
    
    Post Params:
    key: 'payload' value: '{"repository": {"website": "", "fork": false, "name": "coffee-script", "scm": "git", "absolute_url": "/detkin/coffee-script/", "owner": "detkin", "slug": "coffee-script", "is_private": false}, "commits": [{"node": "325475d680f3", "files": [{"type": "modified", "file": "README"}], "branch": "master", "utctimestamp": "2012-08-20 20:36:42+00:00", "author": "Dylan Etkin", "timestamp": "2012-08-20 22:36:42", "raw_node": "325475d680f398fa027ef2ef1adc1dfd0993d19b", "parents": ["2c2e016e53ff"], "raw_author": "Dylan Etkin <detkin@dude.staff.sf.atlassian.com>", "message": "blah\n", "size": -1, "revision": null}], "canon_url": "https://bitbucket.org", "user": "detkin"}'
    Empty post body.
    
  2. suwat ch

    Hi, we also ran into the same issue. 2 POST requests, 20s apart. One thing in common is when it happens, the two requests are about 20s apart. It usually happens to POST endpoints that takes more than 20s to response.

  3. CuleX

    It appears, at least for me, that the double pushing only happens if the POST is not acknowledged with a valid HTTP 1.1 200 return code.

  4. Bruce Bjørkhaug

    So to summarize, the POST service retries the request after 20 seconds if it hasn't received the HTTP 1.1 200 return code, and the latter is in these cases not sent until the receiving server has finished processing the request - which in my case tends to take a little longer than 20 seconds during large jobs.

    I've added the following lines to the top of my script, and so far it seems to solve the problem:

    header("HTTP/1.1 200 OK");
    flush();
    
  5. Michael Kimsal

    this is a problem that needs to be fixed on bitbucket's end, OR... document on the services POST page that a retry will happen.

    FWIW, the header() trick didn't seem to work on my end - added it, but still seeing a double POST, actually this time in < 20 seconds.

  6. Michael Del Tito

    +1 for adding this to the docs for the POST service. Not only is it unintuitive that a retry attempt is made in this scenario, but this useful feature is lost on those that could benefit from it without documentation.

  7. davidebbo

    I'm sorry, I take that back. I was testing with a deployment that takes less than 20 seconds, so it was not a good test.

    When my handler takes 60 seconds, I see a second request coming in after 20 seconds, so as far as I can see, the behavior is unchanged.

  8. Erik van Zijst
    • changed status to open

    That's bad. I have not seen this myself, so I'll reopen this.

    Not sure it'll get the highest priority though, now that we know that the chance of it happening can be minimized by the remote end responding quickly.

  9. Erik van Zijst

    davidebbo I had another look and I can confirm that when the remote server fails to respond with 20 seconds, the broker will retry the request. It cannot wait for the remote server to respond indefinitely, as that would exhaust our broker resources. Instead, it treats the POST as failed and retries once.

  10. davidebbo

    In our case, the fact that it can take more than 20 seconds is by design, as it's doing expensive things (building, deployment, ...).

  11. Michael Del Tito

    Is there any reason these parameters (retry_timeout, max_retries, etc) could not be made configurable? I think each webhook consumer will have different needs, and it may be a poor choice to make any assumptions about these parameters.

    On the other hand, perhaps a better approach on the consumer side would be to separate the response to the webhook from the actual processing. This way, long running tasks (> 20 seconds) will not cause BB to send the automatic retry.

  12. Erik van Zijst

    On the other hand, perhaps a better approach on the consumer side would be to separate the response to the webhook from the actual processing. This way, long running tasks (> 20 seconds) will not cause BB to send the automatic retry.

    Yes, that's what I would like to argue for.

    The problem with slow responses is that we simply can't afford to wait that long as it ties up resources (processes and sockets) on our end.

  13. Erik van Zijst

    In our case, the fact that it can take more than 20 seconds is by design, as it's doing expensive things (building, deployment, ...).

    One issue with that is that when the response inevitably times out, Bitbucket has no way of distinguishing between, say, a network issue and your server legitimately needing more time to respond.

  14. Michael Del Tito

    There is also no reason to wait until processing finishes to send the response back from the consumer. Since the Bitbucket webhook doesn't care about the success or failure of the processing, it makes sense to simply acknowledge receipt with a 200 and then move on.

  15. Erik van Zijst

    In our case, we would be fine if Bitbucket just did a 'fire and forget' on the hook request.

    Yes, I personally think so too and I actually thought I had made it do that, but my fix clearly isn't working.

    I have opened an internal issue for this, but I really do want to stress again that remote servers should not do any heavy lifting in their callback, as there will just be no way for us to tell whether we were successful in delivering the post (and it will show up as false network failures in our logs -- drowning out true network issues).

  16. Michael Del Tito

    I was thinking more about this, and I really think removing the retry is the wrong approach. The retry is there to harden the service in case network transmission fails. I would call this a feature! Removing it unnecessarily just to resolve this (non-)issue seems shortsighted.

    If you need fire-and-forget functionality, it should be handled by properly separating the response from the processing in your application, and always sending a 200 response immediately.

  17. Log in to comment
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.