mixed up ids in responses

Issue #22 resolved
Matt Broadstone created an issue

from the mailing list:

Hi, 

I found a problem using the invokeRemoteMethodBlocking and TCP connections (QJsonRpcTcpServer and client).

It seems that when two calls overlap (coming from two distinct clients), the IDs of the response are mixed:

On the client I see:

Call1, socket1   (send id 10)    (a bit long call)
after some time....
Call2, socket2   (send id 11)     (very fast call)
Response2, socket2 (receive id11) 
Response1, socket1 (receive id11) 

The response to call1 takes the id of the second call.

In the servers the logs show a similar situation, but responses seems sent in a different order:
Call1, socket1   (receive id 10)    (a bit long call)
after some time....
Call2, socket2   (receive id 11)     (very fast call)
Response1, socket1 (send id11) 
Response2, socket2 (send id11) 


Here a real logs:
On the client:

sending( QJsonRpcSocket(0x5e4520) ):  "{"id":188,"jsonrpc":"2.0","method":"ICBox.startRequest","params":[1]}"
......
sending( QJsonRpcSocket(0x5f7158) ):  "{"id":192,"jsonrpc":"2.0","method":"ICBox.keepAliveRequest"}" 
received( QJsonRpcSocket(0x5f7158) ):  "{"id":192,"jsonrpc":"2.0","result":1}"
received( QJsonRpcSocket(0x5e4520) ):  "{"id":192,"jsonrpc":"2.0","result":1}"

Note that two different sockets are used, due to qtcpsocket limitation in a multithread environment.

On the server:

received( QJsonRpcSocket(0x504fb0) ):  "{"id":188,"jsonrpc":"2.0","method":"ICBox.startRequest","params":[1]}"
received( QJsonRpcSocket(0x505030) ):  "{"id":192,"jsonrpc":"2.0","method":ICBox.keepAliveRequest"}" 
sending( QJsonRpcSocket(0x504fb0) ):  "{"id":192,"jsonrpc":"2.0","result":1}" 
sending( QJsonRpcSocket(0x505030) ):  "{"id":192,"jsonrpc":"2.0","result":1}"

Did you have any idea?

best regards

Comments (9)

  1. Matt Broadstone reporter

    @markvime I've created a test case here that should fully emulate the test case you provided in the email. Please take a look and let me know if you disagree that it's doing what your code does.

    Having said that, I cannot get this to fail.

  2. Marco Vimercati

    Matt, this code seems working because the two requests are serialized, probabily by the event loop. They are sharing the same thread. You can see this here:

    QDEBUG : TestIssue22::testIssue21() start fast method QThread(0x6ac4f8)
    QDEBUG : TestIssue22::testIssue21() returning:  "fast fast"
    QDEBUG : TestIssue22::testIssue21() sending( QJsonRpcSocket(0x6d0860) ):  "{\"id\":1,\"jsonrpc\":\"2.0\",\"result\":\"fast fast\"}"
    QDEBUG : TestIssue22::testIssue21() received( QJsonRpcSocket(0x6d1db8) ):  "{\"id\":2,\"jsonrpc\":\"2.0\",\"method\":\"service.slowM
    ethod\",\"params\":[\"slow\"]}"
    QDEBUG : TestIssue22::testIssue21() start slow method QThread(0x6ac4f8)
    QDEBUG : TestIssue22::testIssue21() received( QJsonRpcSocket(0x285f8a0) ):  "{\"id\":1,\"jsonrpc\":\"2.0\",\"result\":\"fast fast\"}
    "
    QDEBUG : TestIssue22::testIssue21() returning:  "slow slow"
    

    You see the slow method is started after the fast method returns. I added some qDebug before the delay loop.

    You can force this code to fail sistematically by adding a shorter delay in the fast method too:

        QString fastMethod(const QString &name)
        {
            qDebug() << "start fast method" << thread();
            QEventLoop loop;
            QTimer::singleShot(500, &loop, SLOT(quit()));
            loop.exec();
    
            qDebug() << "returning: " << QString("fast %1").arg(name);
            return QString("fast %1").arg(name);
        }
    

    This produces the bug and this output (with final timeout too):

    QDEBUG : TestIssue22::testIssue21() start fast method QThread(0x61c4f8)
    QDEBUG : TestIssue22::testIssue21() received( QJsonRpcSocket(0x639128) ):  "{\"id\":2,\"jsonrpc\":\"2.0\",\"method\":\"service.slowM
    ethod\",\"params\":[\"slow\"]}"
    QDEBUG : TestIssue22::testIssue21() start slow method QThread(0x61c4f8)
    QDEBUG : TestIssue22::testIssue21() returning:  "slow slow"
    QDEBUG : TestIssue22::testIssue21() sending( QJsonRpcSocket(0x636ab8) ):  "{\"id\":2,\"jsonrpc\":\"2.0\",\"result\":\"slow slow\"}"
    QDEBUG : TestIssue22::testIssue21() sending( QJsonRpcSocket(0x639128) ):  "{\"id\":2,\"jsonrpc\":\"2.0\",\"result\":\"slow slow\"}"
    QDEBUG : TestIssue22::testIssue21() returning:  "fast fast"
    QDEBUG : TestIssue22::testIssue21() received( QJsonRpcSocket(0x27cf7ec) ):  "{\"id\":2,\"jsonrpc\":\"2.0\",\"result\":\"slow slow\"}
    "
    QDEBUG : TestIssue22::testIssue21() received( QJsonRpcSocket(0x2a2f830) ):  "{\"id\":2,\"jsonrpc\":\"2.0\",\"result\":\"slow slow\"}
    "
    FAIL!  : TestIssue22::testIssue21() 'fastClientSpy.wait(30000)' returned FALSE. ()
    ..\..\..\..\qjsonrpc\tests\auto\issue22\tst_issue22.cpp(168) : failure location
    
  3. Matt Broadstone reporter

    @markvime the requests are indeed serially processed, and that is a limitation of the design of qjsonrpc. If you put a message in before the request is sent you'll see that the requests are sent at exactly the same time, but are processed in the order they are received by the server. If you want to handle multiple requests at the same time for potentially long processes you will need to use the delayedResponse strategy.

    Where are you placing those received qDebugs? Just in QJsonRpcSocket directly? Duh, you are using QJSONRPC_DEBUG, man it's been a while :)

  4. Matt Broadstone reporter

    Okay, thanks for bearing with me getting a working test case for this. I'll see if I can get a fix in tonight!

  5. Matt Broadstone reporter

    @markvime hey, okay so I pushed some code to the direct-dispatch branch. Can you please check that out and see if that fixes your problem?

  6. Marco Vimercati

    @mbroadstone I tried the direct-dispatch branch and it seems fix the problem. I understand that you removed the signal/slot mechanism to provide the responses of the RPC. In my pull request #44 I work around the problem in a different way, by passing the requester socket through the signal/slot and filtering the writeDatas in the notify slot. Probably your solution is cleaner, if you are sure you can avoid signal/slots.

    Anyway playing with the test case I noticed this behaviour, when the fast method starts first and the slow method request is received while the fast is waiting in its delay loop. In this case the end of slow method is waited and then the fast response is sent.

    ********* Start testing of TestIssue22 *********
    Config: Using QtTest library 5.5.1, Qt 5.5.1 (i386-little_endian-ilp32 shared (dynamic) debug build; by MSVC 2013)
    PASS   : TestIssue22::initTestCase()
    QDEBUG : TestIssue22::testIssue21() sending( QJsonRpcSocket(0x26ff994) ):  {"id":2,"jsonrpc":"2.0","method":"service.fastMethod","params":["fast"]}
    QDEBUG : TestIssue22::testIssue21() sending( QJsonRpcSocket(0x2b2f814) ):  {"id":1,"jsonrpc":"2.0","method":"service.slowMethod","params":["slow"]}
    QDEBUG : TestIssue22::testIssue21() received( QJsonRpcSocket(0x5709a8) ):  {"id":2,"jsonrpc":"2.0","method":"service.fastMethod","params":["fast"]}
    QDEBUG : TestIssue22::testIssue21() start fast
    QDEBUG : TestIssue22::testIssue21() received( QJsonRpcSocket(0x571870) ):  {"id":1,"jsonrpc":"2.0","method":"service.slowMethod","params":["slow"]}
    QDEBUG : TestIssue22::testIssue21() start slow
    QDEBUG : TestIssue22::testIssue21() end slow
    QDEBUG : TestIssue22::testIssue21() sending( QJsonRpcSocket(0x571870) ):  {"id":1,"jsonrpc":"2.0","result":"slow slow"}
    QDEBUG : TestIssue22::testIssue21() end fast
    QDEBUG : TestIssue22::testIssue21() sending( QJsonRpcSocket(0x5709a8) ):  {"id":2,"jsonrpc":"2.0","result":"fast fast"}
    QDEBUG : TestIssue22::testIssue21() received( QJsonRpcSocket(0x26ff994) ):  {"id":2,"jsonrpc":"2.0","result":"fast fast"}
    QDEBUG : TestIssue22::testIssue21() received( QJsonRpcSocket(0x2b2f814) ):  {"id":1,"jsonrpc":"2.0","result":"slow slow"}
    

    I understand that this is a border line case, for sure a fast method should not have a delay loop, so it should return before the start of the slow method, but I would like to know your impression. Having both the calls waiting in the qeventloop, why the the fast call is not finishing before the slower?

  7. Matt Broadstone reporter

    @markvime the fix in direct-dispatch is the way the code used to work. I originally I thought that solving this with signals/slots would better integrate with the event loop, but it's really just wasteful in this case (as well as buggy, like you saw). Even your approach was incomplete. What was happening was that the service was connecting (an expensive operation) to the reply signal every single time a request came in without ever disconnecting that slot. This is why you received a reply for a socket that had already received its data and was left open. The new way should be more performant, and is cleaner, please let me know if you're having trouble with that as I'd like to integrate it while I've still got some time.

    As for the ordering, my best guess is because the requests are coming in at the same instance (from the thread pool, you can see that everything is on a different thread now). So both are called at the same time in a stack, however we have loops inside that we must block for on the server side. If you want a solution that plays better with the event loop on the server side you must use delayed responses, it's more performant as well.

  8. Log in to comment