Issue #15 resolved

Serious performance problems

Dmitry Akinin
created an issue

In my project we have serious performance problems with Postman queries. We have only ~2k messages in DB, but queries is too slow to use Postman...

Here is some example queries.

Message list generation (page generation took more than 3 seconds): {{{ Time 1.700 SELECT COUNT(*) FROM postman_message LEFT OUTER JOIN postman_message T3 ON (postman_message.thread_id = T3.id) WHERE (postman_message.id IN (SELECT MAX(U0.id) AS pk__max FROM postman_message U0 WHERE (U0.thread_id IS NOT NULL AND (U0.recipient_archived = False AND U0.moderation_status = a AND U0.recipient_id = 201 AND U0.recipient_deleted_at IS NULL)) GROUP BY U0.thread_id ORDER BY NULL) OR ((postman_message.recipient_archived = False AND postman_message.moderation_status = a AND postman_message.recipient_id = 201 AND postman_message.recipient_deleted_at IS NULL) AND T3.id IS NULL))

Time 1.619 SELECT (SELECT COUNT(*) FROM postman_message T WHERE T.thread_id = postman_message.thread_id ) AS count, postman_message.id, postman_message.subject, postman_message.body, postman_message.sender_id, postman_message.recipient_id, postman_message.email, postman_message.parent_id, postman_message.thread_id, postman_message.sent_at, postman_message.read_at, postman_message.replied_at, postman_message.sender_archived, postman_message.recipient_archived, postman_message.sender_deleted_at, postman_message.recipient_deleted_at, postman_message.moderation_status, postman_message.moderation_by_id, postman_message.moderation_date, postman_message.moderation_reason, T4.id, T4.username, T4.first_name, T4.last_name, T4.email, T4.password, T4.is_staff, T4.is_active, T4.is_superuser, T4.last_login, T4.date_joined FROM postman_message LEFT OUTER JOIN postman_message T3 ON (postman_message.thread_id = T3.id) LEFT OUTER JOIN auth_user T4 ON (postman_message.sender_id = T4.id) WHERE (postman_message.id IN (SELECT MAX(U0.id) AS pk__max FROM postman_message U0 WHERE (U0.thread_id IS NOT NULL AND (U0.recipient_archived = False AND U0.moderation_status = a AND U0.recipient_id = 201 AND U0.recipient_deleted_at IS NULL)) GROUP BY U0.thread_id ORDER BY NULL) OR ((postman_message.recipient_archived = False AND postman_message.moderation_status = a AND postman_message.recipient_id = 201 AND postman_message.recipient_deleted_at IS NULL) AND T3.id IS NULL)) ORDER BY postman_message.sent_at DESC, postman_message.id DESC LIMIT 8 }}}

Archive page gereration took more than 30 seconds: {{{ Time 30.079 SELECT (SELECT COUNT(*) FROM postman_message T WHERE T.thread_id = postman_message.thread_id ) AS count, postman_message.id, postman_message.subject, postman_message.body, postman_message.sender_id, postman_message.recipient_id, postman_message.email, postman_message.parent_id, postman_message.thread_id, postman_message.sent_at, postman_message.read_at, postman_message.replied_at, postman_message.sender_archived, postman_message.recipient_archived, postman_message.sender_deleted_at, postman_message.recipient_deleted_at, postman_message.moderation_status, postman_message.moderation_by_id, postman_message.moderation_date, postman_message.moderation_reason, T3.id, T3.username, T3.first_name, T3.last_name, T3.email, T3.password, T3.is_staff, T3.is_active, T3.is_superuser, T3.last_login, T3.date_joined, auth_user.id, auth_user.username, auth_user.first_name, auth_user.last_name, auth_user.email, auth_user.password, auth_user.is_staff, auth_user.is_active, auth_user.is_superuser, auth_user.last_login, auth_user.date_joined FROM postman_message LEFT OUTER JOIN auth_user ON (postman_message.recipient_id = auth_user.id) LEFT OUTER JOIN auth_user T3 ON (postman_message.sender_id = T3.id) LEFT OUTER JOIN postman_message T4 ON (postman_message.thread_id = T4.id) WHERE (postman_message.id IN (SELECT MAX(U0.id) AS pk__max FROM postman_message U0 WHERE (U0.thread_id IS NOT NULL AND ((U0.recipient_archived = True AND U0.moderation_status = a AND U0.recipient_id = 14028 AND U0.recipient_deleted_at IS NULL) OR (U0.sender_archived = True AND U0.sender_deleted_at IS NULL AND U0.sender_id = 14028 ))) GROUP BY U0.thread_id ORDER BY NULL) OR (((postman_message.recipient_archived = True AND postman_message.moderation_status = a AND postman_message.recipient_id = 14028 AND postman_message.recipient_deleted_at IS NULL) OR (postman_message.sender_archived = True AND postman_message.sender_deleted_at IS NULL AND postman_message.sender_id = 14028 )) AND T4.id IS NULL)) ORDER BY postman_message.sent_at DESC, postman_message.id DESC LIMIT 1 }}}

Сan you give me some advice how to configure Postman or (maybe) Mysql to avoid this perfimance promlem? Thanx!

Comments (35)

  1. Dmitry Akinin reporter

    Here they are:

    1	PRIMARY	postman_message	ALL	postman_message_32f69dc	NULL	NULL	NULL	1906	Using where
    1	PRIMARY	T3	eq_ref	PRIMARY	PRIMARY	4	proactions.postman_message.thread_id	1	Using where; Using index
    2	DEPENDENT SUBQUERY	U0	range	postman_message_32f69dc,postman_message_65912a8a	postman_message_65912a8a	5	NULL	1717	Using where
    
    1	PRIMARY	postman_message	ALL	postman_message_6fe0a617,postman_message_32f69dc	NULL	NULL	NULL	1906	Using where; Using filesort
    1	PRIMARY	auth_user	eq_ref	PRIMARY	PRIMARY	4	proactions.postman_message.recipient_id	1	
    1	PRIMARY	T3	eq_ref	PRIMARY	PRIMARY	4	proactions.postman_message.sender_id	1	
    1	PRIMARY	T4	eq_ref	PRIMARY	PRIMARY	4	proactions.postman_message.thread_id	1	Using where; Using index
    3	DEPENDENT SUBQUERY	U0	range	postman_message_6fe0a617,postman_message_32f69dc,postman_message_65912a8a	postman_message_65912a8a	5	NULL	1717	Using where
    2	DEPENDENT SUBQUERY	T	ref	postman_message_65912a8a	postman_message_65912a8a	5	proactions.postman_message.thread_id	222	Using where; Using index
    
  2. Patrick Samson repo owner

    I don't see any obvious reason why your performance is so bad. I can't believe it's related to the amount of records. 2000 is a very low figure for a DBMS, the query planner doesn't even use an index but performs a sequencial scan. I don't have a mysql at hand, but on my postgresql it's the same: a "Seq Scan", but anyway it's expected because I have a very small set of records. If I force it to do an "Index Scan", it can do it fine as well.

    I would suggest other ways to do: - try temporarily to mount a postgresql server (or even a sqlite) and see the difference - ask a DBA to look at the mysql configuration, statistics, etc.

  3. Dmitry Akinin reporter

    Thanks, but we switched on uMessages app from Userena contrib, this application is better for our needs. Thank you very much for your answers and help!

  4. yomguy

    Hi!

    I'm reopening this issue because I also have some performance pb when displaying more than 250 messages in the "per conversation" view.. On a an excellent server, it takes more than 15s for example and I see mysqld working a LOT...

    But, this pb disappears when using the "per messages" view. So, don't you think we could optimize the "per conversation" process to avoid, maybe, some recursive DB lookups ?

    Thanks for this great tool and your support! Yomguy

  5. yomguy

    Of course! Where do you want me exactly to do the "EXPLAIN SELECT ..." and "ANALYSE TABLE postman_message" queries ? I only use connection.queries, devserver with the shell to debug..

  6. Patrick Samson repo owner

    Use any sql client, for instance the "MySQL Query Browser" from "MySQL Tools". But don't expect much from the EXPLAIN, I guess it won't be different from the ones of Dmitry. You're better off trying the suggestions I gave him.

  7. yomguy

    The pb:

    mysql> SELECT (SELECT COUNT(*) FROM `postman_message` T WHERE T.`thread_id` = `postman_message`.`thread_id` ) AS `count`, `postman_message`.`id`, `postman_message`.`subject`, `postman_message`.`body`, `postman_message`.`sender_id`, `postman_message`.`recipient_id`, `postman_message`.`email`, `postman_message`.`parent_id`, `postman_message`.`thread_id`, `postman_message`.`sent_at`, `postman_message`.`read_at`, `postman_message`.`replied_at`, `postman_message`.`sender_archived`, `postman_message`.`recipient_archived`, `postman_message`.`sender_deleted_at`, `postman_message`.`recipient_deleted_at`, `postman_message`.`moderation_status`, `postman_message`.`moderation_by_id`, `postman_message`.`moderation_date`, `postman_message`.`moderation_reason`, T3.`id`, T3.`username`, T3.`first_name`, T3.`last_name`, T3.`email`, T3.`password`, T3.`is_staff`, T3.`is_active`, T3.`is_superuser`, T3.`last_login`, T3.`date_joined`, `auth_user`.`id`, `auth_user`.`username`, `auth_user`.`first_name`, `auth_user`.`last_name`, `auth_user`.`email`, `auth_user`.`password`, `auth_user`.`is_staff`, `auth_user`.`is_active`, `auth_user`.`is_superuser`, `auth_user`.`last_login`, `auth_user`.`date_joined` FROM `postman_message` LEFT OUTER JOIN `auth_user` ON (`postman_message`.`recipient_id` = `auth_user`.`id`) LEFT OUTER JOIN `auth_user` T3 ON (`postman_message`.`sender_id` = T3.`id`) LEFT OUTER JOIN `postman_message` T4 ON (`postman_message`.`thread_id` = T4.`id`) WHERE (`postman_message`.`id` IN (SELECT MAX(U0.`id`) AS `pk__max` FROM `postman_message` U0 WHERE (U0.`thread_id` IS NOT NULL AND ((U0.`recipient_archived` = True AND U0.`moderation_status` = "a" AND U0.`recipient_id` = 14028 AND U0.`recipient_deleted_at` IS NULL) OR (U0.`sender_archived` = True AND U0.`sender_deleted_at` IS NULL AND U0.`sender_id` = 14028 ))) GROUP BY U0.`thread_id` ORDER BY NULL) OR (((`postman_message`.`recipient_archived` = True AND `postman_message`.`moderation_status` = "a" AND `postman_message`.`recipient_id` = 14028 AND `postman_message`.`recipient_deleted_at` IS NULL) OR (`postman_message`.`sender_archived` = True AND `postman_message`.`sender_deleted_at` IS NULL AND `postman_message`.`sender_id` = 14028 )) AND T4.`id` IS NULL)) ORDER BY `postman_message`.`sent_at` DESC, `postman_message`.`id` DESC LIMIT 1
        -> 
        -> ;
    Empty set (31.93 sec)
    
    
  8. yomguy

    Hi Patrick, Did you manage to see where the exponential query could happen? Can you confirm there is no such bug with postgresql and more than 2000 messages? Thanks, Bye, Yomguy

  9. Patrick Samson repo owner

    We need to do some experimentations. Please post here the output of: EXPLAIN SELECT MAX(U0.`id`) AS `pkmax` FROM `postman_message` U0 WHERE (U0.`thread_id` IS NOT NULL AND ((U0.`recipient_archived` = True AND U0.`moderation_status` = "a" AND U0.`recipient_id` = 14028 AND U0.`recipient_deleted_at` IS NULL) OR (U0.`sender_archived` = True AND U0.`sender_deleted_at` IS NULL AND U0.`sender_id` = 14028 ))) GROUP BY U0.`thread_id`

  10. yomguy
    mysql> EXPLAIN SELECT MAX(U0.`id`) AS `pkmax` FROM `postman_message` U0 WHERE (U0.`thread_id` IS NOT NULL AND ((U0.`recipient_archived` = True AND U0.`moderation_status` = "a" AND U0.`recipient_id` = 14028 AND U0.`recipient_deleted_at` IS NULL) OR (U0.`sender_archived` = True AND U0.`sender_deleted_at` IS NULL AND U0.`sender_id` = 14028 ))) GROUP BY U0.`thread_id`;
    +----+-------------+-------+------+----------------------------------------------------------------------------+------+---------+------+------+----------------------------------------------+
    | id | select_type | table | type | possible_keys                                                              | key  | key_len | ref  | rows | Extra                                        |
    +----+-------------+-------+------+----------------------------------------------------------------------------+------+---------+------+------+----------------------------------------------+
    |  1 | SIMPLE      | U0    | ALL  | postman_message_901f59e9,postman_message_fcd09624,postman_message_9a6ed576 | NULL | NULL    | NULL | 3634 | Using where; Using temporary; Using filesort |
    +----+-------------+-------+------+----------------------------------------------------------------------------+------+---------+------+------+----------------------------------------------+
    1 row in set (0.00 sec)
    
  11. Patrick Samson repo owner

    I confirm there is a serious problem with some queries, in particular those related to the conversation mode. The returned result is correct, but the way queries are built doesn't scale. And it has nothing to do with what variant of DBMS is used.

    It is not a simple fix, the ORM queries have to be reworked. I'm working on it, but sorry, it needs some time, please be patient.

  12. yomguy

    I will!

    Unfortunately, I think the query scheme for the conversation mode overpasses my django query building skills (many Qs!). But if you need some tests against your dev branch and my DB, I'm in ;)

    Thanks Patrick

  13. Patrick Samson repo owner

    Here is a procedure to collect some results I need: 1) try to spot a User with a decent number of archived messages, some in thread and some out. If you don't know how, just run the commands below with various ids until you get a handful of rows in the result. Use this id for each "sender_id = " and "recipient_id = " in the commands to run. 2) run the old command (the one in #comment-1988969), without the final "LIMIT 1", note the computing time. 3) run this new command (maybe adjust the syntax for MySql): select * from ( select postman_message.*, t2.c from postman_message join (select max(id) as mx, count(*) as c from postman_message where thread_id is not null and ((recipient_archived = True AND moderation_status = "a" AND recipient_id = 2 AND recipient_deleted_at IS NULL) or (sender_archived = True AND sender_deleted_at IS NULL AND sender_id = 2 )) group by thread_id) t2 on postman_message.id = t2.mx union select *, 0 from postman_message where thread_id is null and ((recipient_archived = True AND moderation_status = "a" AND recipient_id = 2 AND recipient_deleted_at IS NULL) or (sender_archived = True AND sender_deleted_at IS NULL AND sender_id = 2 )) ) t3 order by sent_at desc, id desc verify that it gives the same results as in step 2. 4) post the EXPLAIN of the command in step 3, and the two computing times.

  14. yomguy

    Hi Patrick,

    Here are the results:

    1. my user_id is 1

    2. 31 rows in set (39.22 sec)

    3. 31 rows in set (0.00 sec)

    Same nb of rows but difficult to compare all text message results

    4.

    '''mysql> EXPLAIN select * from ( select postman_message.*, t2.c from postman_message join (select max(id) as mx, count(*) as c from postman_message where thread_id is not null and ((recipient_archived = True AND moderation_status = "a" AND recipient_id = 1 AND recipient_deleted_at IS NULL) or (sender_archived = True AND sender_deleted_at IS NULL AND sender_id = 1 )) group by thread_id) t2 on postman_message.id = t2.mx union select *, 0 from postman_message where thread_id is null and ((recipient_archived = True AND moderation_status = "a" AND recipient_id = 1 AND recipient_deleted_at IS NULL) or (sender_archived = True AND sender_deleted_at IS NULL AND sender_id = 1 )) ) t3 order by sent_at desc, id desc; +----+--------------+-----------------+--------+----------------------------------------------------------------------------+--------------------------+---------+-------+------+----------------------------------------------+

    idselect_typetabletypepossible_keyskeykey_lenrefrowsExtra

    +----+--------------+-----------------+--------+----------------------------------------------------------------------------+--------------------------+---------+-------+------+----------------------------------------------+

    1PRIMARY<derived2>ALLNULLNULLNULLNULL31Using filesort
    2DERIVED<derived3>ALLNULLNULLNULLNULL26
    2DERIVEDpostman_messageeq_refPRIMARYPRIMARY4t2.mx1
    3DERIVEDpostman_messageALLpostman_message_901f59e9,postman_message_fcd09624,postman_message_9a6ed576NULLNULLNULL3689Using where; Using temporary; Using filesort
    4UNIONpostman_messagerefpostman_message_901f59e9,postman_message_fcd09624,postman_message_9a6ed576postman_message_9a6ed5765289Using where
    NULLUNION RESULT<union2,4>ALLNULLNULLNULLNULLNULL

    +----+--------------+-----------------+--------+----------------------------------------------------------------------------+--------------------------+---------+-------+------+----------------------------------------------+ 6 rows in set (0.00 sec) '''

  15. Patrick Samson repo owner

    These are excellent results! Falling from 39.22 s to 0.00 s is very nice, indeed. At least it confirms my thoughts and it gives the way to go, but I guess it won't be easy to make the ORM generates such a Sql... there is still work to do.

    I'm thinking at an other boost of performance by creating some indexes, but wondering if it's worth because if we can't measure a difference between 0.00 and 0.00 s, it's going to be difficult :) Do you believe you're able to find a use case with a time > 0 ?

  16. yomguy

    If we get 0.00 s with a codable query with indexes, I think it's worth, whatever the microseconds ;)

    But, yes, maybe we need some fixtures - corresponding to some use cases - to test the load on a very big volume of messages so that the query processing time can be above 0.00. For example, it's seems easy using factory_boy: https://github.com/dnerdy/factory_boy

  17. BigJim

    Any update on the perf issues? I am looking to incorporate this app but I need to know what I'm getting myself into. Thanks, and please let me know if I can help out w/ this in any way!

  18. Alexei Gousev

    I'm also having serious performance issues with ~2500 messages. If getting the ORM to produce the desired SQL isn't easy, would adding a few indexes significantly help to alleviate the problem? Which indexes would you add?

  19. Patrick Samson repo owner

    Unfortunately, it's not a simple matter of dropping some indexes or tuning the QuerySet. It's a complete redesign of the querying and the ORM doesn't have natively what is needed. There are some paths to explore, without certainty which one will succeed. Only one thing is sure: it's a big amount of work, many days in a row. But all my present time is dedicated to projects to make my living,

    Now the good news: someone is ready to support the project with a financial contribution (don't know if I can name him publicly here). It would be a help towards a solution.

    May be there are some other people who think that django-postman is a great application which deserves support to keep following the Django.upgrades and to stay at the highest quality. Please contact me (not in this issue, but by email) if you are one of them. Thanks.

  20. Patrick Samson repo owner

    The problem is solved. The solution is currently evaluated by the sponsor. As a courtesy for his support, the fix will stay available only to him for some time, before to be integrated in the repository.

  21. stard0g101

    Hey there guys,

    Is there any news on this item? I am also considering integrating postman, however would not be doing so if this problem cannot be solved.

    Judging by the queries the LEFT OUTER JOINS are killing the time. Left outers are notorious as they need to traverse the entire set... per record looking for nulls. Better to switch to INNER and.. prefetch_related and select_related will help here.

    I'm happy trying to optimize performance is still required?

  22. Log in to comment