RSPAMD not learning spam

Issue #974 resolved
Greg created an issue

I keep junking the same email and RSPAMD keeps ignoring it as new versions of this email keep arriving. RSPAMD marks it as “BAYES_HAM”, giving it -3 points, even though I’ve junked this email probably 10 times now.

Why does it not learn? I’m using Poste 2.3.10 (which is running RSPAMD 3.4). RSPAMD appears to be learning according to the stats in the dashboard…. what gives?

Comments (11)

  1. Greg reporter

    This BAYES_HAM is happening with many different spam messages. Not all, but many. There are several emails that consistently get through and are marked BAYES_HAM no matter how many times I junk them. It is really frustrating.

  2. SH repo owner

    Do you have any idea why this is happening? The last thing I've done is update the script to make sure that moving mail to or from the spam folder is logged - please see the logs, if there is any message

  3. Greg reporter

    No, I have no idea why. How do I check the logs?

    Which version did you add the logging in btw? Is it in “Version 2.3.11 FREE # 2085”?

  4. Greg reporter

    Is it the file /var/log/rspamd/rspamd.log ? I can keep an eye on that, again I’m using version 2.3.11. Next time one of these messages come through I’ll try to see what that log says when I junk it.

    If it’s some other log file let me know!

  5. Greg reporter

    Here’s what appeared in that log when I junked one of the emails that RSPAMD refuses to junk:

    2023-04-08 13:35:42 #4480(controller) <24cfd0>; csession; rspamd_controller_check_password: allow unauthorized connection from a trusted IP 127.0.0.1
    2023-04-08 13:35:42 #4480(controller) <24cfd0>; csession; rspamd_message_parse: loaded message; id: <4P3ZABRWMJU4.9TITQWGH0YLA3@magic.sohelrana.in>; queue-id: <undef>; size: 7268; checksum: <4da82cb707fc7da7f5af0816c99ccb72>
    2023-04-08 13:35:42 #4480(controller) <24cfd0>; csession; rspamd_mime_part_detect_language: detected part language: en
    2023-04-08 13:35:42 #4480(controller) <24cfd0>; csession; rspamd_mime_part_detect_language: detected part language: en
    2023-04-08 13:35:42 #4480(controller) <24cfd0>; csession; rspamd_redis_connected: skip obtaining bayes tokens for BAYES_HAM of classifier bayes: not enough learns 3; 200 required
    2023-04-08 13:35:42 #4480(controller) <24cfd0>; csession; rspamd_stat_classifiers_process: skip statistics as HAM class is missing
    2023-04-08 13:35:42 #4480(controller) <24cfd0>; csession; rspamd_controller_learn_fin_task: <127.0.0.1> learned message as spam: 4P3ZABRWMJU4.9TITQWGH0YLA3@magic.sohelrana.in
    2023-04-08 13:35:43 #4480(controller) <cfaeb5>; csession; rspamd_controller_check_password: allow unauthorized connection from a trusted IP 127.0.0.1
    2023-04-08 13:35:43 #4480(controller) <cfaeb5>; csession; rspamd_message_parse: loaded message; id: <4P3ZABRWMJU4.9TITQWGH0YLA3@magic.sohelrana.in>; queue-id: <undef>; size: 7268; checksum: <4da82cb707fc7da7f5af0816c99ccb72>
    2023-04-08 13:35:43 #4480(controller) <cfaeb5>; csession; rspamd_mime_part_detect_language: detected part language: en
    2023-04-08 13:35:43 #4480(controller) <cfaeb5>; csession; rspamd_mime_part_detect_language: detected part language: en
    2023-04-08 13:35:43 #4480(controller) <cfaeb5>; csession; rspamd_redis_connected: skip obtaining bayes tokens for BAYES_HAM of classifier bayes: not enough learns 3; 200 required
    2023-04-08 13:35:43 #4480(controller) <cfaeb5>; csession; rspamd_stat_classifiers_process: skip statistics as HAM class is missing
    2023-04-08 13:35:43 #4480(controller) <cfaeb5>; csession; rspamd_stat_cache_redis_get: <4P3ZABRWMJU4.9TITQWGH0YLA3@magic.sohelrana.in> has been already learned as spam, ignore it
    2023-04-08 13:35:43 #4480(controller) <cfaeb5>; csession; rspamd_task_process: skip learning: <4P3ZABRWMJU4.9TITQWGH0YLA3@magic.sohelrana.in> has been already learned as spam, ignore it
    2023-04-08 13:35:43 #4480(controller) <9715c5>; csession; rspamd_controller_check_password: allow unauthorized connection from a trusted IP 127.0.0.1
    2023-04-08 13:35:43 #4480(controller) <9715c5>; csession; rspamd_message_parse: loaded message; id: <4P3ZABRWMJU4.9TITQWGH0YLA3@magic.sohelrana.in>; queue-id: <undef>; size: 7268; checksum: <4da82cb707fc7da7f5af0816c99ccb72>
    2023-04-08 13:35:43 #4480(controller) <9715c5>; csession; rspamd_mime_part_detect_language: detected part language: en
    2023-04-08 13:35:43 #4480(controller) <9715c5>; csession; rspamd_mime_part_detect_language: detected part language: en
    2023-04-08 13:35:43 #4480(controller) <9715c5>; csession; rspamd_redis_connected: skip obtaining bayes tokens for BAYES_HAM of classifier bayes: not enough learns 3; 200 required
    2023-04-08 13:35:43 #4480(controller) <9715c5>; csession; rspamd_stat_classifiers_process: skip statistics as HAM class is missing
    2023-04-08 13:35:43 #4480(controller) <9715c5>; csession; rspamd_stat_cache_redis_get: <4P3ZABRWMJU4.9TITQWGH0YLA3@magic.sohelrana.in> has been already learned as spam, ignore it
    2023-04-08 13:35:43 #4480(controller) <9715c5>; csession; rspamd_task_process: skip learning: <4P3ZABRWMJU4.9TITQWGH0YLA3@magic.sohelrana.in> has been already learned as spam, ignore it
    2023-04-08 13:35:44 #4378(main) <d73987>; main; rspamd_control_handler: accepted control connection from /var/lib/rspamd/rspamd.sock
    2023-04-08 13:35:44 #4378(main) <d73987>; main; rspamd_control_connection_close: finished connection from /var/lib/rspamd/rspamd.sock
    

  6. Greg reporter

    If it would help, I can open an issue on the rspamd github with these logs. I think they want to know what the configuration is though, and I don’t know where to find that because I’ve never installed rspamd (except by proxy via this project).

    Any help greatly appreciated! Help me help you! :)

    (This spam is driving me nuts!)

  7. Dimitar Tanev

    Definitely something is not right, because I have exactly the same experience.

    I’ve move probably 100+ messages to Junk and yet I still receive them not marked as spam.

  8. Greg reporter

    I updated to 2.3.14 as I noticed that it contained more debug logging for this issue. I junked another one of those messages and saw this in /var/log/rspamd/rspamd.log:

    2023-10-18 12:34:30 #4973(controller) <5c4a8d>; csession; rspamd_controller_check_password: allow unauthorized connection from a trusted IP 127.0.0.1                                                                                                                                                                        
    2023-10-18 12:34:30 #4973(controller) <5c4a8d>; csession; rspamd_message_parse: loaded message; id: <354a214c-e4fc-4401-83d7-a64de2c7aca6@atl1s11mta404.xt.local>; queue-id: <undef>; size: 24621; checksum: <754c522e997fd375fda9d835b3a15fe2>                                                                              
    2023-10-18 12:34:30 #4973(controller) <5c4a8d>; csession; rspamd_mime_part_detect_language: detected part language: en                                                                                                                                                                                                       
    2023-10-18 12:34:30 #4973(controller) <5c4a8d>; csession; rspamd_mime_part_detect_language: detected part language: en                                                                                                                                                                                                       
    2023-10-18 12:34:30 #4973(controller) <5c4a8d>; csession; rspamd_redis_connected: skip obtaining bayes tokens for BAYES_HAM of classifier bayes: not enough learns 3; 200 required                                                                                                                                           
    2023-10-18 12:34:30 #4973(controller) <5c4a8d>; csession; rspamd_stat_classifiers_process: skip statistics as HAM class is missing                                                                                                                                                                                           
    2023-10-18 12:34:30 #4973(controller) <5c4a8d>; csession; rspamd_stat_cache_redis_get: <354a214c-e4fc-4401-83d7-a64de2c7aca6@atl1s11mta404.xt.local> has been already learned as spam, ignore it                                                                                                                             
    2023-10-18 12:34:30 #4973(controller) <5c4a8d>; csession; rspamd_task_process: skip learning: <354a214c-e4fc-4401-83d7-a64de2c7aca6@atl1s11mta404.xt.local> has been already learned as spam, ignore it                                                                                                                      
    2023-10-18 12:34:31 #4973(controller) <f119f8>; csession; rspamd_controller_check_password: allow unauthorized connection from a trusted IP 127.0.0.1                                                                                                                                                                        
    2023-10-18 12:34:31 #4973(controller) <f119f8>; csession; rspamd_message_parse: loaded message; id: <354a214c-e4fc-4401-83d7-a64de2c7aca6@atl1s11mta404.xt.local>; queue-id: <undef>; size: 24621; checksum: <754c522e997fd375fda9d835b3a15fe2>                                                                              
    2023-10-18 12:34:31 #4973(controller) <f119f8>; csession; rspamd_mime_part_detect_language: detected part language: en
    2023-10-18 12:34:31 #4973(controller) <f119f8>; csession; rspamd_mime_part_detect_language: detected part language: en
    2023-10-18 12:34:31 #4973(controller) <f119f8>; csession; rspamd_redis_connected: skip obtaining bayes tokens for BAYES_HAM of classifier bayes: not enough learns 3; 200 required
    2023-10-18 12:34:31 #4973(controller) <f119f8>; csession; rspamd_stat_classifiers_process: skip statistics as HAM class is missing
    2023-10-18 12:34:31 #4973(controller) <f119f8>; csession; rspamd_stat_cache_redis_get: <354a214c-e4fc-4401-83d7-a64de2c7aca6@atl1s11mta404.xt.local> has been already learned as spam, ignore it
    2023-10-18 12:34:31 #4973(controller) <f119f8>; csession; rspamd_task_process: skip learning: <354a214c-e4fc-4401-83d7-a64de2c7aca6@atl1s11mta404.xt.local> has been already learned as spam, ignore it
    2023-10-18 12:34:31 #4973(controller) <88aa36>; csession; rspamd_controller_check_password: allow unauthorized connection from a trusted IP 127.0.0.1
    2023-10-18 12:34:31 #4973(controller) <88aa36>; csession; rspamd_message_parse: loaded message; id: <354a214c-e4fc-4401-83d7-a64de2c7aca6@atl1s11mta404.xt.local>; queue-id: <undef>; size: 24621; checksum: <754c522e997fd375fda9d835b3a15fe2>
    2023-10-18 12:34:31 #4973(controller) <88aa36>; csession; rspamd_mime_part_detect_language: detected part language: en
    2023-10-18 12:34:31 #4973(controller) <88aa36>; csession; rspamd_mime_part_detect_language: detected part language: en
    2023-10-18 12:34:31 #4973(controller) <88aa36>; csession; rspamd_redis_connected: skip obtaining bayes tokens for BAYES_HAM of classifier bayes: not enough learns 3; 200 required
    2023-10-18 12:34:31 #4973(controller) <88aa36>; csession; rspamd_stat_classifiers_process: skip statistics as HAM class is missing
    2023-10-18 12:34:31 #4973(controller) <88aa36>; csession; rspamd_stat_cache_redis_get: <354a214c-e4fc-4401-83d7-a64de2c7aca6@atl1s11mta404.xt.local> has been already learned as spam, ignore it
    2023-10-18 12:34:31 #4973(controller) <88aa36>; csession; rspamd_task_process: skip learning: <354a214c-e4fc-4401-83d7-a64de2c7aca6@atl1s11mta404.xt.local> has been already learned as spam, ignore it
    2023-10-18 12:34:33 #4784(main) <e51534>; main; rspamd_control_handler: accepted control connection from /var/lib/rspamd/rspamd.sock
    2023-10-18 12:34:33 #4784(main) <e51534>; main; rspamd_control_connection_close: finished connection from /var/lib/rspamd/rspamd.sock
    

  9. Log in to comment