Used memory size detection

Issue #385 duplicate
Former user created an issue

Using a proxmox 4 container (lxc) :

In /var/log/monit.log :

[CEST Jun  2 18:03:13] error    : 'test-host' mem usage of 429496729600.0% matches resource limit [mem usage>1.0%]

free -m outputs :

             total       used       free     shared    buffers     cached
Mem:          4096        263       3832        446          0        166
-/+ buffers/cache:         97       3998
Swap:         1024        114        909

swap usage percent is OK

Comments (13)

  1. Tildeslash repo owner

    Please can you post output of following commands?:

    monit -V
    monit status | head -1
    file <path>/monit
    uname -m
    
  2. Richard Bergoin

    Sure :

    monit -V

    This is Monit version 5.17.1 Built with ssl, with pam and with large files Copyright (C) 2001-2016 Tildeslash Ltd. All Rights Reserved.

    monit status | head -1

    The Monit daemon 5.17.1 uptime: 4h 13m

    which monit

    /usr/bin/monit 22:12:34 root@ansible-test:~

    file /usr/bin/monit

    /usr/bin/monit: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=41e68b09fffd89c31acafba270e81d20fb931bf1, stripped

    I've compiled (using commands given in issue #368).

  3. Tildeslash repo owner

    Thanks for data.

    Please send yet output of the following command:

    cat /proc/meminfo
    

    And send your monit log to support@mmonit.com (the logging can be enabled using the "set logfile" statement).

  4. Richard Bergoin
    # cat /proc/meminfo
    MemTotal:        4194304 kB
    MemFree:         4003972 kB
    MemAvailable:    4003972 kB
    Buffers:               0 kB
    Cached:           137920 kB
    SwapCached:            0 kB
    Active:         14047320 kB
    Inactive:       14762316 kB
    Active(anon):     705900 kB
    Inactive(anon):   763796 kB
    Active(file):   13341420 kB
    Inactive(file): 13998520 kB
    Unevictable:        3524 kB
    Mlocked:            3524 kB
    SwapTotal:       1048576 kB
    SwapFree:         931808 kB
    Dirty:               168 kB
    Writeback:             0 kB
    AnonPages:        978368 kB
    Mapped:           316744 kB
    Shmem:            442228 kB
    Slab:            1105440 kB
    SReclaimable:    1021612 kB
    SUnreclaim:        83828 kB
    KernelStack:       13184 kB
    PageTables:        57328 kB
    NFS_Unstable:          0 kB
    Bounce:                0 kB
    WritebackTmp:          0 kB
    CommitLimit:    20613692 kB
    Committed_AS:    8551720 kB
    VmallocTotal:   34359738367 kB
    VmallocUsed:      398312 kB
    VmallocChunk:   34358947836 kB
    HardwareCorrupted:     0 kB
    AnonHugePages:         0 kB
    CmaTotal:              0 kB
    CmaFree:               0 kB
    HugePages_Total:       0
    HugePages_Free:        0
    HugePages_Rsvd:        0
    HugePages_Surp:        0
    Hugepagesize:       2048 kB
    DirectMap4k:      339800 kB
    DirectMap2M:    21575680 kB
    DirectMap1G:    13631488 kB
    
  5. Tildeslash repo owner

    Thanks for data.

    It seems that the LXC container has bug ... the free memory as presented in /proc/meminfo exceeds the physical memory size. LXC probably masks real memory size and presents just the assigned memory part in the container as MemoryTotal, but passes other system-wide statistics without modification.

    Monit calculates the free memory this way: MemFree + Buffers + Cached + SReclaimable

    Snip from your /etc/meminfo:

    MemTotal:        4194304 kB
    MemFree:         4003972 kB
    Buffers:               0 kB
    Cached:           137920 kB
    SReclaimable:    1021612 kB
    

    which gives:

    MemFree  + Buffers  + Cached  + SReclaimable  = 4003972 + 0 + 137920 + 1021612 = 5163504 free
    (i.e. there is more free memory then memory size: MemTotal = 4194304 of which 5163504 is free)
    

    The "free" utility doesn't subtract the SReclaimable value (which represents slab memory that can be freed if there is high memory pressure and thus shouldn't be counted as used memory).

    I tried to search for LXC if they are aware about the problem and it seems that they have fix for the problem: https://bugzilla.redhat.com/show_bug.cgi?id=1300781

  6. Richard Bergoin

    I have an update for this, still with proxmox (v5), but when using zfs now, the zfsarcsize of the host of container is shown to the containers.... so I got this issue again with the %

    container $ cat /proc/meminfo |grep -E "MemTotal|MemFree|Buffers|Cached|SReclaimable"
    MemTotal:        2097152 kB
    MemFree:         1624392 kB
    Buffers:               0 kB
    Cached:            84424 kB
    SwapCached:            0 kB
    SReclaimable:          0 kB
    
    container $ cat /proc/spl/kstat/zfs/arcstats |grep ^size
    size                            4    33602994120
    
    host $ cat /proc/spl/kstat/zfs/arcstats |grep ^size
    size                            4    33601860040
    

    how to prevent zfsarcsize from being in count of free size ?

  7. Tildeslash repo owner

    i'm sorry it's again 3rd party bug (same type as the SReclaimable value) ... the container sees system-wide value, which (compared to the physical memory reported in the container) gives wrong value

    Please report this along with /proc/meminfo to proxmox/lxc - there's no easy way how to detect which memory usage is wrong from the monit side (except fragile tests for specific platform version to workaround the 3rd party bug)

  8. Richard Bergoin

    As both host and container are using zfs, I don't know whether it is possible to have /proc binding removing the arc size...

    So I patched monit to add a monitrc setting to ignore it: https://bitbucket.org/kenji_21/monit/commits/830a07c102024f31580eb656cc0eb9f184a0c1b5 (edit: fixed warning with parenthesis missing around bitmask)

    Maybe it can also be useful to people wanting to have 15% free memory (with arc cache usage configured to), and being alerted when the memory usage become > 90%.

    Does these changes are "eligible" to a merge request ?

  9. Log in to comment