filesystem statistic error with zfs -2.1.1 and kernel 5.13.19
As of zfs -2.1.1 and Linux 5.13.19-2-pve #1 SMP PVE 5.13.19-4 (Mon, 29 Nov 2021 12:10:09 +0100), /proc/spl/kstat/zfs/<poolname>/io files are no longer available. As a result, Monit cannot read information about filesystems and it complains with:
[2021-12-05T12:51:55-0500] error : filesystem statistic error: cannot read /proc/spl/kstat/zfs/rpool/io -- No such file or directory
[2021-12-05T12:51:55-0500] error : Filesystem '/' not mounted
[2021-12-05T12:51:55-0500] error : 'root' unable to read filesystem '/' state
[2021-12-05T12:51:55-0500] info : 'root' trying to restart
monit summary or status command report that the filesystem “Does not exist“.
The usual filesystem operations, like “ls -l /" or “df -h”, work normally.
Listing of /proc/spl/kstat/zfs/rpool/ reveals that the “io” file is indeed no longer present there.
Comments (16)
-
-
reporter Hi Lutz Mader! Are you saying I should try the HEAD version from this repo and see if I still have the problem?
-
here is a sample output of:
# cat /proc/spl/kstat/zfs/rpool/iostats 25 1 0x01 18 4896 34810932099 312628448789680 name type data trim_extents_written 4 0 trim_bytes_written 4 0 trim_extents_skipped 4 0 trim_bytes_skipped 4 0 trim_extents_failed 4 0 trim_bytes_failed 4 0 autotrim_extents_written 4 0 autotrim_bytes_written 4 0 autotrim_extents_skipped 4 0 autotrim_bytes_skipped 4 0 autotrim_extents_failed 4 0 autotrim_bytes_failed 4 0 simple_trim_extents_written 4 0 simple_trim_bytes_written 4 0 simple_trim_extents_skipped 4 0 simple_trim_bytes_skipped 4 0 simple_trim_extents_failed 4 0 simple_trim_bytes_failed 4 0
-
HEAD won’t work, still looking for “/io”:
} else if (IS(mnt->mnt_type, "zfs")) { // ZFS inf->filesystem->object.getDiskActivity = _getZfsDiskActivity; // Need base zpool name for /proc/spl/kstat/zfs/<NAME>/io lookup: snprintf(inf->filesystem->object.key, sizeof(inf->filesystem->object.key), "%s", inf->filesystem->object.device); Str_replaceChar(inf->filesystem->object.key, '/', 0);
-
Hello Val,
a "ls" or "cat" to the "/proc/spl/kstat/zfs/<POOL>/iostats" and "/proc/spl/kstat/zfs/<POOL>" should enought.With regards,
Lutz -
reporter root@pve1:~# ls -l /proc/spl/kstat/zfs/rpool/ total 0 -rw-r--r-- 1 root root 0 Dec 5 14:57 dmu_tx_assign -rw-r--r-- 1 root root 0 Dec 5 14:57 iostats -rw-r--r-- 1 root root 0 Dec 5 14:57 multihost -rw-r--r-- 1 root root 0 Dec 5 14:57 objset-0x112 -rw-r--r-- 1 root root 0 Dec 7 08:53 objset-0x12 -rw-r--r-- 1 root root 0 Dec 5 14:57 objset-0x129 -rw-r--r-- 1 root root 0 Dec 5 14:57 objset-0x183 -rw-r--r-- 1 root root 0 Dec 5 14:57 objset-0x189 -rw-r--r-- 1 root root 0 Dec 5 14:57 objset-0x36 -rw-r--r-- 1 root root 0 Dec 5 14:57 objset-0x394 -rw-r--r-- 1 root root 0 Dec 5 14:57 objset-0x41c -rw-r--r-- 1 root root 0 Dec 5 14:57 objset-0x483 -rw-r--r-- 1 root root 0 Dec 5 14:57 objset-0x4a -rw-r--r-- 1 root root 0 Dec 5 14:57 objset-0x58 -rw-r--r-- 1 root root 0 Dec 5 14:57 objset-0x58c -rw-r--r-- 1 root root 0 Dec 5 14:57 objset-0x707 -rw-r--r-- 1 root root 0 Dec 5 14:57 objset-0x86 -rw-r--r-- 1 root root 0 Dec 5 14:57 objset-0x8b2 -rw-r--r-- 1 root root 0 Dec 5 14:57 objset-0x8c2 -rw-r--r-- 1 root root 0 Dec 5 14:57 objset-0x909 -rw-r--r-- 1 root root 0 Dec 5 14:57 objset-0x94 -rw-r--r-- 1 root root 0 Dec 5 14:57 objset-0xab -rw------- 1 root root 0 Dec 5 14:57 reads -rw-r--r-- 1 root root 0 Dec 5 14:57 state -rw-r--r-- 1 root root 0 Dec 5 14:57 txgs
root@pve1:~# cat /proc/spl/kstat/zfs/rpool/iostats 25 1 0x01 18 4896 3219827410 158782342259233 name type data trim_extents_written 4 0 trim_bytes_written 4 0 trim_extents_skipped 4 0 trim_bytes_skipped 4 0 trim_extents_failed 4 0 trim_bytes_failed 4 0 autotrim_extents_written 4 0 autotrim_bytes_written 4 0 autotrim_extents_skipped 4 0 autotrim_bytes_skipped 4 0 autotrim_extents_failed 4 0 autotrim_bytes_failed 4 0 simple_trim_extents_written 4 0 simple_trim_bytes_written 4 0 simple_trim_extents_skipped 4 0 simple_trim_bytes_skipped 4 0 simple_trim_extents_failed 4 0 simple_trim_bytes_failed 4 0
I have realised just now that the information above is not relevant to the filesystem of my LXC container. My instance of Monit runs within an LXC container, which itself runs within a Proxmos VE hypervisor.
The following is the relevant information about filesystems of the ‘102’ container:
root@pve1:~# df -h | grep subvol-102 rpool/data/subvol-102-disk-0 40G 9.3G 31G 24% /rpool/data/subvol-102-disk-0 local-zfs-1/subvol-102-disk-0 40G 6.1G 34G 16% /local-zfs-1/subvol-102-disk-0 rpool/data/subvol-102-disk-1 20G 562M 20G 3% /rpool/data/subvol-102-disk-1
-
here is a “strace” of what `df -h /` does:
stat("/", {st_mode=S_IFDIR|0755, st_size=24, ...}) = 0 uname({sysname="Linux", nodename="mmonit", ...}) = 0 statfs("/", {f_type=ZFS_SUPER_MAGIC, f_bsize=131072, f_blocks=131072, f_bfree=107218, f_bavail=107218, f_files=27501798, f_ffree=27447840, f_fsid={val=[960432164, 13924911]}, f_namelen=255, f_frsize=131072, f_flags=ST_VALID|ST_NOATIME}) = 0 openat(AT_FDCWD, "/usr/lib/x86_64-linux-gnu/gconv/gconv-modules.cache", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=26402, ...}) = 0 mmap(NULL, 26402, PROT_READ, MAP_SHARED, 3, 0) = 0x7f4ba3133000 close(3) = 0 fstat(1, {st_mode=S_IFCHR|0600, st_rdev=makedev(0x88, 0x2), ...}) = 0 write(1, "Filesystem Si"..., 63Filesystem Size Used Avail Use% Mounted on ) = 63 write(1, "rpool/data/subvol-999-disk-0 1"..., 54rpool/data/subvol-999-disk-0 16G 3.0G 14G 19% / ) = 54
-
Just saw the 5.30.0 is out, I’d love to contribute to fix this issue for 5.30.1 (or 5.31.0), maybe a more complete strace will be useful:
# df -h / Filesystem Size Used Avail Use% Mounted on rpool/data/subvol-999-disk-0 16G 3.1G 13G 19% /
the full strace:
# strace df -h / execve("/bin/df", ["df", "-h", "/"], 0x7ffd1af0a5c0 /* 22 vars */) = 0 brk(NULL) = 0x5640c163f000 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=35243, ...}) = 0 mmap(NULL, 35243, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f88295b0000 close(3) = 0 openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\260A\2\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=1824496, ...}) = 0 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f88295ae000 mmap(NULL, 1837056, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f88293ed000 mprotect(0x7f882940f000, 1658880, PROT_NONE) = 0 mmap(0x7f882940f000, 1343488, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x22000) = 0x7f882940f000 mmap(0x7f8829557000, 311296, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x16a000) = 0x7f8829557000 mmap(0x7f88295a4000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1b6000) = 0x7f88295a4000 mmap(0x7f88295aa000, 14336, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f88295aa000 close(3) = 0 arch_prctl(ARCH_SET_FS, 0x7f88295af540) = 0 mprotect(0x7f88295a4000, 16384, PROT_READ) = 0 mprotect(0x5640bfd9d000, 4096, PROT_READ) = 0 mprotect(0x7f88295e0000, 4096, PROT_READ) = 0 munmap(0x7f88295b0000, 35243) = 0 brk(NULL) = 0x5640c163f000 brk(0x5640c1660000) = 0x5640c1660000 openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=3036208, ...}) = 0 mmap(NULL, 3036208, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f8829107000 close(3) = 0 openat(AT_FDCWD, "/usr/share/locale/locale.alias", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=2995, ...}) = 0 read(3, "# Locale name alias data base.\n#"..., 3072) = 2995 read(3, "", 3072) = 0 close(3) = 0 openat(AT_FDCWD, "/usr/share/locale/en_US.UTF-8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/share/locale/en_US.utf8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/share/locale/en_US/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/share/locale/en.UTF-8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/share/locale/en.utf8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/share/locale/en/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory) stat("/", {st_mode=S_IFDIR|0755, st_size=24, ...}) = 0 openat(AT_FDCWD, "/", O_RDONLY|O_NOCTTY) = 3 close(3) = 0 openat(AT_FDCWD, "/proc/self/mountinfo", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0 read(3, "2973 2705 0:52 / / rw,noatime ma"..., 1024) = 1024 read(3, "974 0:287 / /dev/.lxc/sys rw,rel"..., 1024) = 1024 read(3, "- fuse.lxcfs lxcfs rw,user_id=0,"..., 1024) = 1024 read(3, "osuid,noexec,relatime master:3 -"..., 1024) = 1024 read(3, "dev/mqueue rw,relatime - mqueue "..., 1024) = 167 read(3, "", 1024) = 0 lseek(3, 0, SEEK_CUR) = 4263 close(3) = 0 stat("/", {st_mode=S_IFDIR|0755, st_size=24, ...}) = 0 uname({sysname="Linux", nodename="mmonit", ...}) = 0 statfs("/", {f_type=ZFS_SUPER_MAGIC, f_bsize=131072, f_blocks=131072, f_bfree=106395, f_bavail=106395, f_files=27291221, f_ffree=27237152, f_fsid={val=[960432164, 13924911]}, f_namelen=255, f_frsize=131072, f_flags=ST_VALID|ST_NOATIME}) = 0 openat(AT_FDCWD, "/usr/lib/x86_64-linux-gnu/gconv/gconv-modules.cache", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=26402, ...}) = 0 mmap(NULL, 26402, PROT_READ, MAP_SHARED, 3, 0) = 0x7f88295b2000 close(3) = 0 fstat(1, {st_mode=S_IFCHR|0600, st_rdev=makedev(0x88, 0x2), ...}) = 0 write(1, "Filesystem Si"..., 63Filesystem Size Used Avail Use% Mounted on ) = 63 write(1, "rpool/data/subvol-999-disk-0 1"..., 54rpool/data/subvol-999-disk-0 16G 3.1G 13G 19% / ) = 54 close(1) = 0 close(2) = 0 exit_group(0) = ? +++ exited with 0 +++
-
repo owner The problem is little bit tricky, i didn’t have time to look on it in detail yet.
The ZFS I/O statistics are available in
/proc/spl/kstat/zfs/<pool>/objset-<hex>
now.The <hex> is unique for each filesystem in that pool. It is not clear to me yet if it is possible to get the <hex> for given filesystem without parsing all objset* files.
Example for tank zpool:
alpine-arm64:~# cat /proc/spl/kstat/zfs/tank/objset-0x36 40 1 0x01 7 2160 83309813285 113514182925 name type data dataset_name 7 tank writes 4 0 nwritten 4 0 reads 4 0 nread 4 0 nunlinks 4 0 nunlinked 4 0 alpine-arm64:~# cat /proc/spl/kstat/zfs/tank/objset-0x10 43 1 0x01 7 2160 83314151160 132322538725 name type data dataset_name 7 tank/test1 writes 4 0 nwritten 4 0 reads 4 0 nread 4 0 nunlinks 4 0 nunlinked 4 0
I’ll look on it.
-
OK, let’s give some help/clue, there is a command in zfs repository to get path from id:
fprintf(stderr, "Usage: zfs_ids_to_path [-v] <pool> <objset id> " "<object id>\n");
it call the function
zpool_obj_to_path
defined here: https://github.com/openzfs/zfs/blob/f291fa658efd146540b03ce386133632bde237bf/lib/libzfs/libzfs_pool.c#L4737then I think I got a nice clue:
/* * Convert from a dataset to a objset id. Note that * we grab the object number from the inode number. */ static int object_from_path(const char *dataset, uint64_t object, zinject_record_t *record) { zfs_handle_t *zhp; if ((zhp = zfs_open(g_zfs, dataset, ZFS_TYPE_DATASET)) == NULL) return (-1); record->zi_objset = zfs_prop_get_int(zhp, ZFS_PROP_OBJSETID); record->zi_object = object; zfs_close(zhp); return (0); }
-
repo owner Thanks for information. The libzfs would be probably problematic though:
1.) we want to limit the dependency of Monit on 3rd party libraries to the minimum (i’d prefer not to link with libzfs)
2.) the CDDL license may not be fully compatible with Monit’s AGPLv3 license
In the worst case, we can use the ‘brute force’ scan of objset-<hexa> files as mentioned in previous post
-
Assuming df is not linked to libzfs, the interesting line in the strace log, is the statfs syscall:
statfs("/", {f_type=ZFS_SUPER_MAGIC, f_bsize=131072, f_blocks=131072, f_bfree=106395, f_bavail=106395, f_files=27291221, f_ffree=27237152, f_fsid={val=[960432164, 13924911]}, f_namelen=255, f_frsize=131072, f_flags=ST_VALID|ST_NOATIME}) = 0
-
oh, sorry, it’s the activity, not the usage that fails
-
# zfs get objsetid rpool/data/subvol-999-disk-0 NAME PROPERTY VALUE SOURCE rpool/data/subvol-999-disk-0 objsetid 105740 -
105740
=>0x19d0c
so:# cat /proc/spl/kstat/zfs/rpool/objset-0x19d0c 43 1 0x01 7 2160 55671716180 3540084614242925 name type data dataset_name 7 rpool/data/subvol-999-disk-0 writes 4 35128423 nwritten 4 842467604105 reads 4 8558897 nread 4 41588065809 nunlinks 4 33062 nunlinked 4 32983
-
repo owner - changed status to resolved
fix Issue
#1021: Add support for Linux OpenZFS 2.x I/O statistics→ <<cset 52a5fa6e0f53>>
-
👍🏻
- Log in to comment
Hello,
the status information used by monit
"/proc/spl/kstat/zfs/<POOL>/io"
are moved to
"/proc/spl/kstat/zfs/<POOL>/iostats"
I think, could you check this, please.
I have no access to a proper Linux at the time.
With regards,
Lutz