EA6700 vpn e Build complete buggy/unstable/looping since 07.01.2021 repo changes
I updated yesterday from a relative good working trunk from 07.01.2021 to the actutal trunk from 25.02.2021 which ends asap in a boot loop and only a clear NVRAM recovered the device. The only change i made was compiling in WIREGUARD this time.
After them i used a 2nd router for testing and started fresh with complete NVRAM clear. When basic configs wie webinterface made by hand in multiple menus, the router reboots sporadic without any resason. So my first idea was WIREGUARD was the issue - i compiled a new version withoud added WIREGUARD this time - and i could repeat all the random reboots, issues in many different menus. Basicly simple things like change the ip address from br1 results in craches/resets/a new bootloop.
enabling ipv6 later after made the most pre-settings causes trigger the sporadic resets even more. Many time i landing in the cfe recovery interface after such crashes.
actualy i can repeat this issues on 3 identical EA6700 models. i have no serial cable attached actualy and had no chance to get any more detailes about crash infos. I asume this becomes true fpr EA6900 maybe other models, too.
My asumption is the adding of WIREGUARD-Support/kernel changes and the NEW QOS implementation ?
Eventually the latest WIFI driver changes for SDK6 ?
The interesting thing is that the sporadic reboots and crashes starts mostly between 10-40 seconds after the last changes
was made in the webinterface - looks like a crash started when some things are reloaded delayed via “rc” in a chain.
- When comes from a 07.01.2021 trunk by just update to the latest one from 25.01.2021 normaly the existing config should be prevented - this is not possible - after upgrading. the Router goes in a endless bootloop and the entire pre existing config ist lost - which resulted in completly cleared nvram with default ip 192.168.1.1 - this i can repeat von 3 equal EA6700. Having wireguard active or not makes no difference. Simply changes like configure WLAN settings in basic config. change or create a virtual WLAN has the same effetcs - sporadic reboots, mostly after 20-30 seconcds.
Finaly a successfully step-by-step full web based configuration is no longer possible until all settings are back to the last working config previous used. Every try to import a previous saved Backup results in the same bootloops after when just upgrade from webinterface.
Steps to reproduce:
NVRAM clean → inital setup:
Basicly just create a 2nd bridge and set IPs here, change wan protocol to PPPoE and fill in all data.
in vlan setup just create br1 vlan, give them a free tag and change some ports for handling br0/br1 as usual.
enabling IPv6 for the PPPoE wan and include suppoort für br1, too.
in admin interface enable remote access, set a access list and finaly configure/ enabling SNMP
Last but not leat define 2 forward fules with udp for some more snmp internal devices to be reachable vom external sources.
In the Advance section just change the country and country revision and made the usual tunings for the radios.
finaly create a guestnet on br1 and set own encryption here.
The trick is: just try to come to this point WITHOUT any crashes in the middle of config.
The only known “legal” reboots should only comes up when changing the WLAN country/region AND when changing
ther VLAN table - but not when simply configure firewall settings, define an DDNS-Service or just change
a mac address for one of the virtual wireless interfaces. Normaly this results in a short outtage of 1-3 seconds in the local lan when services are reloaded - but NOT a complete unclean reset/crash with landing in the rescue cfe interface asap!
Comments (11)
-
reporter -
reporter I had the chance and plugged a serial console terminal on one of the EA6700 and now i have some infos:
before i compiled a version which i checkout at “
9825ea0” - right before the WLAN driver was updated to check if the issue comes from the newer driver.
That was not the case here.Here is the crash-log right after i enabled ipv6 and waited a few seconds after saving the changed config:
TxBeamforming not supported for eth1
wlconf_pre(0x0099): set vhtmode 0 for eth1
TxBeamforming supported for eth2 - corerev: 42
wlconf_pre(0x007c): txbf_bfr_cap for eth2 = 1
wlconf_pre(0x007d): txbf_bfe_cap for eth2 = 1
wlconf_pre(0x0084): set vhtmode 1 for eth2
vlan1: cmInternal error: Oops: 5 [] PREEMPT SMP#1
last sysfs file: /sys/class/net/br0/bridge/stp_state
module: xt_length bf679000 769
module: nf_conntrack_ipv6 bf664000 9463
module: ebtable_filter bf5d7000 1061
module: ebtables bf5ce000 15643
module: ip6table_mangle bf4f0000 934
module: ip6table_filter bf4ea000 750
module: wl bf09d000 4201685
module: dpsta bf094000 12782
module: ehci_hcd bf086000 32414
module: usbcore bf062000 103389
module: nf_nat_pptp bf05c000 1602
module: nf_conntrack_pptp bf056000 3355
module: nf_nat_proto_gre bf050000 887
module: nf_conntrack_proto_gre bf04a000 3308
module: nf_nat_ftp bf044000 1144
module: nf_conntrack_ftp bf03d000 4909
module: nf_nat_h323 bf036000 4761
module: nf_conntrack_h323 bf028000 33807
module: et bf011000 65158
module: igs bf009000 11927
module: emf bf000000 15397
Modules linked in: xt_length nf_conntrack_ipv6 ebtable_filter ebtables ip6table_mangle ip6table_filter wl(P) dpsta(P) ehci_hcd usbcore nf_nat_pptp nf_conntrack_pptp nf_nat_proto_gre nf_conntrack_proto_gre nf_nat_ftp nf_conntrack_ftp nf_nat_h323 nf_conntrack_h323 et(P) igs(P) emf(P) [last unloaded: ip6t_REJECT]
CPU: 0 Tainted: P (2.6.36.4brcmarm)#1
PC is at ipv6_add_addr+0xd0/0x36c
LR is at ipv6_add_addr+0xa8/0x36c
pc : [<c029f9e4>] lr : [<c029f9bc>] psr: 60000013
sp : cf825d80 ip : 00000080 fp : cdda6800
r10: 000080fe r9 : 00000000 r8 : ffb3f84a
r7 : 00000020 r6 : 00000040 r5 : cddff800 r4 : cf825e18
r3 : 8af7e24a r2 : c0440bd8 r1 : c0440bd8 r0 : 75449afe
Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user
Control: 10c53c7d Table: 9f95c04a DAC: 00000015
Process preinit (pid: 1, stack limit = 0xcf824270)
Stack: (0xcf825d80 to 0xcf826000)
5d80: cddff800 00000001 cdda6800 cdda6800 cddff800 cf825e18 cddff800 00000001
5da0: 00000000 00000014 cf1a550c c02a2850 00000080 c029f068 cdda6800 cdda6800
5dc0: cddff800 c02a42bc 00000000 00000000 00000001 c03f3588 00000001 cf825e2f
5de0: cf825e30 c017e6d0 cf1a5500 00000000 00000000 cdda0000 cdda6800 c043ca9c
5e00: 00000000 00000000 cdda6800 00000001 00000014 c017e7e8 000080fe 00000000
5e20: ffb3f84a 75449afe cf825e28 fffffff1 c04022c0 00000000 cdda6800 00000001
5e40: 00000000 00000014 cf1a550c c0081424 cdda6800 00000201 00000000 00008914
5e60: 00000000 c008152c 00000000 00000201 00000000 c01feed4 cdda6800 00001002
5e80: 00000000 c01fef50 bee38b90 cf824000 cf825eb0 c026f814 cf84b000 0000000a
5ea0: cf1a5500 cf825ec0 cdda6800 cf1a550c 6e616c76 00000031 bee38b80 00000000
5ec0: 00001243 00000000 00000000 401210e5 00001243 00000000 00000000 401210e5
5ee0: 00000000 00008914 bee38b90 bee38b90 00000005 c003ec68 cf824000 00000000
5f00: 00001243 c01ec24c cf1afbe0 bee38b90 bee38b90 c00d9844 00000020 cf50c4c8
5f20: cf825f58 00000003 00000005 c00cc860 00000000 cf52b040 00000000 00000005
5f40: cf825f84 c01ec6ec cf52b040 00000000 00000000 c037b66c cf8073c0 00000005
5f60: cf816000 c00c99bc cf1afbe0 bee38b90 00008914 00000005 c003ec68 cf824000
5f80: 00000000 c00d9dcc 00000003 00000000 401e5b94 bee38b90 00008914 00000005
5fa0: 00000036 c003eac0 bee38b90 00008914 00000005 00008914 bee38b90 00000000
5fc0: bee38b90 00008914 00000005 00000036 bee38c68 00000000 00000000 00001243
5fe0: 0005e308 bee38b68 0001b594 401f3aec 60000010 00000005 f7ffffff ffffffff
[<c029f9e4>] (PC is at ipv6_add_addr+0xd0/0x36c)
[<c029f9e4>] (ipv6_add_addr+0xd0/0x36c) from [<c02a2850>] (addrconf_add_linklocal+0x48/0xb8)
[<c02a2850>] (addrconf_add_linklocal+0x48/0xb8) from [<c02a42bc>] (addrconf_notify+0x470/0x7f8)
[<c02a42bc>] (addrconf_notify+0x470/0x7f8) from [<c0081424>] (notifier_call_chain+0x44/0x84)
[<c0081424>] (notifier_call_chain+0x44/0x84) from [<c008152c>] (raw_notifier_call_chain+0x18/0x20)
[<c008152c>] (raw_notifier_call_chain+0x18/0x20) from [<c01feed4>] (__dev_notify_flags+0x2c/0x78)
[<c01feed4>] (__dev_notify_flags+0x2c/0x78) from [<c01fef50>] (dev_change_flags+0x30/0x48)
[<c01fef50>] (dev_change_flags+0x30/0x48) from [<c026f814>] (devinet_ioctl+0x69c/0x754)
[<c026f814>] (devinet_ioctl+0x69c/0x754) from [<c01ec24c>] (sock_ioctl+0x5c/0x250)
[<c01ec24c>] (sock_ioctl+0x5c/0x250) from [<c00d9844>] (do_vfs_ioctl+0x80/0x5d0)
[<c00d9844>] (do_vfs_ioctl+0x80/0x5d0) from [<c00d9dcc>] (sys_ioctl+0x38/0x60)
[<c00d9dcc>] (sys_ioctl+0x38/0x60) from [<c003eac0>] (ret_fast_syscall+0x0/0x30)
Code: e029300a e595b000 e0233008 e0233000 (e7921103)
d=1---[ end trace e07b33bb9ea4104e ]---
4: Kernel panic - not syncing: Fatal exception in interrupt
Ope[<c0045000>] (unwind_backtrace+0x0/0xf8) from [<c02e65a4>] (panic+0x74/0x1a0)
rat[<c02e65a4>] (panic+0x74/0x1a0) from [<c00426f8>] (die+0x1ac/0x1dc)
ion[<c00426f8>] (die+0x1ac/0x1dc) from [<c0046154>] (__do_kernel_fault+0x64/0x84)
no[<c0046154>] (__do_kernel_fault+0x64/0x84) from [<c0046440>] (do_translation_fault+0x70/0xa8)
t s[<c0046440>] (do_translation_fault+0x70/0xa8) from [<c003e3a4>] (do_DataAbort+0x30/0x9c)
upp[<c003e3a4>] (do_DataAbort+0x30/0x9c) from [<c03a384c>] (__dabt_svc+0x4c/0x60)
ortException stack(0xcf825d38 to 0xcf825d80)
5d20: 75449afe c0440bd85d40: c0440bd8 8af7e24a cf825e18 cddff800 00000040 00000020 ffb3f84a 00000000
5d60: 000080fe cdda6800 00000080 cf825d80 c029f9bc c029f9e4 60000013 ffffffff
[<c03a384c>] (__dabt_svc+0x4c/0x60) from [<c029f9e4>] (ipv6_add_addr+0xd0/0x36c)
[<c029f9e4>] (ipv6_add_addr+0xd0/0x36c) from [<c02a2850>] (addrconf_add_linklocal+0x48/0xb8)
[<c02a2850>] (addrconf_add_linklocal+0x48/0xb8) from [<c02a42bc>] (addrconf_notify+0x470/0x7f8)
[<c02a42bc>] (addrconf_notify+0x470/0x7f8) from [<c0081424>] (notifier_call_chain+0x44/0x84)
[<c0081424>] (notifier_call_chain+0x44/0x84) from [<c008152c>] (raw_notifier_call_chain+0x18/0x20)
[<c008152c>] (raw_notifier_call_chain+0x18/0x20) from [<c01feed4>] (__dev_notify_flags+0x2c/0x78)
[<c01feed4>] (__dev_notify_flags+0x2c/0x78) from [<c01fef50>] (dev_change_flags+0x30/0x48)
[<c01fef50>] (dev_change_flags+0x30/0x48) from [<c026f814>] (devinet_ioctl+0x69c/0x754)
[<c026f814>] (devinet_ioctl+0x69c/0x754) from [<c01ec24c>] (sock_ioctl+0x5c/0x250)
[<c01ec24c>] (sock_ioctl+0x5c/0x250) from [<c00d9844>] (do_vfs_ioctl+0x80/0x5d0)
[<c00d9844>] (do_vfs_ioctl+0x80/0x5d0) from [<c00d9dcc>] (sys_ioctl+0x38/0x60)
[<c00d9dcc>] (sys_ioctl+0x38/0x60) from [<c003eac0>] (ret_fast_syscall+0x0/0x30)
CPU1: stopping
[<c0045000>] (unwind_backtrace+0x0/0xf8) from [<c003e334>] (do_IPI+0x114/0x154)
[<c003e334>] (do_IPI+0x114/0x154) from [<c03a38a8>] (__irq_svc+0x48/0xe8)
Exception stack(0xcf843f98 to 0xcf843fe0)
3f80: c8225fa0 cf8faa00
3fa0: cf843fe0 00000000 cf842000 c04050a8 c03ecb80 c0405214 00000000 413fc090
3fc0: 0000001f 00000000 00000000 cf843fe0 c003fbe4 c003fbe8 60000013 ffffffff
[<c03a38a8>] (__irq_svc+0x48/0xe8) from [<c003fbe8>] (default_idle+0x24/0x28)
[<c003fbe8>] (default_idle+0x24/0x28) from [<c003fd88>] (cpu_idle+0x70/0xa4)
[<c003fd88>] (cpu_idle+0x70/0xa4) from [<00008084>] (0x8084)
Rebooting in 3 seconds..Digital core power voltage set to 0.9375V
Decompressing...done
-
Hello can you compile https://bitbucket.org/pedro311/freshtomato-arm/commits/114022d1f062ecc2cf45a07f6134e2c4eb546d2a with CRASH_LOG=y TOMATO_EXPERIMENTAL=1 and see if it crashes and get the messages if it crashes ? This is before cake and wireguard.
Then try with the commit after it.
I’m not sure if your device is SDK6 or SDK7 though.ok seems it is SDK6.Can you explain how do you enable IPv6 so I can attempt to reproduce the problem with my R7000 ?
-
reporter Hi. I think you should check this with SDK 6 on a typical EA6700 compatible one.
What i noted was always the SMP error in the crash log - wondering why it is SMP, because /proc/cpuinfu shows just 1 active cpu and i´am not sure,
is SMP really active on this SDK6 device ?I was last going back to
455578a
- so 1 commit BEFORE wireguard was implemented.I got the same crashes at the moment when i enabled ipv6 as latest step. Setted DHCP with prefix delegation, select /56 and Request PD only and set 2 static DNS servers.
Accept RA from WAN and enable v6 subnet for br1. Then router crashes a few seconds later with the above log again. So it is no wireguard issue so far…I noted that reconfigurations of all types until this point is extremly slow and poor. Often the webinterface goes offline and the router needs more then 2 minutes to be responsive again.
In the serial console i saw a huge time which is needed for any type of simple configurations, like adding virtual wireless (took 7 minutes and one extra sporadic reboot) to becomes active. i got multipe times RTNETLINK errors for vlan1 at various reloading stages. the adblock process hungs 5 minutes with “stop” and was not killed, the same with dnsmasq “restart” - and hungs, and after a while the device becomes responsive again.
This is the situation directly after a coldboot now:
Hit ENTER for console...
emf: module license 'Proprietary' taints kernel. Disabling lock debugging due to kernel taint et_module_init: passivemode set to 0x0 et_module_init: txworkq set to 0x0 et_module_init: et_txq_thresh set to 0xce4 eth0: Broadcom BCM47XX 10/100/1000 Mbps Ethernet Controller 6.37.14.126 (r561982) Restoring wireless vars ... Restoring wireless vars - in progress ... Restoring wireless vars - in progress ... Restoring wireless vars - done ... / # TxBeamforming not supported for eth1 wlconf_pre(0x0099): set vhtmode 0 for eth1 TxBeamforming supported for eth2 - corerev: 42 wlconf_pre(0x007c): txbf_bfr_cap for eth2 = 1 wlconf_pre(0x007d): txbf_bfe_cap for eth2 = 1 wlconf_pre(0x0084): set vhtmode 1 for eth2 vlan1: cmdInternal error: Oops: 5 [
] PREEMPT SMP =14last sysfs file: /sys/class/net/br0/bridge/stp_state Omodule: wl bf09d000 4201685#1
permodule: dpsta bf094000 12782
atimodule: ehci_hcd bf086000 32414
on module: usbcore bf062000 103389
notmodule: nf_nat_pptp bf05c000 1602
sumodule: nf_conntrack_pptp bf056000 3355
ppomodule: nf_nat_proto_gre bf050000 887
rtemodule: nf_conntrack_proto_gre bf04a000 3308
d
module: nf_nat_ftp bf044000 1144
module: nf_conntrack_ftp bf03d000 4909
module: nf_nat_h323 bf036000 4761
module: nf_conntrack_h323 bf028000 33807
module: et bf011000 65158
module: igs bf009000 11927
module: emf bf000000 15397
Modules linked in: wl(P) dpsta(P) ehci_hcd usbcore nf_nat_pptp nf_conntrack_pptp nf_nat_proto_gre nf_conntrack_proto_gre nf_nat_ftp nf_conntrack_ftp nf_nat_h323 nf_conntrack_h323 et(P) igs(P) emf(P)
CPU: 1 Tainted: P (2.6.36.4brcmarm)#1
PC is at ipv6_add_addr+0xd0/0x36c
LR is at ipv6_add_addr+0xa8/0x36c
pc : [<c029f848>] lr : [<c029f820>] psr: 60000013
sp : cf825d80 ip : 00000080 fp : cf84f800
r10: 000080fe r9 : 00000000 r8 : ffb3f84a
r7 : 00000020 r6 : 00000040 r5 : cf961e00 r4 : cf825e18
r3 : 8af7e24a r2 : c0440bd8 r1 : c0440bd8 r0 : 75449afe
Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user
Control: 10c53c7d Table: 9fa5804a DAC: 00000015
Process preinit (pid: 1, stack limit = 0xcf824270)
Stack: (0xcf825d80 to 0xcf826000)
5d80: cf961e00 00000001 cf84f800 cf84f800 cf961e00 cf825e18 cf961e00 00000001
5da0: 00000000 00000014 cf0ab10c c02a26b4 00000080 c029eecc cf84f800 cf84f800
5dc0: cf961e00 c02a4120 00000000 00000000 00000001 c03f3588 00000001 cf825e2f
5de0: cf825e30 c017e534 cf0ab100 00000000 00000000 cf840000 cf84f800 c043ca9c
5e00: 00000000 00000000 cf84f800 00000001 00000014 c017e64c 000080fe 00000000
5e20: ffb3f84a 75449afe cf825e28 fffffff1 c04022c0 00000000 cf84f800 00000001
5e40: 00000000 00000014 cf0ab10c c0081424 cf84f800 00000201 00000000 00008914
5e60: 00000000 c008152c 00000000 00000201 00000000 c01fed38 cf84f800 00001002
5e80: 00000000 c01fedb4 be9e7b90 cf824000 cf825eb0 c026f678 c81c4f00 c0394768
5ea0: cf0ab100 cf825ec0 cf84f800 cf0ab10c 6e616c76 00000031 be9e7b80 00000000
5ec0: 00001243 00000000 00000000 40175fcd 00001243 00000000 00000000 40175fcd
5ee0: 00000000 00008914 be9e7b90 be9e7b90 00000005 c003ec68 cf824000 00000000
5f00: 00001243 c01ec0b0 cfb8b320 be9e7b90 be9e7b90 c00d96d4 00000020 cf4a7660
5f20: cf825f58 00000003 00000005 c00cc6f0 00000000 cf416b60 00000000 00000005
5f40: cf825f84 c01ec550 cf416b60 00000000 00000000 c037b37c cf8073c0 00000005
5f60: cf816000 c00c984c cfb8b320 be9e7b90 00008914 00000005 c003ec68 cf824000
5f80: 00000000 c00d9c5c 00000003 00000000 401b2b94 be9e7b90 00008914 00000005
5fa0: 00000036 c003eac0 be9e7b90 00008914 00000005 00008914 be9e7b90 00000000
5fc0: be9e7b90 00008914 00000005 00000036 be9e7c68 00000000 00000000 00001243
5fe0: 0005d148 be9e7b68 0001a6ec 401c0aec 60000010 00000005 f7f9feff f77dfdff
[<c029f848>] (PC is at ipv6_add_addr+0xd0/0x36c)
[<c029f848>] (ipv6_add_addr+0xd0/0x36c) from [<c02a26b4>] (addrconf_add_linklocal+0x48/0xb8)
[<c02a26b4>] (addrconf_add_linklocal+0x48/0xb8) from [<c02a4120>] (addrconf_notify+0x470/0x7f8)
[<c02a4120>] (addrconf_notify+0x470/0x7f8) from [<c0081424>] (notifier_call_chain+0x44/0x84)
[<c0081424>] (notifier_call_chain+0x44/0x84) from [<c008152c>] (raw_notifier_call_chain+0x18/0x20)
[<c008152c>] (raw_notifier_call_chain+0x18/0x20) from [<c01fed38>] (__dev_notify_flags+0x2c/0x78)
[<c01fed38>] (__dev_notify_flags+0x2c/0x78) from [<c01fedb4>] (dev_change_flags+0x30/0x48)
[<c01fedb4>] (dev_change_flags+0x30/0x48) from [<c026f678>] (devinet_ioctl+0x69c/0x754)
[<c026f678>] (devinet_ioctl+0x69c/0x754) from [<c01ec0b0>] (sock_ioctl+0x5c/0x250)
[<c01ec0b0>] (sock_ioctl+0x5c/0x250) from [<c00d96d4>] (do_vfs_ioctl+0x80/0x5d0)
[<c00d96d4>] (do_vfs_ioctl+0x80/0x5d0) from [<c00d9c5c>] (sys_ioctl+0x38/0x60)
[<c00d9c5c>] (sys_ioctl+0x38/0x60) from [<c003eac0>] (ret_fast_syscall+0x0/0x30)
Code: e029300a e595b000 e0233008 e0233000 (e7921103)
---[ end trace f51550813c9668a1 ]---
Kernel panic - not syncing: Fatal exception in interrupt
[<c0045000>] (unwind_backtrace+0x0/0xf8) from [<c02e6408>] (panic+0x74/0x1a0)
[<c02e6408>] (panic+0x74/0x1a0) from [<c00426f8>] (die+0x1ac/0x1dc)
[<c00426f8>] (die+0x1ac/0x1dc) from [<c0046154>] (__do_kernel_fault+0x64/0x84)
[<c0046154>] (__do_kernel_fault+0x64/0x84) from [<c0046440>] (do_translation_fault+0x70/0xa8)
[<c0046440>] (do_translation_fault+0x70/0xa8) from [<c003e3a4>] (do_DataAbort+0x30/0x9c)
[<c003e3a4>] (do_DataAbort+0x30/0x9c) from [<c03a354c>] (__dabt_svc+0x4c/0x60)
Exception stack(0xcf825d38 to 0xcf825d80)
5d20: 75449afe c0440bd8
5d40: c0440bd8 8af7e24a cf825e18 cf961e00 00000040 00000020 ffb3f84a 00000000
5d60: 000080fe cf84f800 00000080 cf825d80 c029f820 c029f848 60000013 ffffffff
[<c03a354c>] (__dabt_svc+0x4c/0x60) from [<c029f848>] (ipv6_add_addr+0xd0/0x36c)
[<c029f848>] (ipv6_add_addr+0xd0/0x36c) from [<c02a26b4>] (addrconf_add_linklocal+0x48/0xb8)
[<c02a26b4>] (addrconf_add_linklocal+0x48/0xb8) from [<c02a4120>] (addrconf_notify+0x470/0x7f8)
[<c02a4120>] (addrconf_notify+0x470/0x7f8) from [<c0081424>] (notifier_call_chain+0x44/0x84)
[<c0081424>] (notifier_call_chain+0x44/0x84) from [<c008152c>] (raw_notifier_call_chain+0x18/0x20)
[<c008152c>] (raw_notifier_call_chain+0x18/0x20) from [<c01fed38>] (__dev_notify_flags+0x2c/0x78)
[<c01fed38>] (__dev_notify_flags+0x2c/0x78) from [<c01fedb4>] (dev_change_flags+0x30/0x48)
[<c01fedb4>] (dev_change_flags+0x30/0x48) from [<c026f678>] (devinet_ioctl+0x69c/0x754)
[<c026f678>] (devinet_ioctl+0x69c/0x754) from [<c01ec0b0>] (sock_ioctl+0x5c/0x250)
[<c01ec0b0>] (sock_ioctl+0x5c/0x250) from [<c00d96d4>] (do_vfs_ioctl+0x80/0x5d0)
[<c00d96d4>] (do_vfs_ioctl+0x80/0x5d0) from [<c00d9c5c>] (sys_ioctl+0x38/0x60)
[<c00d9c5c>] (sys_ioctl+0x38/0x60) from [<c003eac0>] (ret_fast_syscall+0x0/0x30)
CPU0: stopping
[<c0045000>] (unwind_backtrace+0x0/0xf8) from [<c003e334>] (do_IPI+0x114/0x154)
[<c003e334>] (do_IPI+0x114/0x154) from [<c03a35a8>] (__irq_svc+0x48/0xe8)
Exception stack(0xc03e1f78 to 0xc03e1fc0)
1f60: 00000000 cf9d5b00
1f80: c03e1fc0 00000000 c03e0000 c04050a8 c03ecb80 c03ecb78 00026ab0 413fc090
1fa0: 0000001f 00000000 00000000 c03e1fc0 c003fbe4 c003fbe8 60000013 ffffffff
[<c03a35a8>] (__irq_svc+0x48/0xe8) from [<c003fbe8>] (default_idle+0x24/0x28)
[<c003fbe8>] (default_idle+0x24/0x28) from [<c003fd88>] (cpu_idle+0x70/0xa4)
[<c003fd88>] (cpu_idle+0x70/0xa4) from [<c0008ca0>] (start_kernel+0x338/0x394)
[<c0008ca0>] (start_kernel+0x338/0x394) from [<00008084>] (0x8084)
Rebooting in 3 seconds..Digital core power voltage set to 0.9375VThe Problem starts here, right at initial stage: “vlan1: cmdInternal error: Oops: 5 [
#1] PREEMPT SMP”
Before v6 was enabled i got at every config change a “vlan1: cmd=14: Operation not Supported error” by the way.i compile now “42dca15” and recheck again from this point - so i´am next exclude all the Qos changes that was made between the 7.1.2021 - 13.1.2021
-
reporter one step closer:
starting from commits “
0c8c353” (qos new...) the segfaulting begin to start when ipv6 is enabled! From my findings this should have todo
with the “vlan1: cmd=14: Operation not Supported error” - which shows continously up - in case of v4 this is ignored - but not when v6 is enabled, too.But there is even more: when setting up virtual wirless - after saving - this newly created devices don´t show up in the web interface. So pegging it to br1
is not possible straight away until 1-2 minutes passed AND a new reboot was made. The same issues i figured out when changing vlans. At one point the entire
VLAN-Page was empty until i made a 2nd cold reboot once more after the regulary vlan-settings-reboot when changes are saved.Now the next stopper problem is the unbelivable long response time for reloading/reconfig services from minimum 60-120 seconds for every little change.
I go now further back up to the last working Version from 7.1.2021 whre all was working without any problems. -
reporter the next one:
commits starting from “
cd4e11d” tooks still ages. A Sample: reconfigure vlans and when the usual ”reboot” question pops up. It took 4 minutes before the device
started with the rebooting. Until them the webserver was shutdown, just the device is pingable, nothing else. Any other config change is the same. No changes made until 3-4 minutes “does nothing” - so nothing works on the fly any longer as expected. -
reporter FINALY. i have now a conclusion and a new finding:
In my test scenario i had a PPPoE Session configured - but no VDSL connection at the testbed. THIS is the reason for the entire stalling of the unit! As long as PPPoE get no IP address infos from outside - a useful configuration with asap reloads is impossible. After i switched temporary back from PPPoE → DHCP the device works as before: Every change is commited asap. Changes at WLAN and VLAN trigger without any delays action and reboots performs at the moment i commit it!
The commit:
cd4e11d
is the last reliable working version actualy on SDK6 devices for me!Starting from commit: “
0c8c353” (qos new...)” - The segfaulting starts asap when ipv6 is enabled and results in endless bootlops.
- so the ipv6 issues comes from the new qos commits actualy in the time from 08.01.2021 - 11.01.2021.
For reproducing this error WAN should be configured with PPPoE and IPv6 should be enabled. Further more create a br1 “Guestnet” and configure a 2nd VLAN for br1.
Then enable IPv6 DHCP with prefix delegation, select /56 and Request PD only and set 2 static DNS servers. Accept RA from WAN and enable v6 subnet for br1.When this parts in the web interface are configured - every single step has massive delays - as long as there is no active PPPoE from the outside, so be patient
at every configure step!As long there is no active PPPoE Session “top” shows hanging processes for “adblock stop” and “dnsmasq restart” for many minutes. Maybe some more daemons hangsa the same way. All Tasks are asap killed/restarted when PPPoE comes up or the WAN interfaces is switched to dhcp - so this is a blocker when PPPoE is offline!
The still showing error: “vlan1: cmd=14: Operation not Supported error” apears at any commit until today by the way. I´am not sure if this error has something todo with the upcoming more v6-crash issues - but VLAN1 is the default vlan right after NVRAM clear and this error never stops on a running router each time a config change is made and some parts are reloaded.
There is one more running error at configuration reloads, which is: “# sed: /etc/tinc/tinc-fw.sh: No such file or directory” - the resason is, i have Tinc compiled in - but completly disabled/unconfigured.
.Hope this infos helps….
-
Let’s take this step by step please. Thanks for your help, but can you test the following commits one by one to properly narrow down exactly which change is causing this ?:
- https://bitbucket.org/pedro311/freshtomato-arm/commits/5b55bfbdab440ba609eae67e12b4ce8bb3a618e0 This commit is before all kernel changes. Please confirm that this one works fine for the crashing, the slow response is a different problem and let’s not confuse things with it now, we will see it afterwards.
- https://bitbucket.org/pedro311/freshtomato-arm/commits/423a070cd1eab365a29e019ef89f754d7e74f0a6 First kernel change. See if crash here or not.
- https://bitbucket.org/pedro311/freshtomato-arm/commits/ed6758ae65a15e15685ed5815562cb09a9156943 2nd kernel change. See if crash here or not.
- https://bitbucket.org/pedro311/freshtomato-arm/commits/42dca15b495ab7a0c8ecf3f7913c329e6c79ffcb 3rd kernel change. See if crash here or not.
The problem is that commit https://bitbucket.org/pedro311/freshtomato-arm/commits/0c8c35358b7504703600d3e9415d9e486ba345c3 does not contain any kernel changes that can cause this crash. Are you using QoS enabled ?? If you have QoS disabled, then the code in that commit should not even be executed.
-
reporter just for now… i don´t use BW and QOS, so it is unconfigured always in all my testscenarios.
i will go tomorrow throw the single steps and report back,In the mean time: Can you check about the “SMP” things generally in the codebase ? Descriptions, etc. tell the sdk6 device is “singlecore”, /proc/cpuinfo shows equaly just one cpu - but webinterface states dual-core - even the coredump/segfault “cpu1” tell us about a “SMP preempt error” ? Is there eventually a definition problem with SMP on the wlan driver, but the system itself runs on just 1 cpu ? It is possible that one userland daemon from the “e:” image is using a 2nd core, which is actualy disabled by basic configuration ?
-
Can confirm this. router crashes completely on my side. recovery needed.
Luckily my main router was running still 2020-8 final
problem/issue found, see correction
https://bitbucket.org/M_ars/freshtomato-arm/commits/126315700d27e799aa4b4a8701943cbc4a35d5a3
this issue can be closed i think.
BR
-
repo owner - changed status to resolved
- Log in to comment
I forget to mention that i compiled yesterday the latest trunk for mips/E3000, too! - Just without the wireguard things here as “n60o”-image.
→ lastest Network drivers for ARM ??? → QOS-Issues after latest changes ??? → more typos in the java/web interface things for ARM after latest changes ?