EA6700 vpn e Build complete buggy/unstable/looping since 07.01.2021 repo changes

Issue #93 resolved
TheHiman created an issue

I updated yesterday from a relative good working trunk from 07.01.2021 to the actutal trunk from 25.02.2021 which ends asap in a boot loop and only a clear NVRAM recovered the device. The only change i made was compiling in WIREGUARD this time.

After them i used a 2nd router for testing and started fresh with complete NVRAM clear. When basic configs wie webinterface made by hand in multiple menus, the router reboots sporadic without any resason. So my first idea was WIREGUARD was the issue - i compiled a new version withoud added WIREGUARD this time - and i could repeat all the random reboots, issues in many different menus. Basicly simple things like change the ip address from br1 results in craches/resets/a new bootloop.

enabling ipv6 later after made the most pre-settings causes trigger the sporadic resets even more. Many time i landing in the cfe recovery interface after such crashes.

actualy i can repeat this issues on 3 identical EA6700 models. i have no serial cable attached actualy and had no chance to get any more detailes about crash infos. I asume this becomes true fpr EA6900 maybe other models, too.

My asumption is the adding of WIREGUARD-Support/kernel changes and the NEW QOS implementation ?
Eventually the latest WIFI driver changes for SDK6 ?

The interesting thing is that the sporadic reboots and crashes starts mostly between 10-40 seconds after the last changes
was made in the webinterface - looks like a crash started when some things are reloaded delayed via “rc” in a chain.

  • When comes from a 07.01.2021 trunk by just update to the latest one from 25.01.2021 normaly the existing config should be prevented - this is not possible - after upgrading. the Router goes in a endless bootloop and the entire pre existing config ist lost - which resulted in completly cleared nvram with default ip 192.168.1.1 - this i can repeat von 3 equal EA6700. Having wireguard active or not makes no difference. Simply changes like configure WLAN settings in basic config. change or create a virtual WLAN has the same effetcs - sporadic reboots, mostly after 20-30 seconcds.

Finaly a successfully step-by-step full web based configuration is no longer possible until all settings are back to the last working config previous used. Every try to import a previous saved Backup results in the same bootloops after when just upgrade from webinterface.

Steps to reproduce:

NVRAM clean → inital setup:

Basicly just create a 2nd bridge and set IPs here, change wan protocol to PPPoE and fill in all data.
in vlan setup just create br1 vlan, give them a free tag and change some ports for handling br0/br1 as usual.
enabling IPv6 for the PPPoE wan and include suppoort für br1, too.
in admin interface enable remote access, set a access list and finaly configure/ enabling SNMP
Last but not leat define 2 forward fules with udp for some more snmp internal devices to be reachable vom external sources.
In the Advance section just change the country and country revision and made the usual tunings for the radios.
finaly create a guestnet on br1 and set own encryption here.

The trick is: just try to come to this point WITHOUT any crashes in the middle of config.

The only known “legal” reboots should only comes up when changing the WLAN country/region AND when changing
ther VLAN table - but not when simply configure firewall settings, define an DDNS-Service or just change
a mac address for one of the virtual wireless interfaces. Normaly this results in a short outtage of 1-3 seconds in the local lan when services are reloaded - but NOT a complete unclean reset/crash with landing in the rescue cfe interface asap!

Comments (11)

  1. TheHiman reporter

    I forget to mention that i compiled yesterday the latest trunk for mips/E3000, too! - Just without the wireguard things here as “n60o”-image.

    • From what i have seen. the MIPS build still works as expected with the latest changes from 25.01.2021 so far. No sporadic reboots/crashes or other surprises here

    → lastest Network drivers for ARM ??? → QOS-Issues after latest changes ??? → more typos in the java/web interface things for ARM after latest changes ?

  2. TheHiman reporter

    I had the chance and plugged a serial console terminal on one of the EA6700 and now i have some infos:

    before i compiled a version which i checkout at “9825ea0” - right before the WLAN driver was updated to check if the issue comes from the newer driver.
    That was not the case here.

    Here is the crash-log right after i enabled ipv6 and waited a few seconds after saving the changed config:

    TxBeamforming not supported for eth1
    wlconf_pre(0x0099): set vhtmode 0 for eth1
    TxBeamforming supported for eth2 - corerev: 42
    wlconf_pre(0x007c): txbf_bfr_cap for eth2 = 1
    wlconf_pre(0x007d): txbf_bfe_cap for eth2 = 1
    wlconf_pre(0x0084): set vhtmode 1 for eth2
    vlan1: cmInternal error: Oops: 5 [#1] PREEMPT SMP
    last sysfs file: /sys/class/net/br0/bridge/stp_state
    module: xt_length bf679000 769
    module: nf_conntrack_ipv6 bf664000 9463
    module: ebtable_filter bf5d7000 1061
    module: ebtables bf5ce000 15643
    module: ip6table_mangle bf4f0000 934
    module: ip6table_filter bf4ea000 750
    module: wl bf09d000 4201685
    module: dpsta bf094000 12782
    module: ehci_hcd bf086000 32414
    module: usbcore bf062000 103389
    module: nf_nat_pptp bf05c000 1602
    module: nf_conntrack_pptp bf056000 3355
    module: nf_nat_proto_gre bf050000 887
    module: nf_conntrack_proto_gre bf04a000 3308
    module: nf_nat_ftp bf044000 1144
    module: nf_conntrack_ftp bf03d000 4909
    module: nf_nat_h323 bf036000 4761
    module: nf_conntrack_h323 bf028000 33807
    module: et bf011000 65158
    module: igs bf009000 11927
    module: emf bf000000 15397
    Modules linked in: xt_length nf_conntrack_ipv6 ebtable_filter ebtables ip6table_mangle ip6table_filter wl(P) dpsta(P) ehci_hcd usbcore nf_nat_pptp nf_conntrack_pptp nf_nat_proto_gre nf_conntrack_proto_gre nf_nat_ftp nf_conntrack_ftp nf_nat_h323 nf_conntrack_h323 et(P) igs(P) emf(P) [last unloaded: ip6t_REJECT]
    CPU: 0 Tainted: P (2.6.36.4brcmarm #1)
    PC is at ipv6_add_addr+0xd0/0x36c
    LR is at ipv6_add_addr+0xa8/0x36c
    pc : [<c029f9e4>] lr : [<c029f9bc>] psr: 60000013
    sp : cf825d80 ip : 00000080 fp : cdda6800
    r10: 000080fe r9 : 00000000 r8 : ffb3f84a
    r7 : 00000020 r6 : 00000040 r5 : cddff800 r4 : cf825e18
    r3 : 8af7e24a r2 : c0440bd8 r1 : c0440bd8 r0 : 75449afe
    Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user
    Control: 10c53c7d Table: 9f95c04a DAC: 00000015
    Process preinit (pid: 1, stack limit = 0xcf824270)
    Stack: (0xcf825d80 to 0xcf826000)
    5d80: cddff800 00000001 cdda6800 cdda6800 cddff800 cf825e18 cddff800 00000001
    5da0: 00000000 00000014 cf1a550c c02a2850 00000080 c029f068 cdda6800 cdda6800
    5dc0: cddff800 c02a42bc 00000000 00000000 00000001 c03f3588 00000001 cf825e2f
    5de0: cf825e30 c017e6d0 cf1a5500 00000000 00000000 cdda0000 cdda6800 c043ca9c
    5e00: 00000000 00000000 cdda6800 00000001 00000014 c017e7e8 000080fe 00000000
    5e20: ffb3f84a 75449afe cf825e28 fffffff1 c04022c0 00000000 cdda6800 00000001
    5e40: 00000000 00000014 cf1a550c c0081424 cdda6800 00000201 00000000 00008914
    5e60: 00000000 c008152c 00000000 00000201 00000000 c01feed4 cdda6800 00001002
    5e80: 00000000 c01fef50 bee38b90 cf824000 cf825eb0 c026f814 cf84b000 0000000a
    5ea0: cf1a5500 cf825ec0 cdda6800 cf1a550c 6e616c76 00000031 bee38b80 00000000
    5ec0: 00001243 00000000 00000000 401210e5 00001243 00000000 00000000 401210e5
    5ee0: 00000000 00008914 bee38b90 bee38b90 00000005 c003ec68 cf824000 00000000
    5f00: 00001243 c01ec24c cf1afbe0 bee38b90 bee38b90 c00d9844 00000020 cf50c4c8
    5f20: cf825f58 00000003 00000005 c00cc860 00000000 cf52b040 00000000 00000005
    5f40: cf825f84 c01ec6ec cf52b040 00000000 00000000 c037b66c cf8073c0 00000005
    5f60: cf816000 c00c99bc cf1afbe0 bee38b90 00008914 00000005 c003ec68 cf824000
    5f80: 00000000 c00d9dcc 00000003 00000000 401e5b94 bee38b90 00008914 00000005
    5fa0: 00000036 c003eac0 bee38b90 00008914 00000005 00008914 bee38b90 00000000
    5fc0: bee38b90 00008914 00000005 00000036 bee38c68 00000000 00000000 00001243
    5fe0: 0005e308 bee38b68 0001b594 401f3aec 60000010 00000005 f7ffffff ffffffff
    [<c029f9e4>] (PC is at ipv6_add_addr+0xd0/0x36c)
    [<c029f9e4>] (ipv6_add_addr+0xd0/0x36c) from [<c02a2850>] (addrconf_add_linklocal+0x48/0xb8)
    [<c02a2850>] (addrconf_add_linklocal+0x48/0xb8) from [<c02a42bc>] (addrconf_notify+0x470/0x7f8)
    [<c02a42bc>] (addrconf_notify+0x470/0x7f8) from [<c0081424>] (notifier_call_chain+0x44/0x84)
    [<c0081424>] (notifier_call_chain+0x44/0x84) from [<c008152c>] (raw_notifier_call_chain+0x18/0x20)
    [<c008152c>] (raw_notifier_call_chain+0x18/0x20) from [<c01feed4>] (__dev_notify_flags+0x2c/0x78)
    [<c01feed4>] (__dev_notify_flags+0x2c/0x78) from [<c01fef50>] (dev_change_flags+0x30/0x48)
    [<c01fef50>] (dev_change_flags+0x30/0x48) from [<c026f814>] (devinet_ioctl+0x69c/0x754)
    [<c026f814>] (devinet_ioctl+0x69c/0x754) from [<c01ec24c>] (sock_ioctl+0x5c/0x250)
    [<c01ec24c>] (sock_ioctl+0x5c/0x250) from [<c00d9844>] (do_vfs_ioctl+0x80/0x5d0)
    [<c00d9844>] (do_vfs_ioctl+0x80/0x5d0) from [<c00d9dcc>] (sys_ioctl+0x38/0x60)
    [<c00d9dcc>] (sys_ioctl+0x38/0x60) from [<c003eac0>] (ret_fast_syscall+0x0/0x30)
    Code: e029300a e595b000 e0233008 e0233000 (e7921103)
    d=1---[ end trace e07b33bb9ea4104e ]---
    4: Kernel panic - not syncing: Fatal exception in interrupt
    Ope[<c0045000>] (unwind_backtrace+0x0/0xf8) from [<c02e65a4>] (panic+0x74/0x1a0)
    rat[<c02e65a4>] (panic+0x74/0x1a0) from [<c00426f8>] (die+0x1ac/0x1dc)
    ion[<c00426f8>] (die+0x1ac/0x1dc) from [<c0046154>] (__do_kernel_fault+0x64/0x84)
    no[<c0046154>] (__do_kernel_fault+0x64/0x84) from [<c0046440>] (do_translation_fault+0x70/0xa8)
    t s[<c0046440>] (do_translation_fault+0x70/0xa8) from [<c003e3a4>] (do_DataAbort+0x30/0x9c)
    upp[<c003e3a4>] (do_DataAbort+0x30/0x9c) from [<c03a384c>] (__dabt_svc+0x4c/0x60)
    ortException stack(0xcf825d38 to 0xcf825d80)
    5d20: 75449afe c0440bd8

    5d40: c0440bd8 8af7e24a cf825e18 cddff800 00000040 00000020 ffb3f84a 00000000
    5d60: 000080fe cdda6800 00000080 cf825d80 c029f9bc c029f9e4 60000013 ffffffff
    [<c03a384c>] (__dabt_svc+0x4c/0x60) from [<c029f9e4>] (ipv6_add_addr+0xd0/0x36c)
    [<c029f9e4>] (ipv6_add_addr+0xd0/0x36c) from [<c02a2850>] (addrconf_add_linklocal+0x48/0xb8)
    [<c02a2850>] (addrconf_add_linklocal+0x48/0xb8) from [<c02a42bc>] (addrconf_notify+0x470/0x7f8)
    [<c02a42bc>] (addrconf_notify+0x470/0x7f8) from [<c0081424>] (notifier_call_chain+0x44/0x84)
    [<c0081424>] (notifier_call_chain+0x44/0x84) from [<c008152c>] (raw_notifier_call_chain+0x18/0x20)
    [<c008152c>] (raw_notifier_call_chain+0x18/0x20) from [<c01feed4>] (__dev_notify_flags+0x2c/0x78)
    [<c01feed4>] (__dev_notify_flags+0x2c/0x78) from [<c01fef50>] (dev_change_flags+0x30/0x48)
    [<c01fef50>] (dev_change_flags+0x30/0x48) from [<c026f814>] (devinet_ioctl+0x69c/0x754)
    [<c026f814>] (devinet_ioctl+0x69c/0x754) from [<c01ec24c>] (sock_ioctl+0x5c/0x250)
    [<c01ec24c>] (sock_ioctl+0x5c/0x250) from [<c00d9844>] (do_vfs_ioctl+0x80/0x5d0)
    [<c00d9844>] (do_vfs_ioctl+0x80/0x5d0) from [<c00d9dcc>] (sys_ioctl+0x38/0x60)
    [<c00d9dcc>] (sys_ioctl+0x38/0x60) from [<c003eac0>] (ret_fast_syscall+0x0/0x30)
    CPU1: stopping
    [<c0045000>] (unwind_backtrace+0x0/0xf8) from [<c003e334>] (do_IPI+0x114/0x154)
    [<c003e334>] (do_IPI+0x114/0x154) from [<c03a38a8>] (__irq_svc+0x48/0xe8)
    Exception stack(0xcf843f98 to 0xcf843fe0)
    3f80: c8225fa0 cf8faa00
    3fa0: cf843fe0 00000000 cf842000 c04050a8 c03ecb80 c0405214 00000000 413fc090
    3fc0: 0000001f 00000000 00000000 cf843fe0 c003fbe4 c003fbe8 60000013 ffffffff
    [<c03a38a8>] (__irq_svc+0x48/0xe8) from [<c003fbe8>] (default_idle+0x24/0x28)
    [<c003fbe8>] (default_idle+0x24/0x28) from [<c003fd88>] (cpu_idle+0x70/0xa4)
    [<c003fd88>] (cpu_idle+0x70/0xa4) from [<00008084>] (0x8084)
    Rebooting in 3 seconds..Digital core power voltage set to 0.9375V
    Decompressing...done

  3. TheHiman reporter

    Hi. I think you should check this with SDK 6 on a typical EA6700 compatible one.

    What i noted was always the SMP error in the crash log - wondering why it is SMP, because /proc/cpuinfu shows just 1 active cpu and i´am not sure,
    is SMP really active on this SDK6 device ?

    I was last going back to 455578a - so 1 commit BEFORE wireguard was implemented.

    I got the same crashes at the moment when i enabled ipv6 as latest step. Setted DHCP with prefix delegation, select /56 and Request PD only and set 2 static DNS servers.
    Accept RA from WAN and enable v6 subnet for br1. Then router crashes a few seconds later with the above log again. So it is no wireguard issue so far…

    I noted that reconfigurations of all types until this point is extremly slow and poor. Often the webinterface goes offline and the router needs more then 2 minutes to be responsive again.

    In the serial console i saw a huge time which is needed for any type of simple configurations, like adding virtual wireless (took 7 minutes and one extra sporadic reboot) to becomes active. i got multipe times RTNETLINK errors for vlan1 at various reloading stages. the adblock process hungs 5 minutes with “stop” and was not killed, the same with dnsmasq “restart” - and hungs, and after a while the device becomes responsive again.

    This is the situation directly after a coldboot now:

    Hit ENTER for console...

    emf: module license 'Proprietary' taints kernel. Disabling lock debugging due to kernel taint et_module_init: passivemode set to 0x0 et_module_init: txworkq set to 0x0 et_module_init: et_txq_thresh set to 0xce4 eth0: Broadcom BCM47XX 10/100/1000 Mbps Ethernet Controller 6.37.14.126 (r561982) Restoring wireless vars ... Restoring wireless vars - in progress ... Restoring wireless vars - in progress ... Restoring wireless vars - done ... / # TxBeamforming not supported for eth1 wlconf_pre(0x0099): set vhtmode 0 for eth1 TxBeamforming supported for eth2 - corerev: 42 wlconf_pre(0x007c): txbf_bfr_cap for eth2 = 1 wlconf_pre(0x007d): txbf_bfe_cap for eth2 = 1 wlconf_pre(0x0084): set vhtmode 1 for eth2 vlan1: cmdInternal error: Oops: 5 [#1] PREEMPT SMP =14last sysfs file: /sys/class/net/br0/bridge/stp_state Omodule: wl bf09d000 4201685
    permodule: dpsta bf094000 12782
    atimodule: ehci_hcd bf086000 32414
    on module: usbcore bf062000 103389
    notmodule: nf_nat_pptp bf05c000 1602
    sumodule: nf_conntrack_pptp bf056000 3355
    ppomodule: nf_nat_proto_gre bf050000 887
    rtemodule: nf_conntrack_proto_gre bf04a000 3308
    d
    module: nf_nat_ftp bf044000 1144
    module: nf_conntrack_ftp bf03d000 4909
    module: nf_nat_h323 bf036000 4761
    module: nf_conntrack_h323 bf028000 33807
    module: et bf011000 65158
    module: igs bf009000 11927
    module: emf bf000000 15397
    Modules linked in: wl(P) dpsta(P) ehci_hcd usbcore nf_nat_pptp nf_conntrack_pptp nf_nat_proto_gre nf_conntrack_proto_gre nf_nat_ftp nf_conntrack_ftp nf_nat_h323 nf_conntrack_h323 et(P) igs(P) emf(P)
    CPU: 1 Tainted: P (2.6.36.4brcmarm #1)
    PC is at ipv6_add_addr+0xd0/0x36c
    LR is at ipv6_add_addr+0xa8/0x36c
    pc : [<c029f848>] lr : [<c029f820>] psr: 60000013
    sp : cf825d80 ip : 00000080 fp : cf84f800
    r10: 000080fe r9 : 00000000 r8 : ffb3f84a
    r7 : 00000020 r6 : 00000040 r5 : cf961e00 r4 : cf825e18
    r3 : 8af7e24a r2 : c0440bd8 r1 : c0440bd8 r0 : 75449afe
    Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user
    Control: 10c53c7d Table: 9fa5804a DAC: 00000015
    Process preinit (pid: 1, stack limit = 0xcf824270)
    Stack: (0xcf825d80 to 0xcf826000)
    5d80: cf961e00 00000001 cf84f800 cf84f800 cf961e00 cf825e18 cf961e00 00000001
    5da0: 00000000 00000014 cf0ab10c c02a26b4 00000080 c029eecc cf84f800 cf84f800
    5dc0: cf961e00 c02a4120 00000000 00000000 00000001 c03f3588 00000001 cf825e2f
    5de0: cf825e30 c017e534 cf0ab100 00000000 00000000 cf840000 cf84f800 c043ca9c
    5e00: 00000000 00000000 cf84f800 00000001 00000014 c017e64c 000080fe 00000000
    5e20: ffb3f84a 75449afe cf825e28 fffffff1 c04022c0 00000000 cf84f800 00000001
    5e40: 00000000 00000014 cf0ab10c c0081424 cf84f800 00000201 00000000 00008914
    5e60: 00000000 c008152c 00000000 00000201 00000000 c01fed38 cf84f800 00001002
    5e80: 00000000 c01fedb4 be9e7b90 cf824000 cf825eb0 c026f678 c81c4f00 c0394768
    5ea0: cf0ab100 cf825ec0 cf84f800 cf0ab10c 6e616c76 00000031 be9e7b80 00000000
    5ec0: 00001243 00000000 00000000 40175fcd 00001243 00000000 00000000 40175fcd
    5ee0: 00000000 00008914 be9e7b90 be9e7b90 00000005 c003ec68 cf824000 00000000
    5f00: 00001243 c01ec0b0 cfb8b320 be9e7b90 be9e7b90 c00d96d4 00000020 cf4a7660
    5f20: cf825f58 00000003 00000005 c00cc6f0 00000000 cf416b60 00000000 00000005
    5f40: cf825f84 c01ec550 cf416b60 00000000 00000000 c037b37c cf8073c0 00000005
    5f60: cf816000 c00c984c cfb8b320 be9e7b90 00008914 00000005 c003ec68 cf824000
    5f80: 00000000 c00d9c5c 00000003 00000000 401b2b94 be9e7b90 00008914 00000005
    5fa0: 00000036 c003eac0 be9e7b90 00008914 00000005 00008914 be9e7b90 00000000
    5fc0: be9e7b90 00008914 00000005 00000036 be9e7c68 00000000 00000000 00001243
    5fe0: 0005d148 be9e7b68 0001a6ec 401c0aec 60000010 00000005 f7f9feff f77dfdff
    [<c029f848>] (PC is at ipv6_add_addr+0xd0/0x36c)
    [<c029f848>] (ipv6_add_addr+0xd0/0x36c) from [<c02a26b4>] (addrconf_add_linklocal+0x48/0xb8)
    [<c02a26b4>] (addrconf_add_linklocal+0x48/0xb8) from [<c02a4120>] (addrconf_notify+0x470/0x7f8)
    [<c02a4120>] (addrconf_notify+0x470/0x7f8) from [<c0081424>] (notifier_call_chain+0x44/0x84)
    [<c0081424>] (notifier_call_chain+0x44/0x84) from [<c008152c>] (raw_notifier_call_chain+0x18/0x20)
    [<c008152c>] (raw_notifier_call_chain+0x18/0x20) from [<c01fed38>] (__dev_notify_flags+0x2c/0x78)
    [<c01fed38>] (__dev_notify_flags+0x2c/0x78) from [<c01fedb4>] (dev_change_flags+0x30/0x48)
    [<c01fedb4>] (dev_change_flags+0x30/0x48) from [<c026f678>] (devinet_ioctl+0x69c/0x754)
    [<c026f678>] (devinet_ioctl+0x69c/0x754) from [<c01ec0b0>] (sock_ioctl+0x5c/0x250)
    [<c01ec0b0>] (sock_ioctl+0x5c/0x250) from [<c00d96d4>] (do_vfs_ioctl+0x80/0x5d0)
    [<c00d96d4>] (do_vfs_ioctl+0x80/0x5d0) from [<c00d9c5c>] (sys_ioctl+0x38/0x60)
    [<c00d9c5c>] (sys_ioctl+0x38/0x60) from [<c003eac0>] (ret_fast_syscall+0x0/0x30)
    Code: e029300a e595b000 e0233008 e0233000 (e7921103)
    ---[ end trace f51550813c9668a1 ]---
    Kernel panic - not syncing: Fatal exception in interrupt
    [<c0045000>] (unwind_backtrace+0x0/0xf8) from [<c02e6408>] (panic+0x74/0x1a0)
    [<c02e6408>] (panic+0x74/0x1a0) from [<c00426f8>] (die+0x1ac/0x1dc)
    [<c00426f8>] (die+0x1ac/0x1dc) from [<c0046154>] (__do_kernel_fault+0x64/0x84)
    [<c0046154>] (__do_kernel_fault+0x64/0x84) from [<c0046440>] (do_translation_fault+0x70/0xa8)
    [<c0046440>] (do_translation_fault+0x70/0xa8) from [<c003e3a4>] (do_DataAbort+0x30/0x9c)
    [<c003e3a4>] (do_DataAbort+0x30/0x9c) from [<c03a354c>] (__dabt_svc+0x4c/0x60)
    Exception stack(0xcf825d38 to 0xcf825d80)
    5d20: 75449afe c0440bd8
    5d40: c0440bd8 8af7e24a cf825e18 cf961e00 00000040 00000020 ffb3f84a 00000000
    5d60: 000080fe cf84f800 00000080 cf825d80 c029f820 c029f848 60000013 ffffffff
    [<c03a354c>] (__dabt_svc+0x4c/0x60) from [<c029f848>] (ipv6_add_addr+0xd0/0x36c)
    [<c029f848>] (ipv6_add_addr+0xd0/0x36c) from [<c02a26b4>] (addrconf_add_linklocal+0x48/0xb8)
    [<c02a26b4>] (addrconf_add_linklocal+0x48/0xb8) from [<c02a4120>] (addrconf_notify+0x470/0x7f8)
    [<c02a4120>] (addrconf_notify+0x470/0x7f8) from [<c0081424>] (notifier_call_chain+0x44/0x84)
    [<c0081424>] (notifier_call_chain+0x44/0x84) from [<c008152c>] (raw_notifier_call_chain+0x18/0x20)
    [<c008152c>] (raw_notifier_call_chain+0x18/0x20) from [<c01fed38>] (__dev_notify_flags+0x2c/0x78)
    [<c01fed38>] (__dev_notify_flags+0x2c/0x78) from [<c01fedb4>] (dev_change_flags+0x30/0x48)
    [<c01fedb4>] (dev_change_flags+0x30/0x48) from [<c026f678>] (devinet_ioctl+0x69c/0x754)
    [<c026f678>] (devinet_ioctl+0x69c/0x754) from [<c01ec0b0>] (sock_ioctl+0x5c/0x250)
    [<c01ec0b0>] (sock_ioctl+0x5c/0x250) from [<c00d96d4>] (do_vfs_ioctl+0x80/0x5d0)
    [<c00d96d4>] (do_vfs_ioctl+0x80/0x5d0) from [<c00d9c5c>] (sys_ioctl+0x38/0x60)
    [<c00d9c5c>] (sys_ioctl+0x38/0x60) from [<c003eac0>] (ret_fast_syscall+0x0/0x30)
    CPU0: stopping
    [<c0045000>] (unwind_backtrace+0x0/0xf8) from [<c003e334>] (do_IPI+0x114/0x154)
    [<c003e334>] (do_IPI+0x114/0x154) from [<c03a35a8>] (__irq_svc+0x48/0xe8)
    Exception stack(0xc03e1f78 to 0xc03e1fc0)
    1f60: 00000000 cf9d5b00
    1f80: c03e1fc0 00000000 c03e0000 c04050a8 c03ecb80 c03ecb78 00026ab0 413fc090
    1fa0: 0000001f 00000000 00000000 c03e1fc0 c003fbe4 c003fbe8 60000013 ffffffff
    [<c03a35a8>] (__irq_svc+0x48/0xe8) from [<c003fbe8>] (default_idle+0x24/0x28)
    [<c003fbe8>] (default_idle+0x24/0x28) from [<c003fd88>] (cpu_idle+0x70/0xa4)
    [<c003fd88>] (cpu_idle+0x70/0xa4) from [<c0008ca0>] (start_kernel+0x338/0x394)
    [<c0008ca0>] (start_kernel+0x338/0x394) from [<00008084>] (0x8084)
    Rebooting in 3 seconds..Digital core power voltage set to 0.9375V

    The Problem starts here, right at initial stage: “vlan1: cmdInternal error: Oops: 5 [#1] PREEMPT SMP”
    Before v6 was enabled i got at every config change a “vlan1: cmd=14: Operation not Supported error” by the way.

    i compile now “42dca15” and recheck again from this point - so i´am next exclude all the Qos changes that was made between the 7.1.2021 - 13.1.2021

  4. TheHiman reporter

    one step closer:

    starting from commits “0c8c353” (qos new...) the segfaulting begin to start when ipv6 is enabled! From my findings this should have todo
    with the “vlan1: cmd=14: Operation not Supported error” - which shows continously up - in case of v4 this is ignored - but not when v6 is enabled, too.

    But there is even more: when setting up virtual wirless - after saving - this newly created devices don´t show up in the web interface. So pegging it to br1
    is not possible straight away until 1-2 minutes passed AND a new reboot was made. The same issues i figured out when changing vlans. At one point the entire
    VLAN-Page was empty until i made a 2nd cold reboot once more after the regulary vlan-settings-reboot when changes are saved.

    Now the next stopper problem is the unbelivable long response time for reloading/reconfig services from minimum 60-120 seconds for every little change.
    I go now further back up to the last working Version from 7.1.2021 whre all was working without any problems.

  5. TheHiman reporter

    the next one:

    commits starting from “cd4e11d” tooks still ages. A Sample: reconfigure vlans and when the usual ”reboot” question pops up. It took 4 minutes before the device
    started with the rebooting. Until them the webserver was shutdown, just the device is pingable, nothing else. Any other config change is the same. No changes made until 3-4 minutes “does nothing” - so nothing works on the fly any longer as expected.

  6. TheHiman reporter

    FINALY. i have now a conclusion and a new finding:

    In my test scenario i had a PPPoE Session configured - but no VDSL connection at the testbed. THIS is the reason for the entire stalling of the unit! As long as PPPoE get no IP address infos from outside - a useful configuration with asap reloads is impossible. After i switched temporary back from PPPoE → DHCP the device works as before: Every change is commited asap. Changes at WLAN and VLAN trigger without any delays action and reboots performs at the moment i commit it!

    The commit: cd4e11d is the last reliable working version actualy on SDK6 devices for me!

    Starting from commit: “0c8c353” (qos new...)” - The segfaulting starts asap when ipv6 is enabled and results in endless bootlops.

    • so the ipv6 issues comes from the new qos commits actualy in the time from 08.01.2021 - 11.01.2021.

    For reproducing this error WAN should be configured with PPPoE and IPv6 should be enabled. Further more create a br1 “Guestnet” and configure a 2nd VLAN for br1.
    Then enable IPv6 DHCP with prefix delegation, select /56 and Request PD only and set 2 static DNS servers. Accept RA from WAN and enable v6 subnet for br1.

    When this parts in the web interface are configured - every single step has massive delays - as long as there is no active PPPoE from the outside, so be patient
    at every configure step!

    As long there is no active PPPoE Session “top” shows hanging processes for “adblock stop” and “dnsmasq restart” for many minutes. Maybe some more daemons hangsa the same way. All Tasks are asap killed/restarted when PPPoE comes up or the WAN interfaces is switched to dhcp - so this is a blocker when PPPoE is offline!

    The still showing error: “vlan1: cmd=14: Operation not Supported error” apears at any commit until today by the way. I´am not sure if this error has something todo with the upcoming more v6-crash issues - but VLAN1 is the default vlan right after NVRAM clear and this error never stops on a running router each time a config change is made and some parts are reloaded.

    There is one more running error at configuration reloads, which is: “# sed: /etc/tinc/tinc-fw.sh: No such file or directory” - the resason is, i have Tinc compiled in - but completly disabled/unconfigured.

    .Hope this infos helps….

  7. Not Sure

    Let’s take this step by step please. Thanks for your help, but can you test the following commits one by one to properly narrow down exactly which change is causing this ?:

    1. https://bitbucket.org/pedro311/freshtomato-arm/commits/5b55bfbdab440ba609eae67e12b4ce8bb3a618e0 This commit is before all kernel changes. Please confirm that this one works fine for the crashing, the slow response is a different problem and let’s not confuse things with it now, we will see it afterwards.
    2. https://bitbucket.org/pedro311/freshtomato-arm/commits/423a070cd1eab365a29e019ef89f754d7e74f0a6 First kernel change. See if crash here or not.
    3. https://bitbucket.org/pedro311/freshtomato-arm/commits/ed6758ae65a15e15685ed5815562cb09a9156943 2nd kernel change. See if crash here or not.
    4. https://bitbucket.org/pedro311/freshtomato-arm/commits/42dca15b495ab7a0c8ecf3f7913c329e6c79ffcb 3rd kernel change. See if crash here or not.

    The problem is that commit https://bitbucket.org/pedro311/freshtomato-arm/commits/0c8c35358b7504703600d3e9415d9e486ba345c3 does not contain any kernel changes that can cause this crash. Are you using QoS enabled ?? If you have QoS disabled, then the code in that commit should not even be executed.

  8. TheHiman reporter

    just for now… i don´t use BW and QOS, so it is unconfigured always in all my testscenarios.
    i will go tomorrow throw the single steps and report back,

    In the mean time: Can you check about the “SMP” things generally in the codebase ? Descriptions, etc. tell the sdk6 device is “singlecore”, /proc/cpuinfo shows equaly just one cpu - but webinterface states dual-core - even the coredump/segfault “cpu1” tell us about a “SMP preempt error” ? Is there eventually a definition problem with SMP on the wlan driver, but the system itself runs on just 1 cpu ? It is possible that one userland daemon from the “e:” image is using a 2nd core, which is actualy disabled by basic configuration ?

  9. Log in to comment