actual trunk failover between WAN-Port (wan0) and LAN1 (wan1) ist not working, still load balancing

TheHiman reporter

I forget to mention, that i have CTF enabled, because of higher troughput.

Further more, not sure is this still true: Read from the Documentation. that Failover should? ONLY work with DHCP & others, but NOT when booth interfaces are staticly
configured? But here comes the Problem: When the Router in Front send usefull dhcp (instead static config with same interface values), there is still no information,
if the Front-End-Router itself has an outside connection, because the received IP-Address is an “Transfer-Net”, not the direct outside connection.
There will never be a “0.0.0.0” or “Front-End-Router is Down” as criteria of an dead interface. Which means: The Transfer Network is always there, no matter if
received via DHCP(internal) or configured static including next hop gateway for each WAN-Port.

Maybe there is room to check the Failover-Routine to be usefull for full static configuration which ignore the configured static ip from the interface and just do source based routing through this network and try to ping/traceroute an outside destination via this interface ?

2023-10-01T07:44:14+00:00

pedro repo owner

changed status to new

2023-10-06T13:40:46+00:00

TheHiman reporter

Here is a further info how my test-config is made:

regulary WAN: static 192.168.11.2/24, Gateway: 1 (VLAN 2, untagged, default)
WAN1: used LAN-Port 1, static 192.168.12.2/24, Gateway: 1 (VLAN 12, untagged, only this port is added.)

WAN(0) = Carrier Grade NAT for ipv4, ipv6 full routed from ISP (FTTH)
WAN(1) = Dual Stack v4/v6 from ISP (VDSL2)

Using traceroute als WAN-Check is not really realiabe, for WAN0/CGN so i switched to PING for more testing.
So overcome evtenually DNS-issues i use the ip addresses 8.8.8.8 and 1.1.1.1 for checking.

CTF is enabled, because of a 1GE fast WAN-connection on the WAN0-Port!

The Weight is on booth WAN-Ports “0” - which should disabled any type of load balancing, but it is still
not really working. Balanncing is always active. The most traffic goes straight to WAN1 (the slow 2nd. ISP-Connection (for whatever Reason…)

The Info/Overview Website shows booth links as full working. So there must be a problem with the Routing-Table or the used Default-Route ?
The WAN1 (VDSL2) should ONLY be used, when the v4/CGN connection is really broken or the ISP has Routing issues, but the Traffic
goes after 2-3 Minutes after a cold boot only via WAN1 out, The WAN0-Link is still working, and checked, but is no longer used as
primary default gateway.

2023-10-07T19:56:52+00:00

TheHiman reporter

Experimenting with different Check-Times, like 30/60/120 seconds doesn´t change anything.
manually pinging 8.8.8.8 and 1.1.1.1 to test booth ports in the shell shows clearly, that
booth wan ports are working, and there are no issues with manual testing.

Eventualy the Watchdog had some issues when one of the link is a CGN instead real v4 at the front end router ?
Using traceroute on a CGN connections works anyway when no dns names are resolved and produces
often blockings on the first hops on the ISP side, so ping-method in such mixed scenario is much more reliable.
Usualy when traceroute is often repeated here, some first ISP-routers are delaying or blocking the entire trace
and a false positive on reaching destination points can happen. PING instead is always working.

2023-10-07T20:07:38+00:00

Comments (4)