HS110 Goes Zombie - A Deeper Dive

Broke this out into its own topic.

@ccook,
Here’s the other thing that’s so befuddling… I have created a HA dashboard that now matches up the Ubiquiti presence monitor for select smartplugs (Home/Away) with their Sense binary_sensor companions and have caught my Furnace Up HS110 going rogue, but Ubiquiti shows it as connected (Home). You can also see the dropouts in the Furnace Up waveform below.

When I try to ping that HS110 I get mixed results. A ping from my router (UDM) or from my video server (Mac Mini-left), both wired, are successful. The same ping from my MacBook Pro (right), is unsuccessful.

I’m fairly certain something that HS110 is doing is screwing up it’s accessibility on my network (simple - no VLANs, but multiple SSIDs right now). Not surprisingly, the ARP table for the MacBook Pro does not show an entry for the IP address of that HS110.

And just to make it more interesting, the offending HS110 is visible in the Kasa app and does produce power info.


But that HS110 does not respond to ping send from the same iPhone as the Kasa app.

ps: This is a different problem than I initially posted but continues the endless story of data dropouts. Time to reboot this bugger in the attic.

UPDATE: And of course, a power cycle of the HS110, cleans up the ping problems. And I’m betting that the data dropout will disappear, at least for a bit.

2 Likes

@kevin1 You got all your access point set to 100%? And a AP map?

On Sun I decided to turn on Ubiquiti AP radio optimization. Here’s what I’m seeing (below). I manually had all the 2.6G WiFi channels spread between 1, 6 and 11, but the system has decided to put them all on 6. I live on a 1/2 acre plot but my hardware can see at least 100 other APs all broadcasting away, so it may have made the right decision.

As far as maps - are you looking for a physical with AP positioning ? I have one.

I was wondering if you had 2 that was too close. The UDM wont prevent the the rf interaction between the Unifi radios. The 1, 6, 11 staggering is always good. The 2nd closed AP to your thermostat (Furnace up/ Ecobee?) turn the transmission power to med or even low for a hour and see if your thermostat. And thine in the unifi portal- client devices- see if that HS110 and the furnace up is in the overview 24 hour map is also dropping. If its farther back than the 24hrs on the HS110 you can use HA logbook. When you find the exact time that the hiccup occurs with the HS110. Pull the firewall logs from the UDM by ssh with cat /var/log/messages.

Strange part about the ecobee… I seen another one do the same thing. They had a combo modem/wifi/router that was less than 15 ft away. Closest house to them was 1/2 mile. Then I hooked up another ecobee that was on a different floor and the opposite side of the house, middle of downtown… works like its plugged in.

1 Like

Here are the maps with estimated 2.4GHz sign strength. Most are fairly well spaced out at the periphery of the house. The pair that are the closest laterally are separated by essentially 2 full 10’ stories plus two exterior walls, though the upper one, closest to the the HS110 in question, is a U6-LR (hence the larger radius). Numbers reflect the original channel assignments I’m going back to.


I’m going to have to double check the logs, because the power cycle didn’t fix the data issue, but something mysteriously di around 8AM this morning.

Here’s the sum of the log messages for that span of time - nothing major stands out except maybe the the 8:55AM DHCP lease rental renewal (but there was a similar one 24 hours earlier).

Feb  8 03:00:33 Menlo-UDM daemon.info dnsmasq-dhcp[24692]: DHCPREQUEST(br0) 192.168.1.7 HS110MAC 
Feb  8 03:00:33 Menlo-UDM daemon.info dnsmasq-dhcp[24692]: DHCPACK(br0) 192.168.1.7 HS110MAC HS110
Feb  8 03:01:10 Menlo-UDM user.notice dpi-flow-stats: ubnt-dpi-util: ignoring trace from wireless client HS110MAC due to stale wifi info | 172845 seconds | FP ML on UAPs not supported
Feb  8 12:08:55 Menlo-UDM daemon.info dnsmasq-dhcp[24692]: DHCPDISCOVER(br0) HS110MAC 
Feb  8 12:08:55 Menlo-UDM daemon.info dnsmasq-dhcp[24692]: DHCPOFFER(br0) 192.168.1.7 HS110MAC 
Feb  8 12:08:55 Menlo-UDM daemon.info dnsmasq-dhcp[24692]: DHCPREQUEST(br0) 192.168.1.7 HS110MAC 
Feb  8 12:08:55 Menlo-UDM daemon.info dnsmasq-dhcp[24692]: DHCPACK(br0) 192.168.1.7 HS110MAC HS110
Feb  8 12:09:31 Menlo-UDM user.notice dpi-flow-stats: ubnt-dpi-util: ignoring trace from wireless client HS110MAC due to stale wifi info | 205745 seconds | FP ML on UAPs not supported
Feb  9 00:08:55 Menlo-UDM daemon.info dnsmasq-dhcp[24692]: DHCPREQUEST(br0) 192.168.1.7 HS110MAC 
Feb  9 00:08:55 Menlo-UDM daemon.info dnsmasq-dhcp[24692]: DHCPACK(br0) 192.168.1.7 HS110MAC HS110
Feb  9 00:09:30 Menlo-UDM user.notice dpi-flow-stats: ubnt-dpi-util: ignoring trace from wireless client HS110MAC due to stale wifi info | 248945 seconds | FP ML on UAPs not supported

Also seeing the following two messages on the AP that that HS110 is connected to, but the date on that AP is off for some reason. If I compare against the current time on that AP, these events happened little before 10:30AM today.

Sat Jan 29 03:54:25 2022 daemon.info stahtd: stahtd[3104]: [STA-TRACKER].stahtd_dump_event(): {"auth_ts":"0.0","message_type":"STA_ASSOC_TRACKER","vap":"ra1","mac":"HS110MAC","event_type":"fixup","assoc_status":"0","event_id":"2","dns_resp_seen":"yes"}
Sat Jan 29 04:54:27 2022 daemon.info stahtd: stahtd[3104]: [STA-TRACKER].stahtd_dump_event(): {"auth_ts":"0.0","message_type":"STA_ASSOC_TRACKER","vap":"ra1","mac":"HS110MAC","event_type":"fixup","assoc_status":"0","event_id":"2","dns_resp_seen":"yes"}

One more datapoint - here’s a view of an entire month for the Furnace Up HS110 that seems most prone to going full on “zombie mode” and dropping data to the Sense monitor. The problem is that it works reliably for a long period (in this case 22 days), then goes rogue again.

Lets try… Temporally disable downstairs AP located between bonus room and family room. Turn TX of the both upstairs’ APs to low for a few hours and see if your thermostat wlan stabilizes. Might have to make an ssid just for your upstairs thermostat and then a ssid for the HS110 your having issues. Im thinking ecobee doesnt do well with mesh, or AP’s with the same SSID. The kasa plugs might be the same.

In the logs… Just wondering if it in that 15min, it was using 192.168.0.1

@ccook, thanks for the guidance. A few points:

  • I’ll try your approach to disabling the playroom (Bonus Room) AP, and adjusting the other APs. I’ll have to be a little circumspect in doing so since that room is my wife’s active office, and that AP also has wired connections that are in perpetual use.
  • Wondering how I might actually see an HS110 going into DHCP server mode and using a 192.168.01 IP address ? That IP won’t be visible to my UDM since that HS110 will have disassociated itself from my APs/SSIDs, when it switched over to that mode. If I’m not mistaken, in that mode it’s broadcasting its own SSID. I guess I could see it as a neighboring network ?
  • I’m not so worried about my Ecobees - they seem to be good with data reliability despite their weird ping response time ramp-ups. My Upstairs Ecobe has been providing rock solid data over time (bottom chart).

While attached to your network via WLAN , not in setup mode, just re-assigned itself 192.168.0.1. If it was in setup mode, it would have to release from your network then broadcast its own SSID. On mine, thats how the KP125 was still connected to my network, I noticed the 192.168.0.1 listed in the mac table within my ubiquiti switch GUI (Showed up in the port of the AP). Then I went to my firewall logging (Sophos XP) and searched the IP, and the mac. You should be able to search the mac in your UDM, I’m assuming that any 192.168.0.x/24 would be invalid traffic in your home network.

Assuming you have a UI poe switch of some sort, You might even just look in your switch GUI / mac tables and see if the HS110 is showing up in multiple AP’s or even showing an invalid IP. Same with the upstairs eccobee.

@ccook, thanks again for the guidance. I tried to find commands that reveal the routing tables and associated info my UI PoE switch, but I think the “consumer” line doesn’t have the Cisco compatible commands (I can’t telnet to get to those like in the instructional videos for the Edge line).

When I look at the UI, there’s nothing that traps the illegal 192.168.0.1 on my network because it is already off the network by time it claims the new illegal IP address. I do see the following when I emulate using a soft reset on my goof-around HS110.

I defiantly deal with the edge/ microtek/ cisco switches a lot more. In UI Network (browser) , select the switch,> settings> Manage> Download Device info. Should download a json file. Open it and cnt+F = Find… 192.168.0.1… See if its had that mac: 50:CF:BF xxxx within that entry. Its unfortunate that the managed switches dont also have a web GUI. That and firewall logging is the first thing I pull to figure out what’s going on within the network.

Thanks again. Searched the json for both my main switch and for the UDM (which includes the routing). Neither contained 192.168.01 IP. In Ubiquiti-land, I only have one unified UI view for the home network. I can push down into any device and get a reasonable picture of what’s going on. I’m pretty convinced based on what I have seen that the only inkling I’ll get of the HS110 going rogue is it going “gray” on the client listing.