Ticket #180 (reopened bug)

Opened 4 months ago

Last modified 7 weeks ago

Cloudtrax "Map of Users per Node" and SSID#1 client list inaccurate ng354

Reported by: jmatwyko Owned by: dfl-owner
Priority: major Milestone: ng-beta
Component: dashboard Keywords: OM2P
Cc: john@… Network name: Westerly

Description

Occured after OM2P upgrade from 329 to 354. I have many nodes showing "users" greater than 10 in the Map and yet in SSID#1 client list there are several nodes with no one connected (using any data). As a test I connected with my android phone to AP12 and did some speedtests and some surfing and indeed the "map of users per node" number went up by one (note: it has never gone back down). I then checked the SSID#1 list after next check-in of this node and my phone/mac was listed with correct data usage. I was the only one listed on that node in the last 24 hours that actually moved any data. Yet Map of user per node still showing 9?? Yes I understand that many device connected to open network automatically but don't actually use the connection. Why are they even listed???

And after several hours I checked back into cloudtrax and the node in question (AP12) still shows 9 users but in SSID#1 list my phone and its data usage on that node are not displayed anymore , infact there is no 24 hr data usage or clients listed for this node.

So what bothers me is that the map of user per nodes seems to increment with every connection (doesn't matter if data is used, Maybe??) and doesn't seem to decrement when device leaves. And that if that device does use data and at some point shows up in SSID#1 client list, sometime later no data on that device is in the SSID#1 user list.

I have removed that particular nodes antenna for testing purposes (shouldn't have any drive-by connections so the speak) and will try this same process again in a overnight. Will update with result.

Change History

  Changed 4 months ago by mike

Actually, this hasn't changed recently. You are correct: The usage graph shows all users who connected at all, regardless of whether they used any data. The actual list of users shows only the users who moved at least dial-up rates once.

This is known and will be changed so that the users on count on the daily graph matches the users shown below.

  Changed 4 months ago by jmatwyko

I just rebooted this node, it then reported on next check-in zero connections in Map of user per node. Now have one node showing 22 connections in the map (Wow) and users list has only 2 with any data usage. Something has changed as reporting on cloudtrax show many more users than normal (pre ng354). I've got an empty hotel tonight and user count is at its highest I've seen. And where is the record of data usage on AP12 by my phone that was listed at one point and now doesn't show. That was only this afternoon I was testing this.

  Changed 4 months ago by mike

What is the network? It doesn't seem to be the one on the ticket.

  Changed 4 months ago by jmatwyko

Sorry maybe its case sensitive. Try westerly all lowercase.

  Changed 4 months ago by jmatwyko

I connected again with my phone to AP12 after rebooting it and the map went from 0 to 1 user connected. Did some more speedtests/surfing and map shows 25 mb of data usage by 1 user (me). Went into SSID#1 clients and guess what? My history of connecting from 1:00 pm PST was now availabe. It wasn't there before I re-connected to the same node 10 minutes earlier. Something is amiss.

  Changed 4 months ago by mike

What are you using to view this? I haven't been able to see any of the issues you reported. Sounds like your browser cached data?

  Changed 4 months ago by jmatwyko

2 different machines. My 2008 server and from my linux laptop. IE on w2k8 box and chrome on linux, browser cache dumped on both before submitting ticket.

  Changed 4 months ago by mike

Then perhaps I don't understand what you have been saying. I see several nodes with many users. The user list shows 36 users, a few of which are showing present usage.

Let's try email. mike@….

  Changed 4 months ago by marek

While I was tunneled into AP10 I checked the connected users list. It seems Mike is right - although 9 users were connected only few of them actually generate traffic. Here is the list:

mac: 00:21:e8:6b:54:3c, state: Authenticated, down: 2313, up: 449 mac: 00:26:08:22:a3:43, state: Authenticated, down: 569, up: 124 mac: 00:23:df:0e:8d:c8, state: Authenticated, down: 0, up: 1 mac: 98:03:d8:7e:6d:fa, state: Authenticated, down: 182, up: 64 mac: f0:cb:a1:2a:eb:ec, state: Authenticated, down: 31, up: 19 mac: 60:c5:47:94:f3:2a, state: Authenticated, down: 5, up: 5

Those 3 are connected without generating any traffic at all: 78:d6:f0:3e:4a:21 f0:cb:a1:d1:31:06 40:6a:ab:61:d4:bf

I got this list with the following command (in case you want to check yourself):

ndsctl clients |  awk  -F'=' '{if (NR > 2) {row=(NR - 2) % 13;} if (row == 3) mac=$2; if (row == 8) state=$2; if (row == 9) down=$2; if (row == 11) up=$2;if (row==
12) printf("mac: %s, state: %s, down: %s, up: %s\n", mac, state, down, up);}'

  Changed 4 months ago by jmatwyko

I've finally had the time to read up on ndsctl. I think I understand now how the current users # in the nodes in network part of the dashboard is being generated. ndsctl /status cleared it up for me. "Current users" is all devices that have authenticated at this AP. Active duration is time based on active association time. Added duration is time after association has ended upto the "time limit set" for splash page (or my case url redirection) before it is displayed again. Default 24 hrs. Client is not removed from dashboard until active time plus associated time = 24 hrs. Except when device moves to another AP.

Thanks for the command line in previous post. I saw awk and laughed, had to pull out my UNIX system administration handbook 2nd edition and look at the pub date. OMG, 1995 when I took that course. Man that was a long time ago :-)

thanks, Jeff

  Changed 4 months ago by marek

Yeah, awk definitely is old school but a real time saver. :-)

Where does it leave us with your bug report ?

  Changed 4 months ago by jmatwyko

"Where does it leave us with your bug report ?"

If the change mentioned in the second post is in the works then your call, you can close if you want. As long as we're on the same page and if we are then is this correct?>

The "Active users" on the right hand side of the 24 hr. graph window reflects total associations in the network with usage greater than (X number of bytes)over a sliding 5 minute window?

The one that interests me most is the user per node. If the number is accurate in near real time terms then it is a very useful piece of information but on the other hand if it is just the total of associations seen in the last 24 hour, then not so much.

The term "current users" in the node details implies (at least to me) that they have a current active association regardless of data usage. new device = +1 user, when device stops having an active association and the counter on the "Added duration time" starts then the "current user" count should = -1

Looking at the map right now I see of 17 nodes with only 3 displaying less than (10+), not really useful for real-time (or near real-time) current users.

In the interim, if I need to get an "idea" of actual active current users per node I will give the node a reboot, then for at least for a short period of time it will be close enough.

thanks, Jeff

  Changed 4 months ago by jmatwyko

Another issue is if I look at the nodes in network up/down totals for a particular building they don't add up to just one user's download total in 24 hr period. Example as follows:

AP15 3rd floor mountain side wing ac:86:74:02:18:67 Apple (iPhone) -50dbm 43mbps MCS4 17,195,420 131,624

AP15 total for 24 hr period. 1415.8/33.7 uptime 5d:15h:9m

Now it is possible that this device at some point was connected to 1 of the other 5 nodes in the building continued downloading but the total for the other 5 nodes is only about 11GB down over the 24 hr period. Note: 1 node (AP17 has only been up for 0d:12h:40m as of now) but it is on the other side of the building one floor up so not likely the one the device was using.

Not really a big deal but interesting nonetheless. I will monitor this daily for a bit and see if it happens again.

thanks, Jeff

  Changed 4 months ago by mike

You reported the node mac, not the user mac. What user are you referring to?

  Changed 4 months ago by jmatwyko

68:a8:6d:26:89:7c

  Changed 4 months ago by mike

We looked at this and I agree it is somewhat confusing currently. This will change very soon to use consistent data, but here is what is happening now:

1. The # of users on the nodes list is coming from the wifi driver on the nodes. It is showing users that have associated (may not have any usage) and have not "timed out" or disconnected. The usage that is reported here for gateways shows the usage that the captive portal thinks is current. The problem with this is that all users get their own sliding window based upon your timeouts on the cloud controller. When a user "times out", their data is discarded by the captive portal. So this data tends to lower than you'd expect, compared to what is shown on the users page.

2. The total users count on the top graph on the overview page (also the top graph on the users page) is showing users that have authenticated and been added to cloudtrax database (again, may or may not have data). It also tends to be high compared to the list of users shown on the users page.

3. The users that are listed on the users page show the subset of users that have had a least an average rate equal to dial-up over at least one 5 minute checkin period. This is done to remove very tiny users from the list (it is designed to be more of a "top users" list).

In the case of the heavy user you reported, he had timed out on the node so the node lists shows much smaller amounts as ALL his heavy usage had been discarded from 20 hours before (his current usage is very much less). This gets shown on the node list. But cloudtrax doesn't forget this data for users and correctly shows it on the users page.

What is being done is to change how all this data is reported on the nodes list page to make everything consistent with the users page. The total users at the top will also reflect the users shown. It isn't "inaccurate", it is just not reporting the same data for the same time periods and with the same filters for removing very tiny usage.

follow-ups: ↓ 18 ↓ 21   Changed 3 months ago by jmatwyko

Could you add one more item to the to do list? The sorting for KB Down and KB Up in the SSID#1 Clients is based on the number before the comma. Smallest to largest (or vis-vera) so its end up with sort result like this in KB if I choose smallest to largest.

4,749
4,969
4,346
5,032,249
5,446
7,287

in reply to: ↑ 17   Changed 3 months ago by deniz

Replying to jmatwyko:

Could you add one more item to the to do list? The sorting for KB Down and KB Up in the SSID#1 Clients is based on the number before the comma. Smallest to largest (or vis-vera) so its end up with sort result like this in KB if I choose smallest to largest. {{{ 4,749 4,969 4,346 5,032,249 5,446 7,287 }}}

Is that a joke?

follow-up: ↓ 20   Changed 3 months ago by jmatwyko

  • status changed from new to closed
  • resolution set to worksforme

Nope, not a joke or at least I didn't intend it to be and didn't want to start a ticket just for it. It's just a sorting issue, a comestic fix. How far down on the to-do list it goes doesn't really matter as long as it gets on the list. Mike and marek have explained clearly what is happening and why and that a change is in the works.

in reply to: ↑ 19   Changed 3 months ago by deniz

Replying to jmatwyko:

Nope, not a joke or at least I didn't intend it to be and didn't want to start a ticket just for it. It's just a sorting issue, a comestic fix.

Sorry, my bad. I was misreading your feature request. Thought you were requesting sorting by "number before the comma" - but rather that's the current state and you want it fixed...

Yeah, I agree numeric sorting makes more sense for numbers.

in reply to: ↑ 17   Changed 3 months ago by marek

Replying to jmatwyko:

Could you add one more item to the to do list? The sorting for KB Down and KB Up in the SSID#1 Clients is based on the number before the comma. Smallest to largest (or vis-vera) so its end up with sort result like this in KB if I choose smallest to largest. {{{ 4,749 4,969 4,346 5,032,249 5,446 7,287 }}}

This sorting problem was fixed recently. Can you confirm it is working for you ?

Thanks

  Changed 3 months ago by jmatwyko

Yes, sort order is working correctly. Thanks for fixing that. However, I still see anomalous user counts being reported in the "Map of Users per Node" when I mouse over a node. Say it says 20 users 0/data, it appears to be still reporting all associations without regard to data movement and or it is still including those with the "added duration time" counter ticking in the last 24 hrs.

I've been requested to do a event this spring that brings in 20k people over the four day event window. Part of the reporting back TTPTB is showing the number of actual associated clients, who really did move some data and on which node at any given time.

Maybe, an "active user per node" in addition to the "24hr users" that is already shown could be implemented in the " # Nodes in this Network" details.

Sorry to be such a pain in the ass but having a real time number in the "Map of Users per Node" just seems to make so much more sense that what is currently being displayed. Is this reporting refinement still in the works?

thanks, Jeff

  Changed 3 months ago by mike

  • status changed from closed to reopened
  • resolution worksforme deleted

The current plan is to update the map to be consistent with the nodes list (24-hour usage), and a link (in the node list) to a detail node page showing more stats/graphs including a selectable time period graph for the node's users/usage.

  Changed 7 weeks ago by wombo

  • cc john@… added

I'd like to see the "Current" number of users on the maps as well. Current being active in the last 15 minutes (or a selectable amount.)

Note: See TracTickets for help on using tickets.