Excessive queue length after upgrade.

Excessive queue length after upgrade.

We upgraded our Mobicontrol environment from 13.3.3064 to 13.4.4822 and after that we have experienced queue length issues from time to time.
We have changed update schedule, test message frequency and added memory to our SQL DB but we are still experiencing excessive queue length. Especially on one of our DS. 

None of the servers are under high load and we are using 2 load balancing DS with about 30000 enrolled devices.

Any advice on what to check and try? We experienced no problems before the upgrade.

  • 03 September 2018
  • SOTI MobiControl
  • 13 Answers
  • 1 Upvote
  • 4 Followers
  • 1.4K Views
    • 13 Answers
    • 1 Upvote
    • 4 Followers

13 Answers

Order By:   Standard | Newest | Votes
Raymond, Chan | posted this 03 September 2018

What device platforms do your 30K devices belong to?   Do you use hardware load balancer or just MobiControl built-in software load-balancing in your two-DS system?

 

As v13.4.4822 server patch has been out for more than 3 months,  I guess your upgrade was also done  2-3 months ago.  How often did the issues occur after the upgrade?  If the loading spikes only last for very short period of time (e.g. a couple of minutes at most  every few days say), it might not be a real issue.

 

If you have configured alert rule(s) for high queue length,  check the alert logs for alert time-stamps to look for pattern(s) of what can potentially cause the queue length.   Time intervals in-between or time instants themselves of the alerts can often help to narrow down which rule/profile/adv-setting actually cause the loading spike.

 

 

 

  • 0
  • 0
Michael Stjernborg | posted this 03 September 2018

We have about 3000 WindowsMobile/CE and 27000 Android+, running on-prem and only using Mobicontrol load balancing.

Yes we upgraded back in June and the issue occurs a couple of times a week. Sometimes the queue lenght reach over 30000+ and take serveral hours before everything gets back to normal. Usually only one of the DS experience heavy queue length but both DS have about 50/50 of the devices.

We have used Alert rules to try to find patterns but so far we have found none. 

  • 0
  • 0
Raymond, Chan | posted this 03 September 2018

Did you  26K Android devices need agent upgrade or have agent upgraded?

 

What queue length threshold did you set in your alert rule?  What is the average queue length reported in the web console  (in "Servers" tab for "All-Platforms" ) when the deployment servers are in the low-loading state?

 

What connection mode, update schedule interval and test message frequency did you set?

 

Was it the primary or the secondary deployment server that has exceptional high loading spike?

 

  • 0
  • 0
Michael Stjernborg | posted this 04 September 2018

Yes all Android devices had to upgrade to agent version 13.5.0.1496. Might be few left on the old agent version, but i would say it's less the 100.

queue length threshold was set to 1000 for the alert rule. Average queue length is about 0-30 on both servers at a low-loading state. 

Connection mode: Persistent

Update schedule: Every 4h (it was every 2h but changed it about a month ago to 4h)

Test message frequency: 300s (5min).

 

Both DS are set at priority 1 but it's "07P" that have the high spikes in queue length (see image).

  • 0
  • 0
Raymond, Chan | posted this 04 September 2018

How long have your HA system been running?

 

Was your HA solution installed by yourself or by Soti Professional Service team or by your local Soti reseller's  engineering team?

 

Do you have similar  deployment server priority settings as in the screenshot for all the devices/device-groups in your whole system?

  • 0
  • 0
Michael Stjernborg | posted this 04 September 2018

What do mean by HA? We do have a High availabiltiy system if that's what you mean.
We installed it about 1.5 years ago on verison 13.3.3064 and we nerver had this kind of
problem until after we upgraded (late june this year) to 13.4.4822. 

It's an on prem installation and we did it by our self with the support of both our Soti reseller and Soti them self.

It's the same DS priority for the whole system.

Any thoughts or ideas on what do try next?

  • 0
  • 0
Raymond, Chan | posted this 04 September 2018

By HA, I did mean High-Availability.

 

What did you currently set for the "device management address" of the system?   

 

What is the average cpu percentage loading reported by Microsoft resource meter tool for the both deployment servers during the long period (over an hour??) when  queue count  exceed 1000  ?

 

Do you have the timestamps of the alert received for queue length exceeding your threshold of 1000 in the last 1-2 weeks?

 

 

  • 0
  • 0
Michael Stjernborg | posted this 05 September 2018

We use the "02p" (fqdn) server as the device managent address. Average CPU is about 20% when the queue exceeds 1000.

Unfortunately we haven't had the alert rule running for more than a week. Another observation is that even when the queue length is low or zero, the system seams slow. For example if we request a device check in, some times it takes several minutes before the device acctually checks in. This is just one example of many things that seems to be effected by our perceived lack of performance in mobicontrol. 

  • 0
  • 0
Raymond, Chan | posted this 05 September 2018

The long delay to get device check-in and your attached timestamps have already given me some more tell-tale signs on potential problems.  Based on all your screenshots and descriptions so far in all the previous posts, I believe some configurations of your HA servers and settings of your policies were improper, or at least non-optimal, for HA implementations even before your server upgrade.

 

The problem may get amplified when you upgrade, because the installation program may not be able to fix something previously configured incorrectly.    The way you upgrade your HA servers to v13.4.4822 may also worsen the situation further. 

 

If I continue on asking you more questions and go through some debug flow here, I don't know how many posts will there be in this thread.   Officially, you are not my customers, and I can't spend too much time on somebody's case in another part of the world.   Also, some sensitive information may not be convenient to be included in an open forum, but without it, debugging will be difficult.    

 

The problem(s) should not have been there if your HA implementation were done directly by Soti Professional Service team.  Based on some problems, big and small, I've already spotted,  I guess your team also did not get any basic training on how to tune your system and policies for HA operations.    As you have 30K device licenses and should be considered as a rather big customer, I think you'd better open an official support case with Soti support team to check if your implementation has anything improperly configured.  

 

  • 0
  • 0
Support Staff | posted this 12 October 2018

Hi Michael, 

 

When the upgrade was completed can you confirm that all of your servers were upgraded and are all on the same version please?

 

Also have you created a case yet?  If so please message me with the case number and I can follow up to see the status and assist in any way we can to ensure this is resolved properly.

 

Our Professional service dept would have been a great place to start with your upgrade but as a paid service I understand if your organization decided to tackle this on there own but rest assured they are here if we need them to get involved.   If possible perhaps I can still discuss this with them for any input they may have on this topic. 

 

Regards, 

 

Technical Support | SOTI Inc. |1.905.624.9828 | support@soti.net | www.soti.net |

  • 0
  • 0
Víctor Márquez | posted this 25 October 2018

Hi Michael, 

We had the same issue performing the same upgrade. In our case, when agents started to upgrade, devices started to report an error about a Wi-Fi profile.

We just had to gradually push a new Wi-Fi dummy profile so that new agents stop repporting the Wi-Fi profile error.

It seems that new agents use a different key to encrypt and decrypt Wi-Fi passwords and once you push the dummy profile all profiles are regenerated.

I hope this helps.

Regards, 

Víctor.

  • 0
  • 0
AJMOD@SOTI | posted this 08 November 2018

Hi Victor,

 

Just to confirm, did pushing the newly created WiFi profiles to the devices reduce the queue length following the upgrade to MobiControl? 

 

Technical Support | SOTI Inc. |1.905.624.9828 | support@soti.net | www.soti.net |

  • 0
  • 0
Víctor Márquez | posted this 05 December 2018

Hi, 

The queue length was progressively reduced in a couple of days from 50k to 0 by just pushing that dummy wifi profile.

We still have this profile active so that devices that have not been online for too long can regenerate wifi profiles with the new encription key.

Sorry for the delay.

Best regards, 

Víctor.

  • 0
  • 0

Give us your feedback
Give us your feedback
Feedback