How to stagger datasource instance collection?

Question

I'm working with custom Meraki API datasources and have&nbsp;an issue where the collector can get into a state such that all the instances attempt to collect simultaneously. Or, at least closely enough together it triggers Meraki's rate limiting, and even my backoff/retry isn't doing the job.&nbsp;

I would think that if my collector script for my customer datasource is set to sleep for a rand() number of seconds, I should be able to avoid this. Any thoughts about what I might be doing wrong, or is there "LogicMonitor way" I should be handling this?

The below is what I'm doing to try to account for 429 Rate Limit responses. (Not saying this is 'correct' in any way, but I thought the retry logic should've worked.)

def getAPIQueryOutput(String api_uri)
{
  url = 'https://api.meraki.com' + api_uri
  req = getHTTPResponse(url)

// Got it the first try
  if(req.responseCode == 200){
    return new JsonSlurper().parseText(req.inputStream.getText('UTF-8'))
  } else if(req.responseCode == 204) {
    // Entry exists but did not have data.
    return null
  } else if(req.responseCode == 400) {
    // Bad request.
    return null
  } else if(req.responseCode == 404) {
    // Whatever we tried to find didn't exist. Return null.
    return null
  }

// 429 received due to rate limiting, backoff and try again.
  count = 1
  while(req.responseCode == 429 &amp;&amp; count &lt; 4){
    backoff = getBackOffMs(count)
    // Wait for increasing amount of time.
    sleep(backoff)
    req = getHTTPResponse(url)
    count++
  }

// Return whatever we ended up with.
  return new JsonSlurper().parseText(req.inputStream.getText('UTF-8'))
}

def getBackOffMs(count){
  // Get a random int between 1 - 3 inclusive.
  backoff_seed = new Random().nextInt(3) + 1
  return backoff_seed * 1000 * count
}

&nbsp;

Answer

There is a more LogicalMonitor way to do it.&nbsp; (see what i did there?)

I believe we're working on getting information ready to present, but I think we're shifting the way LM monitors Meraki to utilize the API instead of SNMP (like you have done).&nbsp; I think it does something like breaking it up so each network is represented as a different device in LM splitting the monitoring across parallel tasks. Should skirt the 429 problem.

How urgent is this for you?

boomi · Answer

Ah already ahead of you there, I came to the same conclusion after attempting to do a 'single device' model, which just couldn't do it. (If I weren't trying to do API switchport monitoring, it probably would've been fine).

I use the MX at each location as the 'anchor' for all of the Meraki monitoring at the location:

&nbsp;

But they somehow still manage to sync up. I had&nbsp;the AMP and IPS checks at 4 hour intervals, and after restarting the collector, started getting buckets of 429's:

&nbsp;

And now what's really got me confused ... I set both of those to 5 minute collection intervals about 30 minutes ago and now everything's fine!

&nbsp;

Forum Discussion

How to stagger datasource instance collection?

2 Replies

Recent Discussions

LM / ServiceNow Integration

How long do you sit on hold waiting for Chat support?

Bug in version 223 related to netscan execution

Cisco Wireless Access Points/Wireless Lan Controller

Warning: Linux Collector Upgrades Leaving Collectors Unusable