Can you please provide the LM instance auto balance collector optimization tools in customer portals, so that we can calculate and assign instance per collectors to balance our collector loads.Regards Amit

What do you mean by an Auto Balance Collector Tool?

Like if we need to calculate the instances limit for per collector then how and where to calculate, there is a tool available but it has only accessible by LM support tech, its not available for customer.

When 2 or more collectors are in an auto balance group, you can see the number of instances the collector is monitoring in the UI. You can then edit the group itself, expand the advanced options and put in a new auto balance threshold.Once you do that, you can force a rebalance. If the collectors are already in an auto balance group, you can also get the number of instances via the API.

I have a simple (read:basic) Excel spreadsheet that does this.You could do the same, or I'll happily share the spreadsheet if you'd like.It's based on this page:https://www.logicmonitor.com/support/collectors/collector-groups/auto-balanced-collector-groupsAnd specifically, this equation from that page:Number of instances = (Target_Collector_mem/Medium_mem)^1/2 * Medium_ThresholdThe spreadsheet (which was only created for me, so like I said, it's basic) requires you to enter the current number of instances on each collector and the collector memory size, and it calculates the rebalance threshold. In my testing, it works well.

Can you share the Excel, please?

LM Instance Auto Balance Collector Tool

8 Replies

Joe_Williams

Professor

4 months ago

We utilize some different tooling to represent the results of this into a table but here is an example of what we use to check on our collectors. The commented bit in the loop lets you get some other settings that are in the config itself of the collector. We have used that to check on say thread counts to verify our 2XL collectors are setup above bear minimums.

import os
import requests
import json
import time
import io


def lm_rest(httpVerb, resourcePath, data, queryParams):
    """Used to easily interact with LogicMonitor v3 API
    Args:
        httpVerb (str): HTTP Method - GET, PUT, POST, PATCH or DELETE
        resourcePath (str): Resource path for API
        data (str): json formatted string for PUT, POST or PATCH
        queryParams (str): filter and field parameters
    Returns:
        list: For multiple devices/groups return
        dict: For single device/group return
    """
    
    # Which HTTP Methods are allowed
    validverbs = {'GET', 'PUT', 'POST', 'PATCH', 'DELETE'}
    # Which HTTP Methods require the data variable
    dataverbs = {'PUT', 'POST', 'PATCH'}
    # IF the HTTP Method supplied isn't in the list of allowed
    if httpVerb not in validverbs:
        raise ValueError("httpVerb must be one of %r." % validverbs)

    # Get environment variables needed
    company = os.environ["LM_COMPANY"]
    lm_bearer = os.environ["LM_BEARER_TOKEN"]
    lm_bearer = "Bearer " + lm_bearer


    # Raise exception if not set
    if not company:
        raise ValueError("Environment Variable: LM_COMPANY must be set!")
    if not lm_bearer:
        raise ValueError("Environment Variable: LM_BEARER must be set!")
    # Initialize variables
    count = 0
    done = 0
    allitems = []

    # If queryParams isn't initilized or initilized properly add ? in front
    if not queryParams.startswith("?"):
        queryParams = "?" + queryParams
    while done == 0:
        data = str(data)
        # Use offset to paginate results
        queryParamsPagination = '&offset='+str(count)+'&size=1000'

        url = 'https://' + company + '.logicmonitor.com/santaba/rest' + resourcePath + queryParams + queryParamsPagination

        # Build Headers
        headers = {'Content-Type': 'application/json', 'Authorization': lm_bearer, 'X-Version': '3'}
        # Make request and check for errors
        if httpVerb == 'PUT':
            try:
                response = requests.put(url, data=data, headers=headers)
                response.raise_for_status()
            except requests.exceptions.HTTPError as err:
                raise SystemExit(err)
        elif httpVerb == 'POST':
            try:
                response = requests.post(url, data=data, headers=headers)
                response.raise_for_status()
            except requests.exceptions.HTTPError as err:
                raise SystemExit(err)
        elif httpVerb == 'PATCH':
            try:
                response = requests.patch(url, data=data, headers=headers)
                response.raise_for_status()
            except requests.exceptions.HTTPError as err:
                raise SystemExit(err)
        elif httpVerb == 'DELETE':
            try:
                response = requests.delete(url, data=data, headers=headers)
                response.raise_for_status()
            except requests.exceptions.HTTPError as err:
                raise SystemExit(err)
        else:
            try:
                response = requests.get(url, data=data, headers=headers)
                response.raise_for_status()
            except requests.exceptions.HTTPError as err:
                raise SystemExit(err)


        if httpVerb != 'GET':
            parsed = json.loads(response.content)
            lm_return = parsed
            break
        else:
            # If a GET parse content get totals
            parsed = json.loads(response.content)
            total = parsed.get('total', 0)
            if total != 0:
                items = parsed['items']
            else:
                lm_return = parsed
                done = 1
                break
            allitems = allitems + items
            numitems = len(items)
            count += numitems
            if count == total:
                done = 1
                lm_return = allitems
            else:
                # Loop and check if we are rate limited
                returned_headers = response.headers
                api_limit = int(returned_headers['x-rate-limit-limit'])
                api_left = int(returned_headers['x-rate-limit-remaining'])
                api_threshold = api_limit - api_left
                if api_threshold > 5:
                    time.sleep(int(returned_headers['x-rate-limit-window']))

    return lm_return

def main():

    collectorgroups = lm_rest('GET', '/setting/collector/collectors', '', '?size=1000')
    collectorconfigs = []
    for collector in collectorgroups:
        # Initialize Instance counts
        # If the collector doesn't monitor or use the collection type
        # It doesn't exist in the collector status
        instance_snmp_count = 0
        instance_script_count = 0
        instance_configsourcescript_count = 0
        instance_wmi_count = 0

        collector_description = collector.get('description')
        collector_id = collector.get('id')
        collector_group = collector.get('collectorGroupName')
        collector_numberOfHosts = collector.get('numberOfHosts')
        collector_size = collector.get('collectorSize')
        collector_ea = collector.get('ea')
        collector_status = collector.get('status')
        collector_build = collector.get('build')
        collector_canDowngrade = collector.get('canDowngrade')
        collector_platform = collector.get('platform')
        collector_isDown = collector.get('isDown')
        collector_escalatingChainId = collector.get('escalatingChainId')
        collector_wrapper_conf_raw = collector.get('wrapperConf')
        collector_wrapper_conf = collector_wrapper_conf_raw.replace('\\r\\n', '\r\n')

        collector_statuschecks = lm_rest('GET', f'/setting/collector/collectors/{collector_id}/services/getStatusCheck', '', '?')
        for collector_statuscheck in collector_statuschecks['checkPoints']:
            if collector_statuscheck['name'] == "basic.Role":
                collector_role = collector_statuscheck.get('detail')
            if collector_statuscheck['name'] == "isAdmin":
                collector_isadmin = collector_statuscheck.get('detail')
            if collector_statuscheck['name'] == 'basic.User':
                collector_username = collector_statuscheck.get('detail')
            if collector_statuscheck['name'] == 'biz.InstanceCount':
                collector_instances_total = collector_statuscheck.get('detail')
                if collector_instances_total is not None:
                    collector_instances_total = collector_instances_total.replace('The collector has ','').replace(' instances','')
            if collector_statuscheck['name'] == 'cogs.QueueStats':
                for stat in collector_statuscheck['detail']:
                    if stat['collectorName'] == 'batchscript':
                        instance_batchscript_count = stat.get('instanceCount', 0)
                    if stat['collectorName'] == 'snmp':
                        instance_snmp_count = stat.get('instanceCount', 0)
                    if stat['collectorName'] == 'script':
                        instance_script_count = stat.get('instanceCount', 0)
                    if stat['collectorName'] == 'configsource.script':
                        instance_configsourcescript_count = stat.get('instanceCount', 0)
                    if stat['collectorName'] == 'wmi':
                        instance_wmi_count = stat.get('instanceCount', 0)
        
        for line in io.StringIO(collector_wrapper_conf):
            if line.startswith('wrapper.java.initmemory'):
                collector_wrapper_minmem = line.replace('\r\n', '').split("=")[1]
            if line.startswith('wrapper.java.maxmemory'):
                collector_wrapper_maxmem = line.replace('\r\n', '').split("=")[1]
        


        # collector_conf_raw = collector.get('collectorConf')
        # collector_conf = collector_conf_raw.replace('\\r\\n', '\r\n')
        # for line in io.StringIO(collector_conf):
        #     if line.startswith('groovy.script.runner'):
        #         collector_groovyscript_runner = line.replace('\r\n', '').split("=")[1]
        #     if line.startswith('collector.script.asynchronous'):
        #         collector_script_asynchronous = line.replace('\r\n', '').split("=")[1]
        #     if line.startswith('collector.mongo.threadpool'):
        #         collector_mongo_threadpool = line.replace('\r\n', '').split("=")[1]
        #     if line.startswith('collector.dns.threadpool'):
        #         collector_dns_threadpool = line.replace('\r\n', '').split("=")[1]
        #     if line.startswith('collector.ping.threadpool'):
        #         collector_ping_threadpool = line.replace('\r\n', '').split("=")[1]
        #     if line.startswith('collector.snmp.threadpool'):
        #         collector_snmp_threadpool = line.replace('\r\n', '').split("=")[1]
        #     if line.startswith('collector.script.threadpool'):
        #         collector_script_threadpool = line.replace('\r\n', '').split("=")[1]
        #     if line.startswith('collector.sdkscript.threadpool'):
        #         collector_sdkscript_threadpool = line.replace('\r\n', '').split("=")[1]
        #     if line.startswith('collector.batchscript.threadpool'):
        #         collector_batchscript_threadpool = line.replace('\r\n', '').split("=")[1]
        #     if line.startswith('collector.jdbc.threadpool'):
        #         collector_jdbc_threadpool = line.replace('\r\n', '').split("=")[1]
        #     if line.startswith('collector.webpage.threadpool'):
        #         collector_webpage_threadpool = line.replace('\r\n', '').split("=")[1]
        #     if line.startswith('collector.openmetrics.threadpool'):
        #         collector_openmetrics_threadpool = line.replace('\r\n', '').split("=")[1]
        #     if line.startswith('collector.perfmon.threadpool'):
        #         collector_perfmon_threadpool = line.replace('\r\n', '').split("=")[1]
        #     if line.startswith('collector.wmi.threadpool'):
        #         collector_wmi_threadpool = line.replace('\r\n', '').split("=")[1]
        #     if line.startswith('collector.netapp.threadpool'):
        #         collector_netapp_threadpool = line.replace('\r\n', '').split("=")[1]
        #     if line.startswith('collector.memcached.threadpool'):
        #         collector_memcached_threadpool = line.replace('\r\n', '').split("=")[1]
        #     if line.startswith('collector.jmx.threadpool'):
        #         collector_jmx_threadpool = line.replace('\r\n', '').split("=")[1]
        #     if line.startswith('collector.datapump.threadpool'):
        #         collector_datapump_threadpool = line.replace('\r\n', '').split("=")[1]
        #     if line.startswith('collector.esx.threadpool'):
        #         collector_esx_threadpool = line.replace('\r\n', '').split("=")[1]
        #     if line.startswith('collector.xen.threadpool'):
        #         collector_xen_threadpool = line.replace('\r\n', '').split("=")[1]
        #     if line.startswith('collector.udp.threadpool'):
        #         collector_udp_threadpool = line.replace('\r\n', '').split("=")[1]
        #     if line.startswith('collector.tcp.threadpool'):
        #         collector_tcp_threadpool = line.replace('\r\n', '').split("=")[1]
        #     if line.startswith('collector.cim.threadpool'):
        #         collector_cim_threadpool = line.replace('\r\n', '').split("=")[1]
        #     if line.startswith('configcollector.script.threadpool'):
        #         collector_configcollector_script_threadpool = line.replace('\r\n', '').split("=")[1]

        collectorconfigs.append({'group': collector_group, 
                                    'description': collector_description, 
                                    'numberOfHosts': collector_numberOfHosts, 
                                    'collector_id': collector_id,
                                    'size': collector_size,
                                    'escalatingChainId': collector_escalatingChainId,
                                    'maxmem': collector_wrapper_maxmem, 
                                    'mimmem': collector_wrapper_minmem, 
                                    'ea': collector_ea,
                                    'status': collector_status, 
                                    'build': collector_build, 
                                    'canDowngrade': collector_canDowngrade, 
                                    'platform': collector_platform, 
                                    'isDown': collector_isDown,
                                    'collectorRole': collector_role,
                                    'isAdmin': collector_isadmin,
                                    'username': collector_username,
                                    'instances': collector_instances_total,
                                    'instanceBatchscriptCount': instance_batchscript_count,
                                    'instanceSnmpCount': instance_snmp_count,
                                    'instanceScriptCount': instance_script_count,
                                    'instanceConfigsourceScriptCount': instance_configsourcescript_count,
                                    'instanceWmiCount': instance_wmi_count
                                    # 'script_asynchronous': collector_script_asynchronous, 
                                    # 'groovy_script_runner': collector_groovyscript_runner,
                                    # 'mongo_threadpool': collector_mongo_threadpool, 
                                    # 'collector_configcollector_script_threadpool': collector_configcollector_script_threadpool,
                                    # 'dns_threadpool': collector_dns_threadpool, 
                                    # 'ping_threadpool': collector_ping_threadpool, 
                                    # 'snmp_threadpool': collector_snmp_threadpool, 
                                    # 'script_threadpool': collector_script_threadpool, 
                                    # 'sdkscript_threadpool': collector_sdkscript_threadpool, 
                                    # 'batchscript_threadpool': collector_batchscript_threadpool, 
                                    # 'jdbc_threadpool': collector_jdbc_threadpool, 
                                    # 'webpage_threadpool': collector_webpage_threadpool, 
                                    # 'openmetrics_threadpool': collector_openmetrics_threadpool, 
                                    # 'perfmon_threadpool': collector_perfmon_threadpool, 
                                    # 'wmi_threadpool': collector_wmi_threadpool, 
                                    # 'netapp_threadpool': collector_netapp_threadpool, 
                                    # 'memcached_threadpool': collector_memcached_threadpool, 
                                    # 'jmx_threadpool': collector_jmx_threadpool, 
                                    # 'datapump_threadpool': collector_datapump_threadpool, 
                                    # 'esx_threadpool': collector_esx_threadpool, 
                                    # 'xen_threadpool': collector_xen_threadpool, 
                                    # 'udp_threadpool': collector_udp_threadpool, 
                                    # 'tcp_threadpool': collector_tcp_threadpool, 
                                    # 'cim_threadpool': collector_cim_threadpool 
                                })



    # return value is converted to JSON
    return {"collectors": collectorconfigs}

Cole_McDonald
Professor
5 months ago
Here's the script I ended up creating and using a few jobs ago. I often use PropertySources as maintenance routines as they can be run once/day targeting a dedicated scripting collector directly to help preen device loads across any ABCG in the portal indicated in the script: Add device to Collector with least devices in rest api | LogicMonitor - 5975
DinoReinadi
Neophyte
5 months ago
I created this and run it from time to time...
befuddled
Neophyte
12 months ago
I have a simple (read:basic) Excel spreadsheet that does this.

You could do the same, or I'll happily share the spreadsheet if you'd like.

It's based on this page:
https://www.logicmonitor.com/support/collectors/collector-groups/auto-balanced-collector-groups
And specifically, this equation from that page:
Number of instances = (Target_Collector_mem/Medium_mem)^1/2 * Medium_Threshold

The spreadsheet (which was only created for me, so like I said, it's basic) requires you to enter the current number of instances on each collector and the collector memory size, and it calculates the rebalance threshold.
In my testing, it works well.
- SilasRibas
  Neophyte
  9 months ago
  Can you share the Excel, please?
Mike_Moniz
Professor
12 months ago
What do you mean by an Auto Balance Collector Tool?
- Amit_Ojha
  Neophyte
  12 months ago
  Like if we need to calculate the instances limit for per collector then how and where to calculate, there is a tool available but it has only accessible by LM support tech, its not available for customer.
  - Joe_Williams
    Professor
    12 months ago
    When 2 or more collectors are in an auto balance group, you can see the number of instances the collector is monitoring in the UI. You can then edit the group itself, expand the advanced options and put in a new auto balance threshold.
    Once you do that, you can force a rebalance.
    
    If the collectors are already in an auto balance group, you can also get the number of instances via the API.

Forum Discussion

LM Instance Auto Balance Collector Tool

8 Replies

Recent Discussions

Ingest collector events via LM Logs

Proxmox/KVM Monitoring

Aggregate Reporting by instance or property

Alert Tuning : Please add export to CSV option

Dynamic Dashboards