Forum Discussion

Amit_Ojha's avatar
Amit_Ojha
Icon for Neophyte rankNeophyte
8 months ago

LM Instance Auto Balance Collector Tool

Can you please provide the LM instance auto balance collector optimization tools in customer portals, so that we can calculate and assign instance per collectors to balance our collector loads.

Regards Amit

8 Replies

  • We utilize some different tooling to represent the results of this into a table but here is an example of what we use to check on our collectors. The commented bit in the loop lets you get some other settings that are in the config itself of the collector. We have used that to check on say thread counts to verify our 2XL collectors are setup above bear minimums.

    import os
    import requests
    import json
    import time
    import io
    
    
    def lm_rest(httpVerb, resourcePath, data, queryParams):
        """Used to easily interact with LogicMonitor v3 API
        Args:
            httpVerb (str): HTTP Method - GET, PUT, POST, PATCH or DELETE
            resourcePath (str): Resource path for API
            data (str): json formatted string for PUT, POST or PATCH
            queryParams (str): filter and field parameters
        Returns:
            list: For multiple devices/groups return
            dict: For single device/group return
        """
        
        # Which HTTP Methods are allowed
        validverbs = {'GET', 'PUT', 'POST', 'PATCH', 'DELETE'}
        # Which HTTP Methods require the data variable
        dataverbs = {'PUT', 'POST', 'PATCH'}
        # IF the HTTP Method supplied isn't in the list of allowed
        if httpVerb not in validverbs:
            raise ValueError("httpVerb must be one of %r." % validverbs)
    
        # Get environment variables needed
        company = os.environ["LM_COMPANY"]
        lm_bearer = os.environ["LM_BEARER_TOKEN"]
        lm_bearer = "Bearer " + lm_bearer
    
    
        # Raise exception if not set
        if not company:
            raise ValueError("Environment Variable: LM_COMPANY must be set!")
        if not lm_bearer:
            raise ValueError("Environment Variable: LM_BEARER must be set!")
        # Initialize variables
        count = 0
        done = 0
        allitems = []
    
        # If queryParams isn't initilized or initilized properly add ? in front
        if not queryParams.startswith("?"):
            queryParams = "?" + queryParams
        while done == 0:
            data = str(data)
            # Use offset to paginate results
            queryParamsPagination = '&offset='+str(count)+'&size=1000'
    
            url = 'https://' + company + '.logicmonitor.com/santaba/rest' + resourcePath + queryParams + queryParamsPagination
    
            # Build Headers
            headers = {'Content-Type': 'application/json', 'Authorization': lm_bearer, 'X-Version': '3'}
            # Make request and check for errors
            if httpVerb == 'PUT':
                try:
                    response = requests.put(url, data=data, headers=headers)
                    response.raise_for_status()
                except requests.exceptions.HTTPError as err:
                    raise SystemExit(err)
            elif httpVerb == 'POST':
                try:
                    response = requests.post(url, data=data, headers=headers)
                    response.raise_for_status()
                except requests.exceptions.HTTPError as err:
                    raise SystemExit(err)
            elif httpVerb == 'PATCH':
                try:
                    response = requests.patch(url, data=data, headers=headers)
                    response.raise_for_status()
                except requests.exceptions.HTTPError as err:
                    raise SystemExit(err)
            elif httpVerb == 'DELETE':
                try:
                    response = requests.delete(url, data=data, headers=headers)
                    response.raise_for_status()
                except requests.exceptions.HTTPError as err:
                    raise SystemExit(err)
            else:
                try:
                    response = requests.get(url, data=data, headers=headers)
                    response.raise_for_status()
                except requests.exceptions.HTTPError as err:
                    raise SystemExit(err)
    
    
            if httpVerb != 'GET':
                parsed = json.loads(response.content)
                lm_return = parsed
                break
            else:
                # If a GET parse content get totals
                parsed = json.loads(response.content)
                total = parsed.get('total', 0)
                if total != 0:
                    items = parsed['items']
                else:
                    lm_return = parsed
                    done = 1
                    break
                allitems = allitems + items
                numitems = len(items)
                count += numitems
                if count == total:
                    done = 1
                    lm_return = allitems
                else:
                    # Loop and check if we are rate limited
                    returned_headers = response.headers
                    api_limit = int(returned_headers['x-rate-limit-limit'])
                    api_left = int(returned_headers['x-rate-limit-remaining'])
                    api_threshold = api_limit - api_left
                    if api_threshold > 5:
                        time.sleep(int(returned_headers['x-rate-limit-window']))
    
        return lm_return
    
    def main():
    
        collectorgroups = lm_rest('GET', '/setting/collector/collectors', '', '?size=1000')
        collectorconfigs = []
        for collector in collectorgroups:
            # Initialize Instance counts
            # If the collector doesn't monitor or use the collection type
            # It doesn't exist in the collector status
            instance_snmp_count = 0
            instance_script_count = 0
            instance_configsourcescript_count = 0
            instance_wmi_count = 0
    
            collector_description = collector.get('description')
            collector_id = collector.get('id')
            collector_group = collector.get('collectorGroupName')
            collector_numberOfHosts = collector.get('numberOfHosts')
            collector_size = collector.get('collectorSize')
            collector_ea = collector.get('ea')
            collector_status = collector.get('status')
            collector_build = collector.get('build')
            collector_canDowngrade = collector.get('canDowngrade')
            collector_platform = collector.get('platform')
            collector_isDown = collector.get('isDown')
            collector_escalatingChainId = collector.get('escalatingChainId')
            collector_wrapper_conf_raw = collector.get('wrapperConf')
            collector_wrapper_conf = collector_wrapper_conf_raw.replace('\\r\\n', '\r\n')
    
            collector_statuschecks = lm_rest('GET', f'/setting/collector/collectors/{collector_id}/services/getStatusCheck', '', '?')
            for collector_statuscheck in collector_statuschecks['checkPoints']:
                if collector_statuscheck['name'] == "basic.Role":
                    collector_role = collector_statuscheck.get('detail')
                if collector_statuscheck['name'] == "isAdmin":
                    collector_isadmin = collector_statuscheck.get('detail')
                if collector_statuscheck['name'] == 'basic.User':
                    collector_username = collector_statuscheck.get('detail')
                if collector_statuscheck['name'] == 'biz.InstanceCount':
                    collector_instances_total = collector_statuscheck.get('detail')
                    if collector_instances_total is not None:
                        collector_instances_total = collector_instances_total.replace('The collector has ','').replace(' instances','')
                if collector_statuscheck['name'] == 'cogs.QueueStats':
                    for stat in collector_statuscheck['detail']:
                        if stat['collectorName'] == 'batchscript':
                            instance_batchscript_count = stat.get('instanceCount', 0)
                        if stat['collectorName'] == 'snmp':
                            instance_snmp_count = stat.get('instanceCount', 0)
                        if stat['collectorName'] == 'script':
                            instance_script_count = stat.get('instanceCount', 0)
                        if stat['collectorName'] == 'configsource.script':
                            instance_configsourcescript_count = stat.get('instanceCount', 0)
                        if stat['collectorName'] == 'wmi':
                            instance_wmi_count = stat.get('instanceCount', 0)
            
            for line in io.StringIO(collector_wrapper_conf):
                if line.startswith('wrapper.java.initmemory'):
                    collector_wrapper_minmem = line.replace('\r\n', '').split("=")[1]
                if line.startswith('wrapper.java.maxmemory'):
                    collector_wrapper_maxmem = line.replace('\r\n', '').split("=")[1]
            
    
    
            # collector_conf_raw = collector.get('collectorConf')
            # collector_conf = collector_conf_raw.replace('\\r\\n', '\r\n')
            # for line in io.StringIO(collector_conf):
            #     if line.startswith('groovy.script.runner'):
            #         collector_groovyscript_runner = line.replace('\r\n', '').split("=")[1]
            #     if line.startswith('collector.script.asynchronous'):
            #         collector_script_asynchronous = line.replace('\r\n', '').split("=")[1]
            #     if line.startswith('collector.mongo.threadpool'):
            #         collector_mongo_threadpool = line.replace('\r\n', '').split("=")[1]
            #     if line.startswith('collector.dns.threadpool'):
            #         collector_dns_threadpool = line.replace('\r\n', '').split("=")[1]
            #     if line.startswith('collector.ping.threadpool'):
            #         collector_ping_threadpool = line.replace('\r\n', '').split("=")[1]
            #     if line.startswith('collector.snmp.threadpool'):
            #         collector_snmp_threadpool = line.replace('\r\n', '').split("=")[1]
            #     if line.startswith('collector.script.threadpool'):
            #         collector_script_threadpool = line.replace('\r\n', '').split("=")[1]
            #     if line.startswith('collector.sdkscript.threadpool'):
            #         collector_sdkscript_threadpool = line.replace('\r\n', '').split("=")[1]
            #     if line.startswith('collector.batchscript.threadpool'):
            #         collector_batchscript_threadpool = line.replace('\r\n', '').split("=")[1]
            #     if line.startswith('collector.jdbc.threadpool'):
            #         collector_jdbc_threadpool = line.replace('\r\n', '').split("=")[1]
            #     if line.startswith('collector.webpage.threadpool'):
            #         collector_webpage_threadpool = line.replace('\r\n', '').split("=")[1]
            #     if line.startswith('collector.openmetrics.threadpool'):
            #         collector_openmetrics_threadpool = line.replace('\r\n', '').split("=")[1]
            #     if line.startswith('collector.perfmon.threadpool'):
            #         collector_perfmon_threadpool = line.replace('\r\n', '').split("=")[1]
            #     if line.startswith('collector.wmi.threadpool'):
            #         collector_wmi_threadpool = line.replace('\r\n', '').split("=")[1]
            #     if line.startswith('collector.netapp.threadpool'):
            #         collector_netapp_threadpool = line.replace('\r\n', '').split("=")[1]
            #     if line.startswith('collector.memcached.threadpool'):
            #         collector_memcached_threadpool = line.replace('\r\n', '').split("=")[1]
            #     if line.startswith('collector.jmx.threadpool'):
            #         collector_jmx_threadpool = line.replace('\r\n', '').split("=")[1]
            #     if line.startswith('collector.datapump.threadpool'):
            #         collector_datapump_threadpool = line.replace('\r\n', '').split("=")[1]
            #     if line.startswith('collector.esx.threadpool'):
            #         collector_esx_threadpool = line.replace('\r\n', '').split("=")[1]
            #     if line.startswith('collector.xen.threadpool'):
            #         collector_xen_threadpool = line.replace('\r\n', '').split("=")[1]
            #     if line.startswith('collector.udp.threadpool'):
            #         collector_udp_threadpool = line.replace('\r\n', '').split("=")[1]
            #     if line.startswith('collector.tcp.threadpool'):
            #         collector_tcp_threadpool = line.replace('\r\n', '').split("=")[1]
            #     if line.startswith('collector.cim.threadpool'):
            #         collector_cim_threadpool = line.replace('\r\n', '').split("=")[1]
            #     if line.startswith('configcollector.script.threadpool'):
            #         collector_configcollector_script_threadpool = line.replace('\r\n', '').split("=")[1]
    
            collectorconfigs.append({'group': collector_group, 
                                        'description': collector_description, 
                                        'numberOfHosts': collector_numberOfHosts, 
                                        'collector_id': collector_id,
                                        'size': collector_size,
                                        'escalatingChainId': collector_escalatingChainId,
                                        'maxmem': collector_wrapper_maxmem, 
                                        'mimmem': collector_wrapper_minmem, 
                                        'ea': collector_ea,
                                        'status': collector_status, 
                                        'build': collector_build, 
                                        'canDowngrade': collector_canDowngrade, 
                                        'platform': collector_platform, 
                                        'isDown': collector_isDown,
                                        'collectorRole': collector_role,
                                        'isAdmin': collector_isadmin,
                                        'username': collector_username,
                                        'instances': collector_instances_total,
                                        'instanceBatchscriptCount': instance_batchscript_count,
                                        'instanceSnmpCount': instance_snmp_count,
                                        'instanceScriptCount': instance_script_count,
                                        'instanceConfigsourceScriptCount': instance_configsourcescript_count,
                                        'instanceWmiCount': instance_wmi_count
                                        # 'script_asynchronous': collector_script_asynchronous, 
                                        # 'groovy_script_runner': collector_groovyscript_runner,
                                        # 'mongo_threadpool': collector_mongo_threadpool, 
                                        # 'collector_configcollector_script_threadpool': collector_configcollector_script_threadpool,
                                        # 'dns_threadpool': collector_dns_threadpool, 
                                        # 'ping_threadpool': collector_ping_threadpool, 
                                        # 'snmp_threadpool': collector_snmp_threadpool, 
                                        # 'script_threadpool': collector_script_threadpool, 
                                        # 'sdkscript_threadpool': collector_sdkscript_threadpool, 
                                        # 'batchscript_threadpool': collector_batchscript_threadpool, 
                                        # 'jdbc_threadpool': collector_jdbc_threadpool, 
                                        # 'webpage_threadpool': collector_webpage_threadpool, 
                                        # 'openmetrics_threadpool': collector_openmetrics_threadpool, 
                                        # 'perfmon_threadpool': collector_perfmon_threadpool, 
                                        # 'wmi_threadpool': collector_wmi_threadpool, 
                                        # 'netapp_threadpool': collector_netapp_threadpool, 
                                        # 'memcached_threadpool': collector_memcached_threadpool, 
                                        # 'jmx_threadpool': collector_jmx_threadpool, 
                                        # 'datapump_threadpool': collector_datapump_threadpool, 
                                        # 'esx_threadpool': collector_esx_threadpool, 
                                        # 'xen_threadpool': collector_xen_threadpool, 
                                        # 'udp_threadpool': collector_udp_threadpool, 
                                        # 'tcp_threadpool': collector_tcp_threadpool, 
                                        # 'cim_threadpool': collector_cim_threadpool 
                                    })
    
    
    
        # return value is converted to JSON
        return {"collectors": collectorconfigs}

     

  • I have a simple (read:basic) Excel spreadsheet that does this.

    You could do the same, or I'll happily share the spreadsheet if you'd like.

    It's based on this page:
    https://www.logicmonitor.com/support/collectors/collector-groups/auto-balanced-collector-groups
    And specifically, this equation from that page:
    Number of instances = (Target_Collector_mem/Medium_mem)^1/2 * Medium_Threshold

    The spreadsheet (which was only created for me, so like I said, it's basic) requires you to enter the current number of instances on each collector and the collector memory size, and it calculates the rebalance threshold. 

    In my testing, it works well. 

    • Amit_Ojha's avatar
      Amit_Ojha
      Icon for Neophyte rankNeophyte

      Like if we need to calculate the instances limit for per collector then how and where to calculate, there is a tool available but it has only accessible by LM support tech, its not available for customer.

      • Joe_Williams's avatar
        Joe_Williams
        Icon for Professor rankProfessor

        When 2 or more collectors are in an auto balance group, you can see the number of instances the collector is monitoring in the UI. You can then edit the group itself, expand the advanced options and put in a new auto balance threshold.
        Once you do that, you can force a rebalance.

        If the collectors are already in an auto balance group, you can also get the number of instances via the API.