Blog Post

Tech Talk
2 MIN READ

Fixing misconfigured Auto-Balanced Collector assignments

mray's avatar
mray
Icon for LM Conqueror rankLM Conqueror
10 months ago

I’ve seen this issue pop up a lot in support so I figured this post may help some folks out. I just came across a ticket the other day so it’s fresh on my mind! 

In order for Auto-Balanced Collector Groups (ABCG) to work properly, i.e. balance and failover, you have to make sure that the Collector Group is set to the ABCG and (and this is the important part) the Preferred Collector is set to “Auto Balance”. If it is set to an actual Collector ID, then it won’t get the benefits of the ABCG. 

You want this, not that:

​​

Ok, so that’s cool but now the real question is how do you fix this?

There’s not really a good way to surface in the portal all devices where this is misconfigured. It’s not a system property so a report or AppliesTo query won’t help here… 

Fortunately, not all hope is lost! You can use the ✨API✨

When you GET a Resource/device, you will get back some JSON and what you want is for the autoBalancedCollectorGroupId field to equal the preferredCollectorGroupId field. If “Preferred Collector” is not “Auto Balance” and set to a ID, then autoBalancedCollectorGroupId will be 0.

Breaking it down step by step:

  1. First, get a list of all ABCG IDs
  2. Then, with any given ABCG ID, you can filter a device list for all devices where there’s this mismatch
  3. And now for each device returned, make a PATCH so that autoBalancedCollectorGroupId is now set to preferredCollectorGroupId

Here’s a link to the full script, written in Python for you to check out. I’ll also add it below in a comment since this is already getting long.

Do you have a better, easier, or more efficient way of doing this? I’d love to hear about it! 

Published 10 months ago
Version 1.0
  • Anonymous's avatar
    Anonymous

    This uses my lmwrapper to make authentication and setup of the api object easy. It also uses pagination and skips collectors (in my environment due to naming standard):

    from lm import lm
    try:
    collector_groups = []
    end_found = False
    offset = 0
    size = 1000
    while not end_found:
    current = lm.get_collector_group_list(filter='autoBalance:true').items
    collector_groups += current
    offset += len(current)
    end_found = len(current) != size
    for abcg in [(x.id,x.name) for x in collector_groups]:
    (abcg_id, abcg_name) = abcg
    print(f"Fixing ABCG {abcg_name} ({abcg_id})...")
    try:
    devices = []
    end_found = False
    offset = 0
    size = 1000
    while not end_found:
    current = lm.get_device_list(size=size, offset=offset, filter=f"autoBalancedCollectorGroupId:0,preferredCollectorGroupId:{abcg_id}").items
    devices += current
    offset += len(current)
    end_found = len(current) != size
    if len([x for x in devices if x.display_name[1] != "c"]) > 0: # this skips collectors in my environment; ymmv
    for device in devices:
    if device.display_name[1] != "c": # this skips collectors in my environment; ymmv
    print(f"\tFixing device {device.display_name} ({device.id})")
    try: response = lm.patch_device(id=device.id, body={"autoBalancedCollectorGroupId": abcg_id}, op_type="replace")
    except Exception as e: print(f"There was an error patching {device.display_name}: {e}")
    else: print(f"\tAll devices in {abcg_name} are assigned to the ABCGroup")
    except Exception as e: print(f"There was an error fetching the list of devices from \"{abcg_name}\" that needed fixing: {e}")
    except Exception as e: print(f"There was an error fetching the list of ABCGroups: {e}")
  • mray's avatar
    mray
    Icon for LM Conqueror rankLM Conqueror

    The call to the collector group and the call to fetch devices also needs pagination.

    Yep yep. Definitely more of an example script than anything production ready. Pagination, check for Collector Resource, a flag to take in a subset of Collector IDs you want to fix (and not just all). Probably some other stuff too that I can’t think of!

  • mray's avatar
    mray
    Icon for LM Conqueror rankLM Conqueror

    There should probably be something in there that avoids messing with the assignment of the collector to itself.

    I agree, just prevent this altogether. Have the preferred Collector be just greyed out, or the dropdown not list anything… but then that might lead to issues because (and this is something I neglected to mention above) you actually do want the Collector hard set for the Collector host Resources. I should prob update my script to skip over Collector Resources… but I think my main point still comes across -- really just wanting to highlight how to spot these types of things.

    There may also be some cases (mostly troubleshooting I think) where you might want to force a reassignment. So maybe the ‘reset’ button is the way to go 🤷

  • Anonymous's avatar
    Anonymous

    The call to the collector group and the call to fetch devices also needs pagination.

  • Anonymous's avatar
    Anonymous

    There should probably be something in there that avoids messing with the assignment of the collector to itself.

  • mray's avatar
    mray
    Icon for LM Conqueror rankLM Conqueror

    Everybody please submit a feed for there to be a method on the ABCGroup to “reset” devices assigned to the collectors in the group to the ABCGroup.

    Yes! The more feedback, the better! 🚀

  • Anonymous's avatar
    Anonymous

    Everybody please submit a feed for there to be a method on the ABCGroup to “reset” devices assigned to the collectors in the group to the ABCGroup. We should not have to to build this functionality ourselves through Python/API. The method should open up a modal showing the list of all devices in a box on the left hand side. You move devices from the left side to the right side to include them in resetting their assignment. Typical filters should exist allowing easy selection of all, by collector, by hostname, etc.

  • mray's avatar
    mray
    Icon for LM Conqueror rankLM Conqueror

    Using the LM SDK makes the code much cleaner:

    from __future__ import print_function
    import os
    import logicmonitor_sdk
    from logicmonitor_sdk.rest import ApiException


    # Configure API key authorization: LMv1
    configuration = logicmonitor_sdk.Configuration()
    configuration.company = ''
    configuration.auth_type = 'Bearer'
    configuration.bearer_token = ''

    # create an instance of the API class
    api_instance = logicmonitor_sdk.LMApi(
    logicmonitor_sdk.ApiClient(configuration))


    def main():
    # Get list of auto-balanced collector group ids
    abcg_id_list = get_abcg_ids()

    if abcg_id_list is not None:
    # For each aabcg id, get the list of devices in that group
    # For each device, set autoBalancedCollectorGroupId to preferredCollectorGroupId
    for abcg_id in abcg_id_list:
    print("Fixing ABCG " + str(abcg_id))
    fix_resource_collectors(abcg_id)
    else:
    print("No ABCG IDs found.")


    def get_abcg_ids():
    try:
    response = api_instance.get_collector_group_list(
    filter='autoBalance:true')
    id_list = [item.id for item in response.items]
    return id_list
    except ApiException as e:
    print("Exception when calling CollectorGroupsApi->getCollectorGroupList: %s\n" % e)


    def fix_resource_collectors(abcg_id):
    try:
    filter = 'autoBalancedCollectorGroupId:0,preferredCollectorGroupId:' + \
    str(abcg_id)
    response = api_instance.get_device_list(filter=filter)
    for device in response.items:
    device_id = device.id
    print("Updating deviceID " + str(device_id))
    body = {"autoBalancedCollectorGroupId": abcg_id}
    api_instance.patch_device(id=device_id, body=body)
    except ApiException as e:
    print(
    "Exception when calling DevicesApi: %s\n" % e)


    if __name__ == "__main__":
    main()

    Link to code on github

  • mray's avatar
    mray
    Icon for LM Conqueror rankLM Conqueror

    Full Python script:

    #!/usr/bin/python

    import requests

    # Account Info
    portal = ''
    bearer = ''
    auth = 'bearer ' + bearer
    headers = {'Content-Type': 'application/json',
    'X-version': '3', 'Authorization': auth}


    def main():
    # Get list of auto-balanced collector group ids
    path = '/setting/collector/groups'
    params = '?filter=autoBalance:true'
    abcg_id_list = get_abcg_ids(path, params)

    if abcg_id_list is not None:
    # For each aabcg id, get the list of devices in that group
    # For each device, set autoBalancedCollectorGroupId to preferredCollectorGroupId
    path = '/device/devices'
    params = '?filter=autoBalancedCollectorGroupId:0,preferredCollectorGroupId:'
    for abcg_id in abcg_id_list:
    print("Fixing ABCG " + str(abcg_id))
    fix_resource_collectors(path, params, abcg_id)
    else:
    print("No ABCG IDs found.")


    def get_abcg_ids(path, params):
    try:
    url = 'https://' + portal + '.logicmonitor.com/santaba/rest' + path + params
    response = requests.get(url, headers=headers, verify=True)
    # Raises stored HTTPError, if one occurred.
    response.raise_for_status()
    response_json = response.json()
    id_list = [item['id'] for item in response_json['items']]
    return id_list
    except requests.HTTPError as http_err:
    print(f'HTTP error occurred: {http_err}')
    except Exception as err:
    print(f'Other error occurred: {err}')


    def fix_resource_collectors(path, params, abcg_id):
    try:
    abcg_id = str(abcg_id)
    url = 'https://' + portal + '.logicmonitor.com/santaba/rest' + \
    path + params + abcg_id
    response = requests.get(url, headers=headers, verify=True)
    # Raises stored HTTPError, if one occurred.
    response.raise_for_status()
    response_json = response.json()
    for device in response_json['items']:
    device_id = str(device['id'])
    print("Updating deviceID " + device_id)
    url = 'https://' + portal + \
    '.logicmonitor.com/santaba/rest' + path + '/' + device_id
    data = '{"autoBalancedCollectorGroupId":' + abcg_id + '}'
    requests.patch(url, data=data, headers=headers, verify=True)
    except requests.HTTPError as http_err:
    print(f'HTTP error occurred: {http_err}')
    except Exception as err:
    print(f'Other error occurred: {err}')


    if __name__ == "__main__":
    main()