Forum Discussion

Charlie_Gough's avatar
5 years ago

Azure Site Recovery Monitoring

Hi All, 

I am currently working on a number of projects to start using LogicMonitor as a tool to monitor both on-premise and cloud.   One of the areas I am looking into is how to get more details out from Azure and into LogicMonitor.   One of the Area I am looking to explore and struggling to find information on are as follows:

 - Site Recovery - Pulling in the details about instances on their Replication Health, Agent Health, RPO , Status

I understand there is the Azure Replication Jobs status, but this doesn't present a friendly view on the current health of the replicating machines.  Is there a way to pull in what is available in terms of Datapoints for the Azure Insights Collector?

  • I'd also be interested in this. I've not looked into it really so, will have a play and see if I can work out how and if its possible.

  • Up voted. We are working with support because we found that cloud based Azure Backup Job and Replication job datasources generate alerts on errors but cannot be remediated at the instance level because instances (backup job id's) cannot be rerun. The datapoints do not seem to represent the overall result of all attempts to backup during a cycle (secondary/tertiary backup attempts also have unique backup job ids and create new instances) that automatically execute if the first one fails. Would be great if agent health could be isolated from overall job success for replication jobs but understand there may be a limitation on data available by API.

  • Upvoted.  We also have been working with support due to what Garry Gearhart is referencing.

    We receive alerts on actual jobs (backup / replication jobs themselves) and can look at remediating those, however, at this moment, there is no feature that I know of within LogicMonitor that can be used to actually look at the replication status of virtual machines within recovery services vaults in Azure.  This is for both machines running through an Azure Site Replication appliance, as well as configuration servers.

    There have been multiple recovery services vaults that are monitored that show any attempted backup / replication jobs or errors of other types, but nothing to show the actual status of replication of vaults as a whole.

    If there's a list of 13 servers in the list of one recovery services vault's section for site replication items, we would want to be able to be alerted to the status of any machines not clocking in as replicating successfully regardless of "job" status.

    Can this become a feature?

    - Site Recovery - Pulling in the details about instances on their Replication Health, Agent Health, RPO , Status

    I understand there is the Azure Replication Jobs status, but this doesn't present a friendly view on the current health of the replicating machines, or a view at all for that matter...

    Is there a way to pull in what is available in terms of Datapoints for the Azure Insights Collector?

    Is there any way at all that we can monitor the points of replication health, agent health, RPO, Status, etc of all items within each individual recovery services vaults in Azure?

    These features are GREATLY desired as this is arguably more important than the job status itself in any case.

    -Jonathan