I've got a trio of Load Balancers running HAProxy that I need to monitor, and I found the HAProxy module and installed it. I verified using the 'Test Applies To' and it found all 3 servers, so I assume that means it's been associated right? It's been a few days and the resources are not displaying any HAP related info nor do I have a dropdown (IDK the correct term) under the resource itself like there is for CPU, Disks, etc. Second question.. reading the description here: https://www.logicmonitor.com/integrations/ha-proxy am I correct in assuming that the only stats this module will report is sessions? If so that's missing a ton of important stats.... Thanks!!

Yes, the only datapoint that DataSource tracks is sessions. It would seem the active discovery is not returning anything in your case. Navigate to the DataSource (the same place you tested the AppliesTo) and click the "Test Active Discovery" button. If you see no results there, that's your problem. This DS uses the HTTP discovery method, meaning that discovery involves pulling up a web page and scraping it for the instances (that's the word you were looking for). In this case, it's looking at https://[hostname/ip]/haproxy?stats and scrapes looking for anything matching RegEx: <th colspan=2 class=.pxname.>(.*?)</th>. I would start by hitting one of your stats enabled frontend on one of the haproxies to see if the page loads. If it does not, you probably need to add that frontend to your haproxy config. It's possible that this used to be enabled out of the box for older versions of haproxy and the newest version of haproxy requires you to explicitly configure it. Once you get the page loading in your browser, you might need to make some changes to the DS to get the discovery to pull the page correctly. Once that's working, it looks like it shouldn't be hard to get the other stats from the table on that page. You'll just have to get real familiar with RegEx. I just got haproxy up and running in Docker and i'll take a look today during any free time i have to see what can be done to pull some of the other stats. Did you manually add the haproxy category to your servers or was it discovered? I'm not aware of a propertysource that auto-discovers haproxy installed on devices, but it wouldn't be the first time there's a propertysource i'm unaware of.

Ok, i think i have something for you. Using this haproxy.cfg file: frontend stats bind :8404 mode http log global maxconn 10 timeout client 100s timeout server 100s timeout connect 100s timeout queue 100s stats enable stats hide-version stats refresh 30s stats show-node stats uri /haproxy?stats frontend mysite frontend hissite frontend theothersite frontend google.com I was able to write a DS to pull in 56 different datapoints for each frontend. Your mileage may vary. My /haproxy?stats is running on port 80, not 8404 (running inside a container where the container runtime remaps from 80:8404. Either way, you can add a property to the host called "haproxy.port" to specify a port other than 80 that your stats page is running on. I'll be publishing this to the Exchange shortly where it will need to undergo code review, but here it is in the meantime: https://github.com/sweenig/lmcommunity/tree/master/haproxy_2_4 FYI, instead of scraping the HTML like the old version did, i dove into the json version of the data. I don't know if this just wasn't available in previous versions of HAProxy, or if someone thought it was easier to scrape the HTML. Either way, it necessitates a new DS since the collection method changes from WEBPAGE to BATCHSCRIPT. You should be able to import it into your portal without changing the existing HAProxy DS. Once you get it working, you can delete the existing HAProxy DS.

Stuart do yo happen to have a CSV version of this Datasource?

What do you mean a CSV version?

Monitoring HAProxy? | LogicMonitor

13 Replies

Anonymous
5 years ago
Yes, the only datapoint that DataSource tracks is sessions. It would seem the active discovery is not returning anything in your case. Navigate to the DataSource (the same place you tested the AppliesTo) and click the "Test Active Discovery" button. If you see no results there, that's your problem.

This DS uses the HTTP discovery method, meaning that discovery involves pulling up a web page and scraping it for the instances (that's the word you were looking for). In this case, it's looking at https://[hostname/ip]/haproxy?stats and scrapes looking for anything matching RegEx: <th colspan=2 class=.pxname.>(.*?)</th>. I would start by hitting one of your stats enabled frontend on one of the haproxies to see if the page loads. If it does not, you probably need to add that frontend to your haproxy config.

It's possible that this used to be enabled out of the box for older versions of haproxy and the newest version of haproxy requires you to explicitly configure it.

Once you get the page loading in your browser, you might need to make some changes to the DS to get the discovery to pull the page correctly. Once that's working, it looks like it shouldn't be hard to get the other stats from the table on that page. You'll just have to get real familiar with RegEx.

I just got haproxy up and running in Docker and i'll take a look today during any free time i have to see what can be done to pull some of the other stats. Did you manually add the haproxy category to your servers or was it discovered? I'm not aware of a propertysource that auto-discovers haproxy installed on devices, but it wouldn't be the first time there's a propertysource i'm unaware of.
Anonymous
5 years ago
Ok, i think i have something for you. Using this haproxy.cfg file:

frontend stats bind :8404 mode http log global maxconn 10 timeout client 100s timeout server 100s timeout connect 100s timeout queue 100s stats enable stats hide-version stats refresh 30s stats show-node stats uri /haproxy?stats frontend mysite frontend hissite frontend theothersite frontend google.com

I was able to write a DS to pull in 56 different datapoints for each frontend. Your mileage may vary. My /haproxy?stats is running on port 80, not 8404 (running inside a container where the container runtime remaps from 80:8404. Either way, you can add a property to the host called "haproxy.port" to specify a port other than 80 that your stats page is running on. I'll be publishing this to the Exchange shortly where it will need to undergo code review, but here it is in the meantime: https://github.com/sweenig/lmcommunity/tree/master/haproxy_2_4

FYI, instead of scraping the HTML like the old version did, i dove into the json version of the data. I don't know if this just wasn't available in previous versions of HAProxy, or if someone thought it was easier to scrape the HTML. Either way, it necessitates a new DS since the collection method changes from WEBPAGE to BATCHSCRIPT. You should be able to import it into your portal without changing the existing HAProxy DS. Once you get it working, you can delete the existing HAProxy DS.
danp
5 years ago
Stuart do yo happen to have a CSV version of this Datasource?
Anonymous
5 years ago
What do you mean a CSV version?

danp

5 years ago

so since i run multiple ha processes, each have their own stats page. With the JSON format I was getting weird values as stats would pull from different proceses thus my graph would be bouncing around. I finally figured out that there is a lau script that can pull stats from all the processes and aggregate them to CSV.

So how my main HAprocess page shows this output

And i'm wanting to parse it for the metrics i want

<from my lau stats page>

http:///myserver:8888

pxname,svname,qcur,qmax,scur,smax,slim,stot,bin,bout,dreq,dresp,ereq,econ,eresp,wretr,wredis,status,weight,act,bck,chkfail,chkdown,lastchg,downtime,qlimit,pid,iid,sid,throttle,lbtot,tracked,type,rate,rate_lim,rate_max,check_status,check_code,check_duration,hrsp_1xx,hrsp_2xx,hrsp_3xx,hrsp_4xx,hrsp_5xx,hrsp_other,hanafail,req_rate,req_rate_max,req_tot,cli_abrt,srv_abrt,comp_in,comp_out,comp_byp,comp_rsp,lastsess,last_chk,last_agt,qtime,ctime,rtime,ttime,agg,
stats-aggregate,FRONTEND,,,0,4,800000,7,4505,44273,0,0,0,,,,,OPEN/OPEN/OPEN/OPEN,,,,,,,,,4/2/2/2,8/8/8/8,,,,,0,0,0,4,,,,0,13,0,0,0,0,,0,8,13,,,0,0,0,0,,,,,,,,statistics-cpu-4/statistics-cpu-1/statistics-cpu-3/statistics-cpu-2,
BILLY_front,FRONTEND,,,5,21,800000,379,821472,7749620993,0,0,0,,,,,OPEN/OPEN/OPEN/OPEN,,,,,,,,,4/2/2/2,2/2/2/2,,,,,0,0,0,21,,,,0,0,0,0,0,0,,0,0,0,,,0,0,0,0,,,,,,,,statistics-cpu-4/statistics-cpu-1/statistics-cpu-3/statistics-cpu-2,
stats-2,FRONTEND,,,1,6,600000,1350,80940,4664196,0,0,0,,,,,OPEN/OPEN/OPEN,,,,,,,,,2/2/2,5/5/5,,,,,0,3,0,9,,,,0,1349,0,0,0,0,,3,9,1350,,,0,0,0,0,,,,,,,,statistics-cpu-1/statistics-cpu-3/statistics-cpu-2,
stats-4,FRONTEND,,,1,2,200000,1103,66120,3879247,0,0,0,,,,,OPEN,,,,,,,,,4,7,,,,,0,1,0,4,,,,0,1102,0,0,0,0,,1,4,1103,,,0,0,0,0,,,,,,,,statistics-cpu-4,
stats-aggregate,BACKEND,0,0,0,0,80000,0,4505,44273,0,0,,0,0,0,0,UP/UP/UP/UP,0,0,0,,0,3417.0,0,,4/2/2/2,8/8/8/8,,,0,,1,0,,0,,,,0,0,0,0,0,0,,,,,0,0,0,0,0,0,725.0,,,0.0,0.0,0.0,4.25,statistics-cpu-4/statistics-cpu-1/statistics-cpu-3/statistics-cpu-2,
BILLY_back,BACKEND,0,0,5,21,80000,379,821472,7749620993,0,0,,146,0,48,0,UP/UP/UP/UP,16,16,0,,4,1706.75,48.25,,4/2/2/2,3/3/3/3,,,249,,1,0,,21,,,,0,0,0,0,0,0,,,,,4,0,0,0,0,0,1321.25,,,0.0,0.0,0.0,131623.25,statistics-cpu-4/statistics-cpu-1/statistics-cpu-3/statistics-cpu-2,
BILLY_back,FOX1,0,0,1,6,0,53,198554,2433438274,,0,,0,0,0,0,UP/UP/UP/UP,4,4,0,12,4,1700.0,56.0,0,4/2/2/2,3/3/3/3,2/2/2/2,0,53,0,2,0,,4,L4OK/L4OK/L4OK/L4OK,,0/0/0/0,0,0,0,0,0,0,,,,,0,0,,,,,2569.5,,,0.0,0.0,0.0,1252990.75,statistics-cpu-4/statistics-cpu-1/statistics-cpu-3/statistics-cpu-2,
BILLY_back,FOX0,0,0,2,6,0,39,97291,355396905,,0,,6,0,18,0,UP/UP/UP/UP,4,4,0,12,4,1706.75,48.5,0,4/2/2/2,3/3/3/3,1/1/1/1,0,21,0,2,0,,5,L4OK/L4OK/L4OK/L4OK,,0/0/0/0,0,0,0,0,0,0,,,,,0,0,,,,,1740.5,,,0.0,0.0,0.0,461095.0,statistics-cpu-4/statistics-cpu-1/statistics-cpu-3/statistics-cpu-2,
BILLY_back,FOX3,0,0,1,6,0,123,196964,2033573869,,0,,7,0,21,0,UP/UP/UP/UP,4,4,0,12,4,1691.0,64.0,0,4/2/2/2,3/3/3/3,4/4/4/4,0,102,0,2,0,,4,L4OK/L4OK/L4OK/L4OK,,0/0/0/0,0,0,0,0,0,0,,,,,1,0,,,,,1319.0,,,0.0,0.0,0.0,4631.0,statistics-cpu-4/statistics-cpu-1/statistics-cpu-3/statistics-cpu-2,
BILLY_back,FOX2,0,0,1,6,0,82,328663,2927211945,,0,,3,0,9,0,UP/UP/UP/UP,4,4,0,12,4,1695.75,60.0,0,4/2/2/2,3/3/3/3,3/3/3/3,0,73,0,2,0,,4,L4OK/L4OK/L4OK/L4OK,,0/0/0/0,0,0,0,0,0,0,,,,,3,0,,,,,1327.75,,,0.0,0.0,0.0,76763.25,statistics-cpu-4/statistics-cpu-1/statistics-cpu-3/statistics-cpu-2,
stats-2,BACKEND,0,0,0,0,60000,0,80940,4664196,0,0,,0,0,0,0,UP/UP/UP,0,0,0,,0,3417.0,0,,2/2/2,5/5/5,,,0,,1,0,,0,,,,0,0,0,0,0,0,,,,,0,0,0,0,0,0,0.0,,,0.0,0.0,1.0,1.0,statistics-cpu-1/statistics-cpu-3/statistics-cpu-2,
stats-4,BACKEND,0,0,0,0,20000,0,66120,3879247,0,0,,0,0,0,0,UP,0,0,0,,0,3417.0,0,,4,7,,,0,,1,0,,0,,,,0,0,0,0,0,0,,,,,0,0,0,0,0,0,0.0,,,0.0,0.0,0.0,1.0,statistics-cpu-4,

ie.... i want to pull "BILLY_back, FOX1 scur value"

danp
5 years ago
I may just do this with python and snmp, seems much more of a simple approach, yet requires code on servers
Anonymous
5 years ago

15 hours ago, danp said:

so since i run multiple ha processes, each have their own stats page

Is each page at its own address (port)? If so, we should be able to easily modify the discovery and collection scripts to pull from each one.

14 hours ago, danp said:

I may just do this with python and snmp, seems much more of a simple approach, yet requires code on servers

If that's an option, you can try it. If it's pure SNMP, you might try the no-code option of building an SNMP DataSource in LM.
danp
5 years ago
Yes, each process will require it's own stats page.

What we found is that when we ran multiple processes that master haproxy stats page would pull randomly from one of the running processes. Thus our stats would look like 32 current_sessions then a second later would read 15 current_sessions, the the graph was skewed. When we really needed 32+15 for total sessions.

We used lau to aggregate the stats as shown here: https://discourse.haproxy.org/t/lua-solution-for-stats-aggregation-and-centralization/27

Thus it creates a master aggregate page... ours on port 8880 which dumps the csv.

I ended up just doing a simple python script to pull that stats back as a keyvalue pair and extending snmp to pull them: (it was the easy solution)

import requests import io import csv r = requests.get('http://127.0.0.1:8880/') f = io.StringIO(r.text) reader = csv.reader(f, delimiter=',') for row in reader: if row[0] == 'BILLY_back': print(f"{row[1]}_SessCur={row[4]}\n{row[1]}_Status={row[22]}")

thus returns a clean K-V pair, which can easily be used as an snmp extension and metrics pulled into a very simple datasource

BACKEND_SessCur=60 BACKEND_Status=4 FOX1_SessCur=15 FOX1_Status=4 FOX0_SessCur=15 FOX0_Status=4 FOX3_SessCur=15 FOX3_Status=4 FOX2_SessCur=15 FOX2_Status=4
Anonymous
5 years ago
Cool that it's working. I think it would be pretty easy to modify the existing DS to pull from separate pages, then LM can aggregate if you need it but also show individual stats as well. Is there a programmatic way to discover the addresses of all the pages?
danp
5 years ago
I know all the addresses they would be 8881-4 with the master aggregated page as 8880.

I can see how we would be able to use the json slurper to pull the individual pages but that just seems like a waste of processes when we can just parse the aggregate page with some sort of web CVS slurper.

Forum Discussion

Monitoring HAProxy?

13 Replies

Recent Discussions

Example scripts

Alert Tsunami: Why the Huge Delay and Flood of Post-Resolution Power Alerts?

Need to monitor long running SQL transactions

Adding Data sources to LMExchange

Linux Collector setup