Forum Discussion

Rodger_Keesee's avatar
11 years ago

How to monitor network health?

Hey guys,

We have the need to measure network health from our clients' LAN to our datacenter and vice-versa. We'd like to gather QoS, UDP, and TCP stats (jitter, RTT, loss, etc). The network health monitoring tools I've tested duplicate a lot of what LogicMonitor already does for us (alerting, collecting, reporting, etc). What do you guys use to monitor network health that also leverages the power of LogicMonitor? Software that we could install on a desktop would be great.

8 Replies

Replies have been turned off for this discussion
  • Jeff_Behl's avatar
    Jeff_Behl
    Former Employee

    Hi Rodger -

    While certainly not providing you with all you're looking for, you might get some use out of the Ping (Multi) datasource. It will allow you to ping multiple IPs/hosts from a collector to to see latency and packet loss. From its description:

    Ping multiple locations. Note: even though this can be applied to an arbitrary host, the source of the ping will always be the collector. This being the case, it makes the most sense to add instances to the host the collector is install on. Where I've used it in the past is for monitoring both the internal and public IP address of a VPN endpoint from one data center to another. Having both of these addresses monitored allows me to quickly determine if there is an issue with the VPN tunnel itself, or with the ISP. I've also added the default route that the client router uses in order to see if the first hop from the client router (the ISPs router, presumably) is the issue. Add an instance for each destination hostname/IP. The Name and Wildcard fields should probably be the same, though the collector will try to ping the value of the Wildcard field, which can be a hostname or an IP address. DNS resolution will be from the perspective of the collector machine, so pinging internal host names is possible. Finally, do note that we support NetFlow, so if the equipment you are utilizing supports it, it can help. Our NetFlow will receive some very nice new features in the next few releases, so stay tuned...

    Jeff

  • So one option is (if you happen to have Cisco gear at your clients sites) is to use the Cisco IP SLA responder features. (i.e. have the Ciscos do synthetic transactions, voice calls, etc). Then the loss, latency, jitter, MOS scores, etc will all show up in LogicMonitor.

  • Hi Rodger,

    We do support sFlow, as it is listed as a supported protocol with NetFlow, here: http://help.logicmonitor.com/monitoring-with-logicmonitor/notes-for-monitoring-specific-types-of-hosts/netflow-sflow/ . We support sFlow v5 with the flow sample data format set at enterprise=0 and format=1.

    Since jFlow, from my understanding, is more of a Juniper tweak to NetFlow, I would suspect it would work as well. I would suggest starting with jFlow v5, if you want to give it a try. I was told last night that we do not support jFlow from one of my support colleagues. But, it turns out we do. If you configure jFlow using the following v5 method: http://kb.juniper.net/InfoCenter/index?page=content&id=KB16677, it should work. That is, we have other customers doing exactly this and pulling v5 jFlow data.

    Configuration example for J-Flow versions 5 and 8:
    The following procedure provides an example of the J-Flow configuration for versions 5 and 8 (this procedure should also work with NetFlow versions 5 and 8):rn
    1. Enable sampling on one or more interfaces and specify the direction:
    2. user@host set interfaces ge-0/0/0 unit 0 family inet sampling inputrnuser@host set interfaces ge-0/0/0 unit 0 family inet sampling output
       
    3. Specify the sampling rate:
    4. Caution: Activation of flow collection can have a significant impact on the performance of the SRX Series device. The smaller the sample rate, the bigger the impact. It is recommended to not use a sampling input rate of 1.
       
      user@host set forwarding-options sampling input rate 100
       
    5. Specify the UDP port number of the host that is collecting cflowd packets:

      user@host set forwarding-options sampling family inet output flow-server 10.10.10.1 port 2056

      Specify the version format: 5, 8, or 500 (ASN 500):
    6. user@host set forwarding-options sampling family inet output flow-server 10.10.10.1 version 5

      Thanks,
    7. Michael
  • Thanks Jeff and Steve. We use ping now (and will add ping-multi). But I dont have to tell you you guys what a blunt tool ping is. We have Meraki (a Cisco product) at most clients but that doesnt provide NetFlow stats. Does LM support other standards like sFlow, OpenFlow, IPFix, and jFlow?

  • Thanks for this info everyone. At the moment we are stuck with installing PingPlotter on a desktop at each clients LAN - obviously this solution sucks and isn't scalable to the 100 networks we need to monitor. We have demoed Appneta but at a minimum of $7000 per year for only 3 sites it isn't a good ROI for our limited need. I assume everyone in IT has the need to measure and alert on network health so I feel like I'm missing something. Here's the only choices I see:

    1) buy and install Cisco or Juniper equipment and use LM to gather Netflow data

    2) use Ping datasource to measure ICMP latency as the only indicator of network health

    3) purchase enterprise-grade network health software completely outside of LM

    4) invent network health monitor software that runs as a virtual machine appliance and reports Netflow and SLA data (ideal, but non-existent)

  • Jeff_Behl's avatar
    Jeff_Behl
    Former Employee

    Hi Rodger -

    By no means a robust solution, but perhaps of interest since you mentioned option 4) above: it is possible to generate NetFlow data from pretty much any linux distro using a tool called fprobe. We used it in some of our testing. It simply will capture whatever traffic it sees flowing over a wire and export NetFlow data based on it.

    It looks like nProbe is also an option, but I'm not familiar with it. It looks like it has a lot more documentation associated with it.

    Jeff

  • I agree that configuring Cisco SLAs are the best way of monitoring network health. For example I have wan links that I monitor. If I set up a couple of SLAs from one site to another (say between a switch at each end), and I configure one to mimic TCP citrix traffic, and another for Voip, I can get a very accurate indication of performance that takes into account the QoS etc.

    If I just ping from the LM collector to something at the other end, Ill probably only get an indication of performance for the best efforts QoS class, and general server performance issues will also impact the measurements.  Sometimes I don't have the luxury of Cisco kit, so I have often searched around to see if someone has ever implemented a Linux (or windows) app that can do SLAs. Then I could setup a couple of NUCs at each end running the SLAs and have them monitored by LM. I Haven't found anything yet, but if someone does, it would be great to know.

    Simon

  • Mind the DTMF mixed quality BADLY affects your communications. I have a real story for you; I'm calling one of my banks, they Rep is transferring me to a robot to enter my pin #... I'm entering it, and the robot can't understand the pin. The robot says to enter again... after 3 tries it gives up. ...., and it does not understand your commands due to the line in not DTMF. You must troubleshoot DTMF problems with your VoIP connection through route-test.com. They offer a free testing credits. You can find the solution quickly. #VoIP