uptime

6 Topics

Convert SMTP uptime to days?
Hey all, messing around with a quick uptime dashboard for a product team. They have a bunch of Linux servers that I can pull SNMP Uptime but it needs to be converted from the whacky format to days. I’m just using the table widget on the dashboard, is there anyway to do that conversation? It would be nice if i could just pull the value in from the top level Resource page for the device where it already shows your uptime. Wish you could just hijack that underlying code. PS...cant edit the typo in the title. Thats lame.
Solved
derek_haneman
2 years ago Place Product Discussions
407Views
15likes
5Comments
497 days and counting........
You might have received an alert saying your linux based device has just rebooted, but you know that it has been up a long time. A switch might have just sent an alert for every interface flapping when they have all been up solidly. The important question to ask here is how long has the device been up? If its been up for 497 days,994 days,1491 days or any multiple of 497 then you are seeing the 497 day bug, that hits almost every linux based device that is up for a good length of time. Anything using a kernel less than 2.6 computes the system uptime based on the internal jiffies counter, which counts the time since boot in units of 10 milliseconds, or jiffies. This counter is a 32-bit counter, which has a maximum value of 2^32, or 4,294,967,296. When the counter reaches this value (after 497 days, 2 hours, 27 minutes, and 53 seconds, or approximately 16 months), it wraps back around to zero and continues to increment. This can result in alerts about reboots that didn’t happen and cause switches to report a flap on all interfaces. Systems that use 2.6 Kernel and properly supply a 64 bit counter will still alert incorrectly when the 64 bit counter wraps. A 32 bit counter can hold 4,294,967,295( /4,294,967,295864000/8640000 = 497.1 days) A 64 bit counter can hold 18,446,744,073,709,551,615 . (18,446,744,073,709,551,615/8640000 = 2135039823346 days or 5849424173 years) Though I expect in 6,000 million years we will all have other things to worry over.
David_Lee
7 years ago Place Archive
192Views
0likes
5Comments
Export Datetimes for Downtime in a Website Overview Report
For context, I'm a consumer of LogicMonitor csv/excel report exports only - I am required to aggregate my exports outside of LogicMonitor so I can use the availability data from our website endpoint checks to build Tableau Reports. Thus, it would be incredibly helpful to my organization if I could obtain an export (csv/excel) from LogicMonitor that contained all webservices downtime (NOT aggregated and reported as ##h ##m ##s) with datetimes for each period of missed polls (downtime). For example, each line in the spreadsheet contain the endpoint details and the start/stop time. For periods of flapping, each there would be multiple lines for the same endpoint, each with their own start and end times. Knowing when downtime is occurring AND knowing if it occurred during a scheduled maintenance period are essential pieces of information necessary to advance availability reporting for our cloud applications and ensure we maintain our SLA (which discounts application downtime for planned maintenance). I draw out my data daily, via a website overview report that excludes SDT from the reported downtime in a csv export.
CBU_BA_DUDE
7 years ago Place Archive
21Views
0likes
0Comments
SystemSTARTUpTimeGreaterthan28days
I feel foolish in asking this however I need to this get off my plate. I've been asked to setup a monitor that alerts windows systems that have been up for more than 28days. I thought that I could clone the "WinSystemUptime" DataSource, which will report if a system has been rebooted and modify it to report not that a system has been rebooted but the system has not been rebooted. Any help would be appreciated
wanabeninja
8 years ago Place Archive
15Views
0likes
3Comments
Uptime - Bug Report
I’d like to re-initiate this bug report. The Uptime resetting counters at 497 days or 469 days (historical) I just had a similar false alarm telling me that my devices rebooted, when they did not. Please have the DEV team review this specific monitor and determine how the system can display 497+ days “uptime” --------------------------------- ________________________________________ SEP 11, 2015 | 01:56PM CDT Original message ________________ wrote: Support team at logic monitor, Is it possible to request adjustment to the "Uptime" data source monitor so that it does not alert when the counter resets from 11111111111111111111111111111111 to 00000000000000000000000000000001 The developer was aware enough of the event cause to code explanation in to the system alert: could the alert be altered to not-alert when the counter resets? - ________________ From: ________________ Sent: Friday, September 11, 2015 2:44 PM To: Subject: SC# Error: 6348 ________________ is reporting it has only been up for 0.43 minutes Hello ________________ , We have received the following monitoring alert and a ticket #6348 has been created to track your issue. An engineer is assigned and is working to resolve this issue. Thank you. We are investagating if the VM really did reboot or if this alert is coming up for a different reason: ________________is reporting it has only been up for 0.43 minutes, as of 2015-09-11 14:28:48 EDT. If this was an unexpected reboot, please investigate the system logs. NOTE: if ________________has been up for 469 days without a reboot, this alert will trigger due to a counter wrap in the host. In this case, you may disregard this alert. (But the host is probably due for an OS update.) For any inquiries please contact our NOC at support@highpoint.com<mailto:support@highpoint.com> or call 1-855-485-8324 (TECH). Regards, ________________ NOC Support Engineer
Mahlon_Greene
8 years ago Place Archive
44Views
0likes
1Comment
Time after (Up)time
Time after (Up)time It’s one of the days that an unsung hero gets his chance to make a mark on the world.And who might this be? The floor function , also called the greatest integer function or integer value (Spanier and Oldham, 1987), gives the largest integer less than or equal to x. The floor function is not-normally implemented in the LM perspective – often brushed off, but it got its day when we had a user on chat who was looking on creating a dashboard which has server uptime in days instead of seconds. And so I tried devising a Big Number Widget that would include a virtual datapoint in it with the following calculation to display seconds into days: Fig1. Initial UptimeDays setup without the floor() function. UptimeSeconds/60/60/24 Where UptimeSeconds references a pre-calculated Complex Datapoint from the WinSystemUptime Datasource. Alas – this was not presenting days in a helpful manner. It was displaying days with a decimal point. Quoting from the chat : “that uptime display widget you created would be grand if it showed the days. it says 0,4 at the minute, does that meant 0.4 days?” And so we needed to figure on another method. A colleague of ours, a wizard from the magical realm of complex datapoints pointed out to us something powerful, that could display seconds, minutes and days even – And, with this, the second hand unwinds... By using the floor() function appended to the original expression – we could actually calculate: The day rounded to the nearest lowest integer. floor(UptimeSeconds/ 60/ 60 /24) For example if we had the result of 2.4 days from the UptimeSeconds/6000/60/24 calculation – it will present us the result as 2 – to the smallest following integer. Fig2. Complex Datapoints for Days, Hours and Minutes with the floor function And on the Hours Datapoint: floor((UptimeSeconds-(days*86400))/3600) Where 86400 represents the number of seconds in a day – 60 * 60 *24 = 86400. and 3600 represents the number of seconds in an hour - 60 * 60 =3600. This calculates the number of total hours (excluding the amount already converted and apportioned into the Days metric) , and again the floor function rounds the number hours to the smallest following integer. For example - 2.4 days - 0.4 is then carried over to the hours, which equates to 9.6 hours, and with the application of the floor function it will reflect 9 hours, with the balance of 0.6 hours to be calculated on the Minutes Datapoint. And lastly on the Minutes Datapoint: floor((UptimeSeconds-(days*86400)-(hours*3600))/60) Where 86400 represents the number of seconds in a day – 60 *60 *24 = 86400. and 3600 represents the number of seconds in an hour - 60 * 60 =3600. and 60 represents the number of seconds in a minute. This does the same again, number of total minutes excluding those previously factored in the days (*24) and and the hours (*60), rounded to the smallest following integer. And therefore, the floor() function has been very useful in this case to catch the Days, Hours and Minutes, have them precisely presented on the Big Number Widget - which the team could display on the dashboard to wait on -- Time after Time. Fig 3. Uptime Days,Hours and Minutes on the Big Number Widget References: https://www.logicmonitor.com/support/datasources/creating-managing-datasources/datapoint-expressions/ http://mathworld.wolfram.com/FloorFunction.html Credits to David Lee for pointing out on the powerful capabilities of the floor function.
Haniz
9 years ago Place Archive
16Views
0likes
0Comments