bug

3 Topics

497 days and counting........
You might have received an alert saying your linux based device has just rebooted, but you know that it has been up a long time. A switch might have just sent an alert for every interface flapping when they have all been up solidly. The important question to ask here is how long has the device been up? If its been up for 497 days,994 days,1491 days or any multiple of 497 then you are seeing the 497 day bug, that hits almost every linux based device that is up for a good length of time. Anything using a kernel less than 2.6 computes the system uptime based on the internal jiffies counter, which counts the time since boot in units of 10 milliseconds, or jiffies. This counter is a 32-bit counter, which has a maximum value of 2^32, or 4,294,967,296. When the counter reaches this value (after 497 days, 2 hours, 27 minutes, and 53 seconds, or approximately 16 months), it wraps back around to zero and continues to increment. This can result in alerts about reboots that didn’t happen and cause switches to report a flap on all interfaces. Systems that use 2.6 Kernel and properly supply a 64 bit counter will still alert incorrectly when the 64 bit counter wraps. A 32 bit counter can hold 4,294,967,295( /4,294,967,295864000/8640000 = 497.1 days) A 64 bit counter can hold 18,446,744,073,709,551,615 . (18,446,744,073,709,551,615/8640000 = 2135039823346 days or 5849424173 years) Though I expect in 6,000 million years we will all have other things to worry over.
David_Lee
8 years ago Place Archive
244Views
0likes
5Comments
Uptime - Bug Report
I’d like to re-initiate this bug report. The Uptime resetting counters at 497 days or 469 days (historical) I just had a similar false alarm telling me that my devices rebooted, when they did not. Please have the DEV team review this specific monitor and determine how the system can display 497+ days “uptime” --------------------------------- ________________________________________ SEP 11, 2015 | 01:56PM CDT Original message ________________ wrote: Support team at logic monitor, Is it possible to request adjustment to the "Uptime" data source monitor so that it does not alert when the counter resets from 11111111111111111111111111111111 to 00000000000000000000000000000001 The developer was aware enough of the event cause to code explanation in to the system alert: could the alert be altered to not-alert when the counter resets? - ________________ From: ________________ Sent: Friday, September 11, 2015 2:44 PM To: Subject: SC# Error: 6348 ________________ is reporting it has only been up for 0.43 minutes Hello ________________ , We have received the following monitoring alert and a ticket #6348 has been created to track your issue. An engineer is assigned and is working to resolve this issue. Thank you. We are investagating if the VM really did reboot or if this alert is coming up for a different reason: ________________is reporting it has only been up for 0.43 minutes, as of 2015-09-11 14:28:48 EDT. If this was an unexpected reboot, please investigate the system logs. NOTE: if ________________has been up for 469 days without a reboot, this alert will trigger due to a counter wrap in the host. In this case, you may disregard this alert. (But the host is probably due for an OS update.) For any inquiries please contact our NOC at support@highpoint.com<mailto:support@highpoint.com> or call 1-855-485-8324 (TECH). Regards, ________________ NOC Support Engineer
Mahlon_Greene
9 years ago Place Archive
63Views
0likes
1Comment
Cloned JDBC Datasource doesn't preserve DB credentials
Hello, The title says it all really. I cloned a custom JDBC datasource which already had credentials filled in, but the credentials weren't preserved. I wasn't told about this and had to go digging in the collector logs: [06-03 07:19:34.455 UTC] [MSG] [DEBUG] [pool-29-thread-29::collector.jdbc:Task:155185:db-server.some.where:PostgresServer-5432:jdbc:166] [JDBCTask._collect:223] Create JDBC connection, CONTEXT=timeout=10s, url=jdbc:postgresql://db-server.some.where:5432/, username=, passwordLength=0 [06-03 07:19:34.463 UTC] [MSG] [ERROR] [pool-29-thread-29::collector.jdbc:Task:155185:db-server.some.where:PostgresServer-5432:jdbc:166] [JDBCTask._collect:231] Failed to create JDBC connection, CONTEXT=timeout=10s, url=jdbc:postgresql://db-server.some.where:5432/, username=, passwordLength=0, EXCEPTION=FATAL: no PostgreSQL user name specified in startup packet org.postgresql.util.PSQLException: FATAL: no PostgreSQL user name specified in startup packet at org.postgresql.Driver$ConnectThread.getResult(Driver.java:341) at org.postgresql.Driver.connect(Driver.java:264) at java.sql.DriverManager.getConnection(DriverManager.java:664) at java.sql.DriverManager.getConnection(DriverManager.java:247) at com.santaba.agent.collector3.jdbc.JDBCTask._collect(JDBCTask.java:228) at com.santaba.agent.collector3.DataCollectingTask.execute(DataCollectingTask.java:123) at com.santaba.agent.collector3.CollectingTaskExecutionRunable.run(CollectingTaskExecutionRunable.java:20) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) I've a href="https://communities.logicmonitor.com/topic/751-debugging-information/" rel="">mentioned this already, but it would be pretty useful to have the above lines in a popup window or error log right in the web UI, rather than digging into the collector logs.
Clement_Law
10 years ago Place Archive
12Views
0likes
0Comments