New Badge Alert: AIOps Adoption Is Now Live!
We’re excited to introduce the next badge in the LM Badge and Certification Program—AIOps Adoption, now available in LM Academy! This badge is designed to help you build a solid foundation in AI-driven operations and understand how AIOps enhances observability and incident management within LogicMonitor. By earning this badge, you’ll learn how to: Understand the evolution of AI in IT operations, including how it finds meaningful patterns in your data, it reduces incidents, and improves resolution times Identify how the AIOPs within LM Envision help you enhance observability by proactively preventing issues and troubleshooting faster. Discover the value Edwin AI delivers by surfacing insights, enhancing workflows, and improving decision-making, all while learning some strategic approaches to adopting Edwin AI. Whether you’re just starting to explore AIOps or looking to level up how you manage incidents, this badge gives you the tools and knowledge to make smarter, faster decisions in your LM environment. As always, the badge is: ✔ Free ✔ On-demand ✔ Self-paced Once you complete the badge exam, you’ll earn a verified digital badge delivered straight to your inbox. Be sure to check your Spam folder if you don’t see it. This third party verified badge is perfect for sharing with your team or showcasing on LinkedIn. 🎉 Don’t forget to tag #logicmonitor when you post—we love cheering on your progress! Note: We’re currently building an integration to sync badges to your LM Community profile. In the meantime, our team will manually upload earned badges on a monthly basis.31Views1like0CommentsJuly Product Power Hour Recap: Monitoring Your AI Workloads with LM
Overview In this edition of Product Power Hour, the LM team explored how LogicMonitor can be used to effectively monitor AI workloads across modern environments. The session walked through best practices for monitoring key components of AI systems—including GPU metrics, model latency, and infrastructure dependencies—using LogicMonitor’s platform. Attendees gained insights into real-world AI observability challenges and how LogicMonitor enables end-to-end visibility into the health of AI services. Key Highlights ⭐ AI Workload Dashboards: Demonstrated how to build dashboards tailored to AI-specific metrics, including GPU utilization, job runtimes, and inference latency. ⭐ Dynamic Thresholds: Discussed using anomaly detection to set smarter thresholds for variable workloads like training jobs and inference endpoints, helping reduce alert fatigue and improve model reliability by adapting to fluctuating usage patterns. ⭐ Unified Monitoring: Emphasized LM’s ability to consolidate data across cloud, on-prem, and edge environments—critical for hybrid AI infrastructure. ⭐ Alert Routing + Suppression: Demonstrated how to avoid alert fatigue by using alert tuning and dynamic suppression during scheduled AI retraining windows. Q&A Q: Can LogicMonitor monitor GPU metrics out-of-the-box? A: Yes, LM has native collectors and integrations to pull in GPU metrics from platforms like NVIDIA and cloud providers. Q: Is LM useful for model observability? A: While LM focuses on infrastructure-level monitoring, it provides context crucial to understanding model performance issues (e.g., degraded latency tied to resource constraints). Q: How does alert suppression work during model retraining? A: You can set up dynamic suppression rules based on job schedules or metadata to avoid false positives during known high-usage periods. Q: Does LM integrate with tools like PagerDuty or Slack? A: Yes. These integrations are supported and were demoed live during the session. Customer Call-outs 🌟 “I can now see infrastructure issues that were hard to diagnose before.” 🌟 "LM’s GPU monitoring capabilities have been helpful for managing cloud costs and performance.” What’s Next 📚 Badges and Certifications We’ve launched our new LogicMonitor Badges and Certifications program in LM Academy. Earn free, on-demand, digital badges that validate your product knowledge and platform skills. Available badges: 🛡️Getting Started 🛡️Collectors 🛡️Logs Launching July 31: 🛡️AI Ops Adoption 🏕️ Camp LogicMonitor: An Observability Adventure Join us starting August 18th for this 4-week virtual learning experience designed for LogicMonitor users of all levels. Each week features self-paced lessons, community discussions, and live Campfire Chats with product experts. Earn badges, grow your skills, and score exclusive LogicMonitor swag! 👉 Register now to reserve your spot! 🪵 Logs for Lunch August 12 – Network Troubleshooting & Getting Started with Logs ⚡ Product Power Hour August 19 - Edwin AI In Action Want to check out previous Product Power Hours? Explore the Product Power Hour Hub in LM Community! 👥 User Groups Connect in person with other LM users in your city over dinner and real talk. Share wins, swap stories, and grow your network. RSVP today: Salt Lake City - September 9 Denver - September 10 Stay tuned in our LM Community User Group Hub for upcoming virtual sessions. Note: As we finalize our speakers, these dates and times may change, but be sure to register for your respective regions above so we can keep you informed! Review If you missed any part of the session or want to revisit the content, we’ve got you covered: Review the slide deck here Want to see the full session? Watch the recording below ⬇️59Views1like0CommentsJuly 22 - Product Power Hour: Monitoring your AI Workloads with LM
Product Power Hour: Monitoring Your AI Workloads with LM Date: Tuesday, July 22 at 10 AM CT 🔗 Register Here Join us for July’s edition of Product Power Hour, your monthly deep dive into the latest and greatest from LogicMonitor! Hosted by the LM Community, Product team, and Training & Enablement, this session will focus on AI Monitoring—how LogicMonitor empowers you to monitor, optimize, and gain insights from your AI workloads with confidence. Featuring Guest Speakers: David Femino, Principal Product Manager Richard Brooke, Technical Trainer What You’ll Learn 🧠 Purpose-Built AI Monitoring: Understand how LogicMonitor helps you keep mission-critical AI systems observable, performant, and cost-efficient. 📊 Key Metrics & Dashboards: Explore pre-built dashboards and curated insights tailored for LLMs, inference jobs, GPUs, and more. 🚀 End-to-End Visibility: Learn how to integrate AI monitoring seamlessly into your broader infrastructure monitoring strategy. 🎯 Real-World Use Cases: See how customers are using LogicMonitor to manage and scale their AI initiatives. 🤝 Expert Q&A: Bring your toughest questions—our experts are here to help you make the most of LM’s AI monitoring capabilities.1.4KViews0likes0CommentsNew UI Impact Series - Topology Node Grouping
Next up in our series is Topology Node Grouping. This new feature allows users to dynamically group nodes in saved topology maps based on up to three levels of property metadata. By leveraging tags and labels stored as LogicMonitor properties, you can now organize your complex network maps into intuitive, property-based clusters. The groups are automatically color-coded according to alert status, providing an instant visual indicator of potential issues within specific node groups. So, how does this help you troubleshoot more efficiently? In complex network environments, identifying the severity level and location of issues can be like finding a needle in a haystack. Topology Node Grouping transforms this process, allowing you to quickly assess the 'blast radius' of any network problem. For instance, in a map of virtual machines, grouping by location could instantly reveal that all alerts originate from a specific data center. This level of clarity, which would have required extensive zooming and manual inspection in the past, is now available at a glance. By speeding up the identification of affected areas, Topology Node Grouping enables IT professionals to respond more swiftly and effectively to network issues, potentially reducing downtime and improving overall network performance. Want to know more about Topology Node Grouping? Check out these articles on Node Grouping and Searching for Nodes.53Views6likes0Comments