AWS Amazon DevOps Guru
Amazon Web Services (AWS) late last year, introduced Amazon DevOps Guru, one of several new machine learning-driven services. DevOps Guru detects operational issues, generates reports and notifications, and offers insights and recommendations on how to take action.
DevOps Guru is a fully-managed service that is trained to analyze logs, metrics, and events across 25 AWS resources. The service looks for behavior that deviates from patterns established by history extracted from Amazon and AWS. Users configure DevOps Guru with a list of resources to monitor. The service alerts users about problems and potential issues when it identifies anomalous situations, such as code releases that lead to abnormal behavior or resource utilization patterns that may lead to depletion. DevOps Guru delivers insights that include details about the impact of problems, as well as how to remediate them.
DevOps Guru provides users an integrated dashboard with an Insights page that displays the anomalies it discovered. The service presents these reports with contextual information and recommendations on how to address them. Insights are either reactive, highlighting existing issues, or proactive, identifying problems that may occur in the future. For example, a reactive insight would alert developers of a sudden increase in latency in a lambda function. A proactive insight would alert developers of an anticipated increase in latency due to increased memory utilization in the same function.
The service delivers Insights via SNS events and is already able to deliver alerts via PagerDuty and Atlassian’s Opsgenie. DevOps Guru also integrates with AWS Systems Manager to create new OpsItems in OpsCenter and generates Cloudwatch Events.
There are no monthly or service-level fees for DevOps Guru. Amazon charges for AWS resource analysis and API calls. The fees are billed by the hour per active resource. A resource is active if it generates events, log entries, or metrics within an hour.
With tools like Amazon DevOps Guru, Devops teams at Southlights can prevent operational incidents before they occur, contact us if you want to Improve operational performance, availability and manage resources proactivly in your organization.
Sources: aws.amazon.com, infoq.com