AppKeeper is a service built on Amazon EC2 (EC2), which automates troubleshooting of your system. You can start AppKeeper with easy signing up and use it immediately after signing up.
AppKeeper basically monitors services on the operating system running on EC2 instances, and restarts services or instances when it detects service abnormality. The instances and services to monitor can be flexibly set from the GUI administration screen. You can set AppKeeper not to monitor, or set it to only monitor but not to perform recovery depending on the importance of your instances and services.
AppKeeper can take over most of troubleshooting activities that had been handled by engineers based on the information from operation monitoring tools. AppKeeper is a service which greatly contributes to reduction of stem operation costs.
Features
This section describes features of AppKeeper.
- Automatic configuration
AppKeeper automatically detects the EC2 instances to monitor, and then start monitoring the services running on them with only simple configurations. By default, all the EC2 instances that you are running and their services are monitored. AppKeeper can monitor the both Windows and Linux environment. (Please refer to Supported Environment described later for details.) - Specifying monitoring targets
You can specify EC2 instances and services to monitor. - Monitoring (check)
Function to check the status of the monitored services. The services for which automatic startup is set within the OS are monitored. - Recovery (recover)
When detecting that a monitored service is stopped, AppKeeper restarts the service. If the service still does not recover, then AppKeeper restarts the instance. - Notification
AppKeeper can send a notification to the designated email address upon detection of/recovery from failure. - Fault Logs
AppKeeper collects system information at the time of service failure/recovery. - Publishing API (to be implemented in the future)
Each function provided by AppKeeper will be published as an API.
How AppKeeper Works
AppKeeper uses the API provided by AWS to automatically detect the EC2 instances to monitor. Then, using the RoleName set by a user and Run Command, AppKeeper performs automatic service detection, failure monitoring and recovery for the instances where the AWS Systems Manager agent (SSM agent) is installed.
AppKeeper’s Behavior
1.Resources Discovery
AppKeeper will recognize the existence of the added service by automatic resource detection performed every 30 minutes. Also, AppKeeper will immediately recognize it if you choose the Resources Discovery button.
2.Basic operation of monitoring and recovery
AppKeeper monitors the state of the services in the EC2 instances, and attempts recovery if there is any failure. Service recovery is performed with following two steps.
- Restarting services with commands
- Restarting instances
AppKeeper first tries to recover the service with commands, but if it cannot recover then restarts the instance. If restoration cannot be confirmed after restarting the instance, AppKeeper restarts the service once again. If it still cannot be recovered, AppKeeper determines that the service cannot be recovered by restarts and send a notification email to the administrator and exclude the service from the monitoring target.
3.Monitoring and recovery of services
How to check and recover the service status depends on the monitored OS.
Below is a summary of monitoring (service status checking) and recovery (restart) method for each OS.
OS |
Service management |
Monitoring (status checking) |
Recovery (restart) |
RHEL6/CentOS6/Amazon Linux |
SysVinit |
service status command |
service --full-restart command |
RHEL7/CentOS7 /Amazon Linux 2 |
systemd |
systemctl is-active command |
systemctl restart command |
Windows Server 2012 R2/2016 |
Windows services |
Get-Service cmdlet |
Start-Service cmdlet |
If monitoring (service status checking) or recovery (restart) command takes more than 30 seconds it will time out. If timeout occurs during recovery (restart), it will be treated as recovery failure.
Using AppKeeper with Auto Recovery
Auto Recovery is a service of AWS that automatically recovers EC2 instances from failure such as hardware failure or failure requiring repair by AWS occurs on the EC2 instances.
AppKeeper strongly recommends you to use Auto Recovery for the monitored instances. AppKeeper and Auto Recovery perform monitoring and automatic recovery for the following troubles respectively.
AppKeeper: Failure of services running on EC2 instances
Auto Recovery: Hardware failure and system failure of AWS
In this way AppKeeper and Auto Recovery monitor different layers. If monitored instances become unavailable due to hardware failure, AppKeeper cannot recover the instances.
For this reason, it is recommended to use AppKeeper and Auto Recovery services together in order to properly monitor and restore from failure at each layer occurred on monitored targets.
When using Auto Recovery together, there is no need to make special settings for AppKeeper. If Auto Recovery is already set for the instances you want to monitor, you can continue to use Auto Recovery as it is. You can set up Auto Recovery even after setting up AppKeeper.
Comments
0 comments
Please sign in to leave a comment.