Change in the number of instances and Coati’s behavior
Coati performs automatic detection of resources every thirty minutes. This automatic detection function enables Coati to behave according to the environment even when the number of instances to monitor increases.
If the number of instances increases, Coati automatically add them to the monitoring targets if the increased instances meet the support requirements. In this case, default monitoring (monitoring all the active services, restarting the services/instances, notifying failure/recovery succeeded/recovery failed for all the active instances) is applied.
In order to instantly add the increased instances to Coati's monitoring targets, click the "Detect Resources" button on the left pane.
If the number of instances decreases, Coati will never perform further monitoring for nonexistent instances.
When deleting an instance while recovering from failure, event notification email may be sent or the history of the event may remain, but this does not affect Coati's operation. To avoid this, remove the target instance from Coati's monitoring targets before deleting the instance. It will be out of scope of monitoring at the next monitoring, so please delete the instance after 3 minutes.
Add the service to monitor
How to select the service to monitor is as described in the above "Service setting" screen, however, when new service is added to the OS, this information is not reflected on the "Service Settings" screen immediately.
In this case, Coati will also recognize the existence of the added service by automatic resource detection performed every 30 minutes and you will be able to select whether to add it to the monitoring target. Since it is recognized as the monitoring target by default, if you do not want to monitor it please unselect the added service from the monitoring target on the "Service Settings" screen. Also, in order to enable Coati to recognize the added service immediately, click the "Detect Resource" button on the left pane.
How to operate the monitored instances
No special operation is required to stop instances. Stopped instances are automatically excluded from monitoring targets. However, if an instance is stopped during monitoring, recovery may be executed incompletely. Although it does not cause any problem, please remove the instance from the monitoring target before stopping and stop it a few minutes later in order to completely avoid such a situation.
For restarting instances, special operation is not required same as when stopping instances. However, if you want to avoid monitoring during the restart process, please remove them from monitoring targets before restarting. In order to resume monitoring after restarting, it is necessary to manually add the instance to the monitoring targets.
No special operation is required for deleting instances. However, if you want to avoid monitoring during the deleting process, remove them from monitoring targets before deleting.
When Coati’s recovery attempt failed
If Coati failed to recover the service from failure automatically, Coati sets the status of the instance to "Recovery Failed" and will not check the subsequent service status. In this state, the "Monitor" and "Recovery" boxes are unchecked on the "Monitoring setting" screen. To restart the monitoring, please check the "Monitor" and "Recovery" checkboxes on the "Individual Settings for Monitoring Target" screen.
Connection failure to target instances
Coati uses the SSM agent (AWS Systems Manager agent) to communicate with the instance.
If SSM can not be used, Coati will not be able to continue monitoring and recovering, so we will send SSM inactive detection mail.
If you receive a detection mail, please check the SSM connection of the instance. Coati automatically restarts monitoring as soon as SSM connection can be confirmed.
In addition, even if the user himself turns off the instance or stops the SSM, the detection mail arrives, but there is no special necessary work. If you do not need SSM Inactive Detection Mail, you can turn off notification by unchecking the "SSM Inactive" check box on the "Notification Settings" screen.
State of “Command Unavailable”
The state where monitoring, recovery and detection commands of Coati can not be executed normally even though SSM is available is regarded as the state where "command can not be executed". If this happens, Coati will not be able to continue monitoring and recovery, so it will send “Command Unavailable“ emails. If you receive the email, please check your environment. If you do not need the emails, you can turn off notifications by unchecking the "Command Unavailable" checkbox on the "Notification settings" screen.
The following figure shows the states of the instance