Saturday, January 20, 2018

How to fix "Service Monitor" and "Host Monitor" failure during Cloudera CDH5 cluster restart

Symptoms:

Request to the Service Monitor failed. This may cause slow page responses. View the status of the Service Monitor.
Request to the Host Monitor failed. This may cause slow page responses. View the status of the Host Monitor.

How to Fix:

Cloudera Management Service : -> Configuration

Descriptor Fetch Tries Interval -> Increase from default value 2 to 5

The interval between fetch tries for SCM descriptor when Cloudera Management Service roles are starting.

Descriptor Fetch Max Tries -> Increase from default value 5 to 60


Maximum number of tries to fetch SCM descriptor when Cloudera Management Service roles are starting. If the roles are not able to get the descriptor in these many tries, then they exit.



Reference:

[root@cdh-vm cloudera-scm-firehose]# grep "5 sec" mgmt-cmf-mgmt-SERVICEMONITOR-cdh-vm.dbaglobe.com.log.out
2018-01-20 09:54:10,687 WARN com.cloudera.cmon.firehose.Main: No descriptor fetched from http://cdh-vm.dbaglobe.com:7180 on after 1 tries, sleeping for 5 secs
2018-01-20 09:54:15,722 WARN com.cloudera.cmon.firehose.Main: No descriptor fetched from http://cdh-vm.dbaglobe.com:7180 on after 2 tries, sleeping for 5 secs
2018-01-20 09:54:20,733 WARN com.cloudera.cmon.firehose.Main: No descriptor fetched from http://cdh-vm.dbaglobe.com:7180 on after 3 tries, sleeping for 5 secs
2018-01-20 09:54:25,743 WARN com.cloudera.cmon.firehose.Main: No descriptor fetched from http://cdh-vm.dbaglobe.com:7180 on after 4 tries, sleeping for 5 secs
2018-01-20 09:54:30,744 WARN com.cloudera.cmon.firehose.Main: No descriptor fetched from http://cdh-vm.dbaglobe.com:7180 on after 5 tries, sleeping for 5 secs
2018-01-20 09:54:35,745 WARN com.cloudera.cmon.firehose.Main: No descriptor fetched from http://cdh-vm.dbaglobe.com:7180 on after 6 tries, sleeping for 5 secs
2018-01-20 09:54:40,753 WARN com.cloudera.cmon.firehose.Main: No descriptor fetched from http://cdh-vm.dbaglobe.com:7180 on after 7 tries, sleeping for 5 secs
2018-01-20 09:54:45,761 WARN com.cloudera.cmon.firehose.Main: No descriptor fetched from http://cdh-vm.dbaglobe.com:7180 on after 8 tries, sleeping for 5 secs
2018-01-20 09:54:50,764 WARN com.cloudera.cmon.firehose.Main: No descriptor fetched from http://cdh-vm.dbaglobe.com:7180 on after 9 tries, sleeping for 5 secs
2018-01-20 09:54:55,767 WARN com.cloudera.cmon.firehose.Main: No descriptor fetched from http://cdh-vm.dbaglobe.com:7180 on after 10 tries, sleeping for 5 secs
2018-01-20 09:55:00,773 WARN com.cloudera.cmon.firehose.Main: No descriptor fetched from http://cdh-vm.dbaglobe.com:7180 on after 11 tries, sleeping for 5 secs
2018-01-20 09:55:05,789 WARN com.cloudera.cmon.firehose.Main: No descriptor fetched from http://cdh-vm.dbaglobe.com:7180 on after 12 tries, sleeping for 5 secs
2018-01-20 09:55:10,792 WARN com.cloudera.cmon.firehose.Main: No descriptor fetched from http://cdh-vm.dbaglobe.com:7180 on after 13 tries, sleeping for 5 secs
2018-01-20 09:55:15,796 WARN com.cloudera.cmon.firehose.Main: No descriptor fetched from http://cdh-vm.dbaglobe.com:7180 on after 14 tries, sleeping for 5 secs
2018-01-20 09:55:20,799 WARN com.cloudera.cmon.firehose.Main: No descriptor fetched from http://cdh-vm.dbaglobe.com:7180 on after 15 tries, sleeping for 5 secs
2018-01-20 09:55:25,802 WARN com.cloudera.cmon.firehose.Main: No descriptor fetched from http://cdh-vm.dbaglobe.com:7180 on after 16 tries, sleeping for 5 secs
2018-01-20 09:55:30,805 WARN com.cloudera.cmon.firehose.Main: No descriptor fetched from http://cdh-vm.dbaglobe.com:7180 on after 17 tries, sleeping for 5 secs
2018-01-20 09:55:35,813 WARN com.cloudera.cmon.firehose.Main: No descriptor fetched from http://cdh-vm.dbaglobe.com:7180 on after 18 tries, sleeping for 5 secs
2018-01-20 09:55:40,818 WARN com.cloudera.cmon.firehose.Main: No descriptor fetched from http://cdh-vm.dbaglobe.com:7180 on after 19 tries, sleeping for 5 secs
2018-01-20 09:55:45,824 WARN com.cloudera.cmon.firehose.Main: No descriptor fetched from http://cdh-vm.dbaglobe.com:7180 on after 20 tries, sleeping for 5 secs
2018-01-20 09:55:50,825 WARN com.cloudera.cmon.firehose.Main: No descriptor fetched from http://cdh-vm.dbaglobe.com:7180 on after 21 tries, sleeping for 5 secs
2018-01-20 09:55:55,828 WARN com.cloudera.cmon.firehose.Main: No descriptor fetched from http://cdh-vm.dbaglobe.com:7180 on after 22 tries, sleeping for 5 secs
2018-01-20 09:56:00,836 WARN com.cloudera.cmon.firehose.Main: No descriptor fetched from http://cdh-vm.dbaglobe.com:7180 on after 23 tries, sleeping for 5 secs
2018-01-20 09:56:05,837 WARN com.cloudera.cmon.firehose.Main: No descriptor fetched from http://cdh-vm.dbaglobe.com:7180 on after 24 tries, sleeping for 5 secs
2018-01-20 09:56:10,846 WARN com.cloudera.cmon.firehose.Main: No descriptor fetched from http://cdh-vm.dbaglobe.com:7180 on after 25 tries, sleeping for 5 secs
2018-01-20 09:56:15,852 WARN com.cloudera.cmon.firehose.Main: No descriptor fetched from http://cdh-vm.dbaglobe.com:7180 on after 26 tries, sleeping for 5 secs
[root@cdh-vm cloudera-scm-firehose]# grep "5 sec" mgmt-cmf-mgmt-SERVICEMONITOR-cdh-vm.dbaglobe.com.log.out|wc -l

26