yarn-site.xml 并使用zookeeper 来 管理resourcemanager的状态
<?xml version="1.0"?> <configuration> <property> <name>yarn.resourcemanager.ha.enabled</name> <value>true</value> </property> <property> <name>yarn.resourcemanager.cluster-id</name> <value>cluster1</value> </property> <property> <name>yarn.resourcemanager.ha.rm-ids</name> <value>rm1,rm2</value> </property> <property> <name>yarn.resourcemanager.hostname.rm1</name> <value>master</value> </property> <property> <name>yarn.resourcemanager.hostname.rm2</name> <value>slave01</value> </property> <property> <name>yarn.resourcemanager.webapp.address.rm1</name> <value>master:8088</value> </property> <property> <name>yarn.resourcemanager.webapp.address.rm2</name> <value>slave01:8088</value> </property> <property> <name>yarn.resourcemanager.zk-address</name> <value>master:2181,slave01:2181,slave02:2181</value> </property> <!-- Site specific YARN configuration properties --></configuration> |
将配置同步到 master 和slave01 节点,然后启动rm
hadoop-2.7.2/sbin/yarn-daemon.sh start resourcemanager
查看节点状态
hadoop-2.7.2/bin/yarn rmadmin -getServiceState rm1
hadoop-2.7.2/bin/yarn rmadmin -getServiceState rm2
[jxlgzwh@slave01 ~]$ hadoop-2.7.2/sbin/yarn-daemon.sh start resourcemanager starting resourcemanager, logging to /home/jxlgzwh/hadoop-2.7.2/logs/yarn-jxlgzwh-resourcemanager-slave01.out [jxlgzwh@slave01 ~]$ jps 4160 DataNode 3986 NameNode 4853 ResourceManager 4885 Jps 3851 JournalNode 3502 QuorumPeerMain 4542 DFSZKFailoverController [jxlgzwh@slave01 ~]$ hadoop-2.7.2/bin/yarn rmadmin -getServiceState rm1 standby [jxlgzwh@slave01 ~]$ hadoop-2.7.2/bin/yarn rmadmin -getServiceState rm2 active [jxlgzwh@slave01 ~]$ jps 4160 DataNode 5425 Jps 3986 NameNode 4853 ResourceManager 3851 JournalNode 3502 QuorumPeerMain 4542 DFSZKFailoverController [jxlgzwh@slave01 ~]$ hadoop-2.7.2/sbin/yarn-daemon.sh stop resourcemanager stopping resourcemanager [jxlgzwh@slave01 ~]$ hadoop-2.7.2/bin/yarn rmadmin -getServiceState rm1 active [jxlgzwh@slave01 ~]$ hadoop-2.7.2/bin/yarn rmadmin -getServiceState rm2 16/11/27 23:44:42 INFO ipc.Client: Retrying connect to server: slave01/192.168.31.130:8033. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS) Operation failed: Call From slave01/192.168.31.130 to slave01:8033 failed on connection exception: java.net.ConnectException: 拒绝连接; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused [jxlgzwh@slave01 ~]$ |
参考文档
http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/ResourceManagerHA.html#RM_Failover