Thursday, October 25, 2012

Thursday, October 18, 2012

JBoss Dynamic Migration of Distributed HA Singletons among Nodes

Document  Version 1.0
Copyright © 2012-2013 beijing.beijing.012@gmail.com


Keywords:
JBoss HA  Service, HA Singleton, JBoss Cluster, JBoss load balancing, load migration, load distribution

A JBoss HA singleton is a cluster wide singleton, which runs only on one node of a JBosss cluster. When the node on which the singeton runs,  fails, another node (the master node) will automatically start the singelton service on it. When there are several  different HA singleton services, ALL the services will be activated on the master node. When the master node fails, another node take over the role of of master node, and start ALL the singleton services on this node. This is the default HA singleton behavior of JBoss.

However there might be some problems with the above solution.
If the singleton services, or some of the singleton services are resource intensive, we are going to have performance problem, if they all run simutaniously on one node, while other nodes are relatively idle.

We will now introduce a solution, which dynamically distribute HA singleton services on preferred nodes on cluster startup. In case of server failures, services will be migrated and distributed to other nodes, by checking the number of still alive nodes, and by considering the preferred nodes.

With an example, we will show how 3 HA singleton services could be dynmically migrated among a 3-nodes JBoss cluster.

Rules for distributing services on nodes are: 

if alive node count 3 -> TestHASvcA  on node[0]
                                  TestHASvcB  on node[1]
                                  TestHASvcC  on node[2]


if alive node count 2 -> TestHASvcA  on node[0]
                                  TestHASvcB  on node[0]
                                  TestHASvcC  on node[1]


if alive node count 1 -> TestHASvcA  on node[0]
                                  TestHASvcB  on node[0]
                                  TestHASvcC  on node[0]

Explaination to node sequences:
For a JBoss cluster, all nodes share the same sorted list of all alive nodes. This list is sorted by the time, when a node joins the cluster. For  a 3-nodes cluster,  node1, ndoe2, and node3, if nodes  are started one after another, then the order list will be :
                                  node1 --> [0]
                                  node2 --> [1]
                                  node3 --> [2]

In case node1 fails, node2 and node3 will update the list to:
                                   node2 --> [0]
                                   node3 --> [1]

When node1 joins the cluster again (restarted), the shared list of the 3 nodes is changed to:
                                  node2 --> [0]
                                  node3 --> [1]
                                  node1 --> [2]


Expected HA service migration behavior of our example :


step1. start node1
TestHASvcA on node1
TestHASvcB on node1
TestHASvcC on node1
step2. start node2
TestHASvcA on node1
TestHASvcB on node1
TestHASvcC migrates to node2
step3. start node3
TestHASvcA still on node1
TestHASvcB migrates to  node2
TestHASvcC migrates to  node3
step6. node1 recovers
TestHASvcA still on node3
TestHASvcB still on node3
TestHASvcC migrates to node1
step5. node2 fails
TestHASvcA migrates to node3
TestHASvcB migrates to node3
TestHASvcB still on node2
TestHASvcC still on node3
step4. node1 fails
TestHASvcA migrates to node2
TestHASvcB still on node2
TestHASvcC still on node3
step7. node2 recovers TestHASvcA still on node3
TestHASvcB migrate to node1
TestHASvcC migrates to node2


Configure and run a 3-nodes JBoss cluster

Please refer to the post HA Singleton, Cluster Wide Singleton as MBean in JBoss 5, part 1/3,  and create a JBoss cluster with 3 nodes:
  • node1
  • node2
  • node3 
Create 3 MBean TestHASvcA, TestHASvcB, TestHASvcC.
Use your favorite IDE to creat a simple Java project "HAServiceDynamicLoadMigration".
Please refer to the post HA Singleton, Cluster Wide Singleton as MBean in JBoss 5, part 2/3,  and create 3 JBoss MBeans TestHASvcA, TestHASvcB and TestHASvcC.


 TestHASvcA.java



package test.ha;
import org.jboss.system.ServiceMBeanSupport;
/**
 * Make sure to use the naming standard for MBeans (If the
 * source class is named Serious, then the interface must be named
 * SeriousMBean).
 * 
 * @author ws
 */
public class TestHASvcA extends ServiceMBeanSupport implements TestHASvcAMBean {
public void startHAService() {
System.out.println("# Starting HA service " + TestHASvcA.class.toString());
}

public void stopHAService() {
System.out.println("# Stopping HA service " + TestHASvcA.class.toString());
}
 }



TestHASvcAMBean.java



package test.ha;

import org.jboss.system.ServiceMBean;

    public interface TestHASvcAMBean extends ServiceMBean {
}




TestHASvcB.java



package test.ha;
import org.jboss.system.ServiceMBeanSupport;
/**
 * Make sure to use the naming standard for MBeans (If the
 * source class is named Serious, then the interface must be named
 * SeriousMBean).
 * 
 * @author ws
 */
public class TestHASvcB extends ServiceMBeanSupport implements TestHASvcBMBean {
public void startHAService() {
System.out.println("#Starting HA service " + TestHASvcB.class.toString());
}

public void stopHAService() {
System.out.println("#Stopping HA service " + TestHASvcB.class.toString());
}
 }



TestHASvcBMBean.java



package test.ha;

import org.jboss.system.ServiceMBean;

public interface TestHASvcBMBean extends ServiceMBean {
}




TestHASvcC.java



package test.ha;
import org.jboss.system.ServiceMBeanSupport;
/**
 * Make sure to use the naming standard for MBeans (If the
 * source class is named Serious, then the interface must be named
 * SeriousMBean).
 * 
 * @author ws
 */
public class TestHASvcC extends ServiceMBeanSupport implements TestHASvcCMBean {
public void startHAService() {
System.out.println("#Starting HA service " + TestHASvcC.class.toString());
  }

public void stopHAService() {
System.out.println("#Stopping HA service " + TestHASvcC.class.toString());
}
}



TestHASvcCMBean.java


package test.ha;

import org.jboss.system.ServiceMBean;

public interface TestHASvcCMBean extends ServiceMBean {
}



Customize JBoss HA Singleton Service's Selection Policy 
Key of dynamic HA service distrbution and migration is customized activation of HA singleton services on selected node at runtime. JBoss needs sort of "HASingletonElectionPolicy" to decide on which node to activate a certain HA singleton service. The default selection policy is "HASingletonElectionPolicySimple", which always activate HA Service on the "0th" node.
To change the default selection behavoir, we need to provide JBoss with a   customized HA selection policy, we call it "TestP" selection policy hereafter:
  • write a "TestPMBean" interface for the "TestP" MBean.
  • write a "TestP" MBean, which, implements "HASingletonElectionPolicy" and TestPBean.
  • configure JBoss to apply this selection policy to the singleton service MBeans in "jboss-service.xml"


TestPMBean.java



package test.ha;

import org.jboss.system.ServiceMBean;

public interface TestPMBean extends ServiceMBean {
                  void setId(String singletonId);
                  String getId();
 }






TestP.java



package test.ha;

import java.util.List;

import org.jboss.ha.framework.interfaces.ClusterNode;
import org.jboss.ha.framework.interfaces.HASingletonElectionPolicy;
import org.jboss.system.ServiceMBeanSupport;

public class TestP extends ServiceMBeanSupport implements
HASingletonElectionPolicy, TestPMBean {

private String id;

@Override
public void setId(String singletonId) {
this.id = singletonId;
}

  @Override
public String getId() {
return this.id;
}

// @Override
  public ClusterNode elect(List<ClusterNode> arg0) {
System.out.println(" ### list all nodes before doing selection ploicy name: ");

for (ClusterNode tmpNode : arg0) {
System.out.println(" ## node name: " + tmpNode.getName());
}

// int nodeSeg = getNodeSeq(arg0);
ClusterNode selectedNode = getNodeSeq(arg0);
System.out.println(" ## selected node, policy name: "+ selectedNode.getName());

return selectedNode;
}

      /*
      * This method changes the selection behavior from default to our customized selection          
      logic.
      *
      */
private ClusterNode getNodeSeq(List<ClusterNode> activeNodes) {

System.out.println(" ## getNodeSeq ,  policy name " + id);
// A, B, C 3 logical nodes
// 3 physical nodes: A->0, B->1, C->2
int nodeCount = activeNodes.size();
if (nodeCount == 3) {
if (this.id.equals("SS_A")) {
return activeNodes.get(0);
}

if (this.id.equals("SS_B")) {
return activeNodes.get(1);
}

if (this.id.equals("SS_C")) {
return activeNodes.get(2);
}
}

if (nodeCount == 2) {
if (this.id.equals("SS_A")) {
return activeNodes.get(0);
}

if (this.id.equals("SS_B")) {
return activeNodes.get(0);
}

if (this.id.equals("SS_C")) {
return activeNodes.get(1);
}
}

if (nodeCount == 1) {
if (this.id.equals("SS_A")) {
return activeNodes.get(0);
}

if (this.id.equals("SS_B")) {
return activeNodes.get(0);
}

if (this.id.equals("SS_C")) {
return activeNodes.get(0);
}
}

// Default active service on the 0th node
return activeNodes.get(0);
}
}



jboss-service.xml

Create a new folder "META-INF" directly in base folder of "HAServiceDynamicLoadMigration" project. Create a new "jboss-service.xml" in "META-INF" with following content:



<?xml version="1.0" encoding="UTF-8"?>
<server>
<mbean code="test.ha.TestHASvcA" name="myexample:service=TestHASvcA"/>
<mbean code="test.ha.TestP" 
      name="myexample:service=SingletonServiceControllerA,type=ElectionPolicy">
<attribute name="Id">SS_A</attribute>
</mbean>
<mbean code="org.jboss.ha.singleton.HASingletonController" name="myexample:service=SingletonServiceControllerA">
<attribute name="HAPartition">
<inject bean="HAPartition" />
</attribute>
<attribute name="ElectionPolicy">
<inject bean="myexample:service=SingletonServiceControllerA,type=ElectionPolicy" />
</attribute>
<attribute name="Target">
<inject bean="myexample:service=TestHASvcA" />
</attribute>
<attribute name="TargetStartMethod">startHAService</attribute>
<attribute name="TargetStopMethod">stopHAService</attribute>
</mbean>

<mbean code="test.ha.TestHASvcB" name="myexample:service=TestHASvcB"/>
<mbean code="test.ha.TestP" name="myexample:service=SingletonServiceControllerB,type=ElectionPolicy">
<attribute name="Id">SS_B</attribute>
</mbean>
<mbean code="org.jboss.ha.singleton.HASingletonController" name="myexample:service=SingletonServiceControllerB">
<attribute name="HAPartition">
<inject bean="HAPartition" />
</attribute>
<attribute name="ElectionPolicy">
<inject bean="myexample:service=SingletonServiceControllerB,type=ElectionPolicy" />
</attribute>
<attribute name="Target">
<inject bean="myexample:service=TestHASvcB" />
</attribute>
<attribute name="TargetStartMethod">startHAService</attribute>
<attribute name="TargetStopMethod">stopHAService</attribute>
</mbean>

<mbean code="test.ha.TestHASvcC" name="myexample:service=TestHASvcC"/>

<mbean code="test.ha.TestP" name="myexample:service=SingletonServiceControllerC,type=ElectionPolicy">
<attribute name="Id">SS_C</attribute>
</mbean>
<mbean code="org.jboss.ha.singleton.HASingletonController" name="myexample:service=SingletonServiceControllerC">
<attribute name="HAPartition">
<inject bean="HAPartition" />
</attribute>
<attribute name="ElectionPolicy">
<inject bean="myexample:service=SingletonServiceControllerC,type=ElectionPolicy" />
</attribute>
<attribute name="Target">
<inject bean="myexample:service=TestHASvcC" />
</attribute>
<attribute name="TargetStartMethod">startHAService</attribute>
<attribute name="TargetStopMethod">stopHAService</attribute>
</mbean>
</server>



Now the necessay coding is all done, we are ready to deploy  and test HA singleton services.

Deployment

Use you IDE to export "HAServiceDynamicLoadMigration" project as "HAServiceDynamicLoadMigration.sar" archive to "farm" folder of node1.

Step by step we will now simulate the process of starting all the nodes in cluster, shutting down nodes one by one till only one node is left alive, and then recover the custer by restarting failed server one after another. In the mean time we will keep an eye on the console output of each node to check if the HA singleton services migrations among nodes are done as expected, and as programmed in the customized HA selection policy.

Step1. start node1

Now startup node1 and check the console output of node1, you will see output like following:
----

12:33:39,351 INFO  [GroupMember] I am (127.0.0.1:32902)
12:33:39,352 INFO  [GroupMember] New Members : 1 ([127.0.0.1:32902])
12:33:39,353 INFO  [GroupMember] All Members : 1 ([127.0.0.1:32902])
12:33:39,382 INFO  [STDOUT] 
....
12:33:56,251 INFO  [STDOUT] #                   Starting HA service class test.ha.TestHASvcA
12:33:56,309 INFO  [STDOUT] #                   Starting HA service class test.ha.TestHASvcB
12:33:56,337 INFO  [STDOUT] #                   Starting HA service class test.ha.TestHASvcC
----
The above info shows:
  • JBoss is started in clustered mode, but currently with only one node
  • As expected, all 3 HA services are started on node1 

Step2. start node2

Now start node2 and check the console output of node1 and node2.

node1 console:
---
12:43:12,454 INFO  [GroupMember] org.jboss.messaging.core.impl.postoffice.GroupMember$ControlMembershipListener@1a701a8 got new view [127.0.0.1:50418|1] [127.0.0.1:50418, 127.0.0.1:53538], old view is [127.0.0.1:50418|0] [127.0.0.1:50418]                                                                                                                                      
12:43:12,470 INFO  [GroupMember] I am (127.0.0.1:50418)                                                                                                                                   
12:43:12,471 INFO  [GroupMember] New Members : 1 ([127.0.0.1:53538])                                                                                                                      
12:43:12,471 INFO  [GroupMember] All Members : 2 ([127.0.0.1:50418, 127.0.0.1:53538])                                                                                                     
...
12:43:32,899 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
12:43:32,977 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
12:43:32,977 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
12:43:32,977 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_A
12:43:32,977 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1199
12:43:32,978 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
12:43:32,978 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
12:43:32,978 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
12:43:32,978 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_B
12:43:32,986 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1199
12:43:33,020 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
12:43:33,021 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
12:43:33,021 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
12:43:33,021 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_C
12:43:33,021 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1299
12:43:33,022 INFO  [STDOUT] #                   Stopping HA service class test.ha.TestHASvcC

---
Node1 now has detected, that there is a new member joined the cluater. It updated the sorted list for current alive nodes:
                node1 -->[0]
                node2 -->[1]

While the cluster structure has changed, JBoss HA Controller will try to update the HA Singleton status with help of the configured HA selection policy.
"TestHASvcA" is configured to use HA selection policy "SS-A", and will activate the  "TestHASvcA" on the 0th node, i.e. the node1. Since "TestHASvcA" is already active on node1, so no further action will be done.

"TestHASvcB" is configured to use HA selection policy "SS-B", and will activate the  "TestHASvcB" on the 0th node, i.e. the node1. Since "TestHASvcB" is already active on node1, so no further action will be done.

"TestHASvcC" configured to use HA selection policy "SS-C", and will activate the  "TestHASvcC" on the 1st node, i.e. the node2. So here the "TestHASvcC" is stopped.

node2 console:
---
12:43:12,481 INFO  [GroupMember] I am (127.0.0.1:53538)
12:43:12,482 INFO  [GroupMember] New Members : 2 ([127.0.0.1:50418, 127.0.0.1:53538])
12:43:12,482 INFO  [GroupMember] All Members : 2 ([127.0.0.1:50418, 127.0.0.1:53538])
12:43:12,642 INFO  [STDOUT] 
...
12:43:32,901 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
12:43:32,902 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
12:43:32,903 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
12:43:32,904 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_A
12:43:32,905 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1199
12:43:32,967 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
12:43:32,968 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
12:43:32,969 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
12:43:32,969 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_B
12:43:32,970 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1199
12:43:33,023 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
12:43:33,023 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
12:43:33,023 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
12:43:33,023 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_C
12:43:33,023 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1299
12:43:33,027 INFO  [STDOUT] #                   Starting HA service class test.ha.TestHASvcC

Node2 joins cluster and detected, there is a already a member in the cluster. It updated the sorted list for current alive nodes:
                node1 -->[0]
                node2 -->[1]
Since cluster structure has been changed, JBoss HA Controller will try to update the HA Singleton status.
According to the configured selection policy, "TestHASvcA" and  "TestHASvcB" should stay on 0th node, i.e. node1, so nothing will be done for these services on node2.
"TestHASvcC" is configured to start on 1st node, i.e. node2, so node2 will started "TestHASvcC"
---

Step3. start node3

Now start node3 and check the console output of node1, node2 and node3.

node1 console:
---
14:52:24,349 INFO  [GroupMember] I am (127.0.0.1:36472)
14:52:24,349 INFO  [GroupMember] New Members : 1 ([127.0.0.1:42881])
14:52:24,350 INFO  [GroupMember] All Members : 3 ([127.0.0.1:36472, 127.0.0.1:44741, 127.0.0.1:42881])
...
14:52:43,676 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
14:52:43,705 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
14:52:43,706 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
14:52:43,706 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
14:52:43,706 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_A
14:52:43,706 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1199
14:52:43,752 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
14:52:43,752 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
14:52:43,752 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
14:52:43,753 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
14:52:43,753 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_B
14:52:43,753 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1299
14:52:43,757 INFO  [STDOUT] #                   Stopping HA service class test.ha.TestHASvcB
14:52:43,823 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
14:52:43,825 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
14:52:43,825 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
14:52:43,826 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
14:52:43,826 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_C
14:52:43,827 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1399
---

Node3 joins cluster. Node1 updates the sorted list for current alive nodes:
                node1 -->[0]
                node2 -->[1]
                node3--> [2]
Since cluster structure has changed, JBoss HA Controller will try to update the HA Singleton status.
According to the configured selection policy, "TestHASvcA" should stay on 0th node, i.e. node1. "TestHASvcB" should be migrated to 1st node, i.e. node2, so this service will be stopped on node1.

node2 console:
---
[127.0.0.1:36472|1] [127.0.0.1:36472, 127.0.0.1:44741]
14:52:24,298 INFO  [GroupMember] I am (127.0.0.1:44741)
14:52:24,298 INFO  [GroupMember] New Members : 1 ([127.0.0.1:42881])
14:52:24,298 INFO  [GroupMember] All Members : 3 ([127.0.0.1:36472, 127.0.0.1:44741, 127.0.0.1:42881])
14:52:43,675 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
14:52:43,676 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
14:52:43,676 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
14:52:43,676 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
14:52:43,676 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_A
14:52:43,676 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1199
14:52:43,748 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
14:52:43,749 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
14:52:43,749 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
14:52:43,749 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
14:52:43,749 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_B
14:52:43,749 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1299
14:52:43,751 INFO  [STDOUT] #                   Starting HA service class test.ha.TestHASvcB
14:52:43,822 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
14:52:43,822 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
14:52:43,823 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
14:52:43,823 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
14:52:43,823 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_C
14:52:43,823 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1399
14:52:43,824 INFO  [STDOUT] #                   Stopping HA service class test.ha.TestHASvcC
---
Node2 updates the sorted list for current alive nodes too:
                node1 -->[0]
                node2 -->[1]
                node3--> [2]
According to the configured selection policy, "TestHASvcB" should be migrated to 1st node, i.e. node2, so this service was stopped on node1 as shown in the console of node1, and is started on node2.
"TestHASvcB" does not belong to node2 any more and will be stopped on node2, and will be started on node3.

node3 console:
---
14:52:24,367 INFO  [GroupMember] I am (127.0.0.1:42881)
14:52:24,368 INFO  [GroupMember] New Members : 3 ([127.0.0.1:36472, 127.0.0.1:44741, 127.0.0.1:42881])
14:52:24,369 INFO  [GroupMember] All Members : 3 ([127.0.0.1:36472, 127.0.0.1:44741, 127.0.0.1:42881])
14:52:24,875 INFO  [STDOUT] 
....
14:52:43,679 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
14:52:43,680 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
14:52:43,681 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
14:52:43,681 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
14:52:43,682 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_A
14:52:43,682 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1199
14:52:43,755 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
14:52:43,756 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
14:52:43,756 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
14:52:43,756 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
14:52:43,756 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_B
14:52:43,757 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1299
14:52:43,826 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
14:52:43,827 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
14:52:43,827 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
14:52:43,827 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
14:52:43,827 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_C
14:52:43,827 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1399
14:52:43,831 INFO  [STDOUT] #                   Starting HA service class test.ha.TestHASvcC
---
Node3 joins cluster and detected, there is a already 2 members in the cluster.  It created a sorted list for current alive nodes:
                node1 -->[0]
                node2 -->[1]
                node3 -->[2]
Since cluster structure has changed, JBoss HA Controller will try to update the HA Singleton status.
According to the configured selection policy, "TestHASvcC" will be taken over to node3 from node2.

Step4. node1 fails

Shutdown node1, and check console output of node2 and node3.

console node2:
---
 14:55:37,727 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
14:55:37,810 INFO  [STDOUT]  ## node name: 127.0.0.1:1299                                                                                                                                       
14:55:37,810 INFO  [STDOUT]  ## node name: 127.0.0.1:1399                                                                                                                                       
14:55:37,810 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_C                                                                                                                                  
14:55:37,810 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1399                                                                                                                      
14:55:38,000 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name:                                                                                                             
14:55:38,000 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
14:55:38,000 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
14:55:38,000 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_B
14:55:38,001 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1299
14:55:38,031 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
14:55:38,039 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
14:55:38,039 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
14:55:38,039 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_A
14:55:38,040 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1299
14:55:38,043 INFO  [STDOUT] #                   Starting HA service class test.ha.TestHASvcA
...
14:55:46,615 INFO  [DefaultPartition] I am (127.0.0.1:1299) received membershipChanged event:
14:55:46,615 INFO  [DefaultPartition] Dead members: 1 ([127.0.0.1:1199])
14:55:46,616 INFO  [DefaultPartition] New Members : 0 ([])
14:55:46,616 INFO  [DefaultPartition] All Members : 2 ([127.0.0.1:1299, 127.0.0.1:1399])
---
When node1 is shutdown, node2 updates the sorted list for current alive nodes:
                node2 -->[0]
                node3 -->[1]

Since for 2 alive nodes, "TestHASvcA" should run on the 0th node. And now according to the updated alive node list, the 0th node is node2, so node2 will start "TestHASvcA".

console node3:
---
14:55:37,880 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
14:55:38,016 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
14:55:38,016 INFO  [STDOUT]  ## node name: 127.0.0.1:1399                                                                         
14:55:38,016 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_C                                                                    
14:55:38,017 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1399                                                        
14:55:38,017 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name:                                               
14:55:38,017 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
14:55:38,017 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
14:55:38,017 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_B
14:55:38,017 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1299
14:55:38,032 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
14:55:38,033 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
14:55:38,033 INFO  [STDOUT]  ## node name: 127.0.0.1:1399                                                                         
14:55:38,033 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_A                                                                    
14:55:38,033 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1299
...
14:55:46,625 INFO  [DefaultPartition] I am (127.0.0.1:1399) received membershipChanged event:
14:55:46,625 INFO  [DefaultPartition] Dead members: 1 ([127.0.0.1:1199])
14:55:46,625 INFO  [DefaultPartition] New Members : 0 ([])
14:55:46,625 INFO  [DefaultPartition] All Members : 2 ([127.0.0.1:1299, 127.0.0.1:1399])
---

Node3 just updated the alive node list. "TestHASvcC" stays on node3.

Step5. node2 fails

When node2 fails now, the node3 will be left as the onyl alive node. As shown in node3's console,
The alive node list contains now only one node, i.e. node3. The services running on node2, are now migrated to node3.

console node3:
---
14:58:10,737 INFO  [STDOUT] #                   Starting HA service class test.ha.TestHASvcB
14:58:10,800 INFO  [STDOUT] #                   Starting HA service class test.ha.TestHASvcA
14:58:12,491 INFO  [GroupMember] ....
14:58:14,078 INFO  [DefaultPartition] I am (127.0.0.1:1399) received membershipChanged event:
14:58:14,115 INFO  [DefaultPartition] Dead members: 1 ([127.0.0.1:1299])
14:58:14,115 INFO  [DefaultPartition] New Members : 0 ([])
14:58:14,115 INFO  [DefaultPartition] All Members : 1 ([127.0.0.1:1399])
---

Step6. node1 revocers

Start node1 again, check the console output of node1 and node3

console node1:
---
15:01:35,189 INFO  [GroupMember] I am (127.0.0.1:45635)
15:01:35,190 INFO  [GroupMember] New Members : 2 ([127.0.0.1:42881, 127.0.0.1:45635])
15:01:35,190 INFO  [GroupMember] All Members : 2 ([127.0.0.1:42881, 127.0.0.1:45635])
15:01:35,354 INFO  [STDOUT] 
---------------------------------------------------------
GMS: address is 127.0.0.1:7900 (cluster=MessagingPostOffice-DATA)
---------------------------------------------------------

15:01:49,131 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
15:01:49,132 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
15:01:49,132 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
15:01:49,132 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_A
15:01:49,133 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1399
15:01:49,207 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
15:01:49,207 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
15:01:49,207 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
15:01:49,208 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_B
15:01:49,208 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1399
15:01:49,265 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
15:01:49,266 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
15:01:49,266 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
15:01:49,266 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_C
15:01:49,266 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1199
15:01:49,272 INFO  [STDOUT] #                   Starting HA service class test.ha.TestHASvcC
15:01:49,366 INFO  [Http11Protocol] Starting Coyote HTTP/1.1 on http-127.0.0.1-8180
15:01:49,526 INFO  [AjpProtocol] Starting Coyote AJP/1.3 on ajp-127.0.0.1-8109
15:01:49,576 INFO  [ServerImpl] JBoss (Microcontainer) [5.1.0.GA (build: SVNTag=JBoss_5_1_0_GA date=200905221634)] Started in 1m:51s:721ms
---
When node1 recovers, the alive node list becomes:
                node3 -->[0]
                node1 -->[1]
With alive node count equals 2, "TestHASvcA" and "TestHASvcB" should run on the 0th node, so they stay on node3.  "TestHASvcC" will be migrated to 1st node, i.e. node1. So we see on the console of nod1, "TestHASvcC" is started.

console node3:
---
15:01:18,345 INFO  [DefaultPartition] I am (127.0.0.1:1399) received membershipChanged event:
15:01:18,345 INFO  [DefaultPartition] Dead members: 0 ([])
15:01:18,345 INFO  [DefaultPartition] New Members : 1 ([127.0.0.1:1199])
15:01:18,345 INFO  [DefaultPartition] All Members : 2 ([127.0.0.1:1399, 127.0.0.1:1199])
...
15:01:49,126 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
15:01:49,127 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
15:01:49,127 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
15:01:49,127 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_A
15:01:49,127 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1399
15:01:49,206 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
15:01:49,206 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
15:01:49,206 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
15:01:49,206 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_B
15:01:49,207 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1399
15:01:49,263 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
15:01:49,270 INFO  [STDOUT] #                   Stopping HA service class test.ha.TestHASvcC
15:01:49,280 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
15:01:49,281 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
15:01:49,281 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_C
15:01:49,281 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1199
---

Now node3 is still the 0th node in the cluster, but "TestHASvcC" is expected to  run on the 1st node, so this service does not belong to node3 any more, and will be shutdown on node3.

Step6. node2 revocers

Start node2 again.

console node1:
---
15:04:44,024 INFO  [GroupMember] I am (127.0.0.1:45635)
15:04:44,024 INFO  [GroupMember] New Members : 1 ([127.0.0.1:39250])
15:04:44,025 INFO  [GroupMember] All Members : 3 ([127.0.0.1:42881, 127.0.0.1:45635, 127.0.0.1:39250])
...
15:05:00,621 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
15:05:00,687 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
15:05:00,688 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
15:05:00,688 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
15:05:00,688 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_A
15:05:00,688 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1399
15:05:00,688 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
15:05:00,688 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
15:05:00,688 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
15:05:00,701 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
15:05:00,701 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_B
15:05:00,701 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1199
15:05:00,860 INFO  [STDOUT] #                   Starting HA service class test.ha.TestHASvcB
15:05:00,860 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
15:05:00,861 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
15:05:00,861 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
15:05:00,861 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
15:05:00,861 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_C
15:05:00,861 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1299
15:05:00,864 INFO  [STDOUT] #                   Stopping HA service class test.ha.TestHASvcC
---
When node2 joins cluster , there will be 3 nodes in the cluster again, but the alive node list is different than that before nodes failures. The list is now sorted differently:

                node3 -->[0]
                node1 -->[1]
                node2 -->[2]
Before node2 joins cluster, "TestHASvcC" was running on node1. Now with alive count bekoming 3, the node1 as 1st node, will only have "TestHASvcB" runs on it. Therefore "TestHASvcC" will be stopped, and "TestHASvcB" will be started on node1.

console node2:
---
15:04:44,038 INFO  [GroupMember] I am (127.0.0.1:39250)
15:04:44,038 INFO  [GroupMember] New Members : 3 ([127.0.0.1:42881, 127.0.0.1:45635, 127.0.0.1:39250])
15:04:44,039 INFO  [GroupMember] All Members : 3 ([127.0.0.1:42881, 127.0.0.1:45635, 
...
15:05:00,624 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
15:05:00,625 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
15:05:00,625 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
15:05:00,625 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
15:05:00,625 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_A
15:05:00,625 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1399
15:05:00,732 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
15:05:00,733 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
15:05:00,733 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
15:05:00,733 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
15:05:00,733 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_B
15:05:00,733 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1199
15:05:00,863 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
15:05:00,863 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
15:05:00,863 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
15:05:00,863 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
15:05:00,863 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_C
15:05:00,863 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1299
15:05:00,868 INFO  [STDOUT] #                   Starting HA service class test.ha.TestHASvcC
---
Node2 joins the cluster as the 2nd node, element [2] of the alive node list, will only have "TestHASvcC"  started on it.

console node3:
---
[127.0.0.1:42881, 127.0.0.1:45635]
15:04:44,019 INFO  [GroupMember] I am (127.0.0.1:42881)
15:04:44,019 INFO  [GroupMember] New Members : 1 ([127.0.0.1:39250])
15:04:44,019 INFO  [GroupMember] All Members : 3 ([127.0.0.1:42881, 127.0.0.1:45635, 127.0.0.1:39250])
...
15:05:00,621 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
15:05:00,689 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
15:05:00,689 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
15:05:00,689 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
15:05:00,689 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_A
15:05:00,689 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1399
15:05:00,704 INFO  [STDOUT] #                   Stopping HA service class test.ha.TestHASvcB
15:05:00,731 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
15:05:00,732 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
15:05:00,788 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
15:05:00,788 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
15:05:00,788 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_B
15:05:00,789 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1199
15:05:00,789 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
15:05:00,789 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
15:05:00,789 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
15:05:00,789 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
15:05:00,789 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_C
15:05:00,789 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1299
---
Before node2 joins cluster, node3 was the 0th node in cluster and had "TestHASvcA" and "TestHASvcB"
running on it.

Now the cluster has 3 nodes, node3 will only have "TestHASvcA", running. "TestHASvcB" will be stopped on it, and taken over by node1 as shown in the console of  node1.

CONGRATULATIONS!
Now we are done with our expertment. The HA singleton services are again distrubuted on the 3 nodes.

Dynamic HA singeton services distribution and migration is useful, when you have several services which:
  •  each service should run only once in the cluster 
  •  each service should be high available, .i.e. could survive node failures
  • all services not running on one node , but distributed among all node for, for example performance reason.
You might ask, in case of "client-server" system, what if a client is connected to a node, but the required service just runs on another node? We will come to this topic later...