website/reference/basic/subcollectives.md

   1 ---
   2 layout: default
   3 title: Subcollectives
   4 ---
   5 [ActiveMQClustering]: /mcollective/reference/integration/activemq_clusters.html
   6 [MessageFlow]: messageflow.html
   7
   8 # Overview
   9
  10 By default all servers are part of a single broadcast domain, if you have an
  11 agent on all machines in your network and you send a message directed to
  12 machines with that agent they will all get it regardless of filters.
  13
  14 This works well for the common use case but can present problems in the
  15 following scenarios:
  16
  17  * You have a very big and busy network.  With thousands of machines responding
  18    to requests every 10 seconds very large numbers of messages will be created
  19    and you might want to partition the traffic.
  20  * You have multiple data centers on different continents, the latency and
  21    volume of traffic will be too big.  You'd want to duplicate monitoring and
  22    management automations in each datacenter but as it's all one broadcast
  23    domain you still see large amount of traffic on the slow link.
  24  * You can't just run multiple seperate installs because you still wish to
  25    retain the central management feature of MCollective.
  26  * Securing a flat network can be problematic.  SimpleRPC has a security
  27    framework that is aware of users and network topology but the core network
  28    doesnt.
  29
  30 We've introduced the concept of sub collectives that lets you define broadcast
  31 domains and configure a mcollective server to belong to one or many of these domains.
  32
  33 # Partitionion Approaches
  34
  35 Determining how to partition your nework can be a very complex subject and
  36 requires an understanding of your message flow, where requestors sit and also
  37 the topology of your middleware clusters.
  38
  39 Most middleware solutions will only send traffic where they know there exist an
  40 interest in this traffic.  Therefore if you had an agent on only 10 of 1000
  41 machines only those 10 machines will receive the associated traffic.  This is an
  42 important distinction to keep in mind.
  43
  44 ![ActiveMQ Cluster](../../images/subcollectives-multiple-middleware.png)
  45
  46 We'll be working with a small 52 node collective that you can see above, the
  47 collective has machines in many data centers spread over 4 countries.  There are
  48 3 ActiveMQ servers connected in a mesh.
  49
  50 Along with each ActiveMQ node is also a Puppet Master, Nagios instance and other
  51 shared infrastructure components.
  52
  53 An ideal setup for this network would be:
  54
  55  * MCollective NRPE and Puppetd Agent on each of 52 servers
  56  * Puppet Commander on each of the 3 ActiveMQ locations
  57  * Nagios in each of the locations monitoring the machines in its region
  58  * Regional traffic will be isolated and contained to the region as much as
  59    possible
  60  * Systems Administrators and Registration data retain the ability to target the
  61    whole collective
  62
  63 The problem with a single flat collective is that each of the 3 Puppet
  64 Commanders will get a copy of all the traffic, even traffic they did not request
  65 they will simply ignore the wrong traffic.  The links between Europe and US will
  66 see a large number of messages traveling over them.  In a small 52 node traffic
  67 this is managable but if each of the 4 locations had thousands of nodes the
  68 situation will rapidly become untenable.
  69
  70 It seems natural then to create a number of broadcast domains - subcollectives:
  71
  72  * A global collective that each machines belongs to
  73  * UK, DE, ZA and US collectives that contains just machines in those regions
  74  * An EU collective that has UK, DE and ZA machines
  75
  76 Visually this arrangement might look like the diagram below:
  77
  78 ![Subcollectives](../../images/subcollectives-collectives.png)
  79
  80 Notice how subcollectives can span broker boundaries - our EU collective has nodes
  81 that would connect to both the UK and DE brokers.
  82
  83 We can now configure our Nagios and Puppet Commanders to communicate only to the
  84 sub collectives and the traffic for these collectives will be contained
  85 regionally.
  86
  87 The graph below shows the impact of doing this, this is the US ActiveMQ instance
  88 showing traffic before partitioning and after.  You can see even on a small
  89 network this can have a big impact.
  90
  91 ![Subcollectives](../../images/subcollectives-impact.png)
  92
  93 # Configuring MCollective
  94
  95 Configuring the partitioned collective above is fairly simple.  We'll look at
  96 one of the DE nodes for reference:
  97
  98 {% highlight ini %}
  99 collectives = mcollective,de_collective,eu_collective
 100 main_collective = mcollective
 101 {% endhighlight %}
 102
 103 The _collectives_ directive tells the node all the collectives it should belong
 104 to and the _main`_`collective_ instructs Registration where to direct messages
 105 to.
 106
 107 # Partitioning for Security
 108
 109 Another possible advantage from subcollectives is security.  While the SimpleRPC
 110 framework has a security model that is aware of the topology the core network
 111 layer does not.  Even if you only give someone access to run SimpleRPC requests
 112 against some machines they can still use _mc ping_ to discover other nodes on
 113 your network.
 114
 115 By creating a subcollective of just their nodes and restricting them on the
 116 middleware level to just that collective you can effectively and completely
 117 create a secure isolated zone that overlays your exiting network.
 118
 119 # Testing
 120
 121 Testing that it works is pretty simple, first we need a _client.cfg_ that
 122 configures your client to talk to all the sub collectives:
 123
 124 {% highlight ini %}
 125 collectives = mcollective,uk_collective,us_collective,de_collective,eu_collective,us_collective,za_collective
 126 main_collective = mcollective
 127 {% endhighlight %}
 128
 129 You can now test with _mc ping_:
 130
 131 {% highlight console %}
 132 $ mc ping -T us_collective
 133 host1.us.my.net         time=200.67 ms
 134 host2.us.my.net         time=241.30 ms
 135 host3.us.my.net         time=245.24 ms
 136 host4.us.my.net         time=275.42 ms
 137 host5.us.my.net         time=279.90 ms
 138 host6.us.my.net         time=283.61 ms
 139 host7.us.my.net         time=370.56 ms
 140
 141
 142 ---- ping statistics ----
 143 7 replies max: 370.56 min: 200.67 avg: 270.96
 144 {% endhighlight %}
 145
 146 By specifying other collectives in the -T argument you should see the sub
 147 collectives and if you do not specify anything you should see all machines.
 148
 149 Clients don't need to know about all collectives, only the ones they intend
 150 to communicate with.
 151
 152 You can discover the list of known collectives and how many nodes are in each
 153 using the _inventory_ application:
 154
 155 {% highlight console %}
 156 $ mc inventory --list-collectives
 157
 158  * [ ==================================== ] 52 / 52
 159
 160    Collective                     Nodes
 161    ==========                     =====
 162    za_collective                  2
 163    us_collective                  7
 164    uk_collective                  19
 165    de_collective                  24
 166    eu_collective                  45
 167    mcollective                    52
 168
 169                      Total nodes: 52
 170
 171 {% endhighlight %}
 172
 173 # ActiveMQ Filters
 174
 175 The above setup should just work in most cases but you might want to go one step
 176 further and actively prevent propagation across the network of sub collective
 177 traffic.
 178
 179 In your ActiveMQ broker setup you will already have a section defining your
 180 network connections, something like:
 181
 182 {% highlight xml %}
 183 <networkConnectors>
 184   <networkConnector
 185      name="us-uk"
 186      uri="static:(tcp://stomp1.uk.my.net:6166)"
 187      userName="amq"
 188      password="secret"
 189      duplex="true" />
 190 </networkConnectors>
 191 {% endhighlight %}
 192
 193 You can add filters here restricting traffic in this case the US<->UK connection
 194 should never transmit _us`_`collective_ traffic, so lets restrict that:
 195
 196 {% highlight xml %}
 197 <networkConnectors>
 198   <networkConnector
 199      <excludedDestinations>
 200        <topic physicalName="us_collective.>" />
 201        <topic physicalName="uk_collective.>" />
 202        <topic physicalName="de_collective.>" />
 203        <topic physicalName="za_collective.>" />
 204        <topic physicalName="eu_collective.>" />
 205      </excludedDestinations>
 206      name="us-uk"
 207      uri="static:(tcp://stomp1.uk.my.net:6166)"
 208      userName="amq"
 209      password="secret"
 210      duplex="true" />
 211 </networkConnectors>
 212 {% endhighlight %}