Establishing relationships among containers_Puppet：Mastering Infrastructure Automation-QQ阅读玄幻男生网

上QQ阅读APP看书，第一时间看更新

Establishing relationships among containers

Puppet's classes bear little or no similarity to classes that you find in object-oriented programming languages such as Java or Ruby. There are no methods or attributes. There are no distinct instances of any class. You cannot create interfaces or abstract base classes.

One of the few shared characteristics is the encapsulation aspect. Just as classes from OOP, Puppet's classes hide implementation details. To get Puppet to start managing a subsystem, you just need to include the appropriate class.

Passing events between classes and defined types

By sorting all resources into classes, you make it unnecessary (for your co-workers or other collaborators) to know about each single resource. This is beneficial. You can think of the collection of classes and defined types as your interface. You would not want to read all manifests that anyone on your project ever wrote.

However, the encapsulation is inconvenient for passing resource events. Say you have some daemon that creates live statistics from your Apache logfiles. It should subscribe to Apache's configuration files so that it can restart if there are any changes (which might be of consequence to this daemon's operation). In another scenario, you might have Puppet manage some external data for a self-compiled Apache module. If Puppet updates such data, you will want to trigger a restart of the Apache service to reload everything.

Armed with the knowledge that there is a service, Service['apache2'], defined somewhere in the apache class, you can just go ahead and have your module data files notify that resource. It would work—Puppet does not apply any sort of protection to resources that are declared in foreign classes. However, it would pose a minor maintainability issue.

The reference to the resource is located far from the resource itself. When maintaining the manifest later, you or a coworker might wish to look at the resource when encountering the reference. In the case of Apache, it's not difficult to figure out where to look, but in other scenarios, the location of the reference target can be less obvious.

Tip

Looking up a targeted resource is usually not necessary, but it can be important to find out what that resource actually does. It gets especially important during debugging, if after a change to the manifest, the referenced resource is no longer found.

Besides, this approach will not work for the other scenario, in which your daemon needs to subscribe to configuration changes. You could blindly subscribe the central apache2.conf file, of course. However, this would not yield the desired results if the responsible class opted to do most of the configuration work inside snippets in /etc/apache2/conf.d.

Both scenarios can be addressed cleanly and elegantly by directing the notify or subscribe parameters at the whole class that is managing the entity in question:

file { '/var/lib/apache2/sample-module/data01.bin':
  source => '...',
  notify => Class['apache'],
}
service { 'apache-logwatch': 
  enable    => true, 
  subscribe => Class['apache'], 
}

Of course, the signals are now sent (or received) indiscriminately—the file not only notifies Service['apache2'], but also every other resource in the apache class. This is usually acceptable, because most resources ignore events.

As for the logwatch daemon, it might refresh itself needlessly if some resource in the apache class needs a sync action. The odds for this occurrence depend on the implementation of the class. For ideal results, it might be sensible to relocate the configuration file resources into their own class so that the daemon can subscribe to that instead.

With your defined types, you can apply the same rules: subscribe to and notify them as required. Doing so feels quite natural, because they are declared like native resources anyway. This is how you subscribe several instances of the defined type, symlink:

$active_countries = [ 'England', 'Ireland', 'Germany' ]
service { 'example-app': 
  enable    => true, 
  subscribe => Symlink[$active_countries], 
}

Granted, this very example is a bit awkward, because it requires all symlink resource titles to be available in an array variable. In this case, it would be more natural to make the defined type instances notify the service instead:

symlink { [ 'England', 'Ireland', 'Germany' ]:
  notify => Service['example-app'],
}

This notation passes a metaparameter to a defined type. The result is that this parameter value is applied to all resources declared inside the define.

If a defined type wraps or contains a service or exec type resource, it can also be desirable to notify an instance of that define to refresh the contained resource. The following example assumes that the service type is wrapped by a defined type called protected_service:

file { '/etc/example_app/main.conf': 
  source => '...', 
  notify => Protected_service['example-app'], 
}

Ordering containers

The notify and subscribe metaparameters are not the only ones that you can direct at classes and instances of defined types—the same holds true for their siblings, before and require. These allow you to define an order for your resources relative to classes, order instances of your defined types, and even order classes among themselves.

The latter works by virtue of the chaining operator:

include firewall
include loadbalancing
Class['firewall'] -> Class['loadbalancing']

The effect of this code is that all resources from the firewall class will be synchronized before any resource from the loadbalancing class, and failure of any resource in the former class will prevent all resources in the latter from being synchronized.

Note

Note that the chaining arrow cannot just be placed in between the include statements. It works only between resources or references.

Because of these ordering semantics, it is actually quite wholesome to require a whole class. You effectively mark the resource in question as being dependent on the class. As a result, it will only be synchronized if the entire subsystem that the class models is successfully synchronized first.

Limitations

Sadly, there is a rather substantial issue with both the ordering of containers and the distribution of refresh events: both will not transcend the include statements of further classes. Consider the following example:

class apache {
  include apache::service
  include apache::package
  include apache::config
}
file { '/etc/apache2/conf.d/passwords.conf':
  source  => '...', 
  require => Class['apache'], 
}

I often mentioned how the comprehensive apache class models everything about the Apache server subsystem, and in the previous section, I went on to explain that directing a require parameter at such a class will make sure that Puppet only touches the dependent resource if the subsystem has been successfully configured.

This is mostly true, but due to the limitation concerning class boundaries, it doesn't achieve the desired effect in this scenario. The dependent configuration file should actually require the Package['apache'] package, declared in class apache::package. However, the relationship does not span multiple class inclusions, so this particular dependency will not be part of the resulting catalog at all.

Similarly, any refresh events sent to the apache class will have no effect—they are distributed to resources declared in the class's body (of which there are none), but are not passed on to included classes. Subscribing to the class will make no sense either, because any resource events generated inside the included classes will not be forwarded by the apache class.

The bottom line is that relationships to classes cannot be built in utter ignorance of their implementation. If in doubt, you need to make sure that the resources that are of interest are actually declared directly inside the class you are targeting.

Note

The discussion revolved around the example of the include statements in classes, but since it is common to use them in defined types as well, the same limitation applies in this case too.

There is a bright side to this as well. A more correct implementation of the Apache configuration file from the example explained would depend on the package, but would also synchronize itself before the service, and perhaps even notify it (so that Apache restarts if necessary). When all resources are part of the apache class and you want to adhere to the pattern of interacting with the container only, it would lead to the following declaration:

file { '/etc/apache2/conf.d/passwords.conf': 
  source  => '...', 
  require => Class['apache'], 
  notify  => Class['apache'], 
}

This forms an instant dependency circle: the file resource requires all parts of the apache class to be synchronized before it gets processed, but to notify them, they must all be put after the file resource in the order graph. This cannot work. With the knowledge of the inner structure of the apache class, the user can pick metaparameter values that actually work:

file { '/etc/apache2/conf.d/passwords.conf':
  source  => '...', 
  require => Class['apache::package'], 
  notify  => Class['apache::service'], 
}

For the curious the preceding code shows what the inner classes look like, roughly.

The other good news is that invoking defined types does not pose the same kind of issue that an include statement of a class does. Events are passed to resources inside defined types just fine, transcending an arbitrary number of stacked invocations. Ordering also works just as expected. Let's keep the example brief:

class apache { 
  virtual_host { 'example.net': ... }
  ... 
}

This apache class also creates a virtual host using the defined type, virtual_host. A resource that requires this class will implicitly require all resources from within this virtual_host instance. A subscriber to the class will receive events from those resources, and events directed at the class will reach the resources of this virtual_host.

Tip

There is actually a good reason to make the include statements behave differently in this regard. As classes can be included very generously (thanks to their singleton aspect), it is common for classes to build a vast network of includes. By adding a single include statement to a manifest, you might unknowingly pull hundreds of classes into this manifest.

Assume, for a moment, that relationships and events transcend this whole network. All manners of unintended effects will be the consequence. Dependency circles will be nearly inevitable. The whole construct will become utterly unmanageable. The cost of such relationships will also grow exponentially. Refer to the next section.

The performance implications of container relationships

There is another aspect that you should keep in mind whenever you are referencing a container type to build a relationship to it. The Puppet agent will have to build a dependency graph from this. This graph contains all resources as nodes and all relationships as edges. Classes and defined types get expanded to all their declared resources. All relationships to the container are expanded to relationships to each resource.

This is mostly harmless, if the other end of the relationship is a native resource. A file that requires a class with five declared resources leads to five dependencies. That does not hurt. It gets more interesting if the same class is required by an instance of a defined type that comprises three resources. Each of these builds a relationship to each of the class's resources, so you end up with 15 edges in the graph.

It gets even more expensive when a container invokes complex defined types, perhaps even recursively.

A more complex graph means more work for the Puppet agent, and its runs will take longer. This is especially annoying when running agents interactively during the debugging or development of your manifest. To avoid the unnecessary effort, consider your relationship declarations carefully, and use them only when they are really appropriate.

Mitigating the limitations

The architects of the Puppet language have devised two alternative approaches to solve the ordering issues. We will consider both, because you might encounter them in existing manifests. In new setups, you should always choose the latter variant.

The anchor pattern

The anchor pattern is the classic workaround for the problem with ordering and signaling in the context of recursive class include statements. It can be illustrated by the following example class:

class example_app { 
  anchor { 'example_app::begin': 
    notify => Class['example_app_config'], 
  } 
  include example_app_config 
  anchor { 'example_app::end': 
    require => Class['example_app_config'],
  } 
}

Consider a resource that is placed before => Class['example_app']. It ends up in the chain before each anchor, and therefore, also before any resource in example_app_config, despite the include limitation. This is because the Anchor['example_app::begin'] pseudo-resource notifies the included class and is therefore ordered before all of its resources. A similar effect works for objects that require the class, by virtue of the example::end anchor.

The anchor resource type was created for this express purpose. It is not part of the Puppet core, but has been made available through the stdlib module instead (the next chapter will familiarize you with modules). Since it also forwards refresh events, it is even possible to notify and subscribe this anchored class, and events will propagate into and out of the included example_app_config class.

The stdlib module is available in the Puppet Forge, but more about this in the next chapter. There is a descriptive document for the anchor pattern to be found online as well, in Puppet Labs' Redmine issue tracker (now obsolete) at http://projects.puppetlabs.com/projects/puppet/wiki/Anchor_Pattern. It is somewhat dated, seeing as the anchor pattern has been supplanted as well by Puppet's ability to contain a class in a container.

The contain function

To make composite classes directly work around the limitations of the include statement, you can take advantage of the contain function found in Puppet version 3.4.x or newer.

If the earlier apache example had been written like the following one, there would have been no issues concerning ordering and refresh events:

class apache {
  contain apache::service
  contain apache::package
  contain apache::config
}

The official documentation describes the behavior as follows:

"A contained class will not be applied before the containing class is begun, and will be finished before the containing class is finished."

This might read like we're now discussing the panacea for the presented class ordering issues here. Should you just be using contain in place of include from here on out and never worry about class ordering again? Of course not, this would introduce lots of unnecessary ordering constraints and lead you into unfixable dependency circles very quickly. Do contain classes, but make sure that it makes sense. The contained class should really form a vital part of what the containing class is modeling.

Tip

The quoted documentation refers to classes only, but classes can be contained in defined types just as well. The effect of containment is not limited to ordering aspects either. Refresh events are also correctly propagated.