Let "Arbiter" be a service that provides change risk assessment information for a given node at the current time. Further, let Arbiter be assumed to consider information from ServiceNow—node ownership, escalation status, maintenance window, peak business hours, other data points—and produce a simple matrix of change risk-to-permitted decisions. Low-risk change: permitted. Medium-risk change: not permitted. And so forth.
This module provides Puppet code patterns and constructs that let developers declare their code with change risk information. The constructs will allow Puppet to selectively and automatically tag and/or no-op configuration elements according to the current change risk tolerance permitted by Arbiter.
For testing or semi-permanent configuration, this can just be done in Hiera yaml.
change_risk::permitted_risk:
high: false
medium: true
low: true
unknown: true
See setup for more ways to to source this information.
To mark a class with an assessed change risk, call the change_risk()
function at the top of the class.
class profile::dangerous {
change_risk('high')
# ...
}
To mark a non-class block of code with an assessed change risk, call the change_risk()
function with a block.
change_risk('medium') || {
file { '/etc/postfix/main.cf':
source => 'puppet:///modules/postfix/main.cf',
}
service { 'postfix':
ensure => running,
subscribe => File['/etc/postfix/main.cf'],
}
}
When Puppet evaluates a change_risk()
class or block, it will tag all contained resources with a "change_risk:<risk>" tag. It will then check to see if the specified change risk is permissible. If the risk is permissible, Puppet will proceed. If the risk is not permissible, Puppet will set all contained resources to no-op.
PQL queries can return information about resources in node catalogs and their assessed change-risk levels. Resources will have been tagged with a tag of the form "change_risk:<risk>". E.g. "change_risk:high" or "change_risk:low".
puppet query 'resources { certname = "my-node" and tags = "change_risk:high" }'
The change_risk()
function relies on the trlinkin-noop module to implement its no-op directives when a change risk is not permitted. When using change_risk()
, or indeed any variant of trlinkin-noop's noop()
function, the following "rule-of-thumb" best practices should be applied.
- Don't call
change_risk()
inside a profile class unless you are using the block-form. - Inside
change_risk()
classes or blocks, strongly prefer resource-style class delcaration overinclude()
. - Don't declare "includable" classes (classes you expect to be included from multiple other places in code) inside a
change_risk()
class or block.
Note: for the purpose of these cautions, let include()
refer equally to ANY of
include()
require()
contain()
The root consideration that all of these cautions are drawn from is the behavior of Puppet's scoping. Specifically, Puppet's dynamic scoping, and how it affects class declaration. This is because the functional effects of noop()
and the tagging effects of change_risk()
propogate from parent scopes downwards into child scopes.
From the docs:
- The parent of [most classes or resources] is the first scope [emphasis added] in which [the class or resource] was declared
- Because classes can be declared multiple times with the include function, the contents of a given scope are evaluation-order dependent
What this means, in short, is that a class include()
'd inside a change_risk()
block is subject to that block's no-op capability IF AND ONLY IF it has not already been include()
'd somewhere else in Puppet code.
Detailed breakdown for each rule-of-thumb element follows.
- Don't call
change_risk()
inside a profile class unless you are using the block-form.
Reasoning: Profile classes are designed to be "includable", meaning it is expected that profiles should be safe to include from other profiles, potentially many times. Further, profile classes are expected to freely include other profiles. Because of this, the first place a profile class is included from is considered indeterminate and cannot be guaranteed. To avoid non-deterministic application of change-risk tagging, don't call thechange_risk()
class function directly inside a profile. Be more judicious and use the block form ofchange_risk()
in profiles instead, and avoid usinginclude()
insidechange_risk()
blocks. - Inside
change_risk()
classes or blocks, strongly prefer resource-style class delcaration overinclude()
.
Reasoning: Resource-style class declaration has a desirable side-effect when we care deeply about what a class's parent scope will be: if the class's parent scope won't be the scope we declare it in (because it has been included somewhere else already), Puppet will raise a duplicate resource declaration error and fail the catalog. To help ensure deterministic scoping results for code insidechange_risk()
blocks, use resource-style class declaration whenever you can instead ofinclude()
,require()
orcontain()
. Note that for contain specifically, it is safe to callcontain()
for a class right after declaring it, if you also need contain's special containment semantics to be applied. - Don't declare "includable" classes (classes you expect to be included from multiple other places in code) inside a
change_risk()
class or block.
Reasoning: as a corollary to the above point, if you would like to be able toinclude()
a class elsewhere, you shouldn't declare it resource-style in yourchange_risk()
block. If you find yourself at an impass struggling to adhere to rule-of-thumb points both 1 and 2 because they seem in conflict, it may be advisable to refactor your code—perhaps by creating a new, includable profile, which itself can safely declare the class in question resource-style, in its own internalchange_risk()
block, after which the new profile can be included elsewhere.
Be aware that if resource A is a dependency of resource B and A is no-op, Puppet will always consider B's dependency to be satisfied, even if resource A is detected to be out-of-sync. Because resource A is in no-op mode it cannot "fail", and so Puppet will never skip resource B due to a dependency failure.
Depending on your dependency chains this could cause problems, when, for example, Puppet cannot actually successfully configure resource B unless or until resource A is in-sync.
Rule-of-thumb to avoid problems with this: don't depend on a higher-risk resource.
The behavior of the change_risk()
function is controlled through a configuration class. The configuration can be set by providing the appropriate settings using Hiera data (preferred), or by declaring the class resource-style in site.pp (only recommended for testing purposes).
The $permitted_risk
configuration parameter could be set statically in a Hiera yaml file, or it could be supplied dynamically using either a Puppet function call to query a service such as Arbiter, or by querying Arbiter data using a trusted_external_command
integration.
In the example below the risk tolerance data shown is static, and does not change unless the Hiera yaml file changes, or some condition causes Hiera to consult a different file.
change_risk::permitted_risk:
high: false
medium: true
low: true
unknown: true
change_risk::risk_not_found_action: fail
change_risk::ignore_permitted_risk: false
change_risk::disable_mechanism: flag
change_risk::respect_noop_class_interface: true
Note that the only required parameter is change_risk::permitted_risk
. The remaining parameters have acceptable defaults. For more information on each of these parameters and what they affect, see the reference section.
If change risk data is coming from a system like Arbiter, it can be consumed in Puppet either by:
- Using the
trusted_external_command
feature - Supplying the data through an ENC, as a top-scope variable
- Supplying the data through a custom function, and saving it to a top-scope variable in site.pp
The trusted_external_command feature allows a script to be run to query data from an external source, and make it available to Puppet in the $trusted
variable. Specifically, data will be available under trusted.external
.
Assuming that the full path to the data to use for change_risk::permitted_risk
is trusted.external.arbiter.permitted_risk
, set the following key in your Hiera data to configure change_risk()
appropriately.
change_risk::permitted_risk: "%{trusted.external.arbiter.permitted_risk}"
A variable can be set in top-scope and used similarly to the way the built-in $trusted
variable can be.
Suppose this variable is called $arbiter
, and is a hash with a permitted_risk
key. If an ENC supplies $arbiter
, it may be referenced directly in Hiera without additional work.
change_risk::permitted_risk: "%{arbiter.permitted_risk}"
If the variable will be assigned a value based on calling a Puppet function, it must be set and called in site.pp, before any resources or classes are evaluated.
# site.pp
$arbiter = arbiter::fetch_data(getvar('trusted.certname'))
Once the variable is assigned a value in site.pp, it can be referenced in Hiera the same way an ENC-provided variable would be.
The change_risk class can be declared directly to supply the necessary configuration data. This is method of configuring change_risk is recommended only for testing purposes.
# site.pp
class { 'change_risk':
$permitted_risk => {
'high' => false,
'medium' => true,
'low' => true,
'unknown' => true,
},
}
A normal Puppet agent run will use change_risk::permitted_risk
information to automatically no-op classes and code blocks based on permitted risk. When performing manual Puppet agent runs, there are several mechanisms available to override the automatic no-op decisions.
change_risk()
can be configured to ignore permissible change risks and allow all changes when the --no-noop
flag is passed to Puppet on the command line.
puppet agent -t --no-noop
The --no-noop
flag may be combined with the --tags
flag for a limited ability to target specific change. Note that the usual limitations and characteristics of the --tags
flag apply.
puppet agent -t --no-noop --tags profile::postfix
The --no-noop
flag is available when using the orchestrator to perform Puppet agent runs remotely.
The --no-noop
flag can be used to disable permissible change checks when change_risk::disable_mechanism
is set to "flag" or to "both".
change_risk()
can be configured to consult the value of a special fact to decide whether or not to respect permissible change risks: ignore_permitted_risk
. If this fact is set to Boolean true or the string "true", then change_risk()
will ignore permissible change risks and allow all change.
Note that besides using a custom fact in the facts.d directory, facts can be set when running Puppet on the command line using environment variables. For example, to run Puppet once with ignore_permitted_risk=true
, the following command can be used.
FACTER_ignore_permitted_risk=true puppet agent -t
The ignore_permitted_risk
Facter fact can be used to disable permissible change checks when change_risk::disable_mechanism
is set to "fact" or to "both".
If the change_risk::ignore_permitted_risk
class parameter is set to true
(either through class declaration or through Hiera data) then change_risk()
will ignore permissible change risks and allow all change.
change_risk::ignore_permitted_risk: true
If no-op class interface $class_noop
parameters are being used, per-class hiera data may be set to override the main change_risk()
check for any class so instrumented. Such an override will not, however, affect any nested change risk blocks inside the class.
To override the class no-op setting for profile::postfix and force it to run in op mode, set the following Hiera data parameter:
profile::postfix::class_noop: false
The noop class interface mechanism can be used on classes which support it when change_risk::respect_noop_class_interface
is set to true (default).
The change risk class function and block forms can be used together, if needed. The following example shows a class implemented with the change_risk()
function called at the class level, but also containing a code block of resources with a different risk level specified.
The following example demonstrates using the class function call together with a nested change risk block.
class profile::postfix (
# Various normal class parameters
String $alias_maps = 'hash:/etc/aliases',
Optional[Hash] $configs = {},
) {
change_risk('low')
# Normal configuration management code from this point forward
# ...
# A block of high-risk changes
change_risk('high') || {
file { '/etc/postfix/main.cf':
ensure => file,
replace => true,
source => 'puppet:///modules/postfix/main.cf',
}
service { 'postfix':
ensure => running,
subscribe => File['/etc/postfix/main.cf'],
}
}
# More normal configuration management code
anchor { 'postfix::begin': }
-> class { '::postfix::packages': }
-> class { '::postfix::files': }
~> class { '::postfix::service': }
-> anchor { 'postfix::end': }
}
The method by which configuration elements are disabled in this pattern is by being switched to no-op. The trlinkin-noop module is used to do this.
The no-op class interface pattern can be used to provide a no-op switch at a class level. An advantage of providing this switch at a class level is that it can be overridden on a per-class basis using Hiera data parameters. If the appropriate supporting controls are in place, this can allow for on-demand switching on or off of specific classes for controlled Puppet runs.
The change_risk()
function can be used in conjunction with the no-op class interface to allow developers to indicate the evaluated risk level of their class, but also respect a $class_noop
parameter, if supplied to the class. That is, by default, change_risk()
will also implement noop::class_interface()
.
The following example shows how to implement the no-op class interface in conjunction with the change_risk()
function.
class profile::postfix (
# Various normal class parameters
String $alias_maps = 'hash:/etc/aliases',
Optional[Hash] $configs = {},
# No-op class interface parameter.
Optional[Boolean] $class_noop = undef,
) {
change_risk('low')
# Because $class_noop exists:
# - If $class_noop == true, change_risk() will invoke the noop() function
# for the class, even if the change would otherwise be permitted.
# - If $class_noop == false, change_risk() will NOT no-op the class, even
# if change would normally not be permitted.
# Normal configuration management code from this point forward
# ...
}
If a $class_noop
Boolean value is provided, that value will be deterministic: full control will be passed over to noop::class_interface()
, and permitted risk will be ignored. When $class_noop
is undefined (set to undef
), then change_risk()
will behave normally and consult the permitted risk hash to decide whether or not to no-op the class.
The change_risk class provides a way to configure the behavior of change_risk()
function calls.
A hash of change risk values ("low", "medium", "high", etc.) to Booleans. True indicates the change risk is permissible, while false indicates the change risk is not permissible, and any resources marked with that change risk should be no-op'd.
Example:
{
'low' => true,
'medium' => false,
'high' => false,
}
To support Hiera references to other variables, this parameter will also accept a String representation of a Puppet language Hash[String, Boolean].
Accepts one of the following values. These values define what Puppet will do if a change_risk()
function call is evaluated which uses a risk that is not present in the permitted_risk hash.
fail
– Fail the catalognone
– Don't no-op the resources in the blocknoop
– No-op the resources in the block
When set to true
, this effectively disables the ability for change_risk()
to no-op resources. Defaults to false
.
Defines how to temporarily disable the ability for change_risk()
to no-op resources. There are two possible configurations to disable change_risk()
no-op behavior.
flag
– Use the--no-noop
flag on the command linefact
– Usefacts.ignore_permitted_risk=true
both
– Use EITHER the--no-noop
flag OR the value offacts.ignore_permitted_risk
This enables or disables compatability with trlinkin-noop's noop::class_interface()
pattern. If set to true
(default), then class parameters $class_noop
and $class_noop_override
can be used for individual classes which call change_risk()
internally, and those parameters will take precedence over change_risk()
's determination of whether or not to no-op the class based on permitted risk.
Setting $class_noop
to True or False will switch over to noop::class_interface()
's semantics. Leave $class_noop
undefined, or don't provide it on a class, to leave change_risk()
's semantics in effect.
true
– Respect the values of$class_noop
or$class_noop_override
, if presentfalse
– Ignore the values of$class_noop
and$class_noop_override
The change_risk()
function is implemented in Ruby. For the block variant, it creates a new scope from the containing scope in which to evaluate code, and calls the noop()
function in that scope if the risk permitted indicates that the code should be disabled. This means these code blocks are subject to the same variable scope consideration that always applies when using the noop()
function.