Skip to content

Commit 2707de8

Browse files
committed
doc: Add barclamp development exercises
A hands-on workshop was given a couple of times covering these exercises. This commit adds the exercises as documentation so that others can benefit from it non-interactively.
1 parent 584a6e3 commit 2707de8

File tree

2 files changed

+357
-0
lines changed

2 files changed

+357
-0
lines changed

doc/barclamp.md

+2
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,8 @@ through all the steps neccessary to create a minimal (without UI elements)
55
Barclamp for Crowbar. It will then guide you through all the testing, CI and
66
review steps until the point where it gets merged into Crowbar.
77

8+
See also the [advanced development exercises](barclamp_development_exercises.md).
9+
810
## What is a Barclamp?
911

1012
A barclamp is a plugin for Crowbar that configures a particular service or

doc/barclamp_development_exercises.md

+355
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,355 @@
1+
# Barclamp Development Exercises
2+
3+
This guide provides explanation and hands-on exercises to help with
4+
understanding some of the more useful or unintuitive parts of Crowbar
5+
development.
6+
7+
The exercises here will have you modify the files on-disk on the Crowbar admin
8+
node. Your development workflow may involve synchronizing development changes
9+
from your local workstation to the admin node in some other way, this does not
10+
cover that.
11+
12+
For the sake of practicing, these exercises will only concern the keystone
13+
barclamp. This way you only need to deploy up through the keystone barclamp
14+
which will reduce the amount of time it takes to deploy changes.
15+
16+
## Override Config Files
17+
18+
Crowbar leverages the ``oslo.config`` library to support reading configuration
19+
from directories under ``/etc/<service>/<service>.conf.d``. The service package
20+
will install the default config file in ``/etc/<service>/<service>.conf`` and a
21+
small set of overrides in
22+
``/etc/<service>/<service>.conf.d/010-<service>.conf``. Crowbar then applies its
23+
service configuration in ``/etc/<service>/<service>.conf.d/100-<service>.conf``
24+
which overrides all configuration in the default config file or the package's
25+
override config file. Because Chef is an traditional-style configuration
26+
management tool, it attempts to converge a server's configuration by reapplying
27+
recipes regularly, about every 15 minutes or so. This means that any local
28+
changes you make to the ``100-<service>.conf`` files that Crowbar manages will
29+
be eventually overwritten. Using override config files is a useful way of
30+
working around this, since those files will not be overwritten. Override config
31+
files are also useful for handling any parameters not yet supported by Crowbar,
32+
or for temporarily applying a setting without going through a barclamp proposal.
33+
34+
### Exercise
35+
36+
Create a file ``/etc/keystone/keystone.conf.d/200-debug.conf``.
37+
38+
In the file, add the content:
39+
40+
```
41+
[DEFAULT]
42+
debug = true
43+
insecure_debug = true
44+
```
45+
46+
Restart the keystone service:
47+
48+
```
49+
# systemctl restart apache2
50+
```
51+
52+
The ``insecure_debug`` option is now enabled in keystone, which Crowbar doesn't
53+
support. This option is useful for debugging authentication requests on
54+
non-production systems. You can see it in action by requesting tokens from
55+
keystone using invalidate credentials or incorrect scopes.
56+
57+
## Adding a Parameter to a Barclamp
58+
59+
Adding a new parameter involves making changes to several components, but is
60+
ultimately a matter of copying and pasting similar examples. This exercise will
61+
add the ``insecure_debug`` parameter to the keystone barclamp. In reality we
62+
should never support setting this option through Crowbar because it is insecure
63+
for production use, but it is useful as an exercise.
64+
65+
### Exercise
66+
67+
This is done in three stages.
68+
69+
#### Add the parameter to the data bag
70+
71+
Edit the keystone data bag schema in
72+
``/opt/dell/chef/data_bags/crowbar/template-keystone.schema``. Add a new
73+
parameter ``insecure_debug`` under the existing ``debug`` parameter:
74+
75+
```
76+
{
77+
"type": "map", "required": true,
78+
"mapping": {
79+
"id": { "type": "str", "required": true, "pattern": "/^keystone-|^template-keystone$/" },
80+
"description": { "type": "str", "required": true },
81+
"attributes": { "type": "map", "required": true,
82+
"mapping": {
83+
"keystone": { "type": "map", "required": true,
84+
"mapping": {
85+
"debug": { "type": "bool", "required": true },
86+
"insecure_debug": { "type": "bool", "required": true },
87+
...
88+
```
89+
Edit the default data bag in
90+
``/opt/dell/chef/data_bags/crowbar/template-keystone.json``. Add the new
91+
parameter there as well. Here you should also increment the schema revision
92+
number.
93+
94+
```
95+
{
96+
"id": "template-keystone",
97+
"description": "Centralized authentication and authorization service for OpenStack",
98+
"attributes": {
99+
"keystone": {
100+
"debug": false,
101+
"insecure_debug": false,
102+
...
103+
},
104+
"deployment": {
105+
"keystone": {
106+
"crowbar-revision": 0,
107+
"crowbar-applied": false,
108+
"schema-revision": 302,
109+
```
110+
111+
Add a schema migration in ``/opt/dell/chef/data_bags/migrate/keystone/``. The
112+
``300_noop.rb`` migration can be copied for the function prototypes. The file must
113+
start with the new schema revision number that you incremented in the default
114+
data bag and should describe the migration action. For example, if the new
115+
revision number is 302, the schema migration should go in
116+
``/opt/dell/chef/data_bags/migrate/keystone/302_add_insecure_debug.rb``. The
117+
upgrade migration needs to take the new parameter from the template attributes
118+
hash and add it to the attributes hash, and the downgrade needs to do the
119+
opposite. It will look something like this:
120+
121+
```
122+
def upgrade(template_attrs, template_deployment, attrs, deployment)
123+
attrs["insecure_debug"] = template_attrs["insecure_debug"] unless attrs.key?("insecure_debug")
124+
return attrs, deployment
125+
end
126+
127+
def downgrade(template_attrs, template_deployment, attrs, deployment)
128+
attrs.delete("insecure_debug") unless template_attrs.key?("insecure_debug")
129+
return attrs, deployment
130+
end
131+
```
132+
133+
Run the migration. There are two ways to do this:
134+
135+
```
136+
# barclamp_install.rb /opt/dell/crowbar_framework/barclamps
137+
```
138+
139+
will reinstall all the barclamps, including running all the migrations. Don't be
140+
confused by the ``.rb`` file suffix, the command should be an executable already
141+
in your PATH. Alternatively:
142+
143+
```
144+
# cd /opt/dell/crowbar_framework
145+
# RAILS_ENV=production rake crowbar:schema_migrate
146+
# RAILS_ENV=production rake crowbar:schema_migrate_status # optional
147+
```
148+
149+
You can use the built-in rake commands to do the migration.
150+
``crowbar:schema_migrate`` does the migration, ``crowbar:schema_migrate_status``
151+
shows the revision number for each barclamp. You can use ``rake -T`` while in a
152+
directory containing a ``Rakefile`` to see all the available commands. The
153+
default Rails environment is ``development`` but we never configure a
154+
development environment, so you need to specify ``production``.
155+
156+
Once the migration is run, check the raw view of the barclamp in the crowbar UI
157+
to make sure the parameter appears.
158+
159+
#### Add the parameter to the cookbook
160+
161+
The cookbook contains the Ruby code that creates and applies changes to the
162+
deployment.
163+
164+
Edit the keystone server recipe in
165+
``/opt/dell/chef/cookbooks/keystone/recipes/server.rb``. Find the template
166+
resource that creates the keystone config file and add the new parameter to the
167+
list of variables it passes to the template.
168+
169+
Edit the keystone config template in
170+
``/opt/dell/chef/cookbooks/keystone/templates/default/keystone.conf.erb`` to use
171+
the new parameter.
172+
173+
Chef does not pick up changes to cookbooks automatically, they need to be
174+
uploaded to the Chef server. The ``barclamp_install.rb`` tool will take care of
175+
this:
176+
177+
```
178+
# barclamp_install.rb /opt/dell/crowbar_framework/barclamps
179+
```
180+
181+
Alternatively, you can use the Chef tooling directly to upload the cookbook:
182+
183+
```
184+
# knife cookbook upload keystone -o /opt/dell/chef/cookbooks
185+
```
186+
187+
Now you should be able to apply the change to the deployment. In the crowbar UI,
188+
click "Apply". The new parameter should appear in the keystone config file after
189+
the proposal is finished running. If you change the value in the raw view of the
190+
proposal and reapply, the config file will have changed.
191+
192+
#### Add the parameter to the UI
193+
194+
In many cases this is as far as we go, but for some common operations it is good
195+
to expose the parameter in the "Custom" view of the crowbar UI.
196+
197+
Edit the locales file at
198+
``/opt/dell/crowbar_framework/config/locales/keystone/en.yml``. Add the new
199+
parameter to the YAML file under ``edit_attributes``. The name must match the
200+
name given in the data bag. Add descriptive text to the attribute, this is what
201+
will appear in the web UI to the user.
202+
203+
Edit the barclamp view in
204+
``/opt/dell/crowbar_framework/app/views/barclamp/keystone/_edit_attributes.html.haml``
205+
and add a new ``boolean_field`` for the parameter.
206+
207+
When finished changing the Rails code, restart crowbar:
208+
209+
```
210+
# systemctl restart crowbar
211+
```
212+
213+
Refresh the custom view of the barclamp to see the new dropdown menu appear.
214+
215+
## Synchronizing Chef in HA Deployments
216+
217+
When applying a proposal, Crowbar runs Chef in parallel as much as it can. This
218+
means we don't have to wait for each node to complete its configuration in
219+
serial, but it has implications for interdependent resources. For example, when
220+
an OpenStack service is deployed, a database synchronization must be run to set
221+
up tables for the service. This should only be done once, by one actor. If
222+
multiple nodes try to run the db sync at the same time, a race condition will
223+
occur and one will fail due to the database transaction already occurring.
224+
Moreover, the systemd services depend on that db sync happening before they try
225+
to start, so they need some way of being notified that it is done.
226+
227+
This is implemented in crowbar with sync marks. One node in a cluster is chosen
228+
to be the "founder" node that behaves as the instigator for actions that can't
229+
be run parallel. Other nodes are notified to wait for it to complete. In this
230+
exercise you will implement sync marks to prevent race conditions when the
231+
``keystone-manage mapping_populate`` command is run.
232+
233+
See the [training the HA implementation in
234+
Crowbar](https://w3.suse.de/~aspiers/cloud/HA-training/) for more information.
235+
236+
### Exercise
237+
238+
For this exercise, crowbar should be configured in HA mode with at least two
239+
controllers, and LDAP should be configured using ``want_ldap=1`` in your
240+
``mkcloud`` script.
241+
242+
The ``keystone-manage mapping_populate`` command attempts to enhance keystone
243+
runtime performance by generating IDs for all LDAP users in advance. The end
244+
result is that the ``id_mapping`` table in the keystone database is populated.
245+
We would not want to have multiple controllers all running this command because
246+
it could cause a race condition or fail if another controller started the
247+
command first, so we want to ensure that only the founder node runs it.
248+
249+
Edit the keystone server recipe in
250+
``/opt/dell/chef/cookbooks/keystone/recipes/server.rb``. Add an ``execute``
251+
resource that runs ``keystone-manage mapping_populate --domain ldap_users``.
252+
Ensure the resource is wrapped between a ``wait`` sync mark and a ``create``
253+
sync mark. You can look at the ``keystone-manage db_sync`` resource as an
254+
example; we'll address the ``ruby_block`` in the next exercise. Upload the
255+
keystone cookbook and apply the keystone proposal to ensure it runs correctly.
256+
Examine the ``id_mapping`` table of the keystone database to ensure it has some
257+
LDAP users populated in it.
258+
259+
## Understanding Chef Compile and Converge Phases
260+
261+
The Chef DSL looks just like Ruby, but it should not be treated the same as
262+
regular Ruby scripts. Chef operates in a two-pass model: the Chef recipes are
263+
first compiled into a catalog, and then the catalog is applied on the nodes.
264+
It's called ["Compile and Converge"](https://coderanger.net/two-pass/).
265+
Essentially, everything in the recipe is run twice: during the compilation
266+
phase, each resource block is expanded and expressions are evaluated, freezing
267+
them in time as they are inserted into the run list. Then those expanded
268+
resources are executed in the converge phase, and in some cases triggering
269+
execution of other resources via Chef's [notification mechanism
270+
](https://docs.chef.io/resource_common.html#notifications). Misunderstanding
271+
this sequence can lead to bugs in which values are evaluated incorrectly and
272+
resources are misapplied. This exercise will demonstrate when different types of
273+
code are executed and how to avoid common pitfalls.
274+
275+
### Exercise
276+
277+
In some cases, commands executed by Chef cookbooks are not idempotent, or are
278+
idempotent but are time consuming, and therefore it's best to only run them on
279+
the first deployment. Moreover, in HA deployments, we want all nodes in a
280+
cluster to wait for the founder to complete an action, and then not try to
281+
repeat the action. Chef allows us to manage this by setting persistent flags
282+
on the node that we can evaluate before deciding to include a resource in the
283+
Chef run.
284+
285+
To demonstrate the importance of knowing the difference between the compile and
286+
converge phases, let's first do this incorrectly.
287+
288+
First, undo the result of the last successful ``mapping_populate`` run by
289+
emptying the table:
290+
291+
```
292+
# mysql keystone -e 'delete from id_mapping'
293+
```
294+
295+
Also, change the command in the ``execute`` resource you added in the previous
296+
exercise to return an error, for example by changing the command to ``false``.
297+
This will illustrate what happens when the command fails to execute.
298+
299+
After the ``execute`` resource in the recipe, set a flag on the node object to
300+
declare that the mapping population has been run, for example:
301+
302+
```
303+
node.set[:keystone][:mapping_populated] = true
304+
node.save
305+
```
306+
307+
Also add an ``only_if`` parameter to the ``execute`` resource so that it will
308+
only run if that flag is not set. The (naive) idea is that the ``execute``
309+
resource should run once, then the flag should be set, and then on the next Chef
310+
run it won't be executed. Moreover, if the resource fails, we would want
311+
execution to stop so that the problem can be corrected, and then we would want
312+
the next run to try the mapping population again.
313+
314+
Upload the cookbook and reapply the proposal. Notice the proposal does not fail
315+
on the invalid command in the ``execute`` resource. Also check the database on
316+
the controller and notice that nothing happened.
317+
318+
Fix the command in the ``execute`` resource to use the correct mapping populate
319+
command. Upload the cookbook again and reapply the proposal. This should again
320+
be successful, but the command still will have not populated the database.
321+
322+
Why not? Because the code to set the ``:mapping_populated`` flag was run in
323+
the *compile* phase, long before the ``execute`` resource was run. When it came
324+
time to run the resource, it saw that the flag on the node was already set and
325+
decided not to run the resource. You can examine the node attributes with knife:
326+
327+
```
328+
# knife node list
329+
# ...
330+
# knife node edit <controller1>
331+
```
332+
333+
Go ahead and reset the node by deleting that ``"mapping_populated"`` flag in the
334+
node JSON in knife. Do the same on all other controllers.
335+
336+
Now let's do it correctly. In the recipe, wrap the node attribute setting in a
337+
``ruby_block``:
338+
339+
```
340+
ruby_block "mark node for keystone mapping_populate" do
341+
block do
342+
node.set[:keystone][:mapping_populated] = true
343+
node.save
344+
end
345+
action :nothing
346+
subscribes :create, "execute[keystone-manage mapping_populate]", :immediately
347+
end
348+
```
349+
350+
Make sure the ``ruby_block`` comes before the closing sync mark. The
351+
``ruby_block`` is a Chef resource, which means the action it defines won't be
352+
executed until the *converge* phase. Now when you upload the cookbook and apply
353+
the proposal, the mapping population should happen correctly. It should also
354+
occur on only one of the controllers, and it won't occur on any following Chef
355+
runs.

0 commit comments

Comments
 (0)