Thursday, July 4, 2024

Cloud-init and the curse of the YAML (and a potential solution! or at least idea)

 I recently took up cloud-init professionally as part of exploring modern(?) alternatives to kickstart for the on-premises private cloud. All worked fine, is all viable, after I actually figured out where to put the config entries...

Cloud-init has a couple of config files:

metadata:

#cloud-config
instance-id: test
hostname: test.localhost
network:
  version: 2
  ethernets:
    eth0:
      match:
        name: eth*
etc. etc. etc.

user-data:

users:
  - name: cloud-user
    ssh_authorized_keys:
      - ssh-rsa [our public key] root@our-bastion.localhost
    sudo: ALL=(ALL)     NOPASSWD:ALL
etc. etc. etc.

All looks good, looks reasonable. The challenge in getting this far was that in reading the docs it was not apparent at the time what config entries go in which file. For example, if I put things for the network in the user-data file it was parsed correctly and I could see the YAML was loaded correctly into the various python dictionaries in the code, but then later on it was just ignored. The docs note in vague ways that the metadata is for the environment and the user-data is specifics for this particular system. Which was what got me into trouble initially putting the network information in the user-data, IDK.

Looking at the code, using YAML for configuration and the easy way it's parsed and loaded into a dictionary is very powerful for adding configuration information to your software. The downside is if you name something slightly wrong or if you put something not quite in the right location, things will just be silently ignored. 

There are a couple of schools of thought here, and I've seen both professionally. One is that you should code things explicitly to look for the commands / statements and if there is something not explicitly recognized then you flag as an error. This style of coding is a bit of a challenge to modify, and in one particular case I've needed to extend multiple code locations so that a new argument/statement is a) parsed correctly (in one module), b) consumed correctly (in a different module), and then c) acted upon in a third module. A lot of work for just adding a similar statement, but it does mean if there's something not quite right, we will notice it right away. At the expense of making the code harder to work with and extend.

The second school of thought is that as configuration information is consumed, you only look for and act upon things you know you are looking for and acting on. This code is trivial to extend, you just add something to the "act upon" section and consume whatever it is you're looking for. The downside here is that if you add some configuration that's not quite right, likely it will just be ignored since nothing is looking for it.

This is the approach that cloud-init takes, and for good reason as there are multiple / modularized consumers all looking at the same dictionary and so any extension should be able to just consume whatever configuration it thinks it needs.

So this left me thinking, is there a way that we can have a central YAML file that is loaded into a python dictionary and that some overseer can look at the end and make sure that everything you specified was actually used? I think YES, based on a quick subclass of dict:

class NewDict(dict):
    def __init__(self, *args, **kwargs):
        super(NewDict, self).__init__(*args, **kwargs)
        self._used_flags = {key: False for key in self.keys()}


    def set_used(self, key, used_arg = True):
        # print(f"(Setting used flag for {key} to {used_arg})")
        if key in self:
            self._used_flags[key] = used_arg
   
    def is_used(self, key):
        _used_flags = self._used_flags.get(key,False)
        # print(f"(Getting used flag for {key} - it is {_used_flags})")
        return(_used_flags)

This NewDict provides all the same stuff for dict, but adds a set_used(), and an is_used() method where you can indicate if a dictionary config item has been used/consumed or not.

To use then you would do something like:

import yaml

with open('stuff.yaml', 'r') as file:
    data = yaml.load(file, Loader=yaml.FullLoader)

nd = NewDict
# Copy the yaml into our NewDict
for key in data:
    nd[key] = copy.deepcopy(data[key])

# Indicate some keys are being used.
nd.set_used('Stuff1')
nd.set_used('Stuff3')

# Report on keys not being used
for key in nd:
    if (not nd.is_used(key)):
        print(f"WARNING, {key} is actually unused you know.")

This way various modules can go through and consume (or re-consume!) whatever configuration is needed, and then at the end something could go through the dictionary and note when some key had actually not been used. Bonus points and cloud-init can also name the modules that were called so you get a sense of what modules you're actually using, and what config entries are actually needed.


No comments: