Cloud control and the SLES problem

As some of you may have read I've been investing some time in making use of cloud control as a CMDB (or something CMDB like at least) - everyone has a CMDB right.....?

Well anyway we have been arranging with our os support teams to patch SLES10 and SLES 11 up to the latest patchset across all the estate. I wanted to use cloud control to validate what actually needed to be done and i thought this information would already be gathered - and it is..... just not completely - the version information is just too long.....see the screenshot below

I had hoped this was just a screen display thing and that the actual data stored was 'OK' - but it wasn't this is the collected value which is of no use to me.....

So what to do?

A metric extension is called for.

Now i've covered all the basics in a lot of details before - see here so I'm just amending that slightly to gather the os version info.

The key important screen to share is this one - the rest I'm sure you can work out for yourselves

The key elements are:

1) Command set to "ksh"
2) script set to version.ksh
3) the contents of version.ksh are displayed in the popup - this simply takes the 3 lines present in /etc/SuSE-release and joins them into a single line using the xargs trick - this tiny script is deployed with the metric - nothing manual to do here.

So now the metric will collect the exact version information and store it in the OMS repo once i publish the metric extension to all my SLES hosts.

So i do that and then a short while later (not sure of when the exact timings are of the background aggregation jobs that cloud control runs) the info is queryable using my previously created SQL for other metrics (the data can be seen real time by browsing metric extensions for the host directly in the web page).

In my case i called the metric sles-version in the definition so that is what i need to query on

SELECT target_name, value, last_collection_time
   AND METRIC_COLUMN_NAME = 'sles-version'

This produces results similar to the below (i hid the target names in this example)

Note also the use of the last_collection_time value - this is important as it proves the data is current - it would be easy to collect the info once and then something breaks and you wouldn't necessarily realize the data was very out of date.

I guess this little feature will be fixed in the main code line at some point to not truncate - but the tool is so powerful i could just fix and collect what i wanted with a metric extension in a few minutes.

This illustrates again how powerful metric extensions are.


Post a Comment