Preventing cloud control alerts for controlled restarts



We have hooked cloud control into the request tracker ticketing system (and by hooked in i mean that cloud control just sends an email notification which is then picked up by request tracker to generate a ticket - nice and simple).

This means we get tickets generated for every alert we configure in cloud control (it's also the same system that users can request service requests in - again they just send an email in and it generates a ticket) - this way we get tickets for every piece of work we do for management reporting and the overhead on people to get these tickets created is minimal - this is all great.

However....

The nature of development and test work means that databases are very often restarted to enable some new feature, fix a memory leak, flashback etc - all the usual stuff that happens day to day. This is a problem though - every time the database is restarted (even if you do it really quick) cloud control notices and sends an alert - which generates an email - which generates a ticket. This we don't want - the ticket is a consequence of some activity (that itself was a ticket) and we shouldn't be getting an extra ticket for that. The extra tickets skew the reporting and as we have subcontracted out some of the support can have cost implications as we are billed per ticket to some degree.

So how do we resolve this - i want a controlled restart of the database to not generate a ticket.

Our initial thoughts were to set a blackout every time a restart was required - the problem with this is you have to remember to do this and for most people there life is spent in a sqlplus session on the server and they've worked a certain way for many years - changing now to have to run an emctl command before (and after) the restart just wasn't working.

So what could we do here?

After lots more thought we came up with a solution - which after the fact actually seemed pretty obvious, we just hadn't considered it up front. All we had to do was add a delay in the alerting via an option in cloud control - see the screenshot below


So here we can say only send the alert if the database has been down for 5 minutes - in almost all of our cases the restart only takes a few seconds so the alert will never happen. This of course means that a genuine database down alert is delayed for 5 minutes - those are so infrequent though that for us it's not really an issue - server crashes are far more common (for us at least) - and these would be alerted straight away.

In the end a very simple fix.....

1 comments:

  1. Good workaround! Thanks for this.
    Unfortunately, the "geniuses" who decide on features for cloud control haven't got the foggiest what it's like to run a real life IT department and all its associated conditionals.
    Hence why a simple and fast workaround is not available.
    Something like a quick PL/SQL procedure call to enable or disable an alert?
    Narh, too complicated: their heads might explode...

    ReplyDelete