Sunday 25 May 2014

Python: Restarting and monitoring specific threads

Hi All,

Anyone who has used python before knows how painful threading can be. Especially if we have a multithreaded program that we require specific threads to "always" be alive.
Having hacked around for awhile I came up with the following solution:

import threading
import time

class ThreadRestartable(threading.Thread):
def __init__(self, theName):
threading.Thread.__init__(self, name=theName)

def run(self):
print "In ThreadRestartable\n"
time.sleep(10)

thd = ThreadRestartable("WORKER")
thd.start()

while(1):
i = 0
for t in threading.enumerate():
if t.name is "WORKER":
i = 1
print threading.enumerate()
if i == 0:
thd = ThreadRestartable("WORKER")
thd.start()
time.sleep(5)
We can see it running here:
python run.py
In ThreadRestartable
[<_MainThread(MainThread, started 139833484474176)>, ]

[<_MainThread(MainThread, started 139833484474176)>, ]
[<_MainThread(MainThread, started 139833484474176)>, ]
[<_MainThread(MainThread, started 139833484474176)>, ]
[<_MainThread(MainThread, started 139833484474176)>, ]
[<_MainThread(MainThread, started 139833484474176)>]
[<_MainThread(MainThread, started 139833484474176)>]
In ThreadRestartable

[<_MainThread(MainThread, started 139833484474176)>, ]
[<_MainThread(MainThread, started 139833484474176)>, ]

As we can see we use a named thread WORKER where we then use threading.enumerate() to look for the thread we named. If it does not exist we start it again. This would normally be used in a situation where we have a long running thread that should never end. In which case if you use the standard way of creating a thread then watch it with something like if not thread.IsAlive() and try to call start() on it again you will find python raises an assertion error. This is because the threading object needs to be recreated.
Hope this helps someone out there :)

Thursday 22 May 2014

SCOM 2012 R2: Linux Monitoring...

Hi Guys,

It has been a really long time since I posted on here but yes I am still around heh :)

Anyway I have been working with a client lately who wants to checkout the linux monitoring from SCOM, of which I have personally stayed away from (I have a unix/linux background I am paranoid to put anything MS on something like RHEL). Anyway since the client wants to get rid of the other monitoring platforms on place if possible and have all monitoring coming from one system I had no choice but to check it out.

At first most of the issues around deploying the agent happened to be with the RHEL box one of the unix guys gave me. Basically some firewall fun and adding the SCOM IP's into /etc/hosts.allow, after this I could discover the box but the install was failing at the certificate assignment. In this case it turns out I forgot to set the certificate profile run as account back in SCOM. After setting this the agent installed fine... Next I was not actually getting any performance monitoring in the console, then realizing I forgot to assign the run as profile for the unprivileged and privileged users woohoo..

So I eventually figured it out but then couldn't be bothered waiting for it to pull data so went home..

Will update you all as to if you should even bother with the linux based monitoring agents I'm guessing not :)

6/6/14 UPDATE:
So far the servers have not died which is a good thing I guess. The base RHEL pack obviously is pulling in all the std happy crap like disk usage etc other that that nothing special. I did try a bunch of management packs from an "un-named" company for MySql and Apache(httpd). I don't really have very good things to say about them I may name drop them in future :D