Monday 8 February 2010

SIGALRM and the GIL

had some weird lag issues in nethogs2. i was firing setitimer SIGALRMs to keep things in check, every 0.25s. but strangely some alarms would arrive 6s or more apart...

DEBUG:nethogs2(134):Alarm triggered at 1265561284.348955
INFO:proctab.py(108):Unknown process: Connection(local=('192.168.0.1', 520), remote=('192.168.1.255', 520), family=2, protocol=17)
DEBUG:nethogs2(134):Alarm triggered at 1265561293.166447

which is HIGHLY bizarre, as the OS does this shit, and signals date back so far that you don't fuck with them. they arrive, on time, no excuses.

so i ran python profiler to see wtf is going on:

matt@stanley:/media/data/source/anacrolix/projects/nethogs2$ sudo python -m cProfile -s cumulative ./nethogs2

and i get this

347158 function calls (347141 primitive calls) in 26.807 CPU seconds

all dandy until i see this in the printout:

12 24.737 2.061 24.743 2.062 {_pcap.pcapObject_dispatch}

a C function is using up 90% of my time, LOCKING OUT the signal handling!!

it turns out the call to pcap_dispatch is blocking, and i suspect it was taking the GIL with it. i'd also guess that signals are delivered to your python callback, only if the GIL can be claimed. changed the timeout from blocking to 1ms... problem fixed.

No comments: