Page 1 of 1

Suggestions for "message pump less" process freeze detectio

Posted: Mon Jun 26, 2017 2:53 pm
by obones
Hello,

We are running unattended integration tests at night and some are interacting with Excel or Word via OLE. Those two programs are notorious for popping up dialogs even if the appropriate options have been set.
When that happens, the calling program is stuck, waiting for a non existent user to click on OK.
In our launch system, we have a watcher that kills a process that takes more than 3 hours to execute but when it does so, there is no stack trace printed.
I have read about madTraceProcess32/64 and it could help, except that I could not find a way to call it from the command line.
Now, I could write code to specifically test for that kind of situation, but I'm not sure how to proceed considering that there is no message pump. One idea that comes to mind is to create a bug report periodically and if 10 times in a row I get the same one, trigger the "anti freeze" detection.
But that idea needs quite a bit of filtering because there are timestamps at various locations.

Any other ideas?

Re: Suggestions for "message pump less" process freeze detec

Posted: Mon Jun 26, 2017 3:02 pm
by madshi
I'm slightly confused. I see 2 problems that need solving here:

1) Detect the frozen state.
2) Create a freeze report.

You seem to already have "solved" problem 1), by simply considering all processes that take more than 3 hours to execute as frozen. So why would you periodically create bug reports? That makes no sense to me. Are you not happy with the 3 hours detection logic?

Creating a freeze report is easy enough. Are we talking about process creating a freeze report for itself, or do you want to get a freeze report about a different process than your own?

Re: Suggestions for "message pump less" process freeze detec

Posted: Mon Jun 26, 2017 3:07 pm
by obones
The 3 hours maximum run time is too crude to be of interest and because it's a "taskkill" external call, it's of no use from within our process itself.
Basically, it's a safety trigger, but I want something that reacts faster.

So basically, I'm trying to solve 1 because 2 is documented just fine.
And to solve 1, my initial idea was to create a complete call stack dump on a periodical basis (every 1 minute or so) and see if I get the same one for say, 10 times.
But this does not seem "clever" and I'm wondering if you have suggestions.

Re: Suggestions for "message pump less" process freeze detec

Posted: Mon Jun 26, 2017 4:57 pm
by madshi
Do you want to detect a freeze of your own process or of another process?

Re: Suggestions for "message pump less" process freeze detec

Posted: Tue Jun 27, 2017 7:26 am
by obones
Ah sorry, I was not clear enough, it's my own process.
So basically a watchdog thread that dumps a bugreport to stdout when it considers that all other threads have been doing nothing for a given duration.

Re: Suggestions for "message pump less" process freeze detec

Posted: Tue Jun 27, 2017 8:57 am
by madshi
Shouldn't your own process know by itself when it's frozen and when it's not? I think it would be easier and more reliable if you added dedicated code for freeze detection.

Of course it's possible to use the following API to get a callstack for a specific thread:

http://help.madshi.net/madExceptUnit.ht ... StackTrace

But stack tracing (especially in 32bit) is not an exact science, so there's no guarantee (although it's somewhat likely) that the callstack will be perfectly identical even if the thread is stuck.

Re: Suggestions for "message pump less" process freeze detec

Posted: Tue Jun 27, 2017 12:45 pm
by obones
The thing is, I have no idea which code is waiting for an answer. So, I can write a watchdog thread just fine, but the only way I can think of to detect stalling is to compare stack traces from times to times.
I know it can be imprecise, but if you have no other suggestion, I'll go this way.

Thanks

Re: Suggestions for "message pump less" process freeze detec

Posted: Tue Jun 27, 2017 12:59 pm
by madshi
You can give it a try, of course.