Poor performance of madExcept under x64?

delphi package - automated exception handling
madshi
Site Admin
Posts: 10764
Joined: Sun Mar 21, 2004 5:25 pm

Re: madExcept performance under x64 is very poor

Post by madshi »

Nope. There is a certain fixed frame structure with x64, but using a specific register as a stack frame is optional. It can be done, but it doesn't have to be done. Furthermore if you do it, it doesn't have to be EBP, it could be a different register, too, IIRC.
davidheffernan
Posts: 89
Joined: Thu Feb 23, 2012 12:22 pm

Re: madExcept performance under x64 is very poor

Post by davidheffernan »

OK, I clearly don't understand these low-level details. What exactly makes you think that CaptureStackBackTrace will result in lower quality traces? My simple experiments seem good. I've done a lot of web-search of late, none of which suggests that CaptureStackBackTrace results in lower quality traces than StackWalk64.
madshi
Site Admin
Posts: 10764
Joined: Sun Mar 21, 2004 5:25 pm

Re: madExcept performance under x64 is very poor

Post by madshi »

The documentation of the API looks like it would simply browse through stack frames. If that's what it does, it will fail if a function is not using a stack frame, as explained before.

Now could we please let this thread rest for a while? It's hard for me to do any real work if I have to reply to a dozen emails and forum threads every couple of minutes... :wink: This thread is on my to do list and I'll get to it sooner or later. Until then please have a bit of patience.
davidheffernan
Posts: 89
Joined: Thu Feb 23, 2012 12:22 pm

Re: madExcept performance under x64 is very poor

Post by davidheffernan »

No worries. I certainly didn't mean to pressure and hassle you and I'm sorry about that. But this is all I've been thinking about for the past couple of days and at least we now have a record of those thoughts.
madshi
Site Admin
Posts: 10764
Joined: Sun Mar 21, 2004 5:25 pm

Re: madExcept performance under x64 is very poor

Post by madshi »

It's fine. I was aware of CaptureStackBackTrace, but I didn't know SYMOPT_DEFERRED_LOADS, so I'll keep that in mind. Yesterday I was just burried in support mails and I barely got to do any serious development. Today it looks better so far. I'll let you know when I've found some time to look into improving x64 performance.
madshi
Site Admin
Posts: 10764
Joined: Sun Mar 21, 2004 5:25 pm

Re: madExcept performance under x64 is very poor

Post by madshi »

I've found a way to delay the StackWalk64 call until it's clear whether a madExcept bug report is needed or not. This should make performance more similar to x86. However, I do have to store the whole stack content of the crashing thread to make this work. So a small performance loss is to be expected when an exception is raised. But it should be *much* lower than before. And the changes I did should (hopefully) not impact stack tracing quality. Here's the latest beta build with these changes:

http://madshi.net/madCollectionBeta.exe (2.7.1.19)
davidheffernan
Posts: 89
Joined: Thu Feb 23, 2012 12:22 pm

Re: madExcept performance under x64 is very poor

Post by davidheffernan »

Thanks a lot. I'll try this out and get back to you with my verdict. Might take me a little while to clear a path to being able to do so!
madshi
Site Admin
Posts: 10764
Joined: Sun Mar 21, 2004 5:25 pm

Re: madExcept performance under x64 is very poor

Post by madshi »

I'm planning to release a new official build next weekend. Would be great if you could test it until then, thanks!
davidheffernan
Posts: 89
Joined: Thu Feb 23, 2012 12:22 pm

Re: madExcept performance under x64 is very poor

Post by davidheffernan »

Now you are rushing me! ;-) Jokes aside, I'm on it right now. I will report today or tomorrow.
davidheffernan
Posts: 89
Joined: Thu Feb 23, 2012 12:22 pm

Re: madExcept performance under x64 is very poor

Post by davidheffernan »

I've started trying this out now. First impressions are good. The runtime performance of throwing and swallowing an exception is around 5 times faster than my hack based on CaptureStackBackTrace. Still slower than the old x86 but that's expected and on that score things are good.

However, there does appear to be a little problem. I have a particular intensive calculation test suite that I have been running to stress test the new madExcept 4 code. I ran that with your new code and observed it grind to a halt. I looked under Process Explorer and saw that the memory usage (e.g. private bytes) had jumped from around 400MB to 11GB. The machine only has 8GB of memory and so presumably was thrashing. I found it hard to reproduce this behaviour but managed to do so once. I'll keep trying and see if I can come up with anything tangible to report.

My initial suspicion is that this line

Code: Select all

SetString(result, PAnsiChar(context.rsp), GetStackTop - context.rsp)
is somehow leading to a huge memory block being allocated.

Update: Nope, GetStackTop - context.rsp is always a sensibly sized value.
madshi
Site Admin
Posts: 10764
Joined: Sun Mar 21, 2004 5:25 pm

Re: madExcept performance under x64 is very poor

Post by madshi »

Hmmmm... That doesn't sound so good. How many exceptions do you have in your program? Is it a few dozen? Or a few thousand? Maybe you can count the number of exceptions that occur and log it somewhere? Then you could check whether the amount is somewhat linear to memory consumption? But 11GB is *gigantic*!! I really wonder how that could have happened... :o
madshi
Site Admin
Posts: 10764
Joined: Sun Mar 21, 2004 5:25 pm

Re: madExcept performance under x64 is very poor

Post by madshi »

Hmmmm... I've double checked my code and I can't see anything which would explain such a monstrous memory usage, especially if that "SetString" call you mentioned works fine (it should). I guess there's no chance you could reproduce this in a small test app, can you? Does the same problem occur if you compile your program as x86 (if that is possible)? You could try using FastMM4's memory leak checking to see if that can somehow find who's allocating those 11GB. Of course that makes sense only if you can still gracefully shutdown your application in that situation. Otherwise FastMM4 won't be able to report anything, I guess.
davidheffernan
Posts: 89
Joined: Thu Feb 23, 2012 12:22 pm

Re: madExcept performance under x64 is very poor

Post by davidheffernan »

I'll try to get a small test app but I don't hold out much hope. I think I may have more luck instrumenting the running code. I'll check in x86 and see if I can reproduce the behaviour.

I do wonder if your code that fakes the state of the stack when calling StackWalk64 could be leading to the dbghelp stack walking code doing crazy allocations. But then the bad behaviour occurs without my app showing a bug report. I presume the stack walk only happens when the code asks for a bug report. Is that true? Or does it happen as soon as there is an unhandled exception? That's important to know because my code treats certain unhandled exceptions as not being worthy of bug reports. For example EAbort.

I looked under VMMap and the memory was consumed by either 1 or 2 giant blocks of virtual memory so it's not a progressive leak.

Do you have any tips as to how I might try to instrument the code to find out what is provoking the behaviour. Which parts of your code have been modified? I think my first attack is to try to work out which parts of your new modified code are actually running.
madshi
Site Admin
Posts: 10764
Joined: Sun Mar 21, 2004 5:25 pm

Re: madExcept performance under x64 is very poor

Post by madshi »

You could add a MessageBox or something like that to the StackWalk64 function. The code is supposed to be executed only for exceptions which are really fully handled by madExcept. If you filter some exceptions, it depends which "phase" you filter them in. For "RegisterExceptionHandler(..., epQuickFiltering)" filtered exceptions a callstack should not be calculated. If you filter exceptions out in later phases, callstack calculation is already initiated (in a background thread).

You could simply download the official madCollection version, temporarily install it, copy the madExcept.pas and madStackTrace.pas files and then reinstall the latest beta. This way you could compare madExcept.pas and madStackTrace.pas, e.g. by using Beyond Compare or a similar tool. That should show you the exact source code changes. It's not that much.

You could also try replacing that "SetString" you mentioned earlier with a simple "result := ''" to check if that "fixes" the problem. The SetString itself might always have reasonable sizes, but if it's anything other than "", it may result in the tricky StackWalk64 code being executed. If you set it to "", the StackWalk64 hack should not be used.
davidheffernan
Posts: 89
Joined: Thu Feb 23, 2012 12:22 pm

Re: madExcept performance under x64 is very poor

Post by davidheffernan »

I've worked out what is happening. I still don't understand why, but I know what.

The routine that allocates all the memory is one of mine. It looks like this:

Code: Select all

  while True do begin
    f := Min(f, fMax);
    Abscissas.Add;
    Abscissas[Abscissas.Count-1] := f;
    if f=fMax then begin
      break;
    end;
    Try
      df := Power(K/CalcSDotDot(f), OneThird);
    Except
      on EZeroDivide do begin
        df := K/CalcS(f);
      end;
    End;
    f := f + df;
  end;
Now, when this code fails, it turns out that the SSE floating point control status register MXCSR has been modified to mask exceptions. My program runs with floating point exceptions un-masked which is of course the normal behaviour for Delphi programs. This then leads to the code inside the Try/Except not raising an exception and df is set to 0. There is then an infinite loop and eventually the calls to Abscissas.Add results in the huge memory allocations.

I've done some more digging and I think I now need to concentrate my efforts on understanding why the FP control state is being changed. That's not going to be easy, but I have now built my program without madExcept and can see that the FP control state is still being changed. In other words I suspect that your recent madExcept changes only served to, somehow, make it more likely that I saw this intermittent issue. Perhaps the performance improvement changed the order (relative to the other threads) in which the threads in my app (lots of threads doing FP calcs) did their work.

I do wonder if the problem could be related to a bug I found in the XE2 64 bit compiler that is as yet un-fixed. The QC report is here: http://qc.embarcadero.com/wc/qcmain.aspx?d=105584

Anyway, it's clear now that this particular problem is not in your code and so I will continue plugging on my own. Thanks for your help and sorry if I've wasted your time at all.
Post Reply