Watch, Follow, &
Connect with Us
Public Report
Report From: Delphi-BCB/RTL/Delphi/Other RTL    [ Add a report in this area ]  
Report #:  8399   Status: Open
System initialization differences are allowed to affect FPU Control Word
Project:  Delphi Build #:  8.1
Version:    10.0 Submitted By:   John Herbster
Report Type:  Basic functionality failure Date Reported:  6/12/2004 6:11:02 AM
Severity:    Infrequently encountered problem Last Updated: 3/20/2012 2:24:39 AM
Platform:    All versions Internal Tracking #:   238092
Resolution: None (Resolution Comments) Resolved in Build: : None
Duplicate of:  None
Voting and Rating
Overall Rating: (11 Total Ratings)
4.91 out of 5
Total Votes: 6
Description

There is a problem with FPU (Floating Processor Unit) Control Word being initialized to different precisions on different computers.  

The problem comes up every now and then in the forums when programmers claim that the same programs will give different results on different machines.

On one of my machines, any Delphi project starts user code with an FPU precision of "double" while on most machines Delphi projects start user code with "extended" precision.

See the Steps for instructions on how to check for the problem.  (No user code is required.)

There is an easy fix to make all machines start user code with "extended" precision described in the Workaround tab.

This problem was originally described in QC report #5928.  Which was, I believe, mistakenly closed with the resolution "Test Case Error".  

In QC #5928, I tracked the problem down to a two pieces of, what I call, bad code
(1) in D7's Forms.pas; and the other in
(2) Windows.pas.

In Forms.pas, procedure TApplication.CreateHandle has the code
    ...
    FHandle := CreateWindow(...);
    ...
    if NewStyleControls then
    begin
      SendMessage(FHandle, WM_SETICON, 1, GetIconHandle);
      SetClassLong(FHandle, GCL_HICON, GetIconHandle);
    end;
    ...
where SendMessage is defined in User32.dll.  This SendMessage on some machines or under some conditions can change the FPU's control word.  (In my case from extended to double precision.)

In Windows.pas, function CreateWindow(...) and
similarly CreateWindowEx(...) have the code
  begin
    FPUCW := Get8087CW;
    Result := _CreateWindowEx(...);
    Set8087CW(FPUCW);
  end;

Notice that the code reads FPUCW from the actual FPU's CW instead of the Default8087CW (which can be different) and that on exit the Set8087CW() sets *both* the actual FPU's CW and the Default8087CW.  (This an example of why we try to avoid having redundant variables in our programs.)

If the CreateWindows and CreateWindowsEx code is going to read the actual FPU's CW on entry, then it should only restore FPU's CW on exit and *not* also set the Default8087CW variable.  Or optionally, if on entry it reads the Default8087CW, then it should restore the Default8087CW on exit.

I think that the problem could also be fixed by adding, in Forms.pas, procedure TApplication.CreateHandle the restoration of the FPU's CW at the return from calling the User32.dll SendMessage procedure.

--JohnH
Steps to Reproduce:

(1) Create the default "New Application".

(2) Press F8 to compile and stop at the "begin" line in the DPR file.

(3) Examine the FPU's PC (precision control) register

If PC=3, then the FPU is running extended precision.
If PC=2, then the FPU is running double precision.
If PC=0, then the FPU is running single precision.

The precision should be "extended", but on some PC's it has been observed to be "double".


Workarounds
Insert the statement
  Set8087CW($1332);
immediately after the "begin" in the DPR file, like in the following:

  ...
begin
  Set8087CW($1332);
  Application.Initialize;
  Application.CreateForm(TForm1, Form1);
  Application.Run;
end.
Attachment
None
Comments

Kristofer Skaug at 10/2/2004 7:20:29 AM -
I only now bumped my head against this problem, noticing that a unit test (using some floating point math) would pass on one machine and fail on another one. The problem appears to be the same with Delphi 6.2 and Delphi 7.1: Watching the FPU PC register, I see this, compiling/executing the same test suite:

Exec:   A6   A7   B6   B7
x1      2    2    2    2
x2      2    2    3    3
x3      3    3    3    3

A6: D6.2, Machine-A (P4/1700 Workstation) -> tests fail
A7: D7.1, Machine-A (P4/1700 Workstation) -> tests fail
B6: D6.2, Machine-B (P4/1700 Dell laptop) -> tests pass
B7: D7.1, Machine-B (P4/1700 Dell laptop) -> tests pass

Code points (Exec):
x1 = Break at the 'begin' statement of the .dpr unit (before system initialization)
x2 = Break at first test case constructor
x3 = Break after loading a DLL built with BCB6

with PC=3 (extended precision) my tests pass, but with PC=2 they fail (due to the reduced precision, I guess).

I've stepped through with Debug dcu's and found that on Machine B (the laptop), the following statement of D6's System.pas would cause the FPU PC value to jump from 2 to 3:

procedure       _FpuInit;
asm
        FNINIT
   ....

The same obviously does not happen on my other machine.
I think this reeks of inconsistency.
Also it is frightening that loading DLL's built with other compilers can affect your executable's global FPU precision!!! In this case, even coming from a 'friendly' tool from the same vendor, i.e. BCB.
In this case, I had the weird situation that the tests that I used to see passing with flying colors, would now fail. And then, if I re-ran the tests (DUnit) without restarting the test executable (DUnit) they would suddenly pass, because another test branch further downstream would load a BCB-produced DLL and this would reset the FPU to extended precision.

Kristofer Skaug at 10/5/2004 4:10:07 PM -
>        FNINIT
>   ....
> The same obviously does not happen on my other machine.
> I think this reeks of inconsistency.

I'm sorry, this comment was unfair, and only due to my incomplete understanding of the problem.
FNINIT *does* cause extended-precision setting on all (my) machines. The problem is, as the report also points out, what happens *after* this statement and *before* the first line of "user code".

The fact that your FPU CW is still not safe after that, depending on what your app does (and what it calls), is clear but that's your own problem/responsibility to deal with. What this report asks is for Borland to guarantee a certain value of the FPU CW at the point where user code takes over. I think that's a fair request, even if it really will only buy you *some* respite.

Kristofer

Kristofer Skaug at 10/2/2004 7:31:15 AM -
... I should add that I haven't rated this report yet, because I'm unable to discern what should be done.
Apparently the FPU precision is updated in the context of System.pas' initialization clause, and so it is odd that it should be affected again later with Forms.pas code...

I find it unsettling the way it is, but am worried about the ramifications of change.
It would seem favorable to suggest:

- Delphi and BCB having the same default settings (enforced) -
- The default setting being extended precision, regardless of the machine.
- Delphi executables not letting DLL's affect its FPU precision setting

I hope Danny et al could come over and discuss some of these points a bit further (yes I also read Danny's comments to QC5928).

Kristofer

John Herbster at 10/2/2004 11:03:21 AM -

Thanks for the comment.  May I presume that you saw my work-a-round?

Kristofer Skaug at 10/5/2004 3:59:51 PM -
Hi John,

yes, that's noted. This was a real eye-opener for me, and in the past few days I've been doing a lot of thinking on how to mitigate this problem in general.

- Explicitly asserting a fixed default value for the 8087CW in my apps and code libraries.
- Monitoring 8087CW value regularly during runtime - logging errors when a change is detected
  (e.g. due to external DLL calls).
- Safing all DLL loading procedures with SafeLoadLibrary(), as Danny pointed out
- Introducing 8087CW integrity testing as a mandatory part of integration activities with 3rd party libraries (be it code libs, packages or DLLs).
- Commenting in code (e.g. unit headers) the environmental precondition of having 8087CW = $1332

In a way, this issue is a nastier variety of the "global format settings problem" in SysUtils.pas, which take on different initial values depending on the user's settings/locale, and can also be changed during runtime if the user is messing around in Control Panel / Regional settings.

In fact, it has led me to a new awareness of a whole category of "preconditions" that I've been blithely ignoring for years. Now's the time to take stock and get it all written up, so I can introduce this in unit tests etc.

Kristofer

Prime Vision Software Engineer at 1/7/2005 12:15:32 AM -
This is a major bug for applications that have to trust on the mathematics part of Delphi.

Our company tries to perform a lot of calculations using Delphi that should give the same results on different machines. It cost us a lot of time to discover that the calculation problems where related to the differences in FPU initialization. We got different results when our calculation code was embedded in a dynamically loaded DLL in stead of an executable (on the same machine with the same Delphi compiler).
After DLL initialization the FPU precision was changed from 3 (extended) to 2 (double) precision. The change occured in the call to _fpuinit in the dll initialization code.

We can restore the precisionmode as a workaround, but knowing that the problem already was reported in 2003 (maybe earlier) I think Borland should have solved this.

John Herbster at 1/9/2005 4:44:34 PM -

Is this bug fixed in D2005/Win32?
--JohnH

John Herbster at 1/25/2006 4:23:01 AM -
With D7, the problem has *also* been observed to allow the FPU Control Word to have its mask bit $0001 (invalid operation) erroneously set to mask out interrupts for the likes of 0/0.  

--JohnH

Ref Message-ID: <43d4722e@newsgroups.borland.com> from "peter" <peter@omc.com.hk> in borland.public.delphi.language.delphi.general (about Mon, 23 Jan 2006 14:08:26 +0800).

Ralf Stocker at 2/22/2006 1:17:35 PM -
http://codecentral.borland.com/Item.aspx?id=19421

John Herbster at 11/12/2006 4:26:35 AM -
Above URL is for  
  ExactFloatToStr_JH0 -- Exact Float to String Routines

Ralf Stocker at 11/12/2006 3:53:47 AM -
PC went from 2 to 3 in the first line after the begin of .dpr. Is this correct now?

Version: Turbo Delphi 10.0.2288.42451




John Herbster at 11/12/2006 4:31:01 AM -
The FPU's precision control should be 3 (for extended) when the initialization code in your application units executes.  (It is not clear to me from your comment when you are seeing it change from 2 to 3.)

Server Response from: ETNACODE01