Invalid floating point optimization?

Hello,

I have this piece of code

asm("nop"); asm("nop"); asm("nop"); asm("nop");
double const c = a - b;
double const d = -c;
double const f = d * e;
data->field += f;
asm("nop"); asm("nop"); asm("nop"); asm("nop");
#if !defined(BAD)
printf ("#### new field at %d: %f/%f/%llx\n", __LINE__,
        data->field, data->field, *(unsigned long long *), &data->field));
printf ("#### %f/%g/%llx %f/%g/%llx %f/%g/%llx %f/%g/%llx\n",
       data->field, data->field, *(unsigned long long *), &data->field,
       a, a, *(unsigned long long *), &a, b, b, *(unsigned long long *, &b),
       e, e, *(unsigned long long *), &e);
#endif

So d is basically b-a. However, if b-a is not representable my applications requires d to be an upper bound on the exact value. Therefore the whole code is executed with floating point rounding mode FE_DOWNWARD and it is not ok to directly compute d=b-a (which would give a lower bound on the exact value).

If BAD is not defined then I get this assembly code:

  610141:       90                      nop
  610142:       90                      nop
  610143:       90                      nop
  610144:       90                      nop
  610145:       f2 0f 10 44 24 40       movsd  0x40(%rsp),%xmm0
  61014b:       f2 0f 5c 44 24 48       subsd  0x48(%rsp),%xmm0
  610151:       0f 57 05 08 63 97 00    xorps  0x976308(%rip),%xmm0        # f86460 <.L_2il0floatpacket.43+0x80>
  610158:       f2 0f 59 44 24 10       mulsd  0x10(%rsp),%xmm0
  61015e:       f2 41 0f 58 44 24 20    addsd  0x20(%r12),%xmm0
  610165:       f2 41 0f 11 44 24 20    movsd  %xmm0,0x20(%r12)
  61016c:       90                      nop
  61016d:       90                      nop
  61016e:       90                      nop
  61016f:       90                      nop

This is more or less a literal translation of the code in C and my application works correct in this case. If I define BAD then I get this assembly code instead:

  610086:       90                      nop
  610087:       90                      nop
  610088:       90                      nop
  610089:       90                      nop
  61008a:       f2 0f 10 44 24 38       movsd  0x38(%rsp),%xmm0
  610090:       f2 0f 5c 44 24 40       subsd  0x40(%rsp),%xmm0
  610096:       0f 57 05 b3 61 97 00    xorps  0x9761b3(%rip),%xmm0        # f86250 <.L_2il0floatpacket.43+0x80>
  61009d:       0f 57 05 ac 61 97 00    xorps  0x9761ac(%rip),%xmm0        # f86250 <.L_2il0floatpacket.43+0x80>
  6100a4:       f2 0f 59 04 24          mulsd  (%rsp),%xmm0
  6100a9:       f2 41 0f 58 44 24 20    addsd  0x20(%r12),%xmm0
  6100b0:       f2 41 0f 11 44 24 20    movsd  %xmm0,0x20(%r12)
  6100b7:       90                      nop
  6100b8:       90                      nop
  6100b9:       90                      nop
  6100ba:       90                      nop

and my application does not behave as expected (it computes wrong results). The two xorps statements already look suspicious to me. As far as I understand they perform two xor with the same value, hence are essentially a nop with respect to the result in xmm0. Single stepping through the code in gdb I can see that in the case with BAD not defined:

(gdb) info registers rsp
rsp            0x7fffffffa010	0x7fffffffa010
(gdb) print *(double *)(0x7fffffffa010 + 0x40)
$1 = 5000000
(gdb) print *(double *)(0x7fffffffa010 + 0x48)
$2 = 0

while with BAD defined I get

(gdb) info registers rsp
rsp            0x7fffffffa020	0x7fffffffa020
(gdb) print *(double *)(0x7fffffffa020 + 0x38)
$2 = 0
(gdb) print *(double *)(0x7fffffffa020 + 0x40)
$3 = 5000000

Thus, without BAD the code computes d as stated in the C code, while with BAD the code computes d directly as d=b-a. The latter will round inexact values into the wrong direction which will in turn produce incorrect results in my application.

I am using

icc (ICC) 12.1.5 20120612
Copyright (C) 1985-2012 Intel Corporation. All rights reserved.

I compile with

-O -fno-builtin-strlen -fno-builtin-strcat -fno-builtin-strcmp -fno-builtin-strcpy -fno-builtin-strncat -fno-builtin-strncmp -fno-builtin-strrchr -m64 -fPIC -fno-strict-aliasing -diag-disable 1419 -w1 -Wcheck -Wall -Wmissing-declarations -Wmissing-prototypes -Wshadow -vec-report0 -fp-model strict

I have

#pragma fenv_access(on)

at the top-level of my source code.

Am I missing anything here or is this indeed an invalid optimization?

Thanks,

Daniel

Invalid floating point optimization?

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112