Quantcast
Channel: Intel® C++-Compiler
Viewing all 1665 articles
Browse latest View live

Intel® Parallel Studio XE 2015 Update 4 Composer Edition for Windows*

$
0
0

Intel® Parallel Studio XE 2015 Update 4 Composer Edition for Windows* includes Intel's latest Fortran and C/C++ compilers and performance libraries for IA-32 and Intel® 64 architecture systems. This new product release now includes: Intel® Visual Fortran Compiler 15.0.4, Intel® C++ Compiler XE Version 15.0.4, Intel® Math Kernel Library (Intel® MKL) Version 11.2 Update 3, Intel® Integrated Performance Primitives (Intel® IPP) Version 8.2 Update 2, Intel® Threading Building Blocks (Intel® TBB) Version 4.3 Update 5, Intel® Debugger Extension for Intel® Many Integrated Core Architecture (Intel® MIC Architecture) Version 7.7-8.0

New in this release:

Note:  For more information on the changes listed above, please read the individual component release notes. See the previous releases's ReadMe to see what was new in that release.

Resources

Contents
File: w_compxe_online_2015.4.221.exe
Online installer

File: w_compxe_2015.4.221.exe
Product for developing 32-bit and 64-bit applications (with Microsoft Visual Studio 2010 Shell & Libraries*, English version)

File: w_compxe_all_jp_2015.4.221.exe
Product for developing 32-bit and 64-bit applications (with Microsoft Visual Studio 2010 Shell & Libraries*, Japanese version)

File:  w_ccompxe_redist_msi_2015.4.221.zip
C++ Redistributable Libraries for 32-bit and 64-bit msi files

File:  w_fcompxe_redist_msi_2015.4.221.zip
Fortran Redistributable Libraries for 32-bit and 64-bit msi files

File:  get-ipp-8.2-crypto-library.htm
Cryptography Library

  • Entwickler
  • Microsoft Windows* (XP, Vista, 7)
  • Microsoft Windows* 8.x
  • C/C++
  • Fortran
  • Intel® Parallel Studio XE Composer Edition
  • Intel® C++-Compiler
  • Intel® Fortran Compiler
  • Intel® Debugger
  • Intel® Math Kernel Library
  • Intel® Threading Building Blocks
  • Intel® Integrated-Performance-Primitives
  • Intel® Fortran Composer XE
  • Intel® Composer XE
  • Intel® C++ Composer XE
  • Intel® Visual Fortran Composer XE
  • URL

  • Intel® Parallel Studio XE 2015 Update 4 Composer Edition for C++ Windows*

    $
    0
    0

    Intel® Parallel Studio XE 2015 Update 4 Composer Edition for C++ Windows* includes the latest Intel C/C++ compilers and performance libraries for IA-32 and Intel® 64 architecture systems. This new product release now includes: Intel® C++ Compiler XE Version 15.0.4, Intel® Math Kernel Library (Intel® MKL) Version 11.2 Update 3, Intel® Integrated Performance Primitives (Intel® IPP) Version 8.2 Update 2, Intel® Threading Building Blocks (Intel® TBB) Version 4.3 Update 5, Intel® Debugger Extension for Intel® Many Integrated Core Architecture (Intel® MIC Architecture) Version 7.7-8.0

    New in this release:

    Note:  For more information on the changes listed above, please read the individual component release notes. See the previous releases's ReadMe to see what was new in that release.

    Resources

    Contents
    File: w_ccompxe_online_2015.4.221.exe
    Online installer

    File: w_ccompxe_2015.4.221.exe
    Product for developing 32-bit and 64-bit applications

    File:  w_ccompxe_redist_msi_2015.4.221.zip
    Redistributable Libraries for 32-bit and 64-bit msi files

    File:  get-ipp-8.2-crypto-library.htm
    Cryptography Library

  • Entwickler
  • Microsoft Windows* (XP, Vista, 7)
  • Microsoft Windows* 8.x
  • C/C++
  • Intel® Parallel Studio XE Composer Edition
  • Intel® C++-Compiler
  • Intel® Math Kernel Library
  • Intel® Threading Building Blocks
  • Intel® Integrated-Performance-Primitives
  • Intel® Composer XE
  • Intel® C++ Composer XE
  • URL
  • Unresolved references in MSVCRT.lib with Visual Studio 2015 RC

    $
    0
    0

    I installed Intel Parallel Studio XE 2016 Beta Update 1 with Visual Studio Community 2015 RC and I'm getting unresolved references in MSVCRT.lib when I try to build a default Win32 console project in x64 mode:

    1>ipo: : warning #11021: unresolved __vcrt_initialize
    1>          Referenced in MSVCRT.lib(utility.obj)
    1>ipo: : warning #11021: unresolved __vcrt_uninitialize
    1>          Referenced in MSVCRT.lib(utility.obj)
    1>ipo: : warning #11021: unresolved __vcrt_uninitialize_critical
    1>          Referenced in MSVCRT.lib(utility.obj)
    1>ipo: : warning #11021: unresolved __vcrt_thread_attach
    1>          Referenced in MSVCRT.lib(utility.obj)
    1>ipo: : warning #11021: unresolved __vcrt_thread_detach
    1>          Referenced in MSVCRT.lib(utility.obj)
    1>ipo: : warning #11021: unresolved _is_c_termination_complete
    1>          Referenced in MSVCRT.lib(utility.obj)
    1>ipo: : warning #11021: unresolved __acrt_initialize
    1>          Referenced in MSVCRT.lib(utility.obj)
    1>ipo: : warning #11021: unresolved __acrt_uninitialize
    1>          Referenced in MSVCRT.lib(utility.obj)
    1>ipo: : warning #11021: unresolved __acrt_uninitialize_critical
    1>          Referenced in MSVCRT.lib(utility.obj)
    1>ipo: : warning #11021: unresolved __acrt_thread_attach
    1>          Referenced in MSVCRT.lib(utility.obj)
    1>ipo: : warning #11021: unresolved __acrt_thread_detach
    1>          Referenced in MSVCRT.lib(utility.obj)
    1>ipo: : error #11023: Not all components required for linking are present on command line

    All project settings are defaults set by project wizard. I installed only 64-bit target tools.

     

    _mm_unpackhi_epi8 and _mm_unpacklo_epi8 to convert 16 signed chars into 2 signed short vectors

    $
    0
    0

    I am using the _mm_unpacklo_epi16 and _mm_unpackhi_epi16 with second argumet vector of 0s to convert signed/unsigned short vectors into 2 signed/unsigned integer vectors. i.e.:

    __m128i lowVec  = _mm_unpacklo_epi16(vecA vec0);
    __m128i highVec = _mm_unpackhi_epi16(vecA,vec0);

    This works fine with 16 unsigned chars vector into 2 unsigned short  vectors using  _mm_unpacklo_epi8 and _mm_unpackhi_epi8, yet when the input vector is of 16 signed chars the 2 short values in result vectors are all 127+original values. 

    I found a way to overcome this using add operation with 127, and immediately after the unpack performing substraction of the 127, yet this is very non elegant.

    Another way was to use _mm_cvtepi8_epi16 and shift operations to get the wanted values - but this was less elegant than the previous add/sub and the performance was worse.

    According the documentation of the _mm_unpacklo_epi8  and _mm_unpackhi_epi8 there was not suppose to be any problem with signed chars...

     

    Parallelization of dyadic product

    $
    0
    0

    Hi,

    I have two vectors (they can address the same vector) and I need to perform the product x[i]*y[j] with i,j=1..n.

    What is the best way to perform this operation in parallel? I've tried

    cilk_for(h=0;h<n*n;h++)r[h]=x[h/n]*y[h%n];

    but I guess it is only a naive tentative to do that. Indeed vec-report says it is uneffcient.

    Thanks.

    Fabio

     

    Memory leak caused or worsened by /Qipo?

    $
    0
    0

    I've made a DLL while I compile with /Qipo (Intel C++ Composer XE2015). If I call the constructor and destructor of the main class in it, the memory doesn't get released and after a few calls (32 bit mode) I'm out of memory. However, if I disable /Qipo, there doesn't seem to be a problem at all (I will run it for a longer period tonight, but I let it construct and deconstruct 1024 times earlier tonight and I didn't notice an increase in memory usage).

    If I use /Qip mode, the leak is 8 MB per call. With /Qipo it's about 300 MB.

    I have checked a .EXE version of my software with Inspector XE2015 and it doesn't report any leaks, and neither does the debugger.

    Any clues to help me find the cause are helpful. If I have a memory leak somewhere in my code, could that somehow cause - when using IPO - the whole memory to leak???

    Possible compiler bug

    $
    0
    0

    There's a possible bug in the icc  installed with Composer XE 2013 SP1 Update 5 (2013.1.5.239). The compiler compiiles the code but the result leads to a run-time crash.

    Here is a c++ programs that can reproduce the crash:

    If executed it leads to this error:

    "Run-Time Check Failure #2 - Stack around the variable 'os_.1016' was corrupted."

    //------------------------------------------------
    #include <new>
    
    struct base
    {
      virtual ~base() {}
    };
    
    struct test_type : virtual base
    {
    };
    
    template<class T>
    struct opt
    {
      bool init_;
      char buffer_[sizeof(T)];
    
      opt() : init_(false) {}
    
      void recreate()
      {
        clear();
        construct();
      }
    
      void clear()
      {
        if (init_)
        {
          destroy();
        }
      }
    
      ~opt()
      {
        clear();
      }
    
      T& operator*()
      {
        return *address();
      }
    
    private:
      void* raw_storage()
      {
        return &buffer_;
      }
    
      T* address()
      {
        return static_cast<T*>(raw_storage());
      }
    
      void construct()
      {
        ::new (raw_storage()) T();
        init_ = true;
      }
    
      void destroy()
      {
        address()->T::~T();
        init_ = false;
      }
    };
    
    void test()
    {
      opt<test_type>os_;
      os_.recreate();
      os_.clear();
    }
    
    int main()
    {
      test();
    }
    //------------------------------------------------
    

     

    The program compile and runs correctly in gcc, clang, and vc110. (Tried here: http://melpon.org/wandbox and in VS2012)

     

    proc_bind(spread) does not seem to be honored

    $
    0
    0

    Hello Folks,

    I have a program that is decomposed in two parts:
    One loop that allocates data: it does 4 iterations, one for each socket
    One loop that does computation on the data, it does 48 iterations (each thread should work on a slice of data, hopefully a slice of data that is on the local socket).

    My machine is a 4 socket, 12 cores per processor Xeon machine. I'm using ICC 15.0.1 20141023

    To have good scalability, I need to allocate data evenly on each of my processors.
    To that end, I have found that "KMP_AFFINITY=scatter" does exactly what I need.
    The problem is that this does not work well with my second loop, that does computations. I'd like computations to occur on the socket that has the data allocated in (kind of like the "owner compute rule".

    I thought that OpenMP 4.0's proc_bind(spread) for the first loop, then proc_bind(close) would allow me to have threads where I need them to be, but from my experience, and checking with "sched_getcpu()" to see where a thread is running, using "proc_bind(spread)" or not doesn't make any difference, while "KMP_AFFINITY=scatter" does exactly what I need.

    Questions:
    Am I right to assume that proc_bind(spread) should do the same thing as "KMP_AFFINITY=scatter"?

    Do you think I'm going the wrong way trying to use OpenMP to pin my threads in a custom manner for my two loop nests?

    I hope you can help me with this,

    Best regards,

    Guillaume


    CPU2006 compile issues with MSVC 2013, ICC XE 2015 rev 3, windows server 2012

    $
    0
    0

     

    ICL Version 12.0.3.208

     

    I have compile errors with 2 cpu2006 benchmarks.

     

    483.xalancbmk dies in execution if I compile with -O3 -ipo; it works fine with -O2

    453.povray says:

    file defaultrenderfrontend.cpp

    error "<mathimf.h> is incompatible with system <math.h>!"

     

    debug symbol of binary appeare always w/o debug enable

    $
    0
    0

    Hello, 

    My ICC version intel_parallel_studio_xe_2015_update1, trial version.

    I used following command to compile, 

    icc  -w -fpermissive -fPIE -I. -DMKL_ILP64 -DLINUX  -std=c++11 -g0 -O3  -c xx.cpp -o /tmp/xx.o
    xiar rcs /tmp/xx.a /tmp/xx.o

    and use following command to link.

    icc  /tmp/release/*.o -o release/yy -L/opt/intel/composer_xe_2015.3.187/lib/intel64 -L/opt/intel/composer_xe_2015.3.187/mkl/lib/intel64 -Wl,--start-group /opt/intel/composer_xe_2015.3.187/mkl/lib/intel64/libmkl_core.a /opt/intel/composer_xe_2015.3.187/mkl/lib/intel64/libmkl_intel_lp64.a /opt/intel/composer_xe_2015.3.187/mkl/lib/intel64/libmkl_sequential.a /opt/intel/composer_xe_2015.3.187/compiler/lib/intel64/libiomp5.a /opt/intel/composer_xe_2015.3.187/compiler/lib/intel64/libirc.a -Wl,--end-group -L.  -Bdynamic -lm  -Bstatic -pthread -lstdc++ -ldb -lsqlite3 

    when I get binary yy as above, used objdump -t yy|grep debug, it will show. 

    0000000000000000 l    d  .debug_aranges    0000000000000000              .debug_aranges
    0000000000000000 l    d  .debug_info    0000000000000000              .debug_info
    0000000000000000 l    d  .debug_abbrev    0000000000000000              .debug_abbrev
    0000000000000000 l    d  .debug_line    0000000000000000              .debug_line
    0000000000000000 l    d  .debug_str    0000000000000000              .debug_str
    0000000000000000 l    d  .debug_loc    0000000000000000              .debug_loc
    0000000000000000 l    d  .debug_ranges    0000000000000000              .debug_ranges

     

     

    That I think it is a debug version, but actually, I do compile release version. 

    My gcc is 4.9.2 compile from source code and instead of system default gcc (4.4) , os is centos6.6. 

    2.6.32-504.el6.x86_64 #1 SMP Wed Oct 15 04:27:16 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

     

    gcc -v 

    Using built-in specs.
    COLLECT_GCC=gcc
    COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/4.9.2/lto-wrapper
    Target: x86_64-redhat-linux
    Configured with: ../gcc-4.9.2/configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-languages=c,c++ --disable-dssi --with-ppl --with-cloog --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux --disable-multilib
    Thread model: posix
    gcc version 4.9.2 (GCC)

     

    When I used gcc to compile, it should no issue. binary for gcc is about 5.2M, and for icc is 11M. 

    Please advise any solution? 

     

    Thanks, 

    yixuan

     

     

    Compiler bug in XE 2015: error : no instance of function template "..." matches the argument list

    $
    0
    0

    Hi,

    the following code:

    #include <tuple>
    
    struct Foo {
    	std::tuple<int> inner;
    	template <unsigned Idx>
    	auto get() const -> decltype(std::get<Idx>(inner)) { return std::get<Idx>(inner); }
    };
    
    int main()
    {
    	Foo f;
    	f.get<0>();
    }

    produces the following error:

    1>main.cpp(12): error : no instance of function template "Foo::get" matches the argument list
    1>              object type is: Foo
    1>    	f.get<0>();
    1>    	  ^

    It seems as if the compiler is not able to deduce the trailing return type.

    Best regards,
    Manuel Pöter

    Memory latency numbers

    $
    0
    0

    Hi,

    I 've been working on a talk about cache effects. My main document is "What every programmer should know about memory". I've been working on small benchmark that shows the different level of caches : walking an array of objects in a linear/randomized order. The 3 levels of cache are quite obvious when walking in a randomized order, and I measure about one cache line every 300 clock cycles on a Xeon 2xxx once the program has to go to main memory.

    The paper "What every programmer should know about memory" states that this is more that the main memory latency because the hardware needs to translate virtual addresses to physical addresses and you might be slow down by the TLB.

    What surprises me is that some people claim some very accurate numbers for L1/L2/L3/main memory access. So here are my questions ?

    - On what does the memory latency depend upon ? CPU / Mother board / RAM

    - Does Intel give measurement for memory latency ?

    - How do people measure this latency ? If the author of "What every programmer should know about memory" claims that 300 cycles is more than the memory latency, he must have a way of knowing this latency. Unfortunately, he does not give any hint on how to get/measure it.

     

    Which parallelization library to use for realtime processing?

    $
    0
    0

    I'm developing a realtime audio processing software. There may be several (for example even 100) processors at each moment, in several parallel chains. I cannot let the processors cooperate and must assume any possible sequence of processing. Each of them receives a block of data usually 256-1024 values and needs to process them as quickly as possible, so that the results may be passed to the next item in chain. If the data is not delivered in time, bad things happen... But in many cases just a few processors may be used and the goal is to keep general CPU usage minimal then. The algorithms in each processor vary a lot, so it is hard to predict anything.

    The "host" for all these processors is unknown and usually implements some kind of parallelization as well, but in my testing huge project it was reporting "near trouble" CPU usage, while the system task manager reported just about 14% CPU usage on my 8-core Xeon E5, so evidently there's a lot of spare processing power.

    From what I know these are the choices:

    1) TBB - this one looks harder to use.

    2) CILK

    3) OpenMP - I actually tested this one via MSVC and sadly it seemed to have open actively waiting threads, which means that the CPU was at 100% despite pretty small improvement in performance.

    I'd prefer if the solution could be linked statically. All of the processor implementations will be present in a single DLL (Windows) / dylib (OSX).

    Any recommendations?

    integer constant as template parameter problem

    $
    0
    0

    Compiling pcl-1.7.2 with intel c++ compiler 15.0.4.221 Build 20150407 on Windows failed with errors related to Eigen-3.2.4.
    I was able to reduce the problem to the following code snippet. It uses an integer constant as parameter of a template type. This type is used as the return type of a template classes member function. The intel c++ compiler reports a mismatch between declaration and definition of this member function. But Microsoft compiler, gcc and I agree, that there is no mismatch. Is this an intel c++ compiler problem?
    (For those interested: I found a workaround by defining the integer constant with enum { ... } instead of int const ...).

    namespace MyNamespace {
      int const MyConstant=42; // this leads to error in line 19 with intel compiler
      // enum {MyConstant=42};      // ok with intel compiler
     
      template<typename A, int B> class MyRetVal
      {
      };
    
    
      template<typename A>
      class MyClass
      {
      public:
        MyRetVal<A,MyConstant> func();
      };
     
    
      template<typename A>
      MyRetVal<A,MyConstant> MyClass<A>::func() // <-- error with intel compiler
      {
        MyRetVal<A,MyConstant> m;
        return m;
      }
    }
    
    
    int main()
    {
      MyNamespace::MyClass<float> a;
      a.func();
      return 0;
    }

     

    Here is the compiler error message:

      Intel(R) C++ Compiler XE for applications running on IA-32, Version 15.0.4.221 Build 20150407
      Copyright (C) 1985-2015 Intel Corporation.  All rights reserved.

      intel_template.cpp
      intel_template.cpp(19): error: declaration is incompatible with "MyNamespace::MyRetVal<A, MyNamespace::MyConstant> MyNamespace::MyClass<A>::func()" (declared at line 14)
          MyRetVal<A,MyConstant> MyClass<A>::func() // <-- error with intel compiler

    compilation aborted for intel_template.cpp (code 2)

    Intel® Parallel Studio XE 2015 Update 4 Professional Edition for Windows*

    $
    0
    0

    Intel® Parallel Studio XE 2015 Update 4 Professional Edition parallel software development suite combines Intel's C/C++ compiler and Fortran compiler; performance and parallel libraries; error checking, code robustness, and performance profiling tools into a single suite offering.  This new product release includes:

    • Intel® Parallel Studio XE 2015 Update 4 Composer Edition - includes Intel® Visual Fortran Compiler, Intel® C++ Compiler, Intel® Integrated Performance Primitives (Intel® IPP), Intel® Threading Building Blocks (Intel® TBB) and Intel® Math Kernel Library (Intel® MKL)
    • Intel® Advisor XE 2015 Update 1
    • Intel® Inspector XE 2015 Update 1
    • Intel® VTune™ Amplifier XE 2015 Update 4
    • Sample programs
    • Documentation

    New in this release:

    • Components updated to current versions

    Note:  For more information on the changes listed above, please read the individual component release notes.

     See the previous releases' ReadMe to see what was new in that release.

    Resources

    Contents 
    File:  parallel_studio_xe_2015_update4_online_setup.exe
    Online installer

    File:  parallel_studio_xe_2015_update4_setup.exe
    Product for developing 32-bit and 64-bit applications

  • Entwickler
  • Microsoft Windows* (XP, Vista, 7)
  • Microsoft Windows* 8.x
  • C/C++
  • Fortran
  • Intel® Parallel Studio XE Composer Edition
  • Intel® Parallel Studio XE Professional Edition
  • Intel® VTune™ Amplifier XE
  • Intel® C++-Compiler
  • Intel® Inspector XE
  • Intel® Advisor XE
  • Intel® Fortran Compiler
  • Intel® Math Kernel Library
  • Intel® Threading Building Blocks
  • Intel® Integrated-Performance-Primitives
  • Intel® Fortran Composer XE
  • Intel® Composer XE
  • Intel® C++ Composer XE
  • Intel® C++ Studio XE
  • Intel® Fortran Studio XE
  • Intel® Visual Fortran Composer XE
  • URL

  • Intel® Parallel Studio XE 2015 Update 4 Professional Edition for C++ Windows*

    $
    0
    0

    Intel® Parallel Studio XE 2015 Update 4 Professional Edition for C++ parallel software development suite combines Intel's C/C++ compiler; performance and parallel libraries; error checking, code robustness, and performance profiling tools into a single suite offering.  This new product release includes:

    • Intel® Parallel Studio XE 2015 Update 4 Composer Edition for C++ - includes Intel® C++ Compiler, Intel® Integrated Performance Primitives (Intel® IPP), Intel® Threading Building Blocks (Intel® TBB) and Intel® Math Kernel Library (Intel® MKL)
    • Intel® Advisor XE 2015 Update 1
    • Intel® Inspector XE 2015 Update 1
    • Intel® VTune™ Amplifier XE 2015 Update 4
    • Sample programs
    • Documentation

    New in this release:

    • Components updated to current versions

    Note:  For more information on the changes listed above, please read the individual component release notes.

     See the previous releases' ReadMe to see what was new in that release.

    Resources

    Contents 
    File:  parallel_studio_xe_2015_update4_online_setup.exe
    Online installer

    File:  parallel_studio_xe_2015_update4_setup.exe
    Product for developing 32-bit and 64-bit applications

  • Entwickler
  • Microsoft Windows* (XP, Vista, 7)
  • Microsoft Windows* 8.x
  • C/C++
  • Intel® Parallel Studio XE Composer Edition
  • Intel® Parallel Studio XE Professional Edition
  • Intel® VTune™ Amplifier XE
  • Intel® C++-Compiler
  • Intel® Inspector XE
  • Intel® Advisor XE
  • Intel® Math Kernel Library
  • Intel® Threading Building Blocks
  • Intel® Integrated-Performance-Primitives
  • Intel® Composer XE
  • Intel® C++ Composer XE
  • Intel® Cluster Studio
  • URL
  • Empty icl 2015 update 4 fix list

    icc16: openmp declare reduction : internal error: 20000_0

    $
    0
    0

    Hey all,

    I tried to use the OpenMP declare reduction pragma for an implementation of an array reduction. The code compiles fine using g++ 4.9 and produces correct results. However, using icpc-16.0.038 gives an internal error 20000_0. I attached a small code example.

    #include <iostream>
    #include <vector>
    #include <complex>
    
    
    template <const int n>
    struct complex_array {
    
        complex_array() : element( n, std::complex<float>(0.0f,0.0f) ) {
    
        }
    
        std::vector< std::complex<float> > element;
    };
    
    template <const int n>
    static inline void add_complex_struct( complex_array<n> &x, complex_array<n> const &y ) {
    
        for ( int i = 0; i < n; ++i ) {
    
            x.element[i] += y.element[i];
        }
    }
    
    
    #pragma omp declare reduction( complex_array_reduction : complex_array<10> : add_complex_struct<10>(omp_out, omp_in) )
    
    
    int main() {
    
        complex_array<10> a_array, b_array;
    
    
        for ( int i = 0; i < 10; ++i ) {
    
            std::complex<float> cf( 1.0f+i, 1.5f+i );
    
            a_array.element[i] =  cf;
        }
    
    
        #pragma omp parallel for schedule(static) reduction( complex_array_reduction:b_array )
        for ( int i = 0; i < 100; ++i ) {
    
            add_complex_struct<10>( b_array, a_array );
        }
    
    
        for ( int i = 0; i < 10; ++i ) {
    
            std::cout << b_array.element[i] << std::endl;
        }
    }

    I used g++ -fopenmp and icpc -qopenmp to compile the code.

    Thanks,

    Patrick

    In-class initializer bug, destructors incorrectly called, compiler 2015

    $
    0
    0

    I've got a test case where I have a class with 3 subobjects (A, B and C), and the 2nd subobject Bthrows an exception during construction. As I understand C++, the compiler should rewind the construction of the big class and destroy the 1st object A, but not the 2nd (B) or 3rd (C) objects.

    What I see is that if I use "In-class initialization" of the first object A, then instead of the first object Agetting destroyed, the 3rd object C gets destroyed. Of course it is VERY BAD to destroy an object that has not been constructed! If, for example, C was a std:unique_ptr<T>, it will probably signal a segmentation violation when it tries to free a garbage pointer.

    If I use old school "member initialization", then this problem doesn't happen.

    I don't see this with gcc 4.8

    Here's the code. The class D exposes the bug. The class E should have identical function, but it does not expose the bug.

    #include <iostream>
    
    using namespace std;
    
    
    struct A {
        A(const string& x) : x_(x) { cout << "A::A()"<< (void*)this <<endl; }
        ~A() { cout << "A::~A() "<< (void*)this<< endl;}
        string x_;
    };
    
    struct B {
        B(const A& a)  { cout << "B::B()"<< endl; throw "dead"; }
        ~B() { cout << "B::~B()"<< endl;}
    };
    
    struct C {
        C()  { cout << "C::C()"<< endl; }
        ~C() { cout << "C::~C()"<< endl;}
    };
    
    
    struct D  {
        A a{"foo"}; // "new school In-class initialization"
        B b{a};
        C c;
        D() { cout <<"D::D()"<< endl; }
        ~D() { cout <<"D::~D()"<< endl; }
    };
    
    struct E {
        A a;
        B b;
        C c;
        E()
            :a{"foo"}  // "old school member initialization"
            ,b(a)
            { cout <<"E::E()"<< endl; }
        ~E() { cout <<"E::~E()"<< endl; }
    };
    
    int main()
    {
       try {
           D d;
       }
       catch(...)
       {
           cout << "got exception"<< endl;
       }
    
       try {
           E e;
       }
       catch(...)
       {
           cout << "got exception"<< endl;
       }
    
       return 0;
    }

    Here is the output. I expect to see A constructed, B partially constructed then throws, then Adestroyed, but that is not what I see for the D case.

    $ icpc -std=c++11 test.cpp
    $ ./a.out
    A::A()0x7fffe0a5ee90
    B::B()
    C::~C()
    got exception
    
    A::A()0x7fffe0a5eea0
    B::B()
    A::~A() 0x7fffe0a5eea0
    got exception

    -- update --

    The section of the standard that describes what should happen is 15.2.3

    For an object of class type of any storage duration whose initialization or destruction is terminated by an exception, the destructor is invoked for each of the object’s fully constructed subobjects, that is, for each subobject for which the principal constructor (12.6.2) has completed execution and the destructor has not yet begun execution, except that in the case of destruction, the variant members of a union-like class are not destroyed. The subobjects are destroyed in the reverse order of the completion of their construction. Such destruction is sequenced before entering a handler of the function-try-block of the constructor or destructor, if any.

    cv-qualifier bug

    $
    0
    0

    The following code compiles fine with clang and g++ (-std=c++14) but fails with icc 2016:

     

    #include <iostream>
    
    template<class Derived>
    struct ConstBase {
      template<bool T = true>
      int f() const {
        return 3;
      }
    };
    
    template<class Derived>
    struct Base : ConstBase<Derived> {
      using ConstBase<Derived>::f;
      template<bool T = true>
      int f() {
        const Derived& derived = static_cast<const Derived&>(*this);
        return derived.f();
      }
    };
    
    struct A : Base<A> {
    };
    
    int main() {
      A a;
      std::cout << a.f() << "\n";
      return 0;
    }

     

    Viewing all 1665 articles
    Browse latest View live


    <script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>