Friday, July 20, 2012

-save-temps=obj , -save-temps=cwd, & other optimizations from GCC 4.5 General Optimizer Improvements

GCC 4.5 Release Series — Changes, New Features, and Fixes - GNU Project - Free Software Foundation (FSF):


General Optimizer Improvements

  • The -save-temps now takes an optional argument. The -save-temps and -save-temps=cwd switches write the temporary files in the current working directory based on the original source file. The -save-temps=obj switch will write files into the directory specified with the -o option, and the intermediate filenames are based on the output file. This will allow the user to get the compiler intermediate files when doing parallel builds without two builds of the same filename located in different directories from interfering with each other.
  • Debugging dumps are now created in the same directory as the object file rather than in the current working directory. This allows the user to get debugging dumps when doing parallel builds without two builds of the same filename interfering with each other.
  • GCC has been integrated with the MPC library. This allows GCC to evaluate complex arithmetic at compile time more accurately. It also allows GCC to evaluate calls to complex built-in math functions having constant arguments and replace them at compile time with their mathematically equivalent results. In doing so, GCC can generate correct results regardless of the math library implementation or floating point precision of the host platform. This also allows GCC to generate identical results regardless of whether one compiles in native or cross-compile configurations to a particular target. The following built-in functions take advantage of this new capability: cacoscacoshcasincasinhcatancatanhccosccoshcexpclogcpowcsincsinhcsqrtctan, and ctanh. Thefloat and long double variants of these functions (e.g. csinf and csinl) are also handled.
  • A new link-time optimizer has been added (-flto). When this option is used, GCC generates a bytecode representation of each input file and writes it to specially-named sections in each object file. When the object files are linked together, all the function bodies are read from these named sections and instantiated as if they had been part of the same translation unit. This enables interprocedural optimizations to work across different files (and even different languages), potentially improving the performance of the generated code. To use the link-timer optimizer, -flto needs to be specified at compile time and during the final link. If the program does not require any symbols to be exported, it is possible to combine -flto and the experimental -fwhopr with -fwhole-program to allow the interprocedural optimizers to use more aggressive assumptions.
  • The automatic parallelization pass was enhanced to support parallelization of outer loops.
  • Automatic parallelization can be enabled as part of Graphite. In addition to -ftree-parallelize-loops=, specify -floop-parallelize-all to enable the Graphite-based optimization.
  • The infrastructure for optimizing based on restrict qualified pointers has been rewritten and should result in code generation improvements. Optimizations based on restrict qualified pointers are now also available when using -fno-strict-aliasing.
  • There is a new optimization pass that attempts to change prototype of functions to avoid unused parameters, pass only relevant parts of structures and turn arguments passed by reference to arguments passed by value when possible. It is enabled by -O2 and above as well as -Os and can be manually invoked using the new command-line switch -fipa-sra.
  • GCC now optimize exception handling code. In particular cleanup regions that are proved to not have any effect are optimized out.

'via Blog this'

No comments:

Post a Comment