Monday, November 12, 2012

Why a gcc build creates so many temporary files [LinuxForums]

Quote Originally Posted by hazel View Post
I've just done the first deep update of my system. To my surprise and satisfaction, it recompiled specifically for my machine quite a few packages that had originally come with the stage 3 tarball, including gcc. I was also impressed by the messages that portage gave me, explaining exactly what tidying up operations would be necessary afterwards.

While gcc was compiling, I monitored the number of files the build was creating by using df -i periodically (I already knew it was a large number). The number of inodes consumed was about 55 thousand! That is grotesque! What kind of a process needs 55,000 files? I know that the compiler must create an object file for each source file but were there really 27,000 source files in this package? Just what is going on here?
gcc is a huge piece of software. It is not just a compiler, it can do much more. It understands a number of languages, not just C, and is also a cross compiler. It can compile for a huge number of architectures.

The source package contains more than 10,000 .c files, and a total amount of +60,000 files, add to that the intermediate files and you can start realizing the magnitudes we are talking about.

I found it interesting...  Cheers, Connie

$ tar xf /var/portage/distfiles/gcc-4.3.3.tar.bz2 
$ find gcc-4.3.3/ -name \*.c|wc -l
$ find gcc-4.3.3/ -name \*|wc -l
Makefiles will be created when the configure script is run, and a lot of intermediate object files will be created as you say while compiling. Oh! and don't forget that gcc automatically bootstraps itself. That means that each time you emerge gcc, this happens:

  • first, gcc is compiled using whatever compiler you have on your system
  • second, gcc is re-compiled using the gcc version that you just compiled
  • third, gcc is recompiled again using the gcc produced on the second step, then both are compared (the third and the second) to see if they are 1:1 the same and ensure that there's no problem.

Now you can start wondering how the number of intermediate files is really huge.

And don't forget that in amd64 if you have multilib enabled two entire compilers are created for 32 and 64 bits, that means that gcc will be recompiled 6 times, and will double the total number of files once again.

Why does a build create so many temporary files?

'via Blog this'

No comments:

Post a Comment