Announcement

**Asher** · September 9, 2010, 01:00

Too lazy to check, but I think this is the thread where I railed on gcc.

Was geeking out tonight and found some humourous quotes from the main developer of the x264 encoder pertaining to gcc.

http://mirror05.x264.nl/Dark/loren.html#devel

<Dark_Shikari> are there intrinsics one can give gcc to cause it not to do retarded things?
<pengvado> __attribute__((no_******))

<pengvado> I got -fwhole-program working, but it doesn't give any measurable speedup
<saintdev> do you have -fomg-fast-speed working yet?
<pengvado> -fomit-instructions?

<pengvado> gcc fails to optimize it because gcc has always sucked at arrays
<pengvado> that's *the* benefit of fortran

<pengvado> will eventually need a struct mv_t to simplify int16_t[2] manipulations
<pengvado> hmm, no it fails
<pengvado> simple assignment doesn't work either, that also compiles to assignment of member fields
<pengvado> ok, gcc sucks at structs just as much as it sucks at arrays. scratch that idea.

<pengvado> all the standard tools suck. I need to fix copy+paste in uuterm so I can switch from xterm, and I need to fix wildcards in psh so I can switch from bash, and I need to write a C compiler so I can switch from gcc ...

<Dark_Shikari> so why am I losing so much speed?
<Dark_Shikari> wouldn't it compile to the exact same thing?!
<Dark_Shikari> given that its inlined
<pengvado> gcc sucks?

<pengvado> ICE not crash. that is, it reports an error in what really shouldn't be an error condition, then refuses to compile your code.
<pengvado> this case is slightly believable, as -freorder-blocks-and-partition and -fprofile-use both modify sections (hot vs cold functions), and neither is implied by any of the -O settings so it's quite possible that they were never tested together
<pengvado> oops, this doesn't depend on -Os. just those 2
<pengvado> oh well. typical oss courtesy says I should file a bugreport, but I'm really not interested in dealing with gcc people, so I'll just file it under "don't do that"

<Dark_Shikari> WHAT THE HELL I change the flat to zeroes and it segfaults on startup on my machine?!?!!
<Dark_Shikari> In fact I've found this everywhere now, anywhere I use a static array it crashes.
<pengvado> x264 must have been added to the gcc testsuite, so they have a pristine copy to compare against and know when to crash

<Dark_Shikari> did you hear they're adding the ability to inline function pointers in gcc?
<pengvado> of course. because "interesting" optimizations are preferred over optimizations that make programs fast

<Dark_Shikari> is there a measurable cost, in general to always-correctly-predicted branches in inner loops?
<pengvado> depends which side of the loop is inlined
<pengvado> iirc k8 has a minimum cost of 2 cycles for a jump of any kind but not for a non-taken branch
<Dark_Shikari> so in profiled mode, it'll inline the non-lossless. in non-profiled, how do you know which side it will?
<pengvado> flip a coin?

<pengvado> as you noticed, gcc doesn't store array elements in registers
<Dark_Shikari> even extremely simple arrays, like x[2]?
<pengvado> oh, gcc does, but only when it doesn't help
<Dark_Shikari> Is that like Murphy's Law of gcc--it only does useful things when they aren't useful?
<pengvado> e.g. struct mv { int16_t x[2]; } does store x[2] in registers, thus preventing write combining

<Dark_Shikari> is there any good reason why the makefile has -O4 in it?
<pengvado> because 4 is bigger than 3, duh

<Dark_Shikari> it's much slower on gcc 3.4
<Dark_Shikari> 100 -> 130 cycles
<pengvado> anything obvious in the asm?
<pengvado> because all this is really just rerolling the gcc random code generator

<Dark_Shikari> What happened there is exactly equivalent to the following:
<Dark_Shikari> last_nonb = i; i--; cur_nonb = i;
<Dark_Shikari> assert( last_nonb != cur_nonb );
<Dark_Shikari> That assert failed. This is, of course, completely impossible.
<pengvado> We're talking gcc here. It does the impossible every morning before breakfast.

Announcement

Documentation

Comment