Hi,
While measuring the performance hit of my patches to TEBC, I found a
few specific benchmarks exhibiting a very strong dependency on the
relative addresses of functions in libtcl.so.
One of these outliers is in string.bench and is named:
STR match, recurse2 (fail)
and basically amounts to
string match $longString *a*z*cba*
(there is much backtracking here because the first three stars can
match in many ways, but "cba" is nowhere in the string).
What's striking with this test is that, starting from 8.6 HEAD, you
can slow it down by nearly TWENTY PERCENT, by adding a single empty
and unused (non-static) function to the end of tclNamesp.c:
void foo(void){}
The timing for 1000 iterations goes from 11 to 13 seconds with this
change, reliably.
Looking at 'nm -n', we see that the effect is a mere shift by 16 bytes
of all subsequent functions (for x86 gcc). How can this have such an
effect ?
(a) some relative jumps/calls crossing the boundary may increase by
16 bytes, and this will push some above the max relative offset
(b) the shift makes some heavily used code slightly exceed the
instruction cache span.
(c) other ?
Once the explanation is found, the next Q of course will be: How do we
fight such spurious effects ? Otherwise, speed-conscious evolutions
will become as easy as keeping coherence in quantum physics...
-Alex
------------------------------------------------------------------------------
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing
http://p.sf.net/sfu/novell-sfdev2dev
_______________________________________________
Tcl-Core mailing list
Tcl-...@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tcl-core