On Fri, Jul 3, 2009 at 7:54 AM, Godmar Back <godmar@...> wrote:
Typed too fast here: meant "is very expensive and led to the 3x orders of mag slowdown". We disassembled at the time what Pin was emitting.
Robert is correct that there's an common vs. uncommon question when at least some of the states is accessed most of the time, but clever design of data structures might be able to ameliorate that. It's clear that what Pin currently does - inlining dozens of instructions to pass the state to an outlined function - is very inexpensive and led to the 3x slowdown. ("complex inlinable functions" notwithstanding.)
Typed too fast here: meant "is very expensive and led to the 3x orders of mag slowdown". We disassembled at the time what Pin was emitting.
Thinking about this some more; all you'd need to do is keep track of a map that maps "original reg" -> "reg used in emitted code." Whenever the mapping changes, create a new map. There will be only this many. Intern them. Then break your trace into sections, each section corresponding to one map. Store a simple array with pointers to the map. Then all you need to pass to the instrumentation code is an index into the map. Constant cost, no matter how much state is accessed. Registers can be read with 2-3 loads. You could use pseudo-intrinsics, such as _get_eax() in the instrumentation code.
- Godmar