Ř To
me, logic dictates that whatever information you have when you emit the
instrumented code is information you can store and make accessible later. Thus,
with some space overhead, Pin could provide information on demand with zero
time overhead for the common case, and with an overhead that's proportional to
the amount of information needed if information is in fact accessed.
At the point of the analysis call, we know where all the
application registers are. Some values will be in physical registers, and
others will be in memory. I believe your point is that we should leave
everything were it is and just record the locations so that we can find it
later if a tool asks for a register. The problem is that when we enter the
analysis function it will save some incoming registers on the stack and
overwrite the values. For example, at the point of the analysis call, assume
the application EBP value was stored in the actual EBP register. We enter the
analysis routine, it saves ebp on the stack and overwrites the value. Later,
the analysis routine asks for the application EBP value. It is saved somewhere
on the stack, but the C compiler generates the analysis routine, not pin, so we
don’t know that ebp was saved on the stack or where. You used C++ exceptions
as an analogy, but in this case it directly applies because unwinding would
let us restore ebp to the value at entry to the analysis routine. But the
effort required/payoff doesn’t look good to me.
The thing I described earlier was halfway between what we have
and what you are proposing: If something is already in memory, we leave it
where it is and remember its location. If something is in register, save it to
memory. Today we copy everything, including stuff that was already saved, to a
new location.
From:
pinheads@yahoogroups.com [mailto:pinheads@yahoogroups.com] On Behalf Of Godmar
Back
Sent: Friday, July 03, 2009 7:54 AM
To: pinheads@yahoogroups.com
Subject: Re: [pinheads] Re: DynamoRIO tool platform now open source
under BSD license
On Fri, Jul 3, 2009 at 5:30 AM, Cownie, James H <james.h.cownie@...> wrote:
If
you really don’t know even dynamically what you need, then I
don’t think there is a solution. That case seems to me to be the result
of bad design of your interface to your clients. You’ve promised them
something which is too expensive to deliver…
It's expensive only if you
think within the current design constraints of Pin.
To me, logic dictates that whatever information you have when you emit the
instrumented code is information you can store and make accessible later. Thus,
with some space overhead, Pin could provide information on demand with zero
time overhead for the common case, and with an overhead that's proportional to
the amount of information needed if information is in fact accessed. (I
would tolerate some inaccuracy - for instance, I wouldn't insist that you
report the values of registers that are dead in the original code and which the
instrumented code thus doesn't maintain, etc.)
Robert is correct that there's an common vs. uncommon question when at least
some of the states is accessed most of the time, but clever design of data
structures might be able to ameliorate that. It's clear that what Pin currently
does - inlining dozens of instructions to pass the state to an outlined
function - is very inexpensive and led to the 3x slowdown. ("complex
inlinable functions" notwithstanding.)
On a related note, some tasks are extremely difficult, or even impossible, to
implement in Pin because you don't have access to the compiler: a simple
gen/kill analysis for each trace, for instance. And doing it dynamically at
runtime kills you for the reasons listed above. That at least should not be an
issue with DynamoRIO because it provides source for its instrumentation
framework.
- Godmar