One of the more difficult aspects of life is converting experience
into skill. Without this essential step, life is a series of
unconnected experiences to look back on, possibly with regret. Kids
often feel this way, and unfortunately, many adults do, too. I spend
a lot of time thinking of ways to create a loop between my experience
and skill. Often, there are very simple ways to close the loop.
Tests (aka. benchmarks, performance measures, etc.) close the loop on
converting experience to skill. With a test, you have a way of
monitoring the system, and adapting it. Without a test, you might as
well be a monkey throwing darts at a dartboard. Monkeys don't have
the capacity to understand why one part of the dartboard is important
to aim at over another. More than likely, the monkey would rather
throw the darts at another monkey or even you, if it felt at all
threatened. A monkey's valuation system is based on primitive
principles. Humans have much more complex valuation systems.
Expected value is how we choose what to do. If the expected value is
high, we take the risk. If it's low, we don't. Our computation of
expected-value is, unfortunately, extremely complex. Our brain
applies lots of skills. However, there's a time-space trade off. We
can't compute forever. There's often a time component, which
complicates both physical and human systems, i.e. we reach for the hot
gold before someone else does. There's risk of avoiding an action,
which often figures implicitly into our computations of expected-value
of taking the action. If I don't buy that stock today, I'll lose
Unit and acceptance tests are very useful. However, there's a cost to
running them. Sometimes I run the test suite in a window, and forget
that I've run it, because I go on to something else. I can't afford
to sit and watch the test suite run. Yes, there are all sorts of
technological solutions to this problem so don't send me them. Each
has its own costs, and by my calculations the expected-value of the
status quo is good enough. Spending more time on automated
notifications doesn't solve my context-switching problem, and the cost
of ignoring a test suite run before I checkin are very low since we
run them automatically every night anyway.
In general, though, running existing tests has a high expected-value.
I was just pointing out that even this seemingly simple expected-value
computation is complicated. Writing tests is even more complicated.
You'd like to have the tests written, yet there's this fear of
spending time where it has a low expected-value: my software doesn't
have bugs. OK, I know I make mistakes, but I'm probably afraid to
find out. I'm afraid that even if I write the test, I'll be testing
the obvious, not what's going to break. Etc.
Fear messes up many (most?) expected-value computations. When the
Amygdala goes into high gear, and the Orbitofrontal Cortex (OFC) can't
control it, we overvalue the risks and undervalue the rewards. That's
what often happens when we don't write or even run tests. The reason
XP is so nice is that it codifies a lot of experience into simple
rules so that our OFC doesn't have to fight our Amygdala. It's the
same reason why pilots are trained to trust their instruments, not
Back to experience. One of the problems in XP is planning. It's not
easy to close the loop. Velocity doesn't cut it for me. Velocity is
not a valid measure, because every story is different. Indeed, with
XP, every story better be different, or we're repeating ourselves, and
that's a bad thing. One week I might estimate a story on implementing
a search engine in bOP. The next week I won't be implementing another
search engine; I'll have some other completely unrelated problem to
solve, e.g. upgrading a server. Yes, they may both involve
programming in Perl, but the unknowns are impossible to quantify. My
personal skill set doesn't include quantifying the unknown to any
I've had a fair bit of experience trying to quantify the unknown. I
spent about 10 years writing financial forecasting and modeling
software. I have been unable to convert those experiences into the
skill of quantifying the unknown accurately. In the world of
financial modeling, we validate this skill with *ex*facto* testing.
There's no better test for a financial model than putting it out in
the wild, and watching it run. Unfortunately, there's no more costly
Despite this experience, I still play the game. My current strategy
is to use covered calls. From 1/1/04 to 1/1/06, I had a 10% return.
From 1/1/06 to 7/27/06, I had a -14% return. The return calculation I
use is called the Annualized Internal Rate of Return (AIRR). It's a
very complicated computation, and indeed, most accounting software
gets it wrong. There are many ways to calculate an investment's
performance, but the AIRR is the easiest to use.
The AIRR is a great performance measure (test) for investors. You may
find AIRR computations in accounting programs, but at bivio.com,
you'll find one that allows you to compare your portfolio's AIRR to
another instrument's AIRR using exactly the same cash flows. In other
words, it allows you to do a "What If" calculation in the past. Or,
perhaps, you could call this the "coulda-woulda-shoulda calculation".
We call it the Performance Benchmark.
The 10% and -14% are not comparable. They had different cash flows
over different time spans. To know if 10% is good, you have to
compare it something that had the same cash flow over the same period.
The Performance Benchmark report tells me that if I had invested in
the the Vanguard 500 Index Fund (VFINX), my AIRR would have been 8%.
If I had invested in the VFINX from 1/1/06 on, I would have had a 5%
AIRR, not -14%. Overall, I had a 2% AIRR over 2.5y. The VFINX had a
7% AIRR with the same cash flow.
I've been reading a lot of Nigel Balchin lately. Here's a nice quote
from Mine Own Executioner:
"Well, you speak as an expert in these matters, Doctor Garsten.
But perhaps you'd agree that to the plain man who uses his common
sense this might appear to be a clear case of an error of
Garsten leant forward slightly. "To the plain man who uses his
common sense," he said silkily, "everything is *always* clear --
after the event."
There's the rub, as they say. I didn't put my money in VFINX on
1/1/06, I pursued what I perceived as my successful covered call
strategy in '06, and lost 14% annualized, or 7% since the beginning of
AIRR is only part of what you need to create a closed loop system for
your investing strategy. You also need to consider your time. Every
time I check my stocks, I'm spending time that could be spent
elsewhere. That's where expected-value figures in. If I had $1M, you
might guess the expected-value of managing my investments is probably
high. 10% of $1M is $100K. Unfortunately, it's not the absolute
return that matters, but the delta from my benchmark, which isn't the
proverbial mattress. Rather, it's probably the S&P 500, which VFINX
tracks, and more importantly, the exchange-traded fund SPY, tracks.
The marginal return during my successful two years was 2%, that is,
10% minus 8%. 2% of $1M is about $20K/year. And, more practically
for this list, the marginal return of managing $100K vs. my investing
in SPY was $2K/year. That's a few days of programming at bivio. If I
had to choose a bumper sticker, it *wouldn't* be "I'd rather be
investing". As a side note, I've been preaching to "novice" investors
for years to buy SPY and not to pick stocks. It's pretty clear I'm
not drinking my own coolade, and I've paid a big price for it.
Yet, I do like thinking about investing, because it's a closed-loop
system where the biggest factors are people. The stock market is very
much like planning a programming project. The buyers are customers,
and the sellers are programmers. The difference with a programming
project and the stock market is that the stock market is a lot simpler
to model. There have been numerous performance measures over the
years, and none of them has been adopted as a standard.
In XP, we use velocity. To me, velocity is like someone saying: I had
a 20% return on my investments last year. You don't have any idea
what computation they used, and it probably was more of a guess than
anything else. I've been with extremely experienced and high-level
traders who have claimed with a straight face that their *junior*
traders get 5% return a *month*. If this were true, their financial
institution would own all the world's money in just a few years. The
stock market is a zero sum game. These traders didn't have a clue.
The same thing is true about many programmers. If I believed every "I
programmed project X in N hours" statement, robots would be
programming everything by now, and we could spend all of our time
playing with the software they invented.
So what do we use at bivio instead of Velocity? First, let me note I
also wouldn't choose the bumper sticker "I'd rather be planning."
Indeed, I consider planning mostly a diplomatic exercise, and I'm not
very diplomatic, in case you hadn't noticed.
At bivio, we close the loop with Revenue, which is like AIRR, and
involves almost no planning. It's been the best way we have found to
keep programmers focused on producing value for the customer.
Programmers at bivio get paid when they do something of value for a
customer. We try to structure all projects based on fixed bids. If
you get your work done in less time, you make more per hour than
someone who's less efficient. Nothing closes the loop faster than
knowing your paycheck depends on your estimate.
In the financial industry, people are paid on their performance.
However, this is an asymmetric relationship. They usually get paid no
matter what, and if they do well, they get paid more. If you know a
money manager who has been putting his entire salary and savings in
his own fund for the last ten years, please let me know; I'll be happy
to invest my money with him. Warren Buffet falls into this category.
To this end, I trust my financial accounting to bivio software, and I
would trust any safety critical system that was built by a bivio team
with my life. We don't build safety-critical systems now, but if/when
we do, we'll have even more skill at closing the loop than we do now.
And, I'll have even more reason to trust the systems we build.