
Updating a Dataset Always Takes Longer Than Expected

Happy 4th of July
Now we're getting ready to grill some critters, followed by fireworks. In the meanwhile, here are some links. They're from a previous year's post, but they.re worth repeating (after all at Financial Rounds, we're all about the efficiency thing):
The Declaration of Independence most people have never read it through. So take a few minutes and do so before going about your day.Now go grill some meat, light some fireworks, and have a happy 4th of July.
Our Sacred Honor - a piece that recounts what happened to the signers of the Declaration
The Pledge of Allegiance - 'nuff said.
The Star Spangled Banner - the words to our national anthem and a video of Whitney Houston singing it.
A Quarter Million Hits
In all seriousness, thanks for reading. I'm humbled.
Profile of Paul Wilmott
Imagine an aeronautics engineer designing a state-of-the-art jumbo jet. In order for it to fly, the engineer has to rely on the same aerodynamics equation devised by physicists 150 years ago, which is based on Newton's second law of motion: force equals mass times acceleration. Problem is, the engineer can't reconcile his elegant design with the equation. The plane has too much mass and not enough force. But rather than tweak the design to fit the equation, imagine if the engineer does the opposite, and tweaks the equation to fit the design. The plane still looks awesome, and on paper, it flies. The engineer gets paid, the plane gets built, and soon thousands just like it are packed full of people and sent out onto runways. They fly for a while, but eventually, because of that fatal tweak, they all end up crashing.Read the whole thing here. It's long, but well worth it.In a way, this is what's happened in quantitative finance. The planes are the complex derivatives—like collateralized debt obligations—that now lie smoldering on the balance sheets of banks. The engineers are the "quants": those math and science Ph.D.s who flocked to Wall Street over the past decade and used mathematical models to build these new investment products. These are the people Warren Buffett was talking about when he said, "Beware of geeks bearing formulas" in his letter to shareholders this year. The quants aren't entirely to blame for the financial meltdown; there's plenty of guilt to be shared by regulators, top executives and the investors who bought the instruments the quants created. Yet while aeronautical engineers who willfully designed a faulty plane might be on trial for criminal negligence, Wall Street's math gurus are, for the most part, still employed. Strangely, the banks need quants more than ever right now. If anyone's going to figure out how to price these toxic assets, it's them. Quantitative finance isn't going away, but it is in desperate need of reform. And one man—a math geek himself—thinks he knows where to start.
Paul Wilmott is a 49-year-old Oxford-trained mathematician and arguably the most influential quant today, the brightest star in their insular, nerdy universe. The Financial Times calls him a "cult derivatives lecturer."
Arrrrggghh! SAS is Evil!
Caution: SAS Geekspeak ahead
One of the data sets is pretty large (it was about 70 gigabytes, but with the updates and indexing I've done, it's almost 100 gig). So, adding the new data and checking it took quite a while (no matter how efficiently you code things, SAS simply takes a long time to read a 70 gigabyte file). I thought I had everything done except for the final step. Unfortunately, the program kept crashing due to "insufficient resources."
For the unitiated, when manipulating data (sorting, intermediate steps on SQL select statements, etc...) SAS sets up temporary ("scratch") files. They're supposed to be released when SAS terminates, but unfortunately, my system wasn't doing that. So, I had over 180 gigabytes of temporary files clogging up my hard drive. This means that there wasn't enough disk space on my 250 gigabyte drive for SAS to manipulate the large files I'm using.
Of course, I only realized this when my program crashed AFTER EIGHT HOURS OF RUNNING! TWICE!
I've now manually deleted all the temporary files, and I'm running the program overnight to see if this fixes the problem.
Ah well - if it was easy, anyone could do it.
update (next morning): Phew! It ran - it seems the unreleased temporary files were the issue. On to the next problem.
A Simple (and Impressive) New Three Factor Return Model
In 1993, Fama and French showed that a three factor model (the CAPM market factor plus a size factor and a value/growth factor), did a much better job of explaining cross-sectional returns when compard to the "plain vanilla" CAPM.
Since the FF model became popular, a number of studies have come out that identify other factors that seem to be associated with subsequent returns, such as momentum (Jegadeesh and Titman, 1993), distress (Campbell, Hilscher, and Szilagyi, 2008), stock issues (Fama and French, 2008) and asset growth (Cooper, Gulen, and Schill, 2008).
Now, on to the meat of this post - another factor model. This one is based on q-theory (i.e. on the marginal productivity of a firm's investments). Long Chen and Lu Zhang (from Washington University and Michigan, respectively) recently published a paper "A Better Three-Factor Model That Explains More Anomalies", in the Journal of Finance. They propose a three-factor model", with the three factors being the aggregate returns on the market, the firm's asset-scaled investments, and the firm's return on assets). Their model significantly outperforms the Fama-French (FF) model in explaining stock returns, does a better job (relative to FF) at explaining the size, momentum, and financial distress effects (i.e. you don't need to add additional factors for these effects), and does about as well as FF in capturing the Value (i.e. Book/Market) effect. Here's a taste of their results:
- The average return to the investment factor (i.e. the the difference between the low and high investment firms) is 0.43% per month over the 1972-2006 sample period. When measured only among small firms, the return difference between low and high investment firms is about 26% annually)
- The average return to the ROA factor (the difference between returns to the firms with the lowest and highest ROA) is 0.96% per month over the sample period (with a high/low spread of about 26% for the smallest firms).
- The differences in high vs. low portfolios persist (albeit in smaller magnitudes) after controlling for Fama-French and momentum factors.
HT: CXO Advisory Group
What's going on in the Unknown Family
- The Unknown Daughter is now done with school for the summer. I get a feeling that she'll be doing a lot of social activities this summer - since Friday (the last day of school), she's already had one sleepover (at a cousin's house), and has her second one (at our house) this Wednesday. In addition, she's just finished Neil Gaiman's The Graveyard book. She gets a real kick out of describing the plot to others - it's an unusual tale where the humans are largely the evil characters, and the supernatural ones (ghosts, vampires, werewolves, etc) are the good guys. Next on the agenda is Madeiline L'Engle's A Wrinkle in Time (she just finished the 6th Harry Potter book, and is looking for some variety before tackling the 7th). All in all, not bad for someone who just finished 2nd grade.
- The Unknown Baby is now starting to make sounds - a LOT of sounds. So, he's become the star of any family gathering (he was anyway, but now that he's "talking" and laughing, we pretty much get a complete pass on having to take care of him at any event with others around --our rule is that to hold him, you have to do the associated diaper change). More importantly, he's now sleeping more that 6 hours at a pop at night. So, we give him his last feeding between 11 and 12, and don't hear from him until about 6.
- We've been doing a lot of social things - my sister and law and brother run a golf tournament to raise funds for cancer research (they lost their son to Neuroblastoma a couple of years ago). While we missed the tournament this year, we did make it up for the kid's mini-golf event this past weekend. In addition, the Unknown Wife is hosting her women's bible study group this morning, so the house will shortly be filled with about ten women. Finally, we'll be traveling to West Virginia shortly for a family reunion over the 4th of July weekend.
As for the 500 pound gorilla in the room - we're coping fairly well with the whole grieving process. Of course, it's different for the Unknown Wife and I (hey - the process is different for any two people). For me, it's toughest at night (in the time just before sleep)- for the majority of the day, I don't think about it much, except for the odd moments when I see/hear something that reminds me of the little guy. For her, it occurs more frequently, mostly because she's at home most of the day, where she sees more reminders. We each have different "therapies" - for me, I think about Jonathan on bike rides (almost 17 miles yesterday), and for the Unknown wife, it involves talking about him with friends and family.
But, the sharpest part of the grief has largely passed. For now, it only hits us on occasion, and even then not to badly. It's a lot easier knowing that (1), this is not the end of Jonathan's life (but only the end of this part of his life), and (2), he's not only pain-free, but extremely happy now.
But the human psyche has an amazing capacity for dealing with pain and grief - we're an amazing species that way.
As I said, the Unknown Wife is hosting her ladies group shortly -- As I write, three women have already arrived. So, before I succumb to Estrogen overdose (and start watching the Lifetime channel), I'd better bolt.