Updating a Dataset Always Takes Longer Than Expected

I'm still working on updating my data. As usual, what I thought would be a "simple" three to four-day jobhas stretched out to almost two weeks of work. At least I'm not this person.

Happy 4th of July

We're just got back from a trip to Wal-Mart (what could be more American) to buy some clothes for the Unknown Daughter (she's grown enough that her old bathing suit no longer fits). The Unknown Wife and Unknown Daughter tried on clothes, while I wheeled the Unknown Baby Boy around the store until he went to sleep. Not surprisingly, the large-screen flat-panel TVs did the trick (based on initial indications, he's definitely a boy-child to the core).

Now we're getting ready to grill some critters, followed by fireworks. In the meanwhile, here are some links. They're from a previous year's post, but they.re worth repeating (after all at Financial Rounds, we're all about the efficiency thing):
The Declaration of Independence most people have never read it through. So take a few minutes and do so before going about your day.

Our Sacred Honor - a piece that recounts what happened to the signers of the Declaration

The Pledge of Allegiance - 'nuff said.

The Star Spangled Banner - the words to our national anthem and a video of Whitney Houston singing it.
Now go grill some meat, light some fireworks, and have a happy 4th of July.

A Quarter Million Hits

Shortly after 2 today, Financial Rounds had its 250,000th visitor. Man - I never imagined there were so many people out there with Internet connections and this much free time on their hands.

In all seriousness, thanks for reading. I'm humbled.

Profile of Paul Wilmott

Newsweek recently ran a profile of Paul Wilmott, one of the biggest names in the "quant" world. Here are the opening paragraphs
Imagine an aeronautics engineer designing a state-of-the-art jumbo jet. In order for it to fly, the engineer has to rely on the same aerodynamics equation devised by physicists 150 years ago, which is based on Newton's second law of motion: force equals mass times acceleration. Problem is, the engineer can't reconcile his elegant design with the equation. The plane has too much mass and not enough force. But rather than tweak the design to fit the equation, imagine if the engineer does the opposite, and tweaks the equation to fit the design. The plane still looks awesome, and on paper, it flies. The engineer gets paid, the plane gets built, and soon thousands just like it are packed full of people and sent out onto runways. They fly for a while, but eventually, because of that fatal tweak, they all end up crashing.

In a way, this is what's happened in quantitative finance. The planes are the complex derivatives—like collateralized debt obligations—that now lie smoldering on the balance sheets of banks. The engineers are the "quants": those math and science Ph.D.s who flocked to Wall Street over the past decade and used mathematical models to build these new investment products. These are the people Warren Buffett was talking about when he said, "Beware of geeks bearing formulas" in his letter to shareholders this year. The quants aren't entirely to blame for the financial meltdown; there's plenty of guilt to be shared by regulators, top executives and the investors who bought the instruments the quants created. Yet while aeronautical engineers who willfully designed a faulty plane might be on trial for criminal negligence, Wall Street's math gurus are, for the most part, still employed. Strangely, the banks need quants more than ever right now. If anyone's going to figure out how to price these toxic assets, it's them. Quantitative finance isn't going away, but it is in desperate need of reform. And one man—a math geek himself—thinks he knows where to start.

Paul Wilmott is a 49-year-old Oxford-trained mathematician and arguably the most influential quant today, the brightest star in their insular, nerdy universe. The Financial Times calls him a "cult derivatives lecturer."

Read the whole thing here. It's long, but well worth it.

Arrrrggghh! SAS is Evil!

Just another day (or two) of torturing data. Like I mentioned a couple of days ago, a week back I decided to update a data set to include the last year or so of data (the data sources I use were recently updated). Like most "simple" jobs, it's turned out to be much more of a hairball than I expected. Although the program I used was fairly simple to rewrite, I realized that I had to update not one, not tWo, but THREE datasets in order to bring everything up to the present.

Caution: SAS Geekspeak ahead

One of the data sets is pretty large (it was about 70 gigabytes, but with the updates and indexing I've done, it's almost 100 gig). So, adding the new data and checking it took quite a while (no matter how efficiently you code things, SAS simply takes a long time to read a 70 gigabyte file). I thought I had everything done except for the final step. Unfortunately, the program kept crashing due to "insufficient resources."

For the unitiated, when manipulating data (sorting, intermediate steps on SQL select statements, etc...) SAS sets up temporary ("scratch") files. They're supposed to be released when SAS terminates, but unfortunately, my system wasn't doing that. So, I had over 180 gigabytes of temporary files clogging up my hard drive. This means that there wasn't enough disk space on my 250 gigabyte drive for SAS to manipulate the large files I'm using.

Of course, I only realized this when my program crashed AFTER EIGHT HOURS OF RUNNING! TWICE!

I've now manually deleted all the temporary files, and I'm running the program overnight to see if this fixes the problem.

Ah well - if it was easy, anyone could do it.

update (next morning): Phew! It ran - it seems the unreleased temporary files were the issue. On to the next problem.

A Simple (and Impressive) New Three Factor Return Model

First, a little background on "factor models": The CAPM model for estimating expected returns is the oldest and most widely know of all finance models. In it, exposure to systematic risk (i.e. beta) is only factor that gets "priced" (i.e. that's related to expected returns).

In 1993, Fama and French showed that a three factor model (the CAPM market factor plus a size factor and a value/growth factor), did a much better job of explaining cross-sectional returns when compard to the "plain vanilla" CAPM.

Since the FF model became popular, a number of studies have come out that identify other factors that seem to be associated with subsequent returns, such as momentum (Jegadeesh and Titman, 1993), distress (Campbell, Hilscher, and Szilagyi, 2008), stock issues (Fama and French, 2008) and asset growth (Cooper, Gulen, and Schill, 2008).

Now, on to the meat of this post - another factor model. This one is based on q-theory (i.e. on the marginal productivity of a firm's investments). Long Chen and Lu Zhang (from Washington University and Michigan, respectively) recently published a paper "A Better Three-Factor Model That Explains More Anomalies", in the Journal of Finance. They propose a three-factor model", with the three factors being the aggregate returns on the market, the firm's asset-scaled investments, and the firm's return on assets). Their model significantly outperforms the Fama-French (FF) model in explaining stock returns, does a better job (relative to FF) at explaining the size, momentum, and financial distress effects (i.e. you don't need to add additional factors for these effects), and does about as well as FF in capturing the Value (i.e. Book/Market) effect. Here's a taste of their results:
  • The average return to the investment factor (i.e. the the difference between the low and high investment firms) is 0.43% per month over the 1972-2006 sample period. When measured only among small firms, the return difference between low and high investment firms is about 26% annually)
  • The average return to the ROA factor (the difference between returns to the firms with the lowest and highest ROA) is 0.96% per month over the sample period (with a high/low spread of about 26% for the smallest firms).
  • The differences in high vs. low portfolios persist (albeit in smaller magnitudes) after controlling for Fama-French and momentum factors.
It's definitely worth a read (in fact, it'll be on the reading list for my student-managed fund class). You can find an ungated version of the paper on SSRN here.

HT: CXO Advisory Group

What's going on in the Unknown Family

Here's a short update on goings-on in the Unknown Household:
  • The Unknown Daughter is now done with school for the summer. I get a feeling that she'll be doing a lot of social activities this summer - since Friday (the last day of school), she's already had one sleepover (at a cousin's house), and has her second one (at our house) this Wednesday. In addition, she's just finished Neil Gaiman's The Graveyard book. She gets a real kick out of describing the plot to others - it's an unusual tale where the humans are largely the evil characters, and the supernatural ones (ghosts, vampires, werewolves, etc) are the good guys. Next on the agenda is Madeiline L'Engle's A Wrinkle in Time (she just finished the 6th Harry Potter book, and is looking for some variety before tackling the 7th). All in all, not bad for someone who just finished 2nd grade.
  • The Unknown Baby is now starting to make sounds - a LOT of sounds. So, he's become the star of any family gathering (he was anyway, but now that he's "talking" and laughing, we pretty much get a complete pass on having to take care of him at any event with others around --our rule is that to hold him, you have to do the associated diaper change). More importantly, he's now sleeping more that 6 hours at a pop at night. So, we give him his last feeding between 11 and 12, and don't hear from him until about 6.
  • We've been doing a lot of social things - my sister and law and brother run a golf tournament to raise funds for cancer research (they lost their son to Neuroblastoma a couple of years ago). While we missed the tournament this year, we did make it up for the kid's mini-golf event this past weekend. In addition, the Unknown Wife is hosting her women's bible study group this morning, so the house will shortly be filled with about ten women. Finally, we'll be traveling to West Virginia shortly for a family reunion over the 4th of July weekend.
I continue to work on various research projects. I am finishing up one paper, and thought I only had a bit more to do. Then I decided to update our sample to include the most recent year of data. Unfortunately, as I worked through the program I'd written, I realized that I would also have to update two other datasets that the paper uses. One week, many lines of SAS code, and much profanity later, I'm almost done with the "simple" job that I thought would take only a day or two. So hang on, my coauthors - I'm almost done. Once this part is done, I move on to another piece, which is also in the final stages - once I finish my small part on this paper, we can put all the work we've already done since the initial version together and do the rewrite. My guess is that it'll be at the submittable stage before long. And then, once a third coauthor sends me the data set I need, I can do a bit of analysis on a third paper, which is also nearing completion.

As for the 500 pound gorilla in the room - we're coping fairly well with the whole grieving process. Of course, it's different for the Unknown Wife and I (hey - the process is different for any two people). For me, it's toughest at night (in the time just before sleep)- for the majority of the day, I don't think about it much, except for the odd moments when I see/hear something that reminds me of the little guy. For her, it occurs more frequently, mostly because she's at home most of the day, where she sees more reminders. We each have different "therapies" - for me, I think about Jonathan on bike rides (almost 17 miles yesterday), and for the Unknown wife, it involves talking about him with friends and family.

But, the sharpest part of the grief has largely passed. For now, it only hits us on occasion, and even then not to badly. It's a lot easier knowing that (1), this is not the end of Jonathan's life (but only the end of this part of his life), and (2), he's not only pain-free, but extremely happy now.

But the human psyche has an amazing capacity for dealing with pain and grief - we're an amazing species that way.

As I said, the Unknown Wife is hosting her ladies group shortly -- As I write, three women have already arrived. So, before I succumb to Estrogen overdose (and start watching the Lifetime channel), I'd better bolt.