Home Up CV / Resume Research Contact Info

CRSP
C++ VBA Java Matlab S+ / R SAS Ox / C CRSP


[Under Construction]

 

 

 

 

Project 1. CRSP Mutual Fund Database: Solving Mutual Fund Shareclass Problem

CRSP Mutual Fund database stores data for each mutual fund shareclass separately. Each shareclass is assigned a unique identifier. Many shareclasses can correspond to the same underlying portfolio of securities. For instance, (in the dataset provided below) "Columbia Acorn Fund" is a portfolio with four shareclasses distinguished by letters "A", "B", "C", and "Z". For research purposes, one needs to work with portfolios, not shareclasses. In particular, the portfolio return has to be calculated from its shareclass returns weighed by their Total Net Assets.

Unfortunately, CRSP Mutual Fund database provides the necessary portfolio identifier variable only from 2003 on, which is not enough for a more or less long historical study. On the other hand, so-called CRSP MFLINKS database does have the required information, but its price is forbidding even for educational institutions like Purdue University. Correspondingly, the goal of this project is to create an algorithm that would generate a portfolio identifier based on the available CRSP Mutual Fund database and thus save a few thousand dollars.

The task is accomplished successfully in SAS. Since the portfolio identifier variable, "port_code", is available for 2003-2007, it is possible to test the algorithm. The test dataset with 29471 shareclass-years produces only 51 errors (shareclasses assigned to a wrong portfolio), a negligible error rate of 0.17%.  A subset of that dataset with 1175 shareclass-years is posted below and the error rate for it is zero.

CRSP Mutual Fund database was re-engineered on April 21, 2008. The algorithm uses old variable names, but it can be easily changed for the new format. The correspondence between old / new names is as follows: icdi / crsp_fundno, caldt / caldt, fund_name / fund_name, port_code / crsp_portno. More information is provided in the comments inside the SAS code below.

SAS code

Sample dataset


Software skills

Home Up CV / Resume Research Contact Info