136
CRSP CRSPAccess97 Database Format - Programmers Guide For the CRSP NYSE, AMEX, Nasdaq Daily and Monthly Price and Total Return Database and CRSP US Stock, Treasury Indices and Portfolio Assignments Database Updated Monthly and Annually 1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business

CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSP

CRSPAccess97 Database Format - Programmers Guide

For the

CRSP NYSE, AMEX, Nasdaq Daily and Monthly Price andTotal Return Databaseand

CRSP US Stock, Treasury Indices and Portfolio AssignmentsDatabase

Updated Monthly and Annually1925-1998

Center for Research in Security PricesThe University of Chicago Graduate School of Business

Page 2: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97
Page 3: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSP.com

CRSPAccess97 Database Format - Programmers Guide

For the

CRSP NYSE, AMEX, Nasdaq Daily and Monthly Priceand Total Return Databaseand

CRSP US Stock, Treasury Indices and PortfolioAssignments DatabaseCovering over 22,000 Stocks, Updated Monthly and Annually1925-1998

Center for Research in Security PricesThe University of Chicago Graduate School of Business

Page 4: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPTM

The University of Chicago, Graduate School of Business725 S. Wells StreetSuite 800Chicago, IL 60607

Phone: 773.702.7467Fax: 773.702.3036

e-mail: [email protected]://www.crsp.com

Copyright © 1999Center for Research in Security PricesTM

University of ChicagoVersion CAP_98.01

Page 5: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

and the, herebyicense days of

SB, 725

licensedained by

RSP dataust be in

and theed into in one

usiness,aterials

of of the

rch in

sclaimsAND

'SS

and shallsity ofamages),r other

ify CRSP

CRSP Data License

CRSP DATA LICENSE

This is a legal agreement between you (either an individual or an entity) hereby referred to as the “user” University of Chicago, Graduate School of Business on behalf of the Center for Research in Security Pricesreferred to as “CRSP”. By opening this product, you are agreeing to be bound by the terms of the following LAgreement. If you do not wish to accept these terms, return the unused, unopened data to CRSP within 30receipt with any written materials to the Products and Services Department, CRSP, University of Chicago, GS. Wells Street, Suite 800, Chicago, IL 60607. Opening this data without a signed agreement binds the user torestrictions of use in CRSP’s standard subscription/contract terms for the product.

The accompanying media contains data that are the property of CRSP and its information providers and arefor use only, by you as the original licensee. Title to such media, data and documentation is expressly retCRSP.

Data and documentation are provided with restricted rights. CRSP grants the user the right to access the Cand the CRSP Product Database Guide(s) only for internal or academic research use. Usage of the data maccordance with the terms detailed in the Subscription Agreement, Contract or Agreement between CRSPuser. The data is “in use” on a computer when it is loaded into the temporary memory (i.e. RAM) or is installthe permanent memory (e.g. hard disk, CD ROM or any other storage device) of that computer or networklocation. Once the data has been updated, Subscribers must promptly return the previous release of data on itsoriginal medium to CRSP or destroy it.

COPYRIGHT NOTICE

The documents and data are copyrighted materials of The University of Chicago, Graduate School of BCenter for Research in Security Prices (CRSP) and its information providers. Reproduction or storage of mretrieved from these are subject to the U.S. Copyright Act of 1976, Title 17 U.S.C.

The Center for Research in Security Prices, CRSP, CRSP Total Return Index Series, CRSPAccess97, CRSP Cap-Based Portfolio Series, PERMNO, PERMCO and CRSPID have been registered for trademark and other formsproprietary rights. The Contents are owned or controlled by CRSP or the party credited as the providerContents.

PROPRIETARY RIGHTS

PERMNO, PERMCO and CRSPID are symbols representing data, which is proprietary to the Center for ReseaSecurity Prices.

DISCLAIMER

CRSP will endeavor to obtain information appearing on its Data Files from sources it considers reliable, but diany and all liability for the truth, accuracy or completeness of the information conveyed. THE UNIVERSITY CRSP MAKE NO WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, WITH RESPECT TO THEMERCHANTABILITY, FITNESS, CONDITION, USE OR APPROPRIATENESS FOR SUBSCRIBERPURPOSES OF THE DATA FILES AND DATA FURNISHED TO THE SUBSCRIBER UNDER THISUBSCRIPTION, OR ANY OTHER MATTER AND ALL SUCH DATA FILES WILL BE SUPPLIED ON AN "ASIS" BASIS. CRSP will endeavor to meet the projected dates for updates, but makes no guarantee thereof, not have any liability for delays, breakdowns or interruption of the subscription. In no event shall the UniverChicago be liable for any consequential damages (even if they have been advised of the possibility of such dfor damages arising out of third party suppliers terminating agreements to supply information to CRSP, or focauses beyond its reasonable control.

In the event that the Subscriber discovers an error in the Data Files, Subscriber's sole remedy shall be to notand CRSP will use its best efforts to correct same and deliver corrections with the next update.

v

Page 6: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSP SFA DATABASE FORMAT GUIDE

expressubject to

oprietaryay bewhich it

es andunder thists.

of this

se Guide

PERMISSION TO USE CRSP DATA

CRSP permits the use of its data in scholarly papers. CRSP data may also be used in client newsletters, marketing,and education materials, company reports, books, and other published materials. If you intend to use our data oranything derived from our data, like graphs, in your publications, we require that you receive written permission fromCRSP. Please call us for a Data Permission Request at 773.702.7467.

CITATION

The following citation must be used when displaying the results of any analysis that uses CRSP data.

Source: CRSP, Center for Research in Security Prices. Graduate School of Business, The University ofChicago [year]. Used with permission. All rights reserved. crsp.com

USE OF CRSP DATABASES

PROHIBITED

Employees of academic and commercial subscribers to CRSP databases have from time to time requested the use ofCRSP data for the individual’s outside consulting project. CRSP databases are to be used exclusively for thesubscriber’s own internal educational or business processes, as stated in the Statement of Use in the subscriptionagreement. The University does not permit any third party to use its databases. If you wish to use our data in yourconsulting projects, your client may either subscribe to a CRSP database or request a custom research project.

APPROVED

The CRSP data files are proprietary and should be used only for research purposes by the faculty, students, oremployees of the subscribing institution. The Subscription Agreement, signed by each subscribing institution states:

Subscriber acknowledges that the data files to which it is subscribing contain factual material selected, arranged andprocessed by CRSP and others through research applications and methods involving much time, study, and expense.

The Subscriber agrees that it will not transfer, sell, publish, or release in any way any of the data files or the datacontained therein to any individual or third party who is not an employee or student of the Subscriber, and that thedata provided to the Subscriber by CRSP is solely for the Subscriber’s use.

The Subscriber may not copy the data or documentation in any form onto any device or medium without thewritten consent of CRSP, except solely to create back-up copies of the CRSP data files for its internal use, sthe terms of this Agreement.

Subscriber agrees and warrants that it will take all necessary and appropriate steps to protect CRSP’s prrights and copyright in the data supplied (including, but not limited to, any and all specific steps which mexpressly required by CRSP), and that the Subscriber will protect the data in no less than the manner in would protect its own confidential or proprietary information.

The Subscriber will inform all users and potential users of CRSP data of CRSP’s proprietary rights in its fildata by giving each user a copy of this paragraph and any other specific requirements CRSP may mandate paragraph and by requiring each such user to comply with this paragraph and all such additional requiremen

The Subscriber agrees that its obligation under the Subscription Agreement shall survive the terminationAgreement for any reason.

DATABASE GUIDES

Additional copies of this database guide are available on the CRSP CD-ROM and on-line through the Databalink found at http://www.crsp.com

vi

Page 7: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

TABLE OF CONTENTS

CHAPTER 1. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 About CRSP 1

CRSP Working Papers 1CRSP Board of Directors 2CRSP Historical Data Products 2Sample Data Sets 4

1.2 Overview of CRSPAccess97 Databases 51.3 Format and Programming Changes 6

CRSPAccess97 6Using Existing FORTRAN Programs with CRSPAccess97 71998 Programming Changes 8

1.4 Notational Conventions 9

CHAPTER 2. ACCESSING DATA IN FORTRAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.1 CRSPAccess97 FORTRAN Stock Data Structure 112.2 FORTRAN Stock Sample Programs and Subroutines 17

Sample Programs — USSAMP*.FOR 17Include Files — Parameters, Subscripts, Common Blocks, and Functions 19Stock Access Subprograms — Subroutines For Retrieving CRSP Data 20CRSP Access Subroutines for Nasdaq Data 26FORTRAN Access Subroutines for Indices Data 27Stock FORTRAN Utility Subprograms — Utility Functions and Subroutines 29

2.3 Suggestions For FORTRAN Programs Using CRSP Stock Data 352.4 FORTRAN Access on Supported Systems 37

Windows Systems 37Unix Systems 38OpenVMS Systems 40

CHAPTER 3. ACCESSING DATA IN C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413.1 CRSPAccess97 C Data Structures 41

Data Organization for C Programming 41Data Objects 41Set Structures and Usage 44C Language Data Objects for CRSP Stock Data 45C Language Data Structure for CRSP Stock Data 46C Language Data Objects for CRSP Indices 50C Language Data Structure for CRSP Indices Data 51

3.2 C Sample Programs 54C Header Files and Data Structures 55

3.3 CRSPAccess97 C Library 56Stock Access Functions 57Index Access Functions 64General Access Functions 68General Utility Functions 74Data Utility Functions 87

3.4 C Access on Supported Systems 112Windows Systems 112Unix Systems 114OpenVMS Systems 116

vii

Page 8: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

viii

Page 9: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

OVERVIEWABOUT THIS GUIDE

This book is intended for users who want to process CRSPAccess97 (CA97) US stock and indices data with C andFORTRAN programs. The data is part of the CRSP NYSE, AMEX, Nasdaq Daily and Monthly Price and TotalReturn Databases and the CRSP US Stock, Treasury Indices and Portfolio Assignments Database. They are updatedboth monthly and annually.

These Databases were developed at the University of Chicago, Graduate School of Business, Center for Research inSecurity Prices (CRSP). Professor Lawrence Fisher, currently at Rutgers University, originated the basic design ofthe Monthly Stock Database. The Databases are comprehensive, providing end of day and month-end data on morethan 22,000 Common Stocks traded on the NYSE, the AMEX and the Nasdaq Stock Market. Monthly NYSE databegins December 1925, and daily NYSE begins July 1962. AMEX daily and monthly data begins July 1962. Monthlyand daily Nasdaq data begins December 1972.

CRSP will concentrate development on enhancements to subroutines, utilities, and data for our CA97 DatabaseFormat only. We will continue to support the SFA Database format through the binary and character conversionprograms described in the CRSP SFA Database Format Guide.

INSIDE

Chapter One: Introduction contains a brief description of CRSP, our products, an overview of the stock and indicesdata, and an overview of the CRSPAccess97 Database format and notational conventions.

Chapter Two: Accessing Data in FORTRAN contains instructions and examples for accessing the data in theCRSPAccess97 format using FORTRAN77.

Chapter Three: Accessing Data in C contains instructions and examples for use in accessing the data in theCRSPAccess97 format using C.

Index: provides an alphabetical reference to locate definitions for the sample programs, subroutines, and include files

Other Documentation

See the CRSPAccess97 Installation Guide for instructions on installing data, programs, and code libraries on theCRSPAccess97 CD ROMs.

See the CRSP Data Definitions and Coding Schemes Guide for the variable definitions, coding schemes, dataorganization, data derivations, and index methodologies.

See the CRSPAccess97 Database Format - Utilities Guide for a description of utilities available to retrieve datawithout programming.

See the CRSP SFA Database Format Guide for information on using the SFA database format. (The database formatdelivered prior to 1996.)

Contact us for Technical SupportTelephone: 773.834.1025e-mail: [email protected]

Page 10: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97
Page 11: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Introduction

CHAPTER ONE: INTRODUCTION

OVERVIEW

This chapter provides a brief description of CRSP and our products, an overview of the stock and indices data, anoverview of the CRSPAccess97 Database format and notational conventions.

INSIDE1.1 About CRSP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 1

CRSP Working Papers CRSP Board of Directors CRSP Historical Data Products

1.2 Overview of CRSPAccess97 Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 51.3 Format and Programming Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 6

CRSPAccess97 Using Existing FORTRAN Programs with CRSPAccess97 1998 Programming Changes

1.4 Notational Conventions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 9

Page 12: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97
Page 13: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 1. Introduction

Introduction

ers.

ll stocksn StockUS stock

lly, wegers and return

e Schoolated into

chers andvailable

CHAPTER 1. INTRODUCTION

1.1 About CRSP

Back in 1959, Professor James Lorie fielded a call from Louis Engel, a Vice President at Merrill Lynch, Pierce,Fenner & Smith. The firm wanted to advertise how well people had done investing in common stocks, but Engelneeded some solid data. Could the University of Chicago Graduate School of Business help?

That was the start of the Center for Research in Security Prices. Forty years ago, computer technology was in itsinfancy and no machine-readable data existed.

Professor Lorie and Professor Lawrence Fisher, a colleague on the finance faculty, set out to build a database ofhistorical and current securities data that answered Merrill Lynch’s question and, since then, many, many oth

The professors compiled the first machine-readable file. It contained month-end prices and total returns on alisted on the New York Stock Exchange between 1926 and 1960. Over time, CRSP added the AmericaExchange, the NASDAQ stock exchange, and end-of-day as well as month-end prices. Now CRSP updates data in two frequencies; either once a year or once a month.

In 1999, CRSP is justly considered the best provider by far of US corporate actions information. Specificadiligently track name changes and name identifiers, and distributions of shares, cash, rights, spin-offs, merliquidation payments. As a result, the history and quality of CRSP capital return, income return and totalnumbers are unsurpassed.

CRSP Working Papers

From its founding, the University of Chicago set the highest standards of research excellence. The Graduatof Business helped to spawn the modern revolution in finance, and research done here has been incorporCRSP Data Files. Among them:

Risk/Return Analysis by Harry MoskovitzThe Sharpe-Lintner Capital Asset Pricing ModelThe Efficient Market HypothesisBlack-Scholes Option Pricing ModelSmall Stock Effect

The comprehensiveness and quality of CRSP data has made it the premier source for academic researquantitative analysts for thirty-five years. We have the latest research on a wide variety of finance topics aonline.

World Wide Web: crsp.com, CRSP Working Papers

1

Page 14: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

CRSP Board of DirectorsWe are fortunate to have the guidance of world-renowned faculty.

Chairman Eugene F. Fama, Robert R. McCormick Distinguished Service Professor of Finance.Douglas W. Diamond, Theodore O. Yntema Professor of Finance.Steven Neil Kaplan, Leon Carroll Marshall Professor of Finance.Robert W. Vishny, Eric J. Gleacher Distinguished Service Professor of Finance.John Huizinga, Walter David “Bud” Fackler Professor of Economics, Deputy Dean for the Faculty.Mark E. Zmijewski, Professor of Accounting, Deputy Dean for the Full-Time M.B.A. Programs.

CRSP Historical Data Products

CRSP NYSE, AMEX, Nasdaq Daily and Monthly Price and Total Return Databases

CRSP provides monthly or annual updates of end-of-day and month-end prices on all listed NYSE, AMEX andNASDAQ common stocks with basic market indices. CRSP provides the most comprehensive distributioninformation available, for the most accurate total return calculations.

Important facts regarding CRSP US Stock Data.

Y Annual Update: Ready in April.

Y Monthly Updates: Ready by the 15th day of the following month.

Y Daily and Month-End Data: NYSE/AMEX: High, low, bid, ask, closing price, trading volume, sharesoutstanding, capital appreciation, income appreciation, total return, year-end capitalization, and year-endcapitalization portfolio. NASDAQ data also includes: closing bid, ask, number of trades, historical traitsinformation, market maker count, trading status, and NASD classification.

Y History: NYSE daily data begins July 1962. Monthly data begins December 1925. AMEX daily and monthlydata begins July 1962. NASDAQ daily and monthly data begins December, 14 1972.

Y Identifying Information: Complete Name History for each security; all historical: CUSIPs, exchange codes,ticker symbols, SIC codes, share classes, share codes, and security delisting information. These items maychange over time. CRSP has developed a unique permanent issue identification number, PERMNO, and a uniquepermanent company identification number, PERMCO. These enable the user to track the issue over time,performing extremely accurate time-series data analysis.

Y Distribution Information: descriptions of all distributions, dividend amounts, factors to adjust price and shares,declarations, ex-distributions, record and payment dates, and security and company linking information.

2

Page 15: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 1. Introduction

Introduction

CRSP US Stock, Treasury Indices and Portfolio Assignments Database

A companion database, the CRSP US Stock, Treasury Indices and Portfolio Assignments Database, provides marketindices on a daily, monthly, quarterly and annual frequency. This database provides additional market and securitylevel portfolio statistics and decile portfolio assignment data. Four types of indices provide the following information.

Y The CRSP Stock File Indices includes Value- and Equal-Weighted Indices, with or without dividends, the S&P500 Composite Index and returns, NASDAQ Composite Index and return and security data needed to link stocksto the CRSP US Market Cap-Based Portfolios. The indices files also contain the US Government ConsumerPrice Index, US Government Bond Fixed Term Index Series, and the CRSP Risk-Free Rates File.

Y Track micro-, small-, mid- and large-cap stocks with CRSP US Market Cap-Based Portfolios. CRSP ranks allNYSE companies by market capitalization and divides them into 10 equally populated portfolios. AMEX andNASDAQ National Market stocks are then placed into deciles according to their respective capitalizations,determined by the NYSE breakpoints. CRSP Portfolios 1-2 represent large caps, Portfolios 3, 4, 5 represent mid-caps, Portfolios 6, 7, 8 represent small caps, and Portfolios 9-10 benchmark micro-caps.

Among the monthly data provided are the number of companies in the portfolio at the start of the quarter,portfolio weight at the start of the quarter, total return and index level, capital apprecation return and index level,and income return and index level.

Y CRSP Indices for the S&P 500 Universe are daily and monthly files which include value- and equal-weightedreturns, with and without dividends.

Y CRSP US Treasury and Inflation Series are monthly files containing returns and index levels on US Treasuriesand the US Government Consumer Price Index and index level.

CRSP US Government Bills, Notes and Bonds End-of-Day and Month-End Databases

CRSP provides 1.5 million end-of-day price observations for 3,244 US Treasury bills, notes and bonds since 1961.The monthly database contains 101,986 prices for 5,136 issues since 1925. Monthly supplemental files, developedby Professor Eugene F. Fama, Professor of Finance, are described below. They are updated annually.

Important facts regarding CRSP US Treasury data.

Y Annual Updates: Ready in April.

Y Daily Data: Daily quote dates, delivery dates, 1-, 3-, and 6-month CD rates, 30-, 60-, and 90-day commercialpaper rates, and Federal funds effective rate. Monthly Data: Monthly quote dates and delivery dates. Julian,linear, and other date information to facilitate date arithmetic.

Y History: Daily data begins on June 14, 1961. Monthly data begins on December, 31 1925.

Y Identifying Information: CRSP Identifier (CRSPID), CUSIP, maturity date, coupon rate, among other items,sorted by CRSPID.

Y Quote Data: Bid, ask, and source. Performance Data: Accrued interest, yield, return and duration.

Y Debt Data: Debt outstanding, total and publicly held.

Y Fixed Term Indices Files: Performance of single Treasury issues at fixed maturity horizons.

Y Supplemental Files: The CRSP US Treasury Securities Month-End Database contains files designed byEugene F. Fama, Professor of Finance, The University of Chicago Graduate School of Business. These filesextract term structures and risk-free rates. There are four groups of files. The Treasury Bill Term Structure Files,The Fama-Bliss Discount Bond Files, The Risk-Free Rates File and The Maturity Portfolio Returns File. Thedata in these files begin in 1952 with the exception of the Risk-Free Rates File, where the data begins in 1925.

3

Page 16: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

elues anduarterly,

chool of

quidationt cases isilable to

CRSP Survivor-Bias Free US Mutual Fund Database based on the Standard & Poor’s® Micropal® Database

In estimating the performance on an equal-weighted index of equity mutual funds, Mr. Carhart found that,“Using only surviving funds biases these (performance) measures upward by about one percent per year.”

Recently introduced, the CRSP Survivor-Bias Free US Mutual Fund Database records each mutual fund’s namand organizational history. CRSP tracks monthly returns, monthly total net assets, monthly net asset vamonthly distributions for open-ended mutual funds from January 1, 1962, to December 31, 1997. Updated qthe database uses Microsoft Access 97 database software.

Mark M. Carhart developed this unique database for his 1995 dissertation submitted to the Graduate SBusiness entitled, Survivor Bias and Persistence in Mutual Fund Performance. In it he noted that the explosion innew mutual funds has been “accompanied by a steady disappearance of many other funds through merger, liand other means. …this data is not reported by mutual fund data services or financial periodicals and in mos(electronically) purged from current databases. This imposes a selection bias on the mutual fund data avaresearchers: only survivors are included.”

Sample Data Sets

Sample data sets for all CRSP products are available on the Getting Started CD-ROM.

4

Page 17: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 1. Introduction

Introduction

1.2 Overview of CRSPAccess97 Databases

A CRSPAccess97 database is a customized financial database system supporting time series, event, and header datafor various financial data structures. A single database is a set of defined configuration and module files in adirectory. Configuration files track the location of data in the module files.

The basic levels of a CRSPAccess97 database are the database, set type, set id, module, object, and array. They aredefined as follows:

Y Database (CRSPDB) is the directory containing the database files. A CRSPDB is identified by the database path.

Y Set Type is a predefined type of financial data. Each set type has its own defined set of data structures,specialized access functions, and keys. CRSPAccess97 databases support stock (STK) and index (IND) settypes. A CRSPDB can support more than one set type.

Y Set Identifier (SETID) is a defined subset of a set type. SETIDs of the same set type use the same accessfunctions, structures, and keys, but have different characteristics within those structures. For example, dailystock sets use the same data structure as monthly stock sets, but time series are associated with differentcalendars. Multiple SETIDs of the same set type can be present in one CRSPDB.

Y Modules are the groupings of data found in the data files in a CRSPDB. Multiple data items can be present in amodule. Data is retrieved at a module level, and access functions retrieve data items for keys based on selectedmodules. A module corresponds to one physical data file.

Y Objects are the fundamental data types defined for each set type. There are three fundamental object types, timeseries (CRSP_TIMESERIES), event arrays (CRSP_ARRAY), and headers (CRSP_ROW). Objects containheader information such as counts, ranges, or associated calendars plus arrays of data for zero or moreobservations. Some set types allow arrays of objects of one type. In this case, the number of available objects isdetermined by the SETID, and each of the objects in the list has independent counts, ranges, or associatedcalendars.

Y Arrays are attached to each object. The array contains the set of observations and is the basic level ofprogramming access. An observation can be a simple data type such as an integer for an array of volumes, or acomplex structure such as for a name history. When there is an array of objects, there is a corresponding array ofarrays with the data.

Configuration Files contain information about supported sets and modules in the CRSPDB, a list of keys, addressesof data for each key in different data modules, a set of shared calendars, a set of secondary indexes, and a list of freespace within the module files. Module files contain the data for groups of objects for keys.

A C application programming interface supports access to defined set structures, such as stock security or index data.A FORTRAN programming interface on top of the C interface is also built into the system. The FORTRAN accessutilizes random access by module and by key, but otherwise maps structures to previously supported CRSP commonblocks.

5

Page 18: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

1.3 Format and Programming Changes

CRSPAccess97

CRSPAccess97 provides the following features for programming:

Y Binary data, programming libraries, and sample programs are provided for target platforms. The installation

programs on the Getting Started CD-ROM will load program files and help setup environment variables needed

to use the programs.

Y Utilities are provided to dump data and perform name list searches without programming.

Y FORTRAN77 data structures previously provided in CRSP SFA format sample programs are supported.

Additional data items and random access subroutines are now available.

Y C sample programs and libraries are available. C access supports random access and extended data utility

functions.

Tables are provided in this document with FORTRAN and C structure layouts and comparisons of the structures.

6

Page 19: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 1. Introduction

Introduction

s

Using Existing FORTRAN Programs with CRSPAccess97

CRSPAccess97 is written to be backward-compatible with SFA FORTRAN 77 common blocks provided before 1996

and on tapes. Programs based on earlier releases of these programs can still be used with the new data with minor

adjustments.

CRSPAccess97 data is delivered in a binary format. CRSPAccess97 installation loads the data, include files and

compiled libraries. There is no need to convert character files.

See “Chapter 2. Accessing Data in FORTRAN” on page 11, for information about new FORTRAN acces

subroutines that allow more control when reading data, including random access by PERMNO, PERMCO, CUSIP and

historical CUSIP. CRSPAccess97 also supports existing programs derived from BISTK, BISTKP, or NAMLST.

These can be adapted using the following guidelines:

- INCLUDE statements should be in the form INCLUDE ‘ALLINCLS.TXT’ if not already in that form.

Include files are in a standard directory that is indicated in the compiler command. ALLINCLS.TXT is

equivalent to US.TXT described in Chapter 2.

- Erase any OPEN , REWIND, or CLOSE statements for data or calendar/indices files in the program. All data

files will be opened automatically by the BICAL subroutine.

- The BICAL subroutine parameters must be changed. The first parameter becomes an integer code for

database frequency (1 = monthly, 2 = daily) and a new second parameter is an integer file exchange code

3=NYSE/AMEX/NASDAQ file). CALL BICAL(1,3) is equivalent to CALL

UMOPEN(NYSEAMEX+NASDAQ) and CALL BICAL(2,3) is equivalent to CALL

UDOPEN(NYSEAMEX+NASDAQ). See UMOPEN and UDOPEN Page 20.

CALL BIGET(JUNIT,*900) is equivalent to CALL USGETPER(1,INEXT,EVERY,*900). USGETPER is

described in Chapter 2.

BICAL and BIGET cannot be used to access beta or standard deviation excess returns or portfolio data. To access

these data items, either the program must be converted to use new CRSPAccess97 FORTRAN access subroutines or

the data must be converted back to SFA format with conversion programs.

It is also possible to convert the data back to the SFA format and use the existing programs directly. The

stk_dump_bin , ind_dump_bin, stk_dump_char, and ind_dump_char utilities, described in the CRSP

SFA Database Format Guide can also be used to convert CRSPAccess97 databases to the binary or character files

previously supported for prior FORTRAN or SAS Proc Datasource access.

7

Page 20: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

ith C++

le on the

1998 Programming Changes

Y The default value for DATEFFLAG in CRSPAccess97 FORTRAN is now 1 instead of 0. All date fields in

FORTRAN will be loaded with 4-digit years. A program can set DATEFFLAG to 0 to load dates with 2-digit

years.

Y FORTRAN output functions OUTDEL and OUTDIS were modified to write full 8-digit dates in the formatted

output.

Y C Access and Utility Functions were improved. New general functions and date clearing function were added.

Changes were also made to adjustment functions to handle unusual data cases. See “3.3 CRSPAccess97 C

Library” on page 56.

Y Modifications were made to CRSP header files to make the structure more standard and compatible w

compilers. The index structure element typename was changed to indname.

Y Program and header file listings are no longer provided in the manuals. The files themselves are availab

CRSP Getting Started CD-ROM.

Y CRSPAccess97 libraries are fully supported for IBM AIX Systems. See “FORTRAN Unix Systems Support” on

page 38, and “CRSP C Supported Unix Systems” on page 115.

8

Page 21: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 1. Introduction

Introduction

sing aoutiners to theifier.using ad

n

fer. For

ANrange to as

s,

1.4 Notational Conventions

Y All names that occur within CRSP’s FORTRAN and C sample programs and include files are printed uconstant-width, courier, font. These names include variable names, parameter names, subrnames, subprogram names, function names, library names, and keywords. For example, CUSIP refeCUSIP Agency identifier, while CUSIP refers to the variable that the programs use to store this identCRSP’s variable mnemonics, used as names and in the descriptions, are displayed capitalized CONSTANT-WIDTH font. C is typically displayed in lower case, excepting constants, which are displayein UPPER CASE, while FORTRAN is displayed in UPPER CASE.

Y All names that refer to the CRSP data utilities, sample programs or include file titles are printed using aitalichelvetica font (LWDOLF�OXFLQGD�FRQVROH in the html versions).

Y Names of FORTRAN common blocks are delimited by slashes(/ /).

Y Names with a similar format are sometimes referenced collectively, using three x's where the names difexample, the FORTRAN variables BEGVOL, BEGRET, BEGPRC, etc. are sometimes referred to as BEGxxx.

Y In the variable definitions section, the variable I is sometimes used in referencing a variable in a FORTRarray. In this case, I refers to a possible range of valid data in this array for this company, where the valid is determined by the number of header variables. For example, the names date is referredNAMES(NAMEDT,I). Here I is an integer between 1 and the header variable NUMNAM, which represents thenumber of name structures that exist for any specified issue in the CRSP Stock Database.

Y In C, all CRSP-defined data types have names in all capitals beginning with CRSP_.

Y The text of this document is in Times New Roman. Italics and bold styles are used to emphasize headingnames, definitions and related functions.

9

Page 22: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

10

Page 23: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

FO

RT

RA

N P

rogramm

ing

CHAPTER TWO: ACCESSING DATA IN FORTRANOVERVIEW

This section provides the technical information needed to access CRSPAccess97 Stock and Indices data using theFORTRAN programming language.

The CRSPAccess97 Files are implemented as CRSPDB databases. The CRSPDB format supports random access withmultiple keys, and allows selective access of available data items. FORTRAN programs can be linked to CRSPlibrary functions and use common blocks defined in CRSP include files. Sample programs are provided on theGetting Started CD.

The system can be configured to have one daily and one monthly stock database available. Programs and commonblocks support either daily or monthly data.

See the CRSP Data Definitions and Coding Schemes Guide for data item definitions, the CRSPAccess97 InstallationGuide for installation instructions, and the CRSPAccess97 Database Format - Utilities Guide and for data toolsavailable with the databases.

INSIDE2.1 CRSPAccess97 FORTRAN Stock Data Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Page 112.2 FORTRAN Stock Sample Programs and Subroutines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 17

Sample Programs — USSAMP*.FOR Include Files — Parameters, Subscripts, Common Blocks, and Functions CRSP Access Subprograms — Subroutines For Retrieving CRSP Data Open and Close Subroutines CRSP GET Routines Stock FORTRAN Utility Subprograms — Utility Functions and Subroutines General Utilities Stock FORTRAN Print Utility Subroutines

2.3 Suggestions For FORTRAN Programs Using CRSP Stock Data . . . . . . . . . . . . . . . . . . . . . . . . . Page 352.4 FORTRAN Access on Supported Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 37

Windows Systems Unix Systems OpenVMS Systems

Page 24: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97
Page 25: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 2. Accessing Data in FORTRAN

11

Range

I between 1 and NDAYS

I between 1 and NDAYS, date range between CALDT(1) and CALDT(NDAYS)I between 1 and NDAYS

I between 1 and NDAYSI between 1 and NDAYSI between 1 and NDAYSI between 1 and NDAYSI between 1 and NDAYSI between 1 and NDAYS

I between 1 and NDAYS

I between 1 and NDAYS

n/an/an/an/an/a see PERMCOn/an/agreater than or equal to 1greater than or equal to 0greater than or equal to 0

always equal to 1greater than or equal to 0greater than or equal to 0greater than or equal to 0greater than or equal to 0greater than or equal to 0greater than or equal to 0

FORTRAN Programming

CHAPTER 2. ACCESSING DATA IN FORTRAN

2.1 CRSPAccess97 FORTRAN Stock Data Structure

See the CRSP Data Definitions and Coding Schemes Guide for variable definitions.

Stock FORTRAN Variable UsageMnemonic Name Data Type UsageNDAYS Number of Calendar Days INTEGER NDAYS

/CAL/ Calendar/Indices Arrays COMMON BLOCKCALDT Calendar Trading Date INTEGER CALDT(I)

VWRETDReturn (Including all Distributions) on Value-Weighted Index REAL

VWRETD(I) is the value-weighted return on CALDT(I)

VWRETX Return (Excluding Dividends) on Value-Weighted Index REAL VWRETX(I)

EWRETDReturn (Including all Distributions) on Equal-Weighted Index REAL EWRETD(I)

EWRETX Return (Excluding Dividends) on Equal-Weighted Index REAL EWRETX(I)TOTVAL Total Value of Market REAL TOTVAL(I)TOTCNT Total Count of Market INTEGER TOTCNT(I)USDVAL Market Value of Securities Used REAL USDVAL(I)USDCNT Count of Securities Used INTEGER USDCNT(I)SPINDX or S&P 500 Composite Index Level or REAL SPINDX(I)NCINDX Index Level on Nasdaq Composite REAL NCINDX(I)SPRTRN or S&P 500 Composite Index Return or REAL SPRTRN(I)NCRTRN Return on Nasdaq Composite Index REAL NRCTRN(I)

/HEADER/ Stock Identification and Data Range Variables COMMON BLOCKCUSIP CUSIP Identifier, Header CHARACTER*8 CUSIPPERMNO PERMNO INTEGER PERMNOPERMCO PERMCO INTEGER PERMCOISSUNO Nasdaq Issue Number INTEGER ISSUNOCOMPNO Nasdaq Company Number INTEGER COMPNOHEXCD Exchange Code, Header INTEGER HEXCDHSICCD Standard Industrial Classification (SIC) Code, Header INTEGER HSICCDNUMNAM Number of Name Structures INTEGER NAMES(*,NUMNAM)NUMDIS Number of Distribution Structures INTEGER DISTS(*,NUMDIS)NUMSHR Number of Shares Structures INTEGER SHARES(*,NUMSHR)

NUMDEL Number of Delisting Structures INTEGERDELIST(*,NUMDEL) or RDELIS(*,NUMDEL)

NUMNDI Number of Nasdaq Information Structures INTEGER NASDIN(*,NUMNDI)BEGDAT Begin Index of Stock Data INTEGER CALDT(BEGDAT)ENDDAT End Index of Stock Data INTEGER CALDT(ENDDAT)BEGPRC Begin Index of Price Data INTEGER CALDT(BEGPRC)ENDPRC End Index of Price Data INTEGER CALDT(ENDPRC)BEGSP Begin Index of Secondary Price Data INTEGER CALDT(BEGSP)

Page 26: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

greater than or equal to 0greater than or equal to 0greater than or equal to 0greater than or equal to 0greater than or equal to 0greater than or equal to 0greater than or equal to 0

greater than or equal to 0

greater than or equal to 0

greater than or equal to 0

greater than or equal to 0

there are NUMNAM name structures

I between 1 and NUMNAMI between 1 and NUMNAMI between 1 and NUMNAMI between 1 and NUMNAMI between 1 and NUMNAMI between 1 and NUMNAMI between 1 and NUMNAMI between 1 and NUMNAMI between 1 and NUMNAMI between 1 and NUMNAMI between 1 and NUMNAM

there are NUMDIS distribution events

I between 1 and NUMDISI between 1 and NUMDISI between 1 and NUMDISI between 1 and NUMDISI between 1 and NUMDIS

Range

12

ENDSP End Index of Secondary Price Data INTEGER CALDT(ENDSP)BEGVOL Begin Index of Volume Data INTEGER CALDT(BEGVOL)ENDVOL End Index of Volume Data INTEGER CALDT(ENDVOL)BEGRET Begin Index of Return Data INTEGER CALDT(BEGRET)ENDRET End Index of Return Data INTEGER CALDT(ENDRET)BEGYR Begin Index of Portfolio Data INTEGER BEGYR+1924ENDYR End Index of Portfolio Data INTEGER ENDYR+1924BEGRTX or Begin Index of Return without Dividends Data or

INTEGERCALDT(BEGRTX) or

BEGSXS Begin Index of Optional Time Series 1 Data or

CALDT(BEGSXS)Begin Index of Standard Deviation Excess Return Data

ENDRTX or End Index of Return Without Dividends Data or INTEGER CALDT(ENDRTX) or

ENDSXS End Index of Optional Time Series 1 Data or

INTEGER CALDT(ENDSXS)End Index of Standard Deviation Excess Return Data

BEGPR2 or Begin Index of Spread between Bid and Ask Data or INTEGER CALDT(BEGPR2) or

BEGBXS Begin Index of Optional Time Series 2 Data or

INTEGER CALDT(BEGBXS)Begin Index of Beta Excess Return Data

ENDPR2 or End Index of Spread between Bid and Ask Data or INTEGER CALDT(ENDPR2) or

ENDBXSEnd Index of Optional Time Series 2 Data or

INTEGER CALDT(ENDBXS)End Index of Beta Excess Return Data

/INFO/Name, Distribution, Share, Delisting, and Nasdaq Information Arrays COMMON BLOCK

NAMES Security Name History INTEGERNAMES is equivalenced to CNAMES

CNAMES Security Name History, character fields CHARACTER*4NAMEDT Name Effective Date INTEGER NAMES(NAMEDT, I)NCUSIP CUSIP 2 CHARACTER*4 CNAMES(NCUSIP,I)CUSP CUSIP Function CHARACTER*8 CUSP(I)TICKER Ticker Symbol 2 CHARACTER*4 CNAMES(TICKER,I)TICK Ticker Symbol function CHARACTER*8 TICK(I)COMNAM Company Name 8 CHARACTER*4 CNAMES(COMNAM,I)COMPNM Company Name Function CHARACTER*32 COMPNM(I)SHRCLS Share Class CHARACTER*4 CNAMES(SHRCLS,I)SHRCD Share Code INTEGER NAMES(SHRCD,I)EXCHCD Exchange Code INTEGER NAMES(EXCHCD,I)SICCD Standard Industrial Classification (SIC) Code INTEGER NAMES(SICCD,I)

DISTS Distribution Event Array INTEGER

DISTS is equivalenced to RDISTS

RDISTS Distribution History Array, real fields REAL, see DISTSDISTCD Distribution Code INTEGER DISTS(DISTCD,I)DIVAMT Dividend Cash Amount REAL RDISTS(DIVAMT,I)FACPR Factor to Adjust Price REAL RDISTS(FACPR,I)FACSHR Factor to Adjust Shares Outstanding REAL RDISTS(FACSHR,I)DCLRDT Distribution Declaration Date INTEGER DISTS(DCLRDT,I)

Mnemonic Name Data Type Usage

Page 27: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 2. Accessing Data in FORTRAN

13

I between 1 and NUMDISI between 1 and NUMDISI between 1 and NUMDISI between 1 and NUMDISI between 1 and NUMDIS

there are NUMSHR shares observationsI between 1 and NUMSHR,J between BEGPRC and ENDPRC, CALDT(J) is the date of the observationI between 1 and NUMSHRI between 1 and NUMSHR

there are NUMDEL delist events

I between 1 and NUMDELI between 1 and NUMDELI between 1 and NUMDELI between 1 and NUMDELI between 1 and NUMDELI between 1 and NUMDELI between 1 and NUMDELI between 1 and NUMDELI between 1 and NUMDELI between 1 and NUMDEL

there are NUMNDI information structuresI between 1 and NUMNDII between 1 and NUMNDII between 1 and NUMNDII between 1 and NUMNDII between 1 and NUMNDI

I between BEGSP and ENDSPI between BEGSP and ENDSPI between BEGPRC and ENDPRCI between BEGVOL and ENDVOLI between BEGRET and ENDRET

I between BEGSXS and ENDSXSI between BEGRTX and ENDRTXI between BEGBXS and ENDBXSI between BEGPR2 and ENDPR2

I between BEGYR and ENDYR

Range

FORTRAN Programming

EXDT Ex-Distribution Date INTEGER DISTS(EXDT,I)RCRDDT Record Date INTEGER DISTS(RCRDDT,I)PAYDT Payment Date INTEGER DISTS(PAYDT,I)ACPERM Acquiring PERMNO INTEGER DISTS(ACPERM,I)ACCOMP Acquiring PERMCO INTEGER DISTS(ACCOMP,I)

SHARES Shares Outstanding Observation Array INTEGERSHROUT Shares Outstanding INTEGER SHARES(SHROUT,I)

CURSHR Shares Outstanding Function INTEGER CURSHR(J)SHRSDT Shares Observation Date INTEGER SHARES(SHRSDT,I)SHRFLG Shares Outstanding Observation Flag INTEGER SHARES(SHRFLG,I)

DELIST Delisting History Array INTEGERDELIST is equivalenced to RDELIS

RDELIS Delisting Structure Array REALDLSTDT Delisting Date INTEGER DELIST(DLSTDT,I)DLSTCD Delisting Code INTEGER DELIST(DLSTCD,I)NWPERM New PERMNO INTEGER DELIST(NWPERM,I)NWCOMP New PERMCO INTEGER DELIST(NWCOMP,I)NEXTDT Delisting Date of Next Available Information INTEGER DELIST(NEXTDT,I)DLAMT Amount After Delisting REAL RDELIS(DLAMT,I)DLRETX Delisting Return without Dividends REAL RDELIS(DLRETX,I)DLPRC Delisting Price REAL RDELIS(DLPRC,I)DLPDT Delisting Payment Date INTEGER DELIST(DLPDT,I)DLRET Delisting Return REAL RDELIS(DLRET,I)

NASDIN Nasdaq Information Array INTEGERTRTSDT Nasdaq Traits Date INTEGER NASDIN(TRTSDT,I)TRTSCD Nasdaq Traits Code INTEGER NASDIN(TRTSCD,I)NMSIND Nasdaq National Market Indicator INTEGER NASDIN(NMSIND,I)MMCNT Market Maker Count INTEGER NASDIN(MMCNT,I)NSDINX NASD Index Code INTEGER NASDIN(NSDINX,I)

/DDATA/ Price, Volume, and Return Arrays COMMON BLOCKBIDLO Bid or Low Price REAL BIDLO(I)ASKHI Ask or High Price REAL ASKHI(I)PRC Price or Bid/Ask Average REAL PRC(I)VOL Volume Traded INTEGER VOL(I)RET Holding Period Total Return REAL RET(I)

/ADATA/ Secondary Prices and Returns and Yearly Data COMMON BLOCKSXRET or Standard Deviation Excess Return or

REALSXRET(I) or

RETX Return Without Dividends RETX(I)BXRET or Beta Excess Return or

REALBXRET(I) or

PRC2 Spread between Bid and Ask PRC2(I)

PRTNUM Portfolio Assignment Array INTEGERsee individual components for usage.

PRTNUM(1) orPortfolio Assignment for First Portfolio* INTEGER

PRTNUM(1,I) orPRTNUM(CAP) PRTNUM(CAP,I)

Mnemonic Name Data Type Usage

Page 28: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

I between BEGYR and ENDYR

I between BEGYR and ENDYR

I between BEGYR and ENDYR

I between BEGYR and ENDYR

I between BEGYR and ENDYR

I between BEGNMS and ENDNMSI between BEGNMS and ENDNMS

I between BEGNMS and ENDNMSI between BEGNMS and ENDNMSI between BEGNMS and ENDNMSI between BEGNMS and ENDNMS

Range

14

PRTNUM(2) orPortfolio Assignment for Second Portfolio* INTEGER

PRTNUM(2,I) orPRTNUM(SDEV) or PRTNUM(SDEV,I) orPRTNUM(CAPN) PRTNUM(CAPN,I)PRTNUM(3) or

Portfolio Assignment for Third Portfolio INTEGERPRTNUM(3,I) or

PRTNUM(BETA) or PRTNUM(BETA,I) orPRTNUM(CAPQ) PRTNUM(CAPQ,I)

YRVAL Portfolio Statistic Array REALsee individual components for usage.

YRVAL(1) orPortfolio Statistic for First Portfolio‡ REAL

YRVAL(1,I) orYRVAL(CAP) YRVAL(CAP,I)YRVAL(2) or

Portfolio Statistic for Second Portfolio‡ REALYRVAL(2,I) or

YRVAL(SDEV) or YRVAL(SDEV,I) orYRVAL(CAPN) YRVAL(CAPN,I)YRVAL(3) or

Portfolio Statistic for Third Portfolio‡ REALYRVAL(3,I) or

YRVAL(BETA) or YRVAL(BETA,I) orYRVAL(CAPQ) YRVAL(CAPQ,I)

NMSHDR Nasdaq Header COMMON BLOCKBEGNMS Begin Index of Nasdaq Data INTEGER CALDT(BEGNMS)ENDNSM End Index of Nasdaq Data INTEGER CALDT(ENDNMS)CUSIPN CUSIP, Nasdaq CHARACTER *8 CUSIPNPERMN Nasdaq Permanent Issue Identification Number INTEGER PERMN

NMSDAT Nasdaq Bid, Ask and Number of Trades Arrays COMMON BLOCKNMSBID Bid REAL NMSBID(I)NMSASK Ask REAL NMSASK(I)NMSTRD Number of Trades, Nasdaq INTEGER NMSTRD(I)NMSTRD Price Alternate Date INTEGER NMSTRD(I)* Portfolio Assignment for Capitalizations = Portfolio Assignment for First Portfolio,

Portfolio Assignment for NYSE/AMEX Capitalizations, and Portfolio Assignment for Standard Deviations = Portfolio Assignments for Second Portfolio, Portfolio Assignment for Nasdaq Capitalizations, and Portfolio Assignment for Betas = Portfolio Assignment for Third Portfolio

‡ Portfolio Statistic for Capitalizations = Portfolio Statistic for First Portfolio, Portfolio Assignment for NYSE/AMEX Capitalizations, and Portfolio Statistic for Standard Deviations = Portfolio Statistic for Second Portfolio, Portfolio Statistic for NASDAQ Capitalizations, and Portfolio Statistic for Betas = Portfolio Statistic for Third Portfolio

Mnemonic Name Data Type Usage

Page 29: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 2. Accessing Data in FORTRAN

15

FORTRAN Range

I between 1 and NDAYSI between 1 and NDAYS

I between 1 and NDAYSI between 1 and NDAYS

I between 1 and NDAYSI between 1 and NDAYS

I between 1 and NDAYSI between 1 and NDAYS

I between 1 and NDAYSI between 1 and NDAYSI between 1 and NDAYSI between 1 and NDAYSI between 1 and NDAYSI between 1 and NDAYSI between 1 and NDAYSI between 1 and NDAYSI between 1 and NDAYS

I between 1 and NDAYS

I between 1 and NDAYS

I between 1 and NDAYSI between 1 and NDAYS

I between 1 and NDAYSI between 1 and NDAYS, J between 1 and 10I between 1 and NDAYS, J between 1 and 10I between 1 and NDAYS, J between 1 and 10

I between 1 and NDAYSI between 1 and 17 and J between 1 and NDAYSI between 1 and 17 and J between 1 and NDAYSII between 1 and 17 and J between 1 and NDAYSI between 1 and 17 and J between 1 and NDAYSI between 1 and 17 and J between 1 and NDAYS

FORTRAN Programming

Indices FORTRAN Variable UsageMnemonic Name FORTRAN Data Type FORTRAN Usage

CRSP Stock File Indices FORTRAN Variables - loaded by UMCAL and UDCAL subroutines to /CAL/ common block

CALDT Calendar Trading Date INTEGER CALDT(I)VWRETD Return (Including all Distributions) on Value-Weighted Index REAL VWRETD(I)

VWINDDIndex Level Associated with the Return (Including all Distributions) on Value-Weighted Index REAL VWINDD(I)

VWRETX Return (Excluding Dividends) on Value-Weighted Index REAL VWRETX(I)

VWINDXIndex Level Associated with the Return (Excluding Dividends) on Value-Weighted Index REAL VWINDX(I)

EWRETD Return (Including all Distributions) on Equal-Weighted Index REAL EWRETD(I)

EWINDDIndex Level Associated with the Return (Including all Distributions) on Equal-Weighted Index REAL EWINDD(I)

EWRETX Return (Excluding Dividends) on Equal-Weighted Index REAL EWRETX(I)

EWINDXIndex Level Associated with the Return (Excluding Dividends) on Equal-Weighted Index REAL EWINDX(I)

SPRTRN S&P 500 Composite Index Return REAL SPRTRN(I)SPINDX S&P 500 Composite Index Level REAL SPINDX(I)NCRTRN Return on Nasdaq Composite Index REAL NCRTRN(I)NCINDX Index Level on Nasdaq Composite REAL NCINDX(I)TOTVAL Total Value of Market REAL TOTVAL(I)TOTCNT Total Count of Market INTEGER TOTCNT(I)USDVAL Market Value of Securities Used REAL USDVAL(I)USDCNT Count of Securities Used INTEGER USDCNT(I)

CRSP Stock File Decile Indices FORTRAN Variables - loaded by UMDEC and UDDEC subroutines to /DECILE/ common block

VWINDDIndex Level Associated with the Return (Including all Distributions) on Value-Weighted Index REAL VWINDD(I)

VWINDXIndex Level Associated with the Return (Excluding Dividends) on Value-Weighted Index REAL VWINDX(I)

EWINDDIndex Level Associated with the Return (Including all Distributions) on Equal-Weighted Index REAL EWINDD(I)

EWRETX Return (Excluding Dividends) on Equal-Weighted Index REAL EWRETX(I)

EWINDXIndex Level Associated with the Return (Excluding Dividends) on Equal-Weighted Index REAL EWINDX(I)

DECRET Return on Decile REAL DECRET(J,I)

DECIND Index Level Associated with the Return on Decile REAL DECIND(J,I)

DECRET Return on Decile REAL DECRET(J,I)

CRSP Cap-Based Reports (Monthly) History Variables - loaded by UMCAP subroutines to /CAPBAS/ and /CAPWGT/ common blocksPRTNAM Portfolio Sequence Number CHARACTER*4 PRTNAM(I)

PRTCNT Portfolio Issue Count INTEGER PRTCNT(I,J)

PRTWGT Portfolio Weight REAL*8 PRTWGT(I,J)

TOTRET Return on Portfolio REAL TOTRET(I,J)

TOTIND Index Level Associated with Total Return on Portfolio REAL TOTIND(I,J)

CAPRET Capital Appreciation on Portfolio REAL CAPRET(I,J)

Page 30: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

I between 1 and 17 and J between 1 and NDAYSI between 1 and 17 and J between 1 and NDAYSI between 1 and 17 and J between 1 and NDAYS

I between 1 and NDAYSI between 1 and NDAYSI between 1 and NDAYSI between 1 and NDAYSI between 1 and NDAYSI between 1 and NDAYSI between 1 and NDAYSI between 1 and NDAYSI between 1 and NDAYSI between 1 and NDAYSI between 1 and NDAYSI between 1 and NDAYSI between 1 and NDAYSI between 1 and NDAYSI between 1 and NDAYSI between 1 and NDAYSI between 1 and NDAYSI between 1 and NDAYSI between 1 and NDAYSI between 1 and NDAYS

I between 1 and NDAYS

16

CAPIND Index Level Associated with the Capital Appreciation on Portfolio REAL CAPIND(I,J)

INCRET Return on Income Portfolio REAL INCRET(I,J)

INCIND Index Level Associated with the Income Return on Portfolio REAL INCIND(I,J)

CRSP Indices on the S&P500 Universe FORTRAN Variables same layout as CRSP Stock File Indices

CTI Index FORTRAN Variables - loaded by UMCTI subroutines to the /CTIS/ common blockCALDT Calendar Trading Date INTEGER CALDT(I)B30RET Return on 30 Year Bonds REAL B30RET(I)B30IND Index Level Associated with the 30 Year Bond Returns REAL B30IND(I)B20RET Return on 20 Year Bonds REAL B20RET(I)B20IND Index Level Associated with the 20 Year Bond Returns REAL B20IND(I)B10RET Return on 10 Year Bonds REAL B10RET(I)B10IND Index Level Associated with the 10 Year Bond Returns REAL B10IND(I)B7RET Return on 7 Year Bonds REAL B7RET(I)B7IND Index Level Associated with the 7 Year Bond Returns REAL B7IND(I)B5RET Return on 5 Year Bonds REAL B5RET(I)B5IND Index Level Associated with the 5 Year Bond Returns REAL B5IND(I)B2RET Return on 2 Year Bonds REAL B2RET(I)B2IND Index Level Associated with the 2 Year Bond Returns REAL B2IND(I)B1RET Return on 1 Year Bonds REAL B1RET(I)B1IND Index Level Associated with the 1 Year Bond Returns REAL B1IND(I)T90RET Return on 90-Day Bills REAL T90RET(I)T90IND Index Level Associated with the 90 Day Bill Returns REAL T90IND(I)T30RET Return on 30-Day Bills REAL T30RET(I)T30IND Index Level Associated with the 30 Day Bill Returns REAL T30IND(I)CPIRET Consumer Price Index Rate of Change REAL CPIRET(I)

CPIINDIndex Level Associated with the Rate of Change in Consumer Price Index REAL CPIIND(I)

Page 31: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 2. Accessing Data in FORTRAN

FO

RT

RA

N P

rogramm

ing

d

ata, for

ted

cre-

ce

ingSICinal

g the

tion

hat

s

a are

2.2 FORTRAN Stock Sample Programs and Subroutines

Sample Programs — USSAMP*.FOR

The FORTRAN sample programs give short examples of how to access the CRSPAccess97 Stock File daily ormonthly data with the universal stock access routines. The ten stock sample programs give basic examples of theCRSP access routines, and illustrate possible tasks while using the access and utility routines. To use a sampleprogram, copy it to your directory from the CRSP sample directory. See “2.4 FORTRAN Access on SupporteSystems” on page 37. Edit the program to meet your needs and compile, link, and run.

The sample programs are written to use either daily or monthly data. To switch between daily and monthly dexample, change UDOPENs and UDCLOSEs to UMOPENs and UMCLOSEs.

USSAMP1.FOR USSAMP1.FOR makes a sequential pass through the daily file in PERMNO order,retrieves /INFO/ data, and creates a namelist of current names. The output is prininto a file called DCNAMES.DAT.

USSAMP2.FOR USSAMP2.FOR reads an historical CUSIP list from a file called HCUSIPS.DAT. Itoutputs a partial namelist to the terminal including all historical CUSIPs found in themonthly file and their corresponding current CUSIPs, names, and last prices. It alsoates a file with the current CUSIPs. USSAMP2.FOR is particularly suited for updatingCUSIP lists after some of the CUSIPs have changed.

USSAMP3.FOR USSAMP3.FOR reads from a file called PERMS.DAT for daily data containingPERMNOs. It looks for each of the PERMNOs in the indicated databases. Return and /INFO/ data are� retrieved for each PERMNOs in the input file that exactly matches asecurity on the file. The PERMNO, name, and returns data for the last date, date of priafter delisting, and delisting return are printed for each stock found.

USSAMP4.FOR USSAMP4.FOR makes a partial sequential pass through the monthly file by processall stocks with a current SIC Code between 2000 and 2100. This range of current codes can easily be changed to select different industry groups. It prints to the terma partial namelist including the final prices for each stock found.

USSAMP5.FOR USSAMP5.FOR reads an input file of historical CUSIPS and beginning and endindates, and writes a file containing the compounded return of each security betweendates given, followed by the daily returns for each date in the range. The funcINDXDT is used to find the indices for the dates and COMPND is used to compound thereturns over the indicated range.

USSAMP6.FOR USSAMP6.FOR makes a sequential pass through the monthly file in PERMNO orderand writes year-end capitalizations and decile portfolio assignments for all firms ttraded the last three years. USSAMP6.FOR uses the subroutine USDATE to find theyear-end dates.

USSAMP7.FOR USSAMP7.FOR writes selected NYSE, AMEX, and Nasdaq daily market indicebetween a selected date range to a file. It calls UDOPEN three times to retrieve each ofthe indices. This program can only be used if the CRSP US Stock, Treasury Indices andPortfolio Assignments Database is integrated.

USSAMP8.FOR USSAMP8.FOR finds securities in the daily file that delisted due to a merger duringselected date range. For all securities found, the delisting date and delisting returnprinted as well as the PERMNO of the security merged into, if one is available.

17

Page 32: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

USSAMP9.FOR USSAMP9.FOR finds all spin-offs in the daily file between a selected date range. Foreach spinoff found, the company CUSIP and name at the time of the spinoff are printedas well as the declaration date and amount of the spin-off and the capitalization portfolioof the security during the year the spin-off occurred. USSAMP9.FOR uses the func-tions CURNAM, COMPNM, and CUSP.

USSAMP10.FOR USSAMP10.FOR reads the Nasdaq bid, ask and number of trades in the daily file insequential order for all issues.

18

Page 33: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 2. Accessing Data in FORTRAN

FO

RT

RA

N P

rogramm

ing

Include Files — Parameters, Subscripts, Common Blocks, and Functions

The FORTRAN include files are used as a convenient way to replace long, frequently used blocks of code with singlestatements. All declarations needed to use the CRSP stock data with FORTRAN programs are automatically made byadding the line;

include ’us.txt’

to the beginning of any main programs or subprograms that will use CRSP data or CRSP access or utility routines.Include files can be found in the CRSP_INCLUDE directory in the Getting Started CD-ROM. They are text files thatcan be printed or browsed in a text editor. Common blocks, variables, and subscripts are listed in the table at thebeginning of this section (Page 11) and described in CRSP Data Definitions and Coding Schemes Guide.

us.txt The us.txt include file declares all the common blocks, variables, arrays, and CRSP functions thatcan be used by a FORTRAN program processing CRSP stock data. This includes usparm.txtautomatically.

usparm.txt The usparm.txt include file declares and defines all parameters used by the CRSP subroutines.There are four types of parameters that are used. The first type are maximums that are used whendeclaring arrays. The second type are subscript names used to name the different fields in the /INFO/ arrays and the portfolio data arrays. The third type define keywords used by the CRSPopen and get routines. The fourth type are useful constants describing the data.

nms.txt The nms.txt include file declares variables used for Nasdaq bid, ask, and number of trades dataitems. It must be included following us.txt to use the NMSOPEN or GETNMS Nasdaq accesssubroutines.

usdec.txt The usdec.txt include file declares the /DECILES/ common block used to store index level anddecile results. It must be loaded following us.txt to use the UMDEC or UDDEC Indices Accesssubroutines.

uscap.txt The uscap.txt include file declares the /CAPBAS/ and /CAPWGT/ common blocks used to storeCap-Based Portfolio result data. It must be loaded following us.txt to use the UMCAP indicesaccess subroutines.

uscti.txt The uscti.txt include file declares the /CTIS/ common block used to store CRSP Treasury andInflation data. It must be loaded following us.txt to use the UMCTI indices access subroutines.

allind.txt The allind.txt include file is equivalent to us.txt, but processes CRSP indices data.

19

Page 34: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

Stock Access Subprograms — Subroutines For Retrieving CRSP Data

CRSP access subprograms are divided into open and close subroutines and get routines. The subroutines are used byFORTRAN programs to actually retrieve CRSP data for processing. An open subroutine is used to open daily ormonthly files and read in the CRSP calendar and indices. Get routines read all or selected data for one security at atime. A close subroutine is used when CRSP processing is done to close the files and reset the program.

The access subroutines may print warning or descriptive messages to the screen. A program can disable thesemessages if it executes the statement MESSAGEFLAG = 1. The messages can be re-enabled by settingMESSAGEFLAG = 0.

Date variables are stored internally and used in the access subroutines in the YYYYMMDD date format. A programcan use YYMMDD format by executing the statement DATEFFLAG = 0, and reset back to YYYYMMDD bysetting DATEFFLAG = 1.

Open and Close Subroutines

Open subroutines are used to open CRSP stock data files, read in the CRSP trading calendar, and read the set ofindices for a file. An open subroutine must be called before using any of the CRSP Get Routines to retrieve actualdata. A corresponding close subroutine closes all input files and resets the program to its beginning state. An opensubroutine is usually placed at the beginning of a program immediately after the variable declarations and a closesubroutine is called just before exiting the program. Please note that daily and monthly data cannot be loadedsimultaneously in FORTRAN.

The syntax of the open subroutines is:

CALL UxOPEN(FILE)

UDOPEN UDOPEN initializes a program for using CRSP daily stock data. It loads the /CAL/common block with the daily calendar and the market indices corresponding to FILE andopens all needed data files. FILE must be set to NYSEAMEX+NASDAQ (=3) for userswith NYSE, AMEX, and Nasdaq stock data.

Users with the CRSP US Stock, Treasury Indices and Portfolio Assignments Databasehave related options for overriding the default index and portfolio type data that areloaded. The function UDCAL (Page 27) can be called after the call to UDOPEN to loadmarket indices based on different exchange combinations. The variables PORTTYPE1,PORTTYPE2, and PORTTYPE3 can be assigned before the call to UDOPEN to selectalternate portfolio types to load into the YRVAL and PRTNUM arrays. These variables arefurther described in the WANTED parameter of the CRSP GET Routines.

UMOPEN UMOPEN initializes a program for using CRSP monthly stock data. It loads the /CAL/common block with the monthly calendar and the market indices corresponding to FILEand opens all needed data files. FILE must be set to NYSEAMEX+NASDAQ (=3) forusers with NYSE, AMEX, and Nasdaq stock data.

Users with the CRSP US Stock, Treasury Indices and Portfolio Assignments Databasehave related options for overriding the default index and portfolio type data that areloaded. The function UMCAL (Page 27) can be called after the call to UMOPEN to loadmarket indices based on different exchange combinations. The variables PORTTYPE1,PORTTYPE2, and PORTTYPE3 can be assigned before the call to UMOPEN to selectalternate portfolio types to load into the YRVAL and PRTNUM arrays. These variables arefurther described in the WANTED parameter of the CRSP GET Routines.

20

Page 35: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 2. Accessing Data in FORTRAN

FO

RT

RA

N P

rogramm

ing

The Syntax of the Close Subroutines is:

CALL UxCLOSE

UDCLOSE UDCLOSE closes all data files opened by UDOPEN and resets the program as if no call to UDOPENhas been made.

UMCLOSE UMCLOSE closes all data files opened by UMOPEN and resets the program as if no call to UMOPENhas been made.

21

Page 36: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

CRSP GET Routines

CRSP stock files are indexed on five key identifiers for FORTRAN random or sequential access:

1. CRSP PERMNO (Permanent Issue Identification Number),

2. CRSP PERMCO (Permanent Company Identification Number)

3. CUSIP Identifier, Header (CUSIP),

4. Historical CUSIP (HCUSIP) derived from the CUSIP and CUSIP Identifier, Header (NAMES(NCUSIP,*) andCUSIP),

5. Historical Standard Industrial Classification (SIC) Code (NAMES(SICCD,*))

Each identifier has an associated get routine that can be used to directly access the CRSP data by that identifier.CUSIP, HCUSIP, and PERMNO are unique identifiers. PERMCO and historical SIC code can identify more than oneissue. PERMNO, PERMCO, and HCUSIP permanently identify the same security, while CUSIP may change overtime. Multiple HCUSIPs can identify the same security.

There are five USGETxxx routines that can be used to read a security’s data. They will only work in a program orsubprogram that contains the statement;

include ’us.txt’

and has previously called UDOPEN or UMOPEN to open the appropriate data file. The five subprograms are identicalexcept for the type of identification they use to retrieve the security. All have four parameters, which can be selectedto determine exactly what is wanted: file, stock identifier, data wanted, and a return address.

The syntax of the get routines is:

CALL USGETxxx (FILE,ID,WANTED,*RETURN)

FILE FILE is included for internal backward compatibility and is not used. FILE must be either a 1 or2.

ID ID is the identifier that determines which security to load. USGETCUS and USGETHIS expecteight-character strings and USGETPER, USGETPCO, and USGETSIC expect integers. In allfunctions other than USGETPER, if a specific ID is indicated, the subroutine will find data for thestock with the next identifier equal to or greater than the given one. If the requested identifier isgreater than any in the file, the greatest is returned. If there is not an exact match, a message isprinted to the terminal. You must check for unmatched identifiers within a program by checkingthe identifier passed against the corresponding identifier of the security loaded. USGETPER doesnot load new data if the PERMNO is not found. The common blocks are left with the previouscontents. Success can be checked by comparing PERMNO against the requested key.

The get routines can also access securities sequentially by identifier or reload data for the samesecurity. To do this, one of two keywords can be used instead of a stock identifier for thisparameter. The subroutines with character strings for identifiers use NEXT or SAME, while theroutines expecting integers use INEXT or ISAME.

NEXT or INEXT

NEXT or INEXT finds the next highest security in the file, using the last direct call from thatsubroutine as the point of reference and moving in increasing order according to the identifierassociated with the subroutine. If there is no current security or the current security was readusing a different identification, the first security in the file will be retrieved. If the end of thefile is reached in this way, the subroutine will return to the line number specified in the fourthparameter.

22

Page 37: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 2. Accessing Data in FORTRAN

FO

RT

RA

N P

rogramm

ing

SAME or ISAME

SAME or ISAME will load data for the current security. It can be used for reading in additionaldata for the same security.

WANTED WANTED is an integer expression that determines what data to load. Header variables such asidentifiers and data range variables are always loaded. All other sets of variables are only loaded iftheir associated keyword is passed to get routine. More than one set can be retrieved at once byadding keywords. For example, if WANTED is INFOS + RETURNS (=5), the get routine willload all names, distributions, shares, delistings, Nasdaq information and returns for the security itfinds. The keywords, integer values of the keywords, and the associated variables in the sets aredescribed in the table below.

Users with the CRSP US Stock, Treasury Indices and Portfolio Assignments Database can furtherselect the portfolio data and Optional Time Series Data loaded using FORTRAN variableparameters.

If WANTED includes SDEVS or BETAS or EVERYXS, the SXRET and/or BXRET arrays will beloaded with excess returns calculated using the issue’s return and the portfolio assignmentsspecified in the PORTTYPE2 and PORTTYPE3 variables respectively. If WANTED includesYEARLY, the Portfolio Assignment and Portfolio Statistic Arrays, PRTNUM and YRVALrespectively, will be loaded based on Portfolio Types set in the PORTTYPE1, PORTTYPE2 andPORTTYPE3 variables. CAP arrays are loaded according to PORTTYPE1, SDEV arrays are loadedaccording to PORTTYPE2, and BETA arrays are loaded according to PORTTYPE3. See the CRSPData Definitions and Coding Schemes Guide for index methodologies, portfolio type numbers, andassociated index groups of available market segment series.

Only one of SDEVS or BETAS can be loaded into the Optional Time Series 1 Data slot, and RETXSor PRICE2S can be loaded into the Optional Time Series 2 Data slot. If EVERY or EVERYXS isused, no other keywords are allowed.

Keyword Value Variable(s) Loaded

NONE 0/HEADER/ common block; PERMNO, CUSIP, NUMNAM, BEGDAT, etc.(always loaded)

INFOS 1 /INFO/ common block; NAMES, DISTS, SHARES, DELIST, NASDINPRICES 2 PRC array in /DDATA/ common blockRETURNS 4 RET array in /DDATA/ common blockBIDS 8 BIDLO array in /DDATA/ common blockASKS 16 ASKHI array in /DDATA/ common blockVOLUMES 32 VOL array in /DDATA/ common blockSDEVS 64 SXRET array in /ADATA/ common blockBETAS 128 BXRET array in /ADATA/common blockYEARLY 256 YRVAL and PRTNUM yearly arrays in /ADATA/ common blockRETXS 512 RETX array in /ADATA/ common blockPRICE2S 1024 PRC2 array in /ADATA/ common block ALL 7 short for INFOS+PRICES+RETURNSEVERYXS 511 short for ALL+SXRET+BXRET+BIDS+ASKS+VOLUMES+YEARLYEVERY 1855 short for ALL+BIDS+ASKS+VOLUMES+RETXS+PRICE2S+YEARLY

23

Page 38: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

For example, the FORTRAN code

PORTTYPE2=8PORTTYPE3=6CALL USGETPER (1, INEXT, INFOS+YEARLY+BETAS+SDEVS,*900)

will load NYSE/AMEX beta portfolio data to PRTNUM(3,I) and YRVAL(3,I), where I is betweenBEGYR and ENDYR, and NYSE/AMEX Beta Excess Returns to BXRET(J), where J is between BEGBXSand ENDBXS. It will also load Nasdaq beta portfolio data to PRTNUM(2,I) and YRVAL(2,I), where Iis between BEGYR and ENDYR, and Nasdaq Beta Excess Returns to SXRET(J), where J is betweenBEGSXS and ENDSXS. PRTNUM(1,I) and YRVAL(1,I) are loaded with NYSE/AMEX/NasdaqCapitalization Decile Portfolio Information.

24

Page 39: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 2. Accessing Data in FORTRAN

FO

RT

RA

N P

rogramm

ing

*RETURN RETURN is the line number the program returns to if the end of the file is reached on a sequential call. Forinstance, *900 will return to the line labeled 900.

The five USGETxxx get subroutines are differentiated only by the identifier (ID) they use to access the file. Theyare:

USGETPER(FILE,PERMNO,WANTED,*RETURN)

USGETPER loads the header and selected data for a security chosen according to a security’s PERMNO.

USGETPCO(FILE,PERMCO,WANTED,*RETURN)

USGETPCO loads the header and selected data for a security chosen according to a security’s PERMCO.PERMCOs are integers common to all issues of the same company and are not affected by name changes.To find all issues with a certain PERMCO, read the first directly and then make sequential reads withINEXT until a security with a different PERMCO is found.

USGETCUS(FILE,CUSIP,WANTED,*RETURN)

USGETCUS loads the header and selected data for a security chosen by its CUSIP where CUSIP is theactive CUSIP of the security of interest. CUSIPs are eight character strings that roughly alphabetize thesecurity names in the files.

USGETHIS(FILE,HISTORICAL CUSIP,WANTED,*RETURN)

USGETHIS loads the header and selected data for a security chosen by historical CUSIP. All historicalCUSIPs are preserved with their corresponding name structures in an issue’s name history. Data for anissue can be found by calling USGETHIS with either an old CUSIP from its name history or its currentCUSIP. If no CUSIPs were assigned by the CUSIP bureau, the issue can be accessed by the dummyheader CUSIP assigned by CRSP. The FORTRAN variable HCUSIP will contain the historical CUSIPfound.

USGETSIC(FILE,SIC CODE,WANTED,*RETURN)

USGETSIC loads the header and selected data for a security chosen according to a security’s historicalStandard Industrial Classification (SIC) Code (SICCD). SIC codes classify stocks according to industry.INEXT and ISAME get the next and same stocks respectively. Since there are duplicate SIC codes,INEXT must be used to find all stocks with a certain code. All direct calls will find the first stock inPERMNO order with that code. To read all stocks with a certain SIC code, read the first directly and thenmake sequential reads with INEXT until a security with a different SICCD is found. It is possible to finda security under more than one SIC code.

25

Page 40: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

CRSP Access Subroutines for Nasdaq Data

There are two access subroutines that are used to load Supplemental Nasdaq data arrays for a Nasdaq security afterother stock data has been loaded. NMSOPEN is used with the command CALL NMSOPEN immediately following acall to UDOPEN or UMOPEN in a program. GETNMS is used with the command CALL GETNMS(PERMNO)immediately following a call to USGETxxx in a program. The program must have the include statement:

include ’nms.txt’

following the include statement for us.txt in the program.

NMSOPENNMSOPEN opens Nasdaq data files containing supplemental Nasdaq Bid, Ask, and Number ofTrades data variables. The security stock data must be opened previously with a call to UDOPEN orUMOPEN.

GETNMS(PERMNO)GETNMS retrieve the header variables Begin and End Index of Nasdaq Data (BEGNMS andENDNMS) and loads the Bid (NMSBID), Ask (NMSASK) and Number of Trades, Nasdaq(NMSTRD) arrays for the Nasdaq security by PERMNO. The security stock data must have beenloaded previously with a call to one of the stock access routines USGETxxx.

26

Page 41: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 2. Accessing Data in FORTRAN

FO

RT

RA

N P

rogramm

ing

le. Eachponding

s ofameters

. The

ed to

FORTRAN Access Subroutines for Indices Data

There are six indices access subroutines available — UDCAL,UMCAL,UDDEC,UMDEC,UMCAP, and UMCTI. Theindices access subroutines use parameters to specify a certain set of data if more than one set is availabsubroutine loads one of the indices common blocks, and overwrites and data previously loaded in the correscommon block. The subroutines are used in a program with the syntax CALL subname(parameterlist).These files are contained in the CRSP US Stock, Treasury Indices and Portfolio Assignments Database.

UDCAL (Keyword)UMCAL (Keyword)

UDCAL loads daily data into the /CAL/ common block, and UMCAL loads monthly data in to the/CAL/ common block. All three of the individual exchanges and the two combinationexchanges are supported. The keywords or their integer equivalents can be used as par

A call to UDCAL and UMCAL will overwrite what is currently in the /CAL/ common block,even if it was loaded by a call to UDOPEN or UMOPEN.

UDDEC (Exchange,Type)

UDDEC loads all daily index levels and the ten market segment returns into the /DECILE/common block. The keywords or their integer equivalents can be used as parametersfollowing eight combinations of the arguments are supported for daily data.

A call to UDDEC reinitializes and loads large arrays, and programs should be organizminimize the number of calls made.

Keyword Value Data Loaded

NYSEAMEX 1. NYSE and AMEX CombinedNASDAQ 2 Nasdaq onlyNYSEAMEX+NASDAQ 3 NYSE/AMEX and Nasdaq CombinedNYSE 4 NYSE onlyAMEX 5 AMEX onlySP500 6 Indexes on the S&P 500 Universe

Exchange TypesKeyword Value Keyword Value Data Loaded

NYSEAMEX 1 CAP 1 NYSE/AMEX Capitalization DecilesNYSEAMEX 1 BETA 3 NYSE/AMEX Beta DecilesNYSEAMEX 1 SDEV 2 NYSE/AMEX Std. Deviation DecilesNASDAQ 2 CAP 1 Nasdaq Capitalization DecilesNASDAQ 2 BETA 3 Nasdaq Beta DecilesNASDAQ 2 SDEV 2 Nasdaq Std. Deviation DecilesNYSEAMEX +NASDAQ 3 CAP 1 Combined Capitalization DecilesNYSE 4 CAP 1 NYSE-only Capitalization DecilesAMEX 5 CAP 1 AMEX-only Capitalization Deciles

27

Page 42: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

UMDEC (Exchange,Type)

UMDEC loads all monthly index levels and the ten market segment returns into the /DECILE/common block. The keywords or their integer equivalents can be used as parameters. Thefollowing four combinations of the arguments are supported for monthly data.

A call to UMDEC reinitializes and loads large arrays, and programs should be organized tominimize the number of calls made.

UMCAP (Keyword)

UMCAP loads all monthly CRSP Cap-Based counts, returns and associated index levels into the/CAPBAS/ common block. These keywords or their integer equivalents can be used asparameters. No daily Cap-Based data are available.

A call to UMCAP reinitializes and loads large arrays, and programs should be organized tominimize the number of calls made.

UMCTI

UMCTI loads all monthly CTI returns and associated index levels into the /CTIS/ commonblock. There are no arguments to this subroutine. No daily CTI data are available.

Exchange TypeKeyword Value Keyword Value Data Loaded

NYSEAMEX 1 CAP 1 NYSE/AMEX Capitalization DecilesNASDAQ 2 CAP 1 Nasdaq Capitalization DecilesNYSEAMEX+ NASDAQ 3 CAP 1 Combined Capitalization DecilesNYSE 4 CAP 1 NYSE-only Capitalization DecilesAMEX 5 CAP 1 AMEX-only Capitalization Deciles

Keyword Value Data Loaded

NYSEAMEX 1 NYSE and AMEX CombinedNYSEAMEX+NASDAQ 3 NYSE/AMEX and Nasdaq National Market CombinedNYSE 4 NYSE only

28

Page 43: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 2. Accessing Data in FORTRAN

FO

RT

RA

N P

rogramm

ing

Stock FORTRAN Utility Subprograms — Utility Functions and Subroutines

There are two types of FORTRAN utility subprograms, general and print utilities, which can be used to assist in theprocessing of CRSP data. Utility functions provide simpler access to several variables or allow convenient methodsof performing common operations. The functions have return values and can be nested and used with each other. Forexample, the expression CUSP(CURNAM(19750512)) can be used to find the CUSIP that was valid on May 12,1975 for a security, and CURSHR(INDXDT(19750512)) can be used to find the shares outstanding on that samedate.

Subroutines are used in programs with the syntax

CALL subname(parameterlist)

Functions do not use the CALL syntax, but act the same as variables and can be used in any statement as if they arearrays. In the following descriptions, functions are indicated with a return value type at the right of the functionname.

The utility subroutines include a subroutine to find the subscript into the trading calendar of a date and varioussubroutines to print a type of stock data for a security. The first parameter to the print subroutines is the unit numberof the file to which the output is written and can be set to print the output to the terminal. Other parameters may limitthe range of data that is printed. Date fields in subroutines and functions can be affected by the DATEFFLAGvariable. If it is set to 1, internal dates are YYYYMMDD format. Otherwise, the flag is set to 0 and the dates are inthe YYMMDD format.

General Utilities

COMPND (RETV, BEGIND, ENDIND) REAL

The function COMPND compounds the return in array RETV between period BEGIND and periodENDIND and returns the value to the calling program. Any REAL*4 array can be passed toCOMPND. Therefore, returns for a security compounded over its entire trading range can becalculated with COMPND(RET, BEGRET, ENDRET) and the value-weighted market returnwith dividends for periods 1 to 100 can be calculated with COMPND (VWRETD, 1, 100).

The function skips any missing returns ( < -1.0) and returns -99 for a period with only missingreturns. It returns -99 if an error condition occurs.

COMPNM (I) CHARACTER*32

The function COMPNM returns the company name from the Ith name structure in CHARACTER*32format. In this format it can more easily be printed and compared than it can when stored in thename structure in its eight component parts.

CURDIS (DISTDT, BEGDT, ENDDT) INTEGER

The function CURDIS can be used to find all distribution structures where a distribution date fallsbetween the given date range. The DISTDT argument is the date type of interest; it can be eitherDCLRDT (declare date), EXDT (ex-distribution date), RDRDDT (record date), or PAYDT (paymentdate). BEGDT and ENDDT are the beginning and ending dates respectively and are coded asYYYYMMDD (or YYMMDD if DATEFFLAG is set to 0). All distributions within the specifieddate range can be found by repeatedly calling CURDIS until a value of 0 is returned.

A return value of -1 indicates an error in the arguments passed to the function.

29

Page 44: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

CURNAM (DATE) INTEGER

The function CURNAM returns the subscript of the name structure that is in effect on the passeddate. If the date passed is before the first name structure, it will return 1, and if the date passed isafter the last name structure, it will return the Number of Name Structures (NUMNAM).

The DATE passed is an integer in YYYYMMDD format (or YYMMDD if DATEFFLAG is set to0). It is left to the user to check the validity of the argument.

CURNDI (BDATE) INTEGER

The function CURNDI returns the Nasdaq information structure associated with the passed dateBDATE for the current security. If the date is outside the information structure range a 0 is returned.If an exact match is not found the index of the previous valid date is returned.

The BDATE passed is an integer in YYYYMMDD format (or YYMMDD format if DATEFFLAG isset to 0). It is left to the user to check the validity of the argument.

CURSHR (SIND) INTEGER

The function CURSHR returns a shares outstanding figure at the end of the period indicated for thecurrent security. It uses all shares outstanding numbers in the shares structure, including declaredshares and shares computed from distributions. If SIND is outside of the trading range of thecompany CURSHR returns zero. This function assumes that the shares outstanding values during aperiod where the security is off the exchange are equal to the value on the last date before it left theexchange. The exchange code and dates in the NAMES array can be used to find these ranges.

SIND is the index to date desired. Therefore the capitalization in 1000’s for a security on day I isABS(PRC(I)) * CURSHR(I), and the date is CALDT(I).

CUSP (I) CHARACTER*8

The function CUSP returns the (historical) CUSIP from the Ith name structure in CHARACTER*8format. In this format it can be compared in programs with the CUSIP Identifier, Header.Normally this CUSIP is stored in memory as two different CHARACTER*4 values,CNAMES(NCUSIP,I) and CNAMES(NCUSIP+1,I). The function reports an error if I is notbetween 1 and NUMNAM.

EXRDAT (EXCODE, BEGIND, ENDIND, RARRAY, MISVAL)

The subroutine EXRDAT restricts a REAL data array, RARRAY, by placing a passed REAL*4missing value, MISVAL, into the array on dates while on an exchange not specified by EXCODE,and between the passed index range, BEGIND and ENDIND. The beginning and ending indices arealso adjusted if necessary.

EXCODE is a binary code where 1 = NYSE, 2 = AMEX, 4 = Nasdaq, and combinations ofexchanges are represented by the sum of these codes. It calls the logical function VALEXC todetermine whether the exchange code of a name structure is valid according to the given exchangecode.

30

Page 45: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 2. Accessing Data in FORTRAN

FO

RT

RA

N P

rogramm

ing

EXRINF (EXCODE)

The subroutine EXRINF restricts the distributions, shares, delistings, and Nasdaq informationevent arrays in the /INFO/ common block by erasing all structures pertaining to dates not on adesired exchange or exchanges. The desired exchanges are specified by EXCODE, which is definedabove in EXRDAT.

EXRINT (EXCODE, BEGIND, ENDIND, IARRAY, MISVAL)

The subroutine EXRINT is analogous to EXRDAT, but operates on an INTEGER array, IARRAY,and is passed an INTEGER missing value, MISVAL.

FINDDT (DATE, DATOPT) INTEGER

Function FINDDT is used to search the calendar for the index of a given date. DATE is the desireddate, an integer in YYYYMMDD format (or YYMMDD if DATEFFLAG is set to 0). FINDDTreturns the index of DATE in the calendar. If DATOPT is 0 and DATE is not a trading date, FINDDTreturns 0.

If DATOPT is -1 and DATE is not found, FINDDT will return the index of the first trading date lessthan DATE. If DATE is higher than any dates in the calendar FINDDT will return NDAYS and ifDATE is less than any dates in the calendar FINDDT will return 0.

If DATOPT is 1 and DATE is not found, FINDDT will return the index of the next day in thecalendar. If DATE is less than any dates in the calendar FINDDT returns 1. If DATE is greater thanany dates in the calendar, FINDDT returns 0.

INDXDT (DATE) INTEGER

Function INDXDT is used to search the calendar for the index of a given date. DATE is the desireddate, an integer in YYYYMMDD format (or YYMMDD if DATEFFLAG is set to 0). INDXDTreturns the index of DATE in the calendar. If DATE does not appear in the calendar, the index to thenext higher value will be returned. For example, if a Sunday is passed to the function, it will returnthe index to Monday. Check

CALDT (INDXDT(DATE)).EQ.DATE

to see if the date was found. If DATE is outside the range of the calendar, INDXDT will return zero.

LOADBA The subroutine LOADBA loads only bid, ask and bid-ask average price data into the price arraysPRC, BIDLO, and ASKHI. The Nasdaq bids and asks are also loaded by LOADBA into the BIDLOand AKSHI arrays if they are available. Other prices that represent actual highs, lows and close arereplaced with missing values. This subroutine uses no parameters, and expects the regular pricearrays from the stock master file and the Nasdaq data arrays to be already loaded.

LOADHL The subroutine LOADHL restricts the price arrays PRC, BIDLO and ASKHI to only trade priceinformation. If the only price information and date is a bid and ask price, LOADHL sets PRC,BIDLO and ASKHI on that date to missing values. This subroutine uses no parameters, andexpects the regular price arrays from the stock master file to be already loaded.

31

Page 46: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

NAMRNG (IND, BIND, EIND)

The subroutine NAMRNG determines the index ranges in the trading calendar corresponding to aname structure. IND is the number of the name structure. The beginning and end of the range areloaded into BIND and EIND by the subroutine. If the name structure given is the last one, ENDDATis used for the end of the range. Otherwise, the first trading date preceding the start of the nextname structure is used. If the name structure precedes the calendar or is outside the range of thesecurity’s data, BIND and EIND are set to zero.

TICK (I) CHARACTER*8

The function TICK returns the ticker from the Ith name structure in CHARACTER*8 format byconcatenating the variables CNAMES(TICKER,I) and CNAMES(TICKER+1,I). NYSE andAMEX tickers are always less than 4 characters, so the variable NAMES(TICKER,I) is sufficientfor securities on those exchanges, where as Nasdaq tickers may have 5 characters, and if so, requireconcatenation.

USDATE (DATEIN, DATEOUT, DATEINDEX)

Subroutine USDATE looks for a date in the trading calendar and finds the subscript of that date. Ifthe date does not match a date in the calendar, the next higher date is found. If the date is greaterthan any in the calendar, the latest date in the calendar is found. DATEIN is the YYYYMMDD (orYYMMDD if DATEFFLAG is set to 0) date to be found. DATEOUT is the YYYYMMDD orYYMMDD date found by the subroutine. DATEINDEX is an integer such thatCALDT(DATEINDEX) = DATEOUT. If DATEIN does not equal DATEOUT, DATEIN wasnot a trading date. The variables or constants used for the three parameters must be integers.

VALEXC (EXHAVE, EXWANT) LOGICAL

The VALEXC returns TRUE if the exchange is valid given the binary exchange code, otherwise itreturns FALSE. EXHAVE is a CRSP exchange code from a name structure,NAMES(EXCHCD,I). EXWANT is a binary code where 1 = NYSE, 2 = AMEX, 4 = Nasdaq,and combinations of exchanges are represented by the sum of these codes.

32

Page 47: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 2. Accessing Data in FORTRAN

FO

RT

RA

N P

rogramm

ing

Stock FORTRAN Print Utility Subroutines

OUTHDR (JUNIT)

Subroutine OUTHDR writes out the values of all the variables in the /HEADER/ common block. Itwrites the output to the file described by the argument JUNIT.

OUTNAM (JUNIT, FIRST, LAST)

Subroutine OUTNAM writes out the name structures found in the /INFO/ common block. It writesall the fields for all the name structures, from the FIRST to the LAST, to the file described by theargument JUNIT. If FIRST is less than one or LAST is greater than NUMNAM or FIRST is greaterthan LAST, nothing will be printed. DATEFFLAG must be set to 1 for OUTNAM to work properly.

OUTDIS (JUNIT, FIRST, LAST)

Subroutine OUTDIS is similar to OUTNAM except that it writes distribution structures found in the /INFO/ common block with possible ranges from 1 to NUMDIS. DATEFFLAG must be set to 1 forOUTDIS to work properly.

OUTSHR (JUNIT, FIRST, LAST)

Subroutine OUTSHR is similar to OUTNAM except that it writes share structures found in the /INFO/common block with possible ranges from 1 to NUMSHR. DATEFFLAG must be set to 1 forOUTSHR to work properly.

OUTDEL (JUNIT, FIRST, LAST)

Subroutine OUTDEL is similar to OUTNAM except that it writes delisting structures found in the /INFO/ common block with possible ranges from 1 to NUMDEL. DATEFFLAG must be set to 1 forOUTDEL to work properly.

OUTNDI (JUNIT, FIRST, LAST)

Subroutine OUTNDI is similar to OUTNAM except that it writes Nasdaq information structuresfound in the /INFO/ common block with possible ranges from 1 to NUMNDI. DATEFFLAG mustbe set to 1 for OUTNDI to work properly.

OUTNMS (JUNIT, BIND, EIND, STEP)

Subroutine OUTNMS writes values of the daily Nasdaq data arrays in the /NMSDAT/ common block.Each line has a date followed by various data values for that date. The order of the columns isclosing bid, closing ask and the number of trades. It writes from index BIND to index EIND, sooutput will range from the beginning date CALDT(BIND) to the ending date CALDT(EIND).The STEP parameter determines how often the values are written. If STEP is 1, all dates in therange will be printed. DATEFFLAG must be set to 1 for OUTNMS to work properly.

33

Page 48: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

OUTDAT (JUNIT, BIND, EIND, STEP)

Subroutine OUTDAT writes values of the daily or monthly data arrays in the /DDATA/ block. Eachline has a data followed by various data values for that date. The order of the columns is ClosingPrice or Bid/Ask Average, Bid or Low Price, Ask or High Price, Volume Traded, and Returns. Itwrites from index BIND to index EIND, so output will range from the beginning dateCALDT(BIND) to the ending date CALDT(EIND). The STEP parameter determines how oftenthe values are written. If STEP is 1, all dates in the range will be printed. DATEFFLAG must beset to 1 for OUTDAT to work properly.

OUTINT (JUNIT, TITLE, IARRAY, BIND, EIND, STEP)

Subroutine OUTINT writes out the date and the data in the passed IARRAY from BIND to EIND,

printing every STEPth one. The array passed must be an INTEGER array. If the TITLE argumentis left blank (’’) then no header line is printed. It writes the output to the file described by theargument JUNIT. DATEFFLAG must be set to 1 for OUTINT to work properly.

OUTONE (JUNIT, TITLE, ARRAY, BIND, EIND, STEP)

Subroutine OUTONE writes out the date and the data in the passed ARRAY from BIND to EIND,

printing every STEPth one. The array passed must be a REAL array. If the TITLE argument is leftblank (’’) then no header line is printed. It writes the output to the file described by theargument JUNIT. DATEFFLAG must be set to 1 for OUTONE to work properly.

OUTYR (JUNIT, BIND, IND)

Subroutine OUTYR writes out the year and the yearly data in the PRTNUM and YRVAL arrays.BIND and EIND must be in the range of BEGYR to ENDYR. BIND + FSTYR (=1924) is theactual beginning YYYY year of the range that will be printed, and EIND + FSTYR is the actualending YYYY year of the range. OUTYR prints headers based on capitalization portfolios loaded inthe Portfolio Assignment and Statistic Arrays (PRTNUM and YRVAL).

OUTPRM Subroutine OUTPRM writes out the parameters specifying array sizes that are declared in theinclude file parmsfl.txt. This subroutine uses the FORTRAN PRINT statement.

OUTSUB Subroutine OUTSUB writes out the subscript parameters to the information structure arrays andauxiliary data arrays declared in the include file subsfl.txt. This subroutine uses the FORTRANPRINT statement.

34

Page 49: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 2. Accessing Data in FORTRAN

FO

RT

RA

N P

rogramm

ing

2.3 Suggestions For FORTRAN Programs Using CRSP Stock Data

CRSP stock data files total over a gigabyte of data. Different daily data account for most of this total. Because of thelarge amounts of data, care should be taken to make programs as efficient as possible so that only the data actuallyneeded is read by the program, and is only loaded once.

Some simple rules can be followed in programs that use CRSP stock access routines that may greatly improve theperformance of your programs:

Y The wanted parameter should be used to get only the information that you need. If you are only going to usethe name of the company and the returns, use INFOS+RETURNS. Only use ALL or EVERY if necessary.

Y SAME or ISAME can be used if some header information is checked before using the data for a company.For example, to get prices and distributions for all NYSE/AMEX companies that traded after a certain date,use the statements

CALL USGETPER (1,INEXT,NONE,*900)IF (CALDT(BEGDAT) .GE. YOURSTART) thenCALL USGETPER (1,ISAME,INFOS+PRICES,*900)...END IF

The first statement reads the header information for the next security. The price and distributions areretrieved and the security processed only if the security traded during the desired range.

Y The files are organized by security, so try to do all processing for one security before going on to another ifpossible. A program should never have to read the same data for the same company more than once. If astudy is acting on many securities cross-sectionally by date, one program can dump data for other programsto sort and continue processing.

Y There should be as few calls to UDOPEN or UMOPEN as possible in a program. A call to UDOPEN at thebeginning of a program is sufficient to open all the daily data files. Any other calls to UDOPEN only switchthe default market indices. A program that needs to switch back and forth between different file’s marketindices should copy the needed market indices to local arrays.

Y The data are physically ordered by PERMNO, so a sequential pass is much more efficient if done in PERMNOorder using USGETPER.

Y The utility routines can be added to programs to make manipulation of the data easier. They includefunctions that make it easier to retrieve data or perform a common operation on the data and outputsubroutines that print formatted output for the different data types. For example, the CURSHR function canbe used to easily calculate capitalizations. Check the utility subprogram subsection for a listing of thefunctions and routines available.

35

Page 50: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

There are certain variables or parameters defined as useful constants that may make programs easier to use.

NDAYS A variable containing the number of days in the current calendar. CALDT(NDAYS) is the lastdate in the calendar.

NASBEG The first subscript of the calendar that holds Nasdaq daily data (=2610).

MNASBEG The first subscript of the calendar that holds Nasdaq monthly data (=565).

MAMEXBEG- The first subscript of the calendar that holds AMEX monthly data (=440).

FSTYR Used to find the corresponding YYYY year for portfolio data (=1924). YRVAL(CAP,I)corresponds to capitalization for the year FSTYR + I.

FSTYR2 Used to find the corresponding YY year for portfolio data (=24). YRVAL(CAP,I)corresponds to capitalization for the year FSTYR + I + 1900.

MESSAGEFLAG A flag set to 0 by default that can be reset to 1 to suppress access function messages.

DATEFFLAG A flag set to 1 by default is in the date format YYYYMMDD. It can be reset to 0 to changedate variables to the YYMMDD date format.

TONLYFLAG A flag set to 0 by default that can be reset to 1 to use trade-only prices for excess returncalculations. (Use only for daily NYSE/AMEX data).

PORTTYPE1 Sets the portfolio type to be loaded into the CAP or First Portfolio dimension of the PortfolioAssignment and Statistic Arrays (PRTNUM and YRVAL). The default is 1, the NYSE/AMEX/Nasdaq Market Capitalization Deciles. See Index Methodologies section in the CRSP DataDefinitions and Coding Schemes Guide for a list of available portfolio types and associatedindex groups by product. See the WANTED parameter of the GET routines for usage with dataload subroutines.

PORTTYPE2 Sets the portfolio type to be loaded into the SDEV or Second Portfolio dimension of thePortfolio Assignment and Statistic Arrays (PRTNUM and YRVAL). If excess returns areloaded, this determines the market segment index group used in calculations for loading theSXRET array. This is set to 0 by default.

PORTTYPE3 Sets the portfolio type to be loaded into the BETA or Third Portfolio dimension of the PortfolioAssignment and Statistic Arrays (PRTNUM and YRVAL). If excess returns are loaded, thisdetermines the market segment index group used in calculations for loading the BXRET array.This is set to 0 by default.

36

Page 51: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 2. Accessing Data in FORTRAN

FO

RT

RA

N P

rogramm

ing

2.4 FORTRAN Access on Supported Systems

Windows Systems

CRSP supports FORTRAN on Windows NT, Windows 98, and Windows 95. All FORTRAN was compiled andtested using the Digital Visual FORTRAN 5.0 and 6.0 compilers. Ordinary sample FORTRAN usage links to theobject library, but does not require using or compiling C code. Linking to FORTRAN 5.0 does require that the VisualC++ compiler is installed on the system. Linking to FORTRAN 6.0 does not require that the Visual C++ compiler isinstalled on the system. The LIB environment variable must have the C library before the FORTRAN library, forexample:

LIB=C:\Program Files\DevStudio\VC\LIB;C:\Program Files\DevStudio\DF\LIB

CRSP access depends on environment variables set during installation. Environment variables can be used onWindows systems with the name enclosed with % characters (%name%). See the CRSPAccess97 Installation Guidefor instructions on installation and a full list of environment variables, and how to load data and to configure thesystem. The set command can be used in a command prompt window to find available environment variables.

Important CRSP files or folders can be found with the following names:

%crsp_bin%- executable programs and batch files. This folder is in the PATH so programs can be run from anydirectory. Executable versions of the sample programs can be found in this folder.

%crsp_lib%- folder containing CRSP object library and internal files.

%crsp_lib%\crsp_lib.lib- CRSP object library

%crsp_include%- folder containing CRSP FORTRAN header files referred to by INCLUDE statements

%crsp_sample%- folder containing CRSP sample programs

%crsp_mstk%- folder containing monthly CRSP stock and indices databases

%crsp_dstk%- folder containing daily CRSP stock and indices databases

Following is an example of modifying a sample FORTRAN program using Digital Visual FORTRAN andcommand prompt commands:

Copy the file to a local directory using the Windows Explorer utility or the copy command, or use DigitalVisual FORTRAN Developers Studio to open the file and save to a new location with Save As.

Edit the file with the Digital Visual FORTRAN Developers Studio editor and compile and run commands ina command prompt window:

Y df /I %crsp_include% ussampl.for %crsp_lib%\crsp_lib.lib

Y ussampl

Sample programs can be compiled and linked directly using Digital Visual FORTRAN . Make sure the includedirectory and library options are set in the project tools options.

Sample programs can also be compiled and linked at the command prompt using the nmake utility. A sampledescription file, for_samp.mak, exists in the %crsp_sample% directory. To use the sample description file, copyit to your program directory, modify it to include your program, and run with the command

nmake for_samp.mak

37

Page 52: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

Unix Systems

CRSP currently supports four versions of Unix.

FORTRAN Unix Systems Support

Sun FORTRAN was compiled and tested using the above compilers. FORTRAN library functions interface with Cfunctions in the CRSP object library. Ordinary sample FORTRAN usage links to the object library, but does notrequire compiling C programs.

CRSP access depends on environment variables set during installation. Environment variables can be used on Unixwith the name preceded by the $ symbol. All file names and environmental variable names are case sensitive onUnix systems. The env command can be used in a terminal window to find available environment variables.

Important CRSP files or directories can be found with the following names:

$CRSP_BIN- directory containing executable programs and shellscripts files. This directory is in the PATH soprograms can be run from any directory. Executable versions of the sample programs can be found in this directory

$CRSP_LIB- directory containing CRSP object library and internal files.

$CRSP_LIB/crsplib.a- CRSP object library

$CRSP_INCLUDE- directory containing CRSP FORTRAN header files referred to by INCLUDE statements

$CRSP_SAMPLE- directory containing CRSP sample programs

$CRSP_MSTK- directory containing monthly CRSP stock and indices databases

$CRSP_DSTK- directory containing daily CRSP stock and indices databases

Following is an example of modifying and running a sample FORTRAN program with Sun Solaris andCompaq Tru64 Unix (formerly Digital Unix):

Y cp $CRSP_SAMPLE/ussampl.for (copy the program to a local directory)

Y chmod 660 ussampl.for (make sure you have write access to modify the program)

Y vi ussampl.for (or other available editor, to insert alternate processing)

Y f77 ussampl.for -o ussampl -I$CRSP_INCLUDE $CRSP_LIB/crsplib.a

Y ussampl to run the program. Make sure any input files exist and output files do not already exist.

Following is an example of modifying and running a sample FORTRAN program with HP/UX:

Y cp $CRSP_SAMPLE/ussampl.for ussampl.f (copy the program to a local directory andchange the file extension to *.f)

Y chmod 660 ussampl.f (make sure you have write access to modify the program)

Y vi ussampl.f (or other available editor, to insert alternate processing)

Y f90 ussampl.f -o ussampl -I$CRSP_INCLUDE $CRSP_LIB/crsplib.a

Y ussampl to run the program. Make sure any input files exist and output files do not already exist.

Sun Solaris Compaq Tru64 HP/UX AIXCPU Sun Sparc Alpha PARISC Power PCOperating System Solaris 2.5 Compaq Tru64 HP/UX AIX 4.1.4FORTRAN Compiler SparcCompiler FORTRAN 4.0 DEC FORTRAN f90 f77C Compiler SparcCompiler C 4.0 DEC C 6.0-001 gnu C Compiler cc

38

Page 53: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 2. Accessing Data in FORTRAN

FO

RT

RA

N P

rogramm

ing

Following is an example of modifying and running a sample FORTRAN program with AIX:

Y cp $CRSP_SAMPLE/ussampl.for ussampl.f (copy the program to a local directory andchange the file extension to *.f)

Y chmod 660 ussampl.f (make sure you have write access to modify the program)

Y vi ussampl.f (or other available editor, to insert alternate processing)

Y f77 -w -I$CRSP_INCLUDE $CRSP_LIB/crsplib.a -o ussamp1 ussamp1.f -lm

Y ussampl to run the program. Make sure any input files exist and output files do not already exist.

Sample programs can also be compiled and linked using the make utility. The sample program directory$CRSP_SAMPLE contains sample make description files for Sun Solaris, Compaq Tru64 (formerly Digital Unix),HP/UX, and AIX in for_samp.mk, for_samp.mk2, for_samp.mk3, and for_samp.mk4 respectively. To usemake, copy the relevant description file to your program directory, edit it to support the program(s) of interest andcreate local executables, and run with the command

make -f filename

See installation instructions for Unix in the CRSPAccess97 Installation Guide to load data and configure the systemto support CRSP data.

39

Page 54: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

OpenVMS Systems

CRSP supports OpenVMS on Digital Alpha machines. FORTRAN was compiled and tested using DEC FORTRAN7.1. FORTRAN library functions interface with C functions in the CRSP object library. C Functions were compiledwith DEC C 6.0-001. Ordinary sample FORTRAN usage links to the object library, but does not require compiling Cprograms.

CRSP access depends on logicals set during installation. The show logical crsp* command can be used in aterminal window to find available CRSP logicals.

Important CRSP files or directories can be found with the following names:

CRSP_BIN:- directory containing executable programs and batch files.

CRSP_LIB:- directory containing CRSP object library and internal files.

CRSP_LIB:libcfortran.olb and CRSP_LIB:libcrsp. olb - CRSP object libraries

CRSP_INCLUDE:- directory containing CRSP header files referred to by INCLUDE statements

CRSP_SAMPLE:- directory containing CRSP sample programs

CRSP_MSTK:- directory containing monthly CRSP stock and indices databases

CRSP_DSTK:- directory containing daily CRSP stock and indices databases

Following is an example of modifying and running a sample FORTRAN program with OpenVMS:

- copy crsp_sample:ussampl.for [] (copy the program to a local directory)- eve ussampl.for (or other available editor, to insert alternate processing)- fortran ussampl /include=CRSP_INCLUDE: /float=IEEE_FLOAT- link ussampl + crsp_lib:libcfortran/lib + crsp_lib:libcrsp/lib +

- crsp_lib:deccopts.opt/opt- run ussampl to run the program. Make sure any needed input files exist

Sample programs can also be compiled and linked using the MMS layered product. A sample description file,FOR_SAMP.MMS, exists in the CRSP_SAMPLE: directory. To use the sample description file, copy it to yourprogram directory, modify it to include your program, and run with the command

mms / description=for_samp

See installation instructions for OpenVMS in the CRSPAccess97 Installation Guide to load data and configure thesystem to support CRSP data.

40

Page 55: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

C P

rogramm

ing

Chapter Three: Accessing Data in COVERVIEW

This section provides the technical information needed to access CRSPAccess97 Stock and Indices data using the Cprogramming language.

The CRSPAccess97 Files are implemented as CRSPDB databases. The CRSPDB format supports random access withmultiple key identifiers, and allows selective access into the available data items. C programs can be linked to CRSPlibrary functions and structures defined in CRSP header files. Sample programs are provided as examples ortemplates for processing CRSP data.

In C programming all index series share a common structure and are accessed by indno, a unique permanent indexidentification number assigned by CRSP.

See the CRSP Data Definitions and Coding Schemes Guide for data item definitions, the CRSPAccess97 InstallationGuide for installation instructions, and the CRSPAccess97 Database Format - Utilities Guide for data tools availablewith the databases.

Inside

3.1 CRSPAccess97 C Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 41Data Organization for C Programming Data Objects Set Structures and Usage C Language Data Objects for CRSP Stock DataC Language Data Structure for CRSP Stock Data C Language Data Objects for CRSP Indices C Language Data Structure for CRSP Indices Data

3.2 C Sample Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 54C Header Files and Data Structures

3.3 CRSPAccess97 C Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 56Stock Access Functions Index Access Functions General Access Functions General Utility Functions Data Utility Functions

3.4 C Access on Supported Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Page 112Windows Systems Unix Systems OpenVMS Systems

Page 56: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97
Page 57: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

C P

rogramm

ingin con-

CHAPTER 3. ACCESSING DATA IN C

3.1 CRSPAccess97 C Data Structures

C Programming allows complete support for CRSP databases, including random access on PERMNO, CUSIP andother header variables, and full support of all data items. There are sample programs, header files, and an objectlibrary available.

Data Organization for C Programming

The basic levels of a CRSPAccess97 database are the database, set type, set id, module, object, and array. They aredefined as follows:

Y Database (CRSPDB) is the directory containing the database files. A CRSPDB is identified by the database path.

Y Set Type is a predefined type of financial data. Each set type has its own defined set of data structures,specialized access functions, and keys. CRSPAccess97 stock databases support stock (STK) and index (IND)set types. A CRSPDB can include more than one set type.

Y Set Identifier (SETID) is a defined subset of a set type. SETIDs of the same set type use the same accessfunctions, structures, and keys, but have different characteristics within those structures. For example, dailystock sets use the same data structure and monthly stock sets, but time series are associated with differentcalendars. Multiple SETIDs of the same set type can be present in one CRSPDB.

Y Modules are the groupings of data found in the data files in a CRSPDB. Multiple data items can be present in amodule. Data is retrieved at a module level, and access functions retrieve data items for keys based on selectedmodules. Modules correspond to the physical data files.

Y Objects are the fundamental data types defined for each set type. There are three fundamental object types, timeseries (CRSP_TIMESERIES), event arrays (CRSP_ARRAY), and headers (CRSP_ROW). Objects containheader information such as counts, ranges, or associated calendars (CRSP_CAL) plus arrays of data for zero ormore observations. Some set types allow arrays of objects of one type. In this case, the number of availableobjects is determined by the SETID, and each of the objects in the list has independent counts, ranges, orassociated calendars.

Y Arrays are attached to each object. The array contains the set of observations and is the basic level ofprogramming access. An observation can be a simple data type such as an integer for an array of volumes, or acomplex structure such as for a name history. When there is an array of objects, there is a corresponding array ofarrays with the data.

Data Objects

There are four basic types of information stored in CRSP databases. Each is associated with a CRSP object structure.

1 Header Information. These are identifiers with no implied time component.

2 Event Arrays. Arrays can represent status changes, random events, or observations. The time of the event andrelevant information is stored for each observation. There is a count of the number of observations for each typeof event data.

3 Time Series Arrays. An observation is available for each period in an associated calendar. A beginning andending point of valid data are available for each type of time series data. Data is stored for each period in therange – missing values are stored as placeholders if information is not available for a period.

4 Calendar Arrays. Each time series is tied to an array of relevant time periods. This calendar is used junction with the time series arrays to attach times to the observations.

41

Page 58: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

An observation can be a simple value or contain multiple components such as codes and amounts. Time series,except Portfolios, are based on calendars which share the frequency of the database. In a monthly database, the timeseries are based on a month-end trading date calendar. In a daily database, the time series are based on a daily tradingdate calendar excluding market holdings. Portfolio calendars are dependent on the rebalancing methodology of thespecific portfolio type. All calendars are attached automatically to each wanted time series object when the databaseis opened.

There are four base CRSPAccess97 C structures called objects used in CRSPDBs. The following table contains eachof the objects in all caps, followed by the components, lower case and indented, that each object type contains. Alldata items are defined in terms of the following objects:

C Object Structures Table

OBJECT or Field Usage DATA TYPE

CRSP_ARRAY Structure for storing event-type dataobjtype object type code identifies the structure as a CRSP_ARRAY, always = 3 int

arrtype

array type code defines the structure in the array. Base C types or CRSP-defined structures each have associated codes defined in the constants header file int

subtypedata subtype code defines a subcategory of array data. Subtypes further differentiate arrays with common array type fields. int

size_of_array_width number of bytes in each array element int

maxarr maximum number of array elements containing valid data int

num number of array elements containing valid data int

dummy data secondary subtype code int

arr

object array is a pointer to the array containing the actual data. The array can be a base C data type or a CRSP-defined structure. Its size and type are determined by arrtype, size_of_array_width, and maxarr void *

CRSP_ROW Structure for storing header data

objtype object type code identifies the structure as a CRSP_ROW, always = 5 int

arrtype

array type code defines the structure in the array. Base C types or CRSP-defined structures each have associated codes defined in the constants header file int

subtypedata subtype code defines a subcategory of array data. Subtypes further differentiate arrays with common array type fields. int

size_of_array_width array structure size in bytes int

arr

object array is a pointer to the array containing the actual data. The array can be a base C data type or a CRSP-defined structure. Its size and type are determined by arrtype and size_of_array_width. The array size is always 1. void *

CRSP_TIMESERIES Structure for storing time series data

objtypeobject type code identifies the structure as a CRSP_TIMESERIES, always = 2 int

arrtype

array type code defines the structure in the array. Base C types or CRSP-defined structures each have associated codes defined in the constants header file int

subtypedata subtype code defines a subcategory of array data. Subtypes further differentiate arrays with common array type fields. int

size_of_array_width array structure size in bytes int

maxarr maximum number of array elements int

begfirst array index with valid data for the current record, or 0 if no valid range int

endlast array index with valid data for the current record, or 0 if no valid range int

caltype

calendar time period description code describes the type of time periods. Calendar Type (caltype) is always 2 indicating time periods are described in the Calendar Trading Date (caldt) array by the last trading date in the period. int

42

Page 59: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

C P

rogramm

ing

cal

calendar associated with time series is a pointer to the calendar associated with the time series array. The calendar includes the matching period-ending dates for each array index. CRSP_CAL *

arr

object array is a pointer to the array containing the actual data. The array can be a base C data type or a CRSP-defined structure. Its size and type are determined by arrtype, size_of_array_width, and maxarr. void *

CRSP_CAL Structure for storing calendar period data

objtype object type code identifies the structure as a CRSP_CAL, always = 1 int

calidcalendar identification number is an identifier assigned to each specific calendar by CRSP int

typegeneric group code of calendar, ie. daily or monthly. All current time series use 2 for calendar trading date (caldt) only. int

loadflagcalendar type availability flag is a code indicating the types of calendar arrays loaded. Currently = 2 for calendar trading date (caldt) only int

ndays number of days is the index of the last calendar period int

name calendar name is the calendar name in text char[80]

callistcalendar period grouping identifiers reserved for array of alternate grouping identifiers for calendar periods int *

caldt

calendar trading date is an array of calendar period ending dates, stored in YYYYMMDD. Calendars start at element 1 and end at element number of days (ndays) int *

calmapused to store array of first and last calendar period array elements in a linked calendar to elements in this calendar CRSP_CAL_MAP *

basecal used to point to a calendar linked in calmap CRSP_CAL *

OBJECT or Field Usage DATA TYPE

43

Page 60: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

ck

fix. It is

and is

r. These

ocated.

Set Structures and Usage

Stock and indices access functions initialize and load data to C top-level defined set structures. Top-level structuresare built from general object and array structure definitions and contain object and array pointers that have memoryallocated to them by access open functions.

Two set types and six set identifiers are currently supported for stock and indices data. The identifier must bespecified when opening or accessing data from the set.

Data Set Types and Identifiers

Each set structure has three types of pointer definitions.

— Module pointers point to CRSP_OBJECT_ELEMENT linked lists and are only needed internally to keep traof the objects in a module. These have the suffix _obj and can be ignored by ordinary programming.

— Object pointers define a CRSP_ARRAY, CRSP_ROW, or CRSP_TIMESERIES object type. A suffix, _arr, _ts or _row is appended to the variable name. Range variables num, beg, and end are accessed from these variables.

— Array pointers define the data item array. The array has the same rank as the object but without the sufa pointer to the array element of the object and is used for general access of the data item.

If a module has multiple types of objects, a group structure is created with definitions for those objectsincluded in the main structure.

If a module has a variable number of objects of one type, an integer variable keeps track of the actual numbevariables end with the suffix types and are based on the set type.

Each of the top-level structures contains three standard elements:

— PERMNO - the actual key loaded

— loadflag, a binary flag matching the set wanted parameters indicating which pointers have been allSee the open function for the set for more information about wanted parameters.

— setcode, a constant identifying the type of set (1=STK, 3=IND)

For example, a Stock Structure has CRSP_TIMESERIES object called prc_ts containing an array called prc.

Data Set TypeSet

Identifiers

CRSP Stock Data STK 10 Daily20 Monthly

CRSP Indices Data IND 400 Monthly Groups (in IX product only)420 Monthly Series440 Daily Groups (in IX product only)460 Daily Series

44

Page 61: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

45

e CRSP data structures as well as the data

r. The 0th element of a time series array is

nment data for one portfolio type. Each

nge Ele-nts on an

dex BasisElements on a Set Basis Array Name

ne none stk.headerm maxarr stk.events.namesm maxarr stk.events.distsm maxarr stk.events.sharesm maxarr stk.events.delistm maxarr stk.events.nasdin

g and endmaxarr, cal stk.bidlo

g and endmaxarr, cal stk.askhi

g and endmaxarr, cal stk.prc

g and endmaxarr, cal stk.ret

g and endmaxarr, cal stk.vol

g and endmaxarr, cal stk.bid

g and endmaxarr, cal stk.ask

g and endmaxarr, cal stk.retx

g and endmaxarr, cal stk.spread

g and end maxarr, cal stk.numtrdg and end maxarr, cal stk.altprc

g and end

maxarr, cal, stk.port-types

stk.port[j], j from 0 to stk.porttypes-1

C Programming

C Language Data Objects for CRSP Stock Data

Each stock structure contains a fixed set of possible objects. Objects contain the header information needed to use tharrays. Data elements are described in the C Data Structure Table under the array name.

Time series beg and end are both equal to 0 if there are no data. Otherwise beg > 0, beg <= end, and end < maxarreserved for the missing value for that data type

The stock structure contains an array of portfolio time series. Each member contains the portfolio statistic and assigmember can have a different range and calendar. The count of Portfolio Types is found in the port types variable.

Stock File C Objects Table

Object Name Object Type Array Type Data Subtype

Array Structure

Size

RameIn

header_row Stock Header Structure CRSP_ROW CRSP_STK_HEADER_NUM = 50 0 172 nonames_arr Security Name History CRSP_ARRAY CRSP_STK_NAME_NUM = 51 0 160 nudists_arr Distribution History Array CRSP_ARRAY CRSP_STK_DIST_NUM = 52 0 40 nushares_arr Shares Structure Array CRSP_ARRAY CRSP_STK_SHARE_NUM = 53 CRSP_SHARES_IMP_NUM = 0 16 nudelist_arr Delisting Structure Array CRSP_ARRAY CRSP_STK_DELIST_NUM = 54 0 40 nunasdin_arr Nasdaq Structure Array CRSP_ARRAY CRSP_STK_NASDIN_NUM = 55 0 24 nu

bidlo_ts Bid or Low CRSP_TIMESERIES CRSP_FLOAT_NUM = 1 CRSP_PRICE_NUM = 1 4 be

askhi_ts Ask or High CRSP_TIMESERIES CRSP_FLOAT_NUM = 1 CRSP_PRICE_NUM = 1 8 be

prc_tsClosing Price or Bid/Ask Average CRSP_TIMESERIES CRSP_FLOAT_NUM = 1 CRSP_PRICE_NUM = 1 4 be

ret_ts Holding Period Return CRSP_TIMESERIES CRSP_FLOAT_NUM = 1 CRSP_RETURN_NUM = 2 4 be

vol_ts Share Volume CRSP_TIMESERIES CRSP_INTEGER_NUM = 2 CRSP_VOLUME_NUM = 6 4 be

bid_ts Nasdaq Closing Bid CRSP_TIMESERIES CRSP_FLOAT_NUM = 1 CRSP_PRICE_NUM = 1 4 be

ask_ts Nasdaq Closing Ask CRSP_TIMESERIES CRSP_FLOAT_NUM = 1 CRSP_PRICE_NUM = 1 4 be

retx_ts Return Without Dividends CRSP_TIMESERIES CRSP_FLOAT_NUM = 1 CRSP_RETURN_NUM = 2 4 be

spread_ts Month End Bid/Ask Spread CRSP_TIMESERIES CRSP_FLOAT_NUM = 1 CRSP_PRICE_NUM = 1 4 be

numtrd_ts

Nasdaq Number of Trades (daily) or Alternate Price Date (monthly) CRSP_TIMESERIES CRSP_INTEGER_NUM = 2

CRSP_COUNT_NUM = 7, or CRSP_DATE_NUM = 26 4 be

altprc_ts Alternate Price CRSP_TIMESERIES CRSP_FLOAT_NUM = 1 CRSP_PRICE_NUM = 1 4 be

port_ts Array of Portfolio time series CRSP_TIMESERIES CRSP_STK_PORT_NUM = 56

Each portfolio time series in the array has subtype equal to the Per-manent Index Identification Num-ber of the associated group index. 8 be

Page 62: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

ed by the definitions in the next level of

levels indicated by the indentation in thein all access functions. The second level

m string length allowed. Actual maximums mayn be used, and multiple CRSP_STK_STRUCTs can

age Object Usage

stk.header_row

ffective from vents.names[I].name

vents.names[I].namestk.events.names_arr

tion effective on vents.dists[I].exdt stk.events.dists_arr

46

C Language Data Structure for CRSP Stock Data

All CRSP-defined data type structures have names in all capitals beginning with CRSP_ and are immediately followindentation

Index and Date Ranges for all elements in a structure are the same as for the structure itself. There are four structuremnemonic field. Pointers at any level can be used in a program. The top level contains all other items and is used indicates data grouped in modules. See the CRSPAccess97 Stock Users Guide for data item definitions.

All character strings, indicated by char[#], are NULL terminated. The number of characters – 1 is the maximube lower. The top level stk structure is an example used by CRSP Stock sample programs. Other names cabe declared in a program.

Stock File C Variable Usage TableMnemonic Name Data Type Data Usage Index Range Date Us

stk Master Stock Structure CRSP_STK_STRUCT stk

header Stock Header Structurehcusip CUSIP Identifier, Header char[16] stk.header->hcusippermno PERMNO int stk.header->permnopermco PERMCO int stk.header->permcocompno Nasdaq Company Number int stk.header->compnoissuno Nasdaq Issue Number int stk.header->issunohexcd Exchange Code, Header int stk.header->hexcdhshrcd Share Code, Header int stk.header->hshrcdhsiccd Standard Industrial Classification (SIC) Code, Header int stk.header->hsiccdbegdt Begin of Stock Data int stk.header->begdtenddt End of Stock Data int stk.header->enddtdlstcd Delisting Code, Header int stk.header->dlstcdhtick Ticker Code, Header char[16] stk.header->htickhcomnam Company Name, Header char[64] stk.header->hcomnam

events Master Stock Event StructureCRSP_STKEVENT_STRUCT stk.events

names Security Name History

i between 0 and stk.events.names_arr->num-1

name estk.edt to stk.eenddt

namedt Name Effective Date int stk.events.names[i].namedtnameenddt Last Date of Name int stk.events.names[i].nameenddtncusip CUSIP char[16] stk.events.names[i].ncusipticker Ticker Symbol char[16] stk.events.names[i].tickercomnam Company Name char[64] stk.events.names[i].comnamshrcls Share Class char[4] stk.events.names[i].shrclsshrcd Share Code int stk.events.names[i].shrcdexchcd Exchange Code int stk.events.names[i].exchcdsiccd Standard Industrial Classification (SIC) Code int stk.events.names[i].siccd

dists Distribution History Array

i between 0 and stk.events.dists_arr->num-1

distribustk.e

distcd Distribution Code int stk.events.dists[i].distcddivamt Dividend Cash Amount float stk.events.dists[i].divamtfacpr Factor to Adjust Price float stk.events.dists[i].facprfacshr Factor to Adjust Shares Outstanding float stk.events.dists[i].facshrdclrdt Distribution Declaration Date int stk.events.dists[i].dclrdtexdt Ex-Distribution Date int stk.events.dists[i].exdtrcrddt Record Date int stk.events.dists[i].rcrddt

Page 63: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

47

bservation effective from vents.shares[i].shr

vents.shares[i].shrt stk.events.shares_arr

bservation on vents.delist[i].dls

stk.events.delist_arr

status effective from vents.nasdin[i].trt vents.nasdin[i].trtt stk.events.nasdin_arr

date stk.bidlo_ts->caldt[i] stk.bidlo_ts

date stk.askhi_ts->caldt[i] stk.askhi_ts

date stk.prc_ts->caldt[i] stk.prc_ts

date stk.ret_ts->caldt[i] stk.ret_ts

date stk.vol_ts->caldt[i] stk.vol_ts

date stk.bid_ts->caldt[i] stk.bid_ts

date stk.ask_ts->caldt[i] stk.ask_ts

date stk.retx_ts->caldt[i] stk.retx_ts date stk.spread_ts-

dt[i] stk.spread_ts

age Object Usage

C Programming

paydt Payment Date int stk.events.dists[i].paydtacperm Acquiring PERMNO int stk.events.dists[i].acpermaccomp Acquiring PERMCO int stk.events.dists[i].accomp

shares Shares Structure Array

i between 0 and stk.events.shares_arr->num-1

shares ostk.esdt to stk.esendd

shrout Shares Outstanding int stk.events.shares[i].shroutshrsdt Shares Observation Date int stk.events.shares[i].shrsdtshrsenddt Shares Observation End Date int stk.events.shares[i].shrsenddtshrflg Shares Outstanding Observation Flag int stk.events.shares[i].shrflg

delist Delisting Structure Array

i between 0 and stk.events.delist_arr->num-1

delist ostk.etdt

dlstdt Delisting Date int stk.events.delist[i].dlstdtdlstcd Delisting Code int stk.events.delist[i].dlstcdnwperm New PERMNO int stk.events.delist[i].nwpermnwcomp New PERMCO int stk.events.delist[i].nwcompnextdt Delisting Date of Next Available Information int stk.events.delist[i].nextdtdlamt Amount After Delisting float stk.events.delist[i].dlamtdlretx Delisting Return without Dividends float stk.events.delist[i].dlretxdlprc Delisting Price float stk.events.delist[i].dlprcdlpdt Delisting Payment Date int stk.events.delist[i].dlpdtdlret Delisting Return float stk.events.delist[i].dlret

nasdin Nasdaq Structure Array

i between 0 and stk.events.nasdin_arr->num-1

Nasdaqstk.esdt tostk.esendd

trtsdt Nasdaq Traits Date int stk.events.nasdin[i].trtsdttrtsenddt Nasdaq Traits End Date int stk.events.nasdin[i].trtsenddttrtscd Nasdaq Traits Code int stk.events.nasdin[i].trtscdnmsind Nasdaq National Market Indicator int stk.events.nasdin[i].nmsindmmcnt Market Maker Count int stk.events.nasdin[i].mmcntnsdinx NASD Index Code int stk.events.nasdin[i].nsdinx

Time Series Data Arrays

bidlo Bid or Low Price float * stk.bidlo[i]

i between stk.bidlo_ts->beg and stk.bidlo_ts->end

value on->cal

askhi Ask or High Price float * stk.askhi[i]

i between stk.askhi_ts->beg and stk.askhi_ts->end

value on->cal

prc Price or Bid/Ask Average float * stk.prc[i]

i between stk.prc_ts->beg and stk.prc_ts->end

value on->cal

ret Holding Period Total Return float * stk.ret[i]

i between stk.ret_ts->beg and stk.ret_ts->end

value on->cal

vol Volume Traded int * stk.vol[i]

i between stk.vol_ts->beg and stk.vol_ts->end

value on->cal

bid Bid float * stk.bid[i]

i between stk.bid_ts->beg and stk.bid_ts->end

value on->cal

ask Ask float * stk.ask[i]

i between stk.ask_ts->beg and stk.ask_ts->end

value on->cal

retx Return Without Dividends float * stk.retx[i]

i between stk.retx_ts->beg and stk.retx_ts->end

value on->cal

spread Spread Between Bid and Ask float * stk.spread[i]

i between stk.spread_ts->beg and stk.spread_ts->end

value on>cal->cal

Mnemonic Name Data Type Data Usage Index Range Date Us

Page 64: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

date stk.numtrd_ts->caldt[i] stk.numtrd_ts

date stk.numtrd_ts->caldt[i] stk.numtrd_ts

date stk.altprc_ts->caldt[i] stk.altprc_ts

r period ending ort_ts[j]->caldt[i] array of stk.port_ts

age Object Usage

48

numtrd Number of Trades, Nasdaq int * stk.numtrd[i]

i between stk.numtrd_ts->beg and stk.numtrd_ts->end

value on->cal

numtrd Price Alternate Date int * stk.numtrd[i]

i between stk.numtrd_ts->beg and stk.numtrd_ts->end

value on->cal

altprc Price Alternate float * stk.altprc[i]

i between stk.altprc_ts->beg and stk.altprc_ts->end

value on->cal

port Portfolio Statistics and Assignments

j between 0 and stk.porttypes-1, i between stk.port_ts[j]->beg and stk.port_ts[j]->end

value fostk.p->cal

port Portfolio Assignment Number int stk.port[j][i].portstat Portfolio Statistic Value double stk.port[j][i].stat

Mnemonic Name Data Type Data Usage Index Range Date Us

Page 65: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

C P

rogramm

ing

Examples of C Variable Usage for CRSP Stock Data

These assume a variable stk of type CRSP_STK_STRUCT.

CRSP_ROW/Header Data

Object Variable: stk.header_row

Data Structure: stk.header

Sample Print Statement: printf ("%d %8d-%8d\n", stk.header->permno, stk.header->begdt, stk.header->enddt);

CRSP Array/Distributions

Object Variable: stk.events.dists_arr

Data Array: stk.events.dists

Sample Print Statement: This sample loop prints all distribution codes and ex-distribution dates.

for (i = 0; i < stk.events.dists_arr->num; ++i)

printf ("%4d %8d\n", stk.events.dists[i].distcd, stk.events.dists[i].exdt);

Time Series/Prices

Object Variable: stk.prc_ts

Data Array: stk.prc

Sample Print Statement: This sample loop prints all prices and dates in the issue’s range.

for(i = stk.prc_ts->beg; i <= stk.prc_ts->end; ++i)

printf("%11.5f %8d\n", stk.prc[i], stk.prc_ts->cal->caldt[i]);

Array of Time Series/Portfolios

Object Variable: stk.port_ts[j]

Data Array: stk.port[j]

(There are stk.porttypes portfolios available; j above is between 0 and stk.porttypes -1)

Sample Print Statement:

This prints the associated indno and the sample loop prints the date and assignment for each year in the issue’s rangefor porttype=0.

printf ("indno = %d\n", stk.port_ts[0].subtype);

for (i = stk.port_ts[0]->beg; i <= stk.port_ts[0]->end; ++i)

printf ("%8d %2d\n", stk.port_ts[0]->cal->caldt[i], stk.port[0][i].port);

49

Page 66: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

eries or portfolio group. In CRSP NYSE,l series and groups are available when you data for one series or group and includes

e CRSP data structures as well as the data

r. The 0th element of a time series array is

ndar. In a SERIES SETID, the multiple

nts asis Elements on a Set Basis Array Name

none ind.indhdr

eriesmaxarr, ind.rebal-types

ind.rebal[j], j from 0 to ind.rebaltypes

eriesmaxarr, ind.list-types

ind.list[j], j from 0 to ind.listtypes

maxarr, cal, ind.indtypes

ind.usdcnt[j], j from 0 to ind.ind-types

maxarr, cal, ind.indtypes

ind.totcnt[j], j from 0 to ind.ind-types

for maxarr, cal, ind.indtypes

ind.usdval[j], j from 0 to ind.ind-types

for maxarr, cal, ind.indtypes

ind.totval[j], j from 0 to ind.ind-types

for maxarr, cal, ind.indtypes

ind.tret[j], j from 0 to ind.indtypes

for maxarr, cal, ind.indtypes

ind.aret[j], j from 0 to ind.indtypes

for maxarr, cal, ind.indtypes

ind.iret[j], j from 0 to ind.indtypes

for maxarr, cal, ind.indtypes

ind.tind[j], j from 0 to ind.indtypes

for maxarr, cal, ind.indtypes

ind.aind[j], j from 0 to ind.indtypes

for maxarr, cal, ind.indtypes

ind.iind[j], j from 0 to ind.indtypes

50

C Language Data Objects for CRSP Indices

CRSP assigns the Permanent Index Identification Number (INDNO) to access the indices data in the C for individual sAMEX, Nasdaq Daily and Monthly Price and Total Return Database, a subset of market series is available. Additionasubscribe to the CRSP US Stock, Treasury Indices and Portfolio Assignments Database. The index structure supportsheader, rebalancing, and result information for the index.

Each index structure contains a fixed set of possible objects. Objects contain the header information needed to use tharrays. Data elements are described in the C Data Structure Table under the array name.

Time series beg and end are both equal to 0 if there are no data. Otherwise beg > 0, beg <= end, and end < maxarreserved for the missing value for that data type.

Multiple series in the index structure refers to portfolio subgroups. Each of these will have the same beg, end, and caleseries has a count of 1. In a GROUP SETID, the count of series is found in the corresponding xxxtypes variable.

Index File C Objects TableC Object Mnemonic Array Name Object Type Array Type Subtype

Size of

Array Range Elemeon an Index B

indhdr_row Indices Header Object CRSP_ROW CRSP_IND_HEADER_NUM = 200 0 300 none

rebal_arr[ ] Array of Rebalancing Arrays CRSP_ARRAY CRSP_IND_REBAL_NUM = 201 0 64 num for each s

list_arr[ ] Array of List Arrays CRSP_ARRAY CRSP_IND_LIST_NUM = 202 0 24 num for each s

usdcnt_ts[ ] Array of Used Count Time Series CRSP_TIMESERIES CRSP_INTEGER_NUM = 2 CRSP_COUNT_NUM = 7 4beg and end for each series

totcnt_ts[ ] Array of Total Count Time Series CRSP_TIMESERIES CRSP_INTEGER_NUM = 2 CRSP_COUNT_NUM = 7 4beg and end or each series

usdval_ts[ ] Array of Used Value Time Series CRSP_TIMESERIES CRSP_DOUBLE_NUM = 4 CRSP_WEIGHT_NUM = 4 8beg and end each series

totval_ts[ ] Array of Total Value Time Series CRSP_TIMESERIES CRSP_DOUBLE_NUM = 4 CRSP_WEIGHT_NUM = 4 8beg and end each series

tret_ts[ ] Array of Total Return Time Series CRSP_TIMESERIES CRSP_FLOAT_NUM = 1 CRSP_RETURN_NUM = 2 4beg and end each series

aret_ts[ ]Array of Capital Appreciation Time Series CRSP_TIMESERIES CRSP_FLOAT_NUM = 1 CRSP_RETURN_NUM = 2 4

beg and end each series

iret_ts[ ] Array of Income Return Time Series CRSP_TIMESERIES CRSP_FLOAT_NUM = 1 CRSP_RETURN_NUM = 2 4beg and end each series

tind_ts[ ]Array of Total Return Index Level Time Series CRSP_TIMESERIES CRSP_FLOAT_NUM = 1 CRSP_LEVEL_NUM = 3 4

beg and end each series

aind_ts[ ]Array of Capital Appreciation Index Level Time Series CRSP_TIMESERIES CRSP_FLOAT_NUM = 1 CRSP_LEVEL_NUM = 3 4

beg and end each series

iind_ts[ ]Array of Income Return Index Level Time Series CRSP_TIMESERIES CRSP_FLOAT_NUM = 1 CRSP_LEVEL_NUM = 3 4

beg and end each series

Page 67: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

51

finitions in the next level of indentation.

levels indicated by the indentation in thein all access functions. The second level

ength allowed. Actual maximums may bed multiple CRSP_IND_STRUCTs may be

C Date Usage C Object Usage

ind.indhdr_row

C Programming

C Language Data Structure for CRSP Indices Data

All CRSP-defined data types have names in all capitals beginning with CRSP_ and are immediately followed by the de

Index and Date Ranges for all elements in a structure are the same as for the structure itself. There are four structureMnemonic field. Pointers at any level can be used in a program. The top level contains all other items and is used indicates data grouped in modules. See the CRSPAccess97 Stock Users Guide for data item definitions.

All character strings, indicated by char[#], are null terminated. The number of characters - 1 is the maximum string llower. The top level ind structure is an example used by CRSP Indices sample programs. Other names can be used, andeclared in a program.

Index File C Variable Usage TableMnemonic name C Data Type C Data Usage C Index Range

ind Master Indices Structure CRSP_IND_STRUCT ind

indhdr Indices Header Objectindno Permanent Index Identification Number int ind.indhdr->indnoindco Permanent Index Group Identification Number int ind.indhdr->indcoprimflag Index Primary Link int ind.indhdr->primflagportnum Portfolio Number if Subset Series int ind.indhdr->portnumindname Index Name char[80] ind.indhdr->indnametypename Index Group Name char[80] ind.indhdr->typename

method Index Methodology Description Structure CRSP_IND_METHOD ind.indhdr->methodmethcode Index Method Type Code int ind.indhdr->method.methcodeprimtype Index Primary Methodology Type int ind.indhdr->method.primtypesubtype Index Secondary Methodology Group int ind.indhdr->method.subtypewgttype Index Reweighting Type Flag int ind.indhdr->method.wgttypewgtflag Index Reweighting Timing Flag int ind.indhdr->method.wgtflag

flags Index Exception Handling Flags CRSP_IND_FLAGS ind.indhdr->flagsflagcode Index of Basic Exception Types Code int ind.indhdr->flags.flagcodeaddflag Index New Issues Flag int ind.indhdr->flags.addflagdelflag Index Ineligible Issues Flag int ind.indhdr->flags.delflagdelretflag Return of Delisted Issues Flag int ind.indhdr->flags.delretflagmissflag Index Missing Data Flag int ind.indhdr->flags.missflag

partuniv Index Subset Screening Structure CRSP_UNIV_PARAM ind.indhdr->partuniv

partunivcodeUniverse Subset Types Code in a Partition Restriction int ind.indhdr->partuniv.univcode

begdt Index Restriction Beginning Date int ind.indhdr->induniv.begdtenddt Index Restriction End Date int ind.indhdr->induniv.enddt

wantexchValid Exchange Codes in the Universe in a Partition Restriction int ind.indhdr->partuniv.wantexch

wantnmsValid Nasdaq Market Groups in the Universe in a Partition Restriction int ind.indhdr->partuniv.wantnms

wantwiValid When-Issued Securities in the Universe in a Partition Restriction int ind.indhdr->partuniv.wantwi

wantincValid Incorporation of Securities in the Universe in a Partition Restriction int ind.indhdr->partuniv.wantinc

shrcd Share Code Screen Structure CRSP_UNIV_SHRCD ind.indhdr->partuniv.shrcdsccode Share Code Groupings for Subsets int ind.indhdr->partuniv.shrcd.sccodefstdig Valid First Digit of Share Code int ind.indhdr->partuniv.shrcd.fstdig

secdigValid Second Digit of Share Code in a Partition Restriction int ind.indhdr->partuniv.shrcd.secdig

Page 68: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

data valid fromind.rebal[j][i].rbbegdt to ind.rebal[j][i].rbenddt

array of ind.rebal_arr

data valid from ind.rebal[j][i].rbbegdt to ind.rebal[j][i].rbenddt

array of ind.rebal_arr

valid from ind.list[j][i].beg to ind.list[j][i].enddt

array of ind.list_arr

C Date Usage C Object Usage

52

induniv Partition Subset Screening Structure CRSP_UNIV_PARAM ind.indhdr->induniv

indunivcodeUniverse Subset Types Code in an Index Restriction int ind.indhdr->induniv.univcode

begdt Index Restriction Beginning Date int ind.indhdr->induniv.begdtenddt Index Restriction End Date int ind.indhdr->induniv.enddt

wantexchValid Exchange Codes in the Universe in an Index Restriction int ind.indhdr->induniv.wantexch

wantnmsValid Nasdaq Market Groups in the Universe in an Index Restriction int ind.indhdr->induniv.wantnms

wantwiValid When-Issued Securities in the Universe in an Index Restriction int ind.indhdr->induniv.wantwi

wantincValid Incorporation of Securities in the Universe in an Index Restriction int ind.indhdr->induniv.wantinc

shrcd Share Code Screen Structure CRSP_UNIV_SHRCD ind.indhdr->induniv.shrcdsccode Share Code Groupings for Subsets int ind.indhdr->induniv.shrcd.sccodefstdig Valid First Digit of Share Code int ind.indhdr->induniv.shrcd.fstdig

secdigValid Second Digit of Share Code in an Index Restriction int ind.indhdr->induniv.shrcd.secdig

rules Portfolio Building Rules Structure CRSP_IND_RULES ind.indhdr->rulesrulecode Index of Basic Rule Types Code int ind.indhdr->rules.rulecodebuyfnct Index Function Code for Buy Rules int ind.indhdr->rules.buyfnctsellfnct Index Function Code for Sell Rules int ind.indhdr->rules.sellfnctstatfnct Index Function Code for Generating Statistics int ind.indhdr->rules.statfnctgroupflag Index Statistic Grouping Code int ind.indhdr->rules.groupflag

assign Related Assignment Information CRSP_IND_ASSIGN ind.indhdr->assignassigncode Index of Basic Assignment Types Code int ind.indhdr->assign.assigncode

aspermPermanent Index Identification Number of Associated Index int ind.indhdr->assign.asperm

asport Portfolio Number in Associated Index int ind.indhdr->assign.asport

rebalcalCalendar Identification Number of Rebalancing Calendar int ind.indhdr->assign.rebalcal

assigncalCalendar Identification Number of Assignment Calendar int ind.indhdr->assign.assigncal

calccalCalendar Identification Number of Calculations Calendar int ind.indhdr->assign.calccal

rebal Array of Rebalancing Arrays int ind.rebal[j][i].rbbegdt

j between 0 and ind.rebaltypes, i between 0 and ind.rebal_arr[j]->num-1

rbbegdt Index Rebalancing Beginning Date int ind.rebal[j][i].rbbegdt

j between 0 and ind.rebaltypes, i between 0 and ind.rebal_arr[j]->num-1

rbenddt Index Rebalancing Ending Date int ind.rebal[j][i].rbenddtusdcnt Count Used as of Rebalancing int ind.rebal[j][i].usdcntmaxcnt Maximum Count During Period int ind.rebal[j][i].maxcnttotcnt Count Available as of Rebalancing int ind.rebal[j][i].totcntendcnt Count at End of Rebalancing Period int ind.rebal[j][i].endcntminid Statistic Minimum Identifier int ind.rebal[j][i].minidmaxid Statistic Maximum Identifier int ind.rebal[j][i].maxidminstat Statistic Minimum in Period double ind.rebal[j][i].minstatmaxstat Statistic Maximum in Period double ind.rebal[j][i].maxstatmedstat Statistic Median in Period double ind.rebal[j][i].medstatavgstat Average Statistic in Period double ind.rebal[j][i].avgstat

list Array of List Arrays int ind.list[j][i].permno

j between 0 and ind.listtypes, i between 0 and ind.list_arr[j]->num-1

Mnemonic name C Data Type C Data Usage C Index Range

Page 69: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

53

valid from ind.list[j][i].beg to ind.list[j][i].enddt

array of ind.list_arr

value on date ind.usdcnt_ts[j]->cal->caldt[i]

array of ind.usdcnt_ts

value on date ind.totcnt_ts[j]->cal->caldt[i]

array of ind.totcnt_ts

value on date ind.usdval_ts[j]->cal->caldt[i]

array of ind.usdval_ts

value on date ind.totval_ts[j]->cal->caldt[i]

array of ind.totval_ts

value on date ind.tret_ts[j]->cal->caldt[i]

array of ind.tret_ts

value on date ind.aret_ts[j]->cal->caldt[i]

array of ind.aret_ts

value on date ind.iret_ts[j]->cal->caldt[i]

array of ind.iret_ts

value on date ind.tind_ts[j]->cal->caldt[i]

array of ind.tind_ts

value on dateind.aind_ts[j]->cal->caldt[i] array of ind.aind_ts

value on date ind.iind_ts[j]->cal->caldt[i]

array of ind.iind_ts

C Date Usage C Object Usage

C Programming

permno Permanent Number of Securities in Index List int ind.list[j][i].permno

j between 0 and ind.listtypes, i between 0 and ind.list_arr[j]->num-1

begdt First Date Included in List int ind.list[j][i].begdtenddt Last Date Included in a List int ind.list[j][i].enddtsubind Index Subcategory Code int ind.list[j][i].subindweight Weight of an Issue double ind.list[j][i].weight

Time Series Data Arrays

usdcnt Array of Used Count Time Series

j between 0 and indtypes-1, i between ind.usdcnt_ts[j]->beg and ind.usdcnt_ts[j]

->end

totcnt Array of Total Count Time Series

j between 0 and indtypes-1, i between ind.totcnt_ts[j]->beg and ind.totcnt_ts[j]->end

usdval Array of Used Value Time Series

j between 0 and indtypes-1, i between ind.usdval_ts[j]->beg and ind.usdval_ts[j]->end

totval Array of Total Value Time Series

j between 0 and indtypes-1, i between ind.totval_ts[j]->beg and ind.totval_ts[j]->end

tret Array of Total Return Time Series

j between 0 and indtypes-1, i between ind.tret_ts[j]->beg and ind.tret_ts[j]->end

aret Array of Capital Appreciation Time Series

j between 0 and indtypes-1, i between ind.aret_ts[j]->beg and ind.aret_ts[j]->end

iret Array of Income Return Time Series

j between 0 and indtypes-1, i between ind.iret_ts[j]->beg and ind.iret_ts[j]->end

tind Array of Total Return Index Level Time Series

j between 0 and indtypes-1, i between ind.tind_ts[j]->beg and ind.tind_ts[j]->end

aindArray of Capital Appreciation Index Level Time Series

j between 0 and indtypes-1, i between ind.aind_ts[j]->beg and ind.aind_ts[j]->end

iind Array of Income Return Index Level Time Series

j between 0 and indtypes-1, i between ind.iind_ts[j]->beg and ind.iind_ts[j]->end

Mnemonic name C Data Type C Data Usage C Index Range

Page 70: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

3.2 C Sample Programs

There are two sample programs provided that can process the CRSP Stock Database using C. These programs canload stock and indices data structures for processing. The sample program code contains additional commentinformation. See system-dependent C programming instructions at the end of this section for instructions to runsample programs on supported systems. See C usage tables in 3.1 and variables descriptions in the CRSP DataDefinitions and Coding Schemes Guide for possible data usage. Sample programs are available on your CD or can bedownloaded from crsp.com.

stk_samp1.c stk_samp1.c creates a namelist of current names by reading a stock database sequentially inPERMNO order. It loads one index series before processing the stock data. Output is one line ofheader information per security. stk_samp1.c accepts parameters for database directory, stock setidentifier, indices set identifier, Permanent Index Identification Number (indno), and output filename.

stk_samp2.c stk_samp2.c reads a stock database using an input file of PERMNOs. It loads one set of indicesbefore processing the input list. Output is one line of header information per security.stk_samp2.c accepts parameters for database directory, stock set identifier, indices set identifier,Permanent Index Identification Number, input file name, and output file name.

54

Page 71: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

C P

rogramm

ing

C Header Files and Data Structures

Header files contain all needed structure definitions, constants, and function prototypes. Two C header files aresufficient to define all CRSP structures, constants, and functions.

1. crsp.h defines all structures and constants used by the CRSP C access and utility functions, and the functiondefinitions. crsp.h includes several other header files. The primary definitions needed for stock databases are incrsp_objects.h, crsp_const.h, crsp_stk_objects.h, and crsp_stk_const.h. The primary definitions neededfor the indices data are in crsp_objects.h, crsp_const.h, crsp_ind_objects.h, and crsp_ind_const.h.

2. crsp_init.h declares internal variables needed to store initialization and error information. This should only beincluded in the main program and not in any function modules.

The following list is a more complete summary of individual stock and indices header files that are included bycrsp.h. All header files are kept in the CRSP_INCLUDE directory.

Header File Description

crsp_stk.h top level stock header file includes all needed header files for CRSP Stock accesscrsp_stk_objects.h defines top level CRSP_STK_STRUCT structure for Stock Datacrsp_objects.h defines all object structures and data array structures for all supported typescrsp_stk_const.h defines stock constants and wanted parameterscrsp_const.h defines generic crsp constantscrsp_access_stk.h defines stock access function prototypescrsp_util_stk.h defines stock utility function prototypescrsp_ind.h top level indices header file includes all needed header files for CRSP Indices accesscrsp_ind_objects.h defines top level CRSP_IND_STRUCT structure for Indices Datacrsp_ind_const.h define indices constants and wanted parameterscrsp_access_ind.h defines index access function prototypescrsp_util_ind.h defines index utility function prototypescrsp_sysio.h defines system-specific constantscrsp_maint.h defines internal data structures

55

Page 72: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

3.3 CRSPAccess97 C Library

The CRSPAccess97 C Library contains the Application Programming Interface used to access and to process thedata. The library is broken into sections based on the type of operations. The following major groups are available.Each can be further subdivided into subgroups. Functions within subgroups are alphabetical. Each function includesa function prototype, description, list of arguments, return values, side effects and preconditions for use.

C Library Category Description

Stock Access Functions Functions used to load stock data from the database into structuresIndex Access Functions Functions used to load index data from the database into structuresGeneral Access Functions General calendar and access functionsData Utility Functions Functions used to manipulate stock or indices dataGeneral Utility Functions Functions utility to process base CRSPAccess97 structures

56

Page 73: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

C P

rogramm

ing

r

in a

in the

Stock Access Functions

The following tables list the available functions to access CRSPAccess97 Stock Data. Standard usage is to use anopen function, followed by successive reads and a close. Different databases and sets can be processedsimultaneously if there is a matching structure defined for each one.

C Stock Access Functions

crsp_stk_clear Loads Missing Value Arrays in a Stock Set Structure

crsp_stk_close Closes a Stock Set

Function Description

crsp_stk_clear Loads Missing Values to Arrays in a Stock Set Structurecrsp_stk_close Closes a Stock Setcrsp_stk_free Deallocates Memory and Reinitialize a Stock Set Structurecrsp_stk_init Initializes a CRSPAccess97 Database for Stock Accesscrsp_stk_open Opens a Stock Set in a CRSPAccess97 Databasecrsp_stk_read Loads Wanted Stock Data For a PERMNOcrsp_stk_read_cus Loads Wanted Stock Data Using Header CUSIP Identifier, Header as the Keycrsp_stk_read_permco Loads Wanted Stock Data Using PERMCO as the Keycrsp_stk_read_hcus Loads Wanted Stock Data Using Historical CUSIP as the Keycrsp_stk_read_siccd Loads Wanted Stock Data Using Historical SIC Code as the Keycrsp_stk_read_ticker Loads Wanted Stock Data Using Ticker Symbol, Header as the Key

Prototype: int crsp_stk_clear (CRSP_STK_STRUCT *stk, int clearflag)

Description:

load defined missing values to all allocated objects in a stock set structure. It is assumed that the pointers are eitherNULL or have been allocated by a set open function. The function allows clearing on a range level, range and arraylevel, or array level.

Arguments: CRSP_STK_STRUCT *stk – pointer to a stock structure pointer to be cleared.int clearflag – constant identifying the level of initialization. Supported values are:

CRSP_CLEAR_INIT – only reset num for CRSP_ARRAYs and beg and end for CRSP_TIMESERIES, and setstructure to missing values for CRSP_ROWs.CRSP_CLEAR_ALL – set ranges to missing and set missing values foall elements in the object arraysCRSP_CLEAR_RANGE – set missing values for all elements in the object arrays within the range between beg andend in a CRSP_TIMESERIES or between 0 and num–1 in a CRSP_ARRAY, or the single element in a CRSP_ROW.CRSP_CLEAR_SET – set ranges in the 0’th element of a CRSP_TIMESERIES array or the maxarr-1’th elementof a CRSP_ARRAY to missing values specific to the array type, or sets missing values to the single elementCRSP_ROW.

Return Values: CRSP_SUCCESS: if successCRSP_FAIL: if bad parameters

Side Effects: The stock structure pointer has all allocated fields initialized according to the clearflag

Preconditions: The stock structure must either have object fields set to NULL or allocated with a set open function.Call Sequence: Can be called after crsp_stk_uper and before each crsp_stk_read call.

Prototype: int crsp_stk_close (int crspnum, int setid, CRSP_STK_STRUCT *stkptr)

Description: closes a stock setArguments: int crspnum - identifier of crsp database, as returned by open

int setid - identifier of the stock set code to closeCRSP_STK_STRUCT *stkptr - pointer to stock structure to be deallocated; if NULL nothing is deallocated

Return Values: CRSP_SUCCESS: if successfully closed stock setCRSP_FAIL: if error closing a file or illegal parameter

Side Effects:All stock module files are closed, memory allocated by them are freed. If these are the last modules open

database, the root is also closed. The stock structure associated with the set is deallocated if stkptr is not NULL.Preconditions: The crspnum and setid must be taken from a previous crsp_stk_open call

Call Sequence:Called by external programs, must be preceded by call to crsp_stk_open. calls crsp_closeroot,

crsp_closemod.

57

Page 74: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

t

s in

crsp_stk_free Deallocates Memory and Reinitializes a Stock Set Structure

crsp_stk_init Initializes a CRSPAccess97 Database for Stock Access

Prototype: int crsp_stk_free (int crspnum, int setid, CRSP_STK_STRUCT *stkptr)

Description: deallocates memory and reinitialize a stock set structureArguments: int crspnum – identifier of crsp database, as returned by open

int setid – identifier of the stock set code to closeCRSP_STK_STRUCT *stkptr - pointer to stock structure

Return Values: CRSP_SUCCESS – if successfully deallocated and reset stock structure, or stk structure is NULLCRSP_FAIL – if error deallocating memory, error in parameters

Side Effects:The stock structures are reset so all pointers are NULL and all settings are 0. All memory allocated to existing objec

element lists is freed.

Preconditions:The crspnum must be known from a previous crsp_stk_open or crsp_openroot call. The setcode is an

installation-defined code for the set.

Call Sequence:Called by external programs or by crsp_stk_close must be preceded by call to crsp_stk_alloc calls

crsp_freemod.

Prototype: int crsp_stk_init( CRSP_STK_STRUCT *stkptr )

Description:initializes internal access for stock CRSPDBs and sets stock structure pointers to NULL. See crsp_stk_clear to

clear data from a stock structure.

Arguments:CRSP_STK_STRUCT *stkptr - pointer to a stock structure to initialize. This argument can be NULL to initialize a

stock database without resetting the structure.Return Values: CRSP_SUCCESS: if stock internals successfully initialized

CRSP_FAIL: if error opening or reading initialization file

Side Effects:

Internal structures will be initialized, including the array of known stock sets. They will be stored in static structurethis module and used by other stk functions. All of the pointers in the stock structure stkptr will be set to NULL.If a structure is already initialized with crsp_stk_open, crsp_stk_free should be used or memory will belost.

Preconditions: None; crsp_stk_init is called by crsp_stk_open

58

Page 75: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

C P

rogramm

ing

s.

et ructure

and g the ars

crsp_stk_open Opens an Existing Stock Set in a CRSPAccess97 Database

Prototype:int crsp_stk_open (char *root, int setid, CRSP_STK_STRUCT *stkptr, int wanted,

char *mode, int bufferflag)

Description: opens an existing stock set in a CRSPAccess97 Database

Arguments:char *root - path of root directory. If the root is NULL the CRSP_DSTK or CRSP_MSTK environment variables are

used.int setid – the set identifier

10 Daily CRSP Stock Database20 Monthly CRSP Stock Database

CRSP_STK_STRUCT *stkptr – pointer within stock structure to be associated with this database. If wanted objects in stkptr are NULL then space for objects where the structure is allocated by this function.

int wanted – mask indicating which modules will be used. The list below shows the wanted values for the stock modules. The wanted values can be summed or summary wanted values can be used to open multiple moduleOnly modules that are selected in the wanted parameter have memory allocated in the stock structure and only those modules can be accessed in further access functions to the database.

Individual modules:STK_HEAD 1 header structureSTK_EVENTS 2 names, dists, shares, delists, nasdinSTK_LOWS 4 lowsSTK_HIGHS 8 highsSTK_PRICES 16 close or bid/ask averageSTK_RETURNS 32 total returnsSTK_VOLUMES 64 volumesSTK_PORTS 128 portfoliosSTK_BIDS 256 bidsSTK_ASKS 512 asksSTK_RETXS 1024 returns without dividendsSTK_SPREADS 2048 spreadsSTK_TRADES 4096 number of tradesSTK_ALTPRCS 8192 alternate pricesSTK_GROUPS 16384 groupsGroup of modules:STK_INFOS 3 header and event dataSTK_DDATA 124 price, high, low, volume and returns time seriesSTK_SDATA 4864 bids, asks, and number of trades time seriesSTK_STD 5119 header, events, prices, high, low, volume, returns, and portsSTK_ALL 32767 all modules

char *mode – usage while open (r=read, rw=read/write)int bufferflag - level of buffering: 0 : no buffering, 1 : use default, n : use factor of default

Return Values: crspnum: (integer) if opened successfully. This crspnum is used in further access functions to the database.CRSP_FAIL: (integer) if error opening or loading files, if bad parameters, root already opened exclusively, stock s

already opened rw, wanted not a subset of set’s modules, set does not exist in root, set already opened and stallocated, error allocating memory for internal or stock structures.

Side Effects:

This will load root and stock initialization files if needed, open the root including loading the configuration structure index structures to memory, opening the address file, and if necessary allocating memory to file buffers, loadinfree list, and logging information to the log file. Files will be opened for all wanted modules. Associated calendwill be loaded if necessary. Wanted stock structures will be allocated.

Preconditions: None. The root may already be open under a different set in r mode.

59

Page 76: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

al

e

Use

e

crsp_stk_read Loads Wanted Stock Data for a security by PERMNO

crsp_stk_read_cus Loads Wanted Stock Data Using CUSIP Identifier, Header as the Key

Prototype:int crsp_stk_read (int crspnum, int setid, int *key, int keyflag, CRSP_STK_STRUCT *stkptr, int wanted)

Description: loads wanted stock data for a PERMNOArguments: int crspnum – crspdb root identifier returned by crsp_stk_open

int setid – the set identifier used in crsp_stk_openint *key – specific PERMNO of data to load, or pointer to integer that will be loaded with the key found if a position

keyflag is used.int keyflag – CRSP_EXACT constant to search for the permno in *key, or positional constant:

CRSP_FIRST - the first key in the databaseCRSP_PREV - the previous keyCRSP_LAST - the last key in the databaseCRSP_SAME - the same keyCRSP_NEXT - the next key

CRSP_STK_STRUCT *stkptr - structure to load dataint wanted - mask of flags indicating which module data to load. See crsp_stk_open for module codes.

Return Values: CRSP_SUCCESS: if data loaded successfullyCRSP_EOF: if next or previous key at end or beginning of fileCRSP_FOUND_OTHER: if key found in root, but not for this setidCRSP_NOT_FOUND: if key not found in rootCRSP_FAIL: if error with bad parameters, invalid or unopened crspnum error in read, impossible wanted

Side Effects:

Data from the wanted modules will be loaded to the proper location in the stock structure. The position used for thnext positional read is reset based on the key found. If keyflag is a positional qualifier, the actual PERMNO found is loaded to *key. Data is only loaded to wanted data structures within the range of valid data for the security. stock clear functions to erase previously loaded data.

Preconditions:

The stock set must be previously opened. The crspnum must be returned from a previous crsp_stk_open call. stkptr must have been passed to a previous crsp_stk_open call. wanted must be a subset of the wanted parameter passed to the crsp_stk_open function.

Prototype:int crsp_stk_read_cus (int crspnum, int setid, char *cusip, int keyflag, CRSP_STK_STRUCT *stkptr, int wanted)

Description: loads wanted stock data for a security using the CUSIP Identifier, Header (hcusip) as the keyArguments: int crspnum – crspdb root identifier returned by crsp_stk_open

int setid – the set identifier used in crsp_stk_openchar *cusip – CUSIP Identifier, Header to load, or pointer to string that will be loaded with the key found if a

positional keyflag is used.int keyflag – qualify conditions of key searches:CRSP_EXACT - only accept an exact matchCRSP_BACK - find last previous key if no exact matchCRSP_FORWARD – find the first following key if no exact matchor positional constant:CRSP_FIRST - the first key in the databaseCRSP_PREV - the previous keyCRSP_LAST - the last key in the databaseCRSP_SAME - the same keyCRSP_NEXT - the next key

CRSP_STK_STRUCT *stkptr - structure to load dataint wanted – mask of flags indicating which module data to load. See crsp_stk_open for module codes.

Return Values: CRSP_SUCCESS: if data loaded successfullyCRSP_EOF: if next or previous key at end or beginning of fileCRSP_NOT_FOUND; if key not foundCRSP_FAIL: if error with bad parameters, invalid or unopened crspnum and stknum, error in read, impossible

wanted, invalid CUSIP index

Side Effects:

Data from the wanted modules will be loaded to the proper location in the stock structure. The position used for thnext positional read is reset based on the key found. If keyflag is a positional qualifier, the actual CUSIP Identifier, Header found is loaded to *cusip. Data is only loaded to wanted data structures within the range of valid data for the security. Use stock clear functions to erase previously loaded data

Preconditions:

The stock set must be previously opened. The crspnum must be returned from a previous crsp_stk_open call. stkptr must have been passed to a previous crsp_stk_open call. wanted must be a subset of the wanted parameter passed to the crsp_stk_open function.

60

Page 77: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

C P

rogramm

ing

e

ity.

crsp_stk_read_permco Loads Wanted Stock Data Using PERMCO as the Key

Prototype:int crsp_stk_read_permco (int crspnum, int setid, int *permco, int keyflag, CRSP_STK_STRUCT *stkptr, int wanted)

Description: loads wanted stock data for a security using PERMCO as the keyArguments: int crspnum – crspdb root identifier returned by crsp_stk_open

int setid – the set identifier used in crsp_stk_openint *permco – PERMCO to load, or pointer to an integer that will be loaded with the key found if a positional

keyflag is used.int keyflag – positional qualifier or match qualifier – see crsp_stk_read_cusCRSP_STK_STRUCT *stkptr – structure to load dataint wanted – mask of flags indicating which module data to load. See crsp_stk_open for module codes.

Return Values: CRSP_SUCCESS: if data loaded successfullyCRSP_EOF: if next or previous key at end or beginning of fileCRSP_NOT_FOUND: if key not foundCRSP_FAIL: if error with bad parameters, invalid or unopened crspnum and stknum, error in read, impossible

wanted, invalid PERMCO index

Side Effects:

Data from the wanted modules will be loaded to the proper location in the stock structure. The position used for thnext positional read is reset based on the key found. If keyflag is a positional qualifier, the actual PERMCO found is loaded to *PERMCO. Data is only loaded to wanted data structures within the range of valid data for the securUse stock clear functions to erase previously loaded data

Preconditions:

The stock set must be previously opened. The crspnum must be returned from a previous crsp_stk_open call. stkptr must have been passed to a previous crsp_stk_open call. wanted must be a subset of the wanted parameter passed to the crsp_stk_open function.

61

Page 78: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

e

or

e

ity.

crsp_stk_read_hcus Loads Wanted Stock Data Using Historical CUSIP as the Key

crsp_stk_read_siccd Loads Wanted Stock Data Using Historical SIC Code as the Key

Prototype:int crsp_stk_read_hcus (int crspnum, int setid, char *cusip, int keyflag, CRSP_STK_STRUCT *stkptr, int wanted)

Description: loads wanted stock data for a security using historical CUSIP as the keyArguments: int crspnum – crspdb root identifier returned by crsp_stk_open

int setid – the set identifier used in crsp_stk_openchar *cusip – historical CUSIP to load, or pointer to string that will be loaded with the key found if a positional

keyflag is used.int keyflag – positional qualifier or nomatch qualifier– see crsp_stk_read_cusCRSP_STK_STRUCT * stkptr – structure to load dataint wanted – mask of flags indicating which module data to load. See crsp_stk_open for module codes.

Return Values: CRSP_SUCCESS: if data loaded successfullyCRSP_EOF: if next or previous key at end or beginning of fileCRSP_NOT_FOUND: if key not foundCRSP_FAIL: if error with bad parameters, invalid or unopened crspnum, error in read, impossible wanted, invalid

CUSIP index

Side Effects:

Data from the wanted modules will be loaded to the proper location in the stock structure. The position used for thnext positional read is reset based on the key found. If keyflag is a positional qualifier, the actual historical CUSIP found is loaded to *cusip. Data is only loaded to wanted data structures within the range of valid data fthe security. Use stock clear functions to erase previously loaded data

Preconditions:

The stock set must be previously opened. The crspnum must be returned from a previous crsp_stk_open call. stkptr must have been passed to a previous crsp_stk_open call. wanted must be a subset of the wanted parameter passed to the crsp_stk_open function.

Prototype:int crsp_stk_read_siccd (int crspnum, int setid, int *siccd, int keyflag,

CRSP_STK_STRUCT *stkptr, int wanted)

Description: loads wanted stock data for a security using Standard Industrial Classification (SIC) Code (siccd) as the keyArguments: int crspnum – crspdb root identifier returned by crsp_stk_open

int setid - the set identifier used in crsp_stk_openint *siccd – siccd to load, or pointer to integer that will be loaded with the key found if a positional keyflag is

used.int keyflag – positional qualifier or no match qualifier– see crsp_stk_read_cusCRSP_STK_STRUCT *stkptr - structure to load dataint wanted - mask of flags indicating which module data to load. See crsp_stk_open for module codes.

Return Values: CRSP_SUCCESS: if data loaded successfullyCRSP_EOF: if next or previous key at end or beginning of fileCRSP_NOT_FOUND: if key not foundCRSP_FAIL: if error with bad parameters, invalid or unopened crspnum and stknum, error in read, impossible

wanted, invalid siccd index

Side Effects:

Data from the wanted modules will be loaded to the proper location in the stock structure. The position used for thnext positional read is reset based on the key found. If keyflag is a positional qualifier, the actual SIC Code found is loaded to *siccd Data is only loaded to wanted data structures within the range of valid data for the securUse stock clear functions to erase previously loaded data.

Preconditions:

The stock set must be previously opened. The crspnum must be returned from a previous crsp_stk_open call. stkptr must have been passed to a previous crsp_stk_open call. wanted must be a subset of the wanted parameter passed to the crsp_stk_open function.

62

Page 79: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

C P

rogramm

ing

e

crsp_stk_read_ticker Loads the Wanted Stock Data Using Ticker, Header As The Key

Prototype:int crsp_stk_read_ticker (int crspnum, int setid, char *ticker, int keyflag,

CRSP_STK_STRUCT *stkptr, int wanted)

Description: loads wanted stock data for a security using Ticker, Header as the keyArguments: int crspnum – crspdb root identifier returned by crsp_stk_open

int setid - the set identifier used in crsp_stk_openchar *ticker – pointer to header ticker to load, or pointer to string that will be loaded with the key found if a

positional keyflag is used.int keyflag – positional qualifier or no match qualifier– see crsp_stk_read_cusCRSP_STK_STRUCT *stkptr - structure to load dataint wanted – mask of flags indicating which module data to load. See crsp_stk_open for module codes.

Return Values: CRSP_SUCCESS: if data loaded successfullyCRSP_EOF: if next or previous key at end or beginning of fileCRSP_NOT_FOUND: if ticker not foundCRSP_FAIL: if error with bad parameters, invalid or unopened crspnum, error in read, impossible wanted, invalid

ticker index

Side Effects:

Data from the wanted modules will be loaded to the proper location in the stock structure. The position used for thnext positional read is reset based on the key found. If keyflag is a positional qualifier, the actual header ticker found is loaded to *ticker. Data is only loaded to wanted data structures within the range of valid data for thesecurity. Use stock clear functions to erase previously loaded data.

Preconditions:

The stock set must be previously opened. The crspnum must be returned from a previous crsp_stk_open call. stkptr must have been passed to a previous crsp_stk_open call. wanted must be a subset of the wanted parameter passed to the crsp_stk_open function.

63

Page 80: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

r

in a

st

Index Access Functions

The following tables list the available functions to access CRSPAccess97 index data. Standard usage is to use an open function, followed by successive reads and a close. Different databases and sets can be processed simultaneously if there is a matching structure defined for each one.

C Indices Access Functions

crsp_ind_clear Loads Missing Value Arrays in an Index Set Structure

crsp_ind_close Closes an Index Set

crsp_ind_free Deallocates Memory and Reinitializes an Index Set Structure

Access Function Description

crsp_ind_clear Loads Missing Value Arrays in an Indices Set Structurecrsp_ind_close Closes an Indices Setcrsp_ind_free Deallocates Memory And Reinitializes An Indices Set Structurecrsp_ind_init Initializes a CRSPAccess97 database for Indices Access crsp_ind_open Opens An Indices Set In a CRSPAccess97 Databasecrsp_ind_read Loads Wanted Data For an Index

Prototype: int crsp_ind_clear (CRSP_IND_STRUCT *ind, int clearflag)

Description:

load defined missing values to all allocated objects in an index set structure. It is assumed that the pointers are eitherNULL or have been allocated by a set open function. The function allows clearing on a range level, range and arraylevel, or array level.

Arguments: CRSP_IND_STRUCT *ind – pointer to an index structure pointer to be cleared.int clearflag – constant identifying the level of initialization. Supported values are:

CRSP_CLEAR_INIT – only reset num for CRSP_ARRAYs and beg and end for CRSP_TIMESERIES, and setstructure to missing values for CRSP_ROWs.CRSP_CLEAR_ALL – set ranges to missing and set missing values foall elements in the object arraysCRSP_CLEAR_RANGE – set missing values for all elements in the object arrays within the range between beg andend in a CRSP_TIMESERIES or between 0 and num–1 in a CRSP_ARRAY, or the single element in a CRSP_ROW.CRSP_CLEAR_SET – set ranges in the 0’th element of a CRSP_TIMESERIES array or the maxarr-1’th elementof a CRSP_ARRAY to missing values specific to the array type, or sets missing values to the single elementCRSP_ROW.

Return Values: CRSP_SUCCESS: if successCRSP_FAIL: if bad parameters

Side Effects: The index structure pointer has all allocated fields initialized according to the clearflag Preconditions: The index structure must either have object fields set to NULL or allocated with a set open function.Call Sequence: call after crsp_ind_open and before each crsp_ind_read.

Prototype: int crsp_ind_close (int crspnum, int setid, CRSP_IND_STRUCT *indptr)

Description: close an index setArguments: int crspnum – identifier of the CRSP database, as returned by crsp_ind_open

int setid – identifier of the index set code to close, as used in the openCRSP_IND_STRUCT *indptr – pointer to index structure to deallocate; if NULL, no deallocation occurs

Return Values: CRSP_SUCCESS: if successfully closed index setCRSP_FAIL: if error closing a file or illegal parameter

Side Effects:All index module files are closed, and memory allocated by them in the index structure is freed. If these are the la

modules open in the database, the root is also closed. If indptr is NULL, no structure memory deallocation occurs.Preconditions: The crspnum and setid must be taken from a previous crsp_ind_open call.

Prototype: int crsp_ind_free (int crspnum, int setid, CRSP_IND_STRUCT *indptr)

Description: deallocates memory and reinitializes an index set structureArguments: int crspnum - identifier of CRSPDB database, as returned by open

int setid - identifier of the index set code to closeCRSP_IND_STRUCT *indptr - pointer to index structure

Return Values: CRSP_SUCCESS - if successfully deallocated and reset index structure, or index structure is NULLCRSP_FAIL - if error deallocating memory, error in parameters

Side Effects:The index structures are reset so all pointers are NULL and all settings are 0. All memory allocated to existing objects is

freed. There is no effect if indptr is NULL.

Preconditions:The crspnum must be known from a previous crsp_ind_open call. The setid is a predefined identifier for the

index daily or monthly series or group set of index data previously opened with crsp_ind_open.

64

Page 81: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

C P

rogramm

ing

crsp_ind_init Initializes a CRSPAccess97 Database for Indices Access

Prototype: int crsp_ind_init (CRSP_IND_STRUCT *indptr)

Description:initializes an index structure by setting all pointers to NULL and all counts to zero. Initializes CRSP internal structures if

no previous initialization has been done.

Arguments:CRSP_IND_STRUCT *indptr - pointer to the index structure to be initialized. This argument can be NULL to

initialize a CRSP internal database without resetting an existing structure.Return Values: CRSP_SUCCESS: if index internals successfully initialized

CRSP_FAIL: if error opening or reading initialization file

Side Effects:

Internal structures will be initialized, including the array of known sets. They will be stored in internal structures in this module and used by other CRSP functions. All the pointers in the index structure indptr will be set to null. If a structure is already initialized with crsp_ind_open, crsp_ind_free should be used or memory will be lost.

Preconditions: None

65

Page 82: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

x les. those

and the ars

crsp_ind_open Opens an Existing Index Set in an Existing CRSPDB

Prototype:int crsp_ind_open (char *root, int setid, CRSP_IND_STRUCT *indptr, int wanted,

char *mode, int bufferflag)

Description:opens an existing index set in an existing crspdb. This opens database files, allocates needed memory to a structure,

and initializes internal structures to index data can be used. See crsp_ind_)clear for clearing data

Arguments:char *root - path of root directory. If root is NULL the CRSP_DSTK or CRSP_MSTK environment variables are

used.int setid – the set identifier

400 = monthly index groups420 = monthly index series440 = daily index groups460 = daily index series

CRSP_IND_STRUCT *indptr – pointer to index structure to be associated with this database. If indptr is not NULL then space for a CRSP_IND_STRUCT is allocated by this function.

int wanted – mask indicating which modules will be used. The list below shows the wanted values for the indemodules. The wanted values can be summed or summary wanted values can be used to open multiple moduOnly modules that are selected in the wanted parameter have memory allocated in the index structure and onlymodules can be accessed in further access functions to the database.

IND_HEAD 1 header structure and index description IND_REBALS 2 rebalancing information for index groups IND_LISTS 4 issue lists IND_USDCNTS 8 portfolio used counts IND_TOTCNTS 16 portfolio total eligible counts IND_USDVALS 32 portfolio used weights IND_TOTVALS 64 portfolio eligible weights IND_TRETURNS 128 total returns IND_ARETURNS 256 capital appreciation returns IND_IRETURNS 512 income returns IND_TLEVELS 1024 total return index levels IND_ALEVELS 2048 capital appreciation index levels IND_ILEVELS 4096 income return index levels

Symbols are available for common groups of modules. IND_ALL selects all the index data. IND_INFO = IND_HEAD + IND_REBALS + IND_LISTS IND_RETURNS = IND_TRETURNS + IND_ARETURNS + IND_IRETURNS IND_LEVELS = IND_TLEVELS + IND_ALEVELS + IND_ILEVELS IND_COUNTS = IND_USDCNTS + IND_TOTCNTS + IND_USDVALS + IND_TOTVALS IND_RESULTS = IND_HEAD + IND_USDCNTS + IND_USDVALS + IND_TRETURNS IND_ARESULTS= IND_HEAD + IND_USDCNTS + IND_USDVALS + IND_ARETURNS IND_IRESULTS= IND_HEAD + IND_USDCNTS + IND_USDVALS + IND_IRETURNS IND_STD = IND_HEAD + IND_COUNTS + IND_TRETURNS + IND_ARETURNS IND_ALL = IND_INFO + IND_RETURNS + IND_LEVELS + IND_COUNTS

char *mode – usage while open. Possible string values are“r” = read, “rw” = read/write

int bufferflag - level of buffering: 0 : no buffering, 1 : use default, n : use factor of defaultReturn Values: int crspnum : if opened successfully. This crspnum is used in further access functions to the database.

int CRSP_FAIL : if error opening or loading files, if bad parameters, root already opened exclusively, index set already opened “rw”, wanted not a subset of set’s modules, set does not exist in root, set already opened andstructure allocated, error allocating memory for internal or stock structures.

Side Effects:

This will load root and index initialization files if needed, open the root including loading the configuration structure index structures to memory, opening the address file, and if necessary allocating memory to file buffers, loadingfree list, and logging information to the log file. Files will be opened for all wanted modules. Associated calendwill be loaded if necessary. Wanted index structures will be allocated.

Preconditions: None; the root may already be open. If a new index structure is passed additional fields may be allocated.

66

Page 83: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

C P

rogramm

ing

ext

ar

crsp_ind_read Loads Wanted Index Data For an INDNO

Prototype:int crsp_ind_read (int crspnum, int setid, int *key, int keyflag, CRSP_IND_STRUCT

*indptr, int wanted)

Description: loads wanted index data based on Permanent Index Identification Number (indno)Arguments: int crspnum – crspdb root identifier returned by crsp_ind_open

int setid - the set identifier used in crsp_ind_openint *key – specific indno of data to loadint keyflag – CRSP_EXACT constant to search for the indno in *key, or positional constant:

CRSP_FIRST - the first key in the databaseCRSP_PREV - the previous keyCRSP_LAST - the last key in the databaseCRSP_SAME - the same keyCRSP_NEXT - the next key

CRSP_STK_STRUCT *indptr – structure to load dataint wanted - mask of flags indicating which module data to load. See crsp_ind_open for module codes.

Return Values: CRSP_SUCCESS: if data loaded successfullyCRSP_EOF: if next or previous key at end or beginning of file CRSP_FOUND_OTHER: if key found in root, but not for this set CRSP_NOT_FOUND: if key not found in databaseCRSP_FAIL:if error with bad parameters, invalid or unopened crspnum and setid, error in read, impossible

wanted

Side Effects:

Data from the wanted modules will be loaded to the proper location in the index structure. The position used for the npositional read is reset based on the key found. If keyflag is a positional qualifier, the actual indno found is loaded to *key. Data is only loaded to wanted data structures within the range of valid data for the index. Use index clefunctions to erase previously loaded data.

Preconditions:

The index set must be previously opened. The crspnum must be returned from a previous crsp_stk_open call. indptr must have been passed to a previous crsp_stk_open call. wanted must be a subset of the wanted parameter passed to the crsp_ind_open function.

67

Page 84: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

General Access Functions

The CRSPAccess97 general access functions include error functions and portable file operation functions.

Function Group Description

Error-handling The CRSPAccess97 function for handling errors produced by CRSP functionsFile Operations Portable I/O functions based on C run-time library functions

68

Page 85: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

C P

rogramm

ing

The

riable e files ith tring s are rs must

n the

Error-Handling Function

crsp_errprintf Error Handler for CRSP Functions

Function Description

crsp_errprintf error handler for CRSPAccess97 library functions

Prototype: int crsp_errprintf (va_list)

Description: Builds error messages using an installation-wide file of messages, and supports basic handling of the resultsEach error message, including a mnemonic name and text description, is stored in a file in the CRSP initialization

directory. A unique integer error message number is assigned to each message. The function is passed a message number, an error number, flags for type of error and handling output, plus optionally arguments on where the results are sent and variables to modify the error messages. The message description is built into an output error message. If the error is from a system call, system error messages and error numbers can be appended to this message. The message is then written to the location specified in the print flag.

There is a generic message available to users wishing to use the crsp_errprintf functionality and there are two environment variables users can set to change the behavior of the error message function.

CRSP_TRACE – can be used to modify the output behavior of the function. The default behavior is to add only theformatted string to the message output, and use the CRSP_NULL printflag option. If CRSP_TRACE is defined it must have one or more of the following one-letter codes in a string. Each code present changes the output.possible codes are:m = CRSP_MSGNUMBER – add the message number to message outpute = CRSP_ERRNUMBER – add the error number to message outputn = CRSP_MSGNAME – add the message header name to message outputf = CRSP_MSGFORMAT – do not add formatted string to message outputs = CRSP_SEVERITY – add severity name to outputt = CRSP_ERRTYPE – add error type to outputc = CRSP_CALLTRACE – print call error type messageso = CRSP_NULLOUT – overrides CRSP_NULL printflag option in library functions. Print message directly

to standard output and clear errmsgr = CRSP_NULLERR – overrides CRSP_NULL printflag option in library functions. Print message directly

to standard error and clear errmsgw = CRSP_NOWARN – warning messages are ignoredi = CRSP_NOINFO – informational messages are ignoreda = CRSP_NOFATAL – fatal messages are ignored

CRSP_MSGFILE – can be used to use messages from one or more alternate message files. If this environment vais set to a comma-delimited set of files, the function will search these files in order for messages. The messagmust have a leading line with the lowest and highest message numbers allowed in the file, followed by lines wthree text fields delimited by pipes ( | ), containing in order a message number, a header name, and a format scompatible with the C printf function. The message lines must be sorted by message number. Header namelimited to 20 characters, and the format cannot produce a string of more than 500 characters. Message numbebe 100,000 or higher to avoid possible conflict with standard CRSPAccess97 messages

The standard CRSP message file is named crsp_error_msg.dat. It is found in the directory set by environment variables $CRSP_LIB on Unix, %crsp_lib% on Windows NT, and CRSP_LIB: on OpenVMS. It can contain message numbers in the range of 1 to 99,999. If an alternate message file contains a message number also iCRSP message file, the alternate definition is used, and therefore must have a compatible format.

Arguments: variable argument list, including:int errnum – number assigned to the specific errorint msgnum – number assigned to the specific message, must be defined in the error message fileint errorflag – flag for the type and severity of the error. It must be a constant of the following form:

CRSP_severity_typewhere severity is one of:INFO – informationalWARN – warningFATAL – fataland type is one of:USER – error in user argument or usageSYS – error returned by a system or external functionCALL – function call returned an errorPRINT – print global error messages

int printflag – flag for the method of handling the output. It must be one of the following constants:CRSP_ERROUT_STDERR – write messages directly to standard errorCRSP_ERROUT_STDOUT – write messages directly to standard outputCRSP_ERROUT_FILE – write messages to a file pointer (previously opened with fopen) given in the next argument

69

Page 86: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

ce

CRSP_ERROUT_STRING – write messages to string given in the next argument. The string must have enough spaallocated to store the message.

CRSP_NULL – append messages to a global string err_msg. If messages extend past the length of the string, previously stored messages are printed to the screen. This is used by all CRSP library functions.

FILE * errfileptr – optional argument present only if printflag is CRSP_ERROUT_FILE. If present, error messages are written to this file. Errfileptr is the file handler returned by fopen.

char * msgstring – optional argument present only if printflag is CRSP_ERROUT_STRING. If present, error messages are copied to the string

int msgstringlen – optional argument present only if printflag is CRSP_ERROUT_STRING. If present, msgstringlen is the length allocated to msgstring.

… - list of variables to be embedded in the error message. There can be zero or more variables. There must be a one to one correspondence between the number and types of variables in this list and the format string in the CRSP error message file for the specified msgnum.

Return Values: CRSP_SUCCESS: if error handled successfullyCRSP_FAIL: if error in parameters or in opening or reading the error message file

Side Effects:The crsp_init initialization function is called to initialize access. The crsp_error_msg.dat file in the

initialization directory is opened the first time the function is called and closed when the program exits.

Preconditions:

CRSP functions always place errors in a global string named err_msg . CRSP environment variables must be set properly so the file crsp_error_msg.dat is found in the CRSP_LIB directory. See the description for optional environment variables that affect the results.

70

Page 87: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

C P

rogramm

ing

C Portable File Functions

These functions call standard I/O functions in the C run time library, but can be used on Windows, Unix, and OpenVMS systems without changes and incorporate the CRSP error handling function.

crsp_file_append Generic File Append for Multiple Platforms

crsp_file_close Generic File Close for Multiple Platforms

crsp_file_fopen Generic File fopen for Multiple Platforms

Function Description

crsp_file_append generic file append for multiple platformscrsp_file_close generic file close for multiple platformscrsp_file_fopen generic file fopen for multiple platformscrsp_file_lseek generic file lseek for multiple platformscrsp_file_open generic file open for multiple platformscrsp_file_read generic file read for multiple platformscrsp_file_remove generic file delete for multiple platformscrsp_file_rename generic file rename for multiple platformscrsp_file_search generic check for existence of a file on multiple platformscrsp_file_stamp generic generation of a unique file name for multiple platformscrsp_file_write generic file write for multiple platformscrsp_free generic memory free for multiple platforms

Prototype: int crsp_file_append (char *origfile, char *newfile)

Description: append a file by adding the data from the second file at the end of the first fileArguments: char *origfile – pointer to the original file

char *newfile – pointer to the new file to be appended to the first fileReturn Values: CRSP_SUCCESS: if appended successfully

CRSP_FAIL: error in parameters or error in open or write operationSide Effects: both files are opened with fopen, data from the second file is copied to the first, and then both files are closed.

Preconditions: both files must exist and contain character data with no records 500 characters or longer.

Prototype: int crsp_file_close (int file_desc)

Description: calls the C close functionArguments: int file_desc – file handler of file to close, as returned from open function

Return Values: CRSP_SUCCESS: if closed successfullyCRSP_FAIL: file not opened or error in close

Side Effects: file described by file_desc is closedPreconditions: file must be previously opened, with file_desc returned from open

Prototype: (FILE *)crsp_file_fopen (va_alist)

Description: platform-independent version of fopen with support for extra OpenVMS optionsArguments: variable argument list:

char *path – mandatory argument containing path of file to openchar *mode – mandatory argument containing mode passed to fopen, “r” to open read-only, “rw” to read and write. See fopen for all options.0-6 char *rmsflags – up to 6 optional RMS flags passed to fopen on OpenVMS systems, and ignored on other systems

Return Values: File pointer on successNULL if error opening file

Side Effects:the file specified in the first parameter is opened. The default setting for OpenVMS systems is “mbc=127” unless

overridden by one of the rmsflags options.Preconditions: see fopen for options, and OpenVMS documentation for all RMS options.

71

Page 88: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

nly

crsp_file_lseek Generic C lseek for Multiple Platforms

crsp_file_open Generic C Open for Multiple Platforms

crsp_file_read Generic C Read for Multiple Platforms

crsp_file_remove Generic File Delete for Multiple Platforms

Prototype: int crsp_file_lseek (int file_desc, int offset, int direction)

Description:

Generic file lseek for multiple platforms. This function positions a file to an arbitrary byte position and returns the new position. Parameters are passed directly to the C lseek function. See documentation on this function for more details.

Arguments: int file_desc – the file descriptor of the current file returned by C openint offset – the offset specified in bytes int direction – an integer measuring whether the offset is to be measured:

forward from the beginning of the file (direction = SEEK_SET)forward from the current position (direction = SEEK_CUR)forward from the end of the file (direction = SEEK_END)

Return Values: the new file position if successfulCRSP_FAIL: error if file descriptor unidentified, or a seek was attempted before the beginning of the file.

Side Effects: the current position in the file is set for further operationsPreconditions: file must be previously opened with the open function

Prototype:int crsp_file_open (char *file_spec, int flags, unsigned int mode, int platflags, int pmode, int allocate, int mbc, int extend)

Description:

Generic file open for multiple platforms. Parameters for OpenVMS, Unix, and Windows versions are passed, and othe ones needed for the current platform are passed to the C open function. See C documentation on this function for more details.

Arguments: char *file_spec – character string containing a valid file specification of a file to be opened.int flags – flags for permitted usage of opened file int mode – the file protection of a new fileint platflags – additional flags bitwise or’ed with flags if Windows, ignored if another systemint pmode – additional protection modes or’ed with mode if Windows, ignored if another systemint allocate – blocks to allocate for a new file on OpenVMS, ignored if another systemint mbc – block count per I/O on OpenVMS, ignored if another systemint extend – blocks to allocate if additional space is needed on OpenVMS, ignored if another system

Return Values: file descriptor if opened successfully, to be used in other file operations with this fileCRSP_FAIL: error if file could not be opened

Side Effects: the file is opened and the file pointer is returned for further accessPreconditions: the file existence and protections must agree with flags and modes passed to open

Prototype: int crsp_file_read (int file_desc, void *buffer, int nbytes)

Description: Generic file read for multiple platformsArguments: int file_desc – the file descriptor of the current file returned by C open

void *buffer – address of contiguous storage where data will be loadedint nbytes – the maximum number of bytes to read

Return Values:the number of bytes read. The return value does not necessarily equal nbytes since the function does not read beyond

the end of the file or input terminal lineCRSP_FAIL: if error in parameters or read

Side Effects: the current position in the file is set to the end of the readPreconditions: file must be previously opened with the open function

Prototype: int crsp_file_remove (char *file_spec)

Description: calls the C remove function to delete a fileArguments: char *file_spec – specification of file to remove

Return Values: CRSP_SUCCESS: if removed successfullyCRSP_FAIL: error in parameters or error in remove operation

Side Effects: file is removedPreconditions: file must exist and user must have delete permissions

72

Page 89: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

C P

rogramm

ing

stem

he C

crsp_file_rename Generic File Rename for Multiple Platforms

crsp_file_search Generic Check for the Existence of a File

crsp_file_stamp Create a Unique File Name

crsp_file_write Generic C Write for Multiple Platforms

crsp_free Generic Memory Free for Multiple Platforms

Prototype: int crsp_file_rename (char *old_file_spec, char *new_file_spec)

Description: calls the C rename function to change the name of a fileArguments: char *old_file_spec – specification of the file to rename

char *new_file_spec – new specification of the fileReturn Values: CRSP_SUCCESS: if renamed

CRSP_FAIL: error in parameters or error in file operation or permissionsSide Effects: the file is renamed

Preconditions:the old file must exist, the second must be a valid specification, and the rename operation must be valid on the sy

between the two files.

Prototype: int crsp_file_search (char *file_spec)

Description: Checks for the existence of a fileArguments: char *file_spec – specification of file to check

Return Values: CRSP_SUCCESS: if the file existsCRSP_FAIL: if the file does not exist or cannot be opened for read access

Side Effects: file is opened and closedPreconditions: file must have read permissions

Prototype: char *crsp_file_stamp ( )

Description:

Creates a string that can be built into a unique file name based on system time and user ID. The string contains tprocess ID returned from the C getpid function, an underscore, and the system time in seconds returned from thetime function.

Arguments: noneReturn Values: pointer to a string with a file specification if successful

NULL: if failure getting system time or user IDSide Effects: memory is allocated up to 80 characters to store the new file name

Preconditions: none

Prototype: int crsp_file_write (int file_desc, void *buffer, int nbytes)

Description: Generic file write for multiple platformsArguments: int file_desc – the file descriptor of the current file returned by C open

void *buffer – address of contiguous storage where data will be retrievedint nbytes – the maximum number of bytes to write

Return Values: the number of bytes written.CRSP_FAIL: if error in parameters or write

Side Effects: the current position in the file is set to the end of the written dataPreconditions: file must be previously opened with the open function

Prototype: int crsp_free (void *ptr)

Description: calls the C free functionArguments: void *ptr – pointer to the memory to be freed

Return Values: CRSP_SUCCESS: if successfully freed. Always true on Windows and Unix where C free function is void.CRSP_FAIL: error freeing memory on OpenVMS systems

Side Effects: memory pointed to by ptr is deallocatedPreconditions: none

73

Page 90: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

General Utility Functions

The utility functions operate on the base CRSPAccess97 data structures and are not specific to a type of data. Theyinclude operations on calendars CRSP object structures and general utilities.

Function Group Description

Calendar Utility Functions Functions Used to Manipulate CRSP CalendarsCompare Function Functions Used to Compare Data in Two Structures

Database Information Functions Find CRSPAccess97 Database InformationObject Clear Functions Functions Used to Clear CRSP StructuresObject Utility Functions Functions Used to Manipulate CRSP Object StructuresString Functions Functions Used to Manipulate Character StringsClear Functions Functions Used to Clear CRSP Structures

74

Page 91: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

C P

rogramm

ing

Calendar Utility Functions

These functions are used to manipulate calendar data in CRSPAccess97 databases.

Calendar Utility Functions

crsp_cal_datecmp CALDT Date Search

crsp_cal_dt2lin Transforms YYYYMMDD Format into Linear Date

crsp_cal_dt2parts Separates the YYYYMMDD Format into Year, Month and Day

crsp_cal_lin2dt Transfers Linear Dates into YYYYMMDD Format

Function Description

crsp_cal_datecmp CALDT Date Searchcrsp_cal_dt2lin Transforms YYYYMMDD Format into Linear Datecrsp_cal_dt2parts Separates the YYYYMMDD Format into Year, Month and Daycrsp_cal_lin2dt Transfers Linear Date into YYYYMMDD Formatcrsp_cal_middt Finds the Mid-Point Date of a Rangecrsp_cal_diffdays Finds the Number of Calendar Days Between two Datescrsp_cal_link Creates a Link to Map Periods of two Calendarscrsp_cal_search Generic Calendar Date Search

Prototype: int crsp_cal_datecmp (int *calelem, int *caldates, int ndays, int flag)

Description: searches for an array of caldates to find the matching date for calelem and return the array index .Arguments: int *calelem - pointer to the date in YYYYMMDD format

int *caldates - pointer to the array of calendar dates, usually the caldt pointer in a CRSP_CAL structureint ndays – the number of days in the caldates , usually ndays in a CRSP_CAL structureint flag - flag for handling inexact matches (see crsp_cal_search )

Return Values: index of date found if a date found according to flagCRSP_FAIL: if not acceptable match according to flag

Side Effects: none

Prototype: int crsp_cal_dt2lin (int idate)

Description: transforms the YYYYMMDD format of date into a linear date (number of days since 19000101)Arguments: int idate - the date to be transformed

Return Values: linear dateCRSP_FAIL: if error

Side Effects: none

Prototype: void crsp_cal_dt2parts (int idate, int *year, int *month, int *day)

Description: separates the yyyymmdd formatted date into year, month, day.Arguments: int idate – date need to be separated

int *year – pointer to be loaded with YYYY yearint *month – pointer to be loaded with MM monthint *day – pointer to be loaded with DD day

Return Values: none

Prototype: int crsp_cal_lin2dt (int linear_date)

Description: transfers the linear date (number of days since 19000101) into YYYYMMDD format date.Arguments: int linear_date - the date in linear format

Return Values: translated YYYYMMDD dateCRSP_FAIL: if error

Side Effects: None

75

Page 92: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

crsp_cal_middt Finds the Mid-Point Date of a Range

crsp_cal_diffdays Finds the Number of Calendar Days Between Two Dates

crsp_cal_link Maps from One Calendar to Another

Prototype: int crsp_cal_middt (int idate1, int idate2)

Description: finds a date in the middle of first date and second dateArguments: int idate1 - first date, in YYYYMMDD format

int idate2 - second date, in YYYYMMDD formatReturn Values: middt: middle date between idate1 and idate2

CRSP_FAIL: if errorSide Effects: none

Prototype: int crsp_cal_diffdays (int idate1, int idate2)

Description: finds the number of days between two YYYYMMDD datesArguments: int idate1 - the first date

int idate2 - the end dateReturn Values: number of days

CRSP_FAIL: if errorSide Effects: none

Prototype: int crsp_cal_link (CRSP_CAL *calbase, CRSP_CAL *calsub, int wanted, int flag)

Description:Finds mapping between a base and subset calendar. The map will have each period in the subset calendar in terms of

period index ranges of the base.Arguments: CRSP_CAL *calbase - pointer to base calendar structure

CRSP_CAL *calsub – pointer to subset calendar structureint wanted - the type of calendar period identification to link in the source calendar. Possible values are:

CAL_TYPE_ID – wanted calllist CAL_TYPE_DATE - wanted caldt CAL_TYPE_DATERANGE - wanted date rangeCAL_TYPE_TIME - wanted date + time CAL_TYPE_TIMERANGE - wanted date range + time

int flags - flags for mapping when subset date/range is not applicable to the base. Possible values are:CRSP_CAL_EXACT – (= 0) non-exact matches are not mappedCRSP_CAL_BACK – (= -1) if not found use previousCRSP_CAL_NEXT – (=-1) if not found use next

Return Values: CRSP_SUCCESS: calmap successfully loadedCRSP_FAIL: error

Side Effects:

This function allocates space for the calmap CRSP_CAL structure element. It loops through calsub and for eachdate, finds the cal index at calbase for the same date and stores them in calsub ->calmap. The calsubbasecal pointer is set to basecal .

76

Page 93: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

C P

rogramm

ing

crsp_cal_search Date Range Search

Prototype:int crsp_cal_search (CRSP_CAL *cal, int wanted, void *calelem, int flag, int

rangeflag)

Description:

Finds the relevant calendar index number for a given calendar period. The element type may be any of the typessupported by CRSP_CAL. crsp_cal_search will use only the calendar type that matches the element type. Theflag is used depending on type to handle inexact matches.

Arguments: CRSP_CAL *cal - pointer to the calendar structure to searchint wanted - type of calendar element that will be located, one of:

CAL_TYPE_ID (1) = callistCAL_TYPE_DATE (2) = caldtCAL_TYPE_DATERANGE (4) = date range CAL_TYPE_TIME (8) = date and timeCAL_TYPE_TIMERANGE (16) = date+time range

int calelem - calendar element to find. This is a pointer to a structure that must agree with the wanted parameter, either an int for CAL_TYPE_ID or CAL_TYPE_DATE or, a CRSP_CAL_TIME, a CRSP_CAL_DATE_RANGE, or a CRSP_CAL_TIMERANGE structure.

int flag - flag for handling inexact matches –CRSP_CAL_EXACT (0) - only exact matches are acceptableCRSP_CAL_BACK (-1) - if not found use previousCRSP_CAL_NEXT (1) - if not found use next

int rangeflag - option if calendar type and elements are date or time ranges0 not applicable1 use beginning of ranges2 use end of range3 use middle of beginning and end

Return Values: index of date if a date found according to flagCRSP_NOMATCH if no acceptable match according to flagCRSP_FAIL if invalid flag or data variable

Side Effects: none

77

Page 94: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

Comparison Functions

These functions are used to compare data.

crsp_cmp_int Compares Two Integers

crsp_cmp_string Compares Two Strings

Function Description

crsp_cmp_int Compares Two Integerscrsp_cmp_string Compares Two Strings

Prototype: int crsp_cmp_int(const void *elem1, const void *elem2)

Description: Compares two integers. Can be used as input functions to C search and sort functions.Arguments: const void* - elem1 – pointer to the first element

const void* - elem2 – pointer to the second element

Return Values:int: <0 if elem1 < elem2, 0 if elem1 = elem2,>1 if elem1 > elem2. Based on standard int

comparisons

Prototype: int crsp_cmp_string(const void *elem1, const void *elem2)

Description: Compares two strings. Can be used as input functions to C search and sort functions.Arguments: const void * -

elem1 - pointer to the first terminated stringconst void * - elem2 - pointer to the second terminated string

Return Values:int: <0 if elem1 < elem2, 0 if elem1 = elem2, >1 if elem1 > elem2. Based on standard

string comparisons

78

Page 95: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

C P

rogramm

ing

CRSP Object Functions

These functions are used to manipulate base CRSPAccess97 object structures.

crsp_obj_verify_ts Verifies a CRSP Time Series Object

crsp_obj_verify_arr Verifies a CRSP Array Object

Function Description

crsp_obj_verify_ts Verifies a CRSP Time Series Objectcrsp_obj_verify_arr Verifies a CRSP Array Objectcrsp_obj_verify_row Verifies a CRSP Row Objectcrsp_obj_init_ts Initializes a CRSP Time Series Objectcrsp_obj_init_arr Initializes a CRSP Array Objectcrsp_obj_init_row Initializes a CRSP Row Objectcrsp_obj_comp_ts Compares two CRSP Time Series Objectscrsp_obj_comp_arr Compares two CRSP Array Objectscrsp_obj_comp_row Compares two CRSP Row Objectscrsp_obj_free_ts Frees a CRSP Time Series Objectcrsp_obj_free_arr Frees a CRSP Array Objectcrsp_obj_free_row Frees a CRSP Row Object

Prototype:int crsp_obj_verify_ts(CRSP_TIMESERIES *ptr, int arrtype, int subtype, int

maxarr, int caltypes)

Description: Verifies a time series object, by comparing array type, size, calendar, and data characteristics against expected valuesArguments: CRSP_TIMESERIES *ptr – pointer to a CRSP time series object

int arrtype – constant for the structure type of the array in the object arr. Constants are defined in crsp_const.hint subtype – constant for the subcategory of data in the array. Constants are defined in crsp_const.hint maxarr – maximum elements in the arrayint caltype – expected calendar type in the time series

Return Values: CRSP_SUCCESS: if verification is correct1251: object type does not verify in CRSP_TIMESERIES structures 1252: array type does not verify in time series 1253: subtype does not verify in time series 1254: maxarr does not verify in time series 1255: caltype does not verify in time series 1256: beg and end do not verify in time series 1257: end can not be greater then maxarr in time series 1258: cal pointer can not be NULL in time series 1259: ndays cannot be > than maxarr in time series 1260: arr pointer cannot be NULL in time series

Prototype:int crsp_obj_verify_arr (CRSP_ARRAY *crsp_array_ptr, int arrtype, int subtype,

int maxarr)

Description: Verifies a CRSP array object, by comparing array type, size, and data characteristics against expected valuesArguments: CRSP_ARRAY *crsp_array_ptr – pointer to a CRSP array object

int arrtype – constant for the structure type of the array in the object arr. Constants are defined in crsp_const.hint subtype – constant for the subcategory of data in the array. Constants are defined in crsp_const.hint maxarr – maximum elements in the array

Return Values: CRSP_SUCCESS: if verification is correct1271:object type does not verify in CRSP_ARRAY 1272: array type does not verify in CRSP_ARRAY 1273:subtype does not verify in CRSP_ARRAY 1274:maxarr does not verify in CRSP_ARRAY 1275: num can not be greater than maxarr in CRSP_ARRAY 1276: arr pointer cannot be null in CRSP_ARRAY

79

Page 96: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

eaderith the

are set

crsp_obj_verify_row Verifies a Crsp Row Object

crsp_obj_init_ts Initializes a CRSP Time Series Object

crsp_obj_init_arr Initializes a CRSP Array Object

Prototype: int crsp_obj_verify_row (CRSP_ROW *crsp_row_ptr, int arrtype, int subtype)

Description: Verifies a CRSP row object, by comparing array type and data characteristics against expected valuesArguments: CRSP_ROW *crsp_row_ptr – pointer to a CRSP row object to verify

int arrtype – constant for the structure type of the array in the object arr. Constants are defined in crsp_const.hint subtype – constant for the subcategory of data in the array. Constants are defined in crsp_const.h

Return Values: CRSP_SUCCESS – if verification is correct1282: object type does not verify in CRSP_ROW 1283: array type does not verify in CRSP_ROW 1284: subtype does not verify in CRSP_ROW1285: arr pointer cannot be null in CRSP_ROW

Prototype:

int crsp_obj_init_ts (CRSP_TIMESERIES **crsp_timser_ptr, int arrtype, int subtype, int maxarr, int caltype, int size_of_array, CRSP_CAL *calptr, void *init_ptr )

Description:

Initializes a time series object. If the crsp_timeser_ptr pointer passed is NULL, the function allocates space forthe object. If the array within the object is not allocated the function allocates space for the array. Object hvalues are set and the calendar is attached to the time series. Each element in the object’s array is initialized wvalue in init_ptr.

Arguments: CRSP_TIMESERIES **crsp_timser_ptr – pointer to a CRSP time series pointerint arrtype – constant for the structure type of the array in the object arr. Constants are defined in crsp_const.hint subtype – constant for the subcategory of data in the array. Constants are defined in crsp_const.hint maxarr – maximum elements in the arrayint caltype – calendar type, to allocate, =2 for caldtsint size_of_array – size of the structure for each array element CRSP_CAL *calptr – pointer to a calendar that will be attached to the time series objectvoid *init_ptr - a pointer to a structure of size size_of_array with missing values to load to each element in

the array. Can be NULL.Return Values: CRSP_SUCCESS:- if successfully initialized and space allocated

CRSP_FAIL: if error allocating memory, error in parameters

Prototype:int crsp_obj_init_arr( CRSP_ARRAY **crsp_array_ptr, int arrtype, int subtype, int

maxarr, int size_of_array, void *init_ptr)

Description:

Initializes an array object. If the crsp_array_ptr pointer passed is NULL, the function allocates space for the object.If the array within the object is not allocated the function allocates space for the array. Object header values and each element in the object’s array is initialized with the value in init_ptr.

Arguments: CRSP_ARRAY **crsp_array_ptr – pointer to a CRSP array structure pointerint arrtype – constant for the structure type of the array in the object arr. Constants are defined in crsp_const.hint subtype - constant for the subcategory of data in the array. Constants are defined in crsp_const.hint maxarr – maximum elements in the arrayint size_of_array – size of the structure for each array element void *init_ptr - a pointer to a structure of size size_of_array with missing values to load to each element in

the array. Can be NULL.Return Values: CRSP_SUCCESS: if successfully initialized and space allocated

CRSP_FAIL: if error allocating memory, error in parameters

80

Page 97: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

C P

rogramm

ing

endars

crsp_obj_init_row Initializes a CRSP Row Object

crsp_obj_comp_ts Compares Two CRSP Time Series Objects

crsp_obj_comp_arr Compares Two CRSP Array Objects

crsp_obj_comp_row Compares Two CRSP Row Objects

Prototype:int crsp_obj_init_row(CRSP_ROW **crsp_row_ptr, int arrtype, int subtype, int

size_of_array, void *init_ptr)

Description:

Initializes a row object. If the crsp_row_ptr pointer passed is NULL, the function allocates space for the object. Ifthe array within the object is not allocated the function allocates space for the array. Object header values are set andthe object’s array element is initialized with the value in init_ptr.

Arguments: CRSP_ROW **crsp_row_ptr – pointer to a CRSP row pointerint arrtype – constant for the structure type of the array in the object arr. Constants are defined in crsp_const.hint subtype - constant for the subcategory of data in the array. Constants are defined in crsp_const.hint size_of_array – size of the structure for the array elementvoid *init_ptr - a pointer to a structure of size size_of_array with missing values to load to the row. Can be

NULL.Return Values: CRSP_SUCCESS: if successfully initialized and allocate space

CRSP_FAIL: if error allocating memory, error in parameters

Prototype:int crsp_obj_comp_ts( CRSP_TIMESERIES *crsp_timser_ptr1, CRSP_TIMESERIES

*crsp_timser_ptr2 )

Description: Compares two time series objects, by comparing array types, data characteristics, array sizes, and associated calArguments: CRSP_TIMESERIES *crsp_timser_ptr1

CRSP_TIMESERIES *crsp_timser_ptr2 Return Values: CRSP_SUCCESS: if comparison is correct

1261: the two CRSP_TIMESERIES have different object types1262: the two CRSP_TIMESERIES have different array types1263: the two CRSP_TIMESERIES have different subtypes1264: the two CRSP_TIMESERIES have different array widths1265: the two CRSP_TIMESERIES have different maximum arrays1266: the two CRSP_TIMESERIES have different calendar types1267: the two CRSP_TIMESERIES calendar pointers do not compare

Prototype: int crsp_obj_comp_arr( CRSP_ARRAY *crsp_array_ptr1, CRSP_ARRAY *crsp_array_ptr2)

Description: Compares two CRSP_ARRAY objects, by comparing data array type, size, and data characteristicsArguments: CRSP_ARRAY *crsp_array_ptr1

CRSP_ARRAY *crsp_array_ptr2Return Values: CRSP_SUCCESS if array objects match

1277: the two CRSP_ARRAYs have different object types1278: the two CRSP_ARRAYs have different array types1279: the two CRSP_ARRAYs have different subtypes1280: the two CRSP_ARRAYs have different array widths1281: the two CRSP_ARRAYs have different maximum array types

Prototype: int crsp_obj_comp_row( CRSP_ROW *crsp_row_ptr1, CRSP_ROW *crsp_row_ptr2 )

Description: Compares two CRSP row objects, by comparing data array type and data characteristicsArguments: CRSP_ROW *crsp_row_ptr1

CRSP_ROW *crsp_row_ptr2 Return Values: CRSP_SUCCESS: if row objects match

1286: The two CRSP_ROW objects have different object types1287: The two CRSP_ROW objects have different array types1288: The two CRSP_ROW objects have different subtypes1289: The two CRSP_ROW objects have different array widths

81

Page 98: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

crsp_obj_free_ts Frees a CRSP Time Series Object

crsp_obj_free_arr Frees a CRSP Array Object

crsp_obj_free_row Frees a CRSP Row Object

Prototype: int crsp_obj_free_ts (CRSP_TIMESERIES **crsp timeser_ptr, int free_flag)

Description: Frees a CRSP time series object by deallocating memory for just the data array or the entire objectArguments: CRSP_TIMESERIES **crsp_timser_ptr – points to a CRSP time series object pointer

int free_flag - frees only the arr part or all. Valid values to be freed are: CRSP_FREE_ARR_ONLY, orCRSP_FREE_OBJ_ALL

Return Values: CRSP_SUCCESS - if free is successfulCRSP_FAIL – if error freeing memory or bad pointer or flag

Side Effects: Frees part or whole of the CRSP_TIMESERIES, depending on the free_flag set.

Prototype: int crsp_obj_free_arr(CRSP_ARRAY **crsp_array_ptr, int free_flag)

Description: Frees a CRSP array object by deallocating memory for just the data array or the entire objectArguments: CRSP_ARRAY **crsp_array_ptr – pointer to a CRSP array object pointer

int free_flag - free only the arr part or all valid values to be freed are: CRSP_FREE_ARR_ONLY, orCRSP_FREE_OBJ_ALL

Return Values: CRSP_SUCCESS: if free is successfulCRSP_FAIL: if error freeing memory or bad pointer or flag

Side Effects: Frees part or whole of the CRSP_ARRAY, depending on the free_flag set.

Prototype: int crsp_obj_free_row( CRSP_ROW **crsp_row_ptr, int free_flag )

Description: Frees a CRSP row object by deallocating memory for just the data array or the entire objectArguments: CRSP_ROW **crsp_row_ptr – pointer to a CRSP row object pointer

int free_flag - frees only the arr part or all valid values to be freed are: CRSP_FREE_ARR_ONLY, orCRSP_FREE_OBJ_ALL

Return Values: CRSP_SUCCESS: if free is successfulCRSP_FAIL: if error freeing memory or bad pointer or flag

Side Effects: Frees part or whole of the CRSP_ROW, depending on the free_flag set.

82

Page 99: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

C P

rogramm

ing

s and

String Utilities

These functions can be used to manipulate strings.

crsp_util_convtype Converts CRSP Constant Names to Integers

crsp_util_lowercase Converts Strings to All Lowercase Letters

crsp_util_strtrim Removes Trailing Blanks From Strings

crsp_util_uppercase Converts Strings to All Uppercase Letters

crsp_util_squeeze Removes White Space from Character Strings

Function Descriptioncrsp_util_convtype Converts CRSP Constant Names to Integerscrsp_util_lowercase Converts Strings to All Lowercase Letterscrsp_util_strtrim Removes Trailing Blanks from Stringscrsp_util_uppercase Converts Strings to All Uppercase Letterscrsp_util_squeeze Removes White Space from Character Strings

Prototype: int crsp_util_convtype (char *typestring)

Description:converts a CRSP constant name string to an integer. All CRSP #defined _NUM constants defined in crsp_const.h

are supported.Arguments: char * typename – string to convert

Return Values: integer code foundCRSP_FAIL: if string not supported

Side Effects: nonePreconditions: none

Prototype: void crsp_util_lowercase (char *string)

Description: converts a string to all lowercase letters.Arguments: char *string – string to convert

Return Values: noneSide Effects: string may be changed. If the string is a string of spaces, the routine leaves one leading space.

Preconditions: string must be a null-terminated character string

Prototype: void crsp_util_strtrim (char *string)

Description: converts a string by moving the string termination to after the last nonblank character.Arguments: char *string – string to convert

Return Values: noneSide Effects: string may be changed

Preconditions: string must be a null-terminated character string

Prototype: void crsp_util_uppercase (char *string)

Description: converts a string to all uppercase letters.Arguments: char *string – string to convert

Return Values: noneSide Effects: string may be changed

Preconditions: string must be a null-terminated character string

Prototype: int crsp_stk_clear (CRSP_STK_STRUCT *stk, int clearflag)

Description:converts a string by removeing white space. All leading and trailing tabs or spaces are removed, and multiple tabspaces are replaced with a single space.

Arguments: char *string – string to convertint clearflag – constant identifying the level of clearing. Supported values are:

CRSP_CLEAR_INIT – only reset numCRSP_CLEAR_ALL – set num to 0 and set missing values for all elements in the object arraysCRSP_CLEAR_RANGE – set missing values for all array elements between 0 and num-1CRSP_CLEAR_SET – set ranges in the maxarr-1’th element of the CRSP_ARRAY to missing values specific tothe array type.

Return Values: noneSide Effects: string may be changed

Preconditions: string must be a null-terminated character stringArguments: CRSP_STK_STRUCT *stk – pointer to a stock structure pointer to be cleared

83

Page 100: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

C Structure Generic Clear Functions

These functions are used to load missing data to CRSPAccess97 object structures. CRSPAccess97 access functionsmay be used when the set type is not known ahead of time.

crsp_util_clear_arr Load Missing Values to an Array

crsp_util_clear_elem Load Missing Values to One Array Element or Structure

crsp_util_clear_row Load Missing Values to a Row

Function Description

crsp_util_clear_arr Sets a CRSP_ARRAY to missing valuescrsp_util_clear_elem Sets one structure to missing valuescrsp_util_clear_row Sets a CRSP_ROW to missing valuescrsp_util_clear_ts Sets a CRSP_TIMESERIES to missing values

Prototype: int crsp_util_clear_arr (CRSP_ARRAY *arr, int clearflag)

Description: Loads missing values into a CRSP_ARRAY on a range level or array level.Arguments: CRSP_ARRAY *arr – pointer to a array to be cleared.

int clearflag – constant identifying the level of clearing. Supported values are:CRSP_CLEAR_INIT – only reset numCRSP_CLEAR_ALL – set num to 0 and set missing values for all elements in the object arraysCRSP_CLEAR_RANGE – set missing values for all array elements between 0 and num-1CRSP_CLEAR_SET – set ranges in the maxarr-1’th element of the CRSP_ARRAY to missing values specific tothe array type.

Return Values: CRSP_SUCCESS: if successCRSP_FAIL: if bad parameters

Side Effects:

The array pointer has all allocated fields initialized according to the clearflag. If clearflag isCRSP_CLEAR_RANGE or CRSP_CLEAR_ALL, a missing value is copied to the maxarr-1’th element of theCRSP_ARRAY.

Preconditions: The array pointer must be NULL or initialized with a valid arrtype and subtype.

Prototype: int crsp_util_clear_elem (void *elem, int arrtype, int subtype)

Description: Loads missing values into one structure identified by array type and subtype.Arguments: void *elem – pointer to structure to be loaded with missing values

int arrtype – integer code identifying the structure or simple data type of the elementint subtype – integer code identifying the subcategory of data loaded in the element

Return Values: CRSP_SUCCESS: if successCRSP_FAIL: if bad parameters or unknown arrtype or subtype

Side Effects: The proper missing values are loaded to the elementPreconditions: arrtype and subtype must be valid

Prototype: int crsp_util_clear_row (CRSP_ROW *row, int clearflag)

Description: Loads missing values into a CRSP_ROWArguments: CRSP_ROW *row – pointer to a row to be cleared.

int clearflag – constant identifying the level of clearing. Supported values are:CRSP_CLEAR_INIT – no changeCRSP_CLEAR_ALL – set missing values for the array elementCRSP_CLEAR_RANGE – set missing values for the array elementCRSP_CLEAR_SET – set missing values for the array element

Return Values: CRSP_SUCCESS: if successfully cleared or NULL row or row arrayCRSP_FAIL: if unknown arrtype or subtype

Side Effects: The row pointer has all allocated fields initialized according to the clearflagPreconditions: The row pointer must be NULL or initialized with a valid arrtype and subtype.

84

Page 101: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

C P

rogramm

ing

crsp_util_clear_ts Load Missing Values to a Time Series

Prototype: int crsp_util_clear_ts (CRSP_TIMESERIES *ts, int clearflag)

Description: Loads missing values into a time series on a range level or array level.Arguments: CRSP_TIMESERIES *ts – pointer to a time series to be loaded.

int clearflag – constant identifying the level of clearing. Supported values are:CRSP_CLEAR_INIT – only reset beg and end for CRSP_TIMESERIESCRSP_CLEAR_ALL – set ranges to missing and set missing values for all elements in the time seriesCRSP_CLEAR_RANGE – set missing values to all array elements between beg and endCRSP_CLEAR_SET – set the 0’th element of a CRSP_TIMESERIES array to missing values specific to the arraytype.

Return Values: CRSP_SUCCESS: if successfully cleared or time series pointer is NULLCRSP_FAIL: if bad parameters or unknown arrtype or subtype

Side Effects:The time series pointer has all allocated fields initialized according to the clearflag. If clearflag is

CRSP_CLEAR_RANGE, the 0’th element of the time series array is set to a missing value.Preconditions: The time series pointer must be NULL or initialized with a valid arrtype and subtype.

85

Page 102: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

DD

DD

CRSPAccess97 C Database Information Function

This function is used to retrieve information about a database.

Database Information Function

crsp_root_info_get Load CRSPAccess97 Database Information

Function Description

crsp_root_info_get Load database information from a database

Prototype: int crsp_root_info_get (int crspnum, CRSP_ROOT_INFO *info)

Description:Loads database information from a CRSPAccess97 database into a structure. CRSP_ROOT_INFO is defined in

crsp_objects.h. The following fields are available:crt_date – 25-character string containing the time the database was created, in the format “Dow Mon HH:MM:SS YYYY”mod_date – 25-character string containing the time the database was last modified, in the format “Dow MonHH:MM:SS YYYY”cut_date – 25-character string containing the last date of data in the database, currently loaded as YYYYMMbinary_type – L if IEEE Little-Endian, and B if IEEE Big-Endiancode version – 19-character string containing the CRSPAccess97 version used to create the databaseproduct_code – 11-character CRSP Product Codeproduct_name – 47-character Product name of the databaseversion – integer version number of the databasesettypes – an array of up to eight integer settypes available in the databasesetids – an array of up to eight integer setids available in the databasesetnames – an array of up to eight names of the sets in the databasenumsets – the number of data sets in the databasecalids – an array of up to eight integer calids of calendars available in the databasecalavail – an array of up to eight integer caltypes of the calendars available in the databasecalnames – an array of up to eight names of the calendars in the databasenumcals – the number of calendars in the database

Arguments: int crspnum – database identifier returned by a CRSPAccess97 database open functionCRSP_ROOT_INFO *info – structure that will be loaded with database information

Return Values: CRSP_SUCCESS: if successfully loadedCRSP_FAIL: if database is not open or error loading information structure

Side Effects: nonePreconditions: the database must be opened with one of the CRSPAccess97 open functions.

86

Page 103: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

C P

rogramm

ing

Data Utility Functions

The CRSP library contains several groups of data functions described in the following table. Subsections in thissection contain the descriptions of the individual functions within each of the function groups.

Data Utility Function GroupsFunctions Group Description

Adjust Functions Functions to Adjust Prices or Other DataExcess Returns Functions Functions to Make Excess Returns CalculationsName Mapping Functions Functions to Map Name History Fields to Time SeriesNasdaq Information Mapping Functions Functions to Map Nasdaq Information Elements to Time SeriesReturns Functions Functions to Calculate ReturnsShares Functions Functions to Manipulate Shares DataStock Print Functions Functions to Print Specialized Stock DataTranslation Functions Functions to Translate Data to New Time Series

87

Page 104: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

Adjust Functions

These functions adjust prices, dividends, volumes, and shares for splits or other price factors.

crsp_adj_load Builds a Price Adjustment Structure Array

crsp_adj_map_ts Adjusts a Source CRSP_TIMESERIES According to an Adjustment Array

Function Description

crsp_adj_load Builds a Price Adjustment Structure Arraycrsp_adj_map_ts Adjusts a Source CRSP_TIMESERIES According to an Adjustment Arraycrsp_adj_map_arr Adjusts a Source CRSP_ARRAY According to an Adjustment Arraycrsp_adj_stk Adjusts all relevant fields in a Source Stock Structure

Prototype:int crsp_adj_load (CRSP_STK_STRUCT *stk, CRSP_ARRAY *adj_arr, int adjdt, int

factyp, int gapflg, int knowexch)

Description: loads a crsp_array of crsp_adj_struct structures with cumulative adjustment factors and effective dates.Arguments: CRSP_STK_STRUCT *stk - stk structure with at least events and prices loaded.

CRSP_ARRAY *adj_arr - adj array that will be loaded. It must exist with enough space to store completed array ofadjustment events

int adjdt - base anchor date guaranteed to have 1.0 factorint factyp - code of adjustment type: 0 = stock splits and dividends only 1 = all dists with facpr int gapflg -

0 carry adjustments over a gap/1 adjustments stop when trading on unknown exchange

int knowexch - unused, always set to 0.Return Values: CRSP_SUCCESS: if adjustment structure successfully loaded

CRSP_FAIL: if error in parameters or structuresSide Effects: The adj_arr will be loaded. The subtype in the adj_arr is set to the adjust base date.

Preconditions:It is assumed that events and prices have been loaded. The adj_arr must have arrtype

CRSP_ADJ_STRUCT_NUM

Prototype:int crsp_adj_map_ts(CRSP_TIMESERIES *src_ts, CRSP_TIMESERIES *trg_ts, CRSP_ARRAY

*adj_arr, int begrng, int endrng, int endflg, int direct)

Description:adjusts a source time series according to an adjust array and put the results in a target time series. The adjust array must

exist.Arguments: CRSP_TIMESERIES *src_ts – pointer to source time series, already loaded by crsp_adj_load

CRSP_TIMESERIES *trg_ts – pointer to preexisting target time series, emptyCRSP_ARRAY *adj_arr – pointer to adjustment array already loadedint begrng - begin date index of the date range adjustmentint endrng - end date index of the date range adjustmentint endflg - deterines whether adjustments can be made after the last day of prices. If set to 1 missing values are set

for the range after the end date. Otherwise the last adjustment value is usedint direct - direction flag multiply or divide with adj factor 1 = multiply with adjustment factor -1 = divide with

adjustment factorReturn Values: CRSP_SUCCESS: (integer) if successfully adjusted

CRSP_FAIL: if error in parameters or adjustment

Side Effects:The target time series is loaded with the adjusted data items from the source time series, for the date range specified by

begrng and endrng .

Preconditions:

The src_ts, trg_ts and adj_arr must exist. src_ts and trg_ts must have the same arrtype andsubtype and the same calendar. The src_ts subtype can not be any of these: CRSP_RETURN_NUM orCRSP_PRICE_ADJ_NUM or CRSP_VOLUME_ADJ_NUM. The wanted date range must be subset of the data daterange and the adj_arr date range.

88

Page 105: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

C P

rogramm

ing

es are

crsp_adj_map_arr Adjusts a Source CRSP_ARRAY According to an Adjustment Array

crsp_adj_stk Adjusts All Relevant Fields in a Source Stock Structure

Prototype:int crsp_adj_map_arr (CRSP_ARRAY *src_arr, CRSP_ARRAY *trg_arr, CRSP_ARRAY

*adj_arr, int begdt, int enddt, int endflg, int direct)

Description:adjusts a source CRSP_ARRAY according to an adjust array and put the results in a target CRSP_ARRAY. The adjust

array must exist.Arguments: CRSP_ARRAY *src_arr – pointer to source time series, already loaded

CRSP_ARRAY *trg_arr – pointer to preexisting target time series, emptyCRSP_ARRAY *adj_arr – pointer to adjustment array already loaded by crsp_adj_loadint begdt– begin date index of the date range adjustmentint enddt – end date index of the date range adjustmentint endflg – determines whether adjustments can be made after the last day of prices. If set to 1 missing valu

set for the range after the end date. Otherwise the last adjustment value is used.int direct – direction flag multiply or divide with adj factor 1 = multiply by adjustment factor (prices)-1 =

divide by adjustment factor (shares and values)Return Values: CRSP_SUCCESS: if successfully adjusted

CRSP_FAIL: if error in parameters or adjustment

Side Effects:The target CRSP_ARRAY is loaded with the adjusted data items from the source CRSP_ARRAY, for the date range

specified by begdt and enddt .

Preconditions:

The src_arr, trg_arr and adj_arr must exist. src_arr and trg_arr must have the same arrtype and subtype. The src_arr subtype can not be any of these: CRSP_SHARES_ADJ_NUM or CRSP_DISTS_ADJ_NUM or CRSP_DELIST_ADJ_NUM. The wanted date range must be a subset of the data date range and the adj_arr date range.

Prototype:int crsp_adj_stk(CRSP_STK_STRUCT *src_stk, CRSP_STK_STRUCT *trg_stk, int adjdt,

int factyp, int gapflg, int endflg, int knownexch)

Description:adjusts a source stk structure according to an adjust array and put the results in a target stk structure. The adjust array

is initialized and loaded inside this function.Arguments: CRSP_STK_STRUCT *src_stk - pointer to source stk structure

CRSP_STK_STRUCT *trg_stk - pointer to target stk structureint adjdt - base anchor adjustment date guaranteed to have 1.0 factorint factyp - code of adjustment type: 0 = stock splits and dividends only 1 = all dists with facprint gapflg - take into account gaps in the date range or not (values: 1,0) and set the adjfac accordingly if adjdt

< begin of gap and gapflg is set then zero out all adjfac after the gap if adjdt > end of gap and gapflg is set then zero out all adjfac before the gap if adjdt between the gap and gapflg is set then zero out all adjfac

int endflg - deterines whether adjustments can be made after the last day of prices. If set to 1 missing values are set for the range after the end date. Otherwise the last adjustment value is used.

int knownexch - unused. Always set to 0, no restrictionReturn Values: CRSP_SUCCESS: if successfully adjusted

CRSP_FAIL: if error in parameters or adjustment

Side Effects:

The target stk struct is loaded with the adjusted data items from the source stk struct . The subtypes of objects loaded with adjusted data are changed to reflect the adjusted data. See crsp_const for *_NUM subtype constants.

Preconditions:

The source src_stk must be already loaded with all the modules wanted to be adjusted. The target trg_stk must already be initialized. Use crsp_stk_open to initialize a new structure. If an object subtype indicates adjusted data is already loaded, no adjustment will be made. Use the crsp_stk_clear function to reset stock structures to unadjusted subtypes.

89

Page 106: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

xcess

passedtside of

time

(see

Excess Return Functions

CRSP excess returns compare two returns time series, and produce a series of returns with the amounts a source timeseries is in excess of a base time series.

Stock Excess Return Functions

crsp_xs_calc CRSP Stock Excess Returns Calculation

crsp_xs_port Builds Portfolio Returns into One Series

Function Description

crsp_xs_calc Builds Excess Returns Time Seriescrsp_xs_port Builds Associated Portfolio Returns into a Single Time Series for Excess Returns

Prototype:int crsp_xs_calc (CRSP_TIMESERIES *bas_ts, CRSP_TIMESERIES *ind_ts,

CRSP_TIMESERIES *trg_ts, int beg, int end, int missflag)

Description:general CRSP stock excess returns calculation given a base return series, a reference return series, and a date range,

loads excess returns for each date in the series.Arguments: CRSP_TIMESERIES *bas_ts - time series of issue returns

CRSP_TIMESERIES *ind_ts - time series of index returnsCRSP_TIMESERIES *trg_ts – target output of excess returnsint beg, end - index range to calculate excess returnsint missflag - flag for handling missing returns

CRSP_KEEP - base missing returns are copied to target, index returns are compounded over gapCRSP_SMOOTH – first return after gap is geometrically averaged so entire gap has the same amountCRSP_IGNORE – missing returns are treated as 0's missing returns in index always generate a missing ereturn

It is assumed that targ, base, and ind have been allocated and have the same calendar. 0 < start <=end < maxarr must be true for each time series

Return Values: CRSP_SUCCESS: if returns successfully loadedCRSP_FAIL: if error in parameters or structures

Side Effects:

The target time series object is loaded with excess returns data. The range is set to the min of current beg andstart, and max of current end and passed end. Any excess returns already loaded are kept only if they are oustart/end. If there is a gap between existing range and new range the returns are loaded with missing values.

Preconditions: The subtype of bas_st and ind_ts is CRSP_RETURN_NUMThe subtype of trg_ts is CRSP_RETURNS_XS_NUM or CRSP_RETURN_CUM_NUM

Prototype:int crsp_xs_port (CRSP_TIMESERIES **ind_ts, int indtypes, CRSP_TIMESERIES

port_ts, int porttype, CRSP_TIMESERIES *trg_ts)

Description:builds a time series of index returns by mapping from an array of index returns time series based on a portfolio

seriesArguments: CRSP_TIMESERIES **ind_ts - pointer to indices returns time series

int indtypes - total number of indices typesCRSP_TIMESERIES **port_ts – time series array of portfolio assignmentsint porttype - portfolio type index of interestCRSP_TIMESERIES *trg_ts – target index based on portfoliotarg, and all indices must be allocated and have the same calendar

Return Values: CRSP_SUCCESS: if returns successfully loadedCRSP_FAIL: if error in parameters or structures

Side Effects: The targ time series object is loaded with index data by mapping to an index based on a portfolio time series.

Preconditions:The target time series and all the indices time series must exist prior calling the function and must all verify

crsp_obj_verify_ts) and have the same calendar

90

Page 107: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

C P

rogramm

ing

lso a

re

lso a

Name Mapping Functions

These functions map elements in the names event array to time series.

C Stock Names Array Functions

crsp_map_shrcd Map Share Codes to a Time Series

crsp_map_exchcd Map Exchange Codes to a Time Series

Function Description

crsp_map_shrcd Map name history share codes to a time seriescrsp_map_exchcd Map name history exchange codes to a time seriescrsp_map_siccd Map name history siccd codes to a time seriescrsp_map_ncusip Map name history name CUSIPs to a time seriescrsp_map_ticker Map name history tickers to a time seriescrsp_map_comnam Map name history company names to a time seriescrsp_map_shrcls Map name history share classes to a time series

Prototype: int crsp_map_shrcd (CRSP_ARRAY *names_arr, CRSP_TIMESERIES *trg_ts, int flags)

Description:loads a target time series from a source crsp_array by copying the share type code of the stock events names

structure over each restricted period according to the target calendar file.Arguments: CRSP_ARRAY *names_arr - source CRSP_ARRAY stock events names histories

CRSP_TIMESERIES *trg_ts – target time seriesint flags - flags passed to the function. One of:

CRSP_ACTUAL means data from the source is moved to the same period on target trg[i] = f(src[i])CRSP_EFFECTIVE means data from the source is moved to the next period on target trg[i+1] =f(src[i])CRSP_NLAST means last data from the source is moved to all periods on target

Return Values: CRSP_SUCCESS: if successfully loadedCRSP_FAIL: if error in parameters or loading process

Preconditions:Source CRSP_ARRAY must be loaded with stock events name histories data. The target time series must exist. A

calendar must be associated with the target time seriesThe names_arr arrtype must be CRSP_STK_NAME_NUM The trg_ts subtype must be CRSP_SUB_SHRCD_NUM

Prototype: int crsp_map_exchcd (CRSP_ARRAY *names_arr, CRSP_TIMESERIES *trg_ts, int flags)

Description:loads a target time series from a source CRSP_ARRAY by copying the exchange code of the stock events names structu

over each restricted period according to the target calendar file.Arguments: CRSP_ARRAY *names_arr - source CRSP_ARRAY stock events names histories

CRSP_TIMESERIES *trg_ts - target time seriesint flags – flags passed to the function. One of:

CRSP_ACTUAL means data from the source is moved to the same period on target trg[i] = f(src[i])CRSP_EFFECTIVE means data from the source is moved to the next period on target trg[i+1] =f(src[i])CRSP_NLAST means last data from the source is moved to all periods on target

Return Values: CRSP_SUCCESS: if successfully loadedCRSP_FAIL: if error in parameters or loading process

Preconditions:Source CRSP_ARRAY must be loaded with stock events name histories data. The target time series must exist. A

calendar must be associated with the target time series.The names_arr arrtype must be CRSP_STK_NAME_NUM The trg_ts arrtype must be CRSP_SUB_INTEGER_NUM and subtype must be

CRSP_SUB_EXCHCD_NUM

91

Page 108: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

lso a

r

crsp_map_siccd Map SIC Codes to a Time Series

crsp_map_ncusip Map CUSIPs to a Time Series

crsp_map_ticker Map Tickers to a Time Series

Prototype: int crsp_map_siccd (CRSP_ARRAY *names_arr, CRSP_TIMESERIES *trg_ts, int flags)

Description:loads a target time series from a source CRSP_ARRAY by copying the SIC code of the stock events names structure over

each restricted period according to the target calendar file.Arguments: CRSP_ARRAY *names_arr - source CRSP_ARRAY stock events names histories

CRSP_TIMESERIES *trg_ts - target time seriesint flags - flags passed to the function. One of:

CRSP_ACTUAL means data from the source is moved to the same period on target trg[i] = f(src[i])CRSP_EFFECTIVE means data from the source is moved to the next period on target trg[i+1] =f(src[i])CRSP_NLAST means last data from the source is moved to all periods on target

Return Values: CRSP_SUCCESS: if successfully loadedCRSP_FAIL: if error in parameters or loading process

Preconditions:Source CRSP_ARRAY must be loaded with stock events name histories data. The target time series must exist. Also a

calendar must be associated with the target time seriesThe names_arr arrtype must be CRSP_STK_NAME_NUM The trg_ts subtype must be CRSP_SUB_SICCD_NUM

Prototype: int crsp_map_ncusip (CRSP_ARRAY *names_arr, CRSP_TIMESERIES *trg_ts, int flags)

Description:loads a target time series from a source CRSP_ARRAY by copying the cusip of the stock events names structure over

each restricted period according to the target calendar file.Arguments: CRSP_ARRAY *names_arr – source CRSP_ARRAY stock events name histories

CRSP_TIMESERIES *trg_ts – target time seriesint flags - flags passed to the function. One of:

CRSP_ACTUAL means data from the source is moved to the same period on target trg[i] = f(src[i])CRSP_EFFECTIVE means data from the source is moved to the next period on target trg[i+1] =f(src[i])CRSP_NLAST means last data from the source is moved to all periods on target

Return Values: CRSP_SUCCESS: if successfully loadedCRSP_FAIL: if error in parameters or loading process

Preconditions:Source CRSP_ARRAY must be loaded with stock events name histories data. The target time series must exist. A

calendar must be associated with the target time seriesThe names_arr arrtype must be CRSP_STK_NAME_NUM The trg_ts subtype must be CRSP_SUB_NCUSIP_NUM

Prototype: int crsp_map_ticker (CRSP_ARRAY *names_arr, CRSP_TIMESERIES *trg_ts, int flags)

Description:loads a target time series from a source CRSP_ARRAY by copying the ticker of the stock events names structure ove

each restricted period according to the target calendar file.Arguments: CRSP_ARRAY *names_arr – source CRSP_ARRAY stock events name histories

CRSP_TIMESERIES *trg_ts – target time seriesint flags - flags passed to the function. One of:

CRSP_ACTUAL means data from the source is moved to the same period on target trg[i] = f(src[i])CRSP_EFFECTIVE means data from the source is moved to the next period on target trg[i+1] =f(src[i])CRSP_NLAST means last data from the source is moved to all periods on target

Return Values: CRSP_SUCCESS: if successfully loadedCRSP_FAIL: if error in parameters or loading process

Preconditions:Source crsp_arra y must be loaded with stock events name histories data. The target time series must exist. Also a

calendar must be associated with the target time seriesThe names_arr arrtype must be CRSP_STK_NAME_NUM The trg_ts subtype must be CRSP_SUB_TICKER_NUM

92

Page 109: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

C P

rogramm

ing

lso a

re

crsp_map_comnam Map Company Names to a Time Series

crsp_map_shrcls Map Share Classes to a Time Series

Prototype: int crsp_map_comnam (CRSP_ARRAY *names_arr, CRSP_TIMESERIES *trg_ts, int flag)

Description:loads a target time series from a source CRSP_ARRAY by copying the company name of the stock events names

structure over each restricted period according to the target calendar file.Arguments: CRSP_ARRAY *names_arr - source CRSP_ARRAY stock events names histories

CRSP_TIMESERIES *trg_ts – target time seriesint flags - flags passed to the function. One of:

CRSP_ACTUAL means data from the source is moved to the same period on target trg[i] = f(src[i])CRSP_EFFECTIVE means data from the source is moved to the next period on target trg[i+1] =f(src[i])CRSP_NLAST means last data from the source is moved to all periods on target

Return Values: CRSP_SUCCESS: if successfully loadedCRSP_FAIL: if error in parameters or loading process

Preconditions:Source CRSP_ARRAY must be loaded with stock events name histories data. The target time series must exist. A

calendar must be associated with the target time seriesThe names_arr arrtype must be CRSP_STK_NAME_NUM The trg_ts subtype must be CRSP_SUB_COMNAM_NUM

Prototype: int crsp_map_shrcls (CRSP_ARRAY *names_arr, CRSP_TIMESERIES *trg_ts, int flags)

Description:loads a target time series from a source CRSP_ARRAY by copying the shares class of the stock events names structu

over each restricted period according to the target calendar file.Arguments: CRSP_ARRAY *names_arr - source CRSP_ARRAY stock events name histories

CRSP_TIMESERIES *trg_ts - target time seriesint flags – flags passed to the function. One of:

CRSP_ACTUAL means data from the source is moved to the same period on target trg[i] = f(src[i])CRSP_EFFECTIVE means data from the source is moved to the next period on target trg[i+1] =f(src[i])CRSP_NLAST means last data from the source is moved to all periods on target

Return Values: CRSP_SUCCESS: if successfully loadedCRSP_FAIL: if error in parameters or loading process

Preconditions:Source CRSP_ARRAY must be loaded with stock events name histories data. The target time series must exist. Also a

calendar must be associated with the target time seriesThe names_arr arrtype must be CRSP_STK_NAME_NUM The trg_ts subtype must be CRSP_SUB_SHRCLS_NUM

93

Page 110: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

must

must

Nasdaq Information Mapping Functions

These functions map data in the Nasdaq Information event arrays to time series.

Nasdaq Information Mapping Functions

crsp_map_trtscd Map Nasdaq Status Codes to a Time Series

crsp_map_nmsind Map Nasdaq National Market Indicator to a Time Series

Function Description

crsp_map_trtscd Map Nasdaq status codes to a time seriescrsp_map_nmsind Map Nasdaq National Market indicator to a time seriescrsp_map_mmcnt Map Nasdaq Market Maker count to a time seriescrsp_map_nsdinx Map Nasdaq index code to a time series

Prototype: int crsp_map_trtscd (CRSP_ARRAY *nasdin_arr, CRSP_TIMESERIES *trg_ts, int flags)

Description:loads a target time series from a source CRSP_ARRAY by copying the nasdaq status code of the stk events nasdin

structure over each restricted period according to the target calendar file.Arguments: CRSP_ARRAY *nasdin_arr – source CRSP_ARRAY stock events nasdin Nasdaq information history

CRSP_TIMESERIES *trg_ts – target time seriesint flags - flags passed to the function. One of:

CRSP_ACTUAL means data from the source is moved to the same period on target trg[i] = f(src[i])CRSP_EFFECTIVE means data from the source is moved to the next period on target trg[i+1] =f(src[i])CRSP_NLAST means last data from the source is moved to all periods on target

Return Values: CRSP_SUCCESS: if successfully loadedCRSP_FAIL: if error in parameters or loading process

Preconditions:Source CRSP_ARRAY must be loaded with stock events Nasdaq information histories data. The target time series

exist. Also a calendar must be associated with the target time seriesThe nasdin_arr arrtype must be CRSP_STK_NASDIN_NUM The trg_ts subtype must be CRSP_SUB_TRTSCD_NUM

Prototype: int crsp_map_nmsind (CRSP_ARRAY *nasdin_arr, CRSP_TIMESERIES *trg_ts, int flags)

Description:loads a target time series from a source CRSP_ARRAY by copying the National Market indicator of the stk events

nasdin stucture over each restricted period according to the target calendar file.Arguments: CRSP_ARRAY *nasdin_arr - source CRSP_ARRAY stk events nasdin Nasdaq information history

CRSP_TIMESERIES *trg_ts – target time seriesint flags - flags passed to the function. One of:

CRSP_ACTUAL means data from the source is moved to the same period on target trg[i] = f(src[i])CRSP_EFFECTIVE means data from the source is moved to the next period on target trg[i+1] = f(src[i])CRSP_NLAST means last data from the source is moved to all periods on target

Return Values: CRSP_SUCCESS: if successfully loadedCRSP_FAIL: if error in parameters or loading process

Preconditions:Source CRSP_ARRAY must be loaded with stock events Nasdaq information histories data. The target time series

exist. Also a calendar must be associated with the target time seriesThe nasdin_arr arrtype must be CRSP_STK_NASDIN_NUM The trg_ts subtype must be CRSP_SUB_NMSIND_NUM

94

Page 111: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

C P

rogramm

ing

must

crsp_map_mmcnt Map Nasdaq Market Maker Count to a Time Series

crsp_map_nsdinx Map Nasdaq index code to a time series

Prototype: int crsp_map_mmcnt (CRSP_ARRAY *nasdin_arr, CRSP_TIMESERIES *trg_ts, int flags)

Description:loads a target time series from a source CRSP_ARRAY by copying the market maker count of the stk events nasdin

structure over each restricted period according to the target calendar file.

Arguments:CRSP_ARRAY *nasdin_arr – source CRSP_ARRAY stk events nasdin Nasdaq information

historyCRSP_TIMESERIES *trg_ts – target time seriesint flags - flags passed to the function. One of:

CRSP_ACTUAL means data from the source is moved to the same period on target trg[i] = f(src[i])CRSP_EFFECTIVE means data from the source is moved to the next period on target trg[i+1] = f(src[i])

CRSP_NLAST means last data from the source is moved to all periods on targetReturn Values: CRSP_SUCCESS: if successfully loaded

CRSP_FAIL: if error in parameters or loading process

Preconditions:Source CRSP_ARRAY must be loaded with stock events Nasdaq information histories data. The target time series

exist. Also a calendar must be associated with the target time seriesThe nasdin_arr arrtype must be CRSP_STK_NASDIN_NUM The trg_ts subtype must be CRSP_SUB_MMCNT_NUM

Prototype: int crsp_map_nsdinx (CRSP_ARRAY *nasdin_arr, CRSP_TIMESERIES *trg_ts, int flags)

Description:loads a target time series from a source CRSP_ARRAY by copying the Nasdaq index code of the stk events nasdin

structure over each restricted period according to the target calendar file.Arguments: CRSP_ARRAY *nasdin_arr – source CRSP_ARRAY stk events nasdin Nasdaq information history

CRSP_TIMESERIES *trg_ts target time seriesint flags - flags passed to the function. One of:

CRSP_ACTUAL means data from the source is moved to the same period on target trg[i] = f(src[i])CRSP_EFFECTIVE means data from the source is moved to the next period on target trg[i+1] =f(src[i])CRSP_NLAST means last data from the source is moved to all periods on target

Return Values: CRSP_SUCCESS: if successfully loadedCRSP_FAIL: if error in parameters or loading process

Preconditions:Source CRSP_ARRAY must be loaded with stock events Nasdaq information histories data. The target time series must

exist. Also a calendar must be associated with the target time seriesThe nasdin_arr arrtype must be CRSP_STK_NASDIN_NUM The trg_ts subtype must be CRSP_SUB_NSDINX_NUM

95

Page 112: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

e, so returns

f the

Returns Functions

These functions make various CRSP returns calculations.

Returns Functions

crsp_ret_calc Stock Returns Calculations

crsp_ret_calc_del CRSP Delisting Returns Calculations

Function Description

crsp_ret_calc Stock Returns Calculationscrsp_ret_calc_del CRSP Delisting Returns Calculationscrsp_ret_calc_one Returns Calculation for One Returncrsp_ret_off_exch Marks Returns when it is Not Traded on the Exchangecrsp_ret_ordinary Determines if a Distribution Is Considered Ordinarycrsp_ret_payments Calculates Price Factor and Cash Dividend Amounts

Prototype:

int crsp_ret_calc (CRSP_STK_STRUCT *stk, CRSP_TIMESERIES *p1, CRSP_TIMESERIES *p2, CRSP_TIMESERIES *r, CRSP_TIMESERIES *rn, int start, int end, int gapwindow, int validexch)

Description:general CRSP stock returns calculations, with and without dividends, allowing one or two price series for before/after,

options on gap limits before considered missing, and valid exchanges.Arguments: CRSP_STK_STRUCT *stk - stock structure with names, distributions, and price data loaded

CRSP_TIMESERIES *p1 - time series of primary pricesCRSP_TIMESERIES *p2 - time series of secondary prices (NULL if unused)CRSP_TIMESERIES *r – time series to load total returnsCRSP_TIMESERIES *rn - time series to load returns without dividendsint start, end - index range to calculate returnsint gapwindow – gap in periods before considered missing, use 0 for default (10 periods)int validexch - binary code for valid exchange codes 1=nyse, 2=amex, 4=nasd, 0 = no restriction

Return Values: CRSP_SUCCESS: if returns successfully loadedCRSP_FAIL: if error in parameters or structures

Side Effects:

Return time series objects are loaded with returns data. The subtype of these time series will be set to CRSP_RETURN_NUM. The beg and end ranges will be set according to start and end parameters and price rangall previous returns ranges and data loaded will be erased. If start and end are outside of price ranges missing will be generated for the range outside of prices.

Preconditions:It is assumed that r and rn have been allocated and have the same calendar as the price time series. One can be NULL if

that type is not wanted. Prices, names and distribution histories must be loaded.

Prototype:int crsp_ret_calc_del (CRSP_STK_STRUCT *stk, float *delret, float *delretx,

float *effnewprc, int *effnewdt, int gapwindow, int crspnum, int setnum)

Description:

CRSP stock delisting returns calculations. The delisting return is the return between the last price and the value ostock after delisting, either based on the value given for the stock or the price on a new exchange.

returns are calculated with these steps:

1. find if sufficient delisting information exists to calculate a return. if not, use the correct missing value2. find a payment date and payment amount. The amount will either be the dlprc, the sum of final distributions, or the sum of both. The date will be the delist date + 1 period, or the nextdt if one is available.3. calculate a normal CRSP return between endprc and payment date, using lastprc and payment, using all distributions.

Arguments: CRSP_STK_STRUCT *stk - stock structure float *delret – delisting returnfloat *delretx – delisting return without dividendsfloat *effnewprc – value after delistingint *effnewdt – date of value after delistingint gapwindow - gap in periods before considered missingint crspnum, setnum - database and set identifiers to load prices, these can be set to -99 if prices are loaded

Return Values: CRSP_SUCCESS: if returns successfully loaded, CRSP_FAIL: if error in parameters or structures

Side Effects:

A delisting return with dividends is placed in dlret. A delisting return without dividends is placed in dlretx. An effective last payment is placed in effnewprc. The effective date of the last payment is placed in effnewdt. This will load the prices time series if prices are needed and they are not already loaded.

Exception Codes:

Exception codes (in order of precedence): STK_RMISSR - issue still active, STK_RMISSD - no sources to establish value after delist, STK_RMISSG - no acceptable previous price to calculate return, STK_RMISSP - trades on new exchange, but no price available.

96

Page 113: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

C P

rogramm

ing

y

crsp_ret_calc_one Returns Calculation for One Period

crsp_ret_off_exch Marks Returns when Security is Not Traded on Valid Exchange

crsp_ret_ordinary Determines if a Distribution Is Considered Ordinary

crsp_ret_payments Calculates Price Factor and Cash Dividend Amounts

Prototype:float crsp_ret_calc_one (CRSP_ARRAY *di, float p1, float p2, float *rn, int

start, int end)

Description:General CRSP stock returns calculation for one period given two prices, the dates of the two prices, and a distributions

array. Total return is returned, return without dividends can be loaded by reference.Arguments: CRSP_ARRAY *di - stock distributions structure

float p1 - previous pricefloat p2 - current pricefloat *rn - place to load returns without dividends (NULL if unwanted)int start, end - actual YYYYMMDD dates of p1 and p2int gapwindow - gap in periods before considered missing

Return Values: Total returnCRSP_FAIL: if error in parameters or structures

Side Effects: Returns without Dividends is loaded to rn if not NULL

Prototype:int crsp_ret_off_exch (CRSP_ARRAY *nam, CRSP_TIMESERIES *r1, CRSP_TIMESERIES

*r2, int start, int end, int validexch)

Description:

uses the names history to mark returns from a time period when not on the desired exchange. Returns are marked as off exchange: during the effective range of a name structure that overlaps the returns range when the exchange code of that name structure is:

0 (unknown)1 known but not one of CRSP-supported exchanges (NYSE, AMEX, Nasdaq)2 On one of these valid exchanges but not one part of the validexch binary code

Arguments: CRSP_ARRAY *nam – names array CRSP_TIMESERIES *r1, *r2 – returns and returns without dividendsint start, end – effective range of returns to checkint validexch – binary code of valid exchanges: 1 = NYSE, 2 = AMEX, 4 = Nasdaq, sum for combinations

Return Values: CRSP_SUCCESS:CRSP_FAIL: if bad or missing parameters

Preconditions: The two returns time series must be loaded or set to NULL.

Prototype: int crsp_ret_ordinary (int code, float facpr)

Description:uses the distribution code and price factor to determine whether a distribution is considered ordinary for the purposes of

the returns without dividends calculationArguments: int code - 4-digit CRSP distribution code

float facpr - CRSP distribution price factorReturn Values: 1 if ordinary

0 if non-ordinary2 to use factor

Side Effects: None

Prototype:int crsp_ret_payments (double *t_fp, double *t_odiv, double *t_ndiv, CRSP_ARRAY

*di, int dp, int date)

Description:

calculates price factor & cash dividend amounts for a period using the distribution events array. It is passed adistribution array, a current event, and an ending date of the period. It cumulates information for all distributions inthe period and returns the number of the distribution after the period.

Arguments: double *t_fp – price factor for period*t_odiv – ordinary cash dividends for period*t_ndiv – non-ordinary cash dividends for periodCRSP_ARRAY *di – distributions arrayint dp – current distribution event in arrayint date - ending calendar date of periodthe first three parameters are passed as pointers so they can be loaded with the result values

Return Values: integer: current location in distributions array, this will be the first distribution after date

Side Effects:the parameters t_fp, t_odiv, and t_ndiv are set with period price factor, ordinary amount, and non-ordinar

amountCall Sequence: Assumes exdt, distcd order

97

Page 114: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

crsp_ret_map_payments Maps Adjustment Factors and Payments to a Time Series

Prototype:int crsp_ret_map_payments(CRSP_STK_STRUCT *stk, CRSP_TIMESERIES *fp_ts,

CRSP_TIMESERIES *odiv_ts, CRSP_TIMESERIES *ndiv_ts)

Description: calculate payments over range based on distribution events arrayArguments: CRSP_STK_STRUCT *stk - stock structure with events loaded

CRSP_TIMESERIES *fp_ts - target ts of factor of adjust pricesCRSP_TIMESERIES *odiv_ts - target ts of total ordinary dividend amntCRSP_TIMESERIES *ndiv_ts - target ts of total dividend amountIt is assumed that at least one of the three target is not NULL and if more than one target exist then they have the same

calendarReturn Values: CRSP_SUCCESS: (integer) if returns successfully loaded

CRSP_FAIL: if error in parameters or structures

98

Page 115: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

C P

rogramm

ing

date events

py

ion date

ge

Shares Outstanding Functions

Shares Outstanding Functions

crsp_shr_imp Converts Raw Shares to Imputed Shares

crsp_shr_num Returns Shares Outstanding on a Given Date

Function Description

crsp_shr_imp Converts Raw Shares to Imputed Sharescrsp_shr_num Returns Shares Outstanding on a Given Datecrsp_shr_map Maps The Imputed Shares Array to a Time Seriescrsp_shr_raw Converts Imputed Shares to Raw Shares

Prototype:int crsp_shr_imp (CRSP_STK_STRUCT *stk, CRSP_ARRAY *impshrs,int uniqflag, int

skipflag, int firstflag)

Description:

general imputed crsp stock shares function: given a standard stock structure with header and events structures, a pre-initialized CRSP_ARRAY is loaded with shares observations, including those imputed from distribution events. CRSPAccess97 stock databases are delivered with imputed shares already loaded.

There are two options: the first is a flag that supports collapsing duplicate events so there is only one share observation on a given date; the second supports screening of certain types of distributions such as rights from affecting the shares outstanding results. This only uses ex-date of distributions.

Arguments: CRSP_STK_STRUCT *stk – source data must have EVENTS loadedCRSP_ARRAY *impshrs - array that will be loaded. It must exist with enough space to store completed array of

share eventsint uniqflag - flag for dates with multiple observations 0 - collapse structure so only the last observation on a

is left in the structure. Raw shares observations take precedence over derived ones 1 - allow multiple shares on the same day. The last will be used by crsp_shr_map

int skipflag - flag for skipping certain types of dists 0 - ignore facshr from rights 1 - use all facshrsint firstflag - flag for creating a dummy first observation 0 - do not create a dummy first observation 1 – co

first share structure up to begdt if available.Return Values: CRSP_SUCCESS: if shares array successfully loaded,

CRSP_FAIL: if error in parameters or structures

Side Effects:

the impshrs array is loaded with imputed shares structures and num is set to the number of shares observations found. shrflg is set with the following conventions: > 0 distribution event # (index-1 into dists array of facshr) 0 raw shares observation -1 implied 1st shrs observation (if dist with facshr precedes all raw shares observations, the first shrflg is –1 and the second > 0. -2 implied leading shares observation, where second is copied forward and shrsdt set to begdt.

Preconditions: The impshrs array must have arrtype CRP_STK_SHARE_NUM and subtype STK_SHARES_IMP

Prototype:int crsp_shr_num (CRSP_STK_STRUCT *stk, int date, int skipflag, CRSP_STK_SHARE

*share)

Description:

returns the shares outstanding on a given date. There is an optional parameter that can return the actual observatof the shares outstanding result. Uses crsp_shr_imp to build a static array of imputed shares. If the permno is the same, the array is not rebuilt.

Arguments: CRSP_STK_STRUCT *stk – source data must have EVENTS and HEADER loadedint date - yymmdd or yyyymmdd date to find shares outint skipflag - flag for skipping certain types of dists 0 - ignore facshr from rights 1 - use all facshrsCRSP_STK_SHARE *shares_obs - shares info of actual observation used if set to null will not be loaded

Return Values: CRSP_SUCCESS: number of shares outstanding effective on date, 0 if no shares structures or date out of data ranCRSP_FAIL if error in parameters or structures

Side Effects: If the fourth parameter is passed, it is loaded with the information from the effective shares event.

99

Page 116: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

crsp_shr_map Maps The Imputed Shares Array to a Time Series

crsp_shr_raw Converts Imputed Shares Into Raw Shares Observations

Prototype:int crsp_shr_map (CRSP_STK_STRUCT *stk, CRSP_TIMESERIES *shr_ts, int begind, int

endind, int skipflag)

Description:maps the imputed shares to a time series. Uses crsp_shr_imp to load an imputed shares events array if necessary,

then maps the observations by finding the effective shares outstanding for each date in the calendar.Arguments: CRSP_STK_STRUCT *stk - stock structure loaded with HEADER and EVENTS

CRSP_TIMESERIES *shr_ts - preinitialized time series that will be loaded. Must have array allocated at least up to endind and calendar set.

int begind, endind - range of indexes into calendar that will be loaded to shrsint skipflag - flag for skipping certain types of dists 0 - ignore facshr from rights 1 - use facshr loaded

with sharesReturn Values: CRSP_SUCCESS: (integer) if shares successfully loaded

CRSP_FAIL: if error in parameters or structures

Side Effects:shrs time series is loaded. Arr is filled with shares outstanding values and beg and end are set. If there are no shares

structures beg and end are set to 0, otherwise they inherit parameters begind and endind.Preconditions: The shr_ts time series must have arrtype CRSP_INTEGER_NUM and subtype CRSP_SHARES_IMP_NUM

Prototype: int crsp_shr_raw (CRSP_ARRAY *shr_arr)

Description:converts imputed shares outstanding events in raw shares. Imputed shares are observations directly derived from CRSP

distribution events. These are removed from the shares outstanding observation array.Arguments: CRSP_ARRAY *shr_arr - CRSP_ARRAY of imputed shares already loaded

Return Values: CRSP_SUCCESS: if shares successfully modifiedCRSP_FAIL: if error in parameters or structures

Side Effects: Imputed shares array is converted to raw shares, from subtype STK_SHARES_IMP to STK_SHARES_RAW

100

Page 117: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

C P

rogramm

ing

rmat.

stk_print Output Functions

These output functions are used to output specific stock data. Some functions expect values set in the STKFLAGSstructure, defined in the crsp_stk_flags.h header file. C printf format statements are defined in thecrsp_stk_format.h header file. The output functions are used by the stk_print utility and are currently not writtenfor general output functionality.

C stk_print Functions

crsp_stkwrt_dx Writes Daily Data Information With Shares

crsp_stkwrt_dr Writes Return Information

Function Description

crsp_stkwrt_dx Writes Daily Data Information with Sharescrsp_stkwrt_dr Writes Return Informationcrsp_stkwrt_r Writes Return Arrayscrsp_stkwrt_dd Writes Daily Data In Formatted Outputcrsp_stkwrt_ds Writes Daily Data Info with Index Levelscrsp_stkwrt_nms Writes Nasdaq National Market Data in Formatted Outputcrsp_stkwrt_sh Writes a Share Structure into a Filecrsp_stkwrt_hdr1 Writes a Stock Header Structure without a Date Rangecrsp_stkwrt_hdr Writes a Stock Header Structure with Date Rangescrsp_stkwrt_name Writes a Stock Name Structurecrsp_stkwrt_di Writes a Distribution Structurecrsp_stkwrt_de Writes Delisting Informationcrsp_stkwrt_ni Writes Nasdaq Informationcrsp_stkwrt_it Writes A CRSPDB Recordcrsp_stkwrt_dtrstr Sets Up Date Limits for Different Kinds of Stock Datacrsp_stkwrt_portf Writes Formatted Output for Portfolioscrsp_stkwrt_pdate Finds the Earliest and the Latest Dates for Available Datacrsp_stkwrt_popts Writes a Single Time Series Array

Prototype:int crsp_stkwrt_dx (FILE *file, CRSP_STK_STRUCT *dd, CRSP_TIMESERIES *sh_ts, int

format, int beg, int end)

Description: writes daily data information with shares into the file pointer passed as an argument, using given formatArguments: FILE *file – pointer to output file

CRSP_STK_STRUCT *dd - pointer to stock structureCRSP_TIMESERIES *sh_ts - pointer to time series structureint format – wanted formatint beg – wanted beginning date of dataint end – wanted ending date of data

Return Values: CRSP_SUCCESS: if data has been printed okCRSP_FAIL: if error in format or dates or type of calendar

Side Effects: none

Prototype:int crsp_stkwrt_dr (FILE *file, CRSP_STK_STRUCT *dd, int format, int beg, int

end)

Description: writes return information from daily data structure into the file. File pointer passed as an argument, using given foArguments: FILE *file – pointer to output file

CRSP_STK_STRUCT *dd - pointer to stock structureint format - wanted format for outputint beg - wanted beginning date of dataint end - wanted ending date of data

Return Values: CRSP_SUCCESS: if data has been printed okCRSP_FAIL: if error in format or dates or type of calendar

Side Effects: none

101

Page 118: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

crsp_stkwrt_r Writes Return Arrays

crsp_stkwrt_dd Writes Daily Data In Formatted Output

crsp_stkwrt_ds Writes Daily Data with Index Levels

crsp_stkwrt_nms Writes Nasdaq National Market Data in Formatted Output

Prototype: int crsp_stkwrt_r (FILE *file, CRSP_STK_STRUCT *dd, int format, int beg, int end)

Description: writes returns array into file pointer passed as an argument, using given formatArguments: FILE *file - pointer to output file

CRSP_STK_STRUCT *dd - pointer to stock data structureint format - wanted format for output dataint beg - wanted beginning date for output dataint end - wanted ending date for output data

Return Values: CRSP_SUCCESS: if data has been printedCRSP_FAIL: if error in format or dates or type of calendar

Side Effects: none

Prototype: int crsp_stkwrt_dd (FILE *file, CRSP_STK_STRUCT *dd, int format, int beg,int end)

Description: writes daily data in formatted outputArguments: FILE *file - pointer to output file

CRSP_STK_STRUCT *dd - pointer to the whole stock structureint format - wanted format for outputint beg - wanted beginning date of output dataint end - wanted ending date of output data

Return Values: CRSP_SUCCESS: if data has been printedCRSP_FAIL: if error in format or dates or type of calendar or array

Side Effects: none

Prototype:int crsp_stkwrt_ds (FILE *file, CRSP_STK_STRUCT *dd, int format, int beg, int

end, float baseamt, int basedt)

Description: writes daily data info with index levels into the file pointer passed as an argument, using given format.Arguments: FILE *file – pointer to output file

CRSP_STK_STRUCT *dd - pointer to stock structureint format – wanted formatint beg - wanted beginning date of dataint end - wanted ending date of datafloat baseamt – base valueint basedt - base date

Return Values: CRSP_SUCCESS: if data has been printedCRSP_FAIL: if error in format or dates or type of calendar

Side Effects: none

Prototype:int crsp_stkwrt_nms (FILE *file, CRSP_STK_STRUCT *dd, int format, int beg,int

end)

Description: writes nms data in formatted outputArguments: FILE *file – pointer to output file

CRSP_STK_STRUCT *dd - pointer to the whole stock structureint format – wanted format for outputint beg - wanted beginning date of output dataint end - wanted ending date of output data

Return Values: CRSP_SUCCESS: if data has been printedCRSP_FAIL: if error in format or dates or type of calendar or array

Side Effects: none

102

Page 119: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

C P

rogramm

ing

crsp_stkwrt_sh Writes a Share Structure into a File

crsp_stkwrt_hdr1 Writes a Stock Header Structure without a Date Range

crsp_stkwrt_hdr Writes a Stock Header Structure With Date Ranges

crsp_stkwrt_name Writes a Stock Name Structure

crsp_stkwrt_di Writes a Distribution Structure

Prototype: int crsp_stkwrt_sh (FILE *file, CRSP_STK_SHARE *sh, int format)

Description: writes share structure into file pointer passed as an argument, using given formatArguments: FILE *file – pointer to output file

CRSP_STK_SHARE *sh - pointer to share structureint format – wanted format for output data

Return Values: CRSP_SUCCESS: if data has been printedCRSP_FAIL: if error in format or dates

Side Effects: none

Prototype:int crsp_stkwrt_hdr1 (FILE *file, CRSP_CAL_DATERANGE *r, CRSP_STK_STRUCT *h, int

format)

Description: writes stock header structure without date range to given streamArguments: FILE *file – pointer to output file

CRSP_CAL_DATERANGE *r - pointer to calendar dates range structureCRSP_STK_STRUCT *h - pointer to stock structureint format – wanted format for output

Return Values: CRSP_SUCCESS: if data has been printed CRSP_FAIL: if error in format or dates

Side Effects: none

Prototype:int crsp_stkwrt_hdr (FILE *file, CRSP_CAL_DATERANGE *r, CRSP_STK_STRUCT *h, int

format)

Description: writes stock header structure with date ranges to given formatArguments: FILE *file – pointer to output file

CRSP_CAL_DATERANGE *r - pointer to calendar dates range structureCRSP_STK_STRUCT *h - pointer to stock structureint format – wanted format for output

Return Values: CRSP_SUCCESS: if data has been printedCRSP_FAIL: if error in format or dates

Side Effects: none

Prototype: int crsp_stkwrt_name (FILE *file, CRSP_STK_NAME *n, int format)

Description: writes stock name structure into file pointer passed as an argument, in given formatArguments: FILE *file – pointer to output file

CRSP_STK_NAME *n - pointer to stock name structureint format – wanted format for output

Return Values: CRSP_SUCCESS: if data has been printedCRSP_FAIL: if error in format or dates

Side Effects: none

Prototype: int crsp_stkwrt_di (FILE *file, CRSP_STK_DIST *di, int format)

Description: writes distribution structure into the file pointer passed as an argument, using the given formatArguments: FILE *file – pointer to output file

CRSP_STK_DIST *di - pointer to distribution structureint format – wanted format for output

Return Values: CRSP_SUCCESS: if data has been printedCRSP_FAIL: if error in format or dates

Side Effects: none

103

Page 120: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

crsp_stkwrt_de Writes Delisting Information

crsp_stkwrt_ni Writes Nasdaq Information

crsp_stkwrt_it Writes A CRSPDB Record

crsp_stkwrt_dtrstr Sets Up Date Limits For Different Kind of Stock Data

crsp_stkwrt_portf Writes Formatted Output for Portfolios

crsp_stkwrt_pdate Finds the Earliest and the Latest Dates for Available Data

Prototype: int crsp_stkwrt_de (FILE *file, CRSP_STK_DELIST *de, int format)

Description: writes delisting information into the file pointer passed as an argument, using the given formatArguments: FILE *file – pointer to output file

CRSP_STK_DELIST *de - pointer to stock delisting structureint format – wanted format for output

Return Values: CRSP_SUCCESS: if data has been printedCRSP_FAIL: if error in format or dates

Side Effects: none

Prototype: int crsp_stkwrt_ni (FILE *file, CRSP_STK_NASDIN *n, int format)

Description: writes Nasdaq info into file pointer passed as argument, using given formatArguments: FILE *file – pointer to output file

CRSP_STK_NASDIN *n - pointer to Nasdaq structure to printint format – wanted format for output

Return Values: CRSP_SUCCESS: if data has been printedCRSP_FAIL: if error in format or dates

Side Effects: none

Prototype: int crsp_stkwrt_it (CRSP_STK_STRUCT *r, struct STKFLAGS *flags)

Description: writes a dbstruct database record according to flags given. The record is modified for writing.Arguments: CRSP_STK_STRUCT *r - pointer to the whole stock structure

struct STKFLAGS *flags - pointer to STKFLAGS structureReturn Values: CRSP_SUCCESS: if data has been printed according to flags

CRSP_FAIL: if an error occurred during writing dataSide Effects: none

Preconditions: There should be no fprintfs in this function. The output should be done in the called functions

Prototype: void crsp_stkwrt_dtrstr (CRSP_STK_STRUCT *r, struct STKFLAGS *f)

Description: sets up date limits for different kind of stock data according to dates/indices in STKFLAGS structureArguments: CRSP_STK_STRUCT *r - pointer to stock structure

struct STKFLAGS *f - pointer to STKFLAGS structureReturn Values: none

Side Effects: beginning and ending dates in some stock structures are set

Prototype:int crsp_stkwrt_portf (FILE *file, CRSP_STK_STRUCT *stk, OPT_PORTFSTRUCT

*portlist, struct STKFLAGS *flags)

Description: writes formatted output for portfoliosArguments: FILE *file - pointer to output file

CRSP_STK_STRUCT *stk - pointer to stock structureOPT_PORTFSTRUCT *portlist - pointer to linked list of desired portfolio structuresstruct_STKFLAGS *flags - pointer to STKFLAGS structure

Return Values: CRSP_FAIL: if error in format occursCRSP_SUCCESS: if everything is ok

Side Effects: none

Prototype:int crsp_stkwrt_pdate (CRSP_STK_STRUCT *stkptr, struct STKFLAGS *flags, int

*start, int *finish)

Description:finds the earliest and the latest dates for available data, compares them to flags.begin_date and

flags.end_date and sets up the limits accordinglyArguments: CRSP_STK_STRUCT *stkptr - pointer to stock structure

struct STKFLAGS *flags - pointer to STKFLAGS structureint *start, *finish - values have to be set

Return Values: CRSP_SUCCESS: (integer) no errorsCRSP_FAIL: (integer) wrong parameters sent

Side Effects: none

104

Page 121: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

C P

rogramm

ing

crsp_stkwrt_popts Writes a Single Time Series Array

Prototype:int crsp_stkwrt_popts (FILE *file, CRSP_STK_STRUCT *dd, int format, int beg, int

end, struct STKFLAGS *flags)

Description: writes a single time series array into file pointer passed as an argument, using given formatArguments: FILE *file – pointer to output file

CRSP_STK_STRUCT *dd - pointer to stock structureint format – wanted format for outputint beg - wanted beginning date for output dataint end - wanted ending date for output datastruct_STKFLAGS *flags - pinter to STKFLAGS structure

Return Values: CRSP_SUCCESS: if data has been printedCRSP_FAIL: if error in format or dates

Side Effects: none

105

Page 122: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

ningues at

ning es at

Translation Functions

These functions translate stock data in one or more time series to another. The different time series can be based ondifferent calendars.

Translation Functions

crsp_trans_comp_returns Compound Returns From One Time Series to Another

crsp_trans_last Translate Time Series Based on Last Value in Range

Function Description

crsp_trans_comp_returns Compound returns from one time series to anothercrsp_trans_last Translate Time Series Based On Last Value In Rangecrsp_trans_first Translate Time Series Based On First Value In Rangecrsp_trans_max Translate Time Series Based On Maximum Value In Rangecrsp_trans_min Translate Time Series Based On Minimum Value In Rangecrsp_trans_average Translate Time Series Based On Average Value In Rangecrsp_trans_median Translate Time Series Based On Median Value In Rangecrsp_trans_total Translate Time Series Based On Total Value In Rangecrsp_trans_last_closest Translate Time Series Based On Closest Nonmissing Value In Rangecrsp_trans_last_previous Translate Time Series Based On Last Nonmissing Value In Rangecrsp_trans_level Loads A Target Time Series With Index Level Pricescrsp_trans_cumret Loads A Target Time Series With Cumulative Returnscrsp_trans_port Map Portfolio Assignments To A New Time Seriescrsp_trans_stat Map Portfolio Statistics To A New Time Seriescrsp_trans_cap Loads A Target Timeseries With Capitalization Datacrsp_trans_gen_prc General Translation Price Function

Prototype:int crsp_trans_comp_returns(CRSP_TIMESERIES *src_ts, CRSP_TIMESERIES *trg_ts,

int par_flag)

Description:loads a target time series from a source time series by copying the compounded returns over each restricted period

according to the calendar fileArguments: CRSP_TIMESERIES *src_ts - source time series

CRSP_TIMESERIES *trg_ts - target time seriesint par_flag - determines how missing values affect beg and end of target: 0 – allow missing values at begin

and ending of target range, 1 - not allow missing values at beginning of target range, 2 - not allow missing valending of target range, 3 - not allow missing values at beginning and ending of target range

Return Values: CRSP_SUCCESS: if successfully loaded and space allocatedCRSP_FAIL: if error in parameters or loading process

Preconditions: src_ts and trg_ts arrtype is CRSP_FLOAT_NUM and subtype is CRSP_RETURN_NUM

Prototype:int crsp_trans_last(CRSP_TIMESERIES *src_ts, CRSP_TIMESERIES *trg_ts, int

par_flag)

Description:loads a target time series from a source time series by copying the last price or volume over each restricted period

according to the calendar fileArguments: CRSP_TIMESERIES *src_ts - source time series

CRSP_TIMESERIES *trg_ts - target time seriesint par_flag - determines how missing values affect beg and end of target: 0 – allow missing values at begin

and ending of target range, 1 - not allow missing values at beginning of target range, 2 - not allow missing valuending of target range, 3 - not allow missing values at beginning and ending of target range

Return Values: CRSP_SUCCESS: if successfully loadedCRSP_FAIL: if error in parameters or loading process

Preconditions:If the src_ts arrtype is CRSP_FLOAT_NUM the subtype must be CRSP_PRICE_NUM or

CRSP_PRICE_ADJ_NUM or CRSP_LEVEL_NUMIf the stc_ts arrtype is CRSP_INTEGER_NUM the subtype must be CRSP_VOLUME_NUM or

CRSP_VOLUME_ADJ_NUM or CRSP_COUNT_NUMIf the src_ts arrtype is CRSP_DOUBLE_NUM the subtype must be CRSP_WEIGHT_NUM or

CRSP_CAP_NUMThe src_ts and trg_ts must have the same arrtype and subtype

106

Page 123: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

C P

rogramm

ing

ningues at

period

ningues at

period

ningues at

crsp_trans_first Translate Time Series Based on First Value in Range

crsp_trans_max Translate Time Series Based on Maximum Value in Range

crsp_trans_min Translate Time Series Based on Minimum Value in Range

Prototype:int crsp_trans_first (CRSP_TIMESERIES *src_ts, CRSP_TIMESERIES *trg_ts, int

par_flag)

Description:loads a target time series from a source time series by copying the last price or volume over each restricted period

according to the calendar fileArguments: CRSP_TIMESERIES *src_ts - source time series

CRSP_TIMESERIES *trg_ts - target time seriesint par_flag - determines how missing values affect beg and end of target: 0 – allow missing values at begin

and ending of target range, 1 - not allow missing values at beginning of target range, 2 - not allow missing valending of target range, 3 - not allow missing values at beginning and ending of target range

Return Values: CRSP_SUCCESS: if successfully loadedCRSP_FAIL: if error in parameters or loading process

Preconditions: If the src_ts arrtype is CRSP_FLOAT_NUM the subtype must be CRSP_PRICE_NUMIf the stc_ts arrtype is CRSP_INTEGER_NUM the subtype must be CRSP_VOLUME_NUM The src_ts and trg_ts must have the same arrtype and subtype

Prototype:int crsp_trans_max (CRSP_TIMESERIES *src_ts, CRSP_TIMESERIES *trg_ts, int

par_flag)

Description:Loads a target time series from a source time series by copying the maximum price or volume over each restricted

according to the calendar fileArguments: CRSP_TIMESERIES *src_ts - source time series

CRSP_TIMESERIES *trg_ts - target time seriesint par_flag - determines how missing values affect beg and end of target: 0 – allow missing values at begin

and ending of target range, 1 - not allow missing values at beginning of target range, 2 - not allow missing valending of target range, 3 - not allow missing values at beginning and ending of target range

Return Values: CRSP_SUCCESS: if successfully loadedCRSP_FAIL: if error in parameters or loading process

Preconditons: If the src_ts arrtype is CRSP_FLOAT_NUM the subtype must be CRSP_PRICE_NUMIf the stc_ts arrtype is CRSP_INTEGER_NUM the subtype must be CRSP_VOLUME_NUM The src_ts and trg_ts must have the same arrtype and subtype

Prototype:int crsp_trans_min (CRSP_TIMESERIES *src_ts, CRSP_TIMESERIES *trg_ts, int

par_flag)

Description:loads a target time series from a source time series by copying the minimum price or volume over each restricted

according to the calendar fileArguments: CRSP_TIMESERIES *src_ts - source time series

CRSP_TIMESERIES *trg_ts - target time seriesint par_flag - determines how missing values affect beg and end of target: 0 – allow missing values at begin

and ending of target range, 1 - not allow missing values at beginning of target range, 2 - not allow missing valending of target range, 3 - not allow missing values at beginning and ending of target range

Return Values: CRSP_SUCCESS: if successfully loadedCRSP_FAIL: if error in parameters or loading process

Preconditons: If the src_ts arrtype is CRSP_FLOAT_NUM the subtype must be CRSP_PRICE_NUMIf the stc_ts arrtype is CRSP_INTEGER_NUM the subtype must be CRSP_VOLUME_NUM The src_ts and trg_ts must have the same arrtype and subtype

107

Page 124: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

ningues at

period

ningues at

ding to

ningues at

crsp_trans_average Translate Time Series Based on Average Value in Range

crsp_trans_median Translate Time Series Based on Median Value in Range

crsp_trans_total Translate Time Series Based on Total Value in Range

Prototype:int crsp_trans_average (CRSP_TIMESERIES *src_ts, CRSP_TIMESERIES *trg_ts, int

par_flag)

Description:loads a target time series from a source time series by copying the average price or volume over each restricted period

according to the calendar fileArguments: CRSP_TIMESERIES *src_ts - source time series

CRSP_TIMESERIES *trg_ts - target time seriesint par_flag - determines how missing values affect beg and end of target: 0 – allow missing values at begin

and ending of target range, 1 - not allow missing values at beginning of target range, 2 - not allow missing valending of target range, 3 - not allow missing values at beginning and ending of target range

Return Values: CRSP_SUCCESS: if successfully loadedCRSP_FAIL: if error in parameters or loading process

Preconditions: If the src_ts arrtype is CRSP_FLOAT_NUM the subtype must be CRSP_PRICE_NUMIf the stc_ts arrtype is CRSP_INTEGER_NUM the subtype must be CRSP_VOLUME_NUM The src_ts and trg_ts must have the same arrtype and subtype

Prototype:int crsp_trans_median (CRSP_TIMESERIES *src_ts, CRSP_TIMESERIES *trg_ts, int

par_flag)

Description:loads a target time series from a source time series by copying the median price or volume over each restricted

according to the calendar fileArguments: CRSP_TIMESERIES *src_ts - source time series

CRSP_TIMESERIES *trg_ts - target time seriesint par_flag - determines how missing values affect beg and end of target: 0 – allow missing values at begin

and ending of target range, 1 - not allow missing values at beginning of target range, 2 - not allow missing valending of target range, 3 - not allow missing values at beginning and ending of target range

Return Values: CRSP_SUCCESS: if successfully loadedCRSP_FAIL: if error in parameters or loading process

Preconditions: If the src_ts arrtype is CRSP_FLOAT_NUM the subtype must be CRSP_PRICE_NUMIf the src_ts arrtype is CRSP_INTEGER_NUM the subtype must be CRSP_VOLUME_NUM The src_ts and trg_ts must have the same arrtype and subtype

Side Effects: Possible performance hit if large time series

Prototype:int crsp_trans_total (CRSP_TIMESERIES *src_ts, CRSP_TIMESERIES *trg_ts, int

par_flag)

Description:loads a target time series from a source time series by copying the total volume over each restricted period accor

the calendar fileArguments: CRSP_TIMESERIES *src_ts - source time series

CRSP_TIMESERIES *trg_ts - target time seriesint par_flag - determines how missing values affect beg and end of target: 0 – allow missing values at begin

and ending of target range, 1 - not allow missing values at beginning of target range, 2 - not allow missing valending of target range, 3 - not allow missing values at beginning and ending of target range

Return Values: CRSP_SUCCESS: if successfully loadedCRSP_FAIL: if error in parameters or loading process

Preconditions: If the src_ts arrtype is CRSP_FLOAT_NUM the subtype must be CRSP_PRICE_NUMIf the stc_ts arrtype is CRSP_INTEGER_NUM the subtype must be CRSP_VOLUME_NUM or

CRSP_COUNT_NUM or CRSP_VOLUME_ADJ_NUMThe src_ts and trg_ts must have the same arrtype and subtype

108

Page 125: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

C P

rogramm

ing

ningues at

stricted

d-

ningues at

nd base

crsp_trans_last_closest Translate Time Series Based on Closest Nonmissing Value

crsp_trans_last_previous Translate Time Series Based on Last Nonmissing Value in Range

crsp_trans_level Loads a Target Time Series with Index Levels

Prototype:int crsp_trans_last_closest (CRSP_TIMESERIES *src_ts, CRSP_TIMESERIES *trg_ts,

CRSP_ARRAY *dists, int dlylim, int par_flag)

Description:

crsp_trans_last_closest -Loads a target time series from a source time series by copying the closest non-missing to last price (price over each restricted period according to the calendar file). This adjusts the price if thereare any distributions between it and month-end so that the return will be calculated properly. Passed a limit of daysto use before giving up. If ties, preceding data gets precedence. Will not go outside of current or next period. Alsosets the begin and end.

Arguments: CRSP_TIMESERIES *src_ts – source time seriesCRSP_TIMESERIES *trg_ts - target time seriesCRSP_ARRAY *dists - CRSP_ARRAY of distributionsint dlylim - limit of date periods to use before giving upint par_flag - determines how missing values affect beg and end of target: 0 – allow missing values at begin

and ending of target range, 1 - not allow missing values at beginning of target range, 2 - not allow missing valending of target range, 3 - not allow missing values at beginning and ending of target range

Return Values: CRSP_SUCCESS: if successfully loadedCRSP_FAIL: if error in parameters or loading process

Side Effects: Target, price flag, and last date time series are loaded and their ranges are set.Preconditions: If the src_ts arrtype is CRSP_FLOAT_NUM the subtype must be CRSP_PRICE_NUM

If the stc_ts arrtype is CRSP_INTEGER_NUM the subtype must be CRSP_VOLUME_NUM The src_ts and trg_ts must have the same arrtype and subtype

Prototype:int crsp_trans_last_previous (CRSP_TIMESERIES *src_ts, CRSP_TIMESERIES *trg_ts,

CRSP_TIMESERIES *prcflag_ts, CRSP_TIMESERIES *lastdt_ts, int par_flag)

Description:

loads a target time series from a source time series by copying the previous non-missing to last price over each reperiod according to the calendar file. Also loads the prcflag and lastdt time series according to the case andthe date of the non-missing value found. Will not go outside of current period.

Arguments: CRSP_TIMESERIES *src_ts - source time seriesCRSP_TIMESERIES *trg_ts - target time seriesCRSP_TIMESERIES *prcflag_ts - price flag time series. Each period is set to -1 if no non-zero price if a perio

end price is found, and 1 if an earlier price in the period is found. CRSP_TIMESERIES *lastdt_ts - last date time seriesint par_flag - determines how missing values affect beg and end of target: 0 – allow missing values at begin

and ending of target range, 1 - not allow missing values at beginning of target range, 2 - not allow missing valending of target range, 3 - not allow missing values at beginning and ending of target range

Return Values: CRSP_SUCCESS: if successfully loadedCRSP_FAIL: if error in parameters or loading process

Preconditions: If the src_ts arrtype is CRSP_FLOAT_NUM the subtype must be CRSP_PRICE_NUMIf the stc_ts arrtype is CRSP_INTEGER_NUM the subtype must be CRSP_VOLUME_NUM The src_ts and trg_ts must have the same arrtype and subtype

Prototype:int crsp_trans_level (CRSP_TIMESERIES *src_ts, CRSP_TIMESERIES *trg_ts, int

basedt, float baseamt)

Description:loads a target time series with index level prices from a source time series with returns based on a base date a

amount for that date.Arguments: CRSP_TIMESERIES *src_ts - source time series of returns

CRSP_TIMESERIES *trg_ts - target time series of pricesint basedt – base date, YYYYMMDD date where levels are anchored. Level on this date is set to baseamt and

other levels are set by successively compounding returns fom the starting point.float baseamt - base amount, if = 0 the baseamt = source on basedt. Target time series will contain baseamt

on basedt.Return Values: CRSP_SUCCESS: if successfully loaded

CRSP_FAIL: if error in parameters or loading processSide Effects: target time series is loaded with levels. Subtype of target is set to CRSP_LEVEL_NUM

Preconditions:src can be same as trg. Normally target subtype must be CRSP_LEVEL_NUM but if src=trg must be

CRSP_RETURN_NUM. src_ts and trg_ts must have the same calendar

109

Page 126: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

period

ningues at

tricted

ningues at

shares

dt

crsp_trans_cumret Loads a Target Timeseries with Cumulative Returns

crsp_trans_port Map Portfolio Asssignments to a New Time Series

crsp_trans_stat Map Portfolio Statistics to a New Time Series

crsp_trans_cap Loads a Target Time Series with Capitalization Data

Prototype:int crsp_trans_cumret (CRSP_TIMESERIES *src_ts, CRSP_TIMESERIES *trg_ts, int

basedt)

Description: loads a target time series with cumulative returns from a source time series with level prices based on a base dateArguments: CRSP_TIMESERIES *src_ts - source time series of reruns

CRSP_TIMESERIES *trg_ts - target time series of pricesint basedt – base date

Return Values: CRSP_SUCCESS: if successfully loadedCRSP_FAIL: if error in parameters or loading process

Preconditions: If the src_ts arrtype is CRSP_RETURN_NUM the subtype must be CRSP_RETURN_CUM_NUMIf the stc_ts arrtype is CRSP_LEVEL_NUM the subtype must be CRSP_RETURN_NUM

Prototype:int crsp_trans_port (CRSP_TIMESERIES *src_ts, CRSP_TIMESERIES *trg_ts, int

par_flag)

Description:loads a target time series from a source time series by copying the last portfolio number over each restricted

according to the calendar fileArguments: CRSP_TIMESERIES *src_ts - source time series

CRSP_TIMESERIES *trg_ts - target time seriesint par_flag - determines how missing values affect beg and end of target: 0 – allow missing values at begin

and ending of target range, 1 - not allow missing values at beginning of target range, 2 - not allow missing valending of target range, 3 - not allow missing values at beginning and ending of target range

Return Values: CRSP_SUCCESS: if successfully loadedCRSP_FAIL: if error in parameters or loading process

Preconditions: The src_ts arrtype is CRSP_STK_PORT_NUM The trg_ts arrtype is CRSP_INTEGER_NUM and the subtype must be CRSP_PORT_PORT_NUM

Prototype:int crsp_trans_stat (CRSP_TIMESERIES *src_ts, CRSP_TIMESERIES *trg_ts, int

par_flag)

Description:loads a target time series from a source time series by copying the last portfolio statistic number over each res

period according to the calendar fileArguments: CRSP_TIMESERIES *src_ts - source time series

CRSP_TIMESERIES *trg_ts - target time seriesint par_flag - determines how missing values affect beg and end of target: 0 – allow missing values at begin

and ending of target range, 1 - not allow missing values at beginning of target range, 2 - not allow missing valending of target range, 3 - not allow missing values at beginning and ending of target range

Return Values: CRSP_SUCCESS: if successfully loadedCRSP_FAIL: if error in parameters or loading process

Preconditions: The src_ts arrtype is CRSP_STK_PORT_NUM The trg_ts arrtype is CRSP_DOUBLE_NUM and the subtype must be CRSP_PORT_STAT_NUM

Prototype:int crsp_trans_cap (CRSP_TIMESERIES *prc_ts, CRSP_TIMESERIES *shr_ts,

CRSP_TIMESERIES *cap_ts, int flags)

Description:loads a target time series with capitalization data from two sources time series one with prices and the other with

by multiplying the two values over each period according to the calendar fileArguments: CRSP_TIMESERIES *prc_ts – input prices time series

CRSP_TIMESERIES *shr_ts – input shares time seriesCRSP_TIMESERIES *cap_ts – output capitalization time seriesint flags - flags passed to the function. CRSP_ACTUAL means cap from the source is moved to the same perio

on target cap[i] = prc[i] * shr[i] CRSP_EFFECTIVE means cap from the source is moved to the nexperiod on target cap[i+1] = prc[i] * shr[i]

Return Values: CRSP_SUCCESS: if successfully loadedCRSP_FAIL: if error in parameters or loading process

Preconditions: If the prc_ts subtype is CRSP_PRICE_ADJ_NUM the shr_ts subtype must be CRSP_SHARES_ADJ_NUMIf the prc_ts subtype is CRSP_PRICE_NUM the shr_ts subtype must be CRSP_SHARES_IMP_NUMThe cap_ts arrtype is CRSP_DOUBLE_NUM and the subtype is CRSP_CAP_NUMThe prc_ts shr_ts and cap_ts must have the same calendar

110

Page 127: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

C P

rogramm

ing

ningues at

crsp_trans_gen_prc General Translation Price Function

Prototype:int crsp_trans_gen_prc (CRSP_TIMESERIES *srcprc_ts, CRSP_TIMESERIES *trgprc_ts,

CRSP_STK_STRUCT *stkptr, int case_flag, int adj_flag, int par_flag)

Description:

loads a target time series from a source time series by linking between the two calendars and copying the price values tothe target time series. Uses other translation functions to adjust, use last or last nonmissing price in range. Generaltranslation price function, used for prc, ask, bid, askhi, bidlo, adjprc, adjask, adjbid,adjaskhi, adjbidlo

Arguments: CRSP_TIMESERIES *srcprc_ts - source price time seriesCRSP_TIMESERIES *trgprc_ts - target price time seriesCRSP_STK_STRUCT *stkptr - stock structure pointerint case_flag - last value or previous nonmissing price in range (0 , 1)int adj_flag - adjust or not (1 , 0)int par_flag - determines how missing values affect beg and end of target: 0 – allow missing values at begin

and ending of target range, 1 - not allow missing values at beginning of target range, 2 - not allow missing valending of target range, 3 - not allow missing values at beginning and ending of target range

Return Values: CRSP_SUCCESS: if successfully loadedCRSP_FAIL: if error in parameters or loading process

111

Page 128: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

3.4 C Access on Supported Systems

Windows Systems

CRSP supports Windows NT 4.0 on Intel x86 and Alpha machines, and Windows 95/98 on Intel x86 machines. All Clibraries and sample programs were compiled and tested using the Microsoft Visual C++ 5.0 compiler.

CRSP access depends on environment variables set during installation of CRSPAccess97 Stock and Indicesdatabases. The environment variables can be set in the Control Panel/System menu or in the autoexec.bat file.Environment variables can be used in command prompt windows with the name enclosed in % characters. The setcommand can be used in a command prompt window to show available environment variables. (e.g. >set | find "crsp"). The CRSPAccess97 Installation Guide has information on installing theCRSPAccess97 data and programs.

Important CRSP files and directories have the following names.

%crsp_bin%- folder containing executable sample programs and batch files. This folder should be in the PATH soprograms can be run from any folder.

%crsp_lib%- folder containing CRSP Object Library and Internal Files

%crsp_lib%\crsp_lib.lib- CRSP Object Library

%crsp_include%- location of CRSP C Header Files Referred to by INCLUDE Statements

%crsp_sample%- folder containing CRSP Sample Programs

%crsp_mstk%- folder containing Monthly CRSP Stock and Indices Databases

%crsp_dstk%- folder containing Daily CRSP Stock and Indices Databases

Following is an example of modifying a sample C program using Microsoft Visual C++.

Copy the sample program to a local directory using the Windows Explorer utility or the command prompt copycommand, or use the Developer Studio to open the file and save to a new location with Save As.

Sample programs can be found in the %crsp_sample% directory. The command echo %crsp_sample% can beused to get the explicit directory needed. The explicit paths for %crsp_include% and %crsp_lib% may beneeded to set up projects in the Visual C++ Developers Studio,

To use Visual C++ Developer Studio.

Y Create new project by choosing the New option from the File Menu.

Y Choose Project from New window.

Y Choose Win32 Console Application, from New Project window.

use browse to choose a location which would be the location of your new project. Type the project name inthe Name box. Choose create. This creates a project shell.

Choose Configurations from the Build Menu.

Y Choose Add in the Configurations window. From Add Project Configuration window type a new target name inthe Configuration box. Choose OK, then choose Close to close the Configurations window.

On the project menu, point to Add to project, then click File. Type the path and filename of the sampleprogram in the File Name box and choose OK.

Repeat Step 3, but type the explicit path of %crsp_lib% followed by crsp_lib.lib in the File NameBox instead of the sample program path.

112

Page 129: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

C P

rogramm

ing

Choose Settings from the Project Menu. In Project Settings window:

Y In C/C++ tab, choose General in the Category box,

* type ",WINNT=2" at the end of line in the Preprocessor definitions box,

* Choose None in Debug Info box,

* Choose Default in Optimizations box.

Y In C/C++ tab, choose Precompiled Headers in the Category box,

* Choose not using precompiled header.

Y In C/C++ tab, choose preprocessor,

* Use explicit path of %crsp_include% to add include directory in the Additional Include directories.

Y In C/C++ tab, choose all other choices, one at a time, disabling default options.

Y In Link tab,

* You can set an executable name as desired in Output File Name box,

* Disable all other options.

Y To Compile, choose Rebuild All from the Build Menu.

Alternately, use the command prompt from your local directory.

The location of C run-time libraries must be set in the LIB environment variable.

Y cl /I %crsp_include% /D WINNT=2 stk_sampl.c %crsp_lib%crsp_lib.lib

Sample programs can also be compiled and linked using the nmake utility. The file c_samp.mak in the%crsp_sample% directory is a description file to maintain the two stock sample programs. To run, copy the file toyour program directory and run the utility with the command

Y nmake c_samp.mak

To run your sample program using arguments from a command prompt,

Y Make sure any input files exist and output files do not already exist. If no arguments it will print a usagemessage.

Y stk_sampl %crsp_mstk% 20 420 1000080 file.out (sample uses monthly data and loadsCRSP NYSE/AMEX/Nasdaq value-weighted with dividends index data, creating a header row for eachpermno in file.out).

See Microsoft documentation on the Visual C++, NMAKE, CL, and LINK commands.

See CRSPAccess97 Installation Guide for installation instructions for Windows NT to load data and configure thesystem to support CRSP data.

113

Page 130: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

Unix Systems

CRSP supports C programming on four different Unix platforms.

CRSP C Supported Unix Systems

CRSP access depends on environment variables set during installation. Environment variables can be used on Unixwith the name preceded by $. All file names and environmental variable names are case-sensitive on Unix systems.The env command can be used in a terminal window to find available environment variables.

Important CRSP files or directories can be found with the following names.

$CRSP_BIN- directory containing Executable Sample Programs and Batch Files. This directory is in the PATH soprograms can be run from any directory.

$CRSP_LIB- directory containing CRSP Object Library and Internal Files

$CRSP_LIB/crsplib.a- CRSP Object Library

$CRSP_INCLUDE- directory containing CRSP header files referred to by #INCLUDE statements

$CRSP_SAMPLE- directory containing CRSP Sample Programs

$CRSP_MSTK- directory containing monthly CRSP Stock and Indices Databases

$CRSP_DSTK- directory containing daily CRSP Stock and Indices Databases

Following is an example of how to modify and to run a sample C program with Sun Solaris andCompaq Tru64 Unix (formerly Digital Unix).

Y cp $CRSP_SAMPLE/stk_sampl.c (copy the program to a local directory)

Y chmod 660 stk_sampl.c . (make sure you have write access to modify the program)

Y vi stk_sampl.c (or other available editor, to insert alternate processing)

Y cc stk_sampl.c -o stk_sampl -DUNIX=l -I$CRSP_INCLUDE $CRSP_LIB/crsplib.a to compile the program.

Y stk_sampl $CRSP_MSTK 20 420 1000080 file.out to run the program (sample uses monthlydata and loads CRSP value-weighted with dividends data, creating a header row for each PERMNO infile.out). Make sure any input files exist and output files do not already exist. If no arguments theprogram will print a usage message.

Following is an example of how to modify and to run a sample C program with HP/UX.Y cp $CRSP_SAMPLE/stk_sampl.c (copy the program to a local directory)

Y chmod 660 stk_sampl.c (make sure you have write access to modify the program)

Y vi stk_sampl.c (or other available editor, to insert alternate processing)

Y gcc stk_sampl.c -o stk_sampl -DUNIX=l -DUNIX2=1 -I$CRSP_INCLUDE $CRSP_LIB/crsplib.a

to compile the program.

Y stk_sampl $CRSP_MSTK 20 420 1000080 file.out to run the program (sample uses monthlydata and loads CRSP value-weighted with dividends data, creating a header row for each PERMNO infile.out). Make sure any input files exist and output files do not already exist. If no arguments theprogram will print a usage message.

Sun Solaris Compaq Tru64 HP/UX AIXCPU Sun Sparc Alpha PARISC PowerPCOperating System Solaris 2.5 Compaq Tru64 HP/UX AIX 4.1.4C Compiler SparcCompiler C 4.0 DEC C 6.0-001 gnu C Compiler cc

114

Page 131: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

Chapter 3. Accessing Data in C

C P

rogramm

ing

Following is an example of how to modify and to compile a sample C program with AIX.Y copy the sample C program to a local directory

chmod 660 ussampl.f (make sure you have write access to modify the program)

Y edit stk_sampl.c (to insert alternate processing)gcc -DUNIX=1 -DUNIX2=1 -I$CRSP_INCLUDE -w -o stk_samp1 stk_samp1.c $CRSP_LIB/crsplib.a -lm

Y to compile and link the program.

Y run the program.

Sample programs can also be compiled and linked using the make utility. The directory $CRSP_SAMPLE containssample make description files for Sun Solaris, Compaq Tru64 (formerly Digital Unix), HP/UX, and AIX namedc_samp.mk, c_samp.mk2, c_samp.mk3, and c_samp.mk4 respectively. To use make, copy the relevantdescription file to your program directory, edit it to support the program(s) of interest and create local executables,and run with the command

Y make -f filename

See installation instructions in the CRSPAccess97 Installation Guide for Unix to load data and configure the systemto support CRSP data.

115

Page 132: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

OpenVMS Systems

CRSP supports OpenVMS on Digital Alpha machines. C Functions were compiled with DEC C 6.0-001.

CRSP access depends on logicals set during installation. The show logical crsp* command can be used in aterminal window to find available CRSP logicals.

Important CRSP files or directories can be found with the following names.

CRSP_BIN: - directory containing Executable Programs and Batch Files

CRSP_LIB: - directory containing CRSP Object Library and Internal Files

CRSP_LIB:libcrsp.olb - CRSP Object Library

CRSP_INCLUDE: - directory containing CRSP header files referred to by INCLUDE Statements

CRSP_SAMPLE: - directory containing CRSP Sample Programs

CRSP_MSTK: - directory containing Monthly CRSP Stock and Indices database

CRSP_DSTK: - directory containing Daily CRSP Stock and Indices database

Following is an example of modifying and running a sample C program with OpenVMS.

Y copy crsp_sample:stk_sampl.c [] (copy the program to a local directory)

Y eve stk_sampl.c (or other available editor, to insert alternate processing)

Y cc stk_sampl /include=CRSP_INCLUDE:/define=(VMS=l)/float=IEEE_FLOAT

Y link stk_sampl + crsp_lib:libcrsp/lib + sys$1ibrary:vaxcrtlt.olb/lib

Y stk_sampl :== "$curdir:stk_sampl.exe" (curdir is full path to location of executable,needed to run the program with arguments)

Y stk_sampl crsp_mstk: 20 420 1000080 file.out (sample arguments use monthly data andloads CRSP Value-Weighted with Dividend Index data, creating a header row for each PERMNO infile.out. A usage message will be displayed if the program is run without arguments.)

Sample programs can also be compiled and linked using the MMS utility. A sample description file, C_SAMP.MMS,exists in the CRSP_SAMPLE: directory. To use the sample description file, copy it to your program directory,modify it to include your program, and run with the command

Y mms / description=c_samp

See installation instructions in the CRSPAccess97 Installation Guide for OpenVMS to load data and configure thesystem to support CRSP data.

116

Page 133: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

INDEX

CCOMPND

description 29COMPNM

description 29crsp.h

description 55crsp_adj_load

description 88crsp_adj_map_arr

description 89crsp_adj_map_ts

description 88crsp_adj_stk

description 89crsp_cal_datecmp

description 75crsp_cal_diffdays

description 76crsp_cal_dt2lin

description 75crsp_cal_dt2parts

description 75crsp_cal_lin2dt

description 75crsp_cal_link

description 76crsp_cal_middt

description 76crsp_cal_search

description 77crsp_cmp_int

description 78crsp_cmp_string

description 78crsp_errprintf

description 69crsp_file_append

description 71crsp_file_close

description 71crsp_file_fopen

description 71crsp_file_getenv

description 72crsp_file_lseek

description 72crsp_file_open

description 72crsp_file_read

description 72

crsp_file_remove

description 72crsp_file_rename

description 73crsp_file_search

description 73crsp_file_stamp

description 73crsp_file_write

description 73crsp_free

description 73crsp_init.h

description 55crsp_map_comnam

description 93crsp_map_exchcd

description 91crsp_map_mmcnt

description 95crsp_map_ncusip

description 92crsp_map_nmsind

description 94crsp_map_nsdinx

description 95crsp_map_shrcd

description 91crsp_map_shrcls

description 93crsp_map_siccd

description 92crsp_map_ticker

description 92crsp_map_trtscd

description 94crsp_obj_comp_arr

description 81crsp_obj_comp_row

description 81crsp_obj_comp_ts

description 81crsp_obj_free_arr

description 82crsp_obj_free_row

description 82crsp_obj_free_ts

description 82crsp_obj_init_arr

description 80crsp_obj_init_row

description 81crsp_obj_init_ts

description 80crsp_obj_verify_arr

description 79crsp_obj_verify_row

description 80crsp_obj_verify_ts 79description 79, 90crsp_ret_calc

description 96crsp_ret_calc_del

description 96crsp_ret_calc_one

description 97crsp_ret_map_payments

description 98crsp_ret_off_exch

description 97crsp_ret_ordinary

description 97crsp_ret_payments

description 97crsp_root_info.c

description 86crsp_root_info_get

description 86crsp_root_info_put

description 86crsp_shr_imp

description 99crsp_shr_map

description 100crsp_shr_num

description 99crsp_shr_raw

description 100crsp_stk_clear 57, 64crsp_stk_close

description 57crsp_stk_free

description 58crsp_stk_init

description 58crsp_stk_open

description 59crsp_stk_read

description 60crsp_stk_read_cus

description 60crsp_stk_read_hcus

description 62crsp_stk_read_permco

description 61crsp_stk_read_siccd

description 62crsp_stkwrt_dd

description 102crsp_stkwrt_de

description 104crsp_stkwrt_di

description 103crsp_stkwrt_dr

description 101crsp_stkwrt_ds

description 102crsp_stkwrt_dtrstr

description 104crsp_stkwrt_dx

description 101crsp_stkwrt_hdr

description 103crsp_stkwrt_hdr1

description 103crsp_stkwrt_it

description 104crsp_stkwrt_name

description 103crsp_stkwrt_ni

description 104crsp_stkwrt_nms

description 102crsp_stkwrt_pdate

description 104crsp_stkwrt_popts

description 105crsp_stkwrt_portf

description 104crsp_stkwrt_r

description 102crsp_stkwrt_sh

description 103crsp_trans_cap

description 110crsp_trans_comp_returns

description 106crsp_trans_cumret

description 110crsp_trans_first

description 107crsp_trans_gen_prc

description 111crsp_trans_last

117

Page 134: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

description 106crsp_trans_last_closest

description 109crsp_trans_last_previous

description 109crsp_trans_level

description 109crsp_trans_max

description 107crsp_trans_port

description 110crsp_trans_stat

description 110crsp_trans_total

description 108crsp_util_cntl_findfield 86crsp_util_convtype

description 79crsp_util_lowercase

description 83crsp_util_trim

description 83crsp_util_uppercase

description 83crsp_xs_calc

description 90crsp_xs_port

description 90CURDIS

description 29CURNAM

description 30CURNDI

description 30CURSHR

description 30CUSP

description 30

DDATEFFLAG

description 36

EEXRDAT

description 30EXRINF

description 31EXRINT

description 31

FFILE

description 22FINDDT

description 31FSTYR

description 36FSTYR2

description 36

GGETNMS(

description 26

IID

description 22INDXDT

description 31INEXT

description 22ISAME

description 23

LLOADBA

description 31LOADHL

description 31

MMAMEXBEG

description 36MESSAGEFLAG

description 36MNASBEG

description 36

NNAMRNG

description 32NASBEG

description 36NDAYS

description 36NEXT

description 22nms.txt

description 19

OOUTDAT (JUNIT

description 34OUTDEL

description 33OUTDIS

description 33OUTHDR

description 33OUTINT

description 34OUTNAM

description 33OUTNDI

description 33OUTNMS

description 33OUTONE

description 34OUTPRM

description 34OUTSHR

description 33OUTSUB

description 34OUTYRZ

description 34

RRETURN

description 25

SSAME

description 23stk_samp1.c

description 54stk_samp2.c

description 54

TTICK

description 32

UUDCAL

description 27udclose

description 21UDDEC

description 27UDOPEN

description 20UMCAL

description 27UMCAP

description 28UMCLOSE

description 21UMCTI

description 28UMDEC

description 28UMOPEN 20us.txt

description 19USDATE

description 32USGETCUS

description 25USGETHIS

description 25USGETPCO

description 25USGETPER

description 25USGETSIC

description 25usparm.txt

description 19USSAMP1.FOR

description 17USSAMP10.FOR

description 18USSAMP2.FOR

description 17USSAMP3.FOR

description 17USSAMP4.FOR

description 17USSAMP5.FOR

description 17USSAMP6.FOR

description 17USSAMP7.FOR

description 17USSAMP8.FOR

description 17USSAMP9.FOR

description 18

118

Page 135: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

INDEX

VVALEXC

description 32

WWANTED

description 23

119

Page 136: CRSPterpconnect.umd.edu/~wermers/ftpsite/fnce7200/cap52599.pdf1925-1998 Center for Research in Security Prices The University of Chicago Graduate School of Business CRSP.com CRSPAccess97

CRSPACCESS97 DATABASE FORMAT - PROGRAMMERS GUIDE

120