7
SAS® Macros for Constraining Arrays of Numbers Charles D. Coleman Economic Statistical Methods Division US Census Bureau

SAS® Macros for Constraining Arrays of Numbers Charles D. Coleman Economic Statistical Methods Division US Census Bureau

Embed Size (px)

Citation preview

Page 1: SAS® Macros for Constraining Arrays of Numbers Charles D. Coleman Economic Statistical Methods Division US Census Bureau

SAS® Macros for Constraining Arrays of Numbers

Charles D. ColemanEconomic Statistical Methods Division

US Census Bureau

Page 2: SAS® Macros for Constraining Arrays of Numbers Charles D. Coleman Economic Statistical Methods Division US Census Bureau

Why Constrain Data• Make estimates consistent with external

totals.• Disclosure avoidance• Make data sum to data of higher precision• Presentation• Reduce data anomalies

Page 3: SAS® Macros for Constraining Arrays of Numbers Charles D. Coleman Economic Statistical Methods Division US Census Bureau

Raking: “Multiplicative” Controlling• One dimension (i.e., individual variables), a.k.a. “scaling”

– Nonzero data, control of same sign• Ordinary rake

– Multiply by ratio of control to sum.– Monotonic: All data move in same direction as control.– Preserves order: Order of data in output vector same as input vector.– Preserves ratios: Ratios of pairs of elements of output vector same as input vector.– %RAKE

• With BY-groups– %RAKEBY

– Data of Mixed sign• Generalized rake

– Weighted sum of ordinary rake and difference between control, original sum.– Monotonic– Preserves order– Does not preserve ratio– Nonunique solution– %GENRAKE

Page 4: SAS® Macros for Constraining Arrays of Numbers Charles D. Coleman Economic Statistical Methods Division US Census Bureau

Raking: “Multiplicative” Controlling• Two dimensions (matrices)

– Nonnegative data, fixed positive marginals (controls)• Two-way rake, a.k.a. RAS algorithm, matrix scaling…

– Alternately multiply columns, rows to controls ÷ corresponding sums– Iterate until convergence– Always converges when feasible solution exists– %RAKE2WAYS

– Nonnegative data, marginals in intervals• Range-RAS algorithm

– %RRAS

Page 5: SAS® Macros for Constraining Arrays of Numbers Charles D. Coleman Economic Statistical Methods Division US Census Bureau

Controlled Rounding• Round data to preserve original sum.

– Requires unconventional “wrong-way” rounding.• Two Dimensions

– Cox-Ernst Algorithm minimizes total cost of rounding.• Sands (2003) NESUG paper covers the method.• Macro based on transportation problem example in SAS documentation.

– Requires SAS 9.2+.– %CONTROLROUND

• One Dimension– Greatest Mantissa Algorithm

• Simplification of Cox-Ernst Algorithm• Assign output data the integral parts of input data• Round up mantissas in decreasing order until sum of output data equals sum of input data• %GMROUND

– With BY-group processing• %GMROUNDBY

Page 6: SAS® Macros for Constraining Arrays of Numbers Charles D. Coleman Economic Statistical Methods Division US Census Bureau

Raking and Rounding in 1 Macro• One dimension

– %RAKEANDGMROUND

• One dimension with BY-groups– %RAKEANDGMROUNDBY

• Two dimensions– %RAKEANDROUND2WAYS

• Two dimensions with BY-groups– %RAKEANDROUND2WAYSBY

Page 7: SAS® Macros for Constraining Arrays of Numbers Charles D. Coleman Economic Statistical Methods Division US Census Bureau

Files Locationhttp://sourceforge.net/p/constrainingarrays/code/ci/master/tree/– Also in paper.