Frequently Asked Questions
by Paul von Hippel
Statistician
Department of Sociology and Initiative in Population Research
Ohio State University
Office hours M-F 3-5, Bricker 375A
Social researchers often have questions on the following topics.
Statistics questions
-
Odds ratios:
I'm fitting a logistic regression model, and the coefficients are hard to interpret.
How can I convert the coefficients to odds ratios, and are odds ratios easier to interpret?
-
Initial status and change:
I observe the same variable at two different times. I suspect that the change
in the variable may be related to its initial value. For example, I suspect that a
student's kindergarten achievement score may help to predict the change in that student's
achievement from kindergarten to first grade. Should I regress change on initial status?
-
Non-linear effects:
"My model includes a regressor X (for example age) that is measured
in quantitative units (for example years). Results using this regressor
are confusing. When I break X into artifical categories (20-29, 30-39,
etc.), it seems to affect the response variable. But when I use X as originally
coded, the effect goes away."
-
Missing values (
webpage,
powerpoint
):
"My data set has missing values. For example, I am using X to model
Y but for some cases I don't have a value for X or Y. What should I do
about this?"
-
Multiple imputation (
powerpoint
):
"Okay, I'm going to use multiple imputation for missing values. Can you tell me more about it?
Can you answer common questions?"
-
Imputing categories:
"I'm using SAS PROC MI
(description,
documentation)
for multiple imputation. The software assumes that imputed
values have a conditional normal distribution, but some of the variables I'm imputing
represent categories. What can I do about this?
-
Different slopes in different groups:
"I am fitting two or more groups with the same generalized linear model.
For example, I am fitting the same logistic regression model to two samples
-- one from the US, one from Sweden. It looks to me as though the coefficients
for the US are different from those for Sweden. But how I can I formally test this?"
-
Normalization.
"I'm using a model that assumes normality but my variable doesn't satisfy this assumption.
What can I do?"
-
Comparing small samples.
"I want to compare two samples, but the samples are small and I'm not
prepared to assume normality."
-
Singularity?!
"I'm using some new procedure--for example, SAS PROC MI for multiple imputation
and I'm getting an error message indicating that the program can't converge
because of a 'singular covariance matrix' (or a covariance matrix that is 'not positive definite').
What can I do about this?"
-
Collinearity diagnostics and remedies:
"Some of my collinearity diagnostics have large values, or small values, or whatever
they're not supposed to have. Is this bad? If so, what can I do about it?"
-
Complex samples:
"I'm using data from a survey that uses weights, clusters, and/or strata?
How can I account for this, and what are the implications?"
-
Sample weights:
"I'm fitting a regression model to a survey that includes sampling weights.
Should I use the weights in my analysis?"
-
Interactions:
"My model includes X1 and an interaction between X1 and X2.
Do I have to include X2 as well?"
-
Heteroscedasticity:
"I'm fitting an OLS regression model, which assumes that all cases have
equal error variance (homoscedasticity).
I suspect my data may violate this assumption. Is this a problem; if so what can
I do about it?"
Software questions
Stata
SAS
-
Learning SAS:
"Do you have any tips on learning SAS?"
-
Merging data sets:
"Where can I find documentation on merging data sets in Stata and SAS?"
-
Using the Output Delivery System (ODS):
"I want to put the results of my analysis into a format that I can fiddle with.
How can I do this in SAS?"
-
Creating a summary data set in SAS:
"I have a data set that provides the characteristics of individuals
from several different countries (or counties or schools...). I want to
generate a new data set that contains the average characteristics of individuals
from each country."
-
Convering from SPSS to SAS:
"I need to get data set from SPSS into SAS (or Stata to SAS, Excel to SAS, ...)."
-
Using PROC MI and PROC MIAnalyze:
I'm learning SAS PROC MI and SAS PROC MIAnalyze for producing and analyzing
multiply imputed data sets. What are the basics on using these, and what are some
common difficulties and workarounds?
-
Combining estimates from imputed data sets.:
I want to use multiple imputation in SAS, but I'd rather not use PROC MIANALYZE.
-
Exporting results from SAS.
I have produced results in SAS, which I would like to paste into a spreadsheet for formatting.
Do I really have to paste the numbers, one by one, from the SAS output window?
Excel