is a numeric variable that represents categorical data, similar as gender, race, political cooperation,etc.

Technically, dummy variables are dichotomous, quantitative variables. Their range of values is small; they can take on only two quantitative values. As a practical matter, regression results are easiest to interpret when dummy variables are limited to two specific values, 1 or 0. generally, 1 represents the presence of a qualitative attribute, and 0 represents the absence.

For illustration, suppose we're interested in political cooperation, a categorical variable that might assume three values-Republican, Democrat, or Independent. We could represent political cooperation with two dummy variables

X1 = 1, if Republican; X1 = 0, else.

X2 = 1, if Democrat; X2 = 0, else.

In this illustration, notice that we do not have to produce a dummy variable to represent the" Independent" order of politicalaffiliation.However, we know the voter is neither Republican nor Democrat, If X1 equals zero and X2 equals zero. thus, namer must be Independent.

is a numeric variable that represents categorical data, similar as gender, race, political affiliation,etc.

Technically, dummy variables are dichotomous, quantitative variables. Their range of values is small; they can take on only two quantitative values. As a practical matter, regression results are easiest to interpret when dummy variables are limited to two specific values, 1 or 0. generally, 1 represents the presence of a qualitative attribute, and 0 represents the absence.

For illustration, suppose we're interested in political affiliation, a categorical variable that might assume three values-Democratic, Democrat, or Independent. We could represent political affiliation with two dummy variables

X1 = 1, if Republican; X1 = 0, else.

X2 = 1, if Democrat; X2 = 0, else.

In this illustration, notice that we do not have to produce a ersatz variable to represent the" Independent" order of politicalaffiliation.However, we know the namer is neither Democratic nor Democrat, If X1 equals zero and X2 equals zero. thus, namer must be Independent.

A kth dummy variable is redundant; it carries no new information. And it creates a severe multicollinearity problem for the analysis. Using k dummy variables when only k- 1 dummy variables are needed is known as the dummy variable trap. Avoid this trap!

For illustration, suppose we wanted to assess the relationship between household income and political affiliation( i.e., Republican, Democrat, or Independent>

. The regression equation might be

Income = b0 b1X1 b2X2

where b0, b1, and b2 are regression coefficients. X1 and X2 are regression coefficients defined as

X1 = 1, if Republican; X1 = 0, else.

X2 = 1, if Democrat; X2 = 0, else.

The value of the categorical variable that isn't represented explicitly by a dummy variable is called the reference group. In this illustration, the reference group consists of Independent voters.

In analysis, each dummy variable is compared with the reference group. In this illustration, a positive regression measure means that income is advanced for the ersatz variable political cooperation than for the reference group; a negative regression measure means that income islower.However, the income distinction with the reference group is also statistically significant, If the regression measure is statistically significant.