11.2 Why consider non-linear relationships?

The models we consider below will generally use this extension (and a few other calculus tools) to transform our variables and get at specific non-linear relationships. The reason we do this is to get at the true relationship between the dependent and independent variables. If the relationship is in fact linear, then none of these sophisticated models are necessary. If the relationship is not linear, then linear models are by definition incorrect and will deliver misleading results. The models in this section are therefore used only when the data deems them necessary. Nobody wants to make a model more complicated than it needs to be.

The sole purpose of introducing a sophisticated functional form into an otherwise straight-forward regression model is because the relationship between a dependent an independent variable is not linear in the data. Recall that a linear relationship is one where the slope is constant. If the slope is constant (say, \(\beta\)), then a one-unit increase in \(X\) will deliver a \(\beta\) unit increase in \(Y\) on average no matter where in the range of X you are. There are a lot of instances in the real world where this doesn’t make sense. Take for example the impact of apartment rental price on its size. One can imagine that if an apartment is small (e.g. 50 sq ft), then one might be willing to pay a lot more for a slight increase in size. If an apartment is ridiculously huge (e.g. 5000 sq ft), then one might not be willing to pay anything for an increase in size. This means that the relationship between apartment rental price and size is conceptually non-linear - the slope is dependent upon the actual size of the apartment. However, if your data has a small range (e.g., between 300 and 400 sq ft), then you might never need to consider a non-linear relationship because a linear one does a good job of approximating the relationship that you observe. This chapter deals with situations where you observe a non-linear relationship in your actual data, so it needs to be modeled.

Once we have established that a relationship is non-linear, we next need to take a stand on what type of non-linear relationship we are attempting to uncover. Answering this question depends upon how you think the slope is going to behave.

Is the slope not constant in unit changes, but constant in percentage changes?
Does the slope qualitatively change? In other words, is there a relationship between a dependent and independent variable that starts out as positive (or negative) and eventually turns negative (or positive)?
Does the slope start out positive or negative and eventually dies out (i.e., goes to zero)?

This chapter details three types of non-linear transformations each designed to go after one of these three scenarios. Non-linear transformations are not one size fits all, so having a good idea of how to handle each type of relationship is essential.