The coffee demand in the UK stems from two variables, the price and the real disposable income, that influence consumer behavior. In this paper, I analyze a time series of coffee demand data and prices and income levels. The paper consists of two parts: Part A, using two simple linear regression models, describes the relationship between demand, price and revenue. The relationship between the three variables is evaluated using a multiple regression model. The paper aims at explaining how the variables are related and whether the models have the power to explain. The paper will aim to describe the nature of the relationship amongst the variables and whether the models have explanatory power.Section (A) Simple Linear Regression Model
Scatter Diagrams
Graph 1: Scatter Diagram showing the relationship between demand and price
From the scatter diagram above, it can be established that demand for coffee in the U.K has a negative association with the price, that is, an increase in the price results to a decrease in the demand. However, the negative relationship is not strong (weak) because the data points are not clustered on a straight line (Bluman, 2000).
From the above scatter diagram, it can be established that there is a positive association between the demand for coffee in the U.K and the income of the consumer, that is, demand increases as income rises. Also, the positive relationship is strong because the data points seem to be tightly clustered around a straight line (Bluman, 2000). A possible explanation for such a trend is that at higher incomes, consumers have extra disposable income for consumption of products such as coffee.
Regression between coffee and price
The linear relationship between the demand for coffee and the price is Y= 3366.055-11.3635X1. 3366.055 is the intercept of the model which shows the level of demand when the price is zero. The coefficient for the price is -11.3635 with a standard error of 4.7388 indicating a pound increase in the price would reduce the demand by a factor of -11.3635. The multiple correlation coefficients is 0.4820 indicating that the relationship between demand and price is weakly positive (Cohen, Cohen, West, & Aiken, 2013). The standard error of the regression is 157.9299 which provides an estimate of the variation of the observed demand for coffee about the regression line (Schielzeth, 2010). It means that the average distance of the data points from the fitted line is about 157% of demand, a clear indicator that most of the observed values are far away from the regression line (McHugh 2008). The F-statistic is significant because the p-value of 0.0269 is less than the alpha of 0.05 meaning that the results did not happen by chance.
Appropriate test
The coefficient of determination R2= (143418.0638/ 617313.1429)= 23.23% which means that 23.23% of the variation in the dependent variable (demand) is explained by the independent variable. The R-squared also means that linear model explains 23.26% of the variability of the response data around the mean. The R-Squared is low, and it can be considered that the linear model is not a good fit of the data but the P-value of 0.0269 indicates the relationship is significant.
The first step in evaluating the explanatory power of the model is to establish whether a liner relationship exists. If there is a liner relationship, β≠0 and if no liner relationship, β=0. In this case, β=-11.3635 and this suggests the existence of a liner relationship. The second step is to test whether at α=0.05, there is overwhelming evidence of a liner relation between demand and price.
H0:β1=0
Hα: β1≠0, and α=0.05
Therefore, we use a t-test with n-2 degrees of freedom. The t-values are +2.0930 and
-2.0930. The models sample t-statistic is -2.3979 and its absolute value is 2.3979. Since 2.3979>2.0930, we reject the null hypothesis in favor of the alternative hypothesis. Therefore, there is overwhelming evidence at confidence interval 95%, β≠0 and therefore, there is a liner relationship between demand and price.
Regression between demand and income
The linear relationship between demand for coffee and income is Y= 168.6861+6.2826X2. The coefficient of income is 6.2826 which means that a one pound increase in income increases the demand for coffee by 6.2826. The independent variable’s standard error is 0.4987, and the p-value is 1.1327E-10 indicating that the relationship is significant. The R-square is 0.8913 meaning that 89.13% variation in the dependent variable is explained by the independent variable (Cohen et al. 2013). The standard error of the regression is 58.9823 which means that the average distance of the data points from the fitted line is 58.9823% of the demand.
Appropriate test
The first step in evaluating the explanatory power of the model is to establish whether a linear relationship exists. If there is a linear relationship, β≠0 and if no linear relationship, β=0. In this case, β=6.2826 and this suggests the existence of a linear relationship. The second step is to test whether, at α=0.05, there is overwhelming evidence of a linear relation between demand and price.
H0:β1=0
Hα: β1≠0, and α=0.05
Therefore, we use a t-test with n-2 degrees of freedom. The t-values are +2.0930 and
-2.0930. The models sample t-statistic is 12.5993 and since 12.5993>2.0930, we reject the null hypothesis in favor of the alternative hypothesis. Therefore, there is overwhelming evidence at confidence interval 95%, β≠0 and therefore, there is a liner relationship between demand and price.
Section (B) Multiple Regression Analysis
The multiple linear regression model using the data set is Y=1616.506-6.8558X1+5.8635X2 where X1 is the coefficient for price and X2 is the coefficient for income. The model is significant because the p-value 6.03E-15<0.05. Also, there is statistical significance in the relationship between price and the demand (p value= 6.96E-7) and demand and income (p-value=1.23E-14). From Equation 1, the coefficient the real price of coffee was -11.3635, but in the multiple regression, it has increased to -6.8558. In a regression model, a parameter estimate will change if the variable added to the model is correlated with the response variable, Y and correlated with the parameters corresponding variable which was already in the model. According to Wang & Lain (2003), collinearity amongst the independent variables causes the estimated coefficients to change when an independent variable (income) is added to the model. The multicollinearity is this case is not severe and therefore not problematic. Correlation between the independent variables implies that when income shifts, the price of coffee also changes, and this explains why the coefficient for the price increases (Anderson, Sweeney, & Williams, 2011). The value of R2 for the multiple regression is 0.9737 while R2 linear regression [1] was 0.2323. The implication is that the addition of an independent variable causes an increase in the R-square of the model. The addition of an extra variable causes the SSE to decrease and at the same time, the SSR increases by the same amount (Su, Yan & Tsai, 2012). However, the value of SST remains the same, that is, does not change with the model. The increase in SSR is the amount of the variation due to the variables in the larger model that were not accounted for in the smaller model. Since the OLS has been used, minimizing the SSE over the set of all possible residual factors. Conclusion The demand for coffee and price have a weak but significant positive relationship, and the linear model is Y=3366.055-11.3635X1, where X1 is the coefficient of price meaning that a one pound increase in price reduces demand by 11.3635. The model has explanatory power because the hypothesis test rejects the absolute hypothesis ( the relationship is not linear) in favor of the alternative hypothesis. On the other hand, demand for coffee and income have a strong positive and significant relationship as evidenced by an R of 0.9450. The linear relationship between the variables is Y=168.6861+6.2826X2, and it means that a pound increase in consumer income increases demand by a level of 6.2826. The model has explanatory power because the hypothesis test reveals that the estimated t-statistic>table t-statistic and we reject the null hypothesis in favor of the alternative hypothesis (there is a linear relationship)
The multiple linear regression models with demand as the dependent variable and income and price as the independent variables can be expressed as Y=1616.506-6.8558X1+5.8635X2. The meaning is that one pound increase in price reduces demand by 6.8558 while a one pound increase in income increases demand by 5.8635. The coefficient of price has increased from -11.3635 to -6.8558 with the cause being moderate multicollinearity between the independent variables. Also, R-squared has increased after the addition of the extra variable because of minimization of the SSE. The equations in section A are valid because the R-square indicate that the model fits the data well. Also, the equation is section B is valid because the R-square indicates a positive relationship and therefore, the model fits the data well.
References
Anderson, D. R., Sweeney, D. J., & Williams, T. A. (2011). Essentials of modern business statistics with Microsoft Excel. Cengage Learning
Bluman, A. G. (2000). Elementary statistics. McGraw Hill Publishers
Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2013). Applied multiple regression/correlation analysis for the behavioral sciences. Routledge
McHugh, M. L. (2008). Standard error: meaning and interpretation. Biochemia Medica, 18(1), 7-13
Schielzeth, H. (2010). Simple means to improve the interpretability of regression coefficients. Methods in Ecology and Evolution, 1(2), 103-113
Su, X., Yan, X., & Tsai, C. L. (2012). Linear regression. Wiley Interdisciplinary Reviews: computational statistics, 4(3), 275-294
Wang, G. C., & Jain, C. L. (2003). Regression analysis: modeling & forecasting. Institute of Business Forec