The stacking model is very common in Kaglle competitions. Why?
【机器学习大杀器】Stacking堆叠模型(English)
2.1 description of the algorithms:
2.2 interpretation of the estimated models:
3.2 the coefficient of last model
===============================================
Include three models:
=================
==============================================================
When we get three models with equal robustness but different structure, which model should we choose at this time? I think this is a very difficult thing, but through the integrated learning method, we can easily synthesize these three models into a better model that absorbs the advantages of each model. Stacking is just such a strategy for merging models.

In this experiment, we just used such a stacking strategy, which greatly improved the effect of our model. In the first layer of the model, there are three stacked models: {1.Lasso, 2.Random Forest, 3.Gradient Boost}, and used {Logistics Region} as the second layer of the model.
Stacking is generally composed of two layers. Level 1: basic models with excellent performance (there can be multiple models); The second layer: take the output of the models in the first layer as the model obtained from the training set. The second layer model is also called "meta model". The key role is to integrate the results of all models in the first layer and output them. That is, the second layer model trains the output of the first layer model as a feature. The overall steps of stacking are as follows:

- from sklearn.ensemble import StackingRegressor
- models = [('Lasso', lo), ('Random Forest', rf), ('Gradient Boost', gb_boost)]
-
- stack = StackingRegressor(models, final_estimator=LinearRegression(positive=True), cv=5, n_jobs=4)
- stack.fit(x_train, y_train)
- stack_log_pred = stack.predict(x_test)
- print(stack_log_pred)
- stack_pred = np.exp(stack_log_pred)
-
- plt.barh(np.arange(len(models)), stack.final_estimator_.coef_)
- plt.yticks(np.arange(len(models)), ['Linear Regression', 'Random Forest', 'Gradient Boost']);
- plt.xlabel('Model coefficient')
- plt.title('Figure 4. Model coefficients for our stacked model');
In order to better understand what Stacking has done, we printed the coefficient of the last layer of the Stacking model, i.e. the second layer of the Linear Regression model, after we completed the training with skleran, as shown in the figure:

Coefficients of high and positive linear regression models coef_ It means a stronger relationship. From the above figure, we can see that the proportion of the three models we combined in Stacking is 8:2:0. It can be seen that Gradient Boost has the greatest impact on the overall prediction results of the model.