缺失值补充--SimpleImputer

SimpleImputer

sklearn.impute.SimpleImputer(*, missing_values=nan, strategy='mean', fill_value=None, verbose='deprecated', copy=True, add_indicator=False)
1

通过简单的方法对缺失值进行补充

沿着每一列通过给定的策略（均值、中位数、众数）或定值对缺失值进行补充

参数

missing_values

int, float, str, np.nan, None or pandas.NA, default=np.nan
缺失值的占位符，数据中所有的值为missing_values的空缺值将会被填充

strategy

str, default=’mean’

缺失值填补策略	填补缺失值的数值
mean	每一列的均值
median	每一列的中位数（只能用于数值型数据）
most_frequent	每一列的众数（可用于数值型或字符串型数据）
constant	给定值`fill_value`

fill_value

str or numerical value, default=None
当参数strategy='constant'时，将使用fill_value替换所有缺失值
当该参数为default时，填补缺失值的时候，对于数值型数据将填补为0，对于字符串型或对象型数据将被填补为‘missing_value’

copy

bool, default=True

属性

statistics_

array of shape (n_features,)
每个特征的缺失填补值

The imputation fill value for each feature

indicator_

MissingIndicator
为缺失数值添加二元指标使用到的指标

Indicator used to add binary indicators for missing values.

n_features_in_

int
拟合过程中的特征数量

feature_names_in_

ndarray of shape (n_features_in_,)
拟合过程中的特征名称

方法

fit(X[, y])

拟合数据

Fit the imputer on X.

fit_transform(X[, y])

拟合数据并将其进行转换

Fit to data, then transform it.

get_feature_names_out([input_features])

返回输出特征名称

Get output feature names for transformation.

get_params([deep])

返回模型参数

Get parameters for this estimator.

inverse_transform(X)

还原数据

Convert the data back to the original representation.

set_params(**params)

设置模型参数

Set the parameters of this estimator.

transform(X)

填补缺失值

Impute all missing values in X

相关阅读:
Java 中的日期时间总结
基于 gin + websocket 即时通讯项目 (一、项目初始化)
零样本学习
[SUCTF 2019]Pythonginx
hal开发之hidl/aidl支持的绑定式直通式详细讲解
VIAVI唯亚威FI-10/-11 光纤识别仪
【Linux】安装配置解决Centos&MobaXterm的使用及Linux常用命令&命令模式
Flume实时采集mysql数据到kafka中并输出
【好书推荐】《用户画像：平台构建与业务实践》
分布式应用运行时 Dapr 1.7 发布

原文地址：https://blog.csdn.net/m0_54510474/article/details/128032519