内容:设计并实现一个简单的用于分类的SVI M,主要用于iris 分类。将iris数据集分为训综集和测试集,使用训练集训练得到SVM 分类模型,并使用模型预测测试集的类别归属。
实现思路及步骤如下。 (1)读取数据集,区分标签和数据。(2)标准化数据集。 (3)将数据集划分为训练集和测试集。(4)构建SVM模型。 (5)输出预测测试集结果,评价分类模型性食能,输出测试报告。
- 导入相关模块和数据集,并进行数据的区分标签和数据。其中,使用Pandas库中的read_csv()函数读取名为'iris.csv'的数据文件,通过iloc方法将数据集中的124条数据的前4列作为特征数据X,第5列作为标签数据y。
- 对数据进行标准化处理,以提高SVM模型的分类性能。使用preprocessing模块中的StandardScaler类对特征数据X进行标准化,即对数据进行中心化处理并使每个特征都具有相同的重要性。
- 随机划分数据集,将原始数据集按照8:2的比例随机划分为训练集(X_train, y_train)和测试集(X_test, y_test),其中将20%的数据分配给测试集且设置随机种子数为42,以确保划分结果一致。
- 构建SVM模型,使用svm模块中的SVC类构建一个线性核函数的SVM模型,将其命名为svm并使用fit()函数训练模型,将训练数据(X_train, y_train)作为参数传递给fit()函数以完成模型训练。
- 对测试集进行预测,使用predict()函数对测试集样本进行分类预测,将测试集特征数据X_test作为参数传递给predict()函数,得到预测结果y_pred。
- 输出测试结果和分类报告,使用print()函数输出预测结果y_pred和分类性能评估报告classification_report(y_test, y_pred),其中classification_report()函数可显示模型的分类性能指标,包括每个类别的精确率、召回率、F1值及其加权平均数。
iris.csv
-
- sepal_length,sepal_width,petal_length,petal_width,species
- 5.1,3.5,1.4,0.2,Iris-setosa
- 4.9,3,1.4,0.2,Iris-setosa
- 4.7,3.2,1.3,0.2,Iris-setosa
- 4.6,3.1,1.5,0.2,Iris-setosa
- 5,3.6,1.4,0.2,Iris-setosa
- 5.4,3.9,1.7,0.4,Iris-setosa
- 4.6,3.4,1.4,0.3,Iris-setosa
- 5,3.4,1.5,0.2,Iris-setosa
- 4.4,2.9,1.4,0.2,Iris-setosa
- 4.9,3.1,1.5,0.1,Iris-setosa
- 5.4,3.7,1.5,0.2,Iris-setosa
- 4.8,3.4,1.6,0.2,Iris-setosa
- 4.8,3,1.4,0.1,Iris-setosa
- 4.3,3,1.1,0.1,Iris-setosa
- 5.8,4,1.2,0.2,Iris-setosa
- 5.7,4.4,1.5,0.4,Iris-setosa
- 5.4,3.9,1.3,0.4,Iris-setosa
- 5.1,3.5,1.4,0.3,Iris-setosa
- 5.7,3.8,1.7,0.3,Iris-setosa
- 5.1,3.8,1.5,0.3,Iris-setosa
- 5.4,3.4,1.7,0.2,Iris-setosa
- 5.1,3.7,1.5,0.4,Iris-setosa
- 4.6,3.6,1,0.2,Iris-setosa
- 5.1,3.3,1.7,0.5,Iris-setosa
- 4.8,3.4,1.9,0.2,Iris-setosa
- 5,3,1.6,0.2,Iris-setosa
- 5,3.4,1.6,0.4,Iris-setosa
- 5.2,3.5,1.5,0.2,Iris-setosa
- 5.2,3.4,1.4,0.2,Iris-setosa
- 4.7,3.2,1.6,0.2,Iris-setosa
- 4.8,3.1,1.6,0.2,Iris-setosa
- 5.4,3.4,1.5,0.4,Iris-setosa
- 5.2,4.1,1.5,0.1,Iris-setosa
- 5.5,4.2,1.4,0.2,Iris-setosa
- 4.9,3.1,1.5,0.2,Iris-setosa
- 5,3.2,1.2,0.2,Iris-setosa
- 5.5,3.5,1.3,0.2,Iris-setosa
- 4.9,3.6,1.4,0.1,Iris-setosa
- 4.4,3,1.3,0.2,Iris-setosa
- 5.1,3.4,1.5,0.2,Iris-setosa
- 5,3.5,1.3,0.3,Iris-setosa
- 4.5,2.3,1.3,0.3,Iris-setosa
- 4.4,3.2,1.3,0.2,Iris-setosa
- 5,3.5,1.6,0.6,Iris-setosa
- 5.1,3.8,1.9,0.4,Iris-setosa
- 4.8,3,1.4,0.3,Iris-setosa
- 5.1,3.8,1.6,0.2,Iris-setosa
- 4.6,3.2,1.4,0.2,Iris-setosa
- 5.3,3.7,1.5,0.2,Iris-setosa
- 5,3.3,1.4,0.2,Iris-setosa
- 7,3.2,4.7,1.4,Iris-versicolor
- 6.4,3.2,4.5,1.5,Iris-versicolor
- 6.9,3.1,4.9,1.5,Iris-versicolor
- 5.5,2.3,4,1.3,Iris-versicolor
- 6.5,2.8,4.6,1.5,Iris-versicolor
- 5.7,2.8,4.5,1.3,Iris-versicolor
- 6.3,3.3,4.7,1.6,Iris-versicolor
- 4.9,2.4,3.3,1,Iris-versicolor
- 6.6,2.9,4.6,1.3,Iris-versicolor
- 5.2,2.7,3.9,1.4,Iris-versicolor
- 5,2,3.5,1,Iris-versicolor
- 5.9,3,4.2,1.5,Iris-versicolor
- 6,2.2,4,1,Iris-versicolor
- 6.1,2.9,4.7,1.4,Iris-versicolor
- 5.6,2.9,3.6,1.3,Iris-versicolor
- 6.7,3.1,4.4,1.4,Iris-versicolor
- 5.6,3,4.5,1.5,Iris-versicolor
- 5.8,2.7,4.1,1,Iris-versicolor
- 6.2,2.2,4.5,1.5,Iris-versicolor
- 5.6,2.5,3.9,1.1,Iris-versicolor
- 5.9,3.2,4.8,1.8,Iris-versicolor
- 6.1,2.8,4,1.3,Iris-versicolor
- 6.3,2.5,4.9,1.5,Iris-versicolor
- 6.1,2.8,4.7,1.2,Iris-versicolor
- 6.4,2.9,4.3,1.3,Iris-versicolor
- 6.6,3,4.4,1.4,Iris-versicolor
- 6.8,2.8,4.8,1.4,Iris-versicolor
- 6.7,3,5,1.7,Iris-versicolor
- 6,2.9,4.5,1.5,Iris-versicolor
- 5.7,2.6,3.5,1,Iris-versicolor
- 5.5,2.4,3.8,1.1,Iris-versicolor
- 5.5,2.4,3.7,1,Iris-versicolor
- 5.8,2.7,3.9,1.2,Iris-versicolor
- 6,2.7,5.1,1.6,Iris-versicolor
- 5.4,3,4.5,1.5,Iris-versicolor
- 6,3.4,4.5,1.6,Iris-versicolor
- 6.7,3.1,4.7,1.5,Iris-versicolor
- 6.3,2.3,4.4,1.3,Iris-versicolor
- 5.6,3,4.1,1.3,Iris-versicolor
- 5.5,2.5,4,1.3,Iris-versicolor
- 5.5,2.6,4.4,1.2,Iris-versicolor
- 6.1,3,4.6,1.4,Iris-versicolor
- 5.8,2.6,4,1.2,Iris-versicolor
- 5,2.3,3.3,1,Iris-versicolor
- 5.6,2.7,4.2,1.3,Iris-versicolor
- 5.7,3,4.2,1.2,Iris-versicolor
- 5.7,2.9,4.2,1.3,Iris-versicolor
- 6.2,2.9,4.3,1.3,Iris-versicolor
- 5.1,2.5,3,1.1,Iris-versicolor
- 5.7,2.8,4.1,1.3,Iris-versicolor
- 6.3,3.3,6,2.5,Iris-virginica
- 5.8,2.7,5.1,1.9,Iris-virginica
- 7.1,3,5.9,2.1,Iris-virginica
- 6.3,2.9,5.6,1.8,Iris-virginica
- 6.5,3,5.8,2.2,Iris-virginica
- 7.6,3,6.6,2.1,Iris-virginica
- 4.9,2.5,4.5,1.7,Iris-virginica
- 7.3,2.9,6.3,1.8,Iris-virginica
- 6.7,2.5,5.8,1.8,Iris-virginica
- 7.2,3.6,6.1,2.5,Iris-virginica
- 6.5,3.2,5.1,2,Iris-virginica
- 6.4,2.7,5.3,1.9,Iris-virginica
- 6.8,3,5.5,2.1,Iris-virginica
- 5.7,2.5,5,2,Iris-virginica
- 5.8,2.8,5.1,2.4,Iris-virginica
- 6.4,3.2,5.3,2.3,Iris-virginica
- 6.5,3,5.5,1.8,Iris-virginica
- 7.7,3.8,6.7,2.2,Iris-virginica
- 7.7,2.6,6.9,2.3,Iris-virginica
- 6,2.2,5,1.5,Iris-virginica
- 6.9,3.2,5.7,2.3,Iris-virginica
- 5.6,2.8,4.9,2,Iris-virginica
- 7.7,2.8,6.7,2,Iris-virginica
- 6.3,2.7,4.9,1.8,Iris-virginica
- 6.7,3.3,5.7,2.1,Iris-virginica
- 7.2,3.2,6,1.8,Iris-virginica
- 6.2,2.8,4.8,1.8,Iris-virginica
- 6.1,3,4.9,1.8,Iris-virginica
- 6.4,2.8,5.6,2.1,Iris-virginica
- 7.2,3,5.8,1.6,Iris-virginica
- 7.4,2.8,6.1,1.9,Iris-virginica
- 7.9,3.8,6.4,2,Iris-virginica
- 6.4,2.8,5.6,2.2,Iris-virginica
- 6.3,2.8,5.1,1.5,Iris-virginica
- 6.1,2.6,5.6,1.4,Iris-virginica
- 7.7,3,6.1,2.3,Iris-virginica
- 6.3,3.4,5.6,2.4,Iris-virginica
- 6.4,3.1,5.5,1.8,Iris-virginica
- 6,3,4.8,1.8,Iris-virginica
- 6.9,3.1,5.4,2.1,Iris-virginica
- 6.7,3.1,5.6,2.4,Iris-virginica
- 6.9,3.1,5.1,2.3,Iris-virginica
- 5.8,2.7,5.1,1.9,Iris-virginica
- 6.8,3.2,5.9,2.3,Iris-virginica
- 6.7,3.3,5.7,2.5,Iris-virginica
- 6.7,3,5.2,2.3,Iris-virginica
- 6.3,2.5,5,1.9,Iris-virginica
- 6.5,3,5.2,2,Iris-virginica
- 6.2,3.4,5.4,2.3,Iris-virginica
- 5.9,3,5.1,1.8,Iris-virginica
main.py
- import pandas as pd
- from sklearn.model_selection import train_test_split
- from sklearn.preprocessing import StandardScaler
- from sklearn.svm import SVC
- from sklearn.metrics import classification_report
-
- # 读取数据集
- df = pd.read_csv('iris.csv')
-
- # 区分标签和数据
- X = df.iloc[:, :-1].values
- y = df.iloc[:, -1].values
-
- # 数据标准化
- scaler = StandardScaler()
- X = scaler.fit_transform(X)
-
- # 划分数据集
- X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
-
- # 构建SVM模型
- svm = SVC(kernel='linear')
- svm.fit(X_train, y_train)
-
- # 预测测试集结果
- y_pred = svm.predict(X_test)
-
- # 输出测试结果和分类报告
- print('测试集预测结果:\n', y_pred)
- print('分类模型性能评估报告:\n', classification_report(y_test, y_pred))
运行结果