在4-3中使用了分类准确度accuracy来评判KNN算法的性能。

使用accuracy来评价KNN算法对手写数据集的分类效果

加载手写数据集

from sklearn import datasets
digits = datasets.load_digits()

digits的内容如下:

输入:digits.keys()
输出:dict_keys(['data', 'target', 'target_names', 'images', 'DESCR'])

输入:print(digits.DESCR)
输出:digits的官方说明

输入:digits.data.shape()
输出:(1797, 64)

输入:digits.target.shape()
输出:(1797,)

输入:digits.target_names
输出:array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

输入:

some_digit = X[666]

some_digit_image = some_digit.reshape(8, 8)
plt.imshow(some_digit_image, cmap = matplotlib.cm.binary)
plt.show()

输出:第666个样本的图像

train_test_split + KNN + accuracy

import numpy as np
import matplotlib
import matplotlib.pyplot as plt
from sklearn import datasets

digits = datasets.load_digits()
X = digits.data
y = digits.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_ratio=0.2)
my_knn_clf = KNNClassifier(k=3)
my_knn_clf.fit(X_train, y_train)
y_predict = my_knn_clf.predict(X_test)
accuracy = sum(y_predict == y_test) / len(y_test)

scikit-learn中的accuracy_score

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=666)

from sklearn.neighbors import KNeighborsClassifier
knn_clf = KNeighborsClassifier(n_neighbors=3)

knn_clf.fit(X_train, y_train)
y_predict = knn_clf.predict(X_test)

from sklearn.metrics import accuracy_score
accuracy = accuracy_score(y_test, y_predict)