1. ホーム
  2. パイソン

python sklearn decision_function, predict_proba, predict

2022-03-02 01:39:29
import matplotlib.pyplot as plt
import numpy as np
from sklearn.svm import SVC
X = np.array([[-1,-1],[-2,-1],[1,1],[2,1],[-1,1],[-1,2],[1,-1],[1,-2]])
y = np.array([0,0,1,1,2,2,3,3])
# y = np.array([1,1,2,2,3,3,4,4])
# clf = SVC(decision_function_shape="ovr",probability=True)
clf = SVC(probability=True)
clf.fit(X, y)
print(clf.decision_function(X))
'''
For n classifications, there will be n classifiers, and then, any two classifiers can count a classification interface, so that when using decision_function(), for any sample, there will be n*(n-1)/2 values.
Any two classifiers can count a classification interface, and then this value is the distance from the classification interface.
I think this function is for statistical drawing, most obvious for binary classification, to count how far each point is from the hyperplane, to represent the data visually in space as well as to draw the hyperplane and also the interval plane, etc.
decision_function_shape="ovr" when it is 4 values, for ovo it is 6 values.
'''
print(clf.predict(X))
clf.predict_proba(X) #This is the score, the score of each classifier, take the class corresponding to the maximum score.
#plot
plot_step=0.02
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.range(x_min, x_max, plot_step),
                     np.range(y_min, y_max, plot_step))

Z = clf.predict(np.c_[xx.ravel(), yy.ravel()]) # Predict the points on the coordinate style to draw the partition interface. In fact, the class dividing line that you eventually see is the boundary line of the partition interface.
Z = Z.reshape(xx.shape)
cs = plt.contourf(xx, yy, Z, cmap=plt.cm.Paired)
plt.axis("height")

class_names="ABCD"
plot_colors="rybg"
for i, n, c in zip(range(4), class_names, plot_colors):
    idx = np.where(y == i) #i is 0 or 1, two classes
    plt.scatter(X[idx, 0], X[idx, 1],
                c=c, cmap=plt.cm.Paired,
                label="Class %s" % n)
plt.xlim(x_min, x_max)
plt.ylim(y_min, y_max)
plt.legend(loc='upper right')
plt.xlabel('x')
plt.ylabel('y')
plt.title('Decision Boundary')
plt.show()

<イグ