信息检索评价指标总结-JobPlus

1、NDCG的目标：希望得到的排序列表，质量越高越好。并且，如果将更相关的排到更前面，那么计算得到的NDCG是会越高的。

AUC和NDCG的区别：

1、AUC的含义：把正样本排在负样本前的概率。AUC关注的是全局的排序，只要正样本排在负样本之前，就可以得分。并没有加权。

2、NDCG也是关注排序，但是NDCG关注的是，加权排序。比如我们希望top10的排序准确度，要比bottom10的排序准确度重要。对于这种加权排序，NDCG会更加合适。

因此，AUC和NDCG的区别是，加权与否。AUC的评估中，top-10的排序质量和bottom-10的排序质量是一样重要的。但是，在NDCG中，是需要加权的，top-10的排序质量和bottom-10的排序质量的权重是不一样的。

2、

说明：sklearn只有到0.20版本才支持NDCG误差的计算，因此我们可以将该代码拷贝出来。

[python]

import numpy as np
from sklearn.preprocessing import LabelBinarizer
from sklearn.metrics import make_scorer
from sklearn.utils import check_X_y
import sys
def dcg_score(y_true, y_score, k=5):
order = np.argsort(y_score)[::-1]
y_true = np.take(y_true, order[:k])
gain = 2 ** y_true - 1
#print(gain)
discounts = np.log2(np.arange(len(y_true)) + 2)
#print(discounts)
return np.sum(gain / discounts)
def ndcg_score(y_true, y_score, k=5):
y_score, y_true = check_X_y(y_score, y_true)
# Make sure we use all the labels (max between the length and the higher
# number in the array)
lb = LabelBinarizer()
lb.fit(np.arange(max(np.max(y_true) + 1, len(y_true))))
binarized_y_true = lb.transform(y_true)
print(binarized_y_true)
if binarized_y_true.shape != y_score.shape:
raise ValueError("y_true and y_score have different value ranges")
scores = []
# Iterate over each y_value_true and compute the DCG score
for y_value_true, y_value_score in zip(binarized_y_true, y_score):
actual = dcg_score(y_value_true, y_value_score, k)
best = dcg_score(y_value_true, y_value_true, k)
#print(best)
scores.append(actual / best)
return np.mean(scores)
# NDCG Scorer function
# sklearn的NDCG对二维的计算有点问题，可以转化为三分类问题
y_true = [0, 1, 0]
y_score = [[0.0, 1.0, 0.0], [1.0, 0.0, 0.0], [0.0, 1.0, 0.0]]
print(ndcg_score(y_true, y_score, k=2))

说明：sklearn对二分类的NDCG貌似不是支持得很好，所以折中一下，换成三分类，第三类补成概率为0.

1、NDCG的目标：希望得到的排序列表，质量越高越好。并且，如果将更相关的排到更前面，那么计算得到的NDCG是会越高的。AUC和NDCG的区别：1、AUC的含义：把正样本排在负样本前的概率。AUC关注的是全局的排序，只要正样本排在负样本之前，就可以得分。并没有加权。2、NDCG也是关注排序，但是NDCG关注的是，加权排序。比如我们希望top10的排序准确度，要比bottom10的排序准确度重要。对于这种加权排序，NDCG会更加合适。因此，AUC和NDCG的区别是，加权与否。AUC的评估中，top-10的排序质量和bottom-10的排序质量是一样重要的。但是，在NDCG中，是需要加权的，top-10的排序质量和bottom-10的排序质量的权重是不一样的。<img src="https://file.jobplus.com.cn/2018/05/10/cbd18372f68d4f59aa319544f521e66a.png" _src="https://file.jobplus.com.cn/2018/05/10/cbd18372f68d4f59aa319544f521e66a.png"/>2、说明：sklearn只有到0.20版本才支持NDCG误差的计算，因此我们可以将该代码拷贝出来。 [python]<ol><li>import numpy as np  </li><li>from sklearn.preprocessing import LabelBinarizer  </li><li>from sklearn.metrics import make_scorer  </li><li>from sklearn.utils import check_X_y  </li><li>import sys  </li><li>  </li><li>def dcg_score(y_true, y_score, k=5):  </li><li>    order = np.argsort(y_score)[::-1]  </li><li>    y_true = np.take(y_true, order[:k])  </li><li>    gain = 2 ** y_true - 1  </li><li>    #print(gain)  </li><li>    discounts = np.log2(np.arange(len(y_true)) + 2)  </li><li>    #print(discounts)  </li><li>    return np.sum(gain / discounts)  </li><li>  </li><li>  </li><li>def ndcg_score(y_true, y_score, k=5):  </li><li>    y_score, y_true = check_X_y(y_score, y_true)  </li><li>  </li><li>    # Make sure we use all the labels (max between the length and the higher  </li><li>    # number in the array)  </li><li>    lb = LabelBinarizer()  </li><li>    lb.fit(np.arange(max(np.max(y_true) + 1, len(y_true))))  </li><li>    binarized_y_true = lb.transform(y_true)  </li><li>    print(binarized_y_true)  </li><li>    if binarized_y_true.shape != y_score.shape:  </li><li>        raise ValueError("y_true and y_score have different value ranges")  </li><li>  </li><li>    scores = []  </li><li>  </li><li>    # Iterate over each y_value_true and compute the DCG score  </li><li>    for y_value_true, y_value_score in zip(binarized_y_true, y_score):  </li><li>        actual = dcg_score(y_value_true, y_value_score, k)  </li><li>        best = dcg_score(y_value_true, y_value_true, k)  </li><li>        #print(best)  </li><li>        scores.append(actual / best)  </li><li>    return np.mean(scores)  </li><li>  </li><li>  </li><li># NDCG Scorer function  </li><li># sklearn的NDCG对二维的计算有点问题，可以转化为三分类问题  </li><li>y_true = [0, 1, 0]  </li><li>y_score = [[0.0, 1.0, 0.0], [1.0, 0.0, 0.0], [0.0, 1.0, 0.0]]  </li><li>print(ndcg_score(y_true, y_score, k=2))  </li></ol>说明：sklearn对二分类的NDCG貌似不是支持得很好，所以折中一下，换成三分类，第三类补成概率为0.

关于我们

法律声明

帮助中心

商务合作

相关文章

关于我们

法律声明

帮助中心

商务合作