Python模型评估与验证-编程实验室

# Python模型评估与验证
# 模型评估是机器学习流程的关键环节
# 交叉验证能更可靠地评估模型泛化性能

# 1. 导入库
import numpy as np
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import (
cross_val_score, StratifiedKFold, train_test_split
)
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import (
confusion_matrix, classification_report,
precision_score, recall_score, f1_score,
roc_curve, roc_auc_score
)
from sklearn.ensemble import RandomForestClassifier

# 2. 加载数据
cancer = load_breast_cancer()
X, y = cancer.data, cancer.target
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.3, random_state=42
)

# 3. 交叉验证基础
model = LogisticRegression(max_iter=5000, random_state=42)
cv_scores = cross_val_score(model, X_train, y_train, cv=5, scoring='accuracy')
print(f"=== 5 折交叉验证 ===")
print(f"每折得分: {cv_scores}")
print(f"平均准确率: {cv_scores.mean():.4f}")

# 4. StratifiedKFold 分层交叉验证
skf = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
cv_strat = cross_val_score(model, X_train, y_train, cv=skf, scoring='accuracy')
print(f"\nStratifiedKFold 平均准确率: {cv_strat.mean():.4f}")

# 5. 多种评估指标
print(f"\n多种指标 (5折CV):")
for metric in ['accuracy', 'precision', 'recall', 'f1', 'roc_auc']:
scores = cross_val_score(model, X_train, y_train, cv=5, scoring=metric)
print(f" {metric}: {scores.mean():.4f}")

# 6. 混淆矩阵
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
cm = confusion_matrix(y_test, y_pred)
print(f"\n=== 混淆矩阵 ===")
print(f" 预测负类预测正类")
print(f"实际负类 TN={cm[0,0]:4d} FP={cm[0,1]:4d}")
print(f"实际正类 FN={cm[1,0]:4d} TP={cm[1,1]:4d}")

# 7. 精确率、召回率、F1
precision = precision_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)
f1 = f1_score(y_test, y_pred)
print(f"\n精确率 (Precision): {precision:.4f}")
print(f"召回率 (Recall): {recall:.4f}")
print(f"F1 分数: {f1:.4f}")
print(f"\n完整分类报告:")
print(classification_report(y_test, y_pred, target_names=cancer.target_names))

# 8. ROC 曲线和 AUC
y_prob = model.predict_proba(X_test)[:, 1]
fpr, tpr, thresholds = roc_curve(y_test, y_prob)
auc_score = roc_auc_score(y_test, y_prob)
print(f"\n=== ROC-AUC ===")
print(f"AUC 值: {auc_score:.4f}")

# 9. 不同模型对比
print(f"\n模型对比 (5折CV AUC):")
models = {
'LR': LogisticRegression(max_iter=5000, random_state=42),
'RF': RandomForestClassifier(n_estimators=100, random_state=42)
}
for name, m in models.items():
scores = cross_val_score(m, X_train, y_train, cv=5, scoring='roc_auc')
print(f" {name}: {scores.mean():.4f}")

# 10. 验证策略选择
# 数据量大: 简单 train/test split
# 数据量小: 必须交叉验证 (K=5 或 K=10)
# 类别不平衡: 用 StratifiedKFold
# 时间序列: 用 TimeSeriesSplit
print(f"\n测试集准确率: {model.score(X_test, y_test):.4f}")
print(f"交叉验证准确率: {cv_scores.mean():.4f}")

RevokeMsgPatcher：Windows平台终极防撤回解决方案深度解析

RevokeMsgPatcher：Windows平台终极防撤回解决方案深度解析【免费下载链接】RevokeMsgPatcher :trollface: A hex editor for WeChat/QQ/TIM - PC版微信/QQ/TIM防撤回补丁（我已经看到了，撤回也没用了） 项目地址: https://gitcod…

李华

AnimateDiff终极指南：如何将静态图片变成生动的AI动画

AnimateDiff终极指南：如何将静态图片变成生动的AI动画【免费下载链接】animatediff 项目地址: https://ai.gitcode.com/hf_mirrors/ai-gitcode/animatediff 你是否曾经想过，如果能让Stable Diffusion生成的精美图片"动起来"该有多好&…

李华

3步快速修复损坏视频：Untrunc终极指南让珍贵回忆重获新生

3步快速修复损坏视频：Untrunc终极指南让珍贵回忆重获新生【免费下载链接】untrunc Restore a truncated mp4/mov. Improved version of ponchio/untrunc 项目地址: https://gitcode.com/gh_mirrors/un/untrunc 你是否曾因为相机突然断电、存储卡故障或文件传…

李华

Python模型评估与验证

RevokeMsgPatcher：Windows平台终极防撤回解决方案深度解析

AnimateDiff终极指南：如何将静态图片变成生动的AI动画

AntiMicroX：游戏手柄映射神器，让任何设备变身游戏控制器

3步快速修复损坏视频：Untrunc终极指南让珍贵回忆重获新生

22个AI量化模型实战指南：如何为A股市场选择最佳技术栈？

抖音自动发评论系统基本能运行了----全自动截屏正常