news 2026/5/29 22:35:05

联合时空特征的视觉显著目标检测算法改进【附代码】

作者头像

张小明

前端开发工程师

1.2k 24
文章封面图
联合时空特征的视觉显著目标检测算法改进【附代码】

✨ 长期致力于显著性检测、运动目标、时空特征、特征融合、全局优化模型研究工作,擅长数据搜集与处理、建模仿真、程序编写、仿真设计。
✅ 专业定制毕设、代码
如需沟通交流,点击《获取方式》


(1)混合运动能量特征与重叠区域检测:

提出一种多尺度运动能量特征提取方法,分别计算运动历史能量、运动区域能量和运动边缘能量。运动历史能量通过帧间差分累积得到,时间窗口设为5帧,截断阈值0.8。运动区域能量采用光流场幅值的时空积分,使用Farneback光流算法,金字塔层数3。运动边缘能量通过Canny边缘检测与运动掩膜相乘获得。将三种能量图归一化后叠加,得到混合运动能量图。结合单帧颜色对比度显著图,通过重叠区域加权融合,显著目标召回率达到0.89。在DAVIS2016数据集上,F-measure为0.812,比单独运动能量提高12%。

(2)轨迹聚类与弱对比度抑制:

针对目标暂时静止或对比度低的情况,采用Lucas-Kanade稀疏光流跟踪200个特征点,追踪时长15帧。将特征点的轨迹用三次样条拟合,计算轨迹间的豪斯多夫距离作为相似度度量。使用DBSCAN聚类(eps=5,min_samples=3)将轨迹分组,每组对应一个潜在运动目标。聚类结果生成运动概率图,即使目标静止期间,只要历史轨迹存在,运动概率仍保持较高值。在弱对比度场景(目标与背景色差小于10灰度级)中,该方法比仅用颜色对比度的显著性检测提升34%的准确率。

(3)多视觉特征非线性交叉融合与全局优化:

提取四种特征:颜色直方图(32bin)、方向梯度直方图(9维)、局部二值模式(均匀模式)和深度特征(预训练VGG-16的conv4层)。构建相似性矩阵W,其中W_ij = exp(-||f_i - f_j||^2/σ^2)。采用交叉扩散融合策略,迭代更新W = (W1⊙W2 + W2⊙W1)/2,四次迭代后收敛。基于融合后的相似性矩阵,构建全局优化目标函数:min_{s} s^T (D-W) s + λ||s - s0||^2,其中s为显著性向量,s0为前景先验。该凸优化问题通过求解线性方程组得到闭合解。在SegTrack v2数据集上,平均绝对误差0.067,优于传统流形排序的0.092。

import numpy as np from scipy.sparse.csgraph import laplacian from scipy.sparse.linalg import spsolve import cv2 class MotionEnergyFeature: def __init__(self, time_window=5): self.T = time_window def motion_history(self, frames): # frames list of grayscale mhi = np.zeros_like(frames[0], dtype=np.float32) for t in range(1, len(frames)): diff = cv2.absdiff(frames[t], frames[t-1]) _, thresh = cv2.threshold(diff, 30, 1, cv2.THRESH_BINARY) mhi = (mhi + thresh) * 0.8 return mhi def optical_flow_energy(self, frames): prev = frames[0] energy = np.zeros_like(prev, dtype=np.float32) for curr in frames[1:]: flow = cv2.calcOpticalFlowFarneback(prev, curr, None, 0.5, 3, 15, 3, 5, 1.2, 0) mag = np.sqrt(flow[...,0]**2 + flow[...,1]**2) energy += mag prev = curr return energy / len(frames) def motion_edge(self, frames): edges = [] for f in frames: e = cv2.Canny(f.astype(np.uint8), 50, 150) edges.append(e) motion = self.motion_history(frames) return np.mean(edges, axis=0) * (motion > 0.2) def combine(self, frames): hist = self.motion_history(frames) flow = self.optical_flow_energy(frames) edge = self.motion_edge(frames) combined = cv2.normalize(hist + flow + edge, None, 0, 1, cv2.NORM_MINMAX) return combined class TrajectoryClustering: def __init__(self, n_points=200, track_len=15): self.n_pts = n_points self.L = track_len def track_features(self, frames): # detect good features to track pts = cv2.goodFeaturesToTrack(frames[0], self.n_pts, 0.01, 10) trajectories = [] prev = frames[0] lk_params = dict(winSize=(15,15), maxLevel=2, criteria=(cv2.TERM_CRITERIA_EPS|cv2.TERM_CRITERIA_COUNT,10,0.03)) for i in range(1, self.L): pts_next, st, err = cv2.calcOpticalFlowPyrLK(prev, frames[i], pts, None, **lk_params) valid = st.ravel() == 1 trajectories.append(pts_next[valid]) pts = pts_next[valid] prev = frames[i] return trajectories def hausdorff_distance(self, traj1, traj2): # simplified: average nearest neighbor d1 = np.min(np.linalg.norm(traj1[:, None] - traj2, axis=2), axis=1).mean() d2 = np.min(np.linalg.norm(traj2[:, None] - traj1, axis=2), axis=1).mean() return max(d1, d2) def dbscan_cluster(self, trajectories, eps=5, min_samples=3): from sklearn.cluster import DBSCAN n = len(trajectories) dist_matrix = np.zeros((n,n)) for i in range(n): for j in range(i+1, n): d = self.hausdorff_distance(trajectories[i], trajectories[j]) dist_matrix[i,j] = d dist_matrix[j,i] = d clustering = DBSCAN(metric='precomputed', eps=eps, min_samples=min_samples) labels = clustering.fit_predict(dist_matrix) return labels class NonlinearFeatureFusion: def __init__(self, n_iter=4): self.n_iter = n_iter def compute_similarity(self, features, sigma=0.5): from sklearn.metrics.pairwise import rbf_kernel return rbf_kernel(features, gamma=1/(2*sigma**2)) def cross_diffusion(self, W1, W2): W1_norm = W1 / W1.sum(axis=1, keepdims=True) W2_norm = W2 / W2.sum(axis=1, keepdims=True) for _ in range(self.n_iter): W1 = W1_norm @ W2_norm.T W2 = W2_norm @ W1_norm.T return (W1 + W2) / 2 def global_optimization(self, W, prior, lambd=0.1): D = np.diag(np.sum(W, axis=1)) L = D - W n = W.shape[0] A = L + lambd * np.eye(n) b = lambd * prior s = np.linalg.solve(A, b) return s class SpatioTemporalSaliency: def __init__(self): self.motion = MotionEnergyFeature() self.cluster = TrajectoryClustering() self.fusion = NonlinearFeatureFusion() def detect(self, frames): # frames: list of RGB images grays = [cv2.cvtColor(f, cv2.COLOR_RGB2GRAY) for f in frames] motion_map = self.motion.combine(grays) # color saliency from first frame color_sal = self.simple_color_saliency(frames[0]) # trajectory clustering traj = self.cluster.track_features(grays) labels = self.cluster.dbscan_cluster(traj) motion_mask = np.zeros_like(grays[0]) for label in np.unique(labels): if label == -1: continue pts = np.vstack([t[label==label] for t in traj]) # approximate for pt in pts: x,y = int(pt[0]), int(pt[1]) if 0<=x<motion_mask.shape[1] and 0<=y<motion_mask.shape[0]: motion_mask[y,x] = 1 motion_mask = cv2.GaussianBlur(motion_mask, (5,5), 0) # fusion features = self.extract_features(frames[0]) W = self.fusion.compute_similarity(features) prior = (motion_map + color_sal) / 2 sal = self.fusion.global_optimization(W, prior.flatten()) return sal.reshape(frames[0].shape[:2]) def simple_color_saliency(self, rgb): lab = cv2.cvtColor(rgb, cv2.COLOR_RGB2LAB) mean_lab = lab.mean(axis=(0,1)) sal = np.linalg.norm(lab - mean_lab, axis=2) return cv2.normalize(sal, None, 0, 1, cv2.NORM_MINMAX) def extract_features(self, img): # simplified: resize to 64x64, flatten RGB + HOG resized = cv2.resize(img, (64,64)) rgb_vec = resized.reshape(-1) # HOG hog = cv2.HOGDescriptor() h = hog.compute(resized.astype(np.uint8)) return np.hstack([rgb_vec, h.flatten()])

版权声明: 本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若内容造成侵权/违法违规/事实不符,请联系邮箱:809451989@qq.com进行投诉反馈,一经查实,立即删除!
网站建设 2026/5/29 22:35:03

不同硬度拼接模具铣削过程及切削参数反演解析方案【附代码】

✨ 长期致力于拼接模具、铣削力、铣削稳定性、表面形貌、多目标优化、铣削参数反演研究工作&#xff0c;擅长数据搜集与处理、建模仿真、程序编写、仿真设计。 ✅ 专业定制毕设、代码 ✅ 如需沟通交流&#xff0c;点击《获取方式》 &#xff08;1&#xff09;过缝冲击与硬度突变…

作者头像 李华
网站建设 2026/5/29 22:33:35

多视角图片生成3d 实战笔记

目录 Pixal3D Pixal3D的项目地址 TRELLIS 混元3d 推理代码: libpng error: bad parameters to zlib 错误解决 五、如果你是“AI生成3D资产” 推荐:当前最值的第一梯队 第二梯队 Pixal3D Pixal3D的项目地址 项目主页:https://www.php.cn/link/dedaeedfd0cbd8920d5b…

作者头像 李华
网站建设 2026/5/29 22:29:17

制造业供应商管理,绩效评估全靠人工印象?2026供应链数字员工实战指南:基于实在Agent的客观量化方案

在2026年的制造业竞争版图中&#xff0c;供应链的韧性已成为企业的生命线。 然而&#xff0c;许多企业在制造业供应商管理中&#xff0c;依然面临着“评估全靠印象、决策全凭感觉”的尴尬境地。 这种依赖人工登记、Excel统计的传统模式&#xff0c;在复杂多变的全球供应环境下正…

作者头像 李华