利用YOLOV3检测算法来实现人物定位与距离计算，打造全球定位系统

IT大咖说 2020-06-30

5722

前几天刚听说YOLO V4的出现打破了YOLO系列作者不更新目标检测算法的新闻，突然又听说YOLO V5已经出现，且检测速度与精度有了较大的提高。不得不说现在的节奏太快，一不留神，我们就错过了很多。目标检测算法YOLO V3算是当今最受大家喜欢，且检测速度与精度都有很大的优势，这里我们利用YOLO V3目标检测算法来进行人物的检测，并计算人与人之间的距离

目标检测

首先我们建立一个目标检测函数来检测人物

from scipy.spatial import distance as distimport numpy as npimport cv2import osdef detect_people(frame, net, ln, personIdx=0):(H, W) = frame.shape[:2]results = []blob = cv2.dnn.blobFromImage(frame, 1  255.0, (416, 416), swapRB=True, crop=False)net.setInput(blob)layerOutputs = net.forward(ln)boxes = []centroids = []confidences = []

from scipy.spatial import distance as dist欧拉距离，主要用于质心之间的距离计算

detect_people(frame, net, ln, personIdx=0)这里接受四个参数

frame：从视频帧中提取的图片帧，用于后期的目标检测

net：基于yolo v3的目标检测算法的神经网络

ln:我们提取yolo v3输出层的神经网络，用于神经网络的前向传播的输入层，便于进行目标检测

personIdx：神经网络检测到人的目标的索引

(H, W) = frame.shape[:2]：获取视频帧图片的尺寸

blob = cv2.dnn.blobFromImage(frame, 1  255.0, (416, 416), swapRB=True, crop=False)net.setInput(blob)layerOutputs = net.forward(ln)

此三行代码便是yolo v3目标检测算法的核心代码了，我们计算图片的blob值

放入神经网络进行预测，并把输出层进行反向传播进行目标的预测

for output in layerOutputs:for detection in output:scores = detection[5:]classID = np.argmax(scores)confidence = scores[classID]if classID == personIdx and confidence > 0.5:box = detection[0:4] * np.array([W, H, W, H])(centerX, centerY, width, height) = box.astype("int")x = int(centerX - (width  2))y = int(centerY - (height  2))boxes.append([x, y, int(width), int(height)])centroids.append((centerX, centerY))confidences.append(float(confidence))

神经网络检测完成后，所有检测结果放置在layerOutputs里面

通过for循环遍历出每个目标的置信度，我们只提取目标为人且置信度大于0.5的目标，其它目标直接忽略

box = detection[0:4] * np.array([W, H, W, H])(centerX, centerY, width, height) = box.astype("int")x = int(centerX - (width  2))y = int(centerY - (height  2))

当检测到我们需要的目标时，我们记录其目标的中心（质心）坐标值以及目标box，并保存

idxs = cv2.dnn.NMSBoxes(boxes, confidences, 0.3, 0.3)if len(idxs) > 0:for i in idxs.flatten():(x, y) = (boxes[i][0], boxes[i][1])(w, h) = (boxes[i][2], boxes[i][3])r = (confidences[i], (x, y, x + w, y + h), centroids[i])results.append(r)return results

由于人在移动过程中，很容易拍到的图片2个人有重合的区域，这里使用非极大值抑制（NMS）算法，进行目标的低置信度的筛选

最后，我们把检测到的目标box,目标质心坐标，以及置信度保存到results中便于后期的运算

初始化神经网络

labelsPath = os.path.sep.join(["yolo-coco", "coco.names"])LABELS = open(labelsPath).read().strip().split("\n")weightsPath = os.path.sep.join(["yolo-coco", "yolov3.weights"])configPath = os.path.sep.join(["yolo-coco", "yolov3.cfg"])net = cv2.dnn.readNetFromDarknet(configPath, weightsPath)ln = net.getLayerNames()ln = [ln[i[0] - 1] for i in net.getUnconnectedOutLayers()]vs = cv2.VideoCapture("123.mp4")

#vs=cv2.VideoCapture(0）打开摄像头

目标检测函数完成后，需要初始化神经网络，这里直接使用opencv的dnn.readNetFromDarknet来加载yolo v3模型

ln = [ln[i[0] - 1] for i in net.getUnconnectedOutLayers()]来提取神经网络的输出层，为什么这里需要提取输出层，小伙伴们可以参考往期文章

这里为了演示，直接打开一段视频来进行神经网络的检测

进行目标检测，并计算目标距离

while True:# read the next frame from the file(grabbed, frame) = vs.read()# if the frame was not grabbed, then we have reached the end# of the streamif not grabbed:break# resize the frame and then detect people (and only people) in itframe = imutils.resize(frame, width=700)results = detect_people(frame, net, ln,personIdx=LABELS.index("person"))# initialize the set of indexes that violate the minimum social# distanceviolate = set()# ensure there are *at least* two people detections (required in# order to compute our pairwise distance maps)if len(results) >= 2:# extract all centroids from the results and compute the# Euclidean distances between all pairs of the centroidscentroids = np.array([r[2] for r in results])D = dist.cdist(centroids, centroids, metric="euclidean")# loop over the upper triangular of the distance matrixfor i in range(0, D.shape[0]):for j in range(i + 1, D.shape[1]):# check to see if the distance between any twoif D[i, j] < 50:violate.add(i)violate.add(j)for (i, (prob, bbox, ce(cX, cY) = centroidcolor = (0, 255, 0)if i in violate:color = (0, 0, 255)cv2.rectangle(frame, (startX, startY), (endX, endY), color, 2)cv2.circle(frame, (cX, cY), 5, color, 1)text = " Distancing: {}".format(len(violate))cv2.putText(frame, text, (10, frame.shape[0] - 25),cv2.FONT_HERSHEY_SIMPLEX, 0.85, (0, 0, 255), 3)if 1 > 0:cv2.imshow("Frame", frame)key = cv2.waitKey(1) & 0xFFif key == ord("q"):breakcv2.stop()cv2.destroyAllWindows()

当打开视频后，实时提取视频帧中的图片，并resize图片，这里为了进行更快的目标检测

results = detect_people(frame, net, ln,personIdx=LABELS.index("person"))使用前面的目标检测函数进行人物的目标检测

当获取了目标信息后，我们要确保视频帧中要多于2个人的存在，因为这样距离计算才有意义

centroids = np.array([r[2] for r in results])D = dist.cdist(centroids, centroids, metric="euclidean")

当检测到目标后，我们提取所有目标的质心，并计算每个质心之间的欧拉距离

for i in range(0, D.shape[0]):for j in range(i + 1, D.shape[1]):if D[i, j] < 50:violate.add(i)violate.add(j)

通过for循环来判断是否存在质心距离小于设置值的质心，并记录质心的索引

for (i, (prob, bbox, centroid)) in enumerate(results):(startX, startY, endX, endY) = bbox(cX, cY) = centroidcolor = (0, 255, 0)if i in violate:color = (0, 0, 255)

为了便于区分，当2个质心的间距小于设置值时，我们更改为红色颜色框，相反，其他的设置为绿色框

最后，实时把数据以及box显示在视频中

说在最后

本期主要使用yolo v3来实时进行图片帧的人物检测，并计算质心的距离，这样的方式导致了大量计算都是在神经网络的目标检测上，因为每帧视频都要进行一次目标的检测与质心的运算

当然你的电脑配置够好的话，可以参考这样的设计。

来源：人工智能研究所

https://www.toutiao.com/i6837483778001076747/

“IT大咖说”欢迎广大技术人员投稿，投稿邮箱：aliang@itdks.com

来都来了，走啥走，留个言呗~

IT大咖说 | 关于版权

由“IT大咖说（ID：itdakashuo）”原创的文章，转载时请注明作者、出处及微信公众号。投稿、约稿、转载请加微信：ITDKS10（备注：投稿），茉莉小姐姐会及时与您联系！

感谢您对IT大咖说的热心支持！

相关推荐
推荐文章
Docker命令行参数和Dockerfile指令「收藏版」
为什么选择javafx？
分分钟学会前端sku算法（商品多规格选择）
可算是有文章，把Linux零拷贝技术讲透彻了
为什么大家都说SELECT * 效率低
数十亿条用户记录被暴露，甲骨文或已引发今年最大的数据安全漏洞

技术

文章转载自IT大咖说，如果涉嫌侵权，请发送邮件至：contact@modb.pro进行举报，并提供相关证据，一经查实，墨天轮将立刻删除相关内容。

利用YOLOV3检测算法来实现人物定位与距离计算，打造全球定位系统

目标检测

初始化神经网络

进行目标检测，并计算目标距离

说在最后

评论