WO2020007156A1 - 人体识别方法、装置及存储介质 - Google Patents

人体识别方法、装置及存储介质 Download PDF

Info

Publication number
WO2020007156A1
WO2020007156A1 PCT/CN2019/089969 CN2019089969W WO2020007156A1 WO 2020007156 A1 WO2020007156 A1 WO 2020007156A1 CN 2019089969 W CN2019089969 W CN 2019089969W WO 2020007156 A1 WO2020007156 A1 WO 2020007156A1
Authority
WO
WIPO (PCT)
Prior art keywords
target person
camera
cameras
coordinates
image
Prior art date
Application number
PCT/CN2019/089969
Other languages
English (en)
French (fr)
Inventor
王健
李旭斌
亢乐
刘泽宇
迟至真
张成月
刘霄
孙昊
文石磊
包英泽
陈明裕
丁二锐
Original Assignee
百度在线网络技术(北京)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 百度在线网络技术(北京)有限公司 filed Critical 百度在线网络技术(北京)有限公司
Priority to EP19830531.0A priority Critical patent/EP3819815A4/en
Priority to KR1020207013408A priority patent/KR102377295B1/ko
Priority to JP2020526101A priority patent/JP7055867B2/ja
Publication of WO2020007156A1 publication Critical patent/WO2020007156A1/zh
Priority to US16/934,174 priority patent/US11354923B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • G06V40/173Classification, e.g. identification face re-identification, e.g. recognising unknown faces across different face tracks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition

Definitions

  • the present invention relates to the technical field of image recognition, and in particular, to a human body recognition method, device, and storage medium.
  • multi-person tracking and recognition under multi-view conditions mainly depends on two-dimensional image information, and the human body is recognized and associated across cameras through the semantic features of the human body in the two-dimensional image.
  • the difference in posture of the human body under multiple cameras may be very large, resulting in large deviations of human visual characteristics in the two-dimensional image.
  • This also makes cross-camera recognition based on the information provided by the two-dimensional image, which has a low accuracy rate and is prone to human recognition errors.
  • the invention provides a human body recognition method, device and storage medium, which can introduce the three-dimensional space coordinates of the human body in the human body weight recognition technology to pre-determine the recognition result of the image, and re-recognize the image with the recognition error, thereby Effectively improve the accuracy of human recognition results.
  • the present invention provides a human body recognition method, including:
  • the pedestrian re-identification technology ReID is used to re-recognize the target person under the camera until the back projection error of all cameras containing the target person is not greater than a preset threshold.
  • the three-dimensional spatial coordinates of the human body can be introduced into the human weight recognition technology to pre-determine the recognition result of the image, and re-recognize the image with the recognition error, thereby effectively improving the accuracy of the human recognition result.
  • the method before determining the coordinates of the target person in the three-dimensional space based on the image containing the target person collected by at least two cameras, the method further includes:
  • Pedestrian recognition technology ReID is used to perform human body recognition on the images collected by multiple cameras in the scene to obtain the corresponding relationship of the target person under multiple cameras;
  • the images containing the target person collected by at least two cameras are filtered.
  • At least two cameras are arranged in the scene in advance, and each camera has a different viewing angle. These cameras can track and recognize human activities in the scene, and obtain the target person and Correspondence between multiple cameras to obtain the avatar containing the target person, which improves the tracking accuracy of the target person.
  • determining the coordinates of the target person in the three-dimensional space according to the image containing the target person collected by at least two cameras includes:
  • the coordinates of the target person in the three-dimensional space are obtained according to the coordinates of the target person in the image and the camera matrix of the two cameras.
  • the coordinates of the target person in the three-dimensional space can be accurately converted through the coordinates of the target person in the image of the target person's image and the camera matrix of the two cameras.
  • obtaining the coordinates of the target person in the three-dimensional space according to the coordinates of the target person in the image and the camera matrix of the two cameras includes:
  • X 1 and X 2 are the coordinates of the target person in the image under two cameras, P 1 is the camera matrix of X 1 corresponding to the camera, and P 2 is the camera matrix of X 2 corresponding to the camera; then X 1 , X 2 The following correspondence relationship exists with the coordinate W of the target person in the three-dimensional space:
  • X 1 P 1 * W
  • X 2 P 2 * W
  • * represents a multiplication operation.
  • calculating the back projection errors of the target person under different cameras according to the coordinates of the target person in the three-dimensional space includes:
  • U i is the back-projected coordinates of W under the i-th camera
  • P i is the camera matrix of the i-th camera
  • i 1, 2, 3 ... N
  • N is the total number of cameras containing images of the target person
  • e i is the back projection error under the i-th camera
  • X i is the coordinates of the target person in the image corresponding to the i-th camera
  • i 1,2,3 ... N
  • N is the image containing the target person Total number of cameras.
  • the back projection coordinates of the coordinates in the three-dimensional space in the image collected by the camera can be calculated according to the coordinates in the three-dimensional space and the camera matrix of the camera, and the back projection coordinates are corresponding to the corresponding coordinates in the image collected by the camera. (According to the existing two-dimensional image coordinate algorithm) to perform a difference operation, thereby accurately calculating the back-projection error corresponding to the coordinates in the three-dimensional space.
  • determining whether the camera has a human recognition error according to the back projection error of the camera includes:
  • the back projection error of the camera is greater than a preset threshold, it is determined that the camera has a human recognition error.
  • the back-projection error can be used to evaluate the human recognition result, making the human recognition result more accurate.
  • the coordinates and image tags are sent to a monitoring platform.
  • the obtained coordinates of the target person in the images corresponding to different cameras and the image tags may be sent to the monitoring platform, so that the monitoring platform can accurately monitor the target person.
  • an embodiment of the present invention provides a human body identification device, including:
  • a determining module configured to determine the coordinates of the target person in the three-dimensional space based on the image containing the target person collected by at least two cameras;
  • a calculation module configured to separately calculate back projection errors of the target person under different cameras according to the coordinates of the target person in the three-dimensional space;
  • a judging module configured to determine, for each camera, whether there is a human recognition error on the camera according to the back projection error of the camera;
  • a recognition module configured to re-recognize a target person under the camera by using the pedestrian re-identification technology ReID when there is a human recognition error, until the back projection error of all cameras containing the target person is not greater than a preset threshold .
  • it also includes:
  • a pre-identification module is configured to perform human body recognition on the images collected by multiple cameras in the scene before determining the coordinates of the target person in the three-dimensional space based on the images containing the target person collected by at least 2 cameras. To get the corresponding relationship of the target person under multiple cameras;
  • the images containing the target person collected by at least two cameras are filtered.
  • the determining module is specifically configured to:
  • the coordinates of the target person in the three-dimensional space are obtained according to the coordinates of the target person in the image and the camera matrix of the two cameras.
  • obtaining the coordinates of the target person in the three-dimensional space according to the coordinates of the target person in the image and the camera matrix of the two cameras includes:
  • X 1 and X 2 are the coordinates of the target person in the image under two cameras, P 1 is the camera matrix of X 1 corresponding to the camera, and P 2 is the camera matrix of X 2 corresponding to the camera; then X 1 , X 2 The following correspondence relationship exists with the coordinate W of the target person in the three-dimensional space:
  • X 1 P 1 * W
  • X 2 P 2 * W
  • * represents a multiplication operation.
  • calculating the back projection errors of the target person under different cameras according to the coordinates of the target person in the three-dimensional space includes:
  • U i is the back-projected coordinates of W under the i-th camera
  • P i is the camera matrix of the i-th camera
  • i 1, 2, 3 ... N
  • N is the total number of cameras containing images of the target person
  • e i is the back projection error under the i-th camera
  • X i is the coordinates of the target person in the image corresponding to the i-th camera
  • i 1,2,3 ... N
  • N is the image containing the target person Total number of cameras.
  • the discrimination module is specifically configured to:
  • the back projection error of the camera is greater than a preset threshold, it is determined that the camera has a human recognition error.
  • it also includes:
  • a sending module configured to re-identify the target person under the camera using the pedestrian re-identification technology ReID, until the back projection error of all cameras containing the target person is not greater than a preset threshold, and obtain the target Coordinates of people in corresponding images of different cameras, and image tags;
  • the coordinates and image tags are sent to a monitoring platform.
  • an embodiment of the present invention provides a server, including: a processor and a memory, and the executable instructions of the processor are stored in the memory; wherein the processor is configured to be executed by executing the executable instructions.
  • the human body recognition method according to any one of the first aspects.
  • an embodiment of the present invention provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the human body recognition method according to any one of the first aspects is implemented.
  • a human body recognition method, device, and storage medium determine coordinates of a target person in a three-dimensional space based on images containing a target person collected by at least two cameras; and according to the coordinates of the target person in a three-dimensional space Calculate the back projection errors of the target person under different cameras separately; determine whether there is a human recognition error on the camera according to the back projection errors of the camera; when there is a human recognition error, use the pedestrian re-identification technology ReID to re-scan the camera.
  • the target person is re-recognized until the back-projection error of all cameras including the target person is not greater than a preset threshold.
  • the invention can introduce the three-dimensional space coordinates of the human body in the human body weight recognition technology to pre-determine the recognition result of the image, and re-recognize the image with the recognition error, thereby effectively improving the accuracy of the human body recognition result.
  • FIG. 1 is a schematic structural diagram of an application scenario according to the present invention.
  • Embodiment 1 of the present invention is a flowchart of a human body recognition method provided by Embodiment 1 of the present invention
  • FIG. 3 is a schematic structural diagram of a human body identification device provided in Embodiment 2 of the present invention.
  • FIG. 4 is a schematic structural diagram of a human body identification device provided in Embodiment 3 of the present invention.
  • FIG. 5 is a server provided in Embodiment 4 of the present invention.
  • Pedestrian re-identification is a technology that uses computer vision technology to determine whether a specific pedestrian exists in an image or video sequence. Widely considered a sub-problem of image retrieval. Given a monitored pedestrian image, retrieve the pedestrian image across devices. It is designed to make up for the visual limitations of the current fixed cameras and can be combined with pedestrian detection / pedestrian tracking technology, which can be widely used in intelligent video surveillance, intelligent security and other fields.
  • FIG. 1 is a schematic structural diagram of an application scenario of the present invention. As shown in FIG. 1, all cameras in the scene form a camera group 10, and different cameras 11 in the camera group 10 send the collected target person images to the server 20.
  • the server 20 determines the coordinates of the target person in the three-dimensional space according to the images containing the target person collected by the at least two cameras.
  • the three-dimensional space in this embodiment refers to the space within the scene.
  • the server 20 calculates the back projection errors of the target person under different cameras 11 respectively according to the coordinates of the target person in the three-dimensional space; determines whether there is a human recognition error in the camera 11 according to the back projection errors of the camera 11;
  • the pedestrian re-identification technology ReID re-identifies the target person under the camera 11 until the back projection error of all the cameras 11 containing the target person is not greater than a preset threshold.
  • the server 20 sends the coordinates of the finally recognized target person in the images corresponding to different cameras, and the image tags to the monitoring platform 30.
  • the three-dimensional spatial coordinates of the human body can be introduced into the human body weight recognition technology to pre-determine the recognition result of the image, and re-recognize the image with the recognition error, thereby effectively improving the accuracy of the human body recognition result.
  • FIG. 2 is a flowchart of a human body recognition method provided in Embodiment 1 of the present invention. As shown in FIG. 2, the method in this embodiment may include:
  • S101 Determine coordinates of a target person in a three-dimensional space according to an image including the target person collected by at least two cameras.
  • an image containing the target person collected by any two cameras at the same time may be selected; the coordinates of the target person in the image in the image containing the target person collected by the two cameras are respectively obtained, and two Camera matrix of two cameras; wherein the camera matrix is obtained according to known camera parameters; according to the coordinates of the target person in the image and the camera matrix of the two cameras, the target person's three-dimensional space is obtained coordinate of.
  • a plurality of cameras are arranged in the scene in advance, and each camera has a different viewing angle. Through these cameras, human activities in the scene can be tracked and identified.
  • pedestrian recognition technology ReID can be used to perform human body recognition on the images collected by multiple cameras in the scene to obtain the corresponding relationship of the target person under the multiple cameras; according to the target person on the multiple cameras Under the corresponding relationship, the images containing the target person collected by at least two cameras are filtered.
  • X 1 and X 2 are the coordinates of the target person in the image under the two cameras
  • P 1 is the camera matrix of X 1 corresponding to the camera
  • P 2 is the camera matrix of X 2 corresponding to the camera
  • X 1 , X 2 and the target person's coordinate W in the three-dimensional space have the following correspondence:
  • X 1 P 1 * W
  • X 2 P 2 * W
  • * represents a multiplication operation.
  • the back projection coordinates of the coordinates in the three-dimensional space in the image collected by the camera can be calculated according to the coordinates in the three-dimensional space and the camera matrix of the camera, and the back-projected coordinates and the corresponding coordinates in the image collected by the camera ( According to the existing two-dimensional image coordinate algorithm, a difference operation is performed to obtain a corresponding back projection error.
  • U i is the back-projected coordinates of W under the i-th camera
  • P i is the camera matrix of the i-th camera
  • i 1, 2, 3 ... N
  • N is the total number of cameras containing images of the target person
  • e i is the back projection error under the i-th camera
  • X i is the coordinates of the target person in the image corresponding to the i-th camera
  • i 1,2,3 ... N
  • N is the image containing the target person Total number of cameras.
  • whether the human body recognition error exists in the image corresponding to the camera can be determined by the magnitude of the back projection error.
  • the back projection error of a certain camera is greater than a preset threshold, it is determined that the camera has a human recognition error. If the back projection error of a certain camera is not greater than a preset threshold, it is determined that the human body recognition result of the camera is correct.
  • the existing person re-identification technology ReID can be used to re-identify the target person under the camera that has a human recognition error, so as to effectively exclude the result of the recognition error and improve the accuracy of human recognition.
  • the coordinates of the target person in the three-dimensional space are determined based on the images containing the target person collected by at least two cameras; and the coordinates of the target person under different cameras are calculated according to the coordinates of the target person in the three-dimensional space.
  • Back projection error determining whether there is a human recognition error on the camera according to the back projection error of the camera; when there is a human recognition error, the pedestrian re-identification technology ReID is used to re-recognize the target person under the camera until all
  • the back projection error of the camera of the target person is not greater than a preset threshold.
  • the invention can introduce the three-dimensional space coordinates of the human body in the human body weight recognition technology to pre-determine the recognition result of the image, and re-recognize the image with the recognition error, thereby effectively improving the accuracy of the human body recognition result.
  • FIG. 3 is a schematic structural diagram of a human body recognition device provided in Embodiment 2 of the present invention. As shown in FIG. 3, the human body recognition device in this embodiment may include:
  • a determining module 41 configured to determine coordinates of a target person in a three-dimensional space according to an image including the target person collected by at least two cameras;
  • a calculation module 42 for respectively calculating back projection errors of the target person under different cameras according to the coordinates of the target person in the three-dimensional space;
  • a judging module 43 configured to determine, for each camera, whether there is a human recognition error on the camera according to the back projection error of the camera;
  • the recognition module 44 is configured to re-recognize the target person under the camera by using the pedestrian re-identification technology ReID when there is a human recognition error, until the back projection error of all cameras including the target person is not greater than a preset Threshold.
  • the determining module 41 is specifically configured to:
  • the coordinates of the target person in the three-dimensional space are obtained according to the coordinates of the target person in the image and the camera matrix of the two cameras.
  • obtaining the coordinates of the target person in the three-dimensional space according to the coordinates of the target person in the image and the camera matrix of the two cameras includes:
  • X 1 and X 2 are the coordinates of the target person in the image under two cameras, P 1 is the camera matrix of X 1 corresponding to the camera, and P 2 is the camera matrix of X 2 corresponding to the camera; then X 1 , X 2 The following correspondence relationship exists with the coordinate W of the target person in the three-dimensional space:
  • X 1 P 1 * W
  • X 2 P 2 * W
  • * represents a multiplication operation.
  • calculating the back projection errors of the target person under different cameras according to the coordinates of the target person in the three-dimensional space includes:
  • U i is the back-projected coordinates of W under the i-th camera
  • P i is the camera matrix of the i-th camera
  • i 1, 2, 3 ... N
  • N is the total number of cameras containing images of the target person
  • e i is the back projection error under the i-th camera
  • X i is the coordinates of the target person in the image corresponding to the i-th camera
  • i 1,2,3 ... N
  • N is the image containing the target person Total number of cameras.
  • the determination module 43 is specifically configured to:
  • the back projection error of the camera is greater than a preset threshold, it is determined that the camera has a human recognition error.
  • the human body identification device in this embodiment can execute the technical solutions in the methods of any one of the above method embodiments.
  • the implementation principles and technical effects are similar, and are not described here again.
  • FIG. 4 is a schematic structural diagram of a human body recognition device provided in Embodiment 3 of the present invention. As shown in FIG. 4, based on the device shown in FIG. 3, the human body recognition device in this embodiment may further include:
  • the pre-identification module 45 is configured to perform a human body on the images collected by multiple cameras in the scene before determining the coordinates of the target person in the three-dimensional space based on the images containing the target person collected by at least two cameras. Recognize and get the corresponding relationship of the target person under multiple cameras;
  • the images containing the target person collected by at least two cameras are filtered.
  • it also includes:
  • the sending module 46 is configured to re-identify the target person under the camera using the pedestrian re-identification technology ReID until the back projection error of all cameras including the target person is not greater than a preset threshold, and then obtain the target person.
  • the coordinates and image tags are sent to a monitoring platform.
  • the human body recognition device of this embodiment can execute the technical solutions in the methods of any one of the method embodiments described above, and the implementation principles and technical effects thereof are similar, and are not repeated here.
  • FIG. 5 is a server provided in Embodiment 4 of the present invention.
  • the server 50 in this embodiment includes a processor 51 and a memory 52.
  • the memory 52 is configured to store a computer program (such as an application program, a functional module, and the like that implements the above-mentioned human body recognition method) and computer instructions.
  • the computer program and the computer instructions may be stored in one or more memories 52 in a partition. And the above-mentioned computer program, computer instructions, data, etc. may be called by the processor 51.
  • the processor 51 is configured to execute the computer program stored in the memory 52 to implement each step in the method according to the foregoing embodiment. For details, refer to related descriptions in the foregoing method embodiments.
  • the memory 52 and the processor 51 may be coupled and connected through a bus 53.
  • the server in this embodiment may execute the technical solutions in the methods of any of the foregoing method embodiments, and the implementation principles and technical effects thereof are similar, and details are not described herein again.
  • an embodiment of the present application further provides a computer-readable storage medium.
  • the computer-readable storage medium stores computer execution instructions.
  • the user equipment executes the foregoing various possibilities. Methods.
  • the computer-readable medium includes a computer storage medium and a communication medium, and the communication medium includes any medium that facilitates transfer of a computer program from one place to another.
  • a storage media may be any available media that can be accessed by a general purpose or special purpose computer.
  • An exemplary storage medium is coupled to the processor such that the processor can read information from, and write information to, the storage medium.
  • the storage medium may also be an integral part of the processor.
  • the processor and the storage medium may reside in an ASIC.
  • the ASIC may reside in a user equipment.
  • the processor and the storage medium may also exist as discrete components in a communication device.
  • a person of ordinary skill in the art may understand that all or part of the steps of implementing the foregoing method embodiments may be implemented by a program instructing related hardware.
  • the aforementioned program may be stored in a computer-readable storage medium.
  • the steps including the foregoing method embodiments are performed; and the foregoing storage medium includes various media that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)
  • Traffic Control Systems (AREA)
  • Closed-Circuit Television Systems (AREA)
  • Quality & Reliability (AREA)

Abstract

本发明提供一种人体识别方法、装置及存储介质,该方法包括:根据至少2个摄像头采集的包含目标人物的图像,确定目标人物在三维空间中的坐标;根据所述目标人物在三维空间中的坐标分别计算所述目标人物在不同摄像头下的反投影误差;根据摄像头的反投影误差确定所述摄像头是否存在人体识别错误;当存在人体识别错误时,采用行人重识别技术ReID重新对所述摄像头下的目标人物进行重新识别处理,直到所有包含所述目标人物的摄像头的反投影误差不大于预设阈值。本发明可以在人体重识别技术中引入人体的三维空间坐标来对图像的识别结果进行预判处理,并对存在识别错误的图像进行重识别,从而有效提高人体识别结果的准确率。

Description

人体识别方法、装置及存储介质
本申请要求于2018年07月03日提交中国专利局、申请号为2018107196923、申请人为百度在线网络技术(北京)有限公司、发明名称为“人体识别方法、装置及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及图像识别技术领域,尤其涉及一种人体识别方法、装置及存储介质。
背景技术
随着监控技术的发展,摄像头的部署数量逐步增多,使得在封闭场景下对人体的实时跟踪和识别成为了可能。
目前,多视角条件下的多人跟踪和识别主要依赖于二维图像信息,通过人体在二维图像中的语义特征来对人体进行跨摄像头的识别和关联。
但是,人体在多个摄像头下呈现的姿态差异可能非常大,从而导致二维图像中的人体视觉特征存在较大的偏差。这也使得通过二维图像提供的信息来进行跨摄像头识别,其准确率低,容易出现人体识别错误。
发明内容
本发明提供一种人体识别方法、装置及存储介质,可以在人体重识别技术中引入人体的三维空间坐标来对图像的识别结果进行预判处理,并对存在识别错误的图像进行重识别,从而有效提高人体识别结果的准确率。
第一方面,本发明提供一种人体识别方法,包括:
根据至少2个摄像头采集的包含目标人物的图像,确定目标人物在三维空间中的坐标;
根据所述目标人物在三维空间中的坐标分别计算所述目标人物在不同摄像头下的反投影误差;
针对每个摄像头,根据所述摄像头的反投影误差确定所述摄像头是否存在人体识别错误;
当存在人体识别错误时,采用行人重识别技术ReID重新对所述摄像头下的目标人物进行重新识别处理,直到所有包含所述目标人物的摄像头的反投影误差不大于预 设阈值。
在本实施例中,可以在人体重识别技术中引入人体的三维空间坐标来对图像的识别结果进行预判处理,并对存在识别错误的图像进行重识别,从而有效提高人体识别结果的准确率。
在一种可能的设计中,在根据至少2个摄像头采集的包含目标人物的图像,确定目标人物在三维空间中的坐标之前,还包括:
采用行人重识别技术ReID对场景中多个摄像头所采集的图像进行人体识别,得到目标人物在多个摄像头下的对应关系;
根据目标人物在多个摄像头下的对应关系,筛选出至少2个摄像头采集的包含目标人物的图像。
在本实施例中,在场景内预先布置了至少2个摄像头,每个摄像头的观测视角不同,通过这些摄像头可以对场景内的人体活动进行跟踪和识别,通过行人重识别技术ReID获取目标人物与多个摄像头的对应关系,从而获取到包含目标人物的头像,提高了目标人物的跟踪精度。
在一种可能的设计中,所述根据至少2个摄像头采集的包含目标人物的图像,确定目标人物在三维空间中的坐标,包括:
选取任意两个摄像头在同一时刻采集的包含目标人物的图像;
分别获取两个摄像头采集的包含目标人物的图像中目标人物在图像内的坐标,以及两个摄像头的摄像头矩阵;其中,所述摄像头矩阵是根据已知的摄像头参数获取到的;
根据所述目标人物在图像内的坐标,以及两个摄像头的摄像头矩阵,得到目标人物的在三维空间中的坐标。
在本实施例中,可以通过目标人物的图像中目标人物在图像内的坐标,以及两个摄像头的摄像头矩阵,准确地换算出目标人物在三维空间中的坐标。
在一种可能的设计中,根据所述目标人物在图像内的坐标,以及两个摄像头的摄像头矩阵,得到目标人物的在三维空间中的坐标,包括:
假设X 1和X 2分别是两个摄像头下所述目标人物在图像内的坐标,P 1是X 1对应摄像头的摄像头矩阵,P 2是X 2对应摄像头的摄像头矩阵;则X 1、X 2与所述目标人物的在三维空间中的坐标W存在如下对应关系:
X 1=P 1*W,X 2=P 2*W;
其中,*表示乘法运算。
在一种可能的设计中,根据所述目标人物在三维空间中的坐标分别计算所述目标人物在不同摄像头下的反投影误差,包括:
令U i=P i*W;
其中,U i为W在第i个摄像头下的反投影坐标,P i为第i个摄像头的摄像头矩阵;i=1,2,3…N;N为包含目标人物的图像的摄像头总数;
令e i=U i-X i
其中,e i为第i个摄像头下的反投影误差,X i为所述目标人物在第i个摄像头对应图像内的坐标;i=1,2,3…N;N为包含目标人物的图像的摄像头总数。
在本实施例中,可以根据三维空间中的坐标以及摄像头的摄像头矩阵来计算三维空间中的坐标在摄像头采集的图像中的反投影坐标,将该反投影坐标与摄像头采集的图像中对应的坐标(按照现有的二维图像坐标算法得到)作差值运算,从而准确地计算出三维空间中的坐标对应的反投影误差。
在一种可能的设计中,根据摄像头的反投影误差确定所述摄像头是否存在人体识别错误,包括:
若所述摄像头的反投影误差大于预设阈值,则确定所述摄像头存在人体识别错误。
在本实施例中,可以利用反投影误差能够对人体识别结果进行评估,使得人体识别结果更加准确。
在一种可能的设计中,在采用行人重识别技术ReID重新对所述摄像头下的目标人物进行重新识别处理,直到所有包含所述目标人物的摄像头的反投影误差不大于预设阈值之后,还包括:
获取所述目标人物在不同摄像头对应图像内的坐标,以及图像标签;
将所述坐标和图像标签发送给监控平台。
在本实施例中,可以将获取到的目标人物在不同摄像头对应图像内的坐标,以及图像标签发送给监控平台,使得监控平台能够对目标人物进行准确地监控。
第二方面,本发明实施例提供一种人体识别装置,包括:
确定模块,用于根据至少2个摄像头采集的包含目标人物的图像,确定目标人物在三维空间中的坐标;
计算模块,用于根据所述目标人物在三维空间中的坐标分别计算所述目标人物在不同摄像头下的反投影误差;
判别模块,用于针对每个摄像头,根据所述摄像头的反投影误差确定所述摄像头是否存在人体识别错误;
识别模块,用于在存在人体识别错误时,采用行人重识别技术ReID重新对所述摄像头下的目标人物进行重新识别处理,直到所有包含所述目标人物的摄像头的反投影误差不大于预设阈值。
在一种可能的设计中,还包括:
预识别模块,用于在根据至少2个摄像头采集的包含目标人物的图像,确定目标人物在三维空间中的坐标之前,采用行人重识别技术ReID对场景中多个摄像头所采集的图像进行人体识别,得到目标人物在多个摄像头下的对应关系;
根据目标人物在多个摄像头下的对应关系,筛选出至少2个摄像头采集的包含目标人物的图像。
在一种可能的设计中,所述确定模块,具体用于:
选取任意两个摄像头在同一时刻采集的包含目标人物的图像;
分别获取两个摄像头采集的包含目标人物的图像中目标人物在图像内的坐标,以及两个摄像头的摄像头矩阵;其中,所述摄像头矩阵是根据已知的摄像头参数获取到的;
根据所述目标人物在图像内的坐标,以及两个摄像头的摄像头矩阵,得到目标人物的在三维空间中的坐标。
在一种可能的设计中,根据所述目标人物在图像内的坐标,以及两个摄像头的摄像头矩阵,得到目标人物的在三维空间中的坐标,包括:
假设X 1和X 2分别是两个摄像头下所述目标人物在图像内的坐标,P 1是X 1对应摄像头的摄像头矩阵,P 2是X 2对应摄像头的摄像头矩阵;则X 1、X 2与所述目标人物的在三维空间中的坐标W存在如下对应关系:
X 1=P 1*W,X 2=P 2*W;
其中,*表示乘法运算。
在一种可能的设计中,根据所述目标人物在三维空间中的坐标分别计算所述目标人物在不同摄像头下的反投影误差,包括:
令U i=P i*W;
其中,U i为W在第i个摄像头下的反投影坐标,P i为第i个摄像头的摄像头矩阵;i=1,2,3…N;N为包含目标人物的图像的摄像头总数;
令e i=U i-X i
其中,e i为第i个摄像头下的反投影误差,X i为所述目标人物在第i个摄像头对应图像内的坐标;i=1,2,3…N;N为包含目标人物的图像的摄像头总数。
在一种可能的设计中,所述判别模块,具体用于:
若所述摄像头的反投影误差大于预设阈值,则确定所述摄像头存在人体识别错误。
在一种可能的设计中,还包括:
发送模块,用于在采用行人重识别技术ReID重新对所述摄像头下的目标人物进行重新识别处理,直到所有包含所述目标人物的摄像头的反投影误差不大于预设阈值之后,获取所述目标人物在不同摄像头对应图像内的坐标,以及图像标签;
将所述坐标和图像标签发送给监控平台。
第三方面,本发明实施例提供一种服务器,包括:处理器和存储器,存储器中存储有所述处理器的可执行指令;其中,所述处理器配置为经由执行所述可执行指令来执行第一方面中任一项所述的人体识别方法。
第四方面,本发明实施例提供一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现第一方面中任一项所述的人体识别方法。
本发明提供的一种人体识别方法、装置及存储介质,通过根据至少2个摄像头采集的包含目标人物的图像,确定目标人物在三维空间中的坐标;根据所述目标人物在三维空间中的坐标分别计算所述目标人物在不同摄像头下的反投影误差;根据摄像头的反投影误差确定所述摄像头是否存在人体识别错误;当存在人体识别错误时,采用行人重识别技术ReID重新对所述摄像头下的目标人物进行重新识别处理,直到所有包含所述目标人物的摄像头的反投影误差不大于预设阈值。本发明可以在人体重识别技术中引入人体的三维空间坐标来对图像的识别结果进行预判处理,并对存在识别错误的图像进行重识别,从而有效提高人体识别结果的准确率。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本发明一应用场景的结构示意图;
图2为本发明实施例一提供的人体识别方法的流程图;
图3为本发明实施例二提供的人体识别装置的结构示意图;
图4为本发明实施例三提供的人体识别装置的结构示意图;
图5为本发明实施例四提供的服务器。
具体实施方式
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”、“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本发明的实施例例如能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
下面以具体地实施例对本发明的技术方案进行详细说明。下面这几个具体的实施例可以相互结合,对于相同或相似的概念或过程可能在某些实施例不再赘述。
以下,对本申请中的部分用语进行解释说明,以便于本领域技术人员理解:
1)行人重识别技术(Person re-identification,简称ReID),是利用计算机视觉技术判断图像或者视频序列中是否存在特定行人的技术。广泛被认为是一个图像检索的子问题。给定一个监控行人图像,检索跨设备下的该行人图像。旨在弥补目前固定的摄像头的视觉局限,并可与行人检测/行人跟踪技术相结合,可广泛应用于智能视频监控、智能安保等领域。
在一场景中预先布置了多个摄像头,多个摄像头可以从不同的视角观测场景内人体的活动。具体地,图1为本发明一应用场景的结构示意图,如图1所示,场景内的所有摄像头构成摄像头组10,摄像头组10中不同摄像头11将采集到的目标人物的图像发送给服务器20,服务器20根据至少2个摄像头采集的包含目标人物的图像,来确定目标人物在三维空间中的坐标。本实施例中三维空间指的是场景内的空间。服务器20根据目标人物在三维空间中的坐标分别计算目标人物在不同摄像头11下的反投影误差;根据摄像头11的反投影误差来确定摄像头11是否存在人体识别错误;当存在人体识别错误时,采用行人重识别技术ReID重新对摄像头11下的目标人物进行重新识别处理,直到所有包含目标人物的摄像头11的反投影误差不大于预设阈值。服务器20将最终识别到的目标人物在不同摄像头对应图像内的坐标,以及图像标签发送给 监控平台30。本实施例可以在人体重识别技术中引入人体的三维空间坐标来对图像的识别结果进行预判处理,并对存在识别错误的图像进行重识别,从而有效提高人体识别结果的准确率。
下面以具体地实施例对本发明的技术方案以及本申请的技术方案如何解决上述技术问题进行详细说明。下面这几个具体的实施例可以相互结合,对于相同或相似的概念或过程可能在某些实施例中不再赘述。下面将结合附图,对本发明的实施例进行描述。
图2为本发明实施例一提供的人体识别方法的流程图,如图2所示,本实施例中的方法可以包括:
S101、根据至少2个摄像头采集的包含目标人物的图像,确定目标人物在三维空间中的坐标。
在一种可选的实施方式中,可以选取任意两个摄像头在同一时刻采集的包含目标人物的图像;分别获取两个摄像头采集的包含目标人物的图像中目标人物在图像内的坐标,以及两个摄像头的摄像头矩阵;其中,所述摄像头矩阵是根据已知的摄像头参数获取到的;根据所述目标人物在图像内的坐标,以及两个摄像头的摄像头矩阵,得到目标人物的在三维空间中的坐标。
本实施例中,在场景内预先布置了多个摄像头,每个摄像头的观测视角不同,通过这些摄像头可以对场景内的人体活动进行跟踪和识别。在一种可选的实施方式中,可以采用行人重识别技术ReID对场景中多个摄像头所采集的图像进行人体识别,得到目标人物在多个摄像头下的对应关系;根据目标人物在多个摄像头下的对应关系,筛选出至少2个摄像头采集的包含目标人物的图像。
具体地,假设X 1和X 2分别是两个摄像头下所述目标人物在图像内的坐标,P 1是X 1对应摄像头的摄像头矩阵,P 2是X 2对应摄像头的摄像头矩阵;则X 1、X 2与所述目标人物的在三维空间中的坐标W存在如下对应关系:
X 1=P 1*W,X 2=P 2*W;
其中,*表示乘法运算。
S102、根据所述目标人物在三维空间中的坐标分别计算所述目标人物在不同摄像头下的反投影误差。
本实施例中,可以根据三维空间中的坐标以及摄像头的摄像头矩阵来计算三维空间中的坐标在摄像头采集的图像中的反投影坐标,将该反投影坐标与摄像头采集的图像中对应的坐标(按照现有的二维图像坐标算法得到)作差值运算,得到对应的反投影误差。
在一种可选的实施方式中,假设已经得到目标人物的在三维空间中的坐标W,令U i=P i*W;
其中,U i为W在第i个摄像头下的反投影坐标,P i为第i个摄像头的摄像头矩阵;i=1,2,3…N;N为包含目标人物的图像的摄像头总数;
令e i=U i-X i
其中,e i为第i个摄像头下的反投影误差,X i为所述目标人物在第i个摄像头对应图像内的坐标;i=1,2,3…N;N为包含目标人物的图像的摄像头总数。
S103、针对每个摄像头,根据所述摄像头的反投影误差确定所述摄像头是否存在人体识别错误。
本实施例中,可以通过反投影误差的大小来确定摄像头对应图像是否存在人体识别错误。可选地,若某一摄像头的反投影误差大于预设阈值,则确定所述摄像头存在人体识别错误。若某一摄像头的反投影误差不大于预设阈值,则确定所述摄像头的人体识别结果正确。
S104、当存在人体识别错误时,采用行人重识别技术ReID重新对所述摄像头下的目标人物进行重新识别处理,直到所有包含所述目标人物的摄像头的反投影误差不大于预设阈值。
本实施例中,可以通过现有的行人重识别技术ReID对存在人体识别错误的摄像头下的目标人物进行重新识别处理,从而有效地排除识别错误的结果,提高人体识别的准确率。
在一种可选的实施方式中,在采用行人重识别技术ReID重新对所述摄像头下的目标人物进行重新识别处理,直到所有包含所述目标人物的摄像头的反投影误差不大于预设阈值之后,再最后获取所述目标人物在不同摄像头对应图像内的坐标,以及图像标签;将所述坐标和图像标签发送给监控平台。
本实施例,通过根据至少2个摄像头采集的包含目标人物的图像,确定目标人物在三维空间中的坐标;根据所述目标人物在三维空间中的坐标分别计算所述目标人物在不同摄像头下的反投影误差;根据摄像头的反投影误差确定所述摄像头是否存在人体识别错误;当存在人体识别错误时,采用行人重识别技术ReID重新对所述摄像头下的目标人物进行重新识别处理,直到所有包含所述目标人物的摄像头的反投影误差不大于预设阈值。本发明可以在人体重识别技术中引入人体的三维空间坐标来对图像的识别结果进行预判处理,并对存在识别错误的图像进行重识别,从而有效提高人体识别结果的准确率。
图3为本发明实施例二提供的人体识别装置的结构示意图,如图3所示,本实施例的人体识别装置可以包括:
确定模块41,用于根据至少2个摄像头采集的包含目标人物的图像,确定目标人物在三维空间中的坐标;
计算模块42,用于根据所述目标人物在三维空间中的坐标分别计算所述目标人物在不同摄像头下的反投影误差;
判别模块43,用于针对每个摄像头,根据所述摄像头的反投影误差确定所述摄像头是否存在人体识别错误;
识别模块44,用于在存在人体识别错误时,采用行人重识别技术ReID重新对所述摄像头下的目标人物进行重新识别处理,直到所有包含所述目标人物的摄像头的反投影误差不大于预设阈值。
在一种可能的设计中,所述确定模块41,具体用于:
选取任意两个摄像头在同一时刻采集的包含目标人物的图像;
分别获取两个摄像头采集的包含目标人物的图像中目标人物在图像内的坐标,以及两个摄像头的摄像头矩阵;其中,所述摄像头矩阵是根据已知的摄像头参数获取到的;
根据所述目标人物在图像内的坐标,以及两个摄像头的摄像头矩阵,得到目标人物的在三维空间中的坐标。
在一种可能的设计中,根据所述目标人物在图像内的坐标,以及两个摄像头的摄像头矩阵,得到目标人物的在三维空间中的坐标,包括:
假设X 1和X 2分别是两个摄像头下所述目标人物在图像内的坐标,P 1是X 1对应摄像头的摄像头矩阵,P 2是X 2对应摄像头的摄像头矩阵;则X 1、X 2与所述目标人物的在三维空间中的坐标W存在如下对应关系:
X 1=P 1*W,X 2=P 2*W;
其中,*表示乘法运算。
在一种可能的设计中,根据所述目标人物在三维空间中的坐标分别计算所述目标人物在不同摄像头下的反投影误差,包括:
令U i=P i*W;
其中,U i为W在第i个摄像头下的反投影坐标,P i为第i个摄像头的摄像头矩阵;i=1,2,3…N;N为包含目标人物的图像的摄像头总数;
令e i=U i-X i
其中,e i为第i个摄像头下的反投影误差,X i为所述目标人物在第i个摄像头对应图像内的坐标;i=1,2,3…N;N为包含目标人物的图像的摄像头总数。
在一种可能的设计中,所述判别模块43,具体用于:
若所述摄像头的反投影误差大于预设阈值,则确定所述摄像头存在人体识别错误。
本实施例的人体识别装置装置,可以执行上述任一方法实施例的方法中的技术方案,其实现原理和技术效果类似,此处不再赘述。
图4为本发明实施例三提供的人体识别装置的结构示意图,如图4所示,在图3所示装置的基础上,本实施例的人体识别装置还可以包括:
预识别模块45,用于在根据至少2个摄像头采集的包含目标人物的图像,确定目标人物在三维空间中的坐标之前,采用行人重识别技术ReID对场景中多个摄像头所采集的图像进行人体识别,得到目标人物在多个摄像头下的对应关系;
根据目标人物在多个摄像头下的对应关系,筛选出至少2个摄像头采集的包含目标人物的图像。
在一种可能的设计中,还包括:
发送模块46,用于在采用行人重识别技术ReID重新对所述摄像头下的目标人物进行重新识别处理,直到所有包含所述目标人物的摄像头的反投影误差不大于预设阈值之后,获取所述目标人物在不同摄像头对应图像内的坐标,以及图像标签;
将所述坐标和图像标签发送给监控平台。
本实施例的人体识别装置,可以执行上述任一方法实施例的方法中的技术方案,其实现原理和技术效果类似,此处不再赘述。
图5为本发明实施例四提供的服务器,如图5所示,本实施例中的服务器50包括:处理器51和存储器52。
存储器52,用于存储计算机程序(如实现上述人体识别方法的应用程序、功能模块等)、计算机指令等,上述的计算机程序、计算机指令等可以分区存储在一个或多个存储器52中。并且上述的计算机程序、计算机指令、数据等可以被处理器51调用。
处理器51,用于执行所述存储器52存储的所述计算机程序,以实现上述实施例涉及的方法中的各个步骤。具体可以参见前面方法实施例中的相关描述。其中,存储器52、处理器51可以通过总线53耦合连接。
本实施例的服务器,可以执行上述任一方法实施例的方法中的技术方案,其实现原理和技术效果类似,此处不再赘述。
此外,本申请实施例还提供一种计算机可读存储介质,计算机可读存储介质中存储有 计算机执行指令,当用户设备的至少一个处理器执行该计算机执行指令时,用户设备执行上述各种可能的方法。
其中,计算机可读介质包括计算机存储介质和通信介质,其中通信介质包括便于从一个地方向另一个地方传送计算机程序的任何介质。存储介质可以是通用或专用计算机能够存取的任何可用介质。一种示例性的存储介质耦合至处理器,从而使处理器能够从该存储介质读取信息,且可向该存储介质写入信息。当然,存储介质也可以是处理器的组成部分。处理器和存储介质可以位于ASIC中。另外,该ASIC可以位于用户设备中。当然,处理器和存储介质也可以作为分立组件存在于通信设备中。
本领域普通技术人员可以理解:实现上述各方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成。前述的程序可以存储于一计算机可读取存储介质中。该程序在执行时,执行包括上述各方法实施例的步骤;而前述的存储介质包括:ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。
最后应说明的是:以上各实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述各实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或对其中部分或全部技术特征进行等同替换;而这些修改或替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的范围。

Claims (16)

  1. 一种人体识别方法,其特征在于,包括:
    根据至少2个摄像头采集的包含目标人物的图像,确定目标人物在三维空间中的坐标;
    根据所述目标人物在三维空间中的坐标分别计算所述目标人物在不同摄像头下的反投影误差;
    针对每个摄像头,根据所述摄像头的反投影误差确定所述摄像头是否存在人体识别错误;
    当存在人体识别错误时,采用行人重识别技术ReID重新对所述摄像头下的目标人物进行重新识别处理,直到所有包含所述目标人物的摄像头的反投影误差不大于预设阈值。
  2. 根据权利要求1所述的方法,其特征在于,在根据至少2个摄像头采集的包含目标人物的图像,确定目标人物在三维空间中的坐标之前,还包括:
    采用行人重识别技术ReID对场景中多个摄像头所采集的图像进行人体识别,得到目标人物在多个摄像头下的对应关系;
    根据目标人物在多个摄像头下的对应关系,筛选出至少2个摄像头采集的包含目标人物的图像。
  3. 根据权利要求1所述的方法,其特征在于,所述根据至少2个摄像头采集的包含目标人物的图像,确定目标人物在三维空间中的坐标,包括:
    选取任意两个摄像头在同一时刻采集的包含目标人物的图像;
    分别获取两个摄像头采集的包含目标人物的图像中目标人物在图像内的坐标,以及两个摄像头的摄像头矩阵;其中,所述摄像头矩阵是根据已知的摄像头参数获取到的;
    根据所述目标人物在图像内的坐标,以及两个摄像头的摄像头矩阵,得到目标人物的在三维空间中的坐标。
  4. 根据权利要求3所述的方法,其特征在于,根据所述目标人物在图像内的坐标,以及两个摄像头的摄像头矩阵,得到目标人物的在三维空间中的坐标,包括:
    假设X 1和X 2分别是两个摄像头下所述目标人物在图像内的坐标,P 1是X 1对应摄像头的摄像头矩阵,P 2是X 2对应摄像头的摄像头矩阵;则X 1、X 2与所述目标人物的在三维空间中的坐标W存在如下对应关系:
    X 1=P 1*W,X 2=P 2*W;
    其中,*表示乘法运算。
  5. 根据权利要求4所述的方法,其特征在于,根据所述目标人物在三维空间中的坐标分别计算所述目标人物在不同摄像头下的反投影误差,包括:
    令U i=P i*W;
    其中,U i为W在第i个摄像头下的反投影坐标,P i为第i个摄像头的摄像头矩阵;i=1,2,3…N;N为包含目标人物的图像的摄像头总数;
    令e i=U i-X i
    其中,e i为第i个摄像头下的反投影误差,X i为所述目标人物在第i个摄像头对应图像内的坐标;i=1,2,3…N;N为包含目标人物的图像的摄像头总数。
  6. 根据权利要求1所述的方法,其特征在于,根据所述摄像头的反投影误差确定所述摄像头是否存在人体识别错误,包括:
    若所述摄像头的反投影误差大于预设阈值,则确定所述摄像头存在人体识别错误。
  7. 根据权利要求1-6中任一项所述的方法,其特征在于,在采用行人重识别技术ReID重新对所述摄像头下的目标人物进行重新识别处理,直到所有包含所述目标人物的摄像头的反投影误差不大于预设阈值之后,还包括:
    获取所述目标人物在不同摄像头对应图像内的坐标,以及图像标签;
    将所述坐标和图像标签发送给监控平台。
  8. 一种人体识别装置,其特征在于,包括:
    确定模块,用于根据至少2个摄像头采集的包含目标人物的图像,确定目标人物在三维空间中的坐标;
    计算模块,用于根据所述目标人物在三维空间中的坐标分别计算所述目标人物在不同摄像头下的反投影误差;
    判别模块,用于针对每个摄像头,根据所述摄像头的反投影误差确定所述摄像头是否存在人体识别错误;
    识别模块,用于在存在人体识别错误时,采用行人重识别技术ReID重新对所述摄像头下的目标人物进行重新识别处理,直到所有包含所述目标人物的摄像头的反投影误差不大于预设阈值。
  9. 根据权利要求8所述的装置,其特征在于,还包括:
    预识别模块,用于在根据至少2个摄像头采集的包含目标人物的图像,确定目标人物在三维空间中的坐标之前,采用行人重识别技术ReID对场景中多个摄像头所采 集的图像进行人体识别,得到目标人物在多个摄像头下的对应关系;
    根据目标人物在多个摄像头下的对应关系,筛选出至少2个摄像头采集的包含目标人物的图像。
  10. 根据权利要求8所述的装置,其特征在于,所述确定模块,具体用于:
    选取任意两个摄像头在同一时刻采集的包含目标人物的图像;
    分别获取两个摄像头采集的包含目标人物的图像中目标人物在图像内的坐标,以及两个摄像头的摄像头矩阵;其中,所述摄像头矩阵是根据已知的摄像头参数获取到的;
    根据所述目标人物在图像内的坐标,以及两个摄像头的摄像头矩阵,得到目标人物的在三维空间中的坐标。
  11. 根据权利要求10所述的装置,其特征在于,根据所述目标人物在图像内的坐标,以及两个摄像头的摄像头矩阵,得到目标人物的在三维空间中的坐标,包括:
    假设X 1和X 2分别是两个摄像头下所述目标人物在图像内的坐标,P 1是X 1对应摄像头的摄像头矩阵,P 2是X 2对应摄像头的摄像头矩阵;则X 1、X 2与所述目标人物的在三维空间中的坐标W存在如下对应关系:
    X 1=P 1*W,X 2=P 2*W;
    其中,*表示乘法运算。
  12. 根据权利要求11所述的装置,其特征在于,根据所述目标人物在三维空间中的坐标分别计算所述目标人物在不同摄像头下的反投影误差,包括:
    令U i=P i*W;
    其中,U i为W在第i个摄像头下的反投影坐标,P i为第i个摄像头的摄像头矩阵;i=1,2,3…N;N为包含目标人物的图像的摄像头总数;
    令e i=U i-X i
    其中,e i为第i个摄像头下的反投影误差,X i为所述目标人物在第i个摄像头对应图像内的坐标;i=1,2,3…N;N为包含目标人物的图像的摄像头总数。
  13. 根据权利要求8所述的装置,其特征在于,所述判别模块,具体用于:
    若所述摄像头的反投影误差大于预设阈值,则确定所述摄像头存在人体识别错误。
  14. 根据权利要求8-13中任一项所述的装置,其特征在于,还包括:
    发送模块,用于在采用行人重识别技术ReID重新对所述摄像头下的目标人物进行重新识别处理,直到所有包含所述目标人物的摄像头的反投影误差不大于预设阈值之后,获取所述目标人物在不同摄像头对应图像内的坐标,以及图像标签;
    将所述坐标和图像标签发送给监控平台。
  15. 一种服务器,其特征在于,包括:处理器和存储器,存储器中存储有所述处理器的可执行指令;其中,所述处理器配置为经由执行所述可执行指令来执行权利要求1-7中任一项所述的人体识别方法。
  16. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现权利要求1-7任一项所述的人体识别方法。
PCT/CN2019/089969 2018-07-03 2019-06-04 人体识别方法、装置及存储介质 WO2020007156A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
EP19830531.0A EP3819815A4 (en) 2018-07-03 2019-06-04 METHOD AND DEVICE FOR RECOGNIZING A HUMAN BODY AND STORAGE MEDIA
KR1020207013408A KR102377295B1 (ko) 2018-07-03 2019-06-04 인체 식별 방법, 장치 및 저장 매체
JP2020526101A JP7055867B2 (ja) 2018-07-03 2019-06-04 人体認識方法、機器及び記憶媒体
US16/934,174 US11354923B2 (en) 2018-07-03 2020-07-21 Human body recognition method and apparatus, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810719692.3 2018-07-03
CN201810719692.3A CN109063567B (zh) 2018-07-03 2018-07-03 人体识别方法、装置及存储介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/934,174 Continuation US11354923B2 (en) 2018-07-03 2020-07-21 Human body recognition method and apparatus, and storage medium

Publications (1)

Publication Number Publication Date
WO2020007156A1 true WO2020007156A1 (zh) 2020-01-09

Family

ID=64818440

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/089969 WO2020007156A1 (zh) 2018-07-03 2019-06-04 人体识别方法、装置及存储介质

Country Status (6)

Country Link
US (1) US11354923B2 (zh)
EP (1) EP3819815A4 (zh)
JP (1) JP7055867B2 (zh)
KR (1) KR102377295B1 (zh)
CN (1) CN109063567B (zh)
WO (1) WO2020007156A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102018105063A1 (de) 2018-03-06 2019-09-12 Ebm-Papst Mulfingen Gmbh & Co. Kg Vorrichtung und Verfahren zur Luftmengenerfassung
CN109063567B (zh) * 2018-07-03 2021-04-13 百度在线网络技术(北京)有限公司 人体识别方法、装置及存储介质
CN110443228B (zh) * 2019-08-20 2022-03-04 图谱未来(南京)人工智能研究院有限公司 一种行人匹配方法、装置、电子设备及存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102222348A (zh) * 2011-06-28 2011-10-19 南京大学 一种三维目标运动矢量计算方法
CN106203400A (zh) * 2016-07-29 2016-12-07 广州国信达计算机网络通讯有限公司 一种人脸识别方法及装置
CN109063567A (zh) * 2018-07-03 2018-12-21 百度在线网络技术(北京)有限公司 人体识别方法、装置及存储介质

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3871582B2 (ja) * 2002-03-01 2007-01-24 日本電信電話株式会社 物体形状復元・移動物体検出方法、物体形状復元・移動物体検出装置、物体形状復元・移動物体検出プログラム、およびこのプログラムを記録した記録媒体
US7616807B2 (en) * 2005-02-24 2009-11-10 Siemens Corporate Research, Inc. System and method for using texture landmarks for improved markerless tracking in augmented reality applications
US8948461B1 (en) * 2005-04-29 2015-02-03 Hewlett-Packard Development Company, L.P. Method and system for estimating the three dimensional position of an object in a three dimensional physical space
US10621738B2 (en) * 2011-03-16 2020-04-14 Siemens Healthcare Gmbh 2D/3D registration for abdominal aortic aneurysm intervention
US20130095920A1 (en) * 2011-10-13 2013-04-18 Microsoft Corporation Generating free viewpoint video using stereo imaging
JP6144826B2 (ja) * 2013-06-11 2017-06-07 クアルコム,インコーポレイテッド データベース作成のための対話型および自動的3dオブジェクト走査方法
CN103810476B (zh) * 2014-02-20 2017-02-01 中国计量学院 基于小群体信息关联的视频监控网络中行人重识别方法
WO2015151098A2 (en) * 2014-04-02 2015-10-08 M.S.T. Medical Surgery Technologies Ltd. An articulated structured light based-laparoscope
CN104408436B (zh) * 2014-12-08 2017-10-27 上海宇航系统工程研究所 一种基于反投影的合作目标识别方法及系统
JP2016139949A (ja) * 2015-01-28 2016-08-04 富士通株式会社 動線判断装置、動線判断方法、及び動線判断プログラム
JP2017017441A (ja) * 2015-06-29 2017-01-19 キヤノン株式会社 画像処理装置、情報処理方法及びプログラム
US9911198B2 (en) * 2015-12-17 2018-03-06 Canon Kabushiki Kaisha Method, system and apparatus for matching moving targets between camera views
JP6558803B2 (ja) * 2016-03-23 2019-08-14 Kddi株式会社 幾何検証装置及びプログラム
WO2018147329A1 (ja) * 2017-02-10 2018-08-16 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ 自由視点映像生成方法及び自由視点映像生成システム
US11017531B2 (en) * 2017-03-09 2021-05-25 Cathworks Ltd Shell-constrained localization of vasculature
US10417833B2 (en) * 2017-11-06 2019-09-17 Adobe Inc. Automatic 3D camera alignment and object arrangment to match a 2D background image
US10962355B2 (en) * 2017-12-25 2021-03-30 Htc Corporation 3D model reconstruction method, electronic device, and non-transitory computer readable storage medium thereof
CN108921874B (zh) * 2018-07-04 2020-12-29 百度在线网络技术(北京)有限公司 人体跟踪处理方法、装置及系统

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102222348A (zh) * 2011-06-28 2011-10-19 南京大学 一种三维目标运动矢量计算方法
CN106203400A (zh) * 2016-07-29 2016-12-07 广州国信达计算机网络通讯有限公司 一种人脸识别方法及装置
CN109063567A (zh) * 2018-07-03 2018-12-21 百度在线网络技术(北京)有限公司 人体识别方法、装置及存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3819815A4

Also Published As

Publication number Publication date
KR102377295B1 (ko) 2022-03-22
CN109063567A (zh) 2018-12-21
CN109063567B (zh) 2021-04-13
EP3819815A1 (en) 2021-05-12
US11354923B2 (en) 2022-06-07
JP2021502646A (ja) 2021-01-28
US20200349349A1 (en) 2020-11-05
EP3819815A4 (en) 2022-05-04
KR20200068709A (ko) 2020-06-15
JP7055867B2 (ja) 2022-04-18

Similar Documents

Publication Publication Date Title
CN111709409B (zh) 人脸活体检测方法、装置、设备及介质
CN111627045B (zh) 单镜头下的多行人在线跟踪方法、装置、设备及存储介质
CN109544615B (zh) 基于图像的重定位方法、装置、终端及存储介质
CN110418114B (zh) 一种对象跟踪方法、装置、电子设备及存储介质
US20240092344A1 (en) Method and apparatus for detecting parking space and direction and angle thereof, device and medium
WO2020007156A1 (zh) 人体识别方法、装置及存储介质
US20120320162A1 (en) Video object localization method using multiple cameras
CN107045631A (zh) 人脸特征点检测方法、装置及设备
JP2013210968A (ja) 物体検出装置及びその方法、プログラム
US20210312637A1 (en) Map segmentation method and device, motion estimation method, and device terminal
CN106471440A (zh) 基于高效森林感测的眼睛跟踪
JP2009157767A (ja) 顔画像認識装置、顔画像認識方法、顔画像認識プログラムおよびそのプログラムを記録した記録媒体
CN112215156A (zh) 一种视频监控中的人脸抓拍方法及系统
US10791321B2 (en) Constructing a user's face model using particle filters
CN111105436A (zh) 目标跟踪方法、计算机设备及存储介质
CN111339973A (zh) 一种对象识别方法、装置、设备及存储介质
CN115880428A (zh) 一种基于三维技术的动物检测数据处理方法、装置及设备
CN114694204A (zh) 社交距离检测方法、装置、电子设备及存储介质
Vandoni et al. Active learning for high-density crowd count regression
Roth et al. Multiple instance learning from multiple cameras
Li et al. Estimating gaze points from facial landmarks by a remote spherical camera
Panda et al. Blending of Learning-based Tracking and Object Detection for Monocular Camera-based Target Following
CN112700494A (zh) 定位方法、装置、电子设备及计算机可读存储介质
CN111723610A (zh) 图像识别方法、装置及设备
JP5988894B2 (ja) 被写体照合装置、被写体照合方法、およびプログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19830531

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 20207013408

Country of ref document: KR

Kind code of ref document: A

Ref document number: 2020526101

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2019830531

Country of ref document: EP

Effective date: 20210203