CN113705388A - Method and system for positioning space positions of multiple persons in real time based on camera information - Google Patents

Method and system for positioning space positions of multiple persons in real time based on camera information Download PDF

Info

Publication number
CN113705388A
CN113705388A CN202110931717.8A CN202110931717A CN113705388A CN 113705388 A CN113705388 A CN 113705388A CN 202110931717 A CN202110931717 A CN 202110931717A CN 113705388 A CN113705388 A CN 113705388A
Authority
CN
China
Prior art keywords
video
rays
point
points
camera
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110931717.8A
Other languages
Chinese (zh)
Other versions
CN113705388B (en
Inventor
顾海军
肖世锋
包飞
贾明
田楠
罗清
黄健
侯丽娟
刘岱
曲量
张晔
彭莉
石育
鲁谨慈
黄祥勇
刘祖军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
State Grid Hunan Electric Power Co Ltd
Xiangxi Power Supply Co of State Grid Hunan Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
State Grid Hunan Electric Power Co Ltd
Xiangxi Power Supply Co of State Grid Hunan Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, State Grid Hunan Electric Power Co Ltd, Xiangxi Power Supply Co of State Grid Hunan Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN202110931717.8A priority Critical patent/CN113705388B/en
Publication of CN113705388A publication Critical patent/CN113705388A/en
Application granted granted Critical
Publication of CN113705388B publication Critical patent/CN113705388B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a method and a system for positioning the space position of multiple persons in real time based on camera information, wherein the method specifically comprises the following steps: 1) calibrating each path of video camera to obtain internal parameters, external parameters and distortion coefficients of the camera; 2) acquiring a video image, detecting the head of a person in the video image, and calculating the center position of the head; 3) obtaining rays in each path of video image through the calibration of a video camera and the center position of the head of a person in the video image; each ray starts from the optical center of the video image and passes through the head center position of the person detected in the video image; 4) calculating the minimum distance between all rays in the video images of different paths, and taking the midpoint of the connecting line of the minimum distance between all the rays as an effective derivation point; 5) and clustering the effective derived points, kicking off the false gathering points, and taking the rest gathering points as the spatial positions of the personnel. The invention has the advantages of good real-time performance, high positioning precision and the like.

Description

Method and system for positioning space positions of multiple persons in real time based on camera information
Technical Field
The invention mainly relates to the technical field of visual detection, in particular to a method and a system for positioning the space position of multiple persons in real time based on camera information.
Background
The positioning of the multi-person spatial position based on the camera shooting information means that indoor video data are obtained through a plurality of cameras, and the spatial position of a person is deduced from the video data. The positioning of the multi-person space position based on the camera information is the core of monitoring the behavior of the staff in the high-voltage room, and becomes an indispensable link for the safety responsible person to leave the post, the staff members to leave the sight of the safety responsible person, the three-way traffic process identification, the process staff identity identification and the like.
Methods for positioning multi-person spatial positions based on camera information can be divided into two categories: one is to locate the spatial position of a person by reconstructing the 3D contour of the person, i.e., the human body of each path of video picture is segmented and homography matrices of different heights are calculated, and then the segmentation result is multiplied by the homography matrices to obtain the 3D contour of the person. The other type is a method for establishing association of different camera video images, which identifies that the detected people in the different camera video images are the same person, and then locates the spatial position of the person in different image areas by the same person. Both of these methods have certain drawbacks in locating the spatial position of the person. The computation time required by the 3D contour method is difficult to meet the real-time property; secondly, the number of the video paths needing to be shot for accurate 3D contour reconstruction is large, and the practical application of the method is further reduced. The association rule is influenced by factors such as shielding, imaging distortion and image boundary effect, so that the accuracy of associating the same person in different video image areas is reduced, and the positioning accuracy is influenced.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: aiming at the problems in the prior art, the invention provides a method and a system for positioning the space positions of multiple persons in real time based on camera information, which have good positioning real-time performance and high positioning precision.
In order to solve the technical problems, the technical scheme provided by the invention is as follows:
a method for positioning the space position of multiple persons in real time based on camera information comprises the following steps:
1) calibrating each path of video camera to obtain internal parameters, external parameters and distortion coefficients of the camera;
2) acquiring a video image, detecting the head of a person in the video image, and calculating the center position of the head;
3) obtaining rays in each path of video image through the calibration of a video camera and the center position of the head of a person in the video image; each ray starts from the optical center of the video image and passes through the head center position of the person detected in the video image;
4) calculating the minimum distance between all rays in the video images of different paths, and taking the midpoint of the connecting line of the minimum distance between all the rays as an effective derivation point;
5) and clustering the effective derived points, kicking off the false gathering points, and taking the rest gathering points as the spatial positions of the personnel.
As a further improvement of the above technical solution:
the specific process of the step 3) is as follows:
3.1) normalizing the pixel coordinates of the points before distortion to obtain normalized coordinates of distortion-removed points;
and 3.2) acquiring the position of the video camera in a world coordinate system and the direction of the ray, and then obtaining the corresponding ray according to the coordinate of the distortion removing point.
In step 4), the number of heads of the detected person in the way A video is n1, and corresponding rays ra1, ra2, … and ran1 are derived; the number of heads of the detected person in the B-path video is n2, and corresponding rays derived from the B-path video are rb1, rb2, … and rbn 2; for the ray rai of the A-path video and the ray rbj of the B-path video, the distance defines: min (Dist (p, rai) + Dist (p, rbj)), and the constraint condition is that Dist (p, rai) ═ Dist (p, rbj), that is, a spatial p point where the distance between rays rai and rbj is minimum is found, and the minimum distance is defined as the distance between rays rai and rbj, and the constraint condition ensures the uniqueness of the p point; point p is referred to as the derivative point of the minimum distance between rays rai and rbj; if the minimum distance of rays rai and rbj is less than a given threshold, then the minimum distance valid derivative point is considered.
In the step 5), the convergence points of the effective derivation points are potential spatial positions of the personnel, the convergence points are sorted, and corresponding rays are preferentially distributed to the convergence points arranged in the front; and if the subsequent convergence point possibly has only one ray or no distributable ray, the convergence point is considered as a false convergence point, the false convergence point is kicked off, and the rest convergence points are the personnel space positions.
The principle of ordering takes into account two factors: the height corresponding to the convergence point is required to be between 1.5 and 1.85 meters, and the minimum distance of the rays contained in the class corresponding to the convergence point is required to be derived as many as possible.
In the step 5), a mean shift method, a peak density method or a hierarchical clustering method is adopted for clustering.
The process of clustering by adopting a mean shift method comprises the following steps:
stretching effective points in a three-dimensional scale by adopting a Gaussian kernel function, taking each effective point as a class center initially, and finding out a point with a distance from the class center being smaller than a preset bandwidth as a cluster for each class center; calculating vectors from the class center to each point in the cluster for each cluster, adding the vectors to be recorded as an offset vector, and moving the class center along the offset vector to be used as a new class center; iteratively moving the class center in the way until convergence; and combining the class centers, wherein the distance between the class centers is smaller than the preset bandwidth, and each class center is used as the three-dimensional position of each head.
The invention also discloses a system for positioning the space positions of multiple persons in real time based on the camera information, which comprises the following steps:
the first module is used for calibrating each path of video camera to obtain internal parameters, external parameters and distortion coefficients of the camera;
the second module is used for acquiring the video image, detecting the head of a person in the video image and calculating the center position of the head;
the third module is used for obtaining rays in each path of video image through the calibration of the video camera and the center position of the head of a person in the video image; each ray starts from the optical center of the video image and passes through the head center position of the person detected in the video image;
the fourth module is used for calculating the minimum distance among all rays in different paths of video images, and the continuous midpoint of the minimum distance among all the rays is used as an effective derivation point;
and the fifth module is used for clustering the effective derived points, kicking off the false gathering points, and taking the rest gathering points as the spatial positions of the personnel.
The invention further discloses a computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method for locating a spatial position of a plurality of persons in real time on the basis of camera information as described above.
The invention also discloses computer equipment which comprises a memory and a processor, wherein the memory is stored with a computer program, and the computer program executes the steps of the method for positioning the space positions of the multiple persons in real time based on the camera shooting information when being executed by the processor.
Compared with the prior art, the invention has the advantages that:
the invention relates to a method for positioning the space position of multiple persons in real time based on camera information, which comprises the steps of firstly, calibrating a video camera to determine the internal reference, the external reference and the distortion coefficient of the camera, detecting the heads in video images in all paths and obtaining the central position of the heads; calibrating the center position of the head in the image by a camera, wherein each center position of the head corresponds to a ray starting from an optical center, and the real head position is on the ray; calculating the minimum distance between the rays in the video images of different paths, and determining a point on the space from the minimum distance, wherein the point is equal to the two rays in distance and the sum of the distances is equal to the minimum distance between the two rays; if the minimum distance is smaller than a given threshold value, the determined point is called an effective derivation point of the minimum distance between the rays; clustering the effective lead-out points, and eliminating false clustering points by using the height of a person and only corresponding information of one ray, wherein the positions of the rest clustering points are the spatial positions of the person; the invention has good real-time performance and high positioning precision, is not limited to the space positioning of personnel in the high-pressure chamber, and is suitable for the space positioning of a plurality of personnel in any scene.
The invention adopts deep learning to detect the heads of people, the central position of each head is from the optical ray of a video camera, and the real head position is on the ray; according to the method, the 3D reconstruction is performed only by intersecting the head with the rays, so that the problem that the real-time performance of all the traditional 3D reconstruction is poor is solved, and the real-time performance is good; the head is detected by adopting deep learning, and under the condition that only part of targets are reconstructed, the 3D reconstruction precision is greatly improved, so that the positioning precision is high.
Drawings
FIG. 1 is a flow chart of an embodiment of the method of the present invention.
FIG. 2 is a diagram illustrating a result of multi-person positioning reconstruction in a specific application of the method of the present invention; the image shot by three different cameras at the same time point is shown in the figure, the frame in the figure is the recognition result, and the frame with the same gray level represents the same person.
Detailed Description
The invention is further described below with reference to the figures and the specific embodiments of the description.
From the camera model, the spatial head position of the person is located on a ray connecting from the optical center to the head position in the image. For the same person, if the head of the person is detected by the way of video image, corresponding rays exist, and the intersection point of the rays corresponding to different ways of videos is the head space 3D position of the person. Based on the facts, the invention provides a method for locating the space position of multiple people in real time based on the camera information, which is shown in fig. 1, and the method specifically comprises the following steps:
1) calibrating each path of video camera to obtain internal parameters, external parameters and distortion coefficients of the camera;
2) acquiring a video image, detecting the head of a person in the video image, and calculating the center position of the head;
3) obtaining rays in each path of video image through the calibration of a video camera and the center position of the head of a person in the video image; each ray starts from the optical center of the video image and passes through the head center position of the person detected in the video image;
4) calculating the minimum distance between all rays in the video images of different paths, and taking the midpoint of the connecting line of the minimum distance between all the rays as an effective derivation point;
5) and clustering the effective derived points, kicking off the false gathering points, and taking the rest gathering points as the spatial positions of the personnel.
The invention relates to a method for positioning the space position of multiple persons in real time based on camera information, which comprises the steps of firstly, calibrating a video camera to determine the internal reference, the external reference and the distortion coefficient of the camera, detecting the heads in video images in all paths and obtaining the central position of the heads; calibrating the center position of the head in the image by a camera, wherein each center position of the head corresponds to a ray starting from an optical center, and the real head position is on the ray; calculating the minimum distance between the rays in the video images of different paths, and determining a point on the space from the minimum distance, wherein the point is equal to the two rays in distance and the sum of the distances is equal to the minimum distance between the two rays; if the minimum distance is smaller than a given threshold value, the determined point is called an effective derivation point of the minimum distance between the rays; clustering the effective lead-out points, and eliminating false clustering points by using the height of a person and only corresponding information of one ray, wherein the positions of the rest clustering points are the spatial positions of the person; the invention has good real-time performance and high positioning precision, is not limited to the space positioning of personnel in the high-pressure chamber, and is suitable for the space positioning of a plurality of personnel in any scene.
In a specific embodiment, in the step 1), the video camera calibration implementation manner is based on a checkerboard calibration principle, that is, checkerboard grids are printed on the ground, and the internal reference, the external reference and the distortion coefficient of the video camera are corrected according to the relationship among the checkerboard grids, so that the calibration manner is mature and reliable, and the operation is simple and convenient.
In a specific embodiment, in step 2), the detection of the head of the person in the video image may employ a deep learning detection network or other detection networks.
In a specific embodiment, the specific process of step 3) is:
3.1) normalizing the pixel coordinates of the points before distortion to obtain normalized coordinates of distortion-removed points;
and 3.2) acquiring the position of the video camera in a world coordinate system and the direction of the ray, and then obtaining the corresponding ray according to the coordinate of the distortion removing point.
Specifically, the specific implementation process of step 3.1) is as follows:
known as cx,cyIs the center of the image, fx,fyFocal length in two directions, distortion coefficient k1,k2,k3The coordinates of the distortion front point can be extracted from the txt file as point ═ x, y];
The coordinate relationship of the points before and after distortion is as follows:
x1=u(1+k1r2+k2r4+k3r6)
y1=v(1+k1r2+k2r4+k3r6)
wherein x1,y1The normalized coordinates before distortion, and u and v are normalized coordinates after distortion;
due to r2=u2+v2And r is also an unknown quantity, so that the u and v are solved iteratively by using fixed point positioning, specifically as follows:
1. firstly, normalizing the pixel coordinates of the point before distortion:
x1=(x-cx)/fx
y1=(y-cy)/fy
2. then, the initial value of u, v is given as x1,y1
3. Calculating currs ═ 1+ k according to the values of u and v1r2+k2r4+k3r6);
4. According to
Figure BDA0003211394410000061
Updating the values of u and v;
5. repeating the steps 3 and 4 for five times, so as to obtain the normalized coordinates u, v of the distortion point.
Specifically, the specific implementation process of step 3.2) is as follows:
the corresponding ray is obtained according to the coordinates of the distortion point, and the position of the corresponding video camera in a world coordinate system and the direction of the ray need to be solved.
C represents a video camera coordinate system, and W represents a world coordinate system;
firstly, solving the position of a video camera in a world coordinate system;
the coordinate of the optical center of the video camera in the world coordinate system is XWThe coordinate under the video camera coordinate system is XCThen, there is a transformation relation:
XC=RW_C*XW+TW_C
XW=RC_W*XC+TC_W
wherein R isW_CAnd RC_WAre transposed matrices to each other.
Need to solve for XWThe coordinate of the optical center of the video camera in the video camera coordinate system is XC=[0,0,0]TSo that XW=TC_W
The conversion relation X can be obtained according to a formulaW=TC_W=-RC_W*TW_C
Then, the direction of the ray is determined, and in the video camera coordinate system, the direction of the ray is the direction of the ray [ u, v,1 ═ n]TThen in the world coordinate system the direction of the ray is rotated, i.e. the direction is changed to RC_W*direction。
In a specific embodiment, in step 4), the number of heads of the detected person in the video in the path a is n1, and the corresponding rays ra1, ra2, … and ran1 are derived, and the number of heads of the detected person in the video in the path B is n2, and the corresponding rays rb1, rb2, … and rbn2 are derived. For the ray rbj of the video of the path a and the path B, the distance defines min (Dist (p, rai) + Dist (p, rbj)), and the constraint condition is that Dist (p, rai) ═ Dist (p, rbj), i.e. searching a spatial p point, the distance to the ray rai and rbj reaches the minimum, the minimum distance is defined as the distance of the ray rai and rbj, and the constraint condition ensures the uniqueness of the p point. The p point is called the minimum distance derivation point between rays rai and rbj, and if the minimum distance between rays rai and rbj is less than a given threshold (e.g., 150cm), then the minimum distance derivation point is considered valid.
Specifically, the minimum distance between two rays in step 4) is calculated as:
Figure BDA0003211394410000071
wherein r isa(t)=(xa+mat,ya+nat,za+lat),rb(t)=(xb+mbt,yb+nbt,zb+lbt) is a ray equation;
the solution to the optimization problem described above can be done using a derivative method. To obtain
Figure BDA0003211394410000081
Thereafter, the two ray-derived points p can be determined:
Figure BDA0003211394410000082
if it is
Figure BDA0003211394410000083
Less than 150cm squared, p is considered to be the minimum distance effective derivation point.
In a specific embodiment, in step 5), the minimum distance effective derivation points are clustered, the clustering method is a mean shift method, the kernel function is a three-dimensional gaussian function, the covariance matrix of the gaussian function is a diagonal matrix, and the element value on the diagonal is 100. Certainly, the effective minimum distance derivation point distance is not limited to the mean shift method, and other methods such as a peak density method and hierarchical clustering can be adopted.
Specifically, the aggregation points of the effective derivation points are potential spatial positions of the personnel, the aggregation points are sorted, and two factors are considered in the sorting principle: the height corresponding to the convergence point is required to be between 1.5 and 1.85 meters, and the minimum distance of the rays included in the class corresponding to the convergence point is more possible. For the row-ahead convergence point, the corresponding ray is preferentially assigned. And if the subsequent convergence point possibly has only one ray or no distributable ray, the convergence point is considered as a false convergence point, the false convergence point is kicked off, and the rest convergence points are the personnel space positions.
The method for eliminating the false aggregation points comprises the following steps: the foci are first classified into two categories: a type I focus point, the height of which is between 1.5 meters and 1.85 meters; the type II condensation point is not between 1.5 and 1.85 meters in height. And aiming at the point of convergence of the type I, sorting the points from large to small according to the number of the minimum distance derived points contained in the corresponding category of the point of convergence. After finishing the sorting of the type I gather points, sorting the type II gather points according to the same principle. All the type II foci are ranked after the type I foci. Considering that each minimum distance derived point corresponds to two rays in different video cameras, effective derived points corresponding to the sequenced previous gathering points are processed firstly, namely rays corresponding to the derived points are identified, and each ray can only be identified once at most. And regarding the subsequently processed point focus, if the number of the rays which are not identified is less than 2, the point focus is considered as a false point focus.
As shown in FIG. 2, the above invention is further described in detail with reference to a specific embodiment:
(1) acquiring four paths of video image data, aligning the video image data according to time points, inputting each path of video image into a depth detection network, and detecting the head area of a person in the image.
(2) And calibrating the four paths of video cameras. Firstly, printing a template and pasting the template on a plane, then shooting a plurality of template images from different angles, detecting characteristic points of the images, then solving internal parameters and external parameters of a camera under an ideal distortion condition, improving the precision by using maximum likelihood estimation, then calculating an actual radial distortion coefficient by using least square, then integrating the internal parameters, the external parameters and the distortion coefficient, optimizing estimation by using a maximum likelihood method, improving the estimation precision, and finally obtaining the internal parameters, the external parameters and the distortion coefficient of the video camera.
(3) And obtaining rays according to the calibration information and the detection result. The optical center is determined using the video camera calibration information, and rays are determined that start at the optical center and pass through the detected center region of the head in the image. According to the video camera imaging model, the real space position of the human head is on a ray; the step of obtaining the ray is to normalize the head center coordinate of the image detection by utilizing the video camera internal parameter and the distortion model, and then to convert the normalized coordinate into a ray equation through the video camera external parameter.
(4) And calculating the minimum distance between the rays derived from the video images of different paths and the derived point thereof. The four cameras are paired pairwise, and six groups of calculation are provided (1-2 paths, 1-3 paths, 1-4 paths, 2-3 paths, 2-4 paths and 3-4 paths). Each group of specific calculations is described in the summary of the invention. The minimum distance between any two rays derived from different paths of video images is calculated by finding a point on each of the two rays so that the distance between the two points is minimized. The centers of the two points sought are the minimum distance derivation points between the rays. If the minimum distance is less than 150cm, the derived point is called the effective point.
(5) And clustering the derived points. Clustering the derived points by adopting a mean shift method, wherein the clustering is specifically set as follows: stretching effective points in a three-dimensional scale by adopting a Gaussian kernel function, taking each effective point as a class center initially, taking the bandwidth as 150cm, and finding out points with the distance of less than 150cm from the class center as a cluster for each class center; calculating vectors from the class center to each point in the cluster for each cluster, adding the vectors to be recorded as an offset vector, and moving the class center along the offset vector to be used as a new class center; iteratively moving the class center in this manner until convergence (class center no longer changes); and the distance between the class centers is less than 150cm, the class centers are combined, and each class center is used as the three-dimensional position of each head.
(6) Eliminating false clustering points. And eliminating false clustering centers according to the prior information such as height and ray matching characteristics. Grouping reconstructed heights, height at (150cm, 185cm) considered a standing head position; the error of the reprojection in the standing state is minimum, and the focus is sorted by the height in the interval preferentially. The reconstructed three-dimensional ray can only belong to one convergent point, and the three-dimensional ray which belongs to a plurality of convergent points and has a height to judge the non-standing state is considered to be abnormal, so that the head frame corresponding to the three-dimensional ray is eliminated.
The invention adopts the deep learning technology to detect the heads of the personnel, the central position of each head is started from the optical ray of the video camera, and the real head position is on the ray. The video cameras correspond to the rays of the same person, and intersection points of the rays in the video images in different paths are calculated, namely the spatial coordinates of the head. And clustering the spatial coordinates, and eliminating false clustering points according to the height and the prior information that one ray only corresponds to one spatial coordinate and the like, wherein the position of the residual clustering points is the spatial position of the head of the person. Only the head is subjected to 3D reconstruction by adopting ray intersection, so that the problem of poor real-time performance of the traditional 3D reconstruction is solved; the head is detected by adopting deep learning, and the 3D reconstruction precision is greatly improved under the condition that only part of targets are reconstructed.
The invention also discloses a system for positioning the space positions of multiple persons in real time based on the camera information, which comprises the following steps:
the first module is used for calibrating each path of video camera to obtain internal parameters, external parameters and distortion coefficients of the camera;
the second module is used for acquiring the video image, detecting the head of a person in the video image and calculating the center position of the head;
the third module is used for obtaining rays in each path of video image through the calibration of the video camera and the center position of the head of a person in the video image; each ray starts from the optical center of the video image and passes through the head center position of the person detected in the video image;
the fourth module is used for calculating the minimum distance among all rays in different paths of video images, and the continuous midpoint of the minimum distance among all the rays is used as an effective derivation point;
and the fifth module is used for clustering the effective derived points, kicking off the false gathering points, and taking the rest gathering points as the spatial positions of the personnel.
The system for positioning the space positions of the multiple persons in real time based on the camera shooting information corresponds to the method and has the advantages of the method.
The invention further discloses a computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method for locating a spatial position of a plurality of persons in real time on the basis of camera information as described above. The invention also discloses computer equipment which comprises a memory and a processor, wherein the memory is stored with a computer program, and the computer program executes the steps of the method for positioning the space positions of the multiple persons in real time based on the camera shooting information when being executed by the processor. All or part of the flow of the method of the embodiments may be implemented by a computer program, which may be stored in a computer-readable storage medium and executed by a processor, to implement the steps of the embodiments of the methods. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying computer program code, recording medium, U.S. disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution media, and the like. The memory may be used to store computer programs and/or modules, and the processor may perform various functions by executing or executing the computer programs and/or modules stored in the memory, as well as by invoking data stored in the memory. The memory may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device.
The above is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above-mentioned embodiments, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may be made by those skilled in the art without departing from the principle of the invention.

Claims (10)

1. A method for positioning the space position of multiple persons in real time based on camera shooting information is characterized by comprising the following steps:
1) calibrating each path of video camera to obtain internal parameters, external parameters and distortion coefficients of the camera;
2) acquiring a video image, detecting the head of a person in the video image, and calculating the center position of the head;
3) obtaining rays in each path of video image through the calibration of a video camera and the center position of the head of a person in the video image; each ray starts from the optical center of the video image and passes through the head center position of the person detected in the video image;
4) calculating the minimum distance between all rays in the video images of different paths, and taking the midpoint of the connecting line of the minimum distance between all the rays as an effective derivation point;
5) and clustering the effective derived points, kicking off the false gathering points, and taking the rest gathering points as the spatial positions of the personnel.
2. The method for positioning the spatial position of the multiple persons in real time based on the camera information as claimed in claim 1, wherein the specific process of the step 3) is as follows:
3.1) normalizing the pixel coordinates of the points before distortion to obtain normalized coordinates of distortion-removed points;
and 3.2) acquiring the position of the video camera in a world coordinate system and the direction of the ray, and then obtaining the corresponding ray according to the coordinate of the distortion removing point.
3. The method for locating the spatial position of a plurality of persons in real time based on the camera information as claimed in claim 1, wherein in the step 4), the number of heads of the detected persons in the A-way video is n1, and corresponding rays ra1, ra2, … and ran1 are derived; the number of heads of the detected person in the B-path video is n2, and corresponding rays derived from the B-path video are rb1, rb2, … and rbn 2; for the ray rbj of the video of the path a and the path B, the distance is defined as min (Dist (p, rai) + Dist (p, rbj)), and the constraint condition is that Dist (p, rai) ═ Dist (p, rbj), that is, a spatial p point where the distance between the ray rai and the ray rbj is minimum is found, the minimum distance is defined as the distance between the ray rai and the ray rbj, and the constraint condition ensures the uniqueness of the p point; point p is referred to as the derivative point of the minimum distance between rays rai and rbj; if the minimum distance of rays rai and rbj is less than a given threshold, then the minimum distance valid derivative point is considered.
4. The method for positioning the spatial position of multiple persons in real time based on the camera information according to any one of claims 1 to 3, wherein in the step 5), the convergence points of the effective derivation points are the potential spatial positions of the persons, the convergence points are sorted, and the corresponding rays are preferentially distributed to the convergence points arranged in the front; and if the subsequent convergence point possibly has only one ray or no distributable ray, the convergence point is considered as a false convergence point, the false convergence point is kicked off, and the rest convergence points are the personnel space positions.
5. The method for real-time multi-person spatial location based on camera information as claimed in claim 4, wherein the ranking principle takes into account two factors: the height corresponding to the convergence point is required to be between 1.5 and 1.85 meters, and the minimum distance of the rays contained in the class corresponding to the convergence point is required to be derived as many as possible.
6. The method for positioning the spatial position of multiple persons in real time based on the camera information as claimed in claim 5, wherein in the step 5), the clustering is performed by using a mean shift method, a peak density method or a hierarchical clustering method.
7. The method for positioning the spatial position of the multiple persons in real time based on the camera information as claimed in claim 6, wherein the clustering process by the mean shift method comprises:
stretching effective points in a three-dimensional scale by adopting a Gaussian kernel function, taking each effective point as a class center initially, and finding out a point with a distance from the class center being smaller than a preset bandwidth as a cluster for each class center; calculating vectors from the class center to each point in the cluster for each cluster, adding the vectors to be recorded as an offset vector, and moving the class center along the offset vector to be used as a new class center; iteratively moving the class center in the way until convergence; and combining the class centers, wherein the distance between the class centers is smaller than the preset bandwidth, and each class center is used as the three-dimensional position of each head.
8. A system for positioning the space position of a plurality of people in real time based on camera shooting information is characterized by comprising:
the first module is used for calibrating each path of video camera to obtain internal parameters, external parameters and distortion coefficients of the camera;
the second module is used for acquiring the video image, detecting the head of a person in the video image and calculating the center position of the head;
the third module is used for obtaining rays in each path of video image through the calibration of the video camera and the center position of the head of a person in the video image; each ray starts from the optical center of the video image and passes through the head center position of the person detected in the video image;
the fourth module is used for calculating the minimum distance among all rays in different paths of video images, and the continuous midpoint of the minimum distance among all the rays is used as an effective derivation point;
and the fifth module is used for clustering the effective derived points, kicking off the false gathering points, and taking the rest gathering points as the spatial positions of the personnel.
9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method for locating a multi-person spatial position in real time based on camera information according to any one of claims 1 to 7.
10. Computer device comprising a memory and a processor, the memory having stored thereon a computer program, characterized in that the computer program, when being executed by the processor, performs the steps of the method for locating a multi-person spatial position in real time based on camera information according to any one of claims 1 to 7.
CN202110931717.8A 2021-08-13 2021-08-13 Method and system for positioning spatial positions of multiple persons in real time based on camera shooting information Active CN113705388B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110931717.8A CN113705388B (en) 2021-08-13 2021-08-13 Method and system for positioning spatial positions of multiple persons in real time based on camera shooting information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110931717.8A CN113705388B (en) 2021-08-13 2021-08-13 Method and system for positioning spatial positions of multiple persons in real time based on camera shooting information

Publications (2)

Publication Number Publication Date
CN113705388A true CN113705388A (en) 2021-11-26
CN113705388B CN113705388B (en) 2024-01-12

Family

ID=78652652

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110931717.8A Active CN113705388B (en) 2021-08-13 2021-08-13 Method and system for positioning spatial positions of multiple persons in real time based on camera shooting information

Country Status (1)

Country Link
CN (1) CN113705388B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140327780A1 (en) * 2011-11-29 2014-11-06 Xovis Ag Method and device for monitoring a monitoring region
CN105700029A (en) * 2016-01-22 2016-06-22 清华大学 Method, device and system for inspecting object based on cosmic ray
CN107436969A (en) * 2017-07-03 2017-12-05 四川大学 A kind of three-dimensional multi-target orientation method based on genetic algorithm
CN107767420A (en) * 2017-08-16 2018-03-06 华中科技大学无锡研究院 A kind of scaling method of underwater stereoscopic vision system
CN108263389A (en) * 2018-01-26 2018-07-10 深圳市九洲源科技有限公司 A kind of vehicle front false target device for eliminating and method
CN110458940A (en) * 2019-07-24 2019-11-15 兰州未来新影文化科技集团有限责任公司 The processing method and processing unit of motion capture
CN111028271A (en) * 2019-12-06 2020-04-17 浩云科技股份有限公司 Multi-camera personnel three-dimensional positioning and tracking system based on human skeleton detection
CN111079859A (en) * 2019-12-31 2020-04-28 哈尔滨工程大学 Passive multi-station multi-target direction finding cross positioning and false point removing method
CN111080712A (en) * 2019-12-06 2020-04-28 浩云科技股份有限公司 Multi-camera personnel positioning, tracking and displaying method based on human body skeleton detection
CN111598001A (en) * 2020-05-18 2020-08-28 哈尔滨理工大学 Apple tree pest and disease identification method based on image processing
CN112381025A (en) * 2020-11-23 2021-02-19 恒大新能源汽车投资控股集团有限公司 Driver attention detection method and device, electronic equipment and storage medium

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140327780A1 (en) * 2011-11-29 2014-11-06 Xovis Ag Method and device for monitoring a monitoring region
CN105700029A (en) * 2016-01-22 2016-06-22 清华大学 Method, device and system for inspecting object based on cosmic ray
CN107436969A (en) * 2017-07-03 2017-12-05 四川大学 A kind of three-dimensional multi-target orientation method based on genetic algorithm
CN107767420A (en) * 2017-08-16 2018-03-06 华中科技大学无锡研究院 A kind of scaling method of underwater stereoscopic vision system
CN108263389A (en) * 2018-01-26 2018-07-10 深圳市九洲源科技有限公司 A kind of vehicle front false target device for eliminating and method
CN110458940A (en) * 2019-07-24 2019-11-15 兰州未来新影文化科技集团有限责任公司 The processing method and processing unit of motion capture
CN111028271A (en) * 2019-12-06 2020-04-17 浩云科技股份有限公司 Multi-camera personnel three-dimensional positioning and tracking system based on human skeleton detection
CN111080712A (en) * 2019-12-06 2020-04-28 浩云科技股份有限公司 Multi-camera personnel positioning, tracking and displaying method based on human body skeleton detection
CN111079859A (en) * 2019-12-31 2020-04-28 哈尔滨工程大学 Passive multi-station multi-target direction finding cross positioning and false point removing method
CN111598001A (en) * 2020-05-18 2020-08-28 哈尔滨理工大学 Apple tree pest and disease identification method based on image processing
CN112381025A (en) * 2020-11-23 2021-02-19 恒大新能源汽车投资控股集团有限公司 Driver attention detection method and device, electronic equipment and storage medium

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
LAURA LEAL-TAIXE: "branch-and-price global optimization for multi-view multi-target", 《2012 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》, pages 1987 - 1994 *
MANISH KUSHWAHA等: "Collaborative 3D target tracking in distributed smart camera networks for wide-area surveillance", 《JOURNAL OF SENSOR AND ACTUATOR NETWORKS》, vol. 2, no. 2, pages 316 - 353 *
MURTAZA TAJ等: "multi-view multi-object detection and tracking", 《COMPUTER VISION》, pages 263 - 280 *
杨志国等: "SAR目标检测中的聚类算法改进", 《国防科学技术大学电子科学与工程学院 》, vol. 13, no. 11, pages 2132 - 2138 *
赵倩: "多视角环境中多目标跟踪的关键技术研究", 《中国优秀硕士学位论文全文数据库 (信息科技辑)》, no. 2019, pages 138 - 605 *

Also Published As

Publication number Publication date
CN113705388B (en) 2024-01-12

Similar Documents

Publication Publication Date Title
CN111462200B (en) Cross-video pedestrian positioning and tracking method, system and equipment
CN106709950B (en) Binocular vision-based inspection robot obstacle crossing wire positioning method
US11763485B1 (en) Deep learning based robot target recognition and motion detection method, storage medium and apparatus
Spreeuwers Fast and accurate 3D face recognition: using registration to an intrinsic coordinate system and fusion of multiple region classifiers
CN109190508B (en) Multi-camera data fusion method based on space coordinate system
JP6448223B2 (en) Image recognition system, image recognition apparatus, image recognition method, and computer program
CN110378931A (en) A kind of pedestrian target motion track acquisition methods and system based on multi-cam
CN107578376B (en) Image splicing method based on feature point clustering four-way division and local transformation matrix
Tang et al. ESTHER: Joint camera self-calibration and automatic radial distortion correction from tracking of walking humans
CN110084243A (en) It is a kind of based on the archives of two dimensional code and monocular camera identification and localization method
CN108257155A (en) A kind of extension target tenacious tracking point extracting method based on part and Global-Coupling
CN110267101A (en) A kind of unmanned plane video based on quick three-dimensional picture mosaic takes out frame method automatically
CN114898353B (en) License plate recognition method based on video sequence image characteristics and information
CN113255608A (en) Multi-camera face recognition positioning method based on CNN classification
CN115239882A (en) Crop three-dimensional reconstruction method based on low-light image enhancement
CN110458019B (en) Water surface target detection method for eliminating reflection interference under scarce cognitive sample condition
CN110909617B (en) Living body face detection method and device based on binocular vision
Theiner et al. Tvcalib: Camera calibration for sports field registration in soccer
CN112418250A (en) Optimized matching method for complex 3D point cloud
CN117036404A (en) Monocular thermal imaging simultaneous positioning and mapping method and system
CN112131984A (en) Video clipping method, electronic device and computer-readable storage medium
CN114998532B (en) Three-dimensional image visual transmission optimization method based on digital image reconstruction
CN114399731B (en) Target positioning method under supervision of single coarse point
CN113705388A (en) Method and system for positioning space positions of multiple persons in real time based on camera information
Abdel-Wahab et al. Efficient reconstruction of large unordered image datasets for high accuracy photogrammetric applications

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant