WO2022227462A1 - Positioning method and apparatus, electronic device, and storage medium - Google Patents

Positioning method and apparatus, electronic device, and storage medium Download PDF

Info

Publication number
WO2022227462A1
WO2022227462A1 PCT/CN2021/127625 CN2021127625W WO2022227462A1 WO 2022227462 A1 WO2022227462 A1 WO 2022227462A1 CN 2021127625 W CN2021127625 W CN 2021127625W WO 2022227462 A1 WO2022227462 A1 WO 2022227462A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
target object
position coordinates
initial position
video
Prior art date
Application number
PCT/CN2021/127625
Other languages
French (fr)
Chinese (zh)
Inventor
关英妲
刘文韬
钱晨
Original Assignee
北京市商汤科技开发有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京市商汤科技开发有限公司 filed Critical 北京市商汤科技开发有限公司
Publication of WO2022227462A1 publication Critical patent/WO2022227462A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06T5/80
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • the present disclosure relates to the field of computer vision technology, and in particular, to a positioning method, an apparatus, an electronic device, and a storage medium.
  • Computer vision as one of the key technologies, is widely used.
  • the positioning technology based on computer vision can locate the target object in the target place under different scenarios, and determine the position of the target object in the target place.
  • the position of the target object in the image of the target site can be determined through the image of the target site collected by the camera, and the position of the target object in the target site can be further determined to complete the target object in the target site. track.
  • the embodiments of the present disclosure provide at least one positioning solution.
  • an embodiment of the present disclosure provides a positioning method, including:
  • the initial position coordinates of the same target object in the target objects are fused to obtain the target position coordinates of the target object in the target place.
  • the initial position coordinates of the target object in different video pictures can be determined through the video pictures collected at the same time by a plurality of acquisition devices with different acquisition perspectives set in the target place, and further the same video pictures in different video pictures can be determined.
  • the initial position coordinates of the target object are fused to determine the target position coordinates of the target object in the target place. In this way, on the one hand, it is possible to complete the comprehensive positioning of the target object in the target site with large space and/or complex site, and on the other hand, it can obtain the target position coordinates of the same target object with high accuracy.
  • the initial position coordinates of the target object in the target place are respectively determined, including:
  • the target object collected by the collection device is determined based on the pixel coordinates of at least one of the target objects in the video image collected by the collection device and the parameter information of the collection device The initial position coordinates of at least one of them in the world coordinate system corresponding to the target location.
  • the pixel coordinates of the target object in the video screen can be determined first, and then the initial position coordinates of the target object in the target place can be obtained according to the parameter information of the acquisition device, which is used for subsequent determination of the target object in the target place.
  • Target location coordinates are provided for preparation.
  • acquiring the pixel coordinates of the target object in multiple video frames includes:
  • the neural network includes a plurality of A target detection sub-network that detects target objects of different sizes;
  • the pixel coordinates of the target position point on the detection frame of the target object in the video picture are extracted in the video picture, and the pixel coordinates of the target object in the video picture are obtained.
  • the neural network includes a plurality of target detection sub-networks for detecting target objects of different sizes, so that when the target object in the video picture is detected by the neural network, the same video picture can be accurately detected target objects of different sizes.
  • the target object collected by the collection device is determined based on the pixel coordinates of at least one of the target objects in the video picture collected by the collection device and the parameter information of the collection device The initial position coordinates of at least one of them in the world coordinate system corresponding to the target location, including:
  • the pixel coordinates of at least one of the target objects in the video picture collected by the acquisition device are corrected to obtain at least one of the target objects in the video picture. the corrected pixel coordinates of one;
  • the pixel coordinates are first corrected based on the internal parameter matrix and the distortion coefficient of the capture device that captures the video picture, so that the corrected pixel coordinates with higher accuracy can be obtained. , and further obtain the initial position coordinates of the target object with high accuracy in the target place.
  • the initial position coordinates of the same target object in the target objects are fused to obtain the target position coordinates of the target object in the target place, including:
  • the plurality of initial position coordinates associated with the target object are sequentially fused to obtain the target position coordinates of the target object in the target place.
  • the initial position coordinates of the same target object collected by multiple collection devices can be fused, thereby The target position coordinates with higher accuracy of the same target object can be obtained.
  • the multiple initial position coordinates associated with the target object are sequentially fused to obtain the target position coordinates of the target object in the target place, including:
  • the first intermediate fusion position coordinates are fused with any other initial position coordinates to be fused in the plurality of initial position coordinates to generate the second intermediate fusion position coordinates, and the second intermediate fusion position coordinates are used as the updated and returning to the step of generating the second intermediate fusion position coordinates, until there is no initial position coordinate to be fused in the plurality of initial position coordinates.
  • the first intermediate fusion position coordinate is fused with any other initial position coordinate to be fused among the plurality of initial position coordinates to generate a second intermediate fusion position coordinate, including:
  • multiple initial position coordinates associated with the same target object may be fused in a manner of taking midpoints in sequence, so as to obtain target position coordinates with higher accuracy.
  • determining a plurality of initial position coordinates associated with the same target object in the target objects including:
  • the target object in the first video picture in the any two video pictures is determined as the first target object
  • the second video picture in the arbitrary two video pictures is determined as the first target object.
  • the target object in the picture is determined as a second target pair object; the initial position coordinates of each of the first target objects are determined in the first video picture; the initial position coordinates of each of the first target objects are determined in the second video picture; each of the second the initial position coordinates of the target object; for the initial position coordinates of each of the first target objects, determine the distance between the initial position coordinates of the first target object and the initial position coordinates of each of the second target objects;
  • a second target object with a minimum distance from the first target object is the same target object as the first target object, wherein the minimum distance is less than a preset fusion distance threshold; the initial position coordinates of the first target object and The initial position coordinates of the second target object with the smallest distance from the first target object are taken as a plurality of initial position coordinates associated with the same target object among the target objects.
  • the initial position coordinates associated with the same target object can be quickly determined, so as to determine the subsequent position of each target object.
  • the target location coordinates provide the basis.
  • the positioning method further includes:
  • an early warning prompt is performed.
  • the target position coordinates of each target object in the target place with high accuracy it can be determined whether the target object in the target place is based on a preset target area, such as a preset dangerous area. Enter the target area for timely warning prompts and improve the safety of the target site.
  • a preset target area such as a preset dangerous area.
  • an embodiment of the present disclosure provides a positioning device, including:
  • the acquisition module is used to acquire multiple video images collected at the same time by multiple collection devices set in the target site; wherein, different collection devices have different collection perspectives in the target site, and the multiple video images include A target object; wherein, the target object is an object to be positioned in the target place;
  • a determining module configured to respectively determine the initial position coordinates of the target object in the target place based on the plurality of video pictures
  • the fusion module is used for fusing the initial position coordinates of the same target object in the target objects to obtain the target position coordinates of the target object in the target place.
  • embodiments of the present disclosure provide an electronic device, including: a processor, a memory, and a bus, where the memory stores machine-readable instructions executable by the processor, and when the electronic device runs, the processing The processor and the memory communicate through a bus, and the machine-readable instructions execute the steps of the positioning method according to the first aspect when the machine-readable instructions are executed by the processor.
  • an embodiment of the present disclosure provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, the steps of the positioning method according to the first aspect are executed .
  • an embodiment of the present disclosure provides a computer program product, the computer program product includes a computer program and is stored on a storage medium, and when the computer program is executed by a processor, executes the steps of the positioning method according to the first aspect .
  • FIG. 1 shows a flowchart of a positioning method provided by an embodiment of the present disclosure
  • FIG. 2 shows a flowchart of a method for determining the initial position coordinates of a target object provided by an embodiment of the present disclosure
  • FIG. 3 shows a schematic diagram of a target object detected in a video picture provided by an embodiment of the present disclosure
  • FIG. 4 shows a flowchart of a method for determining target position coordinates of a target object provided by an embodiment of the present disclosure
  • FIG. 5 shows a flowchart of a method for early warning provided by an embodiment of the present disclosure
  • FIG. 6 shows a schematic structural diagram of a positioning device provided by an embodiment of the present disclosure
  • FIG. 7 shows a schematic diagram of an electronic device provided by an embodiment of the present disclosure.
  • the target object in a place In many application scenarios, it is usually necessary to locate the target object in a place. For example, in a factory, it is necessary to detect whether employees are working in designated locations, or whether they have entered a dangerous area. In shopping malls, the distribution of people flow in shopping malls can be detected by locating customers. In the process of locating the target object in the place, the position of the target object can be determined through images collected by a plurality of cameras. However, for some target sites with complex and large areas, in the process of locating the target objects based on multiple cameras, it may not be possible to capture all the target objects, and there is a problem of incomplete target object positioning; there may also be some occlusion areas. , the target objects in these occluded areas cannot be localized.
  • the present disclosure provides a positioning method, which can determine the initial position coordinates of the target object in different video pictures through the video pictures collected at the same time by a plurality of acquisition devices with different acquisition perspectives set in the target place, and further The initial position coordinates of the same target object in different video images are fused to determine the target position coordinates of the target object in the target place.
  • a positioning method which can determine the initial position coordinates of the target object in different video pictures through the video pictures collected at the same time by a plurality of acquisition devices with different acquisition perspectives set in the target place, and further The initial position coordinates of the same target object in different video images are fused to determine the target position coordinates of the target object in the target place.
  • the execution subject of the positioning method provided by the embodiment of the present disclosure is a computer device with computing capability, and the computer device includes, for example: server or other processing device.
  • the positioning method may be implemented by the processor invoking computer-readable instructions stored in the memory.
  • the positioning method includes the following S101-S103:
  • S101 acquiring video images collected at the same time by multiple collection devices set in the target site; wherein, different collection devices have different collection perspectives in the target site, and the video images include target objects.
  • the target place may be the place corresponding to the application scenario.
  • the target place may be the factory;
  • the target place can be a shopping mall; when it is necessary to locate the athletes in the gymnasium, the target place can be the gymnasium.
  • the target objects are objects to be located in the target location, such as the aforementioned employees, customers and athletes.
  • the collection device may be a monocular camera or a binocular camera, and multiple collection devices may be set in the target site.
  • the installation positions of multiple collection devices can be determined according to the actual site of the target site.
  • the acquisition angles of the acquisition devices in the target site may be different, so as to cover the entire area of the target site without leaving a dead angle.
  • too many capture devices will result in too many video images captured at the same time, it will affect the processing speed of the video images. Therefore, when installing the acquisition equipment in the target site, it is necessary to consider the installation angle and quantity of the acquisition equipment at the same time.
  • each target object entering the target site can be captured by two acquisition devices at the same time, so that multiple acquisition devices set in the target site can completely capture the video images of the entire area of the target site.
  • S102 Determine the initial position coordinates of the target object in the target place based on the video images collected by multiple collection devices at the same time.
  • target detection can be further performed on the video images collected by multiple collection devices at the same time, and it is determined that the target objects in different video images correspond to the target location.
  • the initial position coordinates of the target object in the video picture can be determined based on the detected pixel coordinates of the target object in the video picture and the parameter information of the acquisition device that collects the video picture.
  • the world coordinate system corresponding to the target location may be predetermined. For example, take the center point of the ground of the target place as the origin of the world coordinate system, take the direction passing through the origin perpendicular to the ground as the Z-axis direction, take a direction on the ground of the target place passing through the origin as the X-axis direction, and take the ground of the target place as the direction of the X-axis.
  • the direction that passes through the origin and is perpendicular to the X-axis is the Y-axis direction.
  • the initial position coordinates of the same target object can be fused to obtain the target position coordinates of the same target object in the world coordinate system corresponding to the target location.
  • the initial position coordinates of the target object in different video pictures can be determined through the video pictures collected at the same time by a plurality of acquisition devices with different acquisition perspectives set in the target place, and further the same video pictures in different video pictures can be determined.
  • the initial position coordinates of the target object are fused to determine the target position coordinates of the target object in the target place. In this way, on the one hand, it is possible to complete the comprehensive positioning of the target object in the target site with large space and/or complex site, and on the other hand, it can obtain the target position coordinates of the same target object with high accuracy.
  • S201 Acquire pixel coordinates of a target object in a video image separately collected by multiple collection devices at the same time.
  • the target object in the video picture can be identified based on a pre-trained neural network for target detection, and the pixel coordinates of the set position point in the target object in the image coordinate system corresponding to the video picture can be read, The pixel coordinates corresponding to the set position point are taken as the pixel coordinates of the target object.
  • S2011 inputting a plurality of video frames into a pre-trained neural network to obtain a detection frame of a target object in each video frame; wherein, the neural network includes a plurality of target detection sub-networks for detecting target objects of different sizes;
  • the neural network can detect each target object contained in the video picture, and mark the detection frame of each target object.
  • FIG. 3 it is a schematic diagram of a detection frame of a target object included in a video picture.
  • the video image contains two detection frames corresponding to the target objects, including the detection frame A1B1C1D1 of the target object 1 and the detection frame A2B2C2D2 of the target object 2 respectively.
  • a position point can be extracted as the target position point on the detection frame of each target object, for example, the midpoint of the bottom edge of the detection frame is extracted as the target position point.
  • FIG. 3 it is a schematic diagram of a detection frame of a target object included in a video picture.
  • the video image contains two detection frames corresponding to the target objects, including the detection frame A1B1C1D1 of the target object 1 and the detection frame A2B2C2D2 of the target object 2 respectively.
  • a position point can be extracted as the target position point on the detection frame of each target object, for example, the
  • the pixel coordinates of the target object 1 are represented by the pixel coordinates of the midpoint K1 of the bottom edge D1C1 of the detection frame A1B1C1D1
  • the pixel coordinates of the target object 2 are represented by the pixel coordinates of the midpoint position K2 of the bottom edge D2C2 of the detection frame A2B2C2D2.
  • the neural network used in the embodiments of the present disclosure may include multiple target detection sub-networks for detecting target objects of different sizes.
  • it can be a feature pyramid network.
  • Each target detection sub-network in the feature pyramid network is used to detect and identify target objects of the corresponding size of the target detection sub-network in the video picture.
  • targets of different sizes in the same video picture can be accurately detected. object.
  • the neural network includes a plurality of target detection sub-networks for detecting target objects of different sizes. In this way, when the target object in the video picture is detected by the neural network, the target objects of different sizes in the same video picture can be accurately detected.
  • the parameter information of each acquisition device may include a homography matrix of the acquisition device, wherein the homography matrix may represent the image coordinate system corresponding to the video picture acquired by the acquisition device and the target location where the acquisition device is located.
  • the transformation relationship between world coordinate systems In this way, after obtaining the pixel coordinates of the target object in the image coordinate system corresponding to the video screen, the initial position coordinates of the target object in the world coordinate system corresponding to the target location can be determined according to the parameter information of the acquisition device.
  • the world coordinate system corresponding to the target site may take a fixed position in the target site as the coordinate origin to establish a unique world coordinate system.
  • the pixel coordinates of the target object in the video screen can be determined first, and then the initial position coordinates of the target object in the target place can be obtained according to the parameter information of the acquisition device, which is used for subsequent determination of the target object in the target place.
  • Target location coordinates are provided for preparation.
  • the initial position coordinates in the coordinate system include the following S2021 ⁇ S2022:
  • the internal parameter matrix of the acquisition device contains (f x , f y ) represents the focal length of the capture device, and (c x , c y ) represents the pixel coordinates of the center point of the video image captured by the capture device in the image coordinate system.
  • the distortion parameters of the acquisition device include radial distortion parameters and tangential distortion coefficients.
  • the internal parameter matrix and distortion parameters of each acquisition device may be predetermined in the manner of Zhang Zhengyou's chessboard calibration. For example, multiple checkerboard images can be taken from different angles to detect feature points in the images. According to the pixel coordinates of these feature points in the checkerboard image, the internal parameter matrix and distortion parameters of the acquisition device are solved, and then the internal parameter matrix and distortion parameters are continuously optimized. In the optimization process, the same pixel coordinates can be corrected according to the internal parameter matrix and distortion parameters obtained twice adjacently. Whether to end the optimization is determined by the difference between the two corrected pixel coordinates before and after, for example, after the difference is no longer reduced, the optimization can be ended to obtain the internal parameter matrix and distortion parameters of the acquisition device.
  • the homography matrix may represent the conversion relationship between the image coordinate system corresponding to the video picture captured by the capture device and the world coordinate system corresponding to the target location where the capture device is located.
  • the homography matrix can be determined when the acquisition device is calibrated in advance. For example, a sample video image with multiple markers can be collected by a collection device, and the intersection of the multiple markers and the ground (the plane where the X and Y axes of the world coordinate system are located) is in the world coordinate system corresponding to the target site.
  • the target object in the video picture can be obtained according to the corrected pixel coordinates of the target object in the video picture and the homography matrix of the acquisition device that collects the video picture.
  • the pixel coordinates are first corrected based on the internal parameter matrix and the distortion coefficient of the capture device that captures the video picture, so that the corrected pixel coordinates with higher accuracy can be obtained. , and further obtain the initial position coordinates of the target object with high accuracy in the target place.
  • S301 Determine a plurality of initial position coordinates associated with the same target object based on the initial position coordinates of the target object determined based on the plurality of video images.
  • each target object is captured by at least two capture devices at the same time, and for each target object, in the case of being captured by different capture devices at the same time, the capture device There is a certain error in the parameter information, and the error between the parameter information of different acquisition devices is different. Therefore, the initial position coordinates of the same target object determined based on different video pictures may be different. Before fusing the initial position coordinates of the same target object, it is necessary to determine multiple initial position coordinates associated with the same target object.
  • the first two may be fused first to obtain the fused initial position coordinates. Then, the fused initial position coordinates are fused with the third initial position coordinates until the last initial position coordinates are fused, and the final fused position coordinates are used as the target position coordinates of the same target object.
  • the initial position coordinates of the same target object collected by multiple collection devices can be fused, thereby The target position coordinates with higher accuracy of the same target object can be obtained.
  • S3011 for any two video pictures in the plurality of video pictures, determine that the target object in the first video picture in the arbitrary two video pictures is the first target object, and the target object in the second video picture in the arbitrary two video pictures
  • the target object is a second target object, and for the initial position coordinates of each of the first target objects, determine the initial position coordinates of the first target object and the coordinates of each second target object in the second video frame in any two video frames. the distance between the initial position coordinates;
  • S3012 Determine that a second target object having a minimum distance from the first target object and the first target object are the same target object, wherein the minimum distance is less than a preset fusion distance threshold; the initial position of the first target object Coordinates as multiple initial position coordinates associated with the same target object in the target object.
  • a collection device is set up in the target site, and it is assumed that the video images captured by the A collection devices at the same time all contain at least one target object, at this moment, the initial position coordinates of the A group and the initial position coordinates of the A group can be obtained.
  • Constitute the initial coordinate set s ⁇ S1, S2, S3, — SA ⁇ .
  • S1, S2, S3...SA are sequentially represented as the target in the video screen shot by the first acquisition device, the second acquisition device, the third acquisition device to the A-th acquisition device in the A acquisition devices The initial position coordinates of the object.
  • the following is an example of how to determine multiple initial position coordinates associated with the same target object by taking any two of the following video images as the video images captured by the first capture device and the second capture device at the same time:
  • S1 includes initial position coordinates (also referred to as first initial position coordinates) of a first target objects
  • S2 includes b initial position coordinates (also referred to as second initial position coordinates) of second target objects.
  • the Euclidean distance between each first initial position coordinate and each second initial position coordinate can be determined to obtain the distance matrix:
  • multiple initial position coordinates associated with the same target object in S1 and S2 can be determined in the following manner, including S30121 to S30124:
  • the elements in the current distance matrix include the Euclidean distance between each first initial position coordinate in S1 and each second initial position coordinate in S2.
  • S30122 Determine whether the current minimum distance is less than a preset fusion distance threshold.
  • the preset fusion distance may be set empirically. For example, the same target object is photographed by different collection devices in advance, and then multiple position coordinates of the same target object in the target site are determined respectively according to the video images collected by different collection devices.
  • the preset fusion distance threshold is determined according to distances between a plurality of position coordinates.
  • the a-th first initial position coordinate in S1 and the first second initial position coordinate in S2 can be regarded as the same as the The initial position coordinates associated with the target object.
  • the current distance matrix is calculated from the initial position coordinates in S1 and S2, and the specific one is a 3 ⁇ 3 matrix:
  • the preset fusion threshold is d th ; assuming that d 11 is the minimum distance in the current matrix and less than d th , then the first first initial position coordinate in S1 and the first second initial position coordinate in S2 are the same target. The object's associated initial position coordinates. Then in the current distance matrix, all other distances calculated from any of the two initial position coordinates are d 12 , d 13 , d 21 , and d 31 . Therefore, according to S30124, in the current matrix, it is necessary to set d 11 , d 12 , d 13 , d 21 , and d 31 to d th ; the set matrix is:
  • the elements set as the preset fusion distance threshold can be excluded, thereby improving the search efficiency.
  • the initial position coordinates associated with the same target object after obtaining multiple initial position coordinates associated with the same target object in S1 and S2, it can continue to determine whether there is an initial position associated with the same target object based on any other two video frames.
  • the coordinates of different initial positions of each target object in the video images collected by the A collection devices at the same time can be obtained after the video images collected by the A collection devices at the same time are judged.
  • the initial position coordinates associated with the same target object are fused to obtain the target position coordinates of each target object in the target place in the A video images shot by the A collection devices at the same moment.
  • coordinate fusion can be performed on the plurality of initial position coordinates to obtain the updated version of the same target object.
  • Initial position coordinates For the initial position coordinates in S1 and S2 that are not involved in the fusion, S2' can be formed with the updated initial position coordinates. Further form a new current distance matrix by the initial position coordinates in S2' and S3 and repeat the steps of S30121 to S30124 to obtain a plurality of initial position coordinates associated with the same target object in S2' and S3, and obtain S3' in the same way .
  • a new current distance matrix is further formed by the initial position coordinates in S3 ' and S4, and the steps of S30121 to S30124 are repeatedly executed, until after the fusion with the initial position coordinates of the last element in the initial coordinate set is completed, A collection devices are obtained The target position coordinates of each target object in the target location in the A video frames shot at the same time.
  • any initial position coordinates are detected to be involved in the fusion from the beginning to the end, considering that each target in the target location
  • the object is collected by at least two collecting devices at the same time, so any initial position coordinate can be used as the error initial position coordinate for filtering.
  • the initial position coordinates associated with the same target object can be quickly determined, so as to determine the subsequent position of each target object.
  • the target location coordinates provide the basis.
  • S3021 Select any initial position coordinate from a plurality of initial position coordinates associated with the same target object, and use the initial position coordinate as the first intermediate fusion position coordinate.
  • the initial position coordinates to be fused refer to the initial position coordinates that do not participate in the fusion.
  • the method when the first intermediate fusion position coordinate is fused with any other initial position coordinate to be fused among the plurality of initial position coordinates to generate the second intermediate fusion position coordinate, the method includes: determining the first intermediate fusion position. The midpoint coordinates of the coordinates and any other initial position coordinates to be fused, and the midpoint coordinates are used as the generated second intermediate fusion position coordinates.
  • any initial position coordinate may be used as the first intermediate fusion position coordinate, and it is determined that the first intermediate fusion position coordinate is the same as that of the target object A.
  • the midpoint coordinates of any other initial position coordinates to be fused is used as the updated first intermediate fusion position coordinate, and continues to be fused with any other initial position coordinate to be fused. Until there is no initial position coordinate to be fused among the N initial position coordinates, the target position coordinate of the target object A is obtained.
  • multiple initial position coordinates associated with the same target object may be fused in a manner of taking midpoints in sequence, thereby obtaining target position coordinates with higher accuracy.
  • the positioning method proposed in the embodiment of the present disclosure can accurately determine the target position coordinates of each target object in the target place, and this method can be applied to various application scenarios. Taking the application in a factory as an example, after obtaining the target position coordinates of the target object in the target place, as shown in FIG. 5 , the positioning method provided by the embodiment of the present disclosure further includes the following S401 to S402:
  • a coordinate range corresponding to a dangerous target area in the factory may be set in advance in the world coordinates corresponding to the target site. Then, it is determined whether there is a target object entering the target area according to the target position coordinates corresponding to each target object in the determined target place and the target location in the corresponding coordinate range. Further, when it is determined that there is a target object entering the target area, an early warning prompt is performed.
  • the early warning prompts may include, but are not limited to, sound and light alarm prompts, voice alarm prompts, and the like. Through the early warning prompts, the safety of employees in the target site can be guaranteed and the safety of the target site can be improved.
  • the target position coordinates of each target object in the target place with high accuracy it can be determined whether the target object in the target place is based on a preset target area, such as a preset dangerous area. Enter the target area for timely warning prompts and improve the safety of the target site.
  • a preset target area such as a preset dangerous area.
  • the writing order of each step does not mean a strict execution order but constitutes any limitation on the implementation process, and the specific execution order of each step should be based on its function and possible Internal logic is determined.
  • the embodiment of the present disclosure also provides a positioning device corresponding to the positioning method. Since the principle of solving the problem of the device in the embodiment of the present disclosure is similar to the above-mentioned positioning method of the embodiment of the present disclosure, the implementation of the device can refer to the method implementation, and the repetition will not be repeated.
  • the positioning device includes:
  • the acquisition module 501 is used to acquire the video images collected by a plurality of collection devices set in the target site at the same time; wherein, different collection devices have different collection perspectives in the target site, and the video images include the target object;
  • the determining module 502 is configured to respectively determine the initial position coordinates of the target object in the target place based on the video images collected by multiple collection devices at the same time;
  • the fusion module 503 is used to fuse the initial position coordinates of the same target object to obtain the target position coordinates of the target object in the target place.
  • the steps include:
  • the determining module 502 when used to acquire the pixel coordinates of the target object in the video images captured by multiple capturing devices at the same moment, the following steps are included:
  • the neural network includes multiple target detection sub-networks for detecting target objects of different sizes;
  • the pixel coordinates of the target position point on the detection frame of the target object in each video picture are extracted in the video picture, and the pixel coordinates of the target object in the video picture are obtained.
  • the determination module 502 is used to determine that the target object collected by the collection device is in the target place based on the pixel coordinates of the target object in the video picture collected by each collection device and the parameter information of the collection device
  • the corresponding initial position coordinates in the world coordinate system include:
  • the initial position coordinates of the target object in the video frame are determined.
  • the fusion module 503 is used to fuse the initial position coordinates of the same target object to obtain the target position coordinates of the target object in the target place, including:
  • the multiple initial position coordinates associated with the same target object are sequentially fused to obtain the target position coordinates of the same target object in the target place.
  • the fusion module 503 when used to sequentially fuse multiple initial position coordinates associated with the same target object to obtain the target position coordinates of the same target object in the target place, it includes:
  • the first intermediate fusion position coordinates are fused with any other initial position coordinates to be fused in the plurality of initial position coordinates to generate the second intermediate fusion position coordinates, and the second intermediate fusion position coordinates are used as the updated first intermediate The position coordinates are fused, and the step of generating the second intermediate fused position coordinates is returned until there are no initial position coordinates to be fused.
  • the fusion module 503 is used to fuse the first intermediate fusion position coordinate with any other initial position coordinate to be fused among the plurality of initial position coordinates to generate the second intermediate fusion position coordinate , including:
  • the method when the fusion module 503 is used to determine multiple initial position coordinates associated with the same target object based on the initial position coordinates of the target object determined based on multiple video frames, the method includes:
  • the target object in the first video picture in the arbitrary two video pictures is the first target object
  • the target object in the second video picture in the arbitrary two video pictures is determined as the first target object.
  • the second target object for the initial position coordinates of each first target object, determine the initial position coordinates of the first target object and the initial position of each second target object in the second video picture in any two video pictures distance between coordinates;
  • the second target object with the minimum distance from the first target object and the first target object are the same target object, wherein the minimum distance is less than the preset fusion distance threshold; the initial position coordinates of the first target object, As multiple initial position coordinates associated with the same target object in the target object.
  • the determination module 502 is further configured to:
  • an early warning prompt is given.
  • an embodiment of the present disclosure further provides an electronic device 600 .
  • a schematic structural diagram of the electronic device 600 provided by the embodiment of the present disclosure includes:
  • the processor 610 executes the following instructions: acquiring video images collected by multiple collection devices set in the target site at the same time; wherein, different collection devices have different collection perspectives in the target site, and the video images include target objects; For the video images collected by multiple collection devices at the same time, the initial position coordinates of the target pair in the target place are respectively determined; the initial position coordinates of the same target object are fused to obtain the target position coordinates of the target object in the target place.
  • Embodiments of the present disclosure further provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, the steps of the positioning method described in the foregoing method embodiments are executed.
  • the storage medium may be a volatile or non-volatile computer-readable storage medium.
  • An embodiment of the present disclosure further provides a computer program product, where the computer program product carries a program code and is stored in a storage medium, where the instructions included in the program code can be used to execute the steps of the positioning method described in the above method embodiments, For details, reference may be made to the foregoing method embodiments, which will not be repeated here.
  • the above-mentioned computer program product can be specifically implemented by means of hardware, software or a combination thereof.
  • the computer program product is embodied as a computer storage medium, and in another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK), etc. Wait.
  • the units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the functions, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a processor-executable non-volatile computer-readable storage medium.
  • the computer software products are stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in various embodiments of the present disclosure.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)
  • Length Measuring Devices By Optical Means (AREA)
  • Studio Devices (AREA)

Abstract

The present disclosure provides a positioning method and apparatus, an electronic device, and a storage medium. The positioning method comprises: acquiring video images collected at the same moment by multiple collection devices disposed within a target site, wherein different collection devices have different collection angles of view in the target site, and the video images comprise a target object; on the basis of the video images collected at the same moment by the multiple collection devices, determining initial position coordinates of the target object in the target site, respectively; and fusing the initial position coordinates of the same target object, and obtaining target position coordinates of the target object in the target site.

Description

定位方法、装置、电子设备及存储介质Positioning method, device, electronic device and storage medium
相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS
本专利申请要求于2021年4月28日提交的、申请号为202110467657.9、发明名称为“一种定位方法、装置、电子设备及存储介质”的中国专利申请的优先权,该申请以引用的方式并入文本中。This patent application claims the priority of the Chinese patent application filed on April 28, 2021, with the application number of 202110467657.9 and the invention titled "a positioning method, device, electronic device and storage medium", which application is by reference incorporated into the text.
技术领域technical field
本公开涉及计算机视觉技术领域,具体而言,涉及一种定位方法、装置、电子设备及存储介质。The present disclosure relates to the field of computer vision technology, and in particular, to a positioning method, an apparatus, an electronic device, and a storage medium.
背景技术Background technique
人工智能技术在打造智能教育,文娱及生活上发挥着越来越重要的作用,其中计算机视觉作为关键的技术之一,应用广泛。比如基于计算机视觉的定位技术,可以对不同场景下的目标场所内的目标对象进行定位,确定目标场所内目标对象的位置。Artificial intelligence technology is playing an increasingly important role in creating intelligent education, entertainment and life. Computer vision, as one of the key technologies, is widely used. For example, the positioning technology based on computer vision can locate the target object in the target place under different scenarios, and determine the position of the target object in the target place.
在基于计算机视觉进行定位的过程中,可以通过相机采集的目标场所图像,确定目标场所图像中的目标对象的位置,进一步确定目标对象在目标场所内的位置,完成对目标场所内的目标对象的追踪。In the process of positioning based on computer vision, the position of the target object in the image of the target site can be determined through the image of the target site collected by the camera, and the position of the target object in the target site can be further determined to complete the target object in the target site. track.
发明内容SUMMARY OF THE INVENTION
本公开实施例至少提供一种定位方案。The embodiments of the present disclosure provide at least one positioning solution.
第一方面,本公开实施例提供了一种定位方法,包括:In a first aspect, an embodiment of the present disclosure provides a positioning method, including:
获取目标场所内设置的多个采集设备在同一时刻采集的多张视频画面;其中,不同的采集设备在所述目标场所中的采集视角不同,所述多张视频画面中包括目标对象;其中,所述目标对象为所述目标场所待进行定位的对象;Acquiring multiple video images collected at the same time by multiple collection devices set in the target site; wherein, different collection devices have different collection perspectives in the target site, and the multiple video images include target objects; wherein, The target object is an object to be positioned in the target place;
基于所述多张视频画面,分别确定所述目标对象在所述目标场所中的初始位置坐标;Determine the initial position coordinates of the target object in the target place based on the plurality of video images;
对所述目标对象中同一目标对象的初始位置坐标进行融合,得到该目标对象在所述目标场所中的目标位置坐标。The initial position coordinates of the same target object in the target objects are fused to obtain the target position coordinates of the target object in the target place.
本公开实施例中,可以通过目标场所内设置的多个采集视角不同的采集设备在同一时刻采集的视频画面,确定不同视频画面中的目标对象的初始位置坐标,进一步对不同视频画面中的同一目标对象的初始位置坐标进行融合,确定目标对象在目标场所中的目标位置坐标。通过该方式一方面可以完成对空间较大和/或场地复杂的目标场所中的目标对象的全面定位,另一方面可以得到同一目标对象准确度较高的目标位置坐标。In the embodiment of the present disclosure, the initial position coordinates of the target object in different video pictures can be determined through the video pictures collected at the same time by a plurality of acquisition devices with different acquisition perspectives set in the target place, and further the same video pictures in different video pictures can be determined. The initial position coordinates of the target object are fused to determine the target position coordinates of the target object in the target place. In this way, on the one hand, it is possible to complete the comprehensive positioning of the target object in the target site with large space and/or complex site, and on the other hand, it can obtain the target position coordinates of the same target object with high accuracy.
在一种可能的实施方式中,基于所述多张视频画面,分别确定所述目标对象在所 述目标场所中的初始位置坐标,包括:In a possible implementation manner, based on the plurality of video pictures, the initial position coordinates of the target object in the target place are respectively determined, including:
获取所述多张视频画面中所述目标对象的像素坐标;obtaining the pixel coordinates of the target object in the multiple video frames;
针对所述多个采集设备里的每一个,基于该采集设备采集的视频画面中所述目标对象中至少一者的像素坐标和该采集设备的参数信息,确定该采集设备采集的所述目标对象中至少一者在所述目标场所对应的世界坐标系下的初始位置坐标。For each of the multiple collection devices, the target object collected by the collection device is determined based on the pixel coordinates of at least one of the target objects in the video image collected by the collection device and the parameter information of the collection device The initial position coordinates of at least one of them in the world coordinate system corresponding to the target location.
本公开实施例中,可以先确定目标对象在视频画面中的像素坐标,然后再根据采集设备的参数信息,得到目标对象在目标场所中的初始位置坐标,为后续确定目标对象在目标场所中的目标位置坐标提供准备。In the embodiment of the present disclosure, the pixel coordinates of the target object in the video screen can be determined first, and then the initial position coordinates of the target object in the target place can be obtained according to the parameter information of the acquisition device, which is used for subsequent determination of the target object in the target place. Target location coordinates are provided for preparation.
在一种可能的实施方式中,获取多张视频画面中的所述目标对象的像素坐标,包括:In a possible implementation manner, acquiring the pixel coordinates of the target object in multiple video frames includes:
将所述多张视频画面输入预先训练的神经网络,针对所述多张视频画面中的每一张,得到该视频画面中的目标对象的检测框;其中,所述神经网络包含多个用于检测不同尺寸的目标对象的目标检测子网络;Inputting the plurality of video pictures into a pre-trained neural network, for each of the plurality of video pictures, a detection frame of the target object in the video picture is obtained; wherein, the neural network includes a plurality of A target detection sub-network that detects target objects of different sizes;
提取该视频画面中目标对象的检测框上的目标位置点在该视频画面中的像素坐标,得到该视频画面中目标对象的像素坐标。The pixel coordinates of the target position point on the detection frame of the target object in the video picture are extracted in the video picture, and the pixel coordinates of the target object in the video picture are obtained.
本公开实施例中,神经网络中包含多个用于检测不同尺寸的目标对象的目标检测子网络,这样在通过神经网络对视频画面中目标对象进行目标检测时,可以准确地检测出同一视频画面中不同尺寸的目标对象。In the embodiment of the present disclosure, the neural network includes a plurality of target detection sub-networks for detecting target objects of different sizes, so that when the target object in the video picture is detected by the neural network, the same video picture can be accurately detected target objects of different sizes.
在一种可能的实施方式中,基于该采集设备采集的所述视频画面中的所述目标对象中至少一者的像素坐标和该采集设备的参数信息,确定该采集设备采集的所述目标对象中至少一者在所述目标场所对应的世界坐标系下的初始位置坐标,包括:In a possible implementation manner, the target object collected by the collection device is determined based on the pixel coordinates of at least one of the target objects in the video picture collected by the collection device and the parameter information of the collection device The initial position coordinates of at least one of them in the world coordinate system corresponding to the target location, including:
基于预先确定的该采集设备的内参矩阵和畸变参数,对该采集设备采集的视频画面中的所述目标对象中至少一者的像素坐标进行修正,得到该视频画面中的所述目标对象中至少一者的修正像素坐标;Based on the predetermined internal parameter matrix and distortion parameters of the acquisition device, the pixel coordinates of at least one of the target objects in the video picture collected by the acquisition device are corrected to obtain at least one of the target objects in the video picture. the corrected pixel coordinates of one;
基于预先确定的该采集设备的单应性矩阵和该采集设备采集的视频画面中的所述目标对象中至少一者的修正像素坐标,确定该视频画面中的所述目标对象中至少一者的初始位置坐标。Based on the predetermined homography matrix of the capture device and the modified pixel coordinates of at least one of the target objects in the video frame captured by the capture device, determine the pixel coordinates of at least one of the target objects in the video frame Initial position coordinates.
本公开实施例中,在得到视频画面中目标对象的像素坐标后,先基于采集该视频画面的采集设备的内参矩阵和畸变系数对像素坐标进行修正,从而可以得到准确度较高的修正像素坐标,进一步得到目标对象在目标场所中准确度较高的初始位置坐标。In the embodiment of the present disclosure, after obtaining the pixel coordinates of the target object in the video picture, the pixel coordinates are first corrected based on the internal parameter matrix and the distortion coefficient of the capture device that captures the video picture, so that the corrected pixel coordinates with higher accuracy can be obtained. , and further obtain the initial position coordinates of the target object with high accuracy in the target place.
在一种可能的实施方式中,对所述目标对象中同一目标对象的初始位置坐标进行融合,得到该目标对象在所述目标场所中的目标位置坐标,包括:In a possible implementation manner, the initial position coordinates of the same target object in the target objects are fused to obtain the target position coordinates of the target object in the target place, including:
基于所述目标对象在所述目标场所中的所述初始位置坐标,确定与所述目标对象中同一目标对象关联的多个初始位置坐标;determining a plurality of initial position coordinates associated with the same target object in the target objects based on the initial position coordinates of the target object in the target place;
将与该目标对象关联的所述多个所述初始位置坐标进行依次融合,得到该目标对象在所述目标场所中的目标位置坐标。The plurality of initial position coordinates associated with the target object are sequentially fused to obtain the target position coordinates of the target object in the target place.
本公开实施例中,考虑到基于不同采集设备采集视频画面确定的同一目标对象的初始位置坐标会存在一些误差,因此可以通过对多个采集设备采集的同一目标对象的初始位置坐标进行融合,从而可以得到该同一目标对象准确度较高的目标位置坐标。In the embodiment of the present disclosure, considering that there may be some errors in the initial position coordinates of the same target object determined based on the video images collected by different collection devices, the initial position coordinates of the same target object collected by multiple collection devices can be fused, thereby The target position coordinates with higher accuracy of the same target object can be obtained.
在一种可能的实施方式中,将与该目标对象关联的所述多个初始位置坐标进行依次融合,得到该目标对象在所述目标场所中的目标位置坐标,包括:In a possible implementation manner, the multiple initial position coordinates associated with the target object are sequentially fused to obtain the target position coordinates of the target object in the target place, including:
从该目标对象关联的所述多个初始位置坐标中选取任一初始位置坐标,将选取的任一初始位置坐标作为第一中间融合位置坐标;Select any initial position coordinate from the plurality of initial position coordinates associated with the target object, and use the selected initial position coordinate as the first intermediate fusion position coordinate;
将所述第一中间融合位置坐标与所述多个初始位置坐标中其它任一待融合的初始位置坐标进行融合,生成第二中间融合位置坐标,将所述第二中间融合位置坐标作为更新后的所述第一中间融合位置坐标,并返回生成所述第二中间融合位置坐标的步骤,直到所述多个初始位置坐标中不存在待融合的初始位置坐标。The first intermediate fusion position coordinates are fused with any other initial position coordinates to be fused in the plurality of initial position coordinates to generate the second intermediate fusion position coordinates, and the second intermediate fusion position coordinates are used as the updated and returning to the step of generating the second intermediate fusion position coordinates, until there is no initial position coordinate to be fused in the plurality of initial position coordinates.
在一种可能的实施方式中,将所述第一中间融合位置坐标与所述多个初始位置坐标中其它任一待融合的初始位置坐标进行融合,生成第二中间融合位置坐标,包括:In a possible implementation manner, the first intermediate fusion position coordinate is fused with any other initial position coordinate to be fused among the plurality of initial position coordinates to generate a second intermediate fusion position coordinate, including:
确定所述第一中间融合位置坐标与所述多个初始位置坐标中其它任一待融合的初始位置坐标的中点坐标,将该中点坐标作为所述第二中间融合位置坐标。Determine a midpoint coordinate of the first intermediate fusion position coordinate and any other initial position coordinate to be fused among the plurality of initial position coordinates, and use the midpoint coordinate as the second intermediate fusion position coordinate.
本公开实施例中,针对与同一目标对象关联的多个初始位置坐标,可以按照依次取中点的方式融合,从而得到准确度较高的目标位置坐标。In this embodiment of the present disclosure, multiple initial position coordinates associated with the same target object may be fused in a manner of taking midpoints in sequence, so as to obtain target position coordinates with higher accuracy.
在一种可能的实施方式中,基于所述目标对象在所述目标场所中的所述初始位置坐标,确定与所述目标对象中同一目标对象关联的多个初始位置坐标,包括:In a possible implementation manner, based on the initial position coordinates of the target object in the target place, determining a plurality of initial position coordinates associated with the same target object in the target objects, including:
针对所述多张视频画面中的任意两张视频画面,将所述任意两张视频画面中第一视频画面中的目标对象确定为第一目标对象,所述任意两张视频画面中第二视频画面中的目标对象确定为第二目标对对象;在所述第一视频画面中确定每个所述第一目标对象的初始位置坐标;在所述第二视频画面中确定每个所述第二目标对象的初始位置坐标;针对每个所述第一目标对象的初始位置坐标,确定该第一目标对象的初始位置坐标与每个所述第二目标对象的初始位置坐标之间的距离;For any two video pictures in the plurality of video pictures, the target object in the first video picture in the any two video pictures is determined as the first target object, and the second video picture in the arbitrary two video pictures is determined as the first target object. The target object in the picture is determined as a second target pair object; the initial position coordinates of each of the first target objects are determined in the first video picture; the initial position coordinates of each of the first target objects are determined in the second video picture; each of the second the initial position coordinates of the target object; for the initial position coordinates of each of the first target objects, determine the distance between the initial position coordinates of the first target object and the initial position coordinates of each of the second target objects;
确定与该第一目标对象具有最小距离的第二目标对象与该第一目标对象为同一目标对象,其中,所述最小距离小于预设融合距离阈值;将该第一目标对象的初始位置坐标和与该第一目标对象具有最小距离的第二目标对象的初始位置坐标,作为与所述目标 对象中同一目标对象关联的多个初始位置坐标。It is determined that a second target object with a minimum distance from the first target object is the same target object as the first target object, wherein the minimum distance is less than a preset fusion distance threshold; the initial position coordinates of the first target object and The initial position coordinates of the second target object with the smallest distance from the first target object are taken as a plurality of initial position coordinates associated with the same target object among the target objects.
本公开实施例中,根据任意两张视频画面中不同目标对象的初始位置坐标,和预设融合距离阈值,可以快速确定出与同一目标对象关联的初始位置坐标,从而为后续确定各目标对象的目标位置坐标提供依据。In the embodiment of the present disclosure, according to the initial position coordinates of different target objects in any two video frames and the preset fusion distance threshold, the initial position coordinates associated with the same target object can be quickly determined, so as to determine the subsequent position of each target object. The target location coordinates provide the basis.
在一种可能的实施方式中,在得到该目标对象在所述目标场所中的目标位置坐标之后,所述定位方法还包括:In a possible implementation manner, after obtaining the target position coordinates of the target object in the target place, the positioning method further includes:
基于所述目标场所中的各目标对象分别对应的目标位置坐标,以及预先设定的目标区域,确定是否存在进入所述目标区域的目标对象;Determine whether there is a target object entering the target area based on the target position coordinates corresponding to each target object in the target place and a preset target area;
在确定存在进入所述目标区域的目标对象的情况下,进行预警提示。When it is determined that there is a target object entering the target area, an early warning prompt is performed.
本公开实施例中,在得到目标场所中的各目标对象准确度较高的目标位置坐标后,可以基于预先设定的目标区域,比如预先设定的危险区域,判断目标场所中的目标对象是否进入目标区域,以便及时预警提示,提高目标场所的安全性。In the embodiment of the present disclosure, after obtaining the target position coordinates of each target object in the target place with high accuracy, it can be determined whether the target object in the target place is based on a preset target area, such as a preset dangerous area. Enter the target area for timely warning prompts and improve the safety of the target site.
第二方面,本公开实施例提供了一种定位装置,包括:In a second aspect, an embodiment of the present disclosure provides a positioning device, including:
获取模块,用于获取目标场所内设置的多个采集设备在同一时刻采集的多张视频画面;其中,不同的采集设备在所述目标场所中的采集视角不同,所述多张视频画面中包括目标对象;其中,所述目标对象为所述目标场所内待进行定位的对象;The acquisition module is used to acquire multiple video images collected at the same time by multiple collection devices set in the target site; wherein, different collection devices have different collection perspectives in the target site, and the multiple video images include A target object; wherein, the target object is an object to be positioned in the target place;
确定模块,用于基于所述多张视频画面,分别确定所述目标对象在目标场所中的初始位置坐标;a determining module, configured to respectively determine the initial position coordinates of the target object in the target place based on the plurality of video pictures;
融合模块,用于对所述目标对象中同一目标对象的初始位置坐标进行融合,得到该目标对象在所述目标场所中的目标位置坐标。The fusion module is used for fusing the initial position coordinates of the same target object in the target objects to obtain the target position coordinates of the target object in the target place.
第三方面,本公开实施例提供了一种电子设备,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,当电子设备运行时,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行如第一方面所述的定位方法的步骤。In a third aspect, embodiments of the present disclosure provide an electronic device, including: a processor, a memory, and a bus, where the memory stores machine-readable instructions executable by the processor, and when the electronic device runs, the processing The processor and the memory communicate through a bus, and the machine-readable instructions execute the steps of the positioning method according to the first aspect when the machine-readable instructions are executed by the processor.
第四方面,本公开实施例提供了一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行如第一方面所述的定位方法的步骤。In a fourth aspect, an embodiment of the present disclosure provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, the steps of the positioning method according to the first aspect are executed .
第五方面,本公开实施例提供了一种计算机程序产品,该计算机程序产品包括计算机程序并存储于存储介质上,该计算机程序被处理器执行时执行如第一方面所述的定位方法的步骤。In a fifth aspect, an embodiment of the present disclosure provides a computer program product, the computer program product includes a computer program and is stored on a storage medium, and when the computer program is executed by a processor, executes the steps of the positioning method according to the first aspect .
为使本公开的上述目的、特征和优点能更明显易懂,下文特举较佳实施例,并配合所附附图,作详细说明如下。In order to make the above-mentioned objects, features and advantages of the present disclosure more obvious and easy to understand, the preferred embodiments are exemplified below, and are described in detail as follows in conjunction with the accompanying drawings.
附图说明Description of drawings
为了更清楚地说明本公开实施例的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,此处的附图被并入说明书中并构成本说明书中的一部分,这些附图示出了符合本公开的实施例,并与说明书一起用于说明本公开的技术方案。应当理解,以下附图仅示出了本公开的某些实施例,因此不应被看作是对范围的限定,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他相关的附图。In order to explain the technical solutions of the embodiments of the present disclosure more clearly, the following briefly introduces the accompanying drawings required in the embodiments, which are incorporated into the specification and constitute a part of the specification. The drawings illustrate embodiments consistent with the present disclosure, and together with the description serve to explain the technical solutions of the present disclosure. It should be understood that the following drawings only show some embodiments of the present disclosure, and therefore should not be regarded as limiting the scope. Other related figures are obtained from these figures.
图1示出了本公开实施例所提供的一种定位方法的流程图;FIG. 1 shows a flowchart of a positioning method provided by an embodiment of the present disclosure;
图2示出了本公开实施例所提供的一种确定目标对象的初始位置坐标的方法流程图;FIG. 2 shows a flowchart of a method for determining the initial position coordinates of a target object provided by an embodiment of the present disclosure;
图3示出了本公开实施例所提供的一种针对视频画面中检测到的目标对象的示意图;FIG. 3 shows a schematic diagram of a target object detected in a video picture provided by an embodiment of the present disclosure;
图4示出了本公开实施例所提供的一种确定目标对象的目标位置坐标的方法流程图;4 shows a flowchart of a method for determining target position coordinates of a target object provided by an embodiment of the present disclosure;
图5示出了本公开实施例所提供的一种预警提示的方法流程图;FIG. 5 shows a flowchart of a method for early warning provided by an embodiment of the present disclosure;
图6示出了本公开实施例所提供的一种定位装置的结构示意图;FIG. 6 shows a schematic structural diagram of a positioning device provided by an embodiment of the present disclosure;
图7示出了本公开实施例所提供的一种电子设备的示意图。FIG. 7 shows a schematic diagram of an electronic device provided by an embodiment of the present disclosure.
具体实施方式Detailed ways
为使本公开实施例的目的、技术方案和优点更加清楚,下面将结合本公开实施例中附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分实施例,而不是全部的实施例。通常在此处附图中描述和示出的本公开实施例的组件可以以各种不同的配置来布置和设计。因此,以下对在附图中提供的本公开的实施例的详细描述并非旨在限制要求保护的本公开的范围,而是仅仅表示本公开的选定实施例。基于本公开的实施例,本领域技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都属于本公开保护的范围。In order to make the purposes, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present disclosure. Obviously, the described embodiments are only These are some, but not all, embodiments of the present disclosure. The components of the disclosed embodiments generally described and illustrated in the drawings herein may be arranged and designed in a variety of different configurations. Therefore, the following detailed description of the embodiments of the disclosure provided in the accompanying drawings is not intended to limit the scope of the disclosure as claimed, but is merely representative of selected embodiments of the disclosure. Based on the embodiments of the present disclosure, all other embodiments obtained by those skilled in the art without creative work fall within the protection scope of the present disclosure.
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步定义和解释。It should be noted that like numerals and letters refer to like items in the following figures, so once an item is defined in one figure, it does not require further definition and explanation in subsequent figures.
本文中术语“和/或”,仅仅是描述一种关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中术语“至少一种”表示多种中的任意一种或多种中的至少两种的任意组合,例如,包括A、B、C中的至少一种,可以表示包括从A、B和C构成的集合中选择的任意一个或多个元素。The term "and/or" in this paper only describes an association relationship, which means that there can be three kinds of relationships, for example, A and/or B, which can mean: the existence of A alone, the existence of A and B at the same time, the existence of B alone. a situation. In addition, the term "at least one" herein refers to any combination of any one of the plurality or at least two of the plurality, for example, including at least one of A, B, and C, and may mean including from A, B, and C. Any one or more elements selected from the set of B and C.
在很多应用场景中,通常需要对一个场所内的目标对象进行定位。比如在工厂内,需要检测员工是否在指定位置工作,或者是否进入危险区域。在商场中,可以通过对顾客的定位,检测商场中的人流分布。在对场所内的目标对象进行定位的过程中,可以通过多个相机采集的图像,确定目标对象的位置。然而针对一些场地复杂面积较大的目标场所,在基于多个相机对目标对象进行定位的过程中,可能无法捕捉到全部的目标对象,存在目标对象定位不全面的问题;还可能存在一些遮挡区域,针对这些遮挡区域的目标对象则无法完成定位。In many application scenarios, it is usually necessary to locate the target object in a place. For example, in a factory, it is necessary to detect whether employees are working in designated locations, or whether they have entered a dangerous area. In shopping malls, the distribution of people flow in shopping malls can be detected by locating customers. In the process of locating the target object in the place, the position of the target object can be determined through images collected by a plurality of cameras. However, for some target sites with complex and large areas, in the process of locating the target objects based on multiple cameras, it may not be possible to capture all the target objects, and there is a problem of incomplete target object positioning; there may also be some occlusion areas. , the target objects in these occluded areas cannot be localized.
基于上述研究,本公开提供了一种定位方法,可以通过目标场所内设置的多个采集视角不同的采集设备在同一时刻采集的视频画面,确定不同视频画面中的目标对象的初始位置坐标,进一步对不同视频画面中的同一目标对象的初始位置坐标进行融合,确定目标对象在目标场所中的目标位置坐标。通过该方式一方面可以完成对空间较大和/或场地复杂的目标场所中的目标对象的全面定位,另一方面可以得到同一目标对象准确度较高的目标位置坐标。Based on the above research, the present disclosure provides a positioning method, which can determine the initial position coordinates of the target object in different video pictures through the video pictures collected at the same time by a plurality of acquisition devices with different acquisition perspectives set in the target place, and further The initial position coordinates of the same target object in different video images are fused to determine the target position coordinates of the target object in the target place. In this way, on the one hand, it is possible to complete the comprehensive positioning of the target object in the target site with large space and/or complex site, and on the other hand, it can obtain the target position coordinates of the same target object with high accuracy.
为便于对本实施例进行理解,首先对本公开实施例所公开的一种定位方法进行详细介绍,本公开实施例所提供的定位方法的执行主体为具有计算能力的计算机设备,该计算机设备例如包括:服务器或其它处理设备。在一些可能的实现方式中,该定位方法可以通过处理器调用存储器中存储的计算机可读指令的方式来实现。In order to facilitate the understanding of this embodiment, a positioning method disclosed in the embodiment of the present disclosure is first introduced in detail. The execution subject of the positioning method provided by the embodiment of the present disclosure is a computer device with computing capability, and the computer device includes, for example: server or other processing device. In some possible implementations, the positioning method may be implemented by the processor invoking computer-readable instructions stored in the memory.
参见图1所示,为本公开实施例提供的定位方法的流程图,该定位方法包括以下S101~S103:Referring to FIG. 1, which is a flowchart of a positioning method provided by an embodiment of the present disclosure, the positioning method includes the following S101-S103:
S101,获取目标场所内设置的多个采集设备在同一时刻采集的视频画面;其中,不同的采集设备在目标场所中的采集视角不同,视频画面中包括目标对象。S101 , acquiring video images collected at the same time by multiple collection devices set in the target site; wherein, different collection devices have different collection perspectives in the target site, and the video images include target objects.
示例性地,针对不同的应用场景,目标场所可以为与该应用场景对应的场所,比如需要对工厂内的员工进行定位的情况下,目标场所可以为工厂;需要对商场内的顾客进行定位的情况下,目标场所可以为商场;需要对体育馆内的运动员进行定位的情况下,目标场所可以为体育馆。Exemplarily, for different application scenarios, the target place may be the place corresponding to the application scenario. For example, if the employees in the factory need to be located, the target place may be the factory; In this case, the target place can be a shopping mall; when it is necessary to locate the athletes in the gymnasium, the target place can be the gymnasium.
示例性地,目标对象为目标场所内待进行定位的对象,比如上述提到的员工、顾客和运动员。Exemplarily, the target objects are objects to be located in the target location, such as the aforementioned employees, customers and athletes.
示例性地,采集设备可以为单目摄像机或者双目摄像机,目标场所内可以设置多个采集设备。针对不同的目标场所,可以根据目标场所的实际场地确定多个采集设备的安装位置。比如可以使得采集设备在目标场所中的采集视角不同,以覆盖目标场所的全部区域,不留死角。另外,考虑到采集设备过多会导致同一时刻采集的视频画面过多,因此会影响视频画面的处理速度。故在目标场所中安装采集设备时,需要同时考虑到采集设备的安装角度、以及数量。比如可以使得进入目标场所中的每个目标对象可以同时被两个采集设备采集到,这样目标场所内设置的多个采集设备可以完整的采集到目标场 所整个区域的视频画面。Exemplarily, the collection device may be a monocular camera or a binocular camera, and multiple collection devices may be set in the target site. For different target sites, the installation positions of multiple collection devices can be determined according to the actual site of the target site. For example, the acquisition angles of the acquisition devices in the target site may be different, so as to cover the entire area of the target site without leaving a dead angle. In addition, considering that too many capture devices will result in too many video images captured at the same time, it will affect the processing speed of the video images. Therefore, when installing the acquisition equipment in the target site, it is necessary to consider the installation angle and quantity of the acquisition equipment at the same time. For example, each target object entering the target site can be captured by two acquisition devices at the same time, so that multiple acquisition devices set in the target site can completely capture the video images of the entire area of the target site.
S102,基于多个采集设备在同一时刻采集的视频画面,分别确定目标对象在目标场所中的初始位置坐标。S102: Determine the initial position coordinates of the target object in the target place based on the video images collected by multiple collection devices at the same time.
示例性地,在获取到多个采集设备在同一时刻采集的视频画面后,可以进一步对多个采集设备在同一时刻采集的视频画面进行目标检测,确定不同视频画面中的目标对象在目标场所对应的世界坐标系下的初始位置坐标。具体地,可以基于检测出的视频画面中的目标对象在视频画面中的像素坐标,以及采集该视频画面的采集设备的参数信息,确定该视频画面中的目标对象的初始位置坐标。Exemplarily, after acquiring the video images collected by multiple collection devices at the same time, target detection can be further performed on the video images collected by multiple collection devices at the same time, and it is determined that the target objects in different video images correspond to the target location. The initial position coordinates in the world coordinate system of . Specifically, the initial position coordinates of the target object in the video picture can be determined based on the detected pixel coordinates of the target object in the video picture and the parameter information of the acquisition device that collects the video picture.
示例性地,目标场所对应的世界坐标系可以预先确定。比如以目标场所地面的中心点位置为世界坐标系的原点,以经过原点垂直于地面的方向为Z轴方向,以目标场所的地面上经过原点的一个方向为X轴方向,以目标场所的地面上经过原点且垂直于X轴的方向为Y轴方向。Exemplarily, the world coordinate system corresponding to the target location may be predetermined. For example, take the center point of the ground of the target place as the origin of the world coordinate system, take the direction passing through the origin perpendicular to the ground as the Z-axis direction, take a direction on the ground of the target place passing through the origin as the X-axis direction, and take the ground of the target place as the direction of the X-axis. The direction that passes through the origin and is perpendicular to the X-axis is the Y-axis direction.
S103,对目标对象中同一目标对象的初始位置坐标进行融合,得到该目标对象在目标场所中的目标位置坐标。S103 , fuse the initial position coordinates of the same target object in the target objects to obtain the target position coordinates of the target object in the target place.
示例性地,考虑到不同采集设备的参数信息之间存在一些误差,因此基于不同采集设备采集到的视频画面确定的同一目标对象的初始位置坐标会有一些差距。可以对同一目标对象的初始位置坐标进行融合,得到同一目标对象在目标场所对应的世界坐标系下的目标位置坐标。Exemplarily, considering that there are some errors between the parameter information of different collection devices, there will be some differences in the initial position coordinates of the same target object determined based on the video images collected by different collection devices. The initial position coordinates of the same target object can be fused to obtain the target position coordinates of the same target object in the world coordinate system corresponding to the target location.
本公开实施例中,可以通过目标场所内设置的多个采集视角不同的采集设备在同一时刻采集的视频画面,确定不同视频画面中的目标对象的初始位置坐标,进一步对不同视频画面中的同一目标对象的初始位置坐标进行融合,确定目标对象在目标场所中的目标位置坐标。通过该方式一方面可以完成对空间较大和/或场地复杂的目标场所中的目标对象的全面定位,另一方面可以得到同一目标对象准确度较高的目标位置坐标。In the embodiment of the present disclosure, the initial position coordinates of the target object in different video pictures can be determined through the video pictures collected at the same time by a plurality of acquisition devices with different acquisition perspectives set in the target place, and further the same video pictures in different video pictures can be determined. The initial position coordinates of the target object are fused to determine the target position coordinates of the target object in the target place. In this way, on the one hand, it is possible to complete the comprehensive positioning of the target object in the target site with large space and/or complex site, and on the other hand, it can obtain the target position coordinates of the same target object with high accuracy.
针对上述S102,在基于多个采集设备在同一时刻采集的视频画面,分别确定目标对象在目标场所中的初始位置坐标时,如图2所示,可以包括以下S201~S202:For the above S102, when the initial position coordinates of the target object in the target place are respectively determined based on the video images collected by multiple collection devices at the same time, as shown in FIG. 2, the following S201-S202 may be included:
S201,获取多个采集设备在同一时刻分别采集的视频画面中的目标对象的像素坐标。S201: Acquire pixel coordinates of a target object in a video image separately collected by multiple collection devices at the same time.
示例性地,可以基于预先训练的用于进行目标检测的神经网络来识别视频画面中的目标对象,进一步可以读取目标对象中设定位置点在视频画面对应的图像坐标系中的像素坐标,将该设定位置点对应的像素坐标作为目标对象的像素坐标。Exemplarily, the target object in the video picture can be identified based on a pre-trained neural network for target detection, and the pixel coordinates of the set position point in the target object in the image coordinate system corresponding to the video picture can be read, The pixel coordinates corresponding to the set position point are taken as the pixel coordinates of the target object.
具体地,在获取多个采集设备在同一时刻分别采集的视频画面中包含的目标对象的像素坐标时,可以包括以下S2011~S2012:Specifically, when acquiring the pixel coordinates of the target object included in the video images separately collected by multiple collection devices at the same time, the following steps S2011 to S2012 may be included:
S2011,将多张视频画面输入预先训练的神经网络,得到每张视频画面中的目标对象的检测框;其中,神经网络包含多个用于检测不同尺寸的目标对象的目标检测子网络;S2011, inputting a plurality of video frames into a pre-trained neural network to obtain a detection frame of a target object in each video frame; wherein, the neural network includes a plurality of target detection sub-networks for detecting target objects of different sizes;
S2012,提取每张视频画面中的目标对象的检测框上的目标位置点在该视频画面中的像素坐标,得到该视频画面中的目标对象的像素坐标。S2012 , extracting the pixel coordinates of the target position point on the detection frame of the target object in each video picture in the video picture, to obtain the pixel coordinates of the target object in the video picture.
示例性地,神经网络可以检测出视频画面中包含的每个目标对象,并标记出每个目标对象的检测框。如图3所示,为视频画面中的包含的目标对象的检测框的示意图。该视频画面中包含两个目标对象对应的检测框,分别包括目标对象1的检测框A1B1C1D1和目标对象2的检测框A2B2C2D2。可以在每个目标对象的检测框上提取一个位置点作为目标位置点,比如提取检测框底边的中点作为目标位置点。如图3中通过检测框A1B1C1D1底边D1C1的中点K1的像素坐标表示目标对象1的像素坐标,通过检测框A2B2C2D2底边D2C2的中点位置点K2的像素坐标表示目标对象2的像素坐标。Exemplarily, the neural network can detect each target object contained in the video picture, and mark the detection frame of each target object. As shown in FIG. 3 , it is a schematic diagram of a detection frame of a target object included in a video picture. The video image contains two detection frames corresponding to the target objects, including the detection frame A1B1C1D1 of the target object 1 and the detection frame A2B2C2D2 of the target object 2 respectively. A position point can be extracted as the target position point on the detection frame of each target object, for example, the midpoint of the bottom edge of the detection frame is extracted as the target position point. As shown in FIG. 3, the pixel coordinates of the target object 1 are represented by the pixel coordinates of the midpoint K1 of the bottom edge D1C1 of the detection frame A1B1C1D1, and the pixel coordinates of the target object 2 are represented by the pixel coordinates of the midpoint position K2 of the bottom edge D2C2 of the detection frame A2B2C2D2.
示例性地,考虑到目标对象在目标场所中的位置是变化的,且目标场所中设置的多个采集设备在目标场所中的采集视角不同,因此,不同采集设备在同一时刻采集的视频画面中包含的目标对象的尺寸可能不同。为了能够准确地标记出不同尺寸的目标对象的检测框,本公开实施例使用的神经网络可以包含多个用于检测不同尺寸的目标对象的目标检测子网络。比如可以是特征金字塔网络。该特征金字塔网络中的每个目标检测子网络用于检测识别出视频画面中与该目标检测子网络对应尺寸的目标对象,通过该神经网络,可以准确地检测出同一视频画面中不同尺寸的目标对象。Exemplarily, considering that the position of the target object in the target place changes, and the capture angles of multiple capture devices set in the target place are different in the target place, therefore, in the video images captured by different capture devices at the same time The dimensions of the included target objects may vary. In order to accurately mark detection frames of target objects of different sizes, the neural network used in the embodiments of the present disclosure may include multiple target detection sub-networks for detecting target objects of different sizes. For example, it can be a feature pyramid network. Each target detection sub-network in the feature pyramid network is used to detect and identify target objects of the corresponding size of the target detection sub-network in the video picture. Through the neural network, targets of different sizes in the same video picture can be accurately detected. object.
本公开实施例中,神经网络中包含多个用于检测不同尺寸的目标对象的目标检测子网络。这样在通过神经网络对视频画面中的目标对象进行目标检测时,可以准确地检测出同一视频画面中不同尺寸的目标对象。In the embodiment of the present disclosure, the neural network includes a plurality of target detection sub-networks for detecting target objects of different sizes. In this way, when the target object in the video picture is detected by the neural network, the target objects of different sizes in the same video picture can be accurately detected.
S202,基于每个采集设备采集的视频画面中的目标对象的像素坐标和该采集设备的参数信息,确定该采集设备采集的目标对象在目标场所对应的世界坐标系下的初始位置坐标。S202 , based on the pixel coordinates of the target object in the video picture collected by each collection device and the parameter information of the collection device, determine the initial position coordinates of the target object collected by the collection device in the world coordinate system corresponding to the target location.
示例性地,每个采集设备的参数信息可以包含采集设备的单应性矩阵,其中,单应性矩阵可以表示采集设备采集的视频画面对应的图像坐标系和采集设备所处的目标场所对应的世界坐标系之间的转换关系。这样,在得到目标对象在视频画面对应的图像坐标系中的像素坐标后,可以根据采集设备的参数信息,确定目标对象在目标场所对应的世界坐标系中的初始位置坐标。Exemplarily, the parameter information of each acquisition device may include a homography matrix of the acquisition device, wherein the homography matrix may represent the image coordinate system corresponding to the video picture acquired by the acquisition device and the target location where the acquisition device is located. The transformation relationship between world coordinate systems. In this way, after obtaining the pixel coordinates of the target object in the image coordinate system corresponding to the video screen, the initial position coordinates of the target object in the world coordinate system corresponding to the target location can be determined according to the parameter information of the acquisition device.
示例性地,目标场所对应的世界坐标系可以以目标场所中一固定位置为坐标原点,建立唯一世界坐标系。比如可以以目标场所地面中心点为坐标系原点,在地面上设定一个方向作为世界坐标系X轴的正方向,在地面上设定与X轴垂直的方向作为世界坐标系Y轴的正方向,将垂直与地面向上的方向作为世界坐标系Z轴的正方向。Exemplarily, the world coordinate system corresponding to the target site may take a fixed position in the target site as the coordinate origin to establish a unique world coordinate system. For example, you can take the center point of the ground of the target site as the origin of the coordinate system, set a direction on the ground as the positive direction of the X-axis of the world coordinate system, and set the direction perpendicular to the X-axis on the ground as the positive direction of the Y-axis of the world coordinate system , take the vertical and ground-up direction as the positive direction of the Z-axis of the world coordinate system.
本公开实施例中,可以先确定目标对象在视频画面中的像素坐标,然后再根据采集设备的参数信息,得到目标对象在目标场所中的初始位置坐标,为后续确定目标对象在目标场所中的目标位置坐标提供准备。In the embodiment of the present disclosure, the pixel coordinates of the target object in the video screen can be determined first, and then the initial position coordinates of the target object in the target place can be obtained according to the parameter information of the acquisition device, which is used for subsequent determination of the target object in the target place. Target location coordinates are provided for preparation.
在一种实施方式中,针对上述S202,在基于每个采集设备采集的视频画面中的目标对象的像素坐标和该采集设备的参数信息,确定该采集设备采集的目标对象在目标场所对应的世界坐标系下的初始位置坐标时,包括以下S2021~S2022:In one embodiment, for the above S202, based on the pixel coordinates of the target object in the video picture collected by each collection device and the parameter information of the collection device, determine the world corresponding to the target object collected by the collection device in the target place The initial position coordinates in the coordinate system include the following S2021~S2022:
S2021,基于预先确定的每个采集设备的内参矩阵和畸变参数,对该采集设备采集的视频画面中的目标对象的像素坐标进行修正,得到该视频画面中的目标对象的修正像素坐标。S2021 , based on the predetermined internal parameter matrix and distortion parameter of each acquisition device, correct the pixel coordinates of the target object in the video picture collected by the acquisition device, and obtain the corrected pixel coordinates of the target object in the video picture.
示例性地,采集设备的内参矩阵包含
Figure PCTCN2021127625-appb-000001
(f x,f y)表示采集设备的焦距,(c x,c y)表示采集设备采集的视频画面的中心点在图像坐标系中的像素坐标。采集设备的畸变参数包含径向畸变参数和切向畸变系数。在预先得到每个采集设备的内参矩阵和畸变系数后,可以根据该采集设备的内参矩阵和畸变系数对该采集设备采集的视频画面中的目标对象的像素坐标进行去畸变处理。比如可以通过Opencv软件中的去畸变函数,得到该采集设备采集的视频画面中的目标对象的修正像素坐标。
Exemplarily, the internal parameter matrix of the acquisition device contains
Figure PCTCN2021127625-appb-000001
(f x , f y ) represents the focal length of the capture device, and (c x , c y ) represents the pixel coordinates of the center point of the video image captured by the capture device in the image coordinate system. The distortion parameters of the acquisition device include radial distortion parameters and tangential distortion coefficients. After obtaining the internal parameter matrix and distortion coefficient of each acquisition device in advance, the pixel coordinates of the target object in the video image collected by the acquisition device can be de-distorted according to the internal parameter matrix and distortion coefficient of the acquisition device. For example, the corrected pixel coordinates of the target object in the video image captured by the capture device can be obtained through the de-distortion function in the Opencv software.
示例性地,可以按照张正友棋盘标定的方式预先确定每个采集设备的内参矩阵和畸变参数。比如可以从不同角度拍摄多张棋盘格图像,检测出图像中的特征点。根据这些特征点在棋盘格图像中的像素坐标,求解出采集设备的内参矩阵和畸变参数,然后不断对内参矩阵和畸变参数进行优化。在优化过程中,可以根据相邻两次得到的内参矩阵和畸变参数对同一像素坐标进行修正处理。通过前后两次的修正像素坐标的差异确定是否结束优化,比如可以在该差异不再降低后,结束优化得到采集设备的内参矩阵和畸变参数。Exemplarily, the internal parameter matrix and distortion parameters of each acquisition device may be predetermined in the manner of Zhang Zhengyou's chessboard calibration. For example, multiple checkerboard images can be taken from different angles to detect feature points in the images. According to the pixel coordinates of these feature points in the checkerboard image, the internal parameter matrix and distortion parameters of the acquisition device are solved, and then the internal parameter matrix and distortion parameters are continuously optimized. In the optimization process, the same pixel coordinates can be corrected according to the internal parameter matrix and distortion parameters obtained twice adjacently. Whether to end the optimization is determined by the difference between the two corrected pixel coordinates before and after, for example, after the difference is no longer reduced, the optimization can be ended to obtain the internal parameter matrix and distortion parameters of the acquisition device.
S2022,基于预先确定的该采集设备的单应性矩阵和该采集设备采集的视频画面中的目标对象的修正像素坐标,确定该视频画面中的目标对象的初始位置坐标。S2022 , based on the predetermined homography matrix of the acquisition device and the corrected pixel coordinates of the target object in the video picture acquired by the acquisition device, determine the initial position coordinates of the target object in the video picture.
示例性地,单应性矩阵可以表示采集设备采集的视频画面对应的图像坐标系和采集设备位于的目标场所对应的世界坐标系之间的转换关系。该单应性矩阵可以在预先对采集设备进行标定时确定。比如可以通过采集设备采集带有多个标志物的样本视频画面,预先确定多个标志物与地面(世界坐标系X轴和Y轴所在的平面)的交点在目标场所对应的世界坐标系中的世界坐标,然后根据上述方式确定多个标志物与地面的交点在样本视频画面中对应的修正像素坐标,进一步可以基于多个标志物分别对应的修正像素坐标和世界坐标,确定该采集设备的单应性矩阵。Exemplarily, the homography matrix may represent the conversion relationship between the image coordinate system corresponding to the video picture captured by the capture device and the world coordinate system corresponding to the target location where the capture device is located. The homography matrix can be determined when the acquisition device is calibrated in advance. For example, a sample video image with multiple markers can be collected by a collection device, and the intersection of the multiple markers and the ground (the plane where the X and Y axes of the world coordinate system are located) is in the world coordinate system corresponding to the target site. World coordinates, and then determine the corrected pixel coordinates corresponding to the intersections of multiple markers and the ground in the sample video screen according to the above method, and further determine the single unit of the acquisition device based on the corrected pixel coordinates and world coordinates corresponding to the multiple markers respectively. Responsiveness Matrix.
示例性地,在确定视频画面中的目标对象的初始位置坐标时,可以根据视频画面 中的目标对象的修正像素坐标和采集该视频画面的采集设备的单应性矩阵,得到视频画面中的目标对象在目标场所对应的世界坐标系中的初始位置坐标。Exemplarily, when determining the initial position coordinates of the target object in the video picture, the target object in the video picture can be obtained according to the corrected pixel coordinates of the target object in the video picture and the homography matrix of the acquisition device that collects the video picture. The initial position coordinates of the object in the world coordinate system corresponding to the target location.
本公开实施例中,在得到视频画面中目标对象的像素坐标后,先基于采集该视频画面的采集设备的内参矩阵和畸变系数对像素坐标进行修正,从而可以得到准确度较高的修正像素坐标,进一步得到目标对象在目标场所中准确度较高的初始位置坐标。In the embodiment of the present disclosure, after obtaining the pixel coordinates of the target object in the video picture, the pixel coordinates are first corrected based on the internal parameter matrix and the distortion coefficient of the capture device that captures the video picture, so that the corrected pixel coordinates with higher accuracy can be obtained. , and further obtain the initial position coordinates of the target object with high accuracy in the target place.
在一种实施方式中,针对上述S103,在对同一目标对象的初始位置坐标进行融合,得到目标对象在目标场所中的目标位置坐标时,如图4所示,可以包括以下S301~S302:In one embodiment, for the above S103, when the initial position coordinates of the same target object are fused to obtain the target position coordinates of the target object in the target place, as shown in FIG. 4, the following S301-S302 may be included:
S301,基于多张视频画面确定的目标对象的初始位置坐标,确定与同一目标对象关联的多个初始位置坐标。S301: Determine a plurality of initial position coordinates associated with the same target object based on the initial position coordinates of the target object determined based on the plurality of video images.
示例性地,根据上文提到的目标场所中的每个目标对象至少被两个采集设备同时采集到,针对每个目标对象,在同一时刻被不同的采集设备拍摄到的情况下,采集设备的参数信息存在一定的误差,且不同的采集设备的参数信息之间的误差不同。因此,基于不同的视频画面确定的同一目标对象的初始位置坐标可能不同。在对同一目标对象的初始位置坐标进行融合之前,需要先确定与同一目标对象关联的多个初始位置坐标。Exemplarily, according to the above-mentioned target location, each target object is captured by at least two capture devices at the same time, and for each target object, in the case of being captured by different capture devices at the same time, the capture device There is a certain error in the parameter information, and the error between the parameter information of different acquisition devices is different. Therefore, the initial position coordinates of the same target object determined based on different video pictures may be different. Before fusing the initial position coordinates of the same target object, it is necessary to determine multiple initial position coordinates associated with the same target object.
S302,将与同一目标对象关联的多个初始位置坐标进行依次融合,得到同一目标对象在目标场所中的目标位置坐标。S302 , successively fuse multiple initial position coordinates associated with the same target object to obtain target position coordinates of the same target object in the target place.
示例性地,假设与同一目标对象关联的多个初始位置坐标包含N个,可以先将前两个进行融合,得到融合后的初始位置坐标。然后将融合后的初始位置坐标与第三个初始位置坐标进行融合,直至与最后一个初始位置坐标进行融合后,将最终融合得到的位置坐标作为同一目标对象的目标位置坐标。Exemplarily, assuming that the multiple initial position coordinates associated with the same target object include N, the first two may be fused first to obtain the fused initial position coordinates. Then, the fused initial position coordinates are fused with the third initial position coordinates until the last initial position coordinates are fused, and the final fused position coordinates are used as the target position coordinates of the same target object.
本公开实施例中,考虑到基于不同采集设备采集视频画面确定的同一目标对象的初始位置坐标会存在一些误差,因此可以通过对多个采集设备采集的同一目标对象的初始位置坐标进行融合,从而可以得到该同一目标对象准确度较高的目标位置坐标。In the embodiment of the present disclosure, considering that there may be some errors in the initial position coordinates of the same target object determined based on the video images collected by different collection devices, the initial position coordinates of the same target object collected by multiple collection devices can be fused, thereby The target position coordinates with higher accuracy of the same target object can be obtained.
在一种实施方式中,针对上述S301,在基于多张视频画面确定的目标对象的初始位置坐标,确定与同一目标对象关联的多个初始位置坐标时,包括以下S3011~S3012:In one embodiment, for the above S301, when multiple initial position coordinates associated with the same target object are determined based on the initial position coordinates of the target object determined based on multiple video images, the following steps S3011 to S3012 are included:
S3011,针对所述多张视频画面中的任意两张视频画面,确定任意两张视频画面中第一视频画面中的目标对象为第一目标对象,任意两张视频画面中第二视频画面中的目标对象为第二目标对象,针对每个所述第一目标对象的初始位置坐标,确定该第一目标对象的初始位置坐标与任意两张视频画面中第二视频画面中各个第二目标对象的初始位置坐标之间的距离;S3011, for any two video pictures in the plurality of video pictures, determine that the target object in the first video picture in the arbitrary two video pictures is the first target object, and the target object in the second video picture in the arbitrary two video pictures The target object is a second target object, and for the initial position coordinates of each of the first target objects, determine the initial position coordinates of the first target object and the coordinates of each second target object in the second video frame in any two video frames. the distance between the initial position coordinates;
S3012,确定与该第一目标对象具有最小距离的第二目标对象与该第一目标对象为同一目标对象,其中,所述最小距离小于预设融合距离阈值;将该第一目标对象的初始位置坐标,作为与目标对象中同一目标对象关联的多个初始位置坐标。S3012: Determine that a second target object having a minimum distance from the first target object and the first target object are the same target object, wherein the minimum distance is less than a preset fusion distance threshold; the initial position of the first target object Coordinates as multiple initial position coordinates associated with the same target object in the target object.
示例性地,比如目标场所中设置A个采集设备,假设在同一时刻A个采集设备采集的视频画面中均包含至少一个目标对象,在该时刻可以得到A组初始位置坐标,A组初始位置坐标构成初始坐标集合s={S1,S2,S3,......SA}。其中,S1、S2、S3...SA依次表示为A个采集设备中的第一个采集设备、第二个采集设备、第三个采集设备至第A个采集设备拍摄的视频画面中的目标对象的初始位置坐标。下面将以下任意两个视频画面为第一个采集设备和第二个采集设备在同一时刻采集的视频画面为例,说明如何确定与同一目标对象关联的多个初始位置坐标:Exemplarily, for example, A collection device is set up in the target site, and it is assumed that the video images captured by the A collection devices at the same time all contain at least one target object, at this moment, the initial position coordinates of the A group and the initial position coordinates of the A group can be obtained. Constitute the initial coordinate set s={S1, S2, S3, ...... SA}. Among them, S1, S2, S3...SA are sequentially represented as the target in the video screen shot by the first acquisition device, the second acquisition device, the third acquisition device to the A-th acquisition device in the A acquisition devices The initial position coordinates of the object. The following is an example of how to determine multiple initial position coordinates associated with the same target object by taking any two of the following video images as the video images captured by the first capture device and the second capture device at the same time:
示例性地,S1中包含a个第一目标对象的初始位置坐标(也称作第一初始位置坐标),S2中包含b个第二目标对象的初始位置坐标(也称作第二初始位置坐标),可以确定每个第一初始位置坐标和每个第二初始位置坐标之间的欧式距离,得到距离矩阵:Exemplarily, S1 includes initial position coordinates (also referred to as first initial position coordinates) of a first target objects, and S2 includes b initial position coordinates (also referred to as second initial position coordinates) of second target objects. ), the Euclidean distance between each first initial position coordinate and each second initial position coordinate can be determined to obtain the distance matrix:
Figure PCTCN2021127625-appb-000002
Figure PCTCN2021127625-appb-000002
其中,d 11表示S1中第1个第一初始位置坐标和S2中第1个第二初始位置坐标之间的距离;d 1b表示S1中第1个第一初始位置坐标和S2中第b个第二初始位置坐标之间的距离;d ij表示S1中第i个第一初始位置坐标和S2中第j个第二初始位置坐标之间的距离;d a1表示S1中第a个第一初始位置坐标和S2中第1个第二初始位置坐标之间的距离;d ab表示S1中第a个第一初始位置坐标和S2中第b个第二初始位置坐标之间的距离。 Among them, d 11 represents the distance between the first first initial position coordinate in S1 and the first second initial position coordinate in S2; d 1b represents the first first initial position coordinate in S1 and the bth in S2 The distance between the second initial position coordinates; d ij represents the distance between the i-th first initial position coordinate in S1 and the j-th second initial position coordinate in S2; d a1 represents the a-th first initial position coordinate in S1 The distance between the position coordinates and the first and second initial position coordinates in S2; d ab represents the distance between the a-th first initial position coordinates in S1 and the b-th second initial position coordinates in S2.
示例性地,具体在操作时,可以按照以下方式确定S1和S2中与同一目标对象关联的多个初始位置坐标,包括S30121~S30124:Exemplarily, during operation, multiple initial position coordinates associated with the same target object in S1 and S2 can be determined in the following manner, including S30121 to S30124:
S30121,在当前距离矩阵中的元素中查找当前最小距离;S30121, find the current minimum distance in the elements in the current distance matrix;
示例性地,在首次查找最小距离的情况下,当前距离矩阵中的元素包含S1中每个第一初始位置坐标和S2中每个第二初始位置坐标之间的欧式距离。Exemplarily, in the case of finding the minimum distance for the first time, the elements in the current distance matrix include the Euclidean distance between each first initial position coordinate in S1 and each second initial position coordinate in S2.
S30122,判断当前最小距离是否小于预设融合距离阈值。S30122: Determine whether the current minimum distance is less than a preset fusion distance threshold.
示例性地,预设融合距离可以根据经验设定。比如预先通过不同采集设备针对同一目标对象进行拍摄,然后根据不同采集设备采集的视频画面分别确定出该同一目标对象在目标场所中的多个位置坐标。根据多个位置坐标之间的距离来确定该预设融合距离阈值。Exemplarily, the preset fusion distance may be set empirically. For example, the same target object is photographed by different collection devices in advance, and then multiple position coordinates of the same target object in the target site are determined respectively according to the video images collected by different collection devices. The preset fusion distance threshold is determined according to distances between a plurality of position coordinates.
S30123,在确定该当前最小距离小于预设融合距离阈值的情况下,确定该当前最小距离关联的两个初始位置坐标为同一目标对象关联的初始位置坐标。S30123: In the case where it is determined that the current minimum distance is smaller than the preset fusion distance threshold, determine that the two initial position coordinates associated with the current minimum distance are the initial position coordinates associated with the same target object.
示例性地,比如确定出d a1为当前最小距离,且d a1小于预设融合距离阈值,可以 将S1中第a个第一初始位置坐标和S2中第1个第二初始位置坐标作为与同一目标对象关联的初始位置坐标。 Exemplarily, if it is determined that d a1 is the current minimum distance, and d a1 is smaller than the preset fusion distance threshold, the a-th first initial position coordinate in S1 and the first second initial position coordinate in S2 can be regarded as the same as the The initial position coordinates associated with the target object.
S30124,将当前距离矩阵中的当前最小距离,以及与当前最小距离关联的两个初始位置坐标中任一初始位置坐标之间的所有其它距离设置为预设融合距离阈值后,返回执行S30121,直至当前距离矩阵中的当前最小距离大于或等于预设融合距离阈值的情况下,得到S1和S2中所有与同一目标对象关联的初始位置坐标。S30124, after setting the current minimum distance in the current distance matrix and all other distances between any one of the two initial position coordinates associated with the current minimum distance as the preset fusion distance threshold, return to executing S30121, until When the current minimum distance in the current distance matrix is greater than or equal to the preset fusion distance threshold, all initial position coordinates associated with the same target object in S1 and S2 are obtained.
示例性地,假设当前距离矩阵由S1和S2中初始位置坐标计算得到,具体一个为3×3矩阵:Exemplarily, it is assumed that the current distance matrix is calculated from the initial position coordinates in S1 and S2, and the specific one is a 3×3 matrix:
Figure PCTCN2021127625-appb-000003
Figure PCTCN2021127625-appb-000003
预设融合阈值为d th;假设d 11为当前矩阵中最小距离且小于d th,那么S1中的第1个第一初始位置坐标以及S2中的第1个第二初始位置坐标,为同一目标对象的关联初始位置坐标。则在当前距离矩阵中,与这两个初始位置坐标中任一初始位置坐标计算出的所有其它距离为d 12、d 13、d 21、d 31。所以,根据S30124,在当前矩阵中,需要把d 11、d 12、d 13、d 21、d 31均设置为d th;设置后的矩阵为: The preset fusion threshold is d th ; assuming that d 11 is the minimum distance in the current matrix and less than d th , then the first first initial position coordinate in S1 and the first second initial position coordinate in S2 are the same target. The object's associated initial position coordinates. Then in the current distance matrix, all other distances calculated from any of the two initial position coordinates are d 12 , d 13 , d 21 , and d 31 . Therefore, according to S30124, in the current matrix, it is necessary to set d 11 , d 12 , d 13 , d 21 , and d 31 to d th ; the set matrix is:
Figure PCTCN2021127625-appb-000004
Figure PCTCN2021127625-appb-000004
之后返回执行S30121。Then, it returns to execute S30121.
示例性地,在将当前距离矩阵中的当前最小距离,以及与当前最小距离关联的两个初始位置坐标中任一初始位置坐标之间的所有其它距离设置为预设融合距离阈值后,在继续查找当前最小距离的过程中,可以排除被设置为预设融合距离阈值的元素,从而提高搜索效率。Exemplarily, after setting the current minimum distance in the current distance matrix and all other distances between any one of the two initial position coordinates associated with the current minimum distance as the preset fusion distance threshold, continue In the process of finding the current minimum distance, the elements set as the preset fusion distance threshold can be excluded, thereby improving the search efficiency.
示例性地,在一种实施方式中,在得到S1和S2中与同一目标对象关联的多个初始位置坐标后,可以继续基于其它任意两张视频画面确定是否存在与同一目标对象关联的初始位置坐标,直至判断完A个采集设备在同一时刻采集的视频画面后,可以得到A个采集设备在同一时刻采集的视频画面中的各目标对象的不同初始位置坐标。然后对与同一目标对象关联的初始位置坐标进行融合,得到A个采集设备在同一时刻拍摄的A张视频画面中的各目标对象在目标场所中的目标位置坐标。Exemplarily, in one embodiment, after obtaining multiple initial position coordinates associated with the same target object in S1 and S2, it can continue to determine whether there is an initial position associated with the same target object based on any other two video frames. The coordinates of different initial positions of each target object in the video images collected by the A collection devices at the same time can be obtained after the video images collected by the A collection devices at the same time are judged. Then, the initial position coordinates associated with the same target object are fused to obtain the target position coordinates of each target object in the target place in the A video images shot by the A collection devices at the same moment.
示例性地,在另一种实施方式中,在得到S1和S2中与同一目标对象关联的多个初始位置坐标后,可以对多个初始位置坐标进行坐标融合,得到该同一目标对象更新后的初始位置坐标。针对S1和S2中未参与融合的初始位置坐标,可以与更新后的初始位置坐标构成S2’。进一步通过S2’和S3中的初始位置坐标构成新的当前距离矩阵重复执 行S30121至S30124的步骤,得到S2’和S3中与同一目标对象关联的多个初始位置坐标,按照相同的方式得到S3’。进一步通过S3’和S4中的初始位置坐标构成新的当前距离矩阵,重复执行S30121至S30124的步骤,直至完成与初始坐标集合中的最后一个元素的初始位置坐标的融合后,得到A个采集设备在同一时刻拍摄的A张视频画面中的各目标对象在目标场所中的目标位置坐标。Exemplarily, in another embodiment, after obtaining multiple initial position coordinates associated with the same target object in S1 and S2, coordinate fusion can be performed on the plurality of initial position coordinates to obtain the updated version of the same target object. Initial position coordinates. For the initial position coordinates in S1 and S2 that are not involved in the fusion, S2' can be formed with the updated initial position coordinates. Further form a new current distance matrix by the initial position coordinates in S2' and S3 and repeat the steps of S30121 to S30124 to obtain a plurality of initial position coordinates associated with the same target object in S2' and S3, and obtain S3' in the same way . A new current distance matrix is further formed by the initial position coordinates in S3 ' and S4, and the steps of S30121 to S30124 are repeatedly executed, until after the fusion with the initial position coordinates of the last element in the initial coordinate set is completed, A collection devices are obtained The target position coordinates of each target object in the target location in the A video frames shot at the same time.
特别地,在直至完成与初始坐标集合中的最后一个元素的初始位置坐标的融合后,若检测到存在任一初始位置坐标从开始到结束均为参与融合,考虑到目标场所中的每个目标对象至少被两个采集设备同时采集到,因此可以将该任一初始位置坐标作为误差初始位置坐标进行过滤。In particular, until the fusion with the initial position coordinates of the last element in the initial coordinate set is completed, if any initial position coordinates are detected to be involved in the fusion from the beginning to the end, considering that each target in the target location The object is collected by at least two collecting devices at the same time, so any initial position coordinate can be used as the error initial position coordinate for filtering.
本公开实施例中,根据任意两张视频画面中不同目标对象的初始位置坐标,和预设融合距离阈值,可以快速确定出与同一目标对象关联的初始位置坐标,从而为后续确定各目标对象的目标位置坐标提供依据。In the embodiment of the present disclosure, according to the initial position coordinates of different target objects in any two video frames and the preset fusion distance threshold, the initial position coordinates associated with the same target object can be quickly determined, so as to determine the subsequent position of each target object. The target location coordinates provide the basis.
具体地,针对上述S302,在将与同一目标对象关联的多个初始位置坐标进行依次融合,得到同一目标对象在目标场所中的目标位置坐标时,可以包括以下S3021~S3022:Specifically, for the above S302, when the multiple initial position coordinates associated with the same target object are sequentially fused to obtain the target position coordinates of the same target object in the target place, the following steps S3021 to S3022 may be included:
S3021,从同一目标对象关联的多个初始位置坐标中选取任一初始位置坐标,将该初始位置坐标作为第一中间融合位置坐标。S3021: Select any initial position coordinate from a plurality of initial position coordinates associated with the same target object, and use the initial position coordinate as the first intermediate fusion position coordinate.
S3022,将第一中间融合位置坐标与所述多个初始位置坐标中其它任一待融合的初始位置坐标进行融合,生成第二中间融合位置坐标;将第二中间融合位置坐标作为更新后的第一中间融合位置坐标,并返回生成第二中间融合位置坐标的步骤,直到不存在待融合的初始位置坐标。S3022, fuse the first intermediate fusion position coordinate with any other initial position coordinate to be fused among the plurality of initial position coordinates to generate a second intermediate fusion position coordinate; use the second intermediate fusion position coordinate as the updated first an intermediate fusion position coordinate, and return to the step of generating the second intermediate fusion position coordinate, until there is no initial position coordinate to be fused.
其中,待融合的初始位置坐标是指未参与融合的初始位置坐标。The initial position coordinates to be fused refer to the initial position coordinates that do not participate in the fusion.
示例性地,在将第一中间融合位置坐标与所述多个初始位置坐标中其它任一待融合的初始位置坐标进行融合,生成第二中间融合位置坐标时,包括:确定第一中间融合位置坐标与其它任一待融合的初始位置坐标的中点坐标,将该中点坐标作为生成的第二中间融合位置坐标。Exemplarily, when the first intermediate fusion position coordinate is fused with any other initial position coordinate to be fused among the plurality of initial position coordinates to generate the second intermediate fusion position coordinate, the method includes: determining the first intermediate fusion position. The midpoint coordinates of the coordinates and any other initial position coordinates to be fused, and the midpoint coordinates are used as the generated second intermediate fusion position coordinates.
示例性地,结合上述实施例若确定出与目标对象A关联的多个初始位置坐标包含N个,可以将任一初始位置坐标作为第一中间融合位置坐标,确定该第一中间融合位置坐标与其它任一待融合的初始位置坐标的中点坐标。然后将该中点坐标作为更新后的第一中间融合位置坐标,继续与其它任一待融合的初始位置坐标进行融合。直到N个初始位置坐标中不存在待融合的初始位置坐标后,得到目标对象A的目标位置坐标。Exemplarily, in combination with the above-mentioned embodiment, if it is determined that the plurality of initial position coordinates associated with the target object A includes N, any initial position coordinate may be used as the first intermediate fusion position coordinate, and it is determined that the first intermediate fusion position coordinate is the same as that of the target object A. The midpoint coordinates of any other initial position coordinates to be fused. Then, the midpoint coordinate is used as the updated first intermediate fusion position coordinate, and continues to be fused with any other initial position coordinate to be fused. Until there is no initial position coordinate to be fused among the N initial position coordinates, the target position coordinate of the target object A is obtained.
本公开实施例中,提出针对与同一目标对象关联的多个初始位置坐标,可以按照依次取中点的方式融合,从而得到准确度较高的目标位置坐标。In the embodiments of the present disclosure, it is proposed that multiple initial position coordinates associated with the same target object may be fused in a manner of taking midpoints in sequence, thereby obtaining target position coordinates with higher accuracy.
本公开实施例提出的定位方法可以准确地确定目标场所中各目标对象的目标位 置坐标,该方式可以应用于多种应用场景。以应用于工厂为例,在得到目标对象在目标场所中的目标位置坐标之后,如图5所示,本公开实施例提供的定位方法还包括以下S401~S402:The positioning method proposed in the embodiment of the present disclosure can accurately determine the target position coordinates of each target object in the target place, and this method can be applied to various application scenarios. Taking the application in a factory as an example, after obtaining the target position coordinates of the target object in the target place, as shown in FIG. 5 , the positioning method provided by the embodiment of the present disclosure further includes the following S401 to S402:
S401,基于目标场所中的各目标对象分别对应的目标位置坐标,以及预先设定的目标区域,确定是否存在进入目标区域的目标对象;S401, based on the target position coordinates corresponding to each target object in the target place, and the preset target area, determine whether there is a target object entering the target area;
S402,在确定存在进入目标区域的目标对象的情况下,进行预警提示。S402, if it is determined that there is a target object entering the target area, perform an early warning prompt.
示例性地,目标场所为工厂的情况下,可以预先在目标场所对应的世界坐标下中设定工厂内存在危险的目标区域对应的坐标范围。然后根据确定的目标场所中的各目标对象分别对应的目标位置坐标以及目标位于对应的坐标范围,确定是否存在进入目标区域的目标对象。进一步在确定存在进入目标区域的目标对象的情况下,进行预警提示。Exemplarily, in the case where the target site is a factory, a coordinate range corresponding to a dangerous target area in the factory may be set in advance in the world coordinates corresponding to the target site. Then, it is determined whether there is a target object entering the target area according to the target position coordinates corresponding to each target object in the determined target place and the target location in the corresponding coordinate range. Further, when it is determined that there is a target object entering the target area, an early warning prompt is performed.
示例性地,预警提示可以包括但不限于声光报警提示、语音报警提示等。通过预警提示,可以保障目标场所中员工的安全,提高目标场所的安全性。Exemplarily, the early warning prompts may include, but are not limited to, sound and light alarm prompts, voice alarm prompts, and the like. Through the early warning prompts, the safety of employees in the target site can be guaranteed and the safety of the target site can be improved.
本公开实施例中,在得到目标场所中的各目标对象准确度较高的目标位置坐标后,可以基于预先设定的目标区域,比如预先设定的危险区域,判断目标场所中的目标对象是否进入目标区域,以便及时预警提示,提高目标场所的安全性。In the embodiment of the present disclosure, after obtaining the target position coordinates of each target object in the target place with high accuracy, it can be determined whether the target object in the target place is based on a preset target area, such as a preset dangerous area. Enter the target area for timely warning prompts and improve the safety of the target site.
下面以目标场所为工厂为例,结合具体实施例对本公开提供的定位方法进行整体介绍:Taking the target site as a factory as an example, the positioning method provided by the present disclosure will be introduced as a whole in conjunction with specific embodiments:
1)针对工厂进行采集设备安装,比如在工厂安装多个相机。为了实现对场景内目标的精准定位,并保证算法通用性及鲁棒性,使得不同的采集设备在工厂中的采集视角不同,确保进入工厂中的每个员工至少被两个采集设备同时采集到。1) Installation of acquisition equipment for the factory, such as installing multiple cameras in the factory. In order to achieve accurate positioning of the target in the scene and ensure the universality and robustness of the algorithm, different acquisition devices have different acquisition perspectives in the factory, and ensure that each employee entering the factory is captured by at least two acquisition devices at the same time. .
2)使用张正友标定方式确定每个相机的内参矩阵和畸变系数。2) Use Zhang Zhengyou's calibration method to determine the internal parameter matrix and distortion coefficient of each camera.
3)在工厂内设置多个标志物,确定标志物与地面的交点在工厂对应的世界坐标系中的位置坐标。并根据相机的内参矩阵和畸变系数确定标志物与地面的交点在样本视频画面中的修正像素坐标。并根据交点在世界坐标系中的位置坐标和在样本视频画面中的修正像素坐标,确定每个相机的单应性矩阵。3) Set up multiple markers in the factory, and determine the position coordinates of the intersection of the markers and the ground in the world coordinate system corresponding to the factory. And the corrected pixel coordinates of the intersection of the marker and the ground in the sample video picture are determined according to the camera's internal parameter matrix and distortion coefficient. And according to the position coordinates of the intersection point in the world coordinate system and the corrected pixel coordinates in the sample video picture, the homography matrix of each camera is determined.
4)使用加入特征金字塔的神经网络对工厂中的相机采集的视频画面进行目标检测,得到每张视频画面中包含的员工的像素坐标。4) Use the neural network added to the feature pyramid to perform target detection on the video pictures collected by the cameras in the factory, and obtain the pixel coordinates of the employees contained in each video picture.
5)根据采集该张视频画面的相机的内参矩阵和畸变系数,对该张视频画面中包含的员工的像素坐标进行修正,得到该张视频画面中包含的员工的修正像素坐标。5) According to the internal parameter matrix and the distortion coefficient of the camera that collected the video picture, correct the pixel coordinates of the employee included in the video picture to obtain the corrected pixel coordinates of the employee included in the video picture.
6)根据采集该张视频画面的相机的单应性矩阵和该张视频画面中包含的员工的修正像素坐标,确定该张视频画面中包含的员工在工厂中的初始位置坐标。6) According to the homography matrix of the camera that collected the video picture and the corrected pixel coordinates of the employee contained in the video picture, determine the initial position coordinates of the employee in the factory contained in the video picture.
7)对同一时刻采集到的视频画面中的同一员工的初始位置坐标进行融合,得到 工厂中的员工在该时刻的目标位置坐标。7) Integrate the initial position coordinates of the same employee in the video images collected at the same moment to obtain the target position coordinates of the employees in the factory at this moment.
8)根据工厂中的员工在该时刻的目标位置坐标和工厂中预设的危险区域,确定是否存在员工进入危险区域。在确定存在员工进入危险区域的情况下,进行预警提示。8) According to the target position coordinates of the employees in the factory at this moment and the preset dangerous area in the factory, determine whether there is an employee entering the dangerous area. When it is determined that there is an employee entering the dangerous area, an early warning prompt is given.
本领域技术人员可以理解,在具体实施方式的上述方法中,各步骤的撰写顺序并不意味着严格的执行顺序而对实施过程构成任何限定,各步骤的具体执行顺序应当以其功能和可能的内在逻辑确定。Those skilled in the art can understand that in the above method of the specific implementation, the writing order of each step does not mean a strict execution order but constitutes any limitation on the implementation process, and the specific execution order of each step should be based on its function and possible Internal logic is determined.
基于同一技术构思,本公开实施例中还提供了与定位方法对应的定位装置,由于本公开实施例中的装置解决问题的原理与本公开实施例上述定位方法相似,因此装置的实施可以参见方法的实施,重复之处不再赘述。Based on the same technical concept, the embodiment of the present disclosure also provides a positioning device corresponding to the positioning method. Since the principle of solving the problem of the device in the embodiment of the present disclosure is similar to the above-mentioned positioning method of the embodiment of the present disclosure, the implementation of the device can refer to the method implementation, and the repetition will not be repeated.
参照图6所示,为本公开实施例提供的一种定位装置500的示意图,该定位装置包括:Referring to FIG. 6 , which is a schematic diagram of a positioning device 500 according to an embodiment of the present disclosure, the positioning device includes:
获取模块501,用于获取目标场所内设置的多个采集设备在同一时刻采集的视频画面;其中,不同的采集设备在目标场所中的采集视角不同,视频画面中包括目标对象;The acquisition module 501 is used to acquire the video images collected by a plurality of collection devices set in the target site at the same time; wherein, different collection devices have different collection perspectives in the target site, and the video images include the target object;
确定模块502,用于基于多个采集设备在同一时刻采集的视频画面,分别确定目标对象在目标场所中的初始位置坐标;The determining module 502 is configured to respectively determine the initial position coordinates of the target object in the target place based on the video images collected by multiple collection devices at the same time;
融合模块503,用于对同一目标对象的初始位置坐标进行融合,得到该目标对象在目标场所中的目标位置坐标。The fusion module 503 is used to fuse the initial position coordinates of the same target object to obtain the target position coordinates of the target object in the target place.
在一种实施方式中,确定模块502在用于基于多个采集设备在同一时刻采集的视频画面,分别确定目标对象在目标场所中的初始位置坐标时,包括:In one embodiment, when the determining module 502 is used to determine the initial position coordinates of the target object in the target place based on the video images captured by multiple capturing devices at the same time, the steps include:
获取多个采集设备在同一时刻分别采集的视频画面中的目标对象的像素坐标;Obtain the pixel coordinates of the target object in the video images collected by multiple collection devices at the same time;
基于每个采集设备采集的视频画面中的目标对象的像素坐标和该采集设备的参数信息,确定该采集设备采集的目标对象在目标场所对应的世界坐标系下的初始位置坐标。Based on the pixel coordinates of the target object in the video image captured by each capture device and the parameter information of the capture device, determine the initial position coordinates of the target object captured by the capture device in the world coordinate system corresponding to the target location.
在一种可能的实施方式中,确定模块502在用于获取多个采集设备在同一时刻分别采集的视频画面中的目标对象的像素坐标时,包括:In a possible implementation manner, when the determining module 502 is used to acquire the pixel coordinates of the target object in the video images captured by multiple capturing devices at the same moment, the following steps are included:
将多张视频画面输入预先训练的神经网络,得到每张视频画面中的目标对象的检测框;其中,神经网络包含多个用于检测不同尺寸的目标对象的目标检测子网络;Inputting multiple video images into a pre-trained neural network to obtain a detection frame of the target object in each video image; wherein, the neural network includes multiple target detection sub-networks for detecting target objects of different sizes;
提取每张视频画面中的目标对象的检测框上的目标位置点在该视频画面中的像素坐标,得到该视频画面中的目标对象的像素坐标。The pixel coordinates of the target position point on the detection frame of the target object in each video picture are extracted in the video picture, and the pixel coordinates of the target object in the video picture are obtained.
在一种可能的实施方式中,确定模块502在用于基于每个采集设备采集的视频画面中的目标对象的像素坐标和该采集设备的参数信息,确定该采集设备采集的目标对象在目标场所对应的世界坐标系下的初始位置坐标时,包括:In a possible implementation manner, the determination module 502 is used to determine that the target object collected by the collection device is in the target place based on the pixel coordinates of the target object in the video picture collected by each collection device and the parameter information of the collection device The corresponding initial position coordinates in the world coordinate system include:
基于预先确定的每个采集设备的内参矩阵和畸变参数,对该采集设备采集的视频画面中的目标对象的像素坐标进行修正,得到该视频画面中的目标对象的修正像素坐标;Based on the predetermined internal parameter matrix and distortion parameters of each acquisition device, correct the pixel coordinates of the target object in the video picture collected by the acquisition device, and obtain the corrected pixel coordinates of the target object in the video picture;
基于预先确定的该采集设备的单应性矩阵和该采集设备采集的视频画面中的目标对象的修正像素坐标,确定该视频画面中的目标对象的初始位置坐标。Based on the predetermined homography matrix of the capture device and the corrected pixel coordinates of the target object in the video frame captured by the capture device, the initial position coordinates of the target object in the video frame are determined.
在一种可能的实施方式中,融合模块503在用于对同一目标对象的初始位置坐标进行融合,得到目标对象在目标场所中的目标位置坐标,包括:In a possible implementation, the fusion module 503 is used to fuse the initial position coordinates of the same target object to obtain the target position coordinates of the target object in the target place, including:
基于多张视频画面确定的目标对象的初始位置坐标,确定与同一目标对象关联的多个初始位置坐标;Determine a plurality of initial position coordinates associated with the same target object based on the initial position coordinates of the target object determined based on the plurality of video images;
将与同一目标对象关联的多个初始位置坐标进行依次融合,得到同一目标对象在目标场所中的目标位置坐标。The multiple initial position coordinates associated with the same target object are sequentially fused to obtain the target position coordinates of the same target object in the target place.
在一种可能的实施方式中,融合模块503在用于将与同一目标对象关联的多个初始位置坐标进行依次融合,得到同一目标对象在目标场所中的目标位置坐标时,包括:In a possible implementation manner, when the fusion module 503 is used to sequentially fuse multiple initial position coordinates associated with the same target object to obtain the target position coordinates of the same target object in the target place, it includes:
从同一目标对象关联的多个初始位置坐标中选取任一初始位置坐标,将该初始位置坐标作为第一中间融合位置坐标;Select any initial position coordinate from a plurality of initial position coordinates associated with the same target object, and use the initial position coordinate as the first intermediate fusion position coordinate;
将第一中间融合位置坐标与所述多个初始位置坐标中其它任一待融合的初始位置坐标进行融合,生成第二中间融合位置坐标,将第二中间融合位置坐标作为更新后的第一中间融合位置坐标,并返回生成第二中间融合位置坐标的步骤,直到不存在待融合的初始位置坐标。The first intermediate fusion position coordinates are fused with any other initial position coordinates to be fused in the plurality of initial position coordinates to generate the second intermediate fusion position coordinates, and the second intermediate fusion position coordinates are used as the updated first intermediate The position coordinates are fused, and the step of generating the second intermediate fused position coordinates is returned until there are no initial position coordinates to be fused.
在一种可能的实施方式中,融合模块503在用于将第一中间融合位置坐标与所述多个初始位置坐标中其它任一待融合的初始位置坐标进行融合,生成第二中间融合位置坐标时,包括:In a possible implementation manner, the fusion module 503 is used to fuse the first intermediate fusion position coordinate with any other initial position coordinate to be fused among the plurality of initial position coordinates to generate the second intermediate fusion position coordinate , including:
确定第一中间融合位置坐标与其它任一待融合的初始位置坐标的中点坐标,将该中点坐标作为生成的第二中间融合位置坐标。Determine the midpoint coordinate of the first intermediate fusion position coordinate and any other initial position coordinate to be fused, and use the midpoint coordinate as the generated second intermediate fusion position coordinate.
在一种可能的实施方式中,融合模块503在用于基于多张视频画面确定的目标对象的初始位置坐标,确定与同一目标对象关联的多个初始位置坐标时,包括:In a possible implementation manner, when the fusion module 503 is used to determine multiple initial position coordinates associated with the same target object based on the initial position coordinates of the target object determined based on multiple video frames, the method includes:
针对所述多张视频画面中的任意两张视频画面,确定任意两张视频画面中第一视频画面中的目标对象为第一目标对象,任意两张视频画面中第二视频画面中的目标对象为第二目标对象,针对每个所述第一目标对象的初始位置坐标,确定该第一目标对象的初始位置坐标与任意两张视频画面中第二视频画面中各个第二目标对象的初始位置坐标之间的距离;For any two video pictures in the plurality of video pictures, it is determined that the target object in the first video picture in the arbitrary two video pictures is the first target object, and the target object in the second video picture in the arbitrary two video pictures is determined as the first target object. For the second target object, for the initial position coordinates of each first target object, determine the initial position coordinates of the first target object and the initial position of each second target object in the second video picture in any two video pictures distance between coordinates;
确定与该第一目标对象具有最小距离的第二目标对象与该第一目标对象为同一目标对象,其中,所述最小距离小于预设融合距离阈值;将该第一目标对象的初始位置 坐标,作为与目标对象中同一目标对象关联的多个初始位置坐标。It is determined that the second target object with the minimum distance from the first target object and the first target object are the same target object, wherein the minimum distance is less than the preset fusion distance threshold; the initial position coordinates of the first target object, As multiple initial position coordinates associated with the same target object in the target object.
在一种可能的实施方式中,在融合模块503得到目标对象在目标场所中的目标位置坐标之后,确定模块502还用于:In a possible implementation manner, after the fusion module 503 obtains the target position coordinates of the target object in the target place, the determination module 502 is further configured to:
基于目标场所中的各目标对象分别对应的目标位置坐标,以及预先设定的目标区域,确定是否存在进入目标区域的目标对象;Determine whether there is a target object entering the target area based on the target position coordinates corresponding to each target object in the target place and the preset target area;
在确定存在进入目标区域的目标对象的情况下,进行预警提示。When it is determined that there is a target object entering the target area, an early warning prompt is given.
关于装置中的各模块的处理流程、以及各模块之间的交互流程的描述可以参照上述方法实施例中的相关说明,这里不再详述。For the description of the processing flow of each module in the apparatus and the interaction flow between the modules, reference may be made to the relevant descriptions in the foregoing method embodiments, which will not be described in detail here.
对应于图1中的定位方法,本公开实施例还提供了一种电子设备600,如图7所示,为本公开实施例提供的电子设备600结构示意图,包括:Corresponding to the positioning method in FIG. 1 , an embodiment of the present disclosure further provides an electronic device 600 . As shown in FIG. 7 , a schematic structural diagram of the electronic device 600 provided by the embodiment of the present disclosure includes:
处理器610、存储器620、和总线630;存储器620用于存储执行指令,包括内存621和外部存储器622;这里的内存621也称内存储器,用于暂时存放处理器610中的运算数据,以及与硬盘等外部存储器622交换的数据,处理器610通过内存621与外部存储器622进行数据交换,当所述电子设备600运行时,所述处理器610与所述存储器620之间通过总线630通信,使得所述处理器610执行以下指令:获取目标场所内设置的多个采集设备在同一时刻采集的视频画面;其中,不同的采集设备在目标场所中的采集视角不同,视频画面中包含目标对象;基于多个采集设备在同一时刻采集的视频画面,分别确定目标对在目标场所中的初始位置坐标;对同一目标对象的初始位置坐标进行融合,得到目标对象在目标场所中的目标位置坐标。The processor 610, the memory 620, and the bus 630; the memory 620 is used to store the execution instructions, including the memory 621 and the external memory 622; the memory 621 here is also called the internal memory, which is used to temporarily store the operation data in the processor 610, and The data exchanged by the external memory 622 such as the hard disk, the processor 610 exchanges data with the external memory 622 through the memory 621, and when the electronic device 600 is running, the processor 610 and the memory 620 communicate through the bus 630, so that The processor 610 executes the following instructions: acquiring video images collected by multiple collection devices set in the target site at the same time; wherein, different collection devices have different collection perspectives in the target site, and the video images include target objects; For the video images collected by multiple collection devices at the same time, the initial position coordinates of the target pair in the target place are respectively determined; the initial position coordinates of the same target object are fused to obtain the target position coordinates of the target object in the target place.
本公开实施例还提供一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行上述方法实施例中所述的定位方法的步骤。其中,该存储介质可以是易失性或非易失的计算机可读取存储介质。Embodiments of the present disclosure further provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, the steps of the positioning method described in the foregoing method embodiments are executed. Wherein, the storage medium may be a volatile or non-volatile computer-readable storage medium.
本公开实施例还提供一种计算机程序产品,该计算机程序产品承载有程序代码并存储于存储介质中,所述程序代码包括的指令可用于执行上述方法实施例中所述的定位方法的步骤,具体可参见上述方法实施例,在此不再赘述。An embodiment of the present disclosure further provides a computer program product, where the computer program product carries a program code and is stored in a storage medium, where the instructions included in the program code can be used to execute the steps of the positioning method described in the above method embodiments, For details, reference may be made to the foregoing method embodiments, which will not be repeated here.
其中,上述计算机程序产品可以具体通过硬件、软件或其结合的方式实现。在一个可选实施例中,所述计算机程序产品具体体现为计算机存储介质,在另一个可选实施例中,计算机程序产品具体体现为软件产品,例如软件开发包(Software Development Kit,SDK)等等。Wherein, the above-mentioned computer program product can be specifically implemented by means of hardware, software or a combination thereof. In an optional embodiment, the computer program product is embodied as a computer storage medium, and in another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK), etc. Wait.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统和装置的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。在本公开所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅 为一种逻辑功能划分,实际实现时可以有另外的划分方式,又例如,多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些通信接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。Those skilled in the art can clearly understand that, for the convenience and brevity of description, for the specific working process of the system and device described above, reference may be made to the corresponding process in the foregoing method embodiments, which will not be repeated here. In the several embodiments provided by the present disclosure, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. The apparatus embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some communication interfaces, indirect coupling or communication connection of devices or units, which may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
另外,在本公开各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个处理器可执行的非易失的计算机可读取存储介质中。基于这样的理解,本公开的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本公开各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。The functions, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a processor-executable non-volatile computer-readable storage medium. Based on such understanding, the technical solutions of the present disclosure can be embodied in the form of software products in essence, or the parts that contribute to the prior art or the parts of the technical solutions. The computer software products are stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in various embodiments of the present disclosure. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes .
最后应说明的是:以上所述实施例,仅为本公开的具体实施方式,用以说明本公开的技术方案,而非对其限制,本公开的保护范围并不局限于此,尽管参照前述实施例对本公开进行了详细的说明,本领域的普通技术人员应当理解:任何熟悉本技术领域的技术人员在本公开揭露的技术范围内,其依然可以对前述实施例所记载的技术方案进行修改或可轻易想到变化,或者对其中部分技术特征进行等同替换;而这些修改、变化或者替换,并不使相应技术方案的本质脱离本公开实施例技术方案的精神和范围,都应涵盖在本公开的保护范围之内。因此,本公开的保护范围应所述以权利要求的保护范围为准。Finally, it should be noted that the above-mentioned embodiments are only specific implementations of the present disclosure, and are used to illustrate the technical solutions of the present disclosure, but not to limit them. The protection scope of the present disclosure is not limited to this, although the aforementioned The embodiments describe the present disclosure in detail, and those skilled in the art should understand that: any person skilled in the art can still modify the technical solutions described in the foregoing embodiments within the technical scope disclosed by the present disclosure. Or can easily think of changes, or equivalently replace some of the technical features; and these modifications, changes or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the present disclosure, and should be covered in the present disclosure. within the scope of protection. Therefore, the protection scope of the present disclosure should be based on the protection scope of the claims.

Claims (12)

  1. 一种定位方法,包括:A positioning method comprising:
    获取目标场所内设置的多个采集设备在同一时刻采集的多张视频画面;其中,不同的采集设备在所述目标场所中的采集视角不同,所述多张视频画面中包括目标对象;其中,所述目标对象为所述目标场所内待进行定位的对象;Acquiring multiple video images collected at the same time by multiple collection devices set in the target site; wherein, different collection devices have different collection perspectives in the target site, and the multiple video images include target objects; wherein, The target object is an object to be positioned in the target place;
    基于所述多张视频画面,分别确定所述目标对象在所述目标场所中的初始位置坐标;Determine the initial position coordinates of the target object in the target place based on the plurality of video images;
    对所述目标对象中同一目标对象的初始位置坐标进行融合,得到该目标对象在所述目标场所中的目标位置坐标。The initial position coordinates of the same target object in the target objects are fused to obtain the target position coordinates of the target object in the target place.
  2. 根据权利要求1所述的定位方法,其特征在于,基于所述多张视频画面,分别确定所述目标对象在所述目标场所中的初始位置坐标,包括:The positioning method according to claim 1, wherein determining the initial position coordinates of the target object in the target place based on the plurality of video images, comprising:
    获取所述多张视频画面中所述目标对象的像素坐标;obtaining the pixel coordinates of the target object in the multiple video frames;
    针对所述多个采集设备里的每一个,基于该采集设备采集的视频画面中所述目标对象中至少一者的像素坐标和该采集设备的参数信息,确定该采集设备采集的所述目标对象中至少一者在所述目标场所对应的世界坐标系下的初始位置坐标。For each of the multiple collection devices, the target object collected by the collection device is determined based on the pixel coordinates of at least one of the target objects in the video image collected by the collection device and the parameter information of the collection device The initial position coordinates of at least one of them in the world coordinate system corresponding to the target location.
  3. 根据权利要求2所述的定位方法,其特征在于,获取所述多张视频画面中的所述目标对象的像素坐标,包括:The positioning method according to claim 2, wherein acquiring the pixel coordinates of the target object in the plurality of video pictures comprises:
    将所述多张视频画面输入预先训练的神经网络,Inputting the multiple video pictures into a pre-trained neural network,
    针对所述多张视频画面中的每一张,For each of the plurality of video frames,
    得到该视频画面中目标对象的检测框;Obtain the detection frame of the target object in the video image;
    提取该视频画面中目标对象的检测框上的目标位置点在该视频画面中的像素坐标,得到该视频画面中目标对象的像素坐标。The pixel coordinates of the target position point on the detection frame of the target object in the video picture are extracted in the video picture, and the pixel coordinates of the target object in the video picture are obtained.
  4. 根据权利要求2或3所述的定位方法,其特征在于,基于该采集设备采集的所述视频画面中的所述目标对象中至少一者的像素坐标和该采集设备的参数信息,确定该采集设备采集的所述目标对象中至少一者在所述目标场所对应的世界坐标系下的初始位置坐标,包括:The positioning method according to claim 2 or 3, wherein the acquisition is determined based on the pixel coordinates of at least one of the target objects in the video picture acquired by the acquisition device and parameter information of the acquisition device The initial position coordinates of at least one of the target objects collected by the device in the world coordinate system corresponding to the target location, including:
    基于预先确定的该采集设备的内参矩阵和畸变参数,对该采集设备采集的视频画面中的所述目标对象中至少一者的像素坐标进行修正,得到该视频画面中的所述目标对象中至少一者的修正像素坐标;Based on the predetermined internal parameter matrix and distortion parameters of the acquisition device, the pixel coordinates of at least one of the target objects in the video picture collected by the acquisition device are corrected to obtain at least one of the target objects in the video picture. the corrected pixel coordinates of one;
    基于预先确定的该采集设备的单应性矩阵和该采集设备采集的视频画面中的所述目标对象中至少一者的修正像素坐标,确定该视频画面中的所述目标对象中至少一者的初始位置坐标。Based on the predetermined homography matrix of the capture device and the modified pixel coordinates of at least one of the target objects in the video frame captured by the capture device, determine the pixel coordinates of at least one of the target objects in the video frame Initial position coordinates.
  5. 根据权利要求1至4任一所述的定位方法,其特征在于,对所述目标对象中同一目标对象的初始位置坐标进行融合,得到该目标对象在所述目标场所中的目标位置坐标,包括:The positioning method according to any one of claims 1 to 4, wherein the initial position coordinates of the same target object in the target objects are fused to obtain the target position coordinates of the target object in the target place, comprising: :
    基于所述目标对象在所述目标场所中的所述初始位置坐标,确定与所述目标对象中同一目标对象关联的多个初始位置坐标;determining a plurality of initial position coordinates associated with the same target object in the target objects based on the initial position coordinates of the target object in the target place;
    将与该目标对象关联的所述多个初始位置坐标进行依次融合,得到该目标对象在所述目标场所中的目标位置坐标。The plurality of initial position coordinates associated with the target object are sequentially fused to obtain the target position coordinates of the target object in the target place.
  6. 根据权利要求5所述的定位方法,其特征在于,将与该目标对象关联的所述多个初始位置坐标进行依次融合,得到该目标对象在所述目标场所中的目标位置坐标,包括:The positioning method according to claim 5, wherein the plurality of initial position coordinates associated with the target object are sequentially fused to obtain the target position coordinates of the target object in the target place, comprising:
    从该目标对象关联的所述多个初始位置坐标中选取任一初始位置坐标,将选取的任一初始位置坐标作为第一中间融合位置坐标;Select any initial position coordinate from the plurality of initial position coordinates associated with the target object, and use the selected initial position coordinate as the first intermediate fusion position coordinate;
    将所述第一中间融合位置坐标与所述多个初始位置坐标中其它任一待融合的初始位置坐标进行融合,生成第二中间融合位置坐标,将所述第二中间融合位置坐标作为更新后的所述第一中间融合位置坐标,并返回生成所述第二中间融合位置坐标的步骤,直到所述多个初始位置坐标中不存在待融合的初始位置坐标。The first intermediate fusion position coordinates are fused with any other initial position coordinates to be fused in the plurality of initial position coordinates to generate the second intermediate fusion position coordinates, and the second intermediate fusion position coordinates are used as the updated and returning to the step of generating the second intermediate fusion position coordinates, until there is no initial position coordinate to be fused in the plurality of initial position coordinates.
  7. 根据权利要求6所述的定位方法,其特征在于,将所述第一中间融合位置坐标与所述多个初始位置坐标中其它任一待融合的初始位置坐标进行融合,生成第二中间融合位置坐标,包括:The positioning method according to claim 6, wherein the first intermediate fusion position coordinate is fused with any other initial position coordinate to be fused among the plurality of initial position coordinates to generate a second intermediate fusion position Coordinates, including:
    确定所述第一中间融合位置坐标与所述多个初始位置坐标中其它任一待融合的初 始位置坐标的中点坐标,将该中点坐标作为所述第二中间融合位置坐标。Determine the midpoint coordinate of the first intermediate fusion position coordinate and any other initial position coordinate to be fused among the plurality of initial position coordinates, and use the midpoint coordinate as the second intermediate fusion position coordinate.
  8. 根据权利要求5至7任一所述的定位方法,其特征在于,基于所述目标对象在所述目标场所中的所述初始位置坐标,确定与所述目标对象中同一目标对象关联的多个初始位置坐标,包括:The positioning method according to any one of claims 5 to 7, characterized in that, based on the initial position coordinates of the target object in the target place, a plurality of target objects associated with the same target object in the target objects are determined. Initial position coordinates, including:
    针对所述多张视频画面中的任意两张视频画面,将所述任意两张视频画面中第一视频画面中的目标对象确定为第一目标对象,所述任意两张视频画面中第二视频画面中的目标对象确定为第二目标对象;For any two video pictures in the plurality of video pictures, the target object in the first video picture in the any two video pictures is determined as the first target object, and the second video picture in the arbitrary two video pictures is determined as the first target object. The target object in the picture is determined as the second target object;
    在所述第一视频画面中确定每个所述第一目标对象的初始位置坐标;determining the initial position coordinates of each of the first target objects in the first video frame;
    在所述第二视频画面中确定每个所述第二目标对象的初始位置坐标;determining the initial position coordinates of each of the second target objects in the second video frame;
    针对每个所述第一目标对象的初始位置坐标,For the initial position coordinates of each of the first target objects,
    确定该第一目标对象的初始位置坐标与每个所述第二目标对象的初始位置坐标之间的距离;determining the distance between the initial position coordinates of the first target object and the initial position coordinates of each of the second target objects;
    确定与该第一目标对象具有最小距离的第二目标对象与该第一目标对象为同一目标对象,其中,所述最小距离小于预设融合距离阈值;determining that a second target object having a minimum distance from the first target object is the same target object as the first target object, wherein the minimum distance is less than a preset fusion distance threshold;
    将该第一目标对象的初始位置坐标和与该第一目标对象具有最小距离的第二目标对象的初始位置坐标,作为与所述目标对象中同一目标对象关联的多个初始位置坐标。The initial position coordinates of the first target object and the initial position coordinates of the second target object with the smallest distance from the first target object are taken as a plurality of initial position coordinates associated with the same target object in the target objects.
  9. 根据权利要求1至8任一所述的定位方法,其特征在于,在得到该目标对象在所述目标场所中的目标位置坐标之后,所述定位方法还包括:The positioning method according to any one of claims 1 to 8, wherein after obtaining the target position coordinates of the target object in the target place, the positioning method further comprises:
    基于所述目标场所中的各所述目标对象分别对应的目标位置坐标,以及预先设定的目标区域,确定是否存在进入所述目标区域的目标对象;Determine whether there is a target object entering the target area based on the target position coordinates corresponding to each of the target objects in the target place and a preset target area;
    在确定存在进入所述目标区域的目标对象的情况下,进行预警提示。When it is determined that there is a target object entering the target area, an early warning prompt is performed.
  10. 一种定位装置,包括:A positioning device, comprising:
    获取模块,用于获取目标场所内设置的多个采集设备在同一时刻采集的多张视频画面;其中,不同的采集设备在所述目标场所中的采集视角不同,所述多张视频画面中包含目标对象;其中,所述目标对象为所述目标场所内待进行定位的对象;The acquisition module is used to acquire multiple video images collected at the same time by multiple collection devices set in the target site; wherein, different collection devices have different collection perspectives in the target site, and the multiple video images include A target object; wherein, the target object is an object to be positioned in the target place;
    确定模块,用于基于所述多张视频画面,分别确定所述目标对象在所述目标场所中的初始位置坐标;a determining module, configured to respectively determine the initial position coordinates of the target object in the target place based on the plurality of video pictures;
    融合模块,用于对所述目标对象中同一目标对象的初始位置坐标进行融合,得到该目标对象在所述目标场所中的目标位置坐标。The fusion module is used for fusing the initial position coordinates of the same target object in the target objects to obtain the target position coordinates of the target object in the target place.
  11. 一种电子设备,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,当电子设备运行时,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行如权利要求1至9任一所述的定位方法的步骤。An electronic device, comprising: a processor, a memory and a bus, the memory stores machine-readable instructions executable by the processor, and when the electronic device is running, the processor and the memory communicate through the bus , the machine-readable instructions execute the steps of the positioning method according to any one of claims 1 to 9 when the machine-readable instructions are executed by the processor.
  12. 一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行如权利要求1至9任一所述的定位方法的步骤。A computer-readable storage medium, on which a computer program is stored, and when the computer program is run by a processor, executes the steps of the positioning method according to any one of claims 1 to 9.
PCT/CN2021/127625 2021-04-28 2021-10-29 Positioning method and apparatus, electronic device, and storage medium WO2022227462A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110467657.9 2021-04-28
CN202110467657.9A CN113129378A (en) 2021-04-28 2021-04-28 Positioning method, positioning device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
WO2022227462A1 true WO2022227462A1 (en) 2022-11-03

Family

ID=76781086

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/127625 WO2022227462A1 (en) 2021-04-28 2021-10-29 Positioning method and apparatus, electronic device, and storage medium

Country Status (3)

Country Link
CN (1) CN113129378A (en)
TW (1) TW202242803A (en)
WO (1) WO2022227462A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117541910A (en) * 2023-10-27 2024-02-09 北京市城市规划设计研究院 Fusion method and device for urban road multi-radar data

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113129378A (en) * 2021-04-28 2021-07-16 北京市商汤科技开发有限公司 Positioning method, positioning device, electronic equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150268033A1 (en) * 2014-03-21 2015-09-24 The Boeing Company Relative Object Localization Process for Local Positioning System
CN107015193A (en) * 2017-04-18 2017-08-04 中国矿业大学(北京) A kind of binocular CCD vision mine movable object localization methods and system
CN110210446A (en) * 2019-06-12 2019-09-06 广东工业大学 A kind of sitting posture condition detection method, device, equipment and the medium of target object
CN110363179A (en) * 2019-07-23 2019-10-22 联想(北京)有限公司 Ground picture capturing method, device, electronic equipment and storage medium
CN112653848A (en) * 2020-12-23 2021-04-13 北京市商汤科技开发有限公司 Display method and device in augmented reality scene, electronic equipment and storage medium
CN113129378A (en) * 2021-04-28 2021-07-16 北京市商汤科技开发有限公司 Positioning method, positioning device, electronic equipment and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110163914B (en) * 2018-08-01 2021-05-25 京东方科技集团股份有限公司 Vision-based positioning
CN111380502B (en) * 2020-03-13 2022-05-24 商汤集团有限公司 Calibration method, position determination method, device, electronic equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150268033A1 (en) * 2014-03-21 2015-09-24 The Boeing Company Relative Object Localization Process for Local Positioning System
CN107015193A (en) * 2017-04-18 2017-08-04 中国矿业大学(北京) A kind of binocular CCD vision mine movable object localization methods and system
CN110210446A (en) * 2019-06-12 2019-09-06 广东工业大学 A kind of sitting posture condition detection method, device, equipment and the medium of target object
CN110363179A (en) * 2019-07-23 2019-10-22 联想(北京)有限公司 Ground picture capturing method, device, electronic equipment and storage medium
CN112653848A (en) * 2020-12-23 2021-04-13 北京市商汤科技开发有限公司 Display method and device in augmented reality scene, electronic equipment and storage medium
CN113129378A (en) * 2021-04-28 2021-07-16 北京市商汤科技开发有限公司 Positioning method, positioning device, electronic equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117541910A (en) * 2023-10-27 2024-02-09 北京市城市规划设计研究院 Fusion method and device for urban road multi-radar data

Also Published As

Publication number Publication date
TW202242803A (en) 2022-11-01
CN113129378A (en) 2021-07-16

Similar Documents

Publication Publication Date Title
Walch et al. Image-based localization using lstms for structured feature correlation
WO2022227761A1 (en) Target tracking method and apparatus, electronic device, and storage medium
Chen et al. City-scale landmark identification on mobile devices
JP3954484B2 (en) Image processing apparatus and program
Gao et al. Robust RGB-D simultaneous localization and mapping using planar point features
WO2022227462A1 (en) Positioning method and apparatus, electronic device, and storage medium
Tang et al. ESTHER: Joint camera self-calibration and automatic radial distortion correction from tracking of walking humans
CN102612704A (en) Method of providing a descriptor for at least one feature of an image and method of matching features
JP2023015989A (en) Item identification and tracking system
CN103246044A (en) Automatic focusing method, automatic focusing system, and camera and camcorder provided with automatic focusing system
JP2014515530A (en) Planar mapping and tracking for mobile devices
WO2010101227A1 (en) Device for creating information for positional estimation of matter, method for creating information for positional estimation of matter, and program
Yang et al. A new hybrid synthetic aperture imaging model for tracking and seeing people through occlusion
CN113240678B (en) Plane information detection method and system
WO2023284358A1 (en) Camera calibration method and apparatus, electronic device, and storage medium
Angladon et al. The toulouse vanishing points dataset
CN110909685A (en) Posture estimation method, device, equipment and storage medium
JP6304815B2 (en) Image processing apparatus and image feature detection method, program and apparatus thereof
WO2023015938A1 (en) Three-dimensional point detection method and apparatus, electronic device, and storage medium
CN112802112B (en) Visual positioning method, device, server and storage medium
CN115880206A (en) Image accuracy judging method, device, equipment, storage medium and program product
Gandhi et al. Calibration of a reconfigurable array of omnidirectional cameras using a moving person
Zhang et al. Image mosaic of bionic compound eye imaging system based on image overlap rate prior
Wu Human re-identification
KR102249380B1 (en) System for generating spatial information of CCTV device using reference image information

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21938925

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21938925

Country of ref document: EP

Kind code of ref document: A1