CN116843727B - Target handover positioning method and system crossing video sources - Google Patents

Target handover positioning method and system crossing video sources Download PDF

Info

Publication number
CN116843727B
CN116843727B CN202311119686.1A CN202311119686A CN116843727B CN 116843727 B CN116843727 B CN 116843727B CN 202311119686 A CN202311119686 A CN 202311119686A CN 116843727 B CN116843727 B CN 116843727B
Authority
CN
China
Prior art keywords
video
weight
calculating
positioning
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311119686.1A
Other languages
Chinese (zh)
Other versions
CN116843727A (en
Inventor
黄程
杨光
李卫红
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Normal University Weizhi Information Technology Co ltd
Original Assignee
Guangdong Normal University Weizhi Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Normal University Weizhi Information Technology Co ltd filed Critical Guangdong Normal University Weizhi Information Technology Co ltd
Priority to CN202311119686.1A priority Critical patent/CN116843727B/en
Publication of CN116843727A publication Critical patent/CN116843727A/en
Application granted granted Critical
Publication of CN116843727B publication Critical patent/CN116843727B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/292Multi-camera tracking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • H04N7/181Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast for receiving images from a plurality of remote sources

Abstract

The invention discloses a target handover positioning method and a target handover positioning system crossing video sources, wherein the method comprises the following steps: acquiring a first video picture from a first video source and a second video picture from a second video source; the first video picture and the second video picture have a superposition part; positioning and calculating a target object according to the first video picture and the second video picture respectively to obtain first position information and second position information; calculating the proportion of the target object occupying the overlapping part according to the first video picture and the second video picture, and determining weight information corresponding to the target according to the proportion; and determining final position information corresponding to the target object according to the weight information, the first position information and the second position information. Therefore, the invention can realize more accurate positioning of the object by means of the characteristics and parameters of the overlapping vision fields, so as to achieve better target handover positioning effect of the cross video source.

Description

Target handover positioning method and system crossing video sources
Technical Field
The invention relates to the technical field of video target positioning, in particular to a target handover positioning method and system crossing video sources.
Background
In recent years, with deep learning and various excellent neural network models, development of intelligent monitoring videos is better, but when a plurality of video sources such as cameras cooperatively locate targets, problems still exist, for example, in a process of tracking and locating targets across video sources, target handover is limited to simply transmitting target information from an A camera to a B camera, and then tracking is continuously performed in the B camera, but because different cameras or different postures can lead to errors in the result of tracking target location when the cameras are installed, handover processing is directly performed on the targets, and jump and track mutation of target position points are caused. It can be seen that the prior art has defects and needs to be solved.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a target handover positioning method and system for a cross-video source, which can realize more accurate positioning of objects by means of characteristics and parameters of overlapping views so as to achieve a better target handover positioning effect of the cross-video source.
In order to solve the technical problem, the first aspect of the present invention discloses a target handover positioning method across video sources, which comprises:
Acquiring a first video picture comprising a target object from a first video source and a second video picture comprising the target object from a second video source; the first video picture and the second video picture have a superposition part;
positioning calculation is carried out on the target object according to the first video picture and the second video picture respectively to obtain first position information and second position information;
calculating the proportion of the target object occupying the overlapping part according to the first video picture and the second video picture, and determining weight information corresponding to the target according to the proportion;
and determining final position information corresponding to the target object according to the weight information, the first position information and the second position information.
In a first aspect of the present invention, the positioning calculation is performed on the target object according to the first video frame and the second video frame, to obtain first location information and second location information, including:
based on a homonymy point information coordinate system conversion algorithm, unifying coordinate systems of the first video source and the second video source to obtain coordinate conversion relational expressions corresponding to the first video source and the second video source respectively;
According to the coordinate conversion relation corresponding to the first video source and the first video picture, positioning calculation is carried out on the target object to obtain first position information;
and carrying out positioning calculation on the target object according to the coordinate conversion relation corresponding to the second video source and the second video picture to obtain second position information.
In a first aspect of the present invention, the calculating, according to the first video frame and the second video frame, a proportion of the target object occupying the overlapping portion, and determining weight information corresponding to the target according to the proportion includes:
calculating a target distance of the target object relative to one boundary of the overlapping portion according to the first video picture and the second video picture; the boundary corresponds to a view boundary of the first video source or a view boundary of the second video source;
and calculating weight information corresponding to the target object according to the proportion of the target distance to the total length of the overlapping part.
As an optional implementation manner, in the first aspect of the present invention, the target object gradually moves away from the first video source and approaches the second video source in movement; the calculating, according to the first video frame and the second video frame, a target distance of the target object relative to a boundary of the overlapping portion includes:
Determining an object position of the target object and a target boundary corresponding to a view boundary of the second video source in the overlapping part according to the first video picture and the second video picture;
calculating the projection distance between the object position and the target boundary relative to the shooting direction of the video source to obtain a target distance;
and calculating the projection length of the overlapping part relative to the shooting direction of the video source to obtain the total length of the overlapping part.
As an optional implementation manner, in a first aspect of the present invention, the calculating, according to a ratio of the target distance to the total length of the overlapping portion, weight information corresponding to the target object includes:
calculating a distance proportion value of the target distance and the total length of the overlapped part;
determining the distance proportion value as a first weight corresponding to the first video source;
and calculating the difference value between the distance ratio value 1 and the distance ratio value, and determining the difference value as a second weight corresponding to the second video source.
As an optional implementation manner, in the first aspect of the present invention, after the calculating a difference between 1 and the distance scale value and determining the difference as the second weight corresponding to the second video source, the method further includes:
Inputting the first video picture into a pre-trained speed prediction neural network model and a positioning accurate prediction neural network model to obtain a first speed predicted value and a first positioning accurate predicted value corresponding to the output first video picture; the speed prediction neural network model is obtained through training a training data set comprising a plurality of training video pictures and corresponding speed labels; the positioning accuracy prediction neural network model is obtained through training a training data set comprising a plurality of training video pictures and corresponding positioning accuracy labels;
calculating a first speed weight inversely proportional to the first speed predicted value, and calculating a first positioning weight proportional to the first positioning accuracy predicted value, and calculating a first model-related weight; the first model correlation weight is in direct proportion to a first predictive characterization value; the first prediction characterization value is a difference value between a ratio of the first speed prediction value to the first positioning accuracy prediction value and a preset ratio threshold value, and is used for characterizing the accuracy degree of model prediction;
calculating the product of the first speed weight, the first positioning weight and the first model related weight and the first weight to obtain a new first weight;
Inputting the second video picture into the speed prediction neural network model and the positioning accuracy prediction neural network model to obtain a second speed predicted value and a second positioning accuracy predicted value corresponding to the output second video picture;
calculating a second velocity weight inversely proportional to the second velocity prediction value, and calculating a second positioning weight proportional to the second positioning accuracy prediction value, and calculating a second model-related weight; the second model-related weight is proportional to a second predictive characterizing value; the second prediction characterization value is a difference value between a ratio of the second speed prediction value and the second positioning accuracy prediction value and the ratio threshold value, and is used for characterizing the accuracy degree of model prediction;
and calculating the product of the second velocity weight, the second positioning weight and the second model related weight and the second weight to obtain a new second weight.
As an optional implementation manner, in the first aspect of the present invention, the determining final location information corresponding to the target object according to the weight information, the first location information, and the second location information includes:
Calculating a first product of the first location information and the first weight;
calculating a second product of the second location information and the second weight;
and calculating the sum of the first product and the second product to obtain final position information corresponding to the target object.
The second aspect of the present invention discloses a target handover positioning system across video sources, the system comprising:
an acquisition module for acquiring a first video picture including a target object from a first video source and a second video picture including the target object from a second video source; the first video picture and the second video picture have a superposition part;
the positioning module is used for performing positioning calculation on the target object according to the first video picture and the second video picture respectively to obtain first position information and second position information;
the calculating module is used for calculating the proportion of the target object occupying the overlapping part according to the first video picture and the second video picture, and determining weight information corresponding to the target according to the proportion;
and the determining module is used for determining final position information corresponding to the target object according to the weight information, the first position information and the second position information.
In a second aspect of the present invention, the positioning module performs positioning calculation on the target object according to the first video frame and the second video frame, to obtain a specific manner of the first position information and the second position information, where the specific manner includes:
based on a homonymy point information coordinate system conversion algorithm, unifying coordinate systems of the first video source and the second video source to obtain coordinate conversion relational expressions corresponding to the first video source and the second video source respectively;
according to the coordinate conversion relation corresponding to the first video source and the first video picture, positioning calculation is carried out on the target object to obtain first position information;
and carrying out positioning calculation on the target object according to the coordinate conversion relation corresponding to the second video source and the second video picture to obtain second position information.
In a second aspect of the present invention, the calculating module calculates, according to the first video frame and the second video frame, a proportion of the target object occupying the overlapping portion, and determines, according to the proportion, a specific manner of weight information corresponding to the target, including:
Calculating a target distance of the target object relative to one boundary of the overlapping portion according to the first video picture and the second video picture; the boundary corresponds to a view boundary of the first video source or a view boundary of the second video source;
and calculating weight information corresponding to the target object according to the proportion of the target distance to the total length of the overlapping part.
As an alternative implementation manner, in the second aspect of the present invention, the target object gradually moves away from the first video source and approaches the second video source in movement; the calculating module calculates a specific mode of a target distance of the target object relative to a boundary of the overlapping portion according to the first video picture and the second video picture, including:
determining an object position of the target object and a target boundary corresponding to a view boundary of the second video source in the overlapping part according to the first video picture and the second video picture;
calculating the projection distance between the object position and the target boundary relative to the shooting direction of the video source to obtain a target distance;
and calculating the projection length of the overlapping part relative to the shooting direction of the video source to obtain the total length of the overlapping part.
In a second aspect of the present invention, as an optional implementation manner, the calculating module calculates, according to a ratio of the target distance to the total length of the overlapping portion, a specific manner of weight information corresponding to the target object, including:
calculating a distance proportion value of the target distance and the total length of the overlapped part;
determining the distance proportion value as a first weight corresponding to the first video source;
and calculating the difference value between the distance ratio value 1 and the distance ratio value, and determining the difference value as a second weight corresponding to the second video source.
As an optional implementation manner, in the second aspect of the present invention, after the calculating module calculates a difference between 1 and the distance scale value and determines that the difference is the second weight corresponding to the second video source, the following steps are further performed:
inputting the first video picture into a pre-trained speed prediction neural network model and a positioning accurate prediction neural network model to obtain a first speed predicted value and a first positioning accurate predicted value corresponding to the output first video picture; the speed prediction neural network model is obtained through training a training data set comprising a plurality of training video pictures and corresponding speed labels; the positioning accuracy prediction neural network model is obtained through training a training data set comprising a plurality of training video pictures and corresponding positioning accuracy labels;
Calculating a first speed weight inversely proportional to the first speed predicted value, and calculating a first positioning weight proportional to the first positioning accuracy predicted value, and calculating a first model-related weight; the first model correlation weight is in direct proportion to a first predictive characterization value; the first prediction characterization value is a difference value between a ratio of the first speed prediction value to the first positioning accuracy prediction value and a preset ratio threshold value, and is used for characterizing the accuracy degree of model prediction;
calculating the product of the first speed weight, the first positioning weight and the first model related weight and the first weight to obtain a new first weight;
inputting the second video picture into the speed prediction neural network model and the positioning accuracy prediction neural network model to obtain a second speed predicted value and a second positioning accuracy predicted value corresponding to the output second video picture;
calculating a second velocity weight inversely proportional to the second velocity prediction value, and calculating a second positioning weight proportional to the second positioning accuracy prediction value, and calculating a second model-related weight; the second model-related weight is proportional to a second predictive characterizing value; the second prediction characterization value is a difference value between a ratio of the second speed prediction value and the second positioning accuracy prediction value and the ratio threshold value, and is used for characterizing the accuracy degree of model prediction;
And calculating the product of the second velocity weight, the second positioning weight and the second model related weight and the second weight to obtain a new second weight.
In a second aspect of the present invention, as an optional implementation manner, the determining module determines, according to the weight information, the first location information, and the second location information, a specific manner of final location information corresponding to the target object, where the determining module includes:
calculating a first product of the first location information and the first weight;
calculating a second product of the second location information and the second weight;
and calculating the sum of the first product and the second product to obtain final position information corresponding to the target object.
In a third aspect, the present invention discloses another cross-video source target handover positioning system, the system comprising:
a memory storing executable program code;
a processor coupled to the memory;
the processor invokes the executable program code stored in the memory to perform some or all of the steps in the cross-video source target handover positioning method disclosed in the first aspect of the present invention.
A fourth aspect of the invention discloses a computer storage medium storing computer instructions which, when invoked, are operable to perform part or all of the steps of the method of cross-video source object handover positioning disclosed in the first aspect of the invention.
Compared with the prior art, the invention has the following beneficial effects:
the method and the device can calculate the weight by utilizing the duty ratio of the target object in the overlapping view field so as to adjust the position calculation of the target object, thereby realizing more accurate positioning of the object by means of the characteristics and parameters of the overlapping view field and achieving better target handover positioning effect of the cross video source.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic flow chart of a method for positioning a target handover across video sources according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a cross-video source target handover positioning system according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of another cross-video source target handover positioning system according to an embodiment of the present invention;
fig. 4 is a schematic diagram of a target handover positioning calculation method according to an embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The terms first, second and the like in the description and in the claims and in the above-described figures are used for distinguishing between different objects and not necessarily for describing a sequential or chronological order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the invention. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
The invention discloses a target handover positioning method and a target handover positioning system for a cross-video source, which can calculate weights by utilizing the duty ratio of a target object in a coincidence view field so as to adjust the position calculation of the target object, thereby realizing more accurate positioning of the object by means of the characteristics and parameters of the coincidence view field and achieving a better target handover positioning effect of the cross-video source. The following will describe in detail.
Example 1
Referring to fig. 1, fig. 1 is a flowchart of a method for positioning a target handover across video sources according to an embodiment of the present invention. The method described in fig. 1 may be applied to a corresponding data processing device, a data processing terminal, and a data processing server, where the server may be a local server or a cloud server, and the embodiment of the present invention is not limited to the method shown in fig. 1, and the method for positioning target handover across video sources may include the following operations:
101. A first video picture including a target object from a first video source and a second video picture including a target object from a second video source are acquired.
Specifically, the first video picture and the second video picture have overlapping portions. Specifically, the overlapping portion is also an overlapping view between the first video source and the second video source.
102. And respectively carrying out positioning calculation on the target object according to the first video picture and the second video picture to obtain first position information and second position information.
103. And calculating the proportion of the target object occupying the overlapping part according to the first video picture and the second video picture, and determining weight information corresponding to the target according to the proportion.
104. And determining final position information corresponding to the target object according to the weight information, the first position information and the second position information.
Therefore, the method described by the embodiment of the invention can utilize the duty ratio of the target object in the overlapping view field to calculate the weight so as to adjust the position calculation of the target object, thereby realizing more accurate positioning of the object by means of the characteristics and parameters of the overlapping view field and achieving better target handover positioning effect of the cross video source.
As an optional embodiment, in the step, positioning calculation is performed on the target object according to the first video frame and the second video frame, to obtain first position information and second position information, including:
based on a homonymy point information coordinate system conversion algorithm, unifying coordinate systems of the first video source and the second video source to obtain coordinate conversion relational expressions corresponding to the first video source and the second video source respectively;
according to the coordinate conversion relation corresponding to the first video source and the first video picture, positioning calculation is carried out on the target object to obtain first position information;
and according to the coordinate conversion relation corresponding to the second video source and the second video picture, carrying out positioning calculation on the target object to obtain second position information.
Specifically, the coordinate system is unified, the pixel coordinate system of the video source can be subjected to a series of operations through the photographing principle and the related geometric transformation matrix operation, and the world coordinate system is obtained, so that the information of the actual target position is returned. Furthermore, the world coordinate systems of the two video sources are combined with the singular matrix to obtain a translation rotation matrix between the two coordinate systems by selecting four pairs of homonymous points, so as to complete a unified coordinate system.
Specifically, the relation between the physical coordinate system and the pixel coordinate system can be back-deduced according to the imaging principle of the camera device corresponding to the video source. The conversion of the coordinate system is to better describe the position information of the target, and the three-dimensional space information of the target is obtained through the two-dimensional image information.
Specifically, the process of converting pixel coordinates into world coordinates is as follows:
a. the camera internal and external parameters are acquired through camera calibration, such as focal length, pixel size, rotation matrix and translation vector of the camera, etc. Alternatively, zhang Zhengyou checkerboard camera calibration may be used to obtain the internal and external parameters of the camera.
b. And converting the pixel coordinates into coordinates in a camera coordinate system according to the internal parameters of the camera. The conversion involves the pixel size and focal length of the camera. The pixel coordinates need to be converted into image coordinates, namely, the pixel coordinate system is continuous, and the conversion equation is as follows:
c. the perspective relation between the camera coordinate system and the image coordinate system can be calculated by utilizing the principle of similar triangle, and the distance between the object and the camera in space is based onAnd focal length of camera->Is the following:
d. the world coordinate system and the camera coordinate system are rigidly transformed, mainly involving rotation transformation and translation transformation, and the transformation equation is as follows:
Wherein,and->Coordinates of the origin of the pixel coordinate system in the horizontal and vertical directions, respectively, (-)>Is the corresponding pixel coordinate +.>Image coordinates of (-)>Is the corresponding pixel coordinate +.>Camera coordinates of->Andthe length of the pixel cell in the horizontal and vertical directions, respectively,/->Is the focal length of the camera, ">Is the distance of the object in space from the camera, (-j->Is the corresponding pixel coordinate +.>World coordinates of>Is a rotation matrix of 3*3,>is a translation vector.
By integrating the formulas, the pixel coordinate conversion can be obtained into a world coordinate equation:
the camera devices corresponding to the two video sources respectively obtain corresponding three-dimensional coordinate systems after projection transformation, the two three-dimensional coordinate systems are integrated under the same coordinate system, and according to homonymous point information, the rotation translation matrix can be utilized to realize the integration of the coordinate systems:
selecting homonymy point P in overlapping vision fields of camera equipment CamA and CamB corresponding to two video sources Ai (x Ai ,y Ai ,z Ai ) And P Bi (x Bi ,y Bi, z Bi ) (where i=1, 2,3, 4), we express these coordinates as a matrix form:
and multiplying the transposed matrix of B with the matrix of A to obtain a matrix M, and then carrying out singular value division on the matrix M to obtain a left singular vector U, a right singular vector V and a singular value matrix S of the matrix M. Taking the first three columns in U, V to form R (rotation matrix), subtracting the average coordinate of A from the average coordinate of B to obtain a translation vector T, obtaining a final transformation matrix T, and realizing the unification of two coordinate systems:
Through the embodiment, the coordinate systems of the two video sources can be unified based on the homonymous point information coordinate system conversion algorithm, and the first position information and the second position information which are accurate and in the same coordinate system are further calculated, so that the specific position of the target object can be obtained accurately through subsequent calculation, and the target object can be positioned.
As an optional embodiment, in the step, calculating a proportion of the target object occupying the overlapping portion according to the first video frame and the second video frame, and determining the weight information corresponding to the target according to the proportion, including:
calculating a target distance of the target object relative to one boundary of the overlapping part according to the first video picture and the second video picture; the boundary corresponds to a view boundary of the first video source or a view boundary of the second video source;
and calculating weight information corresponding to the target object according to the proportion of the target distance and the total length of the overlapping part.
Through the embodiment, the weight information corresponding to the target object can be calculated according to the ratio of the target distance of the target object relative to one boundary of the overlapping part and the total length of the overlapping part, so that the weight information can be flexibly adjusted according to the duty ratio of the target object in the overlapping part, and the specific position of the target object can be obtained more accurately in subsequent calculation to position the target object.
As an alternative embodiment, the target object is gradually moved away from the first video source and closer to the second video source in the movement; in the above step, calculating the target distance of the target object with respect to one boundary of the overlapping portion based on the first video frame and the second video frame includes:
determining the object position of the target object and a target boundary corresponding to the view boundary of the second video source in the overlapping part according to the first video picture and the second video picture;
calculating the projection distance between the object position and the target boundary relative to the shooting direction of the video source to obtain the target distance;
and calculating the projection length of the overlapping part relative to the shooting direction of the video source to obtain the total length of the overlapping part.
According to the embodiment, the projection distance between the object position and the target boundary relative to the imaging direction of the video source and the projection length between the overlapping part and the imaging direction of the video source can be calculated according to the projection principle, so that the more accurate duty ratio of the target object can be calculated, and the more accurate specific position of the target object can be obtained through subsequent calculation to position the target object.
As an optional embodiment, in the step, calculating the weight information corresponding to the target object according to the ratio of the target distance to the total length of the overlapping portion includes:
Calculating a distance proportion value of the target distance and the total length of the overlapping part;
determining a distance proportion value as a first weight corresponding to a first video source;
and calculating the difference value between the 1 and the distance proportion value, and determining the difference value as a second weight corresponding to the second video source.
Through the embodiment, the distance proportion value can be determined as the first weight corresponding to the first video source, so that the weight of the first position information of the target object is reduced in the process of being far away from the first video source, the difference value between 1 and the distance proportion value is calculated again, and the difference value is determined as the second weight corresponding to the second video source, so that the weight of the second position information in the process is improved, and the specific position of the target object can be obtained accurately in the subsequent calculation to position the target object.
As an optional embodiment, in the step, determining final position information corresponding to the target object according to the weight information, the first position information, and the second position information includes:
calculating a first product of the first location information and the first weight;
calculating a second product of the second location information and the second weight;
and calculating the sum of the first product and the second product to obtain final position information corresponding to the target object.
Through the embodiment, the first position information and the second position information can be weighted and summed according to the weight information to determine the final position information corresponding to the target object, so that the specific position of the target object can be calculated to be more accurate to position the target object.
As an optional embodiment, in the step, after calculating the difference between 1 and the distance scale value and determining the difference as the second weight corresponding to the second video source, the method further includes:
inputting the first video picture into a pre-trained speed prediction neural network model and a positioning accuracy prediction neural network model to obtain a first speed prediction value and a first positioning accuracy prediction value corresponding to the output first video picture; the speed prediction neural network model is obtained through training a training data set comprising a plurality of training video pictures and corresponding speed labels; the positioning accuracy prediction neural network model is obtained through training a training data set comprising a plurality of training video pictures and corresponding positioning accuracy labels;
calculating a first speed weight inversely proportional to the first speed predicted value, and calculating a first positioning weight proportional to the first positioning accuracy predicted value, and calculating a first model-related weight; the first model correlation weight is proportional to the first predictive characterizing value; the first prediction characterization value is a difference value between a ratio of the first speed prediction value and the first positioning accuracy prediction value and a preset ratio threshold value, and is used for characterizing the accuracy degree of model prediction;
Calculating the product of the first speed weight, the first positioning weight, the first model related weight and the first weight to obtain a new first weight;
inputting the second video picture into a speed prediction neural network model and a positioning accuracy prediction neural network model to obtain a second speed predicted value and a second positioning accuracy predicted value corresponding to the output second video picture;
calculating a second velocity weight inversely proportional to the second velocity prediction value, and calculating a second positioning weight proportional to the second positioning accuracy prediction value, and calculating a second model-related weight; the second model correlation weight is proportional to the second predictive characterizing value; the second prediction characterization value is the difference value between the ratio of the second speed prediction value and the second positioning accuracy prediction value and the ratio threshold value, and is used for characterizing the accuracy degree of model prediction;
and calculating the product of the second velocity weight, the second positioning weight and the second model related weight and the second weight to obtain a new second weight.
Alternatively, the neural network may be a neural network of RNN structure.
Optionally, the above prediction characterization value is used to characterize whether the ratio of the speed prediction value to the positioning accuracy prediction value accords with a preset rule, because the positioning accuracy prediction value is generally inversely proportional to the speed prediction value, so that the rule of the speed prediction value and the positioning accuracy prediction value can be used for seeing the prediction accuracy of the neural network model, once unbalance indicates that the prediction of the network model has a problem, and the corresponding weight should be reduced in time.
Through the optional embodiment, the speed prediction neural network model and the positioning accurate prediction neural network model are added to update the positioning weight, so that the specific position of the more accurate target object can be calculated to position the target object.
In a specific embodiment, in the link of positioning target handover across video sources, the position coordinates obtained by two cameras are unavoidable because the pixel coordinates of the video targets are converted to world coordinates, including a unified coordinate system process, and the position coordinates obtained by the targets are inconsistent because of the manual taking of the pixel point pair errors with the same name, the related conversion errors and the positioning tracking errors, and the core point is how to process the two position points. Because the farther the target is from the center point of the camera, the greater the positioning error during video positioning of the target. For this reason, in this solution, by using an adaptive View line and prediction model weighting method, as shown in fig. 4, where o_a and o_b are View center points of the cameras a and B, a ratio k of AB occupied by X' projected by the object X onto the View line of the overlapping View view_ab is used as a trust degree for positioning the object X of the monitoring a and the monitoring B, and as the object X gradually leaves the View of the monitoring a, the trust degree for positioning the object X of the monitoring a is reduced to 0, so as to smoothly transition to positioning the object X of the monitoring B, thereby realizing the whole process of object handover.
The principle equation of the weighting method of the overlapping view lines and the prediction model is as follows:
wherein,and |AB| represents line segment +.>And the length of AB, < >>And->The position information obtained by tracking and positioning the target X by the monitoring A and the monitoring B at the moment is respectively represented, K2 and K3 are product weights of a speed predicted value, a positioning precision predicted value and a predicted characterization value which are predicted by the monitoring A and the monitoring B through a prediction model, and P_X represents the position of the final target X.
Example two
Referring to fig. 2, fig. 2 is a schematic structural diagram of a target handover positioning system across video sources according to an embodiment of the present invention. The system described in fig. 2 may be applied to a corresponding data processing device, a data processing terminal, and a data processing server, where the server may be a local server or a cloud server, and embodiments of the present invention are not limited. As shown in fig. 2, the system may include:
an acquisition module 201, configured to acquire a first video frame including a target object from a first video source and a second video frame including a target object from a second video source; the first video picture and the second video picture have overlapping parts;
the positioning module 202 is configured to perform positioning calculation on the target object according to the first video frame and the second video frame, so as to obtain first position information and second position information;
The calculating module 203 is configured to calculate, according to the first video frame and the second video frame, a proportion of the overlapping portion occupied by the target object, and determine weight information corresponding to the target according to the proportion;
the determining module 204 is configured to determine final location information corresponding to the target object according to the weight information, the first location information, and the second location information.
As an optional embodiment, the positioning module 202 performs positioning calculation on the target object according to the first video frame and the second video frame, to obtain a specific manner of the first position information and the second position information, which includes:
based on a homonymy point information coordinate system conversion algorithm, unifying coordinate systems of the first video source and the second video source to obtain coordinate conversion relational expressions corresponding to the first video source and the second video source respectively;
according to the coordinate conversion relation corresponding to the first video source and the first video picture, positioning calculation is carried out on the target object to obtain first position information;
and according to the coordinate conversion relation corresponding to the second video source and the second video picture, carrying out positioning calculation on the target object to obtain second position information.
As an optional embodiment, the calculating module 203 calculates, according to the first video frame and the second video frame, a proportion of the overlapping portion occupied by the target object, and determines, according to the proportion, a specific manner of weight information corresponding to the target, including:
Calculating a target distance of the target object relative to one boundary of the overlapping part according to the first video picture and the second video picture; the boundary corresponds to a view boundary of the first video source or a view boundary of the second video source;
and calculating weight information corresponding to the target object according to the proportion of the target distance and the total length of the overlapping part.
As an alternative embodiment, the target object is gradually moved away from the first video source and closer to the second video source in the movement; the calculating module 203 calculates a specific mode of the target distance of the target object with respect to one boundary of the overlapping portion according to the first video frame and the second video frame, including:
determining the object position of the target object and a target boundary corresponding to the view boundary of the second video source in the overlapping part according to the first video picture and the second video picture;
calculating the projection distance between the object position and the target boundary relative to the shooting direction of the video source to obtain the target distance;
and calculating the projection length of the overlapping part relative to the shooting direction of the video source to obtain the total length of the overlapping part.
As an optional embodiment, the calculating module 203 calculates, according to the ratio of the target distance to the total length of the overlapping portion, the specific manner of the weight information corresponding to the target object, including:
Calculating a distance proportion value of the target distance and the total length of the overlapping part;
determining a distance proportion value as a first weight corresponding to a first video source;
and calculating the difference value between the 1 and the distance proportion value, and determining the difference value as a second weight corresponding to the second video source.
As an alternative embodiment, the calculating module 203 calculates the difference between 1 and the distance scale value, and after determining the second weight corresponding to the second video source, performs the following steps:
inputting the first video picture into a pre-trained speed prediction neural network model and a positioning accuracy prediction neural network model to obtain a first speed prediction value and a first positioning accuracy prediction value corresponding to the output first video picture; the speed prediction neural network model is obtained through training a training data set comprising a plurality of training video pictures and corresponding speed labels; the positioning accuracy prediction neural network model is obtained through training a training data set comprising a plurality of training video pictures and corresponding positioning accuracy labels;
calculating a first speed weight inversely proportional to the first speed predicted value, and calculating a first positioning weight proportional to the first positioning accuracy predicted value, and calculating a first model-related weight; the first model correlation weight is proportional to the first predictive characterizing value; the first prediction characterization value is a difference value between a ratio of the first speed prediction value and the first positioning accuracy prediction value and a preset ratio threshold value, and is used for characterizing the accuracy degree of model prediction;
Calculating the product of the first speed weight, the first positioning weight, the first model related weight and the first weight to obtain a new first weight;
inputting the second video picture into a speed prediction neural network model and a positioning accuracy prediction neural network model to obtain a second speed predicted value and a second positioning accuracy predicted value corresponding to the output second video picture;
calculating a second velocity weight inversely proportional to the second velocity prediction value, and calculating a second positioning weight proportional to the second positioning accuracy prediction value, and calculating a second model-related weight; the second model correlation weight is proportional to the second predictive characterizing value; the second prediction characterization value is the difference value between the ratio of the second speed prediction value and the second positioning accuracy prediction value and the ratio threshold value, and is used for characterizing the accuracy degree of model prediction;
and calculating the product of the second velocity weight, the second positioning weight and the second model related weight and the second weight to obtain a new second weight.
As an alternative embodiment, the determining module 204 determines, according to the weight information, the first location information, and the second location information, a specific manner of determining the final location information corresponding to the target object, where the specific manner includes:
Calculating a first product of the first location information and the first weight;
calculating a second product of the second location information and the second weight;
and calculating the sum of the first product and the second product to obtain final position information corresponding to the target object.
The details and technical effects of the modules in the embodiment of the present invention may refer to the description in the first embodiment, and are not described herein.
Example III
Referring to fig. 3, fig. 3 is a schematic structural diagram of another cross-video source target handover positioning system according to an embodiment of the present invention. As shown in fig. 3, the system may include:
a memory 301 storing executable program code;
a processor 302 coupled with the memory 301;
the processor 302 invokes the executable program code stored in the memory 301 to perform some or all of the steps in the cross-video source object handover positioning method disclosed in the embodiment of the present invention.
Example IV
The embodiment of the invention discloses a computer storage medium which stores computer instructions for executing part or all of the steps in the cross-video-source target handover positioning method disclosed in the embodiment of the invention when the computer instructions are called.
The system embodiments described above are merely illustrative, in which the modules illustrated as separate components may or may not be physically separate, and the components shown as modules may or may not be physical, i.e., may be located in one place, or may be distributed over a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above detailed description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course by means of hardware. Based on such understanding, the foregoing technical solutions may be embodied essentially or in part in the form of a software product that may be stored in a computer-readable storage medium including Read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), programmable Read-Only Memory (Programmable Read-Only Memory, PROM), erasable programmable Read-Only Memory (Erasable Programmable Read Only Memory, EPROM), one-time programmable Read-Only Memory (OTPROM), electrically erasable programmable Read-Only Memory (EEPROM), compact disc Read-Only Memory (Compact Disc Read-Only Memory, CD-ROM) or other optical disc Memory, magnetic disc Memory, tape Memory, or any other medium that can be used for computer-readable carrying or storing data.
Finally, it should be noted that: the embodiment of the invention discloses a cross-video-source target handover positioning method and system, which are disclosed by the embodiment of the invention only for illustrating the technical scheme of the invention, but not limiting the technical scheme; although the invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art will understand that; the technical scheme recorded in the various embodiments can be modified or part of technical features in the technical scheme can be replaced equivalently; such modifications and substitutions do not depart from the spirit and scope of the corresponding technical solutions.

Claims (6)

1. A method for cross-video source target handoff positioning, the method comprising:
acquiring a first video picture comprising a target object from a first video source and a second video picture comprising the target object from a second video source; the first video picture and the second video picture have a superposition part; the target object gradually moves away from the first video source and approaches the second video source in the moving process;
positioning calculation is carried out on the target object according to the first video picture and the second video picture respectively to obtain first position information and second position information;
Determining an object position of the target object and a target boundary corresponding to a view boundary of the second video source in the overlapping part according to the first video picture and the second video picture;
calculating the projection distance between the object position and the target boundary relative to the shooting direction of the video source to obtain a target distance;
calculating the projection length of the overlapping part relative to the shooting direction of the video source to obtain the total length of the overlapping part;
calculating a distance proportion value of the target distance and the total length of the overlapped part;
determining the distance proportion value as a first weight corresponding to the first video source;
calculating the difference value between the distance ratio value 1 and the distance ratio value, and determining the difference value as a second weight corresponding to the second video source;
calculating a first product of the first location information and the first weight;
calculating a second product of the second location information and the second weight;
and calculating the sum of the first product and the second product to obtain final position information corresponding to the target object.
2. The method for positioning a target object across video sources according to claim 1, wherein the performing positioning calculation on the target object according to the first video frame and the second video frame to obtain first position information and second position information includes:
Based on a homonymy point information coordinate system conversion algorithm, unifying coordinate systems of the first video source and the second video source to obtain coordinate conversion relational expressions corresponding to the first video source and the second video source respectively;
according to the coordinate conversion relation corresponding to the first video source and the first video picture, positioning calculation is carried out on the target object to obtain first position information;
and carrying out positioning calculation on the target object according to the coordinate conversion relation corresponding to the second video source and the second video picture to obtain second position information.
3. The method for positioning a target handover across video sources according to claim 1, wherein after calculating the difference between 1 and the distance scale value and determining the difference as the second weight corresponding to the second video source, the method further comprises:
inputting the first video picture into a pre-trained speed prediction neural network model and a positioning accurate prediction neural network model to obtain a first speed predicted value and a first positioning accurate predicted value corresponding to the output first video picture; the speed prediction neural network model is obtained through training a training data set comprising a plurality of training video pictures and corresponding speed labels; the positioning accuracy prediction neural network model is obtained through training a training data set comprising a plurality of training video pictures and corresponding positioning accuracy labels;
Calculating a first speed weight inversely proportional to the first speed predicted value, and calculating a first positioning weight proportional to the first positioning accuracy predicted value, and calculating a first model-related weight; the first model correlation weight is in direct proportion to a first predictive characterization value; the first prediction characterization value is a difference value between a ratio of the first speed prediction value to the first positioning accuracy prediction value and a preset ratio threshold value, and is used for characterizing the accuracy degree of model prediction;
calculating the product of the first speed weight, the first positioning weight and the first model related weight and the first weight to obtain a new first weight;
inputting the second video picture into the speed prediction neural network model and the positioning accuracy prediction neural network model to obtain a second speed predicted value and a second positioning accuracy predicted value corresponding to the output second video picture;
calculating a second velocity weight inversely proportional to the second velocity prediction value, and calculating a second positioning weight proportional to the second positioning accuracy prediction value, and calculating a second model-related weight; the second model-related weight is proportional to a second predictive characterizing value; the second prediction characterization value is a difference value between a ratio of the second speed prediction value and the second positioning accuracy prediction value and the ratio threshold value, and is used for characterizing the accuracy degree of model prediction;
And calculating the product of the second velocity weight, the second positioning weight and the second model related weight and the second weight to obtain a new second weight.
4. A target handoff positioning system across video sources, the system comprising:
an acquisition module for acquiring a first video picture including a target object from a first video source and a second video picture including the target object from a second video source; the first video picture and the second video picture have a superposition part; the target object gradually moves away from the first video source and approaches the second video source in the moving process;
the positioning module is used for performing positioning calculation on the target object according to the first video picture and the second video picture respectively to obtain first position information and second position information;
the calculating module is configured to calculate, according to the first video frame and the second video frame, a proportion of the target object occupying the overlapping portion, and determine weight information corresponding to the target according to the proportion, where the calculating module specifically includes:
determining an object position of the target object and a target boundary corresponding to a view boundary of the second video source in the overlapping part according to the first video picture and the second video picture;
Calculating the projection distance between the object position and the target boundary relative to the shooting direction of the video source to obtain a target distance;
calculating the projection length of the overlapping part relative to the shooting direction of the video source to obtain the total length of the overlapping part;
calculating a distance proportion value of the target distance and the total length of the overlapped part;
determining the distance proportion value as a first weight corresponding to the first video source;
calculating the difference value between the distance ratio value 1 and the distance ratio value, and determining the difference value as a second weight corresponding to the second video source;
the determining module is configured to determine final location information corresponding to the target object according to the weight information, the first location information, and the second location information, and specifically includes:
calculating a first product of the first location information and the first weight;
calculating a second product of the second location information and the second weight;
and calculating the sum of the first product and the second product to obtain final position information corresponding to the target object.
5. A target handoff positioning system across video sources, the system comprising:
a memory storing executable program code;
A processor coupled to the memory;
the processor invokes the executable program code stored in the memory to perform the target handover positioning method across video sources of any of claims 1-3.
6. A computer storage medium storing computer instructions which, when invoked, are operable to perform the target handover positioning method across video sources of any one of claims 1-3.
CN202311119686.1A 2023-09-01 2023-09-01 Target handover positioning method and system crossing video sources Active CN116843727B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311119686.1A CN116843727B (en) 2023-09-01 2023-09-01 Target handover positioning method and system crossing video sources

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311119686.1A CN116843727B (en) 2023-09-01 2023-09-01 Target handover positioning method and system crossing video sources

Publications (2)

Publication Number Publication Date
CN116843727A CN116843727A (en) 2023-10-03
CN116843727B true CN116843727B (en) 2023-11-24

Family

ID=88167456

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311119686.1A Active CN116843727B (en) 2023-09-01 2023-09-01 Target handover positioning method and system crossing video sources

Country Status (1)

Country Link
CN (1) CN116843727B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2019194857A (en) * 2018-05-04 2019-11-07 キヤノン株式会社 Object tracking method and device
CN113590874A (en) * 2021-09-28 2021-11-02 山东力聚机器人科技股份有限公司 Video positioning method and device, and model training method and device
CN113706555A (en) * 2021-08-12 2021-11-26 北京达佳互联信息技术有限公司 Video frame processing method and device, electronic equipment and storage medium
CN116125462A (en) * 2023-02-17 2023-05-16 南京理工大学 Maneuvering target tracking method under pure angle measurement
CN116523962A (en) * 2023-04-20 2023-08-01 北京百度网讯科技有限公司 Visual tracking method, device, system, equipment and medium for target object
CN116580107A (en) * 2023-05-08 2023-08-11 北京理工大学 Cross-view multi-target real-time track tracking method and system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8913791B2 (en) * 2013-03-28 2014-12-16 International Business Machines Corporation Automatically determining field of view overlap among multiple cameras

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2019194857A (en) * 2018-05-04 2019-11-07 キヤノン株式会社 Object tracking method and device
CN113706555A (en) * 2021-08-12 2021-11-26 北京达佳互联信息技术有限公司 Video frame processing method and device, electronic equipment and storage medium
CN113590874A (en) * 2021-09-28 2021-11-02 山东力聚机器人科技股份有限公司 Video positioning method and device, and model training method and device
CN116125462A (en) * 2023-02-17 2023-05-16 南京理工大学 Maneuvering target tracking method under pure angle measurement
CN116523962A (en) * 2023-04-20 2023-08-01 北京百度网讯科技有限公司 Visual tracking method, device, system, equipment and medium for target object
CN116580107A (en) * 2023-05-08 2023-08-11 北京理工大学 Cross-view multi-target real-time track tracking method and system

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Robust Visual Tracking via Semiadaptive Weighted Convolutional Features;Haijun Wang等;《IEEE Signal Processing Letters》;第25卷(第5期);第670-674页 *
基于多目摄像头的船舶航行环境全景感知技术研究;刘莲;《中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑》(第01期);第C031-809页 *
基于跨视角匹配的图像地理位置定位研究;孙彬;《中国优秀硕士学位论文全文数据库基础科学辑》(第01期);第A008-191页 *

Also Published As

Publication number Publication date
CN116843727A (en) 2023-10-03

Similar Documents

Publication Publication Date Title
CN112444242B (en) Pose optimization method and device
CN110146099B (en) Synchronous positioning and map construction method based on deep learning
CN113140011B (en) Infrared thermal imaging monocular vision distance measurement method and related components
US11315264B2 (en) Laser sensor-based map generation
CN109559371B (en) Method and device for three-dimensional reconstruction
US20210044787A1 (en) Three-dimensional reconstruction method, three-dimensional reconstruction device, and computer
US11615548B2 (en) Method and system for distance measurement based on binocular camera, device and computer-readable storage medium
CN105706112A (en) Method for camera motion estimation and correction
KR102450931B1 (en) Image registration method and associated model training method, apparatus, apparatus
US20040257452A1 (en) Recursive least squares approach to calculate motion parameters for a moving camera
CN112927279A (en) Image depth information generation method, device and storage medium
JP2021105887A (en) Three-dimensional pose obtaining method and device
CN113643342B (en) Image processing method and device, electronic equipment and storage medium
JP2961264B1 (en) Three-dimensional object model generation method and computer-readable recording medium recording three-dimensional object model generation program
CN112541938A (en) Pedestrian speed measuring method, system, medium and computing device
CN113706579A (en) Prawn multi-target tracking system and method based on industrial culture
EP4317910A1 (en) Computer program, model generation method, estimation method and estimation device
CN112790758A (en) Human motion measuring method and system based on computer vision and electronic equipment
CN113168716A (en) Object resolving and point-winding flying method and device
CN114898355A (en) Method and system for self-supervised learning of body-to-body movements for autonomous driving
CN116843727B (en) Target handover positioning method and system crossing video sources
CN114266823A (en) Monocular SLAM method combining SuperPoint network characteristic extraction
CN111553954B (en) Online luminosity calibration method based on direct method monocular SLAM
CN117132649A (en) Ship video positioning method and device for artificial intelligent Beidou satellite navigation fusion
CN111449684A (en) Method and system for quickly acquiring cardiac ultrasound standard scanning section

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant