CN116993785B

CN116993785B - Target object visual tracking method and device, electronic equipment and storage medium

Info

Publication number: CN116993785B
Application number: CN202311115538.2A
Authority: CN
Inventors: 王璠
Original assignee: East Joe Technology Co ltd
Current assignee: East Joe Technology Co ltd
Priority date: 2023-08-31
Filing date: 2023-08-31
Publication date: 2024-02-02
Anticipated expiration: 2043-08-31
Also published as: CN116993785A

Abstract

The invention relates to the technical field of construction safety supervision, in particular to a target object visual tracking method, a target object visual tracking device, electronic equipment and a storage medium, wherein the method firstly acquires a plurality of first image blocks; then, a plurality of first feature matrixes corresponding to the plurality of first image blocks are obtained through edge extraction operation on the plurality of first image blocks respectively; and finally, determining the position of the target object according to the first similarity and the probability of the target object appearing in the first image blocks. According to the embodiment of the invention, the similarity is judged according to the edge feature matrix constructed according to the edge features of the image block, the edge feature matrix of the image containing the target object and the edge feature matrix, and the position of the target object is determined according to the probability that the target object appears in the image block, wherein the calculated amount of the edge feature matrix is smaller than that of the existing algorithm. Because the probability of the image block is considered, the target object can still be tracked when the target object is partially covered, and the tracking recognition rate is improved.

Description

Target object visual tracking method and device, electronic equipment and storage medium

Technical Field

The invention relates to the technical field of construction safety supervision, in particular to a target object visual tracking method, a target object visual tracking device, electronic equipment and a storage medium.

Background

With the rapid development of the construction industry, the industrial scale is continuously enlarged, and the original construction safety supervision method presents various defects. For example, constructors are unfamiliar with the field environment and work out of range; during construction, obstacles appear to block the sight, and constructors lose the sight under the condition that the directions cannot be distinguished. Various factors threaten personnel life and property security at all times. In addition, the construction site has complex environment and various tools, the situation of site loss often occurs, the construction site is difficult to find, and the project risk is increased.

In order to ensure construction safety, various methods have been devised, such as installing a fence at a construction site and establishing an access control system, however, the application of the supervision method with the fence and the access control system cannot effectively solve the safety problem in the field.

Another method of security supervision is to install video surveillance. If the monitoring system after video monitoring is installed, if potential safety hazards in the site are sent out through the traditional visual processing algorithm, the problem that the calculated amount of the visual processing algorithm is large and the real-time performance is poor exists when more monitored target objects are provided.

Based on the above, a target object visual tracking method needs to be developed and designed.

Disclosure of Invention

The embodiment of the invention provides a target object visual tracking method, a target object visual tracking device, electronic equipment and a storage medium, which are used for solving the problem of large calculation amount of a visual processing algorithm in the prior art.

In a first aspect, an embodiment of the present invention provides a target object visual tracking method, including:

acquiring a plurality of first image blocks, wherein the first image blocks are acquired based on image division of a target area, and the target area represents an area where a target object appears;

respectively carrying out edge extraction operation on the plurality of first image blocks to obtain a plurality of first feature matrixes corresponding to the plurality of first image blocks;

and determining the position of the target object according to a plurality of first similarities and probabilities of the target object appearing in the plurality of first image blocks, wherein the first similarities are determined according to a first feature matrix and a feature matrix of a first target image block, the first similarities represent the similarity degree of the first image block and the first target image block, and the first target image block contains the image of the target object.

In one possible implementation, the acquiring a plurality of first image blocks includes:

acquiring a first position, a second position and a third position, wherein the first position, the second position and the third position are determined based on the first target image block, the second target image block and the third target image block contain images of the target object, and time nodes corresponding to the first target image block, the second target image block and the third target image block are sequentially arranged;

determining a moving direction, a direction change rate, a moving speed and a moving speed change rate of the target object according to the first position, the second position and the third position;

determining a target moving direction and a moving radius according to the moving direction, the direction change rate, the moving speed and the moving speed change rate;

determining the target area according to the first position, the target moving direction and the moving radius, wherein the target area is a rectangle taking the first position as a center, the long axis or the short axis of the target area is parallel to the target moving direction, and a planning circle is contained in the target area, and is a circle taking the first position as a center and the moving radius as a radius;

And dividing the image of the target area into a plurality of image blocks serving as the plurality of first image blocks according to the major axis and the minor axis of the target area.

In one possible implementation manner, the obtaining, by the edge extraction operation, the plurality of first feature matrices corresponding to the plurality of first image blocks includes:

obtaining a target line number and a target column number, wherein the target line number and the target column number are the line number and the column number of the feature matrix of the first target image block respectively;

for each of the plurality of first image blocks, performing the following operations, respectively:

the image block is subjected to color removal to obtain a gray level image;

extracting a horizontal edge feature matrix and a vertical edge feature matrix from the gray level map by adopting a horizontal edge detection template and a vertical edge detection template;

constructing a fusion matrix representing edge characteristics in an edge synthesis and breakpoint connection mode according to the horizontal edge characteristic matrix and the vertical edge characteristic matrix;

deleting the edge blank rows and the edge blank columns of the fusion matrix, wherein the edge blank rows and the edge blank columns are rows and columns which are positioned at the edge of the fusion matrix and do not comprise edge features;

And carrying out pooling operation on the fusion matrix after the deleting operation according to the target line number and the target column number, and taking the fusion matrix after the pooling operation as a first feature matrix.

In one possible implementation manner, the extracting the horizontal edge feature matrix and the vertical edge feature matrix from the gray scale map by using a horizontal edge detection template and a vertical edge detection template includes:

acquiring a positioning instruction, the horizontal edge detection template and the vertical edge detection template;

according to the positioning indication, a first data block which is the same type as the horizontal edge detection template and the vertical edge detection template is taken out from the gray level diagram;

extracting the horizontal edge characteristic and the vertical edge characteristic of the first data block according to a first formula, the horizontal edge detection template and the vertical edge detection template, wherein the first formula is as follows:

in the method, in the process of the invention,for horizontal edge features or vertical edge features, +.>For a horizontal edge detection template or said vertical edge detection template +.>Line->Column element->For the first data block->Line->Column element->For the number of lines of the first data block, +. >A column number for the first data block;

according to the positioning instruction, adding the horizontal edge feature and the vertical edge feature into the horizontal edge feature matrix and the vertical edge feature matrix respectively;

and if the positioning instruction does not reach the end of the gray level diagram, shifting the positioning instruction according to a preset shifting distance, and jumping to the step of taking out the first data block which is the same type with the horizontal edge detection template and the vertical edge detection template from the gray level diagram according to the positioning instruction.

In one possible implementation manner, the constructing a fusion matrix for representing edge features according to the horizontal edge feature matrix and the vertical edge feature matrix through edge synthesis and breakpoint connection includes:

acquiring an edge detection threshold value and a neighborhood radius;

calculating the average value of the corresponding elements of the horizontal edge feature matrix and the vertical edge feature matrix, and adding the average value into a fusion matrix;

resetting elements smaller than the edge detection threshold value in the fusion matrix;

obtaining a breakpoint in the fusion matrix, wherein the breakpoint is a point representing an open loop of an edge;

For each breakpoint in the fusion matrix, performing the steps of:

obtaining a plurality of neighborhood points in the breakpoint neighborhood according to the neighborhood radius;

judging the continuity of the plurality of neighborhood points according to a second formula, wherein the second formula is as follows:

in the method, in the process of the invention,for the first breakpoint threshold->For the second breakpoint threshold->For the value of the break point in the horizontal edge feature matrix, < >>For the value of the break point in the vertical edge feature matrix, < >>For the values of the neighborhood points in the horizontal edge feature matrix, -/-, for>Values of the neighborhood points in the vertical edge feature matrix;

and if the points meeting the second formula exist in the plurality of neighborhood points, taking the points meeting the second formula as break points, replacing the values of the break points in the fusion matrix by continuous average values of the break points, and jumping to the step of obtaining the plurality of neighborhood points in the break point neighborhood according to the neighborhood radius, wherein the continuous average values are the average values of the continuous points in the vertical edge feature matrix and the values of the continuous points in the horizontal edge feature matrix.

In one possible implementation manner, the determining the location of the target object according to the first similarities and the probabilities of the target object appearing in the first image blocks includes:

If one first similarity higher than a first similarity threshold exists in the plurality of first similarities, the first similarity higher than the first similarity threshold is taken as a target similarity, and the position of the target object is determined according to a first image block corresponding to the target similarity;

otherwise, determining a plurality of labels according to a second similarity threshold and the first similarities, determining a first image block where the target object appears according to the labels, a plurality of position probabilities and a plurality of condition image similarity probabilities, and determining the position of the target object according to the first image block where the target object appears, wherein the labels represent whether the first image block is similar to the first target image block, the position probabilities are probabilities that the target object appears in the first image block, the condition image similarity probabilities are probabilities that images are similar when the target object appears in the first image block, and the second similarity threshold is smaller than the first similarity threshold.

In one possible implementation manner, the determining, according to the plurality of labels, the plurality of location probabilities, and the plurality of conditional image similarity probabilities, the first image block in which the target object appears includes:

Determining an image similarity probability according to a third formula, the plurality of position probabilities and the plurality of conditional image similarity probabilities, wherein the image similarity probability is a probability that similar image blocks exist in the plurality of first image blocks, and the third formula is:

in the method, in the process of the invention,for image similarity probability +.>Appear at +.>Probability of first image block similarity when in first image block, < >>Appear at +.>The prior probabilities of the first image blocks,for the target object not to appear at +.>The probability that the first image blocks are similar when in the first image block,for the target object not to appear at +.>Prior probability of the first image block, +.>For the number of first image blocks;

determining a plurality of conditional position probabilities according to a fourth formula, the plurality of labels, the image similarity probability, the plurality of position probabilities, and the plurality of conditional image similarity probabilities, wherein the conditional position probabilities characterize a probability that the target object appears in the first image block when images are similar or a probability that the target object appears in the first image block when images are dissimilar, the fourth formula being:

in the method, in the process of the invention,is conditional location probability >For the image similarity the target object appears at +.>Probability of the first image block, +.>For labels, for example->Target object appears at the first when images are dissimilarProbability of the first image block, +.>Appear at +.>Probability of first image block dissimilarity when in first image block, +.>Is the image dissimilarity probability;

determining a posterior probability that the target object appears in each first image block according to a fifth formula and the plurality of conditional position probabilities, wherein the fifth formula is:

in the method, in the process of the invention,appear at +.>Posterior probabilities in the first image block;

and taking the target first image block as a first image block of the target object, wherein the posterior probability of the target first image block is maximum and the posterior probability of the target first image block is greater than a posterior probability threshold.

In a second aspect, an embodiment of the present invention provides a target object visual tracking device, configured to implement the target object visual tracking method according to the first aspect or any one of the possible implementation manners of the first aspect, where the target object visual tracking device includes:

the image block acquisition module is used for acquiring a plurality of first image blocks, wherein the first image blocks are acquired based on image division of a target area, and the target area represents an area where a target object appears;

The feature matrix extraction module is used for obtaining a plurality of first feature matrixes corresponding to the plurality of first image blocks through edge extraction operation on the plurality of first image blocks respectively;

the method comprises the steps of,

and the target object tracking module is used for determining the position of the target object according to a plurality of first similarities and the probability of the target object appearing in the plurality of first image blocks, wherein the first similarities are determined according to a first feature matrix and the feature matrix of the first target image block, the first similarities represent the similarity degree of the first image block and the first target image block, and the first target image block contains the image of the target object.

In a third aspect, an embodiment of the present invention provides an electronic device, comprising a memory and a processor, the memory storing a computer program executable on the processor, the processor implementing the steps of the method according to the first aspect or any one of the possible implementations of the first aspect when the computer program is executed.

In a fourth aspect, embodiments of the present invention provide a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the method as described above in the first aspect or any one of the possible implementations of the first aspect.

Compared with the prior art, the embodiment of the invention has the beneficial effects that:

the embodiment of the invention discloses a target object visual tracking method, which comprises the steps of firstly, acquiring a plurality of first image blocks, wherein the first image blocks are acquired based on image division of a target area, and the target area represents an area where a target object appears; then, a plurality of first feature matrixes corresponding to the plurality of first image blocks are obtained through edge extraction operation on the plurality of first image blocks respectively; and finally, determining the position of the target object according to a plurality of first similarities and the probability of the target object appearing in the plurality of first image blocks, wherein the first similarities are determined according to a first feature matrix and the feature matrix of the first target image block, the first similarities represent the similarity degree of the first image block and the first target image block, and the first target image block contains the image of the target object. According to the edge feature matrix constructed according to the edge features of the image block, the similarity is judged according to the edge feature matrix and the edge feature matrix of the image containing the target object, the position of the target object is determined according to the probability that the target object appears in the image block, the calculated amount is smaller than that of the existing algorithm, and the probability that the target object appears in the image block is considered, so that the target object can still be tracked when the target object is partially covered, and the tracking recognition rate is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments or the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a method for visual tracking of a target object provided by an embodiment of the present invention;

FIG. 2 is a schematic diagram of image segmentation provided by an embodiment of the present invention;

FIG. 3 is a functional block diagram of a target object visual tracking device provided by an embodiment of the present invention;

fig. 4 is a functional block diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth such as the particular system architecture, techniques, etc., in order to provide a thorough understanding of the embodiments of the present invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.

For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the following description will be made with reference to the accompanying drawings.

The following describes in detail the embodiments of the present invention, and the present embodiment is implemented on the premise of the technical solution of the present invention, and a detailed implementation manner and a specific operation procedure are given, but the protection scope of the present invention is not limited to the following embodiments.

Fig. 1 is a flowchart of a target object visual tracking method according to an embodiment of the present invention.

As shown in fig. 1, a flowchart of an implementation of the target object visual tracking method according to an embodiment of the present invention is shown, and the details are as follows:

in step 101, a plurality of first image blocks are acquired, wherein the first image blocks are obtained based on an image partition of a target region, which characterizes a region where the target object appears.

In some embodiments, the step 101 includes:

Illustratively, as shown in FIG. 2, a schematic diagram of the image segmentation provided is shown. In the figure, the speed and direction of the tracked target object 201 will change with time, and the tracked target object is represented on a segmented track 202 corresponding to a plurality of time periods, so that each segmented track 202 is different in length and presents a certain angle with each other.

From these segmented trajectories 202, the region 204 where the target object 201 is active for the next time period can be deduced. Since the probability of occurrence of the target object 201 is different in different positions in the region 204, the present embodiment divides the region into a plurality of blocks 205 (nine blocks are divided into regions in the figure), so that the position of the target object 201 can be determined according to the probability of occurrence in the blocks 205.

When determining the position and direction of the area, the embodiment obtains three continuous positions, calculates the moving speed according to the position difference and the time difference, calculates the moving speed change rate through the moving speed difference of two times, and similarly, calculates the moving direction through two positions and calculates the angle difference of two times to obtain the direction change rate on the determination of the moving direction. According to the moving speed and the moving speed change rate, the moving radius of the target object in the next period can be determined, and similarly, the moving direction of the next period can be determined. A rectangular region (which is a circle including a circle planned with a moving radius) can be planned based on the moving direction and the moving radius, and the rectangular region is divided according to the moving direction to obtain a plurality of blocks.

In step 102, a plurality of first feature matrices corresponding to the plurality of first image blocks are obtained through an edge extraction operation on the plurality of first image blocks, respectively.

In some embodiments, the step 102 includes:

the image block is subjected to color removal to obtain a gray level image;

In some embodiments, the extracting the horizontal edge feature matrix and the vertical edge feature matrix from the gray scale map using the horizontal edge detection template and the vertical edge detection template includes:

in the method, in the process of the invention,for horizontal edge features or vertical edge features, +.>For a horizontal edge detection template or said vertical edge detection template +.>Line->Column element->For the first data block->Line->Column element->For the number of lines of the first data block, +.>A column number for the first data block;

In some embodiments, the constructing a fusion matrix characterizing edge features according to the horizontal edge feature matrix and the vertical edge feature matrix by means of edge synthesis and breakpoint connection includes:

acquiring an edge detection threshold value and a neighborhood radius;

for each breakpoint in the fusion matrix, performing the steps of:

Illustratively, visual tracking of the target object by embodiments of the present invention is based on image feature extraction and validation against feature contrast of images of a previous period. One existing technique is to identify and track targets using convolutional artificial neural networks (Convolutional Neural Networks, CNN).

By adopting the tracking method of the convolution artificial neural network, a large number of samples are required for training, so that the recognition rate of the target image is improved. When the method is applied to a construction site, because the images of the construction site are acquired based on video streams, the number of the images is large, the targets to be tracked are large, and the samples are acquired for training for each target, so that the cost is obviously great. Moreover, for some tools, such as multiple shovels for field applications, the shapes of the tools are very similar, and it is difficult to identify, distinguish and locate the multiple shovels by means of the above-described method.

In addition, the images of the construction site are often blocked, so that the recognition effect is affected, for example, a certain worker walks behind a sand pile, and only the upper body image appears in the video, which can cause the failure of the convolutional artificial neural network recognition tracking.

In addition, as mentioned above, the convolutional artificial neural network has the problems of large calculation amount, low efficiency and poor real-time performance, and the problems are especially the case when the images and the targets are more.

The method is based on image processing, edge characteristics are extracted, an edge characteristic matrix is constructed, similarity of the matrix and probability of occurrence positions are finally judged, and positions of target objects are determined. Therefore, the calculation cost is low, the investment of early sample extraction and training is not needed, and the problem of recognition failure caused by incomplete images is avoided.

For the aspect of extracting the feature matrix, the embodiment of the invention firstly performs the color removal on the graph to obtain a gray level graph, then adopts a horizontal edge detection template and a vertical detection template to extract the edge from the gray level graph, and one vertical edge detection template is as follows:

a horizontal edge detection template is as follows:

in the application mode of the template, firstly, a data block (the same type of the data block and the edge detection template) is taken out from the image, and then the edge characteristics are calculated by using a first formula:

In the method, in the process of the invention,for horizontal edge features or vertical edge features, +.>For a horizontal edge detection template or said vertical edge detection template +.>Line->Column element->For the first data block->Line->Column element->For the number of lines of the first data block, +.>Is the number of columns of the first data block.

The obtained edge features are placed in an edge feature matrix corresponding to the positions of the data blocks, and then the positions of the data blocks are shifted out in a predetermined order, for example, the predetermined order is left to right, shifting one data position at a time, shifting one data position from top to bottom when the data blocks reach the rightmost side, and starting from the leftmost position until the lower right corner of the image is reached.

Two edge feature matrixes can be obtained from the process, wherein the two edge feature matrixes respectively represent a horizontal edge and a vertical edge, and the two matrixes are fused to obtain a complete edge matrix. One way of fusion is to calculate the arithmetic mean of the same position elements of the two edge matrices and put them into the fusion matrix.

In practice, there are open-loop points at the edges of the fusion matrix that could be closed-loop, for example, the outer contour of a blender could be closed-loop, but the edges of the high spots break due to the overhead illumination. In fact, some edges may be found and connected, in this embodiment, the location of the break point is obtained, (one obtaining method is that the direction of the nearest edge point of the edge point is judged, the nearest edge point is found in the opposite direction of the direction, if the distance between the edge points in the opposite direction is larger, this edge point is indicated to be broken in the opposite direction, so that this point is judged as the break point), a plurality of neighborhood points are found according to the location of the break point and the neighborhood radius, for example, the radius is 5 pixels, then points within the radius of 5 pixels are searched, and for each neighborhood point, continuity is judged by the second formula:

In the method, in the process of the invention,for the first breakpoint threshold->For the second breakpoint threshold->For the value of the break point in the horizontal edge feature matrix, < >>For the value of the break point in the vertical edge feature matrix, < >>For the values of the neighborhood points in the horizontal edge feature matrix, -/-, for>Is the value of the neighborhood point in the vertical edge feature matrix.

When the condition of the second formula is met, the continuing average value of the neighborhood point replaces the value in the fusion matrix, the point is taken as a breakpoint, and the continuing steps are repeated.

When the connection is completed, the rows and columns located at the outer periphery without edge points can be deleted.

The last step is to carry out pooling operation, wherein the goal of pooling operation is to compress the fusion matrix and reduce the number of rows and columns, and in one application scene, the adopted maximum pooling is to divide the fusion matrix into a plurality of small data blocks, take the maximum value of each data block, and then rearrange the maximum values according to the positions of the data blocks, so that the number of rows and columns of the fusion matrix are the same as the number of rows and columns of the feature matrix of the previous target image.

In step 103, the position of the target object is determined according to a plurality of first similarities and probabilities of the target object appearing in the plurality of first image blocks, wherein the first similarities are determined according to a first feature matrix and a feature matrix of a first target image block, the first similarities represent the similarity degree of the first image block and the first target image block, and the first target image block contains the image of the target object.

In some embodiments, the step 103 includes:

otherwise, determining a plurality of labels according to a second similarity threshold and the first similarities, determining a first image block of the target object according to the labels, a plurality of position probabilities and a plurality of conditional image similarity probabilities, and determining the position of the target object according to the first image block of the target object, wherein the labels represent whether the first image block is similar to the first target image block, the position probabilities are probabilities of the target object appearing in the first image block, the conditional image similarity probabilities are probabilities of image similarity when the target object appears in the first image block, and the second similarity threshold is smaller than the first similarity threshold.

In some embodiments, the first image block of the target object determined according to the plurality of labels, the plurality of location probabilities, and the plurality of conditional image similarity probabilities includes:

in the method, in the process of the invention,is conditional location probability >For the image similarity the target object appears at +.>Probability of the first image block, +.>For labels, for example->The target object appears at +.>Probability of the first image block, +.>Appear at +.>Probability of first image block dissimilarity when in first image block, +.>Is the image dissimilarity probability;

Illustratively, the matrix obtained by the foregoing process first determines the similarity to the feature matrix of the image preceding the target object, then determines the position of the target object based on the probability that the target object appears in the plurality of blocks.

The similarity is judged in a plurality of ways, for example, the following formula is adopted:

in the method, in the process of the invention, For similarity, ->For the first feature matrix->For the feature matrix of the first target image block, < >>For the number of rows of the first feature matrix, +.>Is the number of columns of the first feature matrix.

If the similarity value obtained by the calculation of the above formula is relatively high, for example, at 0.9 or more (the similarity obtained by the calculation of the above formula is distributed between-1 and 1), it can be determined that the target object appears in the image block.

In practice, however, the similarity obtained by the calculation of the above formula cannot be sufficiently demonstrated in the image block. For example, the forefront image block 205 as shown in fig. 2 is the image block most likely to present the target object 203, and although its similarity is only 0.6, the similarity of other image blocks is as low as 0.3 or less, and we still have reason to believe that the target object is present in this image block.

The above-described judgment process is a process of conditional probability and joint probability. In order to solve the problem that the similarity cannot sufficiently prove whether or not the object appears in the image block, the present embodiment first obtains a plurality of conditional probabilities, and determines whether or not the target object appears in the image block based on the conditional probabilities. The conditional probability obtained in the embodiment of the present invention is the probability that the images are similar when the target object appears in the image block.

When the target object appears in different image blocks, the image similarity probability of the target object is different. The likelihood of similarity occurring in the forefront image is high because the target object appears in the same direction as in the previous image. Similarly, the probability of similarity when occurring in the rear is low.

According to the probability of similarity of each image block image, the third formula may be applied in this embodiment, and the probability of similarity of the overall image block is calculated and obtained:

in the method, in the process of the invention,for image similarity probability +.>Appear at +.>Probability of first image block similarity when in first image block, < >>Appear at +.>The prior probabilities of the first image blocks,for the target object not to appear at +.>In a first image blockWhen the first image block is similar to the probability,for the target object not to appear at +.>Prior probability of the first image block, +.>Is the number of first image blocks.

Meanwhile, the embodiment divides the similarity lower than the obvious similarity again (as described above, the obvious similarity image does not consider the position probability), for example, the similarity is above 0.9 to determine that the image block is present, the similarity is below 0.9 and above 0.5, the similarity is below 0.5, and the labels are set for the plurality of image blocks.

In this way, whether each image block is similar or not is determined, and in the case of determining whether each image block is similar or not, the present embodiment further determines whether each image block exists, and the fourth formula may be applied:

in the method, in the process of the invention,is conditional location probability>For the image similarity the target object appears at +.>Probability of the first image block, +.>For labels, for example->Target object appears at the first when images are dissimilarProbability of the first image block, +.>Appear at +.>Probability of first image block dissimilarity when in first image block, +.>Is the probability of image dissimilarity.

The above procedure is illustrative of the probability that a target object will appear in an image block when the image blocks are known to be similar, or dissimilar. In fact, the above formula illustrates a number of possible situations, such as even similar, not necessarily showing that the target object appears in the tile, e.g., in the previous image of the target object, the background content occupies a majority of the portion, further resulting in a higher image similarity. If not so similar, it is possible that the target is present in the tile, e.g., the target object is occluded by the tree. Therefore, comprehensively considering the occurrence probability and the similarity is a more reliable scheme.

After the conditional position probabilities for the plurality of image blocks are obtained, the posterior probability for each image block may be determined by a fifth equation:

in the method, in the process of the invention,appear at +.>Posterior probability in a first image block.

If the posterior probability of an image block is highest and exceeds the posterior probability threshold, it may be determined that the target object is present in the image block. Conversely, if the posterior probability is not the highest, or even if it is the highest but fails to exceed the posterior probability, it cannot be taken as a basis for the target object to appear in the image block, at this time, an auxiliary method should be adopted, for example, enabling a convolutional artificial neural network to make a detailed determination as to which image block appears.

The invention relates to an implementation mode of a target object visual tracking method, which comprises the steps of firstly, acquiring a plurality of first image blocks, wherein the first image blocks are obtained based on image division of a target area, and the target area represents an area where a target object appears; then, a plurality of first feature matrixes corresponding to the plurality of first image blocks are obtained through edge extraction operation on the plurality of first image blocks respectively; and finally, determining the position of the target object according to a plurality of first similarities and the probability of the target object appearing in the plurality of first image blocks, wherein the first similarities are determined according to a first feature matrix and the feature matrix of the first target image block, the first similarities represent the similarity degree of the first image block and the first target image block, and the first target image block contains the image of the target object. According to the edge feature matrix constructed according to the edge features of the image block, the similarity is judged according to the edge feature matrix and the edge feature matrix of the image containing the target object, the position of the target object is determined according to the probability that the target object appears in the image block, the calculated amount is smaller than that of the existing algorithm, and the probability that the target object appears in the image block is considered, so that the target object can still be tracked when the target object is partially covered, and the tracking recognition rate is improved.

It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiment of the present invention.

The following are device embodiments of the invention, for details not described in detail therein, reference may be made to the corresponding method embodiments described above.

Fig. 3 is a functional block diagram of a target object visual tracking apparatus according to an embodiment of the present invention, and referring to fig. 3, the target object visual tracking apparatus includes: an image block acquisition module 301, a feature matrix extraction module 302, and a target object tracking module 303, wherein:

the image block obtaining module 301 is configured to obtain a plurality of first image blocks, where the first image blocks are obtained based on image division of a target area, and the target area represents an area where a target object appears;

the feature matrix extracting module 302 is configured to obtain a plurality of first feature matrices corresponding to the plurality of first image blocks through edge extraction operations on the plurality of first image blocks respectively;

the target object tracking module 303 is configured to determine a position of the target object according to a plurality of first similarities and probabilities that the target object appears in the plurality of first image blocks, where the first similarities are determined according to a first feature matrix and a feature matrix of a first target image block, and the first similarities represent a degree of similarity between the first image block and the first target image block, and the first target image block includes an image of the target object.

Fig. 4 is a functional block diagram of an electronic device provided by an embodiment of the present invention. As shown in fig. 4, the electronic apparatus 4 of this embodiment includes: a processor 400 and a memory 401, said memory 401 having stored therein a computer program 402 executable on said processor 400. The processor 400, when executing the computer program 402, implements the steps of the respective target object visual tracking method and embodiment described above, such as steps 101 to 103 shown in fig. 1.

By way of example, the computer program 402 may be partitioned into one or more modules/units that are stored in the memory 401 and executed by the processor 400 to accomplish the present invention.

The electronic device 4 may be a computing device such as a desktop computer, a notebook computer, a palm computer, a cloud server, etc. The electronic device 4 may include, but is not limited to, a processor 400, a memory 401. It will be appreciated by those skilled in the art that fig. 4 is merely an example of the electronic device 4 and is not meant to be limiting of the electronic device 4, and may include more or fewer components than shown, or may combine certain components, or different components, e.g., the electronic device 4 may further include input-output devices, network access devices, buses, etc.

The processor 400 may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 401 may be an internal storage unit of the electronic device 4, such as a hard disk or a memory of the electronic device 4. The memory 401 may also be an external storage device of the electronic device 4, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the electronic device 4. Further, the memory 401 may also include both an internal storage unit and an external storage device of the electronic device 4. The memory 401 is used for storing the computer program 402 and other programs and data required by the electronic device 4. The memory 401 may also be used to temporarily store data that has been output or is to be output.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working process of the units and modules in the above system may refer to the corresponding process in the foregoing method embodiment, and will not be described herein again.

In the foregoing embodiments, the descriptions of the embodiments are focused on, and the details or descriptions of other embodiments may be referred to for those parts of an embodiment that are not described in detail or are described in detail.

Those of ordinary skill in the art will appreciate that the elements and algorithm steps described in connection with the embodiments disclosed herein can be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus/electronic device and method may be implemented in other manners. For example, the apparatus/electronic device embodiments described above are merely illustrative, e.g., the division of the modules or units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection via interfaces, devices or units, which may be in electrical, mechanical or other forms.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the embodiment.

In addition, each functional unit in each embodiment of the present invention may be integrated in one processing unit, each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated modules/units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on this understanding, the present invention may also be implemented by implementing all or part of the procedures in the methods of the above embodiments, or by instructing the relevant hardware by a computer program, where the computer program may be stored in a computer readable storage medium, and the computer program may be implemented by implementing the steps of the embodiments of the methods and apparatuses described above when executed by a processor. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth.

The above embodiments are only for illustrating the technical solution of the present invention, and are not limited thereto; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and they should be included in the protection scope of the present invention.

Claims

1. A method for visual tracking of a target object, comprising:

determining the position of the target object according to a plurality of first similarities and probabilities of the target object appearing in the plurality of first image blocks, wherein the first similarities are determined according to a first feature matrix and a feature matrix of a first target image block, the first similarities represent the similarity degree of the first image block and the first target image block, and the first target image block contains an image of the target object;

Wherein the determining the position of the target object according to the first similarities and the probabilities of the target object appearing in the first image blocks includes:

otherwise, determining a plurality of labels according to a second similarity threshold and the first similarity, determining a first image block where the target object appears according to the labels, a plurality of position probabilities and a plurality of conditional image similarity probabilities, and determining the position of the target object according to the first image block where the target object appears, wherein the labels represent whether the first image block is similar to the first target image block, the position probabilities are probabilities that the target object appears in the first image block, and the conditional image similarity probabilities are probabilities that images are similar when the target object appears in a first image block, and the second similarity threshold is smaller than the first similarity threshold;

The determining, according to the plurality of labels, the plurality of location probabilities, and the plurality of conditional image similarity probabilities, a first image block in which the target object appears includes:

wherein P (IS) IS the probability of image similarity, P (IS|PST) _n ) For the probability of similarity of the first image blocks when the target object appears in the nth first image block, P (PST) _n ) For the prior probability that the target object appears in the nth first image block,for the probability of similarity of the first image blocks when the target object is not present in the nth first image block,/for the first image block>For the prior probability that the target object does not appear in the nth first image block, N is the number of the first image blocks;

In the formula, P (CPST) _n ) For conditional position probability, P (PST) _n IS) IS the probability that the target object appears in the nth first image block when the images are similar, label IS the Label,probability of target object appearing in nth first image block when images are dissimilar, +.>For the probability of dissimilar first image blocks when the target object is present in the nth first image block,/for the target object>Is the image dissimilarity probability;

in the formula, P (RPST) _n ) A posterior probability for the target object to appear in the nth first image block;

2. The method of claim 1, wherein the acquiring a plurality of first image blocks comprises:

3. The method according to claim 1, wherein the obtaining a plurality of first feature matrices corresponding to the plurality of first image blocks by the edge extraction operation for the plurality of first image blocks, respectively, includes:

the image block is subjected to color removal to obtain a gray level image;

according to the horizontal edge feature matrix and the vertical edge feature matrix, constructing a fusion matrix representing edge features in an edge synthesis and breakpoint connection mode;

4. A method of visual tracking a target object according to claim 3, wherein extracting a horizontal edge feature matrix and a vertical edge feature matrix from the gray scale map using a horizontal edge detection template and a vertical edge detection template comprises:

wherein, the feature is a horizontal edge feature or a vertical edge feature, the module (Rn, ln) is an element of an Rn row and an Ln column of a horizontal edge detection template or the vertical edge detection template, the datablock (Rn, ln) is an element of an Rn row and an Ln column of a first data block, the Rn is a line number of the first data block, and the Ln is a column number of the first data block;

5. A method of visual tracking a target object according to claim 3, wherein constructing a fusion matrix characterizing edge features by edge synthesis and breakpoint succession according to the horizontal edge feature matrix and the vertical edge feature matrix comprises:

acquiring an edge detection threshold value and a neighborhood radius;

for each breakpoint in the fusion matrix, performing the steps of:

wherein Ath is a first breakpoint threshold value, θth is a second breakpoint threshold value, V _bx At the water for break pointValues in the flat edge feature matrix, H _bx V is the value of the breakpoint in the vertical edge feature matrix _nx For values of neighborhood points in the horizontal edge feature matrix, H _nx Values of the neighborhood points in the vertical edge feature matrix;

6. A target object visual tracking device for implementing a target object visual tracking method according to any one of claims 1 to 5, the target object visual tracking device comprising:

the method comprises the steps of,

7. An electronic device comprising a memory and a processor, the memory having stored therein a computer program executable on the processor, characterized in that the processor, when executing the computer program, implements the steps of the method according to any of the preceding claims 1 to 5.

8. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method according to any of the preceding claims 1 to 5.