CN112819706B

CN112819706B - Method for determining identification frame of superimposed display, readable storage medium and electronic device

Info

Publication number: CN112819706B
Application number: CN202110046727.3A
Authority: CN
Inventors: 叶炎钟; 李文斌
Original assignee: Hangzhou Ruiying Technology Co ltd
Current assignee: Hangzhou Ruiying Technology Co ltd
Priority date: 2021-01-14
Filing date: 2021-01-14
Publication date: 2024-05-14
Anticipated expiration: 2041-01-14
Also published as: CN112819706A

Abstract

The embodiment of the invention provides a method for determining an identification frame of superposition display, a readable storage medium and electronic equipment. The method comprises the following steps: acquiring position information of a first identification frame in an original frame; converting the position information of the first identification frame in the original frame into the position information of the first identification frame in the output frame; according to the position information of the first identification frame in the output frame, if it is determined that the position of the start point of the first identification frame in the output frame points between two pixels of the output frame in the movement direction of the object, a second identification frame superimposed and displayed on the output frame is set in the following manner: and respectively expanding a first preset number of pixels and a second preset number of pixels to two sides in the moving direction of the target object by taking the two pixels as references to obtain a frame of the second identification frame in the moving direction of the target object, wherein the moving direction of the target object is a horizontal direction or a vertical direction. The invention reduces the jitter of the identification frame during the superposition display.

Description

Method for determining identification frame of superimposed display, readable storage medium and electronic device

Technical Field

The present invention relates to the field of video display technologies, and in particular, to a method for determining an identification frame for overlay display, a readable storage medium, and an electronic device.

Background

In video processing, after identifying the object and position in the frame, a frame, usually a rectangular frame, is framed at the object outline boundary to mark the object. If the object is in motion, the displacement of the identification frame relative to the object will change after the identification frame is superimposed, and a dithering effect will be visually produced.

The following image formats of video pictures belong to formats in which color components need to be complemented by interpolation, such as: YUV422, YVU422, YUV420, YVU420, etc. are taken as examples, a process of identifying frame superposition display is given, and in particular, the image format is taken as an example of YUV 420:

1) When the current frame of the acquired video stream identifies an object and its position, the position of the identification frame superimposed at the contour boundary of the object in the current frame is determined.

The location of the identification frame in the current frame may be determined by: the start point coordinates of the recognition frame, the entire frame width (i.e., the width of the entire recognition frame), and the entire frame height (i.e., the height of the entire recognition frame) are expressed, wherein the width of the frame of the recognition frame in four directions (hereinafter, the frame width) is generally two pixels.

Fig. 1 presents a schematic view of an identification box. Since the YUV420 format image shares one U component and one V component for 4Y components, the abscissa and ordinate of the start point of the identification frame in the current frame must be even (the units of the abscissa and the ordinate are pixels, the upper left vertex of the current frame is set as the origin (0, 0), and the horizontal right is the horizontal forward direction, and the vertical downward is the vertical forward direction).

2) The method for identifying the position information of the frame in the current frame comprises the following steps: the initial point coordinates, the whole frame width and the whole frame height are normalized, and then normalized identification frame position information comprises: and storing the normalized starting point coordinates, the normalized whole frame width and the normalized whole frame height of the identification frame into a cache.

Dividing the abscissa of the starting point of the identification frame by the width of the current frame to obtain the abscissa of the normalized starting point of the identification frame, and dividing the ordinate of the starting point of the identification frame by the height of the current frame to obtain the ordinate of the normalized starting point of the identification frame; dividing the whole frame width of the identification frame by the width of the current frame to obtain the normalized whole frame width of the identification frame; dividing the whole frame height of the identification frame by the height of the current frame to obtain the normalized whole frame height of the identification frame. Then, the normalized starting point abscissa and ordinate of the identification frame, and the normalized whole frame width and whole frame height are all less than 1.

3) Then, when the current frame is to be displayed, reading the stored normalized position information of each identification frame in the current frame includes: for each identification frame, the normalized starting point coordinates, the normalized whole frame width and the normalized whole frame height of the identification frame are converted into display position information of the identification frame through the following processing:

Multiplying the abscissa and the ordinate of the normalized starting point of the identification frame by the preset width and height of the output frame to obtain the abscissa and the ordinate of the display starting point of the identification frame;

Multiplying the normalized whole frame width of the identification frame by the preset width of the output frame to obtain the display width of the identification frame;

And multiplying the normalized whole frame height of the identification frame by the preset output frame height to obtain the display height of the identification frame.

4) If the abscissa or the ordinate of the display start point of the identification frame obtained in 3) is not an integer, rounding processing is performed on the abscissa or/and the ordinate which is not an integer.

As shown in fig. 2: if the abscissa of the display starting point of the identification frame obtained in 3) is 1001.41, after rounding, the abscissa of the display starting point of the identification frame is 1000 (since the image in YUV420 format shares one U component and one V component for 4Y components, the abscissa and ordinate of the display starting point of the identification frame must be even), and the left frame of the identification frame is the left frame of the identification frame 1 shown in fig. 2;

As shown in fig. 2: if the abscissa of the display starting point of the identification frame obtained in the step 3) is 1001.51, rounding to obtain the abscissa of the display starting point of the identification frame as 1002, wherein the left frame of the identification frame is the left frame of the identification frame 2 shown in fig. 2;

It can be seen that, although the abscissa of the display start point of the recognition frame calculated in 3) differs by only 0.1 pixel, a deviation of 2 pixels may be generated when the recognition frame is displayed, resulting in the shaking of the recognition frame when the video is displayed in real time.

It has been found through extensive experimentation that at present, algorithms that perform frame overlaying while video is displayed produce an average of 0.5 pixels of variance.

If rounding is not used and other rounding methods are used, there may be cases where even if the display positions of the recognition frames calculated in 3) differ by only 0.1 pixel, there may be a 2-pixel deviation when the recognition frames are displayed on the display.

In addition, even if the image format for the superimposed recognition frame is complete for each component, i.e., any component of the pixel does not need to be complemented by interpolation, such as: YUV444, YVU444, RGB, etc., at this time, jitter is much smaller when the identification frames are superimposed, but jitter still exists, mainly because: the normalized position information of the identification frame is a decimal, and after the normalized position information of the identification frame is converted into the display position information of the identification frame, the display position information of the identification frame may still be a decimal, and when the identification frame is actually displayed, the decimal part behind the decimal point cannot be considered, so that deviation between the display position and the real position of the identification frame is caused, and jitter is caused.

Disclosure of Invention

The embodiment of the invention provides a method for determining an identification frame for superposition display, a readable storage medium and electronic equipment, so as to reduce jitter when the identification frame is subjected to superposition display.

The technical scheme of the embodiment of the invention is realized as follows:

a method of determining an identification frame for a superimposed display, the method comprising:

acquiring position information of a first identification frame in an original frame, wherein the first identification frame is used for marking a target object in the original frame;

Converting the position information of the first identification frame in the original frame into the position information of the first identification frame in the output frame; the output frame is the same as the original frame in content but different in resolution;

According to the position information of the first identification frame in the output frame, if the position of the starting point of the first identification frame in the output frame is determined to be between two pixels pointing to the output frame in the moving direction of the target object, a second identification frame which is displayed in a superimposed manner on the output frame is set in the following manner:

And respectively expanding a first preset number of pixels and a second preset number of pixels to two sides in the moving direction of the target object by taking the two pixels as references to obtain a frame of the second identification frame in the moving direction of the target object, wherein the moving direction of the target object is a horizontal direction or a vertical direction.

When the position of the starting point of the first identification frame in the output frame points to exactly one pixel of the output frame, the method comprises the following steps based on the two pixels:

the pixel of the output frame to which the position of the start point of the first identification frame in the output frame is exactly pointed is taken as a first pixel, and,

When the motion direction of the target object is the horizontal direction, taking the pixel which is in the same row with the first pixel and is positioned on the right side of the first pixel on the output frame as a second pixel; when the motion direction of the target object is the vertical direction, taking the pixel which is in the same column with the first pixel and is positioned below the first pixel on the output frame as a second pixel;

The first pixel and the second pixel are used as references.

After the first preset number of pixels and the second preset number of pixels are respectively extended to two sides in the moving direction of the target object, the method further comprises the steps of:

If the moving direction of the target object is the horizontal direction, taking the pixel at the leftmost side after expansion as a starting point of the first row of pixels of the left frame of the second identification frame, and taking the pixel at the rightmost side after expansion as an end point of the first row of pixels of the left frame of the second identification frame;

if the moving direction of the object is a vertical direction, the pixel at the uppermost position after expansion is taken as a starting point of the first column of pixels of the upper frame of the second identification frame, and the pixel at the lowermost position after expansion is taken as an end point of the first column of pixels of the upper frame of the second identification frame.

The first preset number is 1+2×m, the second preset number is 1+2×n, m and n are non-negative integers, and m and n are not necessarily equal.

After determining that the position of the start point of the first identification frame in the output frame points between two pixels of the output frame in the motion direction of the object, the method further comprises:

Determining that a first pixel of the two pixels is an even-numbered pixel in the row of the output frame, wherein the first pixel is a pixel positioned on the left side or a pixel positioned above the pixel in the two pixels;

and after the frame of the second identification frame in the movement direction of the target object is obtained, the method further comprises:

for each non-boundary pixel on the second identification frame, setting the Y, U, V value of that pixel to be the same as the Y, U, V value of the predefined second identification frame;

For each boundary pixel on the second identification frame, the U, V value of the pixel is set to be the same as the U, V value of the predefined second identification frame, and the Y value of the pixel is set to be: and fusing the Y value of the second predefined identification frame with the Y value of the pixel of the output frame corresponding to the pixel.

The Y value of the pixel is set as: the fused value of the Y value of the second predefined identification frame and the Y value corresponding to the pixel comprises:

If the pixel is located at the left or upper boundary of the frame of the second recognition frame, the Y value of the pixel is equal to α [1- (A-A 0) ]y1+β (A-A 0) Y0;

If the pixel is located at the right or lower boundary of the frame of the second identification frame, the Y value of the pixel is equal to α (A-A 0) y1+β1- (A-A 0) Y0;

Wherein Y1 is a Y value of a second predefined identification frame, and Y0 is a Y value of a pixel of the output frame corresponding to the pixel; a is the abscissa of the starting point of the first identification frame in the output frame after converting the position information of the first identification frame in the original frame into the position information of the first identification frame in the output frame; a0 is an integer part of A, alpha and beta are preset constants, and 0< alpha is less than or equal to 1, and 0< beta is less than or equal to 1.

Determining that a first pixel in the two pixels is an odd-numbered pixel in the row of the output frame, wherein the first pixel is a pixel positioned on the left side or a pixel positioned above the left side in the two pixels;

for each boundary pixel on the second identification frame, setting the U, V value of the pixel to be the same as the U, V value of the pixel of the output frame corresponding to the pixel, and setting the Y value of the pixel to be: and fusing the Y value of the second predefined identification frame with the Y value of the pixel of the output frame corresponding to the pixel.

The output frame is in YUV420, or YVU420, or YUV422, or YVU422, or YUV400, or YUV444, or RGB.

A non-transitory computer readable storage medium storing instructions which, when executed by a processor, cause the processor to perform the steps of the method of determining an overlaid displayed identification box as claimed in any preceding claim.

An electronic device comprising a non-transitory computer readable storage medium as described above, and the processor having access to the non-transitory computer readable storage medium.

In the embodiment of the invention, when the position of the starting point of the first identification frame in the output frame points to between two pixels of the output frame in the moving direction of the object, a first preset number of pixels and a second preset number of pixels are respectively expanded to two sides in the moving direction of the object by taking the two pixels as references, so as to obtain the frame of the second identification frame in the moving direction of the object, wherein the moving direction of the object is in the horizontal direction or the vertical direction, and therefore, the frame of the second identification frame can be always attached to the outline of the object when the output frame is displayed, and the shake of the identification frame during display is avoided.

Drawings

FIG. 1 is a schematic diagram of an identification frame;

FIG. 2 is a schematic diagram showing the difference between superimposed recognition frames when the abscissa of the display start points of the converted recognition frames differ by 0.1 pixel in the conventional scheme;

FIG. 3 is a flowchart of a method for determining an identification frame for overlay display according to an embodiment of the present invention;

FIG. 4 is an exemplary first application provided in the present invention for determining a second identification box;

FIG. 5 is a second example of an application provided by the present invention for determining a second identification box;

FIG. 6 is a flowchart of a method for determining an identification frame for overlay display according to another embodiment of the present invention;

fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The invention will be described in further detail with reference to the accompanying drawings and specific examples.

Fig. 3 is a flowchart of a method for determining an identification frame displayed in a superimposed manner according to an embodiment of the present invention, which specifically includes the following steps:

step 301: and acquiring the position information of the first identification frame in the original frame. The first identification frame is used for marking a target object in the original frame.

The position information of the first identification frame in the original frame includes: the position information of the starting point of the first identification frame in the original frame, the whole frame width and the whole frame height.

Step 302: the position information of the first identification frame in the original frame is converted into the position information of the first identification frame in the output frame. Wherein the output frame is the same as the original frame in content but different in resolution.

Step 303: according to the position information of the first identification frame in the output frame, if the position of the starting point of the first identification frame in the output frame is determined to be between two pixels pointing to the output frame in the moving direction of the target object, a second identification frame which is displayed in a superimposed manner on the output frame is set as follows:

The motion direction of the target object is a horizontal direction or a vertical direction, wherein the horizontal direction is the width direction of the output frame, and the vertical direction is the height direction of the output frame. Then:

When the motion direction of the object is horizontal, in this step, the position of the start point of the first identification frame in the output frame points between two pixels of the output frame in the motion direction of the object means that the position of the start point of the first identification frame in the output frame points between two pixels in the width direction of the output frame, that is, the two pixels are horizontally adjacent on the output frame, and the start point of the first identification frame is located between the two pixels in the output frame;

When the motion direction of the object is a vertical direction, in this step, the position of the start point of the first identification frame in the output frame points between two pixels of the output frame in the motion direction of the object means that the position of the start point of the first identification frame in the output frame points between two pixels in the height direction of the output frame, that is, the two pixels are vertically adjacent on the output frame, and the start point of the first identification frame is located between the two pixels in the output frame.

When the moving direction of the target object is a horizontal direction, in the step, the two pixels are used as references, and the first preset number of pixels and the second preset number of pixels are respectively extended to two sides in the moving direction of the target object, which means that the first preset number of pixels are extended leftwards and the second preset number of pixels are extended rightwards on the line where the two pixels are located by using the two pixels as references, so that the first line of the left frame of the second identification frame is obtained;

When the moving direction of the target object is a vertical direction, in this step, the two pixels are used as references, and the first preset number of pixels and the second preset number of pixels are respectively extended to two sides in the moving direction of the target object, which means that the two pixels are used as references, and the first preset number of pixels are extended upwards and the second preset number of pixels are extended downwards on the column where the two pixels are located, so as to obtain the first column of the upper frame of the second identification frame.

Since the full frame width and the full frame height of the second recognition frame are known (can be calculated according to the full frame width and the full frame height of the first recognition frame in the original frame and the resolutions of the original frame and the output frame), and the frame widths of the four frames of the second recognition frame are also preset, after the first row or the first column of the left frame or the upper frame of the second recognition frame is obtained (after the starting point of the second recognition frame is substantially known at the position of the output frame), the four frames of the whole second recognition frame can be obtained. And then, the second identification frame is overlapped on the output frame.

In the above embodiment, when the position of the starting point of the first identification frame in the output frame points between two pixels of the output frame in the moving direction of the object, the two pixels are used as references, and the first preset number of pixels and the second preset number of pixels are respectively extended to two sides in the moving direction of the object, so as to obtain the frame of the second identification frame in the moving direction of the object, wherein the moving direction of the object is in the horizontal direction or the vertical direction, so that the frame of the second identification frame can be always attached to the outline of the object when the output frame is displayed, and jitter of the identification frame during display is avoided.

In an alternative embodiment, when the position of the starting point of the first identification frame in the output frame points to exactly one pixel of the output frame, in step 303, based on the two pixels, the method includes:

the pixel of the output frame to which the position of the start point of the first identification frame in the output frame is exactly pointed is taken as the first pixel, and,

The first pixel and the second pixel are used as references.

In an optional embodiment, in step 303, after expanding the first preset number of pixels and the second preset number of pixels to two sides in the moving direction of the target object, the method further includes:

In an alternative embodiment, the first preset number is 1+2×m, the second preset number is 1+2×n, m and n are non-negative integers, and m and n are not necessarily equal.

Fig. 4 shows an application example of determining the second identification frame according to the present invention, as shown in fig. 4, an upper left vertex of the output frame is set as an origin (0, 0), a horizontal right is set as a horizontal forward direction, a vertical downward is set as a vertical forward direction, a coordinate unit is a pixel, a moving direction of the object is set as a horizontal direction, and in step 302, it is calculated that a starting point of the first identification frame is located between two pixels with abscissas 1001 and 1002 on the output frame, at this time, with two pixels with abscissas 1001 and 1002 as references, 1 pixel is extended to the left and right, and then the four pixels (with abscissas 1000, 1001, 1002, 1003) are the 1 st row of pixels of the left frame of the second identification frame, and the widths of the four frames of the second identification frame are all 4 pixels.

In an optional embodiment, in step 303, after determining that the position of the start point of the first identification frame in the output frame points between two pixels of the output frame in the moving direction of the object, the method further includes: determining that a first pixel of the two pixels is an even number of pixels in the line of the output frame, wherein the first pixel is a pixel positioned on the left side or a pixel positioned above the left side of the two pixels;

in step 303, after obtaining the frame of the second recognition frame in the moving direction of the target object, the method further includes:

for each boundary pixel on the second identification frame, the U, V value of the pixel is set to be the same as the U, V value of the predefined second identification frame, and the Y value of the pixel is set to be: and the Y value of the second predefined identification frame is fused with the Y value of the pixel of the output frame corresponding to the pixel.

Taking fig. 4 as an example, let the top left vertex of the output frame be the origin (0, 0), the horizontal right be the horizontal forward direction, the vertical downward be the vertical forward direction, the coordinate unit be pixels, let the motion direction of the object be the horizontal direction, and the abscissa of the starting point of the first identification frame in the output frame calculated in step 302 be 1001.41, then the starting point of the first identification frame is located between two pixels with abscissas 1001 and 1002 on the output frame, that is, the first pixel in the two pixels: the pixel with the abscissa of 1001 is the 1002 th (i.e., even) pixel in the row of the output frame; then, with two pixels whose abscissas are 1001 and 1002 as references, respectively extending 1 pixel to the left and right, the abscissas of the leftmost pixel in the four pixels are 1000, and the abscissas of the rightmost pixel are 1003, that is, the abscissas of the pixel points on the left frame of the second recognition frame are sequentially from left to right: 1000. 1001, 1002, 1003.

Consider that: when the output frame format is YUV420 or YVU420 or YUV422 or YVU422, four Y components or two Y components need to share one U component and one V component, then the U, V components of two pixels whose adjacent abscissas are sequentially even and odd need to be set to be the same, that is, U, V values of pixels of abscissas 1000 and 1001 are the same, U, V values of pixels of abscissas 1002 and 1003 are the same, and since pixels of abscissas 1000 to 1003 are all located on the second identification frame, U, V values of the pixels are all set to U, V values of the predefined second identification frame;

Meanwhile, consider that: and reducing the difference of the displacement between the second identification frame and the target object in the front frame and the rear frame as much as possible, and fusing the Y value of the boundary pixel of the second identification frame with the Y value of the pixel of the output frame at the position of the boundary pixel.

As can be seen from the above example, when the starting point of the first identification frame is calculated in step 302 to have the abscissa of 1001.41 and 1001.51 (i.e., the difference of 0.1 pixel) in the output frame, the displacement between the second identification frame and the target object remains substantially consistent in the previous and subsequent frames, and no deviation of 2 pixels as in the conventional scheme occurs.

In an optional embodiment, in step 303, after determining that the position of the start point of the first identification frame in the output frame points between two pixels of the output frame in the moving direction of the object, the method further includes:

Determining that a first pixel in the two pixels is an odd-numbered pixel in the line of the output frame, wherein the first pixel is a pixel positioned on the left side or a pixel positioned above the left side in the two pixels;

for each boundary pixel on the second identification frame, the U, V value of that pixel is set to be the same as the U, V value of the pixel of the output frame corresponding to that pixel, and the Y value of that pixel is set to be: and the Y value of the second predefined identification frame is fused with the Y value of the pixel of the output frame corresponding to the pixel.

Fig. 5 is a schematic diagram of an application example two of determining a second identification frame according to the present invention, as shown in fig. 5, an upper left vertex of an output frame is set as an origin (0, 0), a horizontal right direction is set as a horizontal forward direction, a vertical downward direction is set as a vertical forward direction, a coordinate unit is a pixel, a moving direction of a target is set as a horizontal direction, and in step 302, it is calculated that a starting point of the first identification frame is 1002.41 on an abscissa in the output frame, and then the starting point of the first identification frame is located between two pixels on the output frame with abscissas 1002 and 1003, that is, a first pixel of the two pixels: the pixel with the abscissa 1002 is the 1003 (i.e., odd) th pixel in the row of the output frame; then, with two pixels with abscissas 1002 and 1003 as references, respectively expanding 1 pixel to the left and right, the abscissas of the leftmost pixel in the four pixels are 1001, and the abscissas of the rightmost pixel are 1004, that is, the abscissas of the pixel points on the left frame of the second recognition frame are from left to right: 1001. 1002, 1003, 1004.

Consider that: when the output frame format is YUV420 or YVU420 or YUV422 or YVU422, four Y components or two Y components share one U component and one V component, then U, V components of two pixels with adjacent abscissas being even and odd in sequence need to be set to be the same; that is, the U, V values of the pixels of the abscissas 1002 and 1003 are the same, and since the pixels of the abscissas 1002 and 1003 are both located on the second identification frame, the U, V values thereof are both set to the U, V value of the second identification frame defined in advance; whereas the U, V value of abscissa 1001 should be the same as the U, V value of the pixel of abscissa 1000 on the same line on the output frame, the U, V value of the pixel of abscissa 1004 should be the same as the U, V value of the pixel of abscissa 1005 on the same line on the output frame;

Meanwhile, consider that: and (3) smoothly transiting the second identification frame and the output frame, and fusing the Y value of the boundary pixel of the second identification frame with the Y value of the pixel of the output frame at the position of the boundary pixel.

As can be seen from the above example, when the starting point of the first identification frame is calculated in step 302 to be 1002.41 and 1002.51 (i.e., 0.1 pixels apart) on the abscissa in the output frame, the second identification frame is identical, and no deviation of 2 pixels occurs as in the prior art.

In an alternative embodiment, the Y value of the pixel is set to: the value obtained by fusing the Y value of the second predefined identification frame with the Y value of the pixel of the output frame corresponding to the pixel comprises:

if the pixel is located at the left or upper boundary of the frame of the second recognition frame, the Y value of the pixel is equal to α [1- (A-A 0) ]y1+β (A-A 0) Y0; the left border aims at a left frame and a right frame, and the upper border aims at an upper frame and a lower frame;

If the pixel is located at the right or lower boundary of the frame of the second identification frame, the Y value of the pixel is equal to α (A-A 0) y1+β1- (A-A 0) Y0; the right boundary aims at a left frame and a right frame, and the lower boundary aims at an upper frame and a lower frame;

Wherein Y1 is the Y value of a second predefined identification frame, Y0 is the Y value of the pixel of the output frame corresponding to the pixel; a is the abscissa of the starting point of the first identification frame in the output frame after converting the position information of the first identification frame in the original frame into the position information of the first identification frame in the output frame; a0 is an integer portion of a; for example: in step 302, if the abscissa of the starting point of the first identification frame in the output frame is 1002.41, a= 1002.41, a0=1002; α and β are preset constants, and 0< α is equal to or less than 1,0< β is equal to or less than 1, and generally, α=1 and β=1.

In an alternative embodiment, the format of the output frame in the embodiment of the present invention may be YUV420, or YVU420, or YUV422, or YVU422, or YUV400, or YUV444, or RGB.

Fig. 6 is a flowchart of a method for determining an identification frame displayed in a superimposed manner according to another embodiment of the present invention, which specifically includes the following steps:

Step 601: collecting an original video stream, carrying out object identification on each original frame in the original video stream, carrying out normalization processing on position information of an identification frame for marking an object, putting the obtained normalized position information of the identification frame into a first buffer queue, and simultaneously putting the original frames into a second buffer queue.

Step 602: when the number of frames in the second buffer queue reaches a preset threshold, according to the frame identification (such as a frame number or a timestamp) of the frame to be displayed, searching the normalized position information of the identification frame of the frame from the first buffer queue, including: the normalized starting point coordinates, normalized whole frame width and normalized whole frame height of the identification frame are processed in the following steps 603-607 for each identification frame:

If the normalized position information of any identification frame of the frame is not found in the first buffer queue according to the frame identification of the frame to be displayed, predicting according to the normalized position information of each identification frame of the last frame and the inter-frame offset to obtain the normalized position information of each identification frame of the frame to be displayed.

Step 603: the current identification frame is set as a first identification frame, and normalized position information of the first identification frame is converted into position information of the first identification frame in an output frame:

multiplying the abscissa and the ordinate of the normalized starting point of the first identification frame by the width and the height of the output frame respectively to obtain the abscissa and the ordinate of the starting point of the first identification frame in the output frame;

multiplying the normalized whole frame width of the first identification frame by the width of the output frame to obtain the whole frame width of the first identification frame on the output frame;

and multiplying the normalized whole frame height of the first identification frame by the height of the output frame to obtain the whole frame height of the first identification frame on the output frame.

Here, the movement direction of the target is set to be the horizontal direction.

Before outputting and displaying the frame, the resolution of the frame is converted into the preset resolution of the output frame.

Step 604: and determining that the abscissa of the starting point of the first identification frame is positioned between two pixels of the output frame, and expanding one pixel leftwards and rightwards respectively by taking the two pixels as a reference to obtain the 1 st row of pixels of the left frame of the second identification frame.

Here, the frame width of the second recognition frame is set to 4 pixels.

Step 605: judging whether the abscissa of the starting point of the second identification frame (i.e. the left pixel of the two pixels, i.e. the 1 st pixel of the 1 st row of pixels of the left frame of the second identification frame) in the output frame is odd or even, if so, executing step 606; if it is even, go to step 607.

Wherein, the top left vertex of the output frame is the origin (0, 0), the horizontal right is the horizontal forward direction, the vertical downward is the vertical forward direction, and the coordinate unit is pixel.

Step 606: for each non-boundary pixel on the second identification frame, setting the Y, U, V value of that pixel to be the same as the Y, U, V value of the predefined second identification frame; for each boundary pixel on the second identification frame, the U, V value of the pixel is set to be the same as the U, V value of the predefined second identification frame, and the Y value of the pixel is set to be: the process ends by fusing the Y value of the second predefined identification frame with the Y value of the pixel of the output frame corresponding to the pixel.

After the non-boundary pixels, i.e. the second identification frame is superimposed on the output frame, the pixels on the second identification frame that are not adjacent to the original pixels on the output frame, such as: left-most and right-most pixels on the left and right frames of the second recognition frame, and uppermost and lowermost pixels on the upper and lower frames.

Step 607: for each non-boundary pixel on the second identification frame, setting the Y, U, V value of that pixel to be the same as the Y, U, V value of the predefined second identification frame; for each boundary pixel on the second identification frame, the U, V value of that pixel is set to be the same as the U, V value of the pixel of the output frame corresponding to that pixel, and the Y value of that pixel is set to be: and the Y value of the second predefined identification frame is fused with the Y value of the pixel of the output frame corresponding to the pixel.

Embodiments of the present invention also provide a non-transitory computer readable storage medium storing instructions that, when executed by a processor, cause the processor to perform the steps of the method as described in steps 301-303, or steps 601-607.

Fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, where the apparatus includes a non-transitory computer readable storage medium 71 as described above, and a processor 72 that can access the non-transitory computer readable storage medium 71.

The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather to enable any modification, equivalent replacement, improvement or the like to be made within the spirit and principles of the invention.

Claims

1. A method of determining an identification frame for a superimposed display, the method comprising:

Respectively expanding a first preset number of pixels and a second preset number of pixels to two sides in the moving direction of the target object by taking the two pixels as references to obtain a frame of the second identification frame in the moving direction of the target object, wherein the moving direction of the target object is a horizontal direction or a vertical direction;

The step of expanding a first preset number of pixels and a second preset number of pixels to two sides in the movement direction of the object with the two pixels as references to obtain a frame of the second identification frame in the movement direction of the object, includes:

When the moving direction of the target object is the horizontal direction, taking the two pixels as a reference, expanding a preset first number of pixels leftwards and expanding a preset second number of pixels rightwards on the row where the two pixels are located, so as to obtain a first row of a left frame of the second identification frame;

When the moving direction of the target object is in the vertical direction, the two pixels are used as the reference, and a preset first number of pixels are expanded upwards and a preset second number of pixels are expanded downwards on the column where the two pixels are located, so that a first column of the upper frame of the second identification frame is obtained.

2. The method of claim 1, wherein when the position of the start point of the first identification frame in the output frame points to exactly one pixel of the output frame, the referencing the two pixels comprises:

The first pixel and the second pixel are used as references.

3. The method according to claim 1 or 2, wherein after expanding the first preset number of pixels and the second preset number of pixels to both sides in the movement direction of the object, respectively, further comprises:

4. A method according to claim 3, wherein the first predetermined number is 1+2 x m, the second predetermined number is 1+2 x n, m, n are non-negative integers, and m, n are not necessarily equal.

5. The method of claim 4, wherein the determining the location of the start point of the first identification frame in the output frame after pointing between two pixels of the output frame in the direction of motion of the object further comprises:

6. The method of claim 5, wherein the setting the Y value of the pixel is: the fused value of the Y value of the second predefined identification frame and the Y value corresponding to the pixel comprises:

7. The method of claim 4, wherein the determining the location of the start point of the first identification frame in the output frame after pointing between two pixels of the output frame in the direction of motion of the object further comprises:

8. The method of claim 7, wherein the setting the Y value of the pixel is: the fused value of the Y value of the second predefined identification frame and the Y value corresponding to the pixel comprises:

9. The method according to claim 1 or 2, wherein the output frame is in YUV420, or YVU420, or YUV422, or YVU422, or YUV400, or YUV444, or RGB.

10. A non-transitory computer readable storage medium storing instructions which, when executed by a processor, cause the processor to perform the steps of the method of determining an overlaid displayed identification box of any of claims 1 to 9.

11. An electronic device comprising the non-transitory computer-readable storage medium of claim 10, and the processor having access to the non-transitory computer-readable storage medium.