CN110677585A

CN110677585A - Target detection frame output method and device, terminal and storage medium

Info

Publication number: CN110677585A
Application number: CN201910945467.6A
Authority: CN
Inventors: 贾玉虎
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2019-09-30
Filing date: 2019-09-30
Publication date: 2020-01-10

Abstract

The embodiment of the invention is suitable for the technical field of shooting and provides an output method, an output device, a terminal and a storage medium of a target detection frame, wherein the output method of the target detection frame comprises the following steps: acquiring a current image frame; predicting a first target detection frame of the current image frame according to the current image frame; judging the coincidence degree of the areas occupied by the first target detection frame and the reference target detection frame in the image frame; the reference target detection frame is a prediction target detection frame aiming at a reference image frame, and the acquisition of the reference image frame is positioned before the acquisition of the current image frame; and displaying a target detection frame on the current image frame according to the judgment result.

Description

Target detection frame output method and device, terminal and storage medium

Technical Field

The invention belongs to the technical field of shooting, and particularly relates to an output method and device of a target detection frame, a terminal and a storage medium.

Background

When the mobile phone is used for photographing and previewing, the mobile phone can adopt the target detection frame to frame a target object (people, animals, automobiles and the like) in a preview image so as to facilitate framing and focusing of a user. However, when the mobile phone is shaken slightly, the target detection frame in the preview image will jump along with the shake, which brings interference to the photographing operation.

Disclosure of Invention

In view of this, embodiments of the present invention provide an output method, an output device, a terminal, and a storage medium for a target detection frame, so as to at least solve the problem in the related art that when a terminal performs a photo preview, the target detection frame may jump along with the shake of the terminal.

The technical scheme of the embodiment of the invention is realized as follows:

in a first aspect, an embodiment of the present invention provides an output method for a target detection box, where the method includes:

acquiring a current image frame;

predicting a first target detection frame of the current image frame according to the current image frame;

judging the coincidence degree of the areas occupied by the first target detection frame and the reference target detection frame in the image frame; the reference target detection frame is a prediction target detection frame aiming at a reference image frame, and the acquisition of the reference image frame is positioned before the acquisition of the current image frame;

and displaying a target detection frame on the current image frame according to the judgment result.

In the foregoing solution, the determining the overlapping ratio of the first target detection frame and the reference target detection frame in the image frame includes:

and calculating the intersection ratio of the first target detection frame and the reference target detection frame in the image frame, and judging the overlap ratio of the areas occupied by the first target detection frame and the reference target detection frame in the image frame according to the calculation result of the intersection ratio.

In the foregoing solution, the displaying a target detection frame on the current image frame according to the determination result includes:

when the coincidence degree of the first target detection frame and the reference target detection frame in the area occupied by the image frame is greater than or equal to a set value, displaying the reference target detection frame on the current image frame;

and when the coincidence degree of the areas occupied by the first target detection frame and the reference target detection frame in the image frame is smaller than the set value, displaying the first target detection frame on the current image frame.

In the above-described aspect, when the reference target detection frame is displayed on the current image frame, the output method includes:

and displaying the reference target detection frame on the current image frame according to the position parameter of the reference target detection frame.

In the foregoing solution, after the first target detection frame is displayed on the current image frame, the output method further includes:

and the position parameter of the first target detection frame of the current image frame is used as the position parameter of the reference target detection frame of the stored reference image, the update of the position parameter of the reference target detection frame of the stored reference image is completed, and the updated position parameter of the reference target detection frame of the reference image frame is stored in the register.

In the foregoing solution, the output method further includes:

detecting the running state of a camera of the terminal;

and when the running state of the camera is closed, emptying the position parameters of the reference target detection frame of the reference image frame stored in the register.

In the above scheme, the reference target detection frame is a target detection frame displayed on a reference image frame by the camera; the reference image frame is a previous image frame of the current image frame shot by the camera.

In the foregoing scheme, the acquiring the current image frame specifically includes:

collecting a preview picture shot by a camera of the terminal as a current image frame; alternatively, the first and second electrodes may be,

and acquiring a frame in the video stream shot by the camera of the terminal as a current image frame.

In a second aspect, an embodiment of the present invention provides an output apparatus for an object detection frame, where the apparatus includes:

the acquisition module is used for acquiring a current image frame;

the prediction module is used for predicting a first target detection frame of the current image frame according to the current image frame;

the judging module is used for judging the contact ratio of the first target detection frame and the reference target detection frame in the area occupied by the image frame; the reference target detection frame is a reference target detection frame aiming at a reference image frame, and the acquisition of the reference image frame is positioned before the acquisition of the current image frame;

and the display module is used for displaying the target detection frame on the current image frame according to the judgment result.

In a third aspect, an embodiment of the present invention provides a terminal, including a processor and a memory, where the processor and the memory are connected to each other, where the memory is used to store a computer program, and the computer program includes program instructions, and the processor is configured to call the program instructions and execute the steps of the method for outputting the target detection block provided in the first aspect of the embodiment of the present invention.

In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, including: the computer-readable storage medium stores a computer program. The computer program, when executed by a processor, implements the steps of the method for outputting an object detection box as provided by the first aspect of the embodiment of the present invention.

According to the scheme provided by the embodiment of the invention, the first target detection frame of the current image frame is predicted by acquiring the current image frame; judging the coincidence degree of the areas occupied by the first target detection frame and the reference target detection frame in the image frame; and displaying a target detection frame on the current image frame according to the judgment result. The embodiment of the invention can avoid the target detection frame from jumping caused by slight shake of the terminal when the terminal performs preview shooting, and simultaneously ensures the accuracy of target detection. The moving smoothness and stability of the target detection frame in the preview image are improved, a good visual effect is brought to a user, and possible interference to the photographing operation is eliminated.

Drawings

Fig. 1 is a schematic diagram of a photo preview effect provided by the related art;

fig. 2 is a schematic diagram of an image frame provided by the related art being processed;

fig. 3 is a schematic flow chart illustrating an implementation of an output method of a target detection box according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a calculation of the cross-over ratio according to an embodiment of the present invention;

FIG. 5 is a schematic flow chart illustrating an implementation of another method for outputting a target detection box according to an embodiment of the present invention;

FIG. 6 is a diagram illustrating a preview effect of photographing according to an embodiment of the present invention;

FIG. 7 is a schematic flow chart illustrating an implementation of another method for outputting a target detection box according to an embodiment of the present invention;

fig. 8 is a block diagram of an output device of an object detection box according to an embodiment of the present invention;

fig. 9 is a schematic diagram of a hardware structure of a terminal according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

The technical means described in the embodiments of the present invention may be arbitrarily combined without conflict.

In addition, in the embodiments of the present invention, "first", "second", and the like are used for distinguishing similar objects, and are not necessarily used for describing a specific order or a sequential order.

The target detection refers to a process of searching the shot picture for a target object, if the shot picture contains the target object, the position and the size of the target object are returned, and the returned result can be displayed in the form of a target detection frame in the image. The target detection is usually applied to a terminal with a camera, for example, when a user uses a mobile phone to perform a photo preview, the mobile phone may use a target detection frame to frame a target object (the target object may be a person, an animal, an automobile, etc.) in a preview image, so as to facilitate the user to view and focus. Referring to fig. 1, fig. 1 is a schematic diagram of a shooting preview effect provided by the related art, as shown in fig. 1, when shooting, a mobile phone displays an image frame shot by a camera on a display screen in real time to form a preview picture. A target detection frame is displayed for a preview image frame containing a target object, when a mobile phone slightly shakes during shooting, the target detection frame in the preview image can jump along with the mobile phone, the visual effect of a user is affected, the false phenomenon that focusing is unsuccessful is easily caused for the user, and interference is brought to the shooting operation.

In order to avoid the target detection frame from jumping, the related art generally adopts a mode of reducing the acquisition frame rate, referring to fig. 2, for example, a terminal camera performs image acquisition every 10 image frames, performs prediction of the target detection frame on the acquired image, and displays the target detection frame in the subsequent 10 image frames. Although the frequency of jumping of the target detection frame can be reduced by reducing the acquisition frame rate, the method has poor real-time performance, and the target detection frame output according to the method cannot be matched with the rapid movement of the moving object in the picture in some occasions with high real-time performance requirement on target detection, for example, in a scene of shooting and previewing the moving object.

In the related art, a Kalman filtering method is adopted to process the problem that the target detection frame jumps, however, the method is not suitable for occasions with high real-time requirements on target detection, the output of the target detection frame is delayed to a certain degree, and the Kalman filtering method fails when the target is blocked or the moving amplitude of a mobile phone lens is suddenly increased.

Aiming at the defect that the related technology cannot be applied to the occasion of real-time target detection, the embodiment of the invention provides an output method of a target detection frame, which can avoid the target detection frame from jumping in the occasion of real-time target detection. In order to explain the technical means of the present invention, the following description will be given by way of specific examples.

Referring to fig. 3, fig. 3 is a schematic view of an implementation flow of an output method of a target detection box according to an embodiment of the present invention, where an execution subject of the method is a terminal such as a mobile phone and a tablet, and referring to fig. 3, the output method of the target detection box includes:

and S101, acquiring a current image frame.

Here, the terminal may collect a preview picture taken by the camera as a current image frame; or the terminal collects one frame in the video stream shot by the camera as the current image frame.

It should be understood that the current image frame is not limited to the terminal generating through the camera, but may also be an image frame pre-stored in the local video; or the terminal may acquire the image frames in the video from the network server through wired or wireless communication.

S102, predicting a first target detection frame of the current image frame according to the current image frame.

And the terminal detects a target in the current image frame and determines the position parameter of the first target detection frame in the current image frame according to the detected target so as to complete the prediction of the first target detection frame. It should be understood that step S102 is only to predict the position parameter of the first object detection frame in the current image frame, and does not display the first object detection frame on the current image frame.

In practical application, the trained target detection model may be stored in the terminal in advance, the target detection model may be used to detect the region where the target object in the image is located, and the detection result may be framed by a target detection frame, which is a rectangular frame. And inputting the current image frame into the target detection model to obtain the position parameters of the target detection frame of the current image frame. The target detection model may be a model obtained by training a convolutional neural network to converge based on a sample set (including a target image and a label indicating a position of a target object region) by using a machine learning/deep learning technique, and in practical application, the target detection network may be constructed based on networks such as YOLO, SSD, fast-RCNN, and the like.

It should be understood that the target object in the current image frame may be a human face, an automobile, a building, a plant, an animal, etc. For example, if the terminal is a user's mobile phone, the general target object is a human face; if the terminal is a tachograph, the target object is a car. The position parameters of the target detection box may include coordinates of four vertices of the target detection box; or the position parameters of the target detection frame can comprise coordinates of any group of diagonal vertices of the target detection frame; or the position parameters of the target detection box may include the coordinates of any one vertex of the target detection box and the length and width of the target detection box.

S103, judging the overlapping degree of the first target detection frame and the reference target detection frame in the area occupied by the image frame; the reference target detection frame is a prediction target detection frame for a reference image frame, and the acquisition of the reference image frame is before the acquisition of the current image frame.

In an embodiment of the present invention, the reference target detection frame is a prediction target detection frame for a reference image frame, and the acquisition of the reference image frame is prior to the acquisition of the current image frame. Here, if the terminal predicts the target detection frame for each image frame in the preview screen or the video stream, the reference image frame is an image frame previous to the current image frame captured by the camera. If the terminal does not predict the target detection frame every time 10 image frames are output, the reference image frame is the first 10 image frames of the current image frame shot by the camera. The reference target detection frame is a prediction target detection frame for the reference image frame, and refers to a target detection frame that has already been displayed on the reference image frame.

It should be understood that the current image frame and the reference image frame should be displayed by the terminal in the same video stream or preview picture stream, rather than two image frames that are not associated.

Generally, the value interval of the contact ratio is [0, 1], and the smaller the contact ratio is, the larger the position of the target object moving in the current image frame and the reference image frame is; a coincidence degree of 0 indicates a place where the first object detection frame and the reference object detection frame do not coincide at all in the image frame. The higher the coincidence degree is, the smaller the position of the target object moving in the current image frame and the reference image frame is, and when the coincidence degree is 1, the target object does not move in the current image frame and the reference image frame, that is, the position parameters of the first target detection frame and the reference target detection frame are the same.

Further, as an embodiment of the present invention, the determining a degree of overlapping of areas occupied by the first object detection frame and the reference object detection frame in an image frame includes:

In the embodiment of the present invention, the intersection ratio of the first target detection frame and the reference target detection frame in the image frame may be determined based on the position parameter of the first target detection frame and the position parameter of the reference target detection frame, and the calculation result of the intersection ratio may be used as the overlap ratio of the areas occupied by the first target detection frame and the reference target detection frame in the image frame. Referring to fig. 4, fig. 4 is a schematic diagram of calculating a cross-over ratio according to an embodiment of the present invention, as shown in fig. 4, where the IOU is a cross-over ratio of the first target detection frame and the reference target detection frame in the image frame, the intersection is an area of an intersection region of the two target detection frames, and the intersection is an area of a phase-crossing region of the two target detection frames. The intersection ratio of the two target detection frames is the ratio of the area of the intersection region of the two target detection frames to the area of the phase-parallel region of the two target detection frames. The area of the phase-parallel area of the two target detection frames is equal to the sum of the areas of the two target detection frames minus the area of the intersection area of the two target detection frames. Generally, the value interval of the intersection ratio is [0, 1], and the intersection ratio is 0, which indicates that the first target detection frame and the reference target detection frame are not intersected completely; the closer the intersection ratio is to 1, the more overlapped areas of the two target detection frames are shown; when the intersection ratio is 1, it is indicated that the first target detection frame and the reference target detection frame are completely overlapped in the image frame, that is, the position parameters of the first target detection frame and the reference target detection frame are the same.

For example, when the target detection frame is a rectangle, the length and width of the rectangle where the first target detection frame intersects with the reference target detection frame may be determined by using the four vertex coordinates of the first target detection frame and the four vertex coordinates of the reference target detection frame. Further, the area of the intersecting rectangles (referred to as the intersection area) can be obtained. Then, the sum of the areas of the first target detection frame and the reference target detection frame (referred to as a total area) is calculated. Then, the difference between the total area and the intersection area (called the phase-parallel area) is calculated. And finally, determining the ratio of the intersection area to the parallel area as the intersection ratio of the first target detection frame and the reference target detection frame, wherein the calculation result of the intersection ratio is the coincidence degree of the areas occupied by the first target detection frame and the reference target detection frame in the image frame.

And S104, displaying a target detection frame on the current image frame according to the judgment result.

Referring to fig. 5, which is a schematic flowchart illustrating another method for outputting a target detection frame according to an embodiment of the present invention, as shown in fig. 5, the displaying a target detection frame on the current image frame according to a determination result includes:

s1041, when the overlapping ratio of the first target detection frame and the reference target detection frame in the image frame is larger than or equal to a set value, displaying the reference target detection frame on the current image frame.

And when the coincidence degree of the first target detection frame and the reference target detection frame is greater than a set value, indicating that the first target detection frame and the reference target detection frame are basically coincided. It can be further stated that the position of the target object in the current image frame is not greatly displaced compared with the position of the target object in the reference image frame when the target detection frame was last detected, and the terminal displays the reference target detection frame in the current image frame. Because the terminal displays the reference target detection frame on the current image frame and the reference image frame, the seen target detection frame does not jump for the user, and the target detection frame accurately frames the target object.

In practical applications, the set value may be set to a number close to 1, for example, the set value may be set to 0.8.

It should be understood that, when the terminal adopts intersection ratio as the coincidence ratio of the areas occupied by the first object detection frame and the reference object detection frame in the image frame, step S1041 should be: and when the intersection ratio of the first target detection frame and the reference target detection frame in the area occupied by the image frame is greater than or equal to a set value, displaying the reference target detection frame on the current image frame.

S1042, when the overlapping ratio of the first target detection frame and the reference target detection frame in the image frame is smaller than the set value, displaying the first target detection frame on the current image frame.

When the coincidence degree is smaller than the set value, the position of the target object in the current image frame is indicated, and the displacement is larger than the position of the target object in the reference image frame when the target detection frame is detected last time, because the range of the shake of the terminal is larger or the target object moves rapidly in the terminal shooting picture. Since the target detection frame has a large displacement in the current image frame and the reference image frame, the terminal needs to update the target detection frame in order to accurately frame the target object, and thus the terminal displays the first target detection frame on the current image frame.

It should be understood that, when the terminal adopts intersection ratio as the coincidence ratio of the areas occupied by the first object detection frame and the reference object detection frame in the image frame, step S1042 should be: and when the intersection ratio of the area occupied by the first target detection frame and the reference target detection frame in the image frame is smaller than the set value, displaying the first target detection frame on the current image frame.

Further, the displaying the reference target detection frame on the current image frame includes: and outputting the reference target detection frame on the current image frame according to the position parameter of the reference target detection frame.

In one embodiment, the terminal stores the location parameters of the object detection boxes that have been displayed historically.

Further, after the first object detection frame is output on the current image frame, the output method further includes:

and the terminal takes the position parameter of the first target detection frame of the current image frame as the position parameter of the reference target detection frame of the stored reference image, completes the updating of the position parameter of the reference target detection frame of the stored reference image, and stores the updated position parameter of the reference target detection frame of the reference image frame into the register.

And after the terminal outputs the first target detection frame on the current image frame, the position parameter of the first target detection frame of the current image frame is used as the position parameter of the reference target detection frame of the stored reference image, and the position parameter of the reference target detection frame of the stored reference image is updated. That is, only one position parameter of the reference target detection frame is always stored in the terminal, and each time the terminal outputs the first target detection frame on the current image frame, the position parameter of the reference target detection frame of the stored reference image is updated. Specifically, the updated position parameters of the reference target detection frame of the reference image are stored in a register, and the position parameters of the reference target detection frame of the reference image frame are read from the register whenever the terminal needs to display the reference target detection frame of the reference image frame.

In one aspect, the position parameter of the first target detection frame stored in the current image frame is to store the latest position of the target object for reference when the terminal displays the target detection frame next time. In the second aspect, the reason why the position parameter of the reference target detection frame of the stored reference image is deleted is that the position of the target detection frame has been greatly displaced, the position parameter of the reference target detection frame of the stored reference image no longer has a reference value for the display of the target detection frame thereafter, and the deletion of the position parameter of the reference target detection frame of the stored reference image can also save the register space. In the third aspect, the position parameter of the first target detection frame covers the position parameter of the reference target detection frame of the stored reference image, so that only the position parameter of one target detection frame is stored in the register forever, positioning in multiple groups of position parameters is not needed during reading, and the reading efficiency is improved.

In the embodiment of the invention, the terminal stores the position parameters of the reference target detection frame of the reference image frame in the register of the processor, and the register has very high reading and writing speed, so that when the terminal needs to display the reference target detection frame, the position parameters of the reference target detection frame can be quickly read.

As another embodiment of the present invention, the terminal may further generate a new target detection frame according to the first target detection frame of the current image frame and the reference target detection frame of the reference image frame, and display the new target detection frame on the current image frame. For example, the terminal may perform averaging or weighting calculation on the position parameters of the first target detection frame and the reference target detection frame to obtain the position parameter of the new target detection frame. And generating a position parameter of a new target detection frame according to the position parameters of the first target detection frame and the reference target detection frame, and displaying the new target detection frame on the current image frame, so that the accuracy of the target detection of the terminal can be improved.

Referring to fig. 6, fig. 6 is a schematic diagram of a photographing preview effect according to an embodiment of the present invention, as shown in fig. 6, when a mobile phone performs photographing, the mobile phone continuously displays image frames imaged by a camera in real time on a display screen to form a preview image, and a target detection frame is displayed corresponding to each image frame. Comparing fig. 6 with fig. 1, it can be seen that, the same is the second frame image, and the position of the target detection frame in fig. 1 is changed due to the shake of the mobile phone, but the position of the target detection frame in the second frame image of fig. 6 is the same as that of the target detection frame in the first frame image. When the mobile phone uses the method for outputting the target detection frame provided by the embodiment of the invention, when the mobile phone slightly shakes during shooting, the position of the target detection frame in the preview picture is kept unchanged all the time and does not jump along with the shaking of the mobile phone, so that the visual effect of a user is not influenced.

The method comprises the steps of predicting a first target detection frame of a current image frame by acquiring the current image frame; judging the coincidence degree of the areas occupied by the first target detection frame and the reference target detection frame in the image frame; and displaying a target detection frame on the current image frame according to the judgment result. The embodiment of the invention can avoid the target detection frame from jumping caused by slight shake of the terminal when the terminal performs preview shooting, and simultaneously ensures the accuracy of target detection. The embodiment of the invention improves the moving smoothness and stability of the target detection frame in the preview image, brings good visual effect to users and eliminates the possible interference to the photographing operation.

Referring to fig. 7, fig. 7 is a schematic flow chart illustrating an implementation of another method for outputting a target detection box according to an embodiment of the present invention, and as shown in fig. 7, the method for outputting a target detection box further includes:

and S701, detecting the running state of a camera of the terminal.

The running state of the camera of the terminal comprises opening and closing.

S702, when the running state of the camera is closed, emptying the position parameters of the reference target detection frame of the reference image frame stored in the register.

On the one hand, when the operation state is off, the terminal stops shooting, and the terminal does not need to display the target detection frame any more at this time, so that the position parameter of the reference target detection frame of the reference image frame stored in the register can be deleted. On the other hand, the position parameters of the reference target detection frame of the reference image frame stored in the register are cleared, so that the situation that the wrong target detection frame is displayed when the terminal starts the camera for shooting next time can be avoided.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.

Referring to fig. 8, fig. 8 is a schematic diagram of an output apparatus of an object detection frame according to an embodiment of the present invention, as shown in fig. 8, the apparatus includes: the device comprises an acquisition module, a prediction module, a judgment module and a display module.

The acquisition module is used for acquiring a current image frame;

the judging module is used for judging the contact ratio of the first target detection frame and the reference target detection frame in the area occupied by the image frame; the reference target detection frame is a prediction target detection frame aiming at a reference image frame, and the acquisition of the reference image frame is positioned before the acquisition of the current image frame;

a display module for displaying the target detection frame on the current image frame according to the judgment result

The judgment module is specifically configured to: and calculating the intersection ratio of the first target detection frame and the reference target detection frame in the image frame, and judging the overlap ratio of the areas occupied by the first target detection frame and the reference target detection frame in the image frame according to the calculation result of the intersection ratio.

The display module is specifically configured to: when the coincidence degree of the first target detection frame and the reference target detection frame in the area occupied by the image frame is greater than or equal to a set value, displaying the reference target detection frame on the current image frame;

The display module is specifically configured to:

The device further comprises:

and the storage module is used for taking the position parameter of the first target detection frame of the current image frame as the position parameter of the reference target detection frame of the stored reference image, finishing updating the position parameter of the reference target detection frame of the stored reference image, and storing the updated position parameter of the reference target detection frame of the reference image frame to the register.

The device further comprises:

the detection module is used for detecting the running state of a camera of the terminal;

and the clearing module is used for clearing the position parameters of the reference target detection frame of the reference image frame stored in the register when the running state of the camera is closed.

The reference target detection frame is a target detection frame displayed on a reference image frame by the camera; the reference image frame is a previous image frame of the current image frame shot by the camera.

The acquisition module is specifically configured to:

It should be noted that: the output device of the target detection frame provided in the above embodiment is only illustrated by dividing the above modules when displaying the target detection frame, and in practical applications, the processing may be distributed to different modules according to needs, that is, the internal structure of the device may be divided into different modules to complete all or part of the processing described above. In addition, the output device of the target detection frame and the output method embodiment of the target detection frame provided in the above embodiments belong to the same concept, and specific implementation processes thereof are detailed in the method embodiments and are not described herein again.

Fig. 9 is a schematic diagram of a terminal according to an embodiment of the present invention. As shown in fig. 9, the terminal of this embodiment includes: a processor, a memory, and a computer program stored in the memory and executable on the processor. The processor, when executing the computer program, implements the steps in the various method embodiments described above, such as steps 101 to 104 shown in fig. 1. Alternatively, the processor, when executing the computer program, implements the functions of each module/unit in the above-mentioned device embodiments, such as the functions of the acquiring module, the predicting module, the judging module and the displaying module shown in fig. 8.

Illustratively, the computer program may be partitioned into one or more modules that are stored in the memory and executed by the processor to implement the invention. The one or more modules may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution of the computer program in the terminal.

The terminal may include, but is not limited to, a processor, a memory. Those skilled in the art will appreciate that fig. 9 is only an example of a terminal and is not intended to be limiting and may include more or fewer components than those shown, or some components may be combined, or different components, e.g., the terminal may also include input-output devices, network access devices, buses, etc.

The Processor may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field-Programmable gate array (FPGA) or other Programmable logic device, discrete gate or transistor logic device, discrete hardware component, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory may be an internal storage unit of the terminal, such as a hard disk or a memory of the terminal. The memory may also be an external storage device of the terminal, such as a plug-in hard disk, a Smart Memory Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the terminal. Further, the memory may also include both an internal storage unit and an external storage device of the terminal. The memory is used for storing the computer program and other programs and data required by the terminal. The memory may also be used to temporarily store data that has been output or is to be output.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus/terminal and method may be implemented in other ways. For example, the above-described apparatus/terminal embodiments are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present invention, and are intended to be included within the scope of the present invention.

Claims

1. The output method of the target detection frame is applied to a terminal and comprises the following steps:

acquiring a current image frame;

2. The output method according to claim 1, wherein the determining the degree of coincidence of the areas occupied by the first object detection frame and the reference object detection frame in the image frame includes:

3. The output method according to claim 1, wherein the displaying a target detection frame on the current image frame according to the determination result includes:

4. The output method according to claim 3, wherein the displaying the reference target detection frame on the current image frame includes:

5. The output method according to claim 3, wherein after the first object detection frame is displayed on the current image frame, the output method further comprises:

6. The output method according to claim 5, characterized in that the output method further comprises:

detecting the running state of a camera of the terminal;

7. The output method according to claim 1, wherein the reference target detection frame is a target detection frame displayed on a reference image frame by the camera; the reference image frame is a previous image frame of the current image frame shot by the camera.

8. The output method according to claim 1, wherein the acquiring the current image frame specifically comprises:

9. An output device of an object detection frame, comprising:

the acquisition module is used for acquiring a current image frame;

10. A terminal comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the method of outputting the object detection box according to any one of claims 1 to 8 when executing the computer program.

11. A computer-readable storage medium characterized in that a computer program is stored, the computer program comprising program instructions that, when executed by a processor, cause the processor to execute the output method of the object detection block according to any one of claims 1 to 8.