CN111277753A - Focusing method and device, terminal equipment and storage medium - Google Patents
Focusing method and device, terminal equipment and storage medium Download PDFInfo
- Publication number
- CN111277753A CN111277753A CN202010084012.2A CN202010084012A CN111277753A CN 111277753 A CN111277753 A CN 111277753A CN 202010084012 A CN202010084012 A CN 202010084012A CN 111277753 A CN111277753 A CN 111277753A
- Authority
- CN
- China
- Prior art keywords
- preview image
- image
- current preview
- target
- focusing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/67—Focus control based on electronic image sensor signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/63—Control of cameras or camera modules by using electronic viewfinders
- H04N23/631—Graphical user interfaces [GUI] specially adapted for controlling image capture or setting capture parameters
- H04N23/632—Graphical user interfaces [GUI] specially adapted for controlling image capture or setting capture parameters for displaying or modifying preview images prior to image capturing, e.g. variety of image resolutions or capturing parameters
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Human Computer Interaction (AREA)
- Image Analysis (AREA)
Abstract
The application discloses a focusing method, a focusing device, terminal equipment and a storage medium. The method can comprise the following steps: acquiring a current preview image; acquiring a main body confidence coefficient of a current preview image; detecting whether the current preview image is a main image or a non-main image according to the main confidence coefficient; and carrying out focusing processing by adopting a corresponding focusing mode according to the detection result. The method can detect whether the current preview image is a main image or a non-main image by acquiring the confidence coefficient of the main body of the preview image, and adopts a corresponding focusing mode according to the detection result to realize focusing processing, thereby realizing focusing processing in the scene with or without the main image, reducing target error segmentation or target error detection in the scene without the main image, avoiding the inaccuracy of automatic focusing caused by the error of main body segmentation or main body detection, improving the precision of target main body segmentation or target main body detection, and improving the accuracy of automatic focusing.
Description
Technical Field
The present application relates to the field of image technologies, and in particular, to a focusing method, an apparatus, a terminal device, and a computer-readable storage medium.
Background
With the development of the imaging technology, the phenomenon of using the terminal device to shoot images is more and more common. In the image capturing process, a subject is generally detected by a target segmentation or target detection method, and a segmentation result or a detection result is sent to an automatic focusing module to perform focusing processing.
However, in a scene without a subject target, erroneous segmentation or erroneous detection is likely to occur, so that the result of the erroneous segmentation or the erroneous detection is transmitted to the auto-focusing module, which causes inaccurate focusing and deteriorates user experience.
Disclosure of Invention
The present application aims to solve at least one of the technical problems in the related art to some extent.
In a first aspect, an embodiment of the present application provides a focusing method, including the following steps: acquiring a current preview image; acquiring the subject confidence of the current preview image; detecting whether the current preview image is a main image or a non-main image according to the main confidence coefficient; and carrying out focusing processing by adopting a corresponding focusing mode according to the detection result.
In a second aspect, an embodiment of the present application provides a focusing apparatus, including: the preview image acquisition module is used for acquiring a current preview image; a subject confidence coefficient obtaining module, configured to obtain a subject confidence coefficient of the current preview image; the detection module is used for detecting whether the current preview image is a main image or a non-main image according to the main confidence coefficient; and the focusing module is used for carrying out focusing processing by adopting a corresponding focusing mode according to the detection result.
In a third aspect, an embodiment of the present application provides a terminal device, including: the focusing device comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein when the processor executes the program, the focusing method is realized.
In a fourth aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the focusing method described in the first aspect of the present application.
According to the technical scheme of the embodiment of the application, the current preview image can be obtained, then the subject confidence of the current preview image is obtained, whether the current preview image is a subject image or a non-subject image is detected according to the subject confidence, and then the corresponding focusing mode is adopted for focusing according to the detection result. The method can detect whether the current preview image is a main image or a non-main image by acquiring the confidence coefficient of the main body of the preview image, and adopts a corresponding focusing mode according to the detection result to realize focusing processing, thereby realizing focusing processing in the scene with or without the main image, reducing target error segmentation or target error detection in the scene without the main image, avoiding the inaccuracy of automatic focusing caused by the error of main body segmentation or main body detection, improving the precision of target main body segmentation or target main body detection, and improving the accuracy of automatic focusing.
Additional aspects and advantages of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the present application.
Drawings
The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
FIG. 1 is a flowchart of a focusing method according to an embodiment of the present application.
FIG. 2 is a flowchart of a focusing method according to an embodiment of the present application.
FIG. 3 is a schematic diagram of a target segmentation architecture according to an embodiment of the present application.
FIG. 4 is a flowchart of a focusing method according to another embodiment of the present application.
FIG. 5 is a schematic structural diagram of a focusing device according to an embodiment of the present application.
Fig. 6 is a schematic structural diagram of a terminal device according to an embodiment of the present application.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.
In the prior art, a current subject detection method generally detects a subject by using a target division or target detection method, and performs focusing processing by giving a division result or a detection result to automatic focusing.
In the target segmentation method, an image is generally divided into a plurality of sub-images, the divided sub-images keep the maximum similarity inside the image, and the similarity between the sub-images keeps the minimum, such as Normalized Cut (Normalized Cut), Graph Cut (Graph Cut), and the like. A segmentation method based on Clustering generally initializes a rough cluster, uses an Iterative manner to cluster pixel points with similar features to the same super pixel, and iterates until convergence, thereby obtaining a final image segmentation result, such as k-means (k-means Clustering algorithm), SLIC (Simple Linear Iterative Clustering), and the like. A segmentation method based on semantics generally adopts a Convolutional neural network to perform softmax cross entropy classification on each pixel point in an image, so as to implement segmentation on a target, such as FCN (full Convolutional neural Networks), Deep Lab series, and the like.
The target detection method comprises the step of detecting a main body by adopting an integral graph characteristic + AdaBoost method. The host target candidate region is determined by extracting HOG features and combining with an SVM (Support Vector Machine) classifier or DPM. Target detection method based on deep learning, One-stage (YOLO (You Only look at target detection) and SSD series of networks): and directly regressing the position of the target frame, namely directly converting the problem of target frame positioning into a regression problem without generating a candidate frame. Two-stage (networks of the family of fast RCNN): and recommending the candidate area by using the RPN network.
However, in a scene without a subject target, erroneous segmentation or erroneous detection is likely to occur, so that the result of the erroneous segmentation or the erroneous detection is transmitted to the auto-focusing module, which causes inaccurate focusing and deteriorates user experience.
Focusing methods, apparatuses, terminal devices, and storage media according to embodiments of the present application are described below with reference to the accompanying drawings.
FIG. 1 is a flowchart of a focusing method according to an embodiment of the present application. It should be noted that the focusing method of the embodiments of the present application can be applied to the focusing device of the embodiments of the present application, and the device can be configured on a terminal device. In an embodiment of the present application, the terminal device may be a mobile terminal, such as a smart phone, a tablet computer, a wearable device, and the like.
As shown in fig. 1, the focusing method may include:
and S110, acquiring the current preview image.
In the embodiment of the application, when the camera collects the current image, the current preview image can be obtained. The preview image is generated by the terminal device capturing a picture of the current scene in real time through a camera, and may include a character image and may further include a background image, wherein the preview image includes, but is not limited to, an RGB (color of three channels of red, green and blue) graph and a depth graph, and in order to reduce the amount of calculation of subsequent processing, the preview image may be reduced to a smaller size, for example, 224 × 224.
In an embodiment of the present application, the capturing of the current image by using the camera may be an image captured after the camera on the terminal device is turned on.
The camera may be a device installed in the terminal device, and the camera includes, but is not limited to, one or more of various wide-angle cameras, telephoto cameras, color cameras, or black and white cameras.
In one embodiment of the application, the preview image can be displayed on a display screen of the terminal device in real time.
In an embodiment of the present application, the current preview image may also be a frame image in a video acquired by the terminal device when recording the video.
And S120, acquiring the subject confidence of the current preview image.
In the embodiment of the present application, after the current preview image is obtained, the main confidence of the current preview image may be obtained through a Support Vector Machine (SVM) regression model or a classification network model based on a Convolutional Neural Network (CNN). For example, the subject confidence of the current preview image may be obtained in two ways:
as an embodiment of an implementation manner, the features of the current preview image may be extracted, and then the features of the current preview image are input into a trained support vector machine regression model for regression prediction to obtain a regression value, and the obtained regression value is determined as a main confidence of the current preview image.
As an embodiment of another implementation manner, the current preview image is input into a trained neural network classifier, where the neural network classifier is a classification network model based on a convolutional neural network, the neural network classifier includes an input layer for extracting features and an output layer for outputting network original output values, then the network original output values output by the neural network classifier are obtained, probability conversion is performed on target output values in the network original output values, and the probability obtained after the conversion is determined as a subject confidence of the current preview image, where the target output values refer to output values corresponding to class labels with subject class labels. Specific implementation modes can refer to the following embodiments.
And S130, detecting whether the current preview image is a subject image or a non-subject image according to the subject confidence.
In an embodiment of the present application, the subject confidence of the current preview image is obtained, whether the subject confidence is smaller than a target threshold may be determined, if the subject confidence is smaller than the target threshold, the current preview image is determined to be a no-subject image, and if the subject confidence is greater than or equal to the target threshold, the current preview image is determined to be a subject image.
That is, the subject confidence of the current preview image is obtained, the subject confidence of the current preview image may be compared with a target threshold, if the subject confidence is smaller than the target threshold, the current preview image is determined to be a no-subject image, and if the subject confidence is greater than or equal to the target threshold, the current preview image is determined to be a subject image. As an example, the target threshold may be 0.5.
And S140, carrying out focusing processing by adopting a corresponding focusing mode according to the detection result.
Optionally, in an embodiment of the present application, when it is detected that the current preview image is a subject image, the size and the area position of the subject target are detected from the current preview image, and focusing is performed according to the detected size and the detected area position of the subject target; and when the current preview image is detected to be the image without the main body, focusing processing is carried out according to the target area in the current preview image.
Wherein the target area may be a middle area, i.e. a middle area in the current preview image.
As an example, when the current preview image is detected as a no-subject image, the middle area in the current preview image may be automatically focused, and when the current preview image is detected as a subject image, the size and the area position of the subject target may be detected from the current preview image by means of target segmentation or target detection, and focusing may be performed according to the detected size and the area position of the subject target. The specific implementation process can refer to the following embodiments.
According to the focusing method of the embodiment of the application, the current preview image can be obtained, then the subject confidence of the current preview image is obtained, whether the current preview image is a subject image or a non-subject image is detected according to the subject confidence, and then the corresponding focusing mode is adopted for focusing according to the detection result. The method can detect whether the current preview image is a main image or a non-main image by acquiring the confidence coefficient of the main body of the preview image, and adopts a corresponding focusing mode according to the detection result to realize focusing processing, thereby realizing focusing processing in the scene with or without the main image, reducing target error segmentation or target error detection in the scene without the main image, avoiding the inaccuracy of automatic focusing caused by the error of main body segmentation or main body detection, improving the precision of target main body segmentation or target main body detection, and improving the accuracy of automatic focusing.
Fig. 2 is a flowchart of a focusing method according to an embodiment of the present application, and as shown in fig. 2, the focusing method may include:
and S210, acquiring the current preview image.
For example, when a camera takes a picture of a user, the user faces away from the sea and can obtain a current preview image. The acquired preview image comprises the user and the sea.
And S220, acquiring the subject confidence of the current preview image.
As an embodiment of an implementation manner, the features of the current preview image may be extracted, and then the features of the current preview image are input into a trained support vector machine regression model for regression prediction to obtain a regression value, and the obtained regression value is determined as a main confidence of the current preview image.
The features of the current preview image include, but are not limited to, features such as HOG (Histogram of oriented gradient), texture, color, and shape.
The support vector machine regression model includes, but is not limited to, a support vector machine regression model configured by a linear kernel function, a support vector machine regression model configured by a polynomial kernel function, and a support vector machine regression model configured by a radial basis kernel function.
For example, the current preview image is obtained, the HOG features of the current preview character image may be extracted, the HOG features are input into a support vector machine regression model configured by a linear kernel function, a regression value may be obtained, for example, the regression value is between 0 and 1, and then the obtained regression value is determined as the subject confidence of the current preview image.
The specific implementation manner of extracting the HOG features of the current preview character image is as follows: acquiring a current preview image, normalizing the whole image, calculating the gradient of the image in the abscissa and ordinate directions, calculating the gradient direction value of each pixel position according to the gradient direction value, providing a code for a local image region, simultaneously keeping the weak sensitivity to the posture and the appearance of a human body object in the image, enabling the change range of the gradient strength to be very large due to the change of local illumination and the change of foreground-background contrast, needing to normalize the gradient strength, finally collecting HOG characteristics of all overlapped blocks in a detection window, combining the HOG characteristics into final characteristic vectors for classification, and further acquiring the HOG characteristics of the current preview image.
For another example, when the current preview image is obtained, the texture feature of the current preview character image may be extracted by the extraction module, where the texture image has a regular distribution of gray values caused by repeated arrangement of features on the image, and then the texture feature is input into a regression model of a support vector machine configured by a polynomial kernel function to obtain a regression value, and then the obtained regression value is determined as the principal confidence of the current preview image.
The extraction of the texture features of the current preview character image can be realized through a statistical algorithm, a geometric algorithm and a signal processing algorithm.
For another example, when the current preview image is obtained, the shape feature of the current preview character image may be extracted by the extraction module, and then the shape feature is input into a support vector machine regression model configured by a radial basis function, so as to obtain a regression value, and then the obtained regression value is determined as the subject confidence of the current preview image.
The shape feature of the current preview character image can be extracted through a boundary feature algorithm, a shape invariant moment algorithm and a Fourier shape descriptor algorithm.
As an embodiment of another implementation manner, the current preview image is input into a trained neural network classifier, where the neural network classifier is a classification network model based on a convolutional neural network, the neural network classifier includes an input layer for extracting features and an output layer for outputting network original output values, then the network original output values output by the neural network classifier are obtained, probability conversion is performed on target output values in the network original output values, and the probability obtained after the conversion is determined as a subject confidence of the current preview image, where the target output values refer to output values corresponding to class labels with subject class labels. As an example, the classification network model may be a two-classification network model.
For example, the current preview image is acquired, the current preview image may be input into a trained neural network classifier, the neural network classifier may extract features of the preview image, and based on the features of the preview image, a network original output value may be output, where the network original output value may have only one output value, then the network original output value is subjected to probability conversion through a Sigmoid function, and the probability obtained after the conversion is determined as a subject confidence of the current preview image.
For another example, the current preview image is acquired, the current preview image may be input into a trained neural network classifier, the neural network classifier may extract features of the preview image, and based on the features of the preview image, a network raw output value may be output, where the network raw output value may include two output values, then the two output values in the network raw output values are subjected to probability conversion through a Softmax function, and a conversion probability of an output value corresponding to a category label with a subject is determined as a subject confidence of the current preview image.
And S230, detecting whether the current preview image is a subject image or a non-subject image according to the subject confidence.
For example, the subject confidence of the current preview image is obtained, and the subject confidence may be compared with a target threshold, for example, if the subject confidence is 0.6, the target threshold is 0.5, and at this time, if the subject confidence 0.6 is greater than the target threshold 0.5, it is determined that the current preview image has the subject image.
And S240, when the current preview image is detected to be the image without the main body, focusing processing is carried out according to the target area in the current preview image.
Wherein the target area may be the middle area.
That is, when it is detected that the current preview image is a no-subject image, auto-focusing may be performed on the middle area in the current preview image.
And S250, when the current preview image is detected to be the main body image, detecting the size and the area position of the main body target from the current preview image in a target segmentation mode, and carrying out focusing processing according to the detected size and the area position of the main body target.
In an embodiment of the present application, when it is detected that the current preview image is a subject image, the current preview image may be input into a trained target segmentation network model, where the target segmentation network model is a mapping relationship between each image feature and a size and a region position of a subject that has been learned, the target segmentation network model includes an input layer for extracting image features and an output layer for outputting the size and the region position of the subject, then the size and the region position of a subject target output by the target segmentation network model are obtained, and focusing is performed according to the detected size and the region position of the subject target.
The target segmentation network model is a segmentation model based on a target segmentation algorithm, wherein the target segmentation algorithm includes, but is not limited to, a deep learning series segmentation algorithm, a U-Net, and an FCN (fuzzy C-means) algorithm, and the target segmentation algorithm includes an Encoder feature coding module and a Decoder target template generation module. For example, the network structure of the object segmentation network model may be as shown in FIG. 3.
That is, when it is detected that the current preview image is a subject image, the current preview image may be input into the trained target segmentation network model, the target segmentation network model may extract image features, and based on a mapping relationship between each image feature and the size and the region position of the subject, an output layer of the size and the region position of the subject may be output, thereby obtaining the size and the region position of the subject target output by the target segmentation network model, and performing auto-focusing according to the detected size and the region position of the subject target.
According to the focusing method of the embodiment of the application, the current preview image can be obtained, then the main body confidence coefficient of the current preview image is obtained, whether the current preview image is a main body image or a non-main body image is detected according to the main body confidence coefficient, and when the current preview image is detected to be the non-main body image, focusing processing is carried out according to a target area in the current preview image; when the current preview image is detected to be a main body image, the size and the area position of the main body target can be detected from the current preview image in a target segmentation mode, and focusing processing is carried out according to the detected size and the area position of the main body target. The method can detect whether the current preview image is a main image or a non-main image by acquiring the confidence coefficient of the main body of the preview image, and adopts a corresponding focusing mode according to the detection result to realize focusing processing, thereby realizing focusing processing in the scene with or without the main image, reducing target error segmentation in the scene without the main image, avoiding inaccurate automatic focusing caused by main segmentation, improving the precision of target main segmentation and improving the accuracy of automatic focusing.
Fig. 4 is a flowchart of a focusing method according to another embodiment of the present application, as shown in fig. 3, the focusing method may include:
and S410, acquiring the current preview image.
And S420, acquiring the subject confidence of the current preview image.
And S430, detecting whether the current preview image is a subject image or a non-subject image according to the subject confidence.
And S440, when the current preview image is detected to be the image without the main body, focusing is carried out according to the target area in the current preview image.
Wherein the target area may be the middle area.
That is, when it is detected that the current preview image is a no-subject image, auto-focusing may be performed on the middle area in the current preview image.
It should be noted that, in the embodiment of the present application, the implementation manners of the steps S410 to S440 may refer to the description of the implementation manners of the steps S210 to S240, and are not described herein again.
S450, when the current preview image is detected to be the main body image, the size and the area position of the main body target can be detected from the current preview image in a target detection mode, and focusing processing is carried out according to the size and the area position of the detected main body target.
In an embodiment of the application, when it is detected that the current preview image is a subject image, the current preview image may be input into a trained target detection network model, where the target detection network model is a target detection network model that has been learned to obtain a mapping relationship between each image feature and a size and a position of a subject inscribed rectangle, the target detection network model includes an input layer for extracting the image feature and an output layer for outputting the size and the position of the subject inscribed rectangle, then the size and the position of the subject inscribed rectangle output by the target detection network model are obtained, the size of the subject inscribed rectangle is determined to be the size of the subject target, the position of the subject inscribed rectangle is determined to be a region position of the subject target, and focusing processing is performed according to the detected size and region position of the subject target.
The target detection model is a detection model based on a target detection algorithm, wherein the target detection algorithm includes, but is not limited to, SSD, YOLO, Fast R-CNN (Fast area convolutional neural network), and other algorithms.
That is, when it is detected that the current preview image is a subject image, the current preview image may be input into a trained target detection network model, the target detection network model may extract image features, and may output an output layer of the size and the position of the subject inscribed rectangle based on a mapping relationship between each image feature and the size and the position of the subject inscribed rectangle, thereby obtaining the size and the position of the subject inscribed rectangle output by the target segmentation network model, and performing auto-focusing according to the detected size and the region position of the subject target.
It should be noted that, since the circumscribed rectangle of the main body includes the background region, it is easy to cause inaccuracy of auto-focusing, and therefore, in the embodiment of the present application, the target detection network model outputs the size and the position of the inscribed rectangle of the main body.
According to the focusing method of the embodiment of the application, the current preview image can be obtained, then the main body confidence coefficient of the current preview image is obtained, whether the current preview image is a main body image or a non-main body image is detected according to the main body confidence coefficient, and when the current preview image is detected to be the non-main body image, focusing processing is carried out according to a target area in the current preview image; when the current preview image is detected to be a subject image, the size and the area position of the subject target can be detected from the current preview image in a target detection mode, and focusing processing is carried out according to the detected size and the area position of the subject target. The method can detect whether the current preview image is a main image or a non-main image by acquiring the confidence coefficient of the main body of the preview image, and adopts a corresponding focusing mode according to the detection result to realize focusing processing, thereby realizing focusing processing in the scene with or without the main image, reducing target error detection in the scene without the main image, avoiding the inaccuracy of automatic focusing caused by main body detection, improving the detection precision of the target main body and improving the accuracy of automatic focusing.
Corresponding to the focusing methods provided by the above embodiments, an embodiment of the present application further provides a focusing apparatus, and since the focusing apparatus provided by the embodiment of the present application corresponds to the focusing methods provided by the above embodiments, the implementation manner of the focusing method is also applicable to the focusing apparatus provided by the embodiment, and is not described in detail in the embodiment. FIG. 5 is a schematic structural diagram of a focusing device according to an embodiment of the present application.
As shown in fig. 5, the focusing apparatus 500 includes: a preview image acquisition module 510, a subject confidence acquisition module 520, a detection module 530, and a focus module 540. Wherein:
the preview image acquiring module 510 is used for acquiring a current preview image.
The subject confidence obtaining module 520 is configured to obtain a subject confidence of the current preview image; as an example, the subject confidence obtaining module is specifically configured to: extracting the characteristics of the current preview image; inputting the characteristics of the current preview image into a trained support vector machine regression model for regression prediction to obtain a regression value; and determining the obtained regression value as the subject confidence of the current preview image.
In an embodiment of the present application, the subject confidence obtaining module 520 is specifically configured to: inputting the current preview image into a trained neural network classifier; the neural network classifier is a classification network model based on a convolutional neural network, and comprises an input layer for extracting features and an output layer for outputting network original output values; acquiring a network original output value output by the neural network classifier; and performing probability conversion on a target output value in the network original output values, and determining the probability obtained after the conversion as the subject confidence of the current preview image, wherein the target output value is an output value corresponding to a subject type label.
The detecting module 530 is configured to detect whether the current preview image is a subject image or a non-subject image according to the subject confidence level; as an example, the detection module is specifically configured to: judging whether the subject confidence is smaller than a target threshold value; if the subject confidence is smaller than the target threshold, judging that the current preview image is a non-subject image; and if the subject confidence is greater than or equal to the target threshold, judging that the current preview image is a subject image.
The focusing module 540 is configured to perform focusing processing by using a corresponding focusing method according to the detection result. As an example, the focusing module is specifically configured to: when the current preview image is detected to be a main image, detecting the size and the area position of a main object from the current preview image, and carrying out focusing processing according to the detected size and the area position of the main object; and when the current preview image is detected to be a no-body image, focusing according to a target area in the current preview image.
In an embodiment of the present application, the focusing module 540 is specifically configured to: inputting the current preview image into a trained target segmentation network model; the target segmentation network model is a mapping relation between each image feature and the size and the region position of the main body, and comprises an input layer for extracting the image feature and an output layer for outputting the size and the region position of the main body; and acquiring the size and the area position of the main target output by the target segmentation network model.
In an embodiment of the present application, the focusing module 540 is specifically configured to: inputting the current preview image into a trained target detection network model; the target detection network model is obtained by learning the mapping relation between each image feature and the size and the position of the main body inscribed rectangle, and comprises an input layer for extracting the image feature and an output layer for outputting the size and the position of the main body inscribed rectangle; acquiring the size and the position of a main body inscribed rectangle output by the target detection network model; determining the size of the main body inscribed rectangle to be the size of the main body target, and determining the position of the main body inscribed rectangle to be the area position of the main body target.
According to the focusing method of the embodiment of the application, the current preview image can be obtained, then the subject confidence of the current preview image is obtained, whether the current preview image is a subject image or a non-subject image is detected according to the subject confidence, and then the corresponding focusing mode is adopted for focusing according to the detection result. The method can detect whether the current preview image is a main image or a non-main image by acquiring the confidence coefficient of the main body of the preview image, and adopts a corresponding focusing mode according to the detection result to realize focusing processing, thereby realizing focusing processing in the scene with or without the main image, reducing target error segmentation or target error detection in the scene without the main image, avoiding the inaccuracy of automatic focusing caused by the error of main body segmentation or main body detection, improving the precision of target main body segmentation or target main body detection, and improving the accuracy of automatic focusing.
In order to implement the above embodiment, the present application further provides a terminal device.
Fig. 6 is a schematic structural diagram of a terminal device according to an embodiment of the present application. As shown in fig. 6, the terminal device 600 may include: a memory 610, a processor 620 and a computer program 630 stored in the memory 610 and operable on the processor 620, wherein the processor 620 implements the focusing method according to any one of the above embodiments when executing the program.
In order to implement the above embodiments, the present application also proposes a computer-readable storage medium, which when executed by a processor implements the focusing method of any one of the above.
In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and the scope of the preferred embodiments of the present application includes other implementations in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.
The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.
In addition, functional units in the embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.
The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc. Although embodiments of the present application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present application, and that variations, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.
Claims (16)
1. A focusing method, comprising:
acquiring a current preview image;
acquiring the subject confidence of the current preview image;
detecting whether the current preview image is a main image or a non-main image according to the main confidence coefficient;
and carrying out focusing processing by adopting a corresponding focusing mode according to the detection result.
2. The focusing method of claim 1, wherein obtaining the subject confidence level of the current preview image comprises:
extracting the characteristics of the current preview image;
inputting the characteristics of the current preview image into a trained support vector machine regression model for regression prediction to obtain a regression value;
and determining the obtained regression value as the subject confidence of the current preview image.
3. The focusing method of claim 1, wherein obtaining the subject confidence level of the current preview image comprises:
inputting the current preview image into a trained neural network classifier; the neural network classifier is a classification network model based on a convolutional neural network, and comprises an input layer for extracting features and an output layer for outputting network original output values;
acquiring a network original output value output by the neural network classifier;
performing probability conversion on a target output value in the network original output values, and determining the probability obtained after conversion as a main confidence coefficient of the current preview image; the target output value is an output value corresponding to the class label with a main class label.
4. The focusing method of claim 1, wherein detecting whether the current preview image is a subject image or a non-subject image according to the subject confidence level comprises:
judging whether the subject confidence is smaller than a target threshold value;
if the subject confidence is smaller than the target threshold, judging that the current preview image is a non-subject image;
and if the subject confidence is greater than or equal to the target threshold, judging that the current preview image is a subject image.
5. The focusing method according to claim 1, wherein the performing the focusing process by using the corresponding focusing manner according to the detection result comprises:
when the current preview image is detected to be a main image, detecting the size and the area position of a main object from the current preview image, and carrying out focusing processing according to the detected size and the area position of the main object;
and when the current preview image is detected to be a no-body image, focusing according to a target area in the current preview image.
6. The focusing method of claim 5, wherein detecting the size and the area position of the subject target from the current preview image comprises:
inputting the current preview image into a trained target segmentation network model; the target segmentation network model is a mapping relation between each image feature and the size and the region position of the main body, and comprises an input layer for extracting the image feature and an output layer for outputting the size and the region position of the main body;
and acquiring the size and the area position of the main target output by the target segmentation network model.
7. The focusing method of claim 5, wherein detecting the size and the area position of the subject target from the current preview image comprises:
inputting the current preview image into a trained target detection network model; the target detection network model is obtained by learning the mapping relation between each image feature and the size and the position of the main body inscribed rectangle, and comprises an input layer for extracting the image feature and an output layer for outputting the size and the position of the main body inscribed rectangle;
acquiring the size and the position of a main body inscribed rectangle output by the target detection network model;
determining the size of the main body inscribed rectangle to be the size of the main body target, and determining the position of the main body inscribed rectangle to be the area position of the main body target.
8. A focusing apparatus, comprising:
the preview image acquisition module is used for acquiring a current preview image;
a subject confidence coefficient obtaining module, configured to obtain a subject confidence coefficient of the current preview image;
the detection module is used for detecting whether the current preview image is a main image or a non-main image according to the main confidence coefficient;
and the focusing module is used for carrying out focusing processing by adopting a corresponding focusing mode according to the detection result.
9. The focusing device of claim 8, wherein the subject confidence acquisition module is specifically configured to:
extracting the characteristics of the current preview image;
inputting the characteristics of the current preview image into a trained support vector machine regression model for regression prediction to obtain a regression value;
and determining the obtained regression value as the subject confidence of the current preview image.
10. The focusing device of claim 8, wherein the subject confidence acquisition module is specifically configured to:
inputting the current preview image into a trained neural network classifier; the neural network classifier is a classification network model based on a convolutional neural network, and comprises an input layer for extracting features and an output layer for outputting network original output values;
acquiring a network original output value output by the neural network classifier;
performing probability conversion on a target output value in the network original output values, and determining the probability obtained after conversion as a main confidence coefficient of the current preview image; the target output value is an output value corresponding to the class label with a main class label.
11. The focusing device of claim 8, wherein the detection module is specifically configured to:
judging whether the subject confidence is smaller than a target threshold value;
if the subject confidence is smaller than the target threshold, judging that the current preview image is a non-subject image;
and if the subject confidence is greater than or equal to the target threshold, judging that the current preview image is a subject image.
12. The focusing device of claim 8, wherein the focusing module is specifically configured to:
when the current preview image is detected to be a main image, detecting the size and the area position of a main object from the current preview image, and carrying out focusing processing according to the detected size and the area position of the main object;
and when the current preview image is detected to be a no-body image, focusing according to a target area in the current preview image.
13. The focusing device of claim 12, wherein the focusing module is specifically configured to:
inputting the current preview image into a trained target segmentation network model; the target segmentation network model is a mapping relation between each image feature and the size and the region position of the main body, and comprises an input layer for extracting the image feature and an output layer for outputting the size and the region position of the main body;
and acquiring the size and the area position of the main target output by the target segmentation network model.
14. The focusing device of claim 12, wherein the focusing module is specifically configured to:
inputting the current preview image into a trained target detection network model; the target detection network model is obtained by learning the mapping relation between each image feature and the size and the position of the main body inscribed rectangle, and comprises an input layer for extracting the image feature and an output layer for outputting the size and the position of the main body inscribed rectangle;
acquiring the size and the position of a main body inscribed rectangle output by the target detection network model;
determining the size of the main body inscribed rectangle to be the size of the main body target, and determining the position of the main body inscribed rectangle to be the area position of the main body target.
15. A terminal device, comprising: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the focusing method as claimed in any one of claims 1 to 7 when executing the program.
16. A computer-readable storage medium on which a computer program is stored, the program, when executed by a processor, implementing a focusing method as recited in any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010084012.2A CN111277753A (en) | 2020-02-10 | 2020-02-10 | Focusing method and device, terminal equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010084012.2A CN111277753A (en) | 2020-02-10 | 2020-02-10 | Focusing method and device, terminal equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111277753A true CN111277753A (en) | 2020-06-12 |
Family
ID=71003580
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010084012.2A Pending CN111277753A (en) | 2020-02-10 | 2020-02-10 | Focusing method and device, terminal equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111277753A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113572957A (en) * | 2021-06-26 | 2021-10-29 | 荣耀终端有限公司 | Shooting focusing method and related equipment |
CN114666490A (en) * | 2020-12-23 | 2022-06-24 | 北京小米移动软件有限公司 | Focusing method and device, electronic equipment and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103955718A (en) * | 2014-05-15 | 2014-07-30 | 厦门美图之家科技有限公司 | Image subject recognition method |
CN109167910A (en) * | 2018-08-31 | 2019-01-08 | 努比亚技术有限公司 | focusing method, mobile terminal and computer readable storage medium |
CN109587394A (en) * | 2018-10-23 | 2019-04-05 | 广东智媒云图科技股份有限公司 | A kind of intelligence patterning process, electronic equipment and storage medium |
CN110149482A (en) * | 2019-06-28 | 2019-08-20 | Oppo广东移动通信有限公司 | Focusing method, device, electronic equipment and computer readable storage medium |
CN110418064A (en) * | 2019-09-03 | 2019-11-05 | 北京字节跳动网络技术有限公司 | Focusing method, device, electronic equipment and storage medium |
-
2020
- 2020-02-10 CN CN202010084012.2A patent/CN111277753A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103955718A (en) * | 2014-05-15 | 2014-07-30 | 厦门美图之家科技有限公司 | Image subject recognition method |
CN109167910A (en) * | 2018-08-31 | 2019-01-08 | 努比亚技术有限公司 | focusing method, mobile terminal and computer readable storage medium |
CN109587394A (en) * | 2018-10-23 | 2019-04-05 | 广东智媒云图科技股份有限公司 | A kind of intelligence patterning process, electronic equipment and storage medium |
CN110149482A (en) * | 2019-06-28 | 2019-08-20 | Oppo广东移动通信有限公司 | Focusing method, device, electronic equipment and computer readable storage medium |
CN110418064A (en) * | 2019-09-03 | 2019-11-05 | 北京字节跳动网络技术有限公司 | Focusing method, device, electronic equipment and storage medium |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114666490A (en) * | 2020-12-23 | 2022-06-24 | 北京小米移动软件有限公司 | Focusing method and device, electronic equipment and storage medium |
CN114666490B (en) * | 2020-12-23 | 2024-02-09 | 北京小米移动软件有限公司 | Focusing method, focusing device, electronic equipment and storage medium |
CN113572957A (en) * | 2021-06-26 | 2021-10-29 | 荣耀终端有限公司 | Shooting focusing method and related equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Sajid et al. | Universal multimode background subtraction | |
US10491895B2 (en) | Fast and robust human skin tone region detection for improved video coding | |
AU2021201933B2 (en) | Hierarchical multiclass exposure defects classification in images | |
US8374440B2 (en) | Image processing method and apparatus | |
JP6332937B2 (en) | Image processing apparatus, image processing method, and program | |
US8934724B2 (en) | Image recognition device, image recognizing method, storage medium that stores computer program for image recognition | |
US8306262B2 (en) | Face tracking method for electronic camera device | |
US8446494B2 (en) | Automatic redeye detection based on redeye and facial metric values | |
TW202014984A (en) | Image processing method, electronic device, and storage medium | |
CN107633237B (en) | Image background segmentation method, device, equipment and medium | |
CN111754531A (en) | Image instance segmentation method and device | |
CN110580499B (en) | Deep learning target detection method and system based on crowdsourcing repeated labels | |
CN111277753A (en) | Focusing method and device, terminal equipment and storage medium | |
CN113691724A (en) | HDR scene detection method and device, terminal and readable storage medium | |
WO2015189369A1 (en) | Methods and systems for color processing of digital images | |
EP4332910A1 (en) | Behavior detection method, electronic device, and computer readable storage medium | |
CN111275045B (en) | Image main body recognition method and device, electronic equipment and medium | |
CN116168328A (en) | Thyroid nodule ultrasonic inspection system and method | |
CN115424293A (en) | Living body detection method, and training method and device of living body detection model | |
CN112784691B (en) | Target detection model training method, target detection method and device | |
JP2023519527A (en) | Generating segmentation masks based on autoencoders in alpha channel | |
EP3038059A1 (en) | Methods and systems for color processing of digital images | |
Zhou et al. | Feature fusion detector for semantic cognition of remote sensing | |
CN115331310B (en) | Multi-user gesture recognition method, device and medium | |
CN112926417A (en) | Pedestrian detection method, system, device and medium based on deep neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200612 |
|
RJ01 | Rejection of invention patent application after publication |