CN112565586A - Automatic focusing method and device - Google Patents

Automatic focusing method and device Download PDF

Info

Publication number
CN112565586A
CN112565586A CN201910920070.1A CN201910920070A CN112565586A CN 112565586 A CN112565586 A CN 112565586A CN 201910920070 A CN201910920070 A CN 201910920070A CN 112565586 A CN112565586 A CN 112565586A
Authority
CN
China
Prior art keywords
area
position information
target
objects
deep learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910920070.1A
Other languages
Chinese (zh)
Inventor
董中要
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Anyun Century Technology Co Ltd
Original Assignee
Beijing Anyun Century Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Anyun Century Technology Co Ltd filed Critical Beijing Anyun Century Technology Co Ltd
Priority to CN201910920070.1A priority Critical patent/CN112565586A/en
Publication of CN112565586A publication Critical patent/CN112565586A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/67Focus control based on electronic image sensor signals

Abstract

The invention discloses an automatic focusing method, which is applied to electronic equipment, wherein the electronic equipment is provided with one or more image acquisition units, and the method comprises the following steps: acquiring an image of a current scene by using the image acquisition unit to obtain at least one picture; performing target detection on the picture by using a deep learning model to obtain position information of a target area, wherein the target area is an area which is in the current scene and is interested by a user, and the user corresponds to the electronic equipment; and controlling the image acquisition unit to focus and photograph the target area based on the position information of the target area. The invention realizes the automatic focusing of photographing and improves the technical effect of photographing speed. Meanwhile, the invention also discloses an automatic focusing device, electronic equipment and a computer readable storage medium.

Description

Automatic focusing method and device
Technical Field
The present invention relates to the field of photographing technologies, and in particular, to an auto-focusing method and apparatus, an electronic device, and a computer-readable storage medium.
Background
With the development of science and technology, smart phones are widely popularized, have rich functions and are deeply loved by the majority of users. People can use smart phones to make calls, receive/send short messages, browse web pages, process documents, listen to music, watch movies, shop online, stock-making, take/record photos, chat, get a car, order/train tickets, take out, play games, transfer/manage money, etc.
The photographing is an important function of the smart phone, and the quality of the photographing effect can directly reduce the sales volume of the smart phone, so that the photographing technology is always the research and development focus of various large mobile phone manufacturers.
At present, there are two focusing methods for mobile phone photographing: one is that a user manually clicks a specific area of the screen, and the focusing is performed manually; the other is automatic focusing according to the distance between the lens and the shot object or based on the focus detection of clear imaging on the focusing screen. However, the existing manual focusing method is very inconvenient to operate and slow in focusing speed, and manual click focusing cannot be realized in a scene of taking a picture by using a selfie stick; the existing automatic focusing method is not accurate enough and has larger error.
Disclosure of Invention
The embodiment of the application provides an automatic focusing method and device, an electronic device and a computer readable storage medium, solves the technical problems of low focusing speed or inaccurate focusing in the focusing method in the prior art, realizes automatic focusing of photographing, and improves the focusing speed and the focusing accuracy.
In a first aspect, the present application provides the following technical solutions through an embodiment of the present application:
an auto-focusing method applied to an electronic device, wherein the electronic device is provided with one or more image acquisition units, and the method comprises the following steps:
acquiring an image of a current scene by using the image acquisition unit to obtain at least one picture;
performing target detection on the picture by using a deep learning model to obtain position information of a target area, wherein the target area is an area which is in the current scene and is interested by a user, and the user corresponds to the electronic equipment;
and controlling the image acquisition unit to focus and photograph the target area based on the position information of the target area.
Preferably, the performing target detection on the picture by using the deep learning model to obtain position information of a target region includes:
inputting the picture into the deep learning model, wherein the deep learning model is a neural network model;
acquiring position information of a region where a candidate target object is output by the deep learning model;
and acquiring the position information of the area where the target object is located based on the position information of the area where the candidate target object is located, and taking the position information of the area where the target object is located as the position information of the target area.
Preferably, the obtaining the position information of the area where the target object is located based on the position information of the area where the candidate target object is located includes:
when the candidate target object is an object, taking the position information of the area where the candidate target object is as the position information of the area where the target object is; or
And when the candidate target objects are a plurality of objects, determining the target objects from the plurality of objects, and extracting the position information of the area where the target objects are located.
Preferably, the determining the target object from the plurality of objects includes:
selecting an object with the largest area from the plurality of objects as the target object based on the area size of the region where each object in the plurality of objects is located; or
Selecting an object with the position closest to the central point of the picture from the plurality of objects as the target object based on the position information of the area where each object in the plurality of objects is located; or
Selecting an object with highest reliability from the plurality of objects as the target object based on the reliability of each object in the plurality of objects, wherein the reliability is provided by the deep learning model and is used for representing the reliability degree of each object identified by the deep learning model; or
Selecting an object with the highest type weight from the plurality of objects as the target object based on the type of each object in the plurality of objects, wherein the weights of different types of objects are different from each other; or
Randomly selecting one object from the plurality of objects as the target object.
Preferably, the training method of the deep learning model includes:
acquiring a plurality of data sets, wherein each data set comprises a plurality of picture materials, each picture material comprises one or more objects, and each object is marked with corresponding name information and position information of a region where the object is located;
and training the plurality of data sets as training samples to obtain the deep learning model.
Preferably, before the performing target detection on the picture by using the deep learning model to obtain the position information of a target region, the method further includes:
acquiring a history picture shot by the user from the electronic equipment locally;
and fine-tuning the deep learning model based on the historical pictures.
Based on the same inventive concept, in a second aspect, the present application provides the following technical solutions through an embodiment of the present application:
an auto-focusing apparatus for use in an electronic device having one or more image capturing units, the apparatus comprising:
the acquisition module is used for acquiring an image of the current scene by using the image acquisition unit to obtain at least one picture;
the detection module is used for carrying out target detection on the picture by utilizing a deep learning model to obtain position information of a target area, wherein the target area is an area which is interested by a user in the current scene, and the user corresponds to the electronic equipment;
and the focusing module is used for controlling the image acquisition unit to focus and photograph the target area based on the position information of the target area.
Preferably, the detection module includes:
the input sub-module is used for inputting the picture into the deep learning model, wherein the deep learning model is a neural network model;
the acquisition submodule is used for acquiring the position information of the region where the candidate target object output by the deep learning model is located;
and the obtaining submodule is used for obtaining the position information of the area where the target object is located based on the position information of the area where the candidate target object is located, and taking the position information of the area where the target object is located as the position information of the target area.
Preferably, the obtaining submodule is specifically configured to:
when the candidate target object is an object, taking the position information of the area where the candidate target object is as the position information of the area where the target object is; or
And when the candidate target objects are a plurality of objects, determining the target objects from the plurality of objects, and extracting the position information of the area where the target objects are located.
Preferably, the obtaining submodule is specifically configured to:
selecting an object with the largest area from the plurality of objects as the target object based on the area size of the region where each object in the plurality of objects is located; or
Selecting an object with the position closest to the central point of the picture from the plurality of objects as the target object based on the position information of the area where each object in the plurality of objects is located; or
Selecting an object with highest reliability from the plurality of objects as the target object based on the reliability of each object in the plurality of objects, wherein the reliability is provided by the deep learning model and is used for representing the reliability degree of each object identified by the deep learning model; or
Selecting an object with the highest type weight from the plurality of objects as the target object based on the type of each object in the plurality of objects, wherein the weights of different types of objects are different from each other; or
Randomly selecting one object from the plurality of objects as the target object.
Preferably, the auto-focusing apparatus further includes:
the training module is used for training the deep learning model; wherein the training the deep learning model comprises: acquiring a plurality of data sets, wherein each data set comprises a plurality of picture materials, each picture material comprises one or more objects, and each object is marked with corresponding name information and position information of a region where the object is located; and training the plurality of data sets as training samples to obtain the deep learning model.
Preferably, the auto-focusing apparatus further includes:
the acquisition module is used for locally acquiring historical pictures shot by the user from the electronic equipment before the pictures are subjected to target detection by using a deep learning model and position information of a target area is acquired;
and the fine tuning module is used for fine tuning the deep learning model based on the historical pictures.
Based on the same inventive concept, in a third aspect, the present application provides the following technical solutions through an embodiment of the present application:
an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor executes the program to implement the method steps of any of the embodiments of the first aspect.
Based on the same inventive concept, in a fourth aspect, the present application provides the following technical solutions through an embodiment of the present application:
a computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, is adapted to carry out the method steps of any of the embodiments of the first aspect.
One or more technical solutions provided in the embodiments of the present application have at least the following technical effects or advantages:
in an embodiment of the present application, an auto-focusing method is disclosed, which is applied to an electronic device having one or more image capturing units, and the method includes: acquiring an image of a current scene by using the image acquisition unit to obtain at least one picture; performing target detection on the picture by using a deep learning model to obtain position information of a target area, wherein the target area is an area which is in the current scene and is interested by a user, and the user corresponds to the electronic equipment; and controlling the image acquisition unit to focus and photograph the target area based on the position information of the target area. The electronic equipment can determine a target area from the picture by using the deep learning model and automatically focus, so that the technical problems of low focusing speed or inaccurate focusing in the focusing method in the prior art are solved, the automatic focusing of photographing is realized, the focusing speed is increased, and the focusing accuracy is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.
FIG. 1 is a flowchart illustrating an auto-focusing method according to an embodiment of the present disclosure;
FIG. 2 is a block diagram of an auto-focusing apparatus according to an embodiment of the present disclosure;
FIG. 3 is a block diagram of an electronic device according to an embodiment of the present disclosure;
fig. 4 is a block diagram of a computer-readable storage medium according to an embodiment of the present application.
Detailed Description
The embodiment of the application provides an automatic focusing method and device, an electronic device and a computer readable storage medium, solves the technical problems of low focusing speed or inaccurate focusing in the focusing method in the prior art, realizes automatic focusing of photographing, and improves the focusing speed and the focusing accuracy.
In order to solve the technical problems, the general idea of the embodiment of the application is as follows:
an auto-focusing method applied to an electronic device, wherein the electronic device is provided with one or more image acquisition units, and the method comprises the following steps: acquiring an image of a current scene by using the image acquisition unit to obtain at least one picture; performing target detection on the picture by using a deep learning model to obtain position information of a target area, wherein the target area is an area which is in the current scene and is interested by a user, and the user corresponds to the electronic equipment; and controlling the image acquisition unit to focus and photograph the target area based on the position information of the target area. The electronic equipment can determine a target area from the picture by using the deep learning model and automatically focus, so that the technical problems of low focusing speed or inaccurate focusing in the focusing method in the prior art are solved, the automatic focusing of photographing is realized, the focusing speed is increased, and the focusing accuracy is improved.
In order to better understand the technical solution, the technical solution will be described in detail with reference to the drawings and the specific embodiments.
First, it is stated that the term "and/or" appearing herein is merely one type of associative relationship that describes an associated object, meaning that three types of relationships may exist, e.g., a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
Example one
The embodiment provides an auto-focusing method, which is applied to electronic equipment, where the electronic equipment may be: a smart phone, or a tablet computer, or a digital camera, or a game console, or a smart television, etc. Here, the electronic device is not particularly limited in the embodiment as to what kind of device is. And, the electronic device has one or more image capturing units (i.e., cameras) for implementing photographing or video recording functions.
As shown in fig. 1, the auto-focusing method includes:
step S101: and acquiring an image of the current scene by using an image acquisition unit to obtain at least one picture.
In a specific implementation process, after it is detected that the user starts the photographing function, the image acquisition unit is controlled to acquire an image, and the acquired image is displayed on a screen of the electronic device in real time for the user to preview, which is a framing process. After the user finishes framing, the posture of the electronic equipment is relatively fixed, the image acquired by the image acquisition unit tends to be stable and does not change obviously any more, at the moment, at least one picture can be acquired, and the picture is a framing picture.
Step S102: and carrying out target detection on the picture by using the deep learning model to obtain position information of a target area, wherein the target area is an area which is interested by the user in the current scene, and the user corresponds to the electronic equipment.
In a specific implementation, the image obtained in step S101 (i.e., the framing image) may be subject to target detection by using a deep learning model (e.g., a neural network model). The purpose of current detection is to identify a target area from a picture and obtain location information of the target area, where the current area is an area in the current scene that is of interest to a user. For example, if an object (hereinafter, referred to as a "target object") in which a user is interested is located in a certain area, the area in which the object is located is the target area.
In a specific implementation process, the user refers to a user of the electronic device, and for example, a smart phone, the user generally refers to an owner of the smart phone.
As an alternative embodiment, step S102 includes:
inputting the picture into a deep learning model, wherein the deep learning model is specifically an SSD neural network model; acquiring position information of a region where a candidate target object is output by a deep learning model; and obtaining the position information of the area where the target object is located based on the position information of the area where the candidate target object is located, and taking the position information of the area where the target object is located as the position information of the target area.
In a specific implementation process, a pre-trained deep learning model may be obtained, where the deep learning model is specifically an SSD neural network model, and the picture obtained in step S101 is input into the SSD neural network model, and the SSD neural network model may identify the name and the position of each object in the picture, so as to obtain name information and position information of each object. The objects are candidate target objects, and further, a target object is determined from the candidate target objects, where the area where the target object is located is a target area, and position information of the target area is obtained.
SSD: the English is called Single Shot MultiBox Detector, and is a target detection algorithm.
Target detection: is an algorithm for detecting a target area.
As an alternative embodiment, the obtaining the position information of the area where the target object is located based on the position information of the area where the candidate target object is located includes:
when the candidate target object is an object, taking the position information of the area where the candidate target object is as the position information of the area where the target object is; or
When the candidate target objects are a plurality of objects, the target objects are determined from the plurality of objects, and the position information of the area where the target objects are located is extracted.
In the implementation process, when the deep learning model detects an object (i.e., a candidate target object) from the picture, the object is the final target object. When the deep learning model detects a plurality of objects (i.e., candidate target objects) from the picture, one object needs to be determined from the objects as the target object.
As an alternative embodiment, the determining the target object from the plurality of objects includes the following manners (i to (v):
selecting an object with the largest area from the plurality of objects as a target object based on the area size of the region where each object in the plurality of objects is located.
In the implementation process, generally speaking, the more a user feels like a certain object, the larger the area of the object in the view picture is, and the object belongs to the focus of the view. Therefore, the area of each object in the picture (i.e., the viewfinder picture) can be calculated based on the position information (i.e., the coordinate information) of the area where each object is located, and the object with the largest area is taken as the target object, the target object is the object in which the user is interested, and the corresponding area is the area in which the user is interested (i.e., the target area).
And secondly, selecting the object with the position closest to the central point of the picture from the multiple objects as a target object based on the position information of the area where each object in the multiple objects is located.
In practice, generally speaking, the more a user feels about an object, the closer the object is to the center point of the viewfinder picture. Therefore, the coordinates of the center point of each object in the picture (i.e., the viewfinder picture) can be calculated based on the position information (i.e., the coordinate information) of the area where each object is located, and the object with the center point closest to the center point of the picture is taken as the target object, the target object is the object in which the user is interested, and the corresponding area is the area in which the user is interested (i.e., the target area).
And thirdly, selecting an object with the highest reliability from the plurality of objects as a target object based on the reliability of each object in the plurality of objects, wherein the reliability is provided by the deep learning model and is used for representing the reliability of the deep learning model for identifying each object.
In a specific implementation process, when the deep learning model outputs name information and position information of each object in a picture, the credibility of each object can be further output, and the credibility is used for representing the reliability of the deep learning model for identifying the object. For example, it is recognized that there is an a object in the picture as "bicycle" with 70% confidence (representing 70% confidence that the a object is "bicycle"), and it is also recognized that there is a B object in the picture as "motorcycle" with 85% confidence (representing 85% confidence that the B object is "motorcycle"). Thus, the B object may be selected as the target object.
And fourthly, selecting the object with the highest type weight from the plurality of objects as the target object based on the type of each object in the plurality of objects, wherein the weights of different types of objects are different.
In a specific implementation process, different weights can be set for different types of objects, for example, the weight of "person" > "animal" > "car" > "building" >, wherein the different weights can reflect the preference of the user, for example, the user usually shoots the most people, the animal is the second, the car is the second, and the building is the minimum. In this way, when a plurality of objects are recognized by the deep learning model, the objects can be sorted according to the type weight of each object, and the object with the highest weight can be selected as the target object. For example, a "dog" and a "car" are recognized in the picture, and the "dog" is used as the target object because the weight of the "animal" is greater than that of the "car".
Randomly selecting one object from the plurality of objects as a target object.
In the specific implementation process, the modes of (i) - (v) can also be combined for use, and are not described herein again.
After the target object is determined from the plurality of objects, the area where the target object is located is the target area interested by the user, and the position information of the target area is further obtained, so that automatic focusing can be achieved, and the focusing is faster and more accurate.
As an optional embodiment, the training method of the deep learning model includes: acquiring a plurality of data sets, wherein each data set comprises a plurality of picture materials, each picture material comprises one or more objects, and each object is marked with corresponding name information and position information of a region where the object is located; and training a plurality of data sets as training samples to obtain a deep learning model.
In the specific implementation process, the deep learning model needs to be trained in advance. Specifically, the PASCAL VOC2012 data set may be used as a training sample, and the training sample is input into the deep learning model to train the deep learning model, so as to finally obtain the deep learning model required in the present embodiment.
The PASCAL VOC2012 data set currently stores 11700 picture materials, each of which contains one or more objects (which include aspects of life, such as people, animals, plants, buildings, landscapes, vehicles, clothing, living goods, electronic products, cultural goods, medical goods, and the like). And each object is tagged with its name information and location information of the area in which it is located (i.e. location information in the picture material), the data set being collectively 27000 tagged with a plurality of objects (including the same object). For example, an image of a "bicycle" is contained in a picture material, and the position coordinates and the name "bicycle" of the "bicycle" are marked in the attribute information of the picture.
Of course, other data sets may be used in addition to the PASCAL VOC2012 data set, and are not specifically limited herein.
In a specific implementation process, when the PASCAL VOC2012 data set is used as a training sample for the deep learning model to learn, the deep learning model can learn the name information and the position information of each object in each picture material. Thus, after any picture is input into the trained deep learning model, the deep learning model can identify what objects are in the picture (i.e., obtaining name information of the objects) and where the objects are located in the picture (i.e., obtaining position information of the objects).
In more detail, after a certain picture is input into the SSD neural network model, the picture may undergo operations such as convolution and pooling to obtain feature maps with different sizes, k candidate frames with different proportions are generated for each pixel on each feature map, a large number of candidate frames are formed (for example, when k is 9, 8732 candidate frames are formed), and finally, the area where the object is located is determined by a non-maximum suppression algorithm, and the position information of the area where the object is located is generated.
Non-maximum suppression algorithm: the method is an algorithm for eliminating redundant candidate frames and finding the optimal object detection position.
As an alternative embodiment, before step S102, the method further includes:
acquiring a history picture shot by a user from the electronic equipment locally; and fine-tuning the deep learning model based on the historical pictures.
In particular implementations, the deep learning model may also be fine-tuned (fine tuning) based on historical photos local to the electronic device before being used. For example, historical photos taken by a user are obtained from an album of a smart phone, and the historical photos are input into the deep learning model, so that the deep learning model is finely adjusted. In this way, the deep learning model can learn the photographing habits/preferences of the user, for example, whether a person or an animal is to be photographed, whether a car or a building is to be photographed.
In this way, after the deep learning model is finely adjusted, when the target recognition is performed on the picture (namely, the preview picture), the object liked by the user can be more easily recognized, the object liked by the user is taken as the target object, and the area where the target object is located is further obtained as the target area. The identification accuracy is higher, the area which is interested by the user can be identified more accurately in the current scene, and the focusing is faster and more accurate when the focusing is carried out.
Step S103: and controlling the image acquisition unit to focus and photograph the target area based on the position information of the target area.
In a specific implementation process, after the position information of the target area is obtained, the image acquisition unit (i.e., the camera) can be controlled to focus on the target area based on the position information of the target area, and the purpose of focusing is to adjust the focus of the image acquisition unit to the target area which is interested by a user, so that a photographed picture is clearer. Further, the user is detected to trigger a shutter key for taking pictures, and pictures (i.e., photos) obtained by taking pictures are saved in a photo album or other folders.
In this way, in the auto-focusing method of the embodiment, a target region is determined from a picture (i.e., a preview picture) by using a deep learning model and auto-focusing is performed; compared with a manual focusing method in the prior art, the method has the advantages that the focusing speed is higher, the operation is simpler and more convenient, and the focusing is not influenced in a scene of using the selfie stick; compared with the automatic focusing method in the prior art, the method can automatically identify the area in the current scene, which is interested by the user, and perform automatic focusing, so that focusing is more accurate, and the intention of the user can be more easily met.
The technical scheme in the embodiment of the application at least has the following technical effects or advantages:
in an embodiment of the present application, an auto-focusing method is disclosed, which is applied to an electronic device having one or more image capturing units, and the method includes: acquiring an image of a current scene by using the image acquisition unit to obtain at least one picture; performing target detection on the picture by using a deep learning model to obtain position information of a target area, wherein the target area is an area which is in the current scene and is interested by a user, and the user corresponds to the electronic equipment; and controlling the image acquisition unit to focus and photograph the target area based on the position information of the target area. The electronic equipment can determine a target area from the picture by using the deep learning model and automatically focus, so that the technical problems of low focusing speed or inaccurate focusing in the focusing method in the prior art are solved, the automatic focusing of photographing is realized, the focusing speed is increased, and the focusing accuracy is improved.
Example two
Based on the same inventive concept, as shown in fig. 2, the present embodiment provides an auto-focusing apparatus 200, applied in an electronic device having one or more image capturing units, the apparatus comprising:
the acquisition module 201 is configured to acquire an image of a current scene by using the image acquisition unit to obtain at least one picture;
the detection module 202 is configured to perform target detection on the picture by using a deep learning model to obtain position information of a target area, where the target area is an area in the current scene that is interested by a user, and the user corresponds to the electronic device;
and the focusing module 203 is configured to control the image acquisition unit to focus and photograph the target area based on the position information of the target area.
As an alternative embodiment, the detection module 202 includes:
the input sub-module is used for inputting the picture into the deep learning model, wherein the deep learning model is a neural network model;
the acquisition submodule is used for acquiring the position information of the region where the candidate target object output by the deep learning model is located;
and the obtaining submodule is used for obtaining the position information of the area where the target object is located based on the position information of the area where the candidate target object is located, and taking the position information of the area where the target object is located as the position information of the target area.
As an optional embodiment, the obtaining submodule is specifically configured to:
when the candidate target object is an object, taking the position information of the area where the candidate target object is as the position information of the area where the target object is; or
And when the candidate target objects are a plurality of objects, determining the target objects from the plurality of objects, and extracting the position information of the area where the target objects are located.
As an optional embodiment, the obtaining submodule is specifically configured to:
selecting an object with the largest area from the plurality of objects as the target object based on the area size of the region where each object in the plurality of objects is located; or
Selecting an object with the position closest to the central point of the picture from the plurality of objects as the target object based on the position information of the area where each object in the plurality of objects is located; or
Selecting an object with highest reliability from the plurality of objects as the target object based on the reliability of each object in the plurality of objects, wherein the reliability is provided by the deep learning model and is used for representing the reliability degree of each object identified by the deep learning model; or
Selecting an object with the highest type weight from the plurality of objects as the target object based on the type of each object in the plurality of objects, wherein the weights of different types of objects are different from each other; or
Randomly selecting one object from the plurality of objects as the target object.
As an alternative embodiment, the automatic focusing apparatus 200 further includes:
the training module is used for training the deep learning model; wherein the training the deep learning model comprises: acquiring a plurality of data sets, wherein each data set comprises a plurality of picture materials, each picture material comprises one or more objects, and each object is marked with corresponding name information and position information of a region where the object is located; and training the plurality of data sets as training samples to obtain the deep learning model.
As an alternative embodiment, the automatic focusing apparatus 200 further includes:
the acquisition module is used for locally acquiring historical pictures shot by the user from the electronic equipment before the pictures are subjected to target detection by using a deep learning model and position information of a target area is acquired;
and the fine tuning module is used for fine tuning the deep learning model based on the historical pictures.
Since the electronic device described in this embodiment is an electronic device used for implementing the auto-focusing method in this embodiment, a person skilled in the art can understand the specific implementation of the electronic device of this embodiment and various variations thereof based on the auto-focusing method described in this embodiment, and therefore, how to implement the method in this embodiment of the present application by the electronic device is not described in detail herein. The electronic device used by those skilled in the art to implement the auto-focusing method in the embodiments of the present application is within the scope of the present application.
The technical scheme in the embodiment of the application at least has the following technical effects or advantages:
in an embodiment of the present application, an auto-focusing apparatus is disclosed, which is applied to an electronic device having one or more image capturing units, and the apparatus includes: the acquisition module is used for acquiring an image of the current scene by using the image acquisition unit to obtain at least one picture; the detection module is used for carrying out target detection on the picture by utilizing a deep learning model to obtain position information of a target area, wherein the target area is an area which is interested by a user in the current scene, and the user corresponds to the electronic equipment; and the focusing module is used for controlling the image acquisition unit to focus and photograph the target area based on the position information of the target area. The electronic equipment can determine a target area from the picture by using the deep learning model and automatically focus, so that the technical problems of low focusing speed or inaccurate focusing in the focusing method in the prior art are solved, the automatic focusing of photographing is realized, the focusing speed is increased, and the focusing accuracy is improved.
EXAMPLE III
Based on the same inventive concept, as shown in fig. 3, the present embodiment provides an electronic device 300, which includes a memory 310, a processor 320, and a computer program 311 stored in the memory 310 and executable on the processor 320, wherein the processor 320 executes the computer program 311 to implement the following method steps:
acquiring an image of a current scene by using the image acquisition unit to obtain at least one picture; performing target detection on the picture by using a deep learning model to obtain position information of a target area, wherein the target area is an area which is in the current scene and is interested by a user, and the user corresponds to the electronic equipment; and controlling the image acquisition unit to focus and photograph the target area based on the position information of the target area.
In a specific implementation process, when the processor 320 executes the program 311, any manner steps in the first embodiment may also be implemented.
Example four
Based on the same inventive concept, as shown in fig. 4, the present embodiment provides a computer-readable storage medium 400, on which a computer program 411 is stored, the computer program 411 implementing the following steps when being executed by a processor:
acquiring an image of a current scene by using the image acquisition unit to obtain at least one picture; performing target detection on the picture by using a deep learning model to obtain position information of a target area, wherein the target area is an area which is in the current scene and is interested by a user, and the user corresponds to the electronic equipment; and controlling the image acquisition unit to focus and photograph the target area based on the position information of the target area.
In a specific implementation, the computer program 411, when executed by a processor, may implement the method steps of the second embodiment.
The algorithms and displays presented herein are not inherently related to any particular computer, virtual machine, or other apparatus. Various general purpose systems may also be used with the teachings herein. The required structure for constructing such a system will be apparent from the description above. Moreover, the present invention is not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best mode of the invention.
In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the following claims, any of the claimed embodiments may be used in any combination.
The various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. It will be appreciated by those skilled in the art that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functions of some or all of the components of an autofocus device, electronic device, and the like in accordance with embodiments of the present invention. The present invention may also be embodied as apparatus or device programs (e.g., computer programs and computer program products) for performing a portion or all of the methods described herein. Such programs implementing the present invention may be stored on computer-readable media or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.
The invention discloses an automatic focusing method A1, which is applied to electronic equipment, wherein the electronic equipment is provided with one or more image acquisition units, and the method is characterized by comprising the following steps:
acquiring an image of a current scene by using the image acquisition unit to obtain at least one picture;
performing target detection on the picture by using a deep learning model to obtain position information of a target area, wherein the target area is an area which is in the current scene and is interested by a user, and the user corresponds to the electronic equipment;
and controlling the image acquisition unit to focus and photograph the target area based on the position information of the target area.
The auto-focusing method of a2, as in a1, wherein the obtaining position information of a target area by performing target detection on the picture by using a deep learning model comprises:
inputting the picture into the deep learning model, wherein the deep learning model is a neural network model;
acquiring position information of a region where a candidate target object is output by the deep learning model;
and acquiring the position information of the area where the target object is located based on the position information of the area where the candidate target object is located, and taking the position information of the area where the target object is located as the position information of the target area.
A3, the auto-focusing method according to a2, wherein the obtaining position information of a region where a target object is located based on the position information of the region where the candidate target object is located includes:
when the candidate target object is an object, taking the position information of the area where the candidate target object is as the position information of the area where the target object is; or
And when the candidate target objects are a plurality of objects, determining the target objects from the plurality of objects, and extracting the position information of the area where the target objects are located.
A4, the auto-focusing method as claimed in A3, wherein said determining the target object from the plurality of objects comprises:
selecting an object with the largest area from the plurality of objects as the target object based on the area size of the region where each object in the plurality of objects is located; or
Selecting an object with the position closest to the central point of the picture from the plurality of objects as the target object based on the position information of the area where each object in the plurality of objects is located; or
Selecting an object with highest reliability from the plurality of objects as the target object based on the reliability of each object in the plurality of objects, wherein the reliability is provided by the deep learning model and is used for representing the reliability degree of each object identified by the deep learning model; or
Selecting an object with the highest type weight from the plurality of objects as the target object based on the type of each object in the plurality of objects, wherein the weights of different types of objects are different from each other; or
Randomly selecting one object from the plurality of objects as the target object.
The auto-focusing method A5, as claimed in A1, wherein the training method of the deep learning model comprises:
acquiring a plurality of data sets, wherein each data set comprises a plurality of picture materials, each picture material comprises one or more objects, and each object is marked with corresponding name information and position information of a region where the object is located;
and training the plurality of data sets as training samples to obtain the deep learning model.
The auto-focusing method of a6, as in any one of a1 to a5, further comprising, before the performing the target detection on the picture by using the deep learning model to obtain the position information of a target region:
acquiring a history picture shot by the user from the electronic equipment locally;
and fine-tuning the deep learning model based on the historical pictures.
B7, an automatic focusing device, used in an electronic device having one or more image capturing units, the device comprising:
the acquisition module is used for acquiring an image of the current scene by using the image acquisition unit to obtain at least one picture;
the detection module is used for carrying out target detection on the picture by utilizing a deep learning model to obtain position information of a target area, wherein the target area is an area which is interested by a user in the current scene, and the user corresponds to the electronic equipment;
and the focusing module is used for controlling the image acquisition unit to focus and photograph the target area based on the position information of the target area.
B8, the auto-focusing device of B7, wherein the detecting module comprises:
the input sub-module is used for inputting the picture into the deep learning model, wherein the deep learning model is a neural network model;
the acquisition submodule is used for acquiring the position information of the region where the candidate target object output by the deep learning model is located;
and the obtaining submodule is used for obtaining the position information of the area where the target object is located based on the position information of the area where the candidate target object is located, and taking the position information of the area where the target object is located as the position information of the target area.
B9, the autofocus device of claim B8, wherein the obtaining submodule is configured to:
when the candidate target object is an object, taking the position information of the area where the candidate target object is as the position information of the area where the target object is; or
And when the candidate target objects are a plurality of objects, determining the target objects from the plurality of objects, and extracting the position information of the area where the target objects are located.
B10, the autofocus device of claim B9, wherein the obtaining submodule is configured to:
selecting an object with the largest area from the plurality of objects as the target object based on the area size of the region where each object in the plurality of objects is located; or
Selecting an object with the position closest to the central point of the picture from the plurality of objects as the target object based on the position information of the area where each object in the plurality of objects is located; or
Selecting an object with highest reliability from the plurality of objects as the target object based on the reliability of each object in the plurality of objects, wherein the reliability is provided by the deep learning model and is used for representing the reliability degree of each object identified by the deep learning model; or
Selecting an object with the highest type weight from the plurality of objects as the target object based on the type of each object in the plurality of objects, wherein the weights of different types of objects are different from each other; or
Randomly selecting one object from the plurality of objects as the target object.
B11, the auto-focusing device of B7, further comprising:
the training module is used for training the deep learning model; wherein the training the deep learning model comprises: acquiring a plurality of data sets, wherein each data set comprises a plurality of picture materials, each picture material comprises one or more objects, and each object is marked with corresponding name information and position information of a region where the object is located; and training the plurality of data sets as training samples to obtain the deep learning model.
B12, the auto-focusing device of any one of B7 to B11, further comprising:
the acquisition module is used for locally acquiring historical pictures shot by the user from the electronic equipment before the pictures are subjected to target detection by using a deep learning model and position information of a target area is acquired;
and the fine tuning module is used for fine tuning the deep learning model based on the historical pictures.
C13, an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor, when executing the program, performs the method steps of any of claims a 1-a 6.
D14, a computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, is adapted to carry out the method steps of any of claims a1 to a 6.

Claims (10)

1. An auto-focusing method applied to an electronic device having one or more image capturing units, the method comprising:
acquiring an image of a current scene by using the image acquisition unit to obtain at least one picture;
performing target detection on the picture by using a deep learning model to obtain position information of a target area, wherein the target area is an area which is in the current scene and is interested by a user, and the user corresponds to the electronic equipment;
and controlling the image acquisition unit to focus and photograph the target area based on the position information of the target area.
2. The auto-focusing method of claim 1, wherein the performing the target detection on the picture by using the deep learning model to obtain the position information of a target region comprises:
inputting the picture into the deep learning model, wherein the deep learning model is a neural network model;
acquiring position information of a region where a candidate target object is output by the deep learning model;
and acquiring the position information of the area where the target object is located based on the position information of the area where the candidate target object is located, and taking the position information of the area where the target object is located as the position information of the target area.
3. The auto-focusing method of claim 2, wherein the obtaining of the position information of the area where a target object is located based on the position information of the area where the candidate target object is located comprises:
when the candidate target object is an object, taking the position information of the area where the candidate target object is as the position information of the area where the target object is; or
And when the candidate target objects are a plurality of objects, determining the target objects from the plurality of objects, and extracting the position information of the area where the target objects are located.
4. The auto-focusing method of claim 3, wherein said determining the target object from the plurality of objects comprises:
selecting an object with the largest area from the plurality of objects as the target object based on the area size of the region where each object in the plurality of objects is located; or
Selecting an object with the position closest to the central point of the picture from the plurality of objects as the target object based on the position information of the area where each object in the plurality of objects is located; or
Selecting an object with highest reliability from the plurality of objects as the target object based on the reliability of each object in the plurality of objects, wherein the reliability is provided by the deep learning model and is used for representing the reliability degree of each object identified by the deep learning model; or
Selecting an object with the highest type weight from the plurality of objects as the target object based on the type of each object in the plurality of objects, wherein the weights of different types of objects are different from each other; or
Randomly selecting one object from the plurality of objects as the target object.
5. The auto-focusing method of claim 1, wherein the training method of the deep learning model comprises:
acquiring a plurality of data sets, wherein each data set comprises a plurality of picture materials, each picture material comprises one or more objects, and each object is marked with corresponding name information and position information of a region where the object is located;
and training the plurality of data sets as training samples to obtain the deep learning model.
6. The auto-focusing method of any one of claims 1 to 5, wherein before the performing the target detection on the picture by using the deep learning model to obtain the position information of a target region, the method further comprises:
acquiring a history picture shot by the user from the electronic equipment locally;
and fine-tuning the deep learning model based on the historical pictures.
7. An auto-focusing apparatus for use in an electronic device having one or more image capturing units, the apparatus comprising:
the acquisition module is used for acquiring an image of the current scene by using the image acquisition unit to obtain at least one picture;
the detection module is used for carrying out target detection on the picture by utilizing a deep learning model to obtain position information of a target area, wherein the target area is an area which is interested by a user in the current scene, and the user corresponds to the electronic equipment;
and the focusing module is used for controlling the image acquisition unit to focus and photograph the target area based on the position information of the target area.
8. The autofocus apparatus of claim 7, wherein the detection module comprises:
the input sub-module is used for inputting the picture into the deep learning model, wherein the deep learning model is a neural network model;
the acquisition submodule is used for acquiring the position information of the region where the candidate target object output by the deep learning model is located;
and the obtaining submodule is used for obtaining the position information of the area where the target object is located based on the position information of the area where the candidate target object is located, and taking the position information of the area where the target object is located as the position information of the target area.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor, when executing the program, is adapted to carry out the method steps of any of claims 1 to 6.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, is adapted to carry out the method steps of any of claims 1 to 6.
CN201910920070.1A 2019-09-26 2019-09-26 Automatic focusing method and device Pending CN112565586A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910920070.1A CN112565586A (en) 2019-09-26 2019-09-26 Automatic focusing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910920070.1A CN112565586A (en) 2019-09-26 2019-09-26 Automatic focusing method and device

Publications (1)

Publication Number Publication Date
CN112565586A true CN112565586A (en) 2021-03-26

Family

ID=75030119

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910920070.1A Pending CN112565586A (en) 2019-09-26 2019-09-26 Automatic focusing method and device

Country Status (1)

Country Link
CN (1) CN112565586A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115527100A (en) * 2022-10-09 2022-12-27 范孝徐 Identification analysis system and method for associated database

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106713761A (en) * 2017-01-11 2017-05-24 中控智慧科技股份有限公司 Image processing method and apparatus
CN108712609A (en) * 2018-05-17 2018-10-26 Oppo广东移动通信有限公司 Focusing process method, apparatus, equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106713761A (en) * 2017-01-11 2017-05-24 中控智慧科技股份有限公司 Image processing method and apparatus
CN108712609A (en) * 2018-05-17 2018-10-26 Oppo广东移动通信有限公司 Focusing process method, apparatus, equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115527100A (en) * 2022-10-09 2022-12-27 范孝徐 Identification analysis system and method for associated database
CN115527100B (en) * 2022-10-09 2023-05-23 广州佳禾科技股份有限公司 Identification analysis system and method for association database

Similar Documents

Publication Publication Date Title
US11483268B2 (en) Content navigation with automated curation
CN101910936B (en) Guided photography based on image capturing device rendered user recommendations
CN108401112B (en) Image processing method, device, terminal and storage medium
US20110314049A1 (en) Photography assistant and method for assisting a user in photographing landmarks and scenes
TW201011696A (en) Information registering device for detection, target sensing device, electronic equipment, control method of information registering device for detection, control method of target sensing device, information registering device for detection control progr
TWI586160B (en) Real time object scanning using a mobile phone and cloud-based visual search engine
US20140133764A1 (en) Automatic curation of digital images
US11514713B2 (en) Face quality of captured images
WO2015145769A1 (en) Imaging device, information processing device, photography assistance system, photography assistance program, and photography assistance method
CN110581950B (en) Camera, system and method for selecting camera settings
CN103945116A (en) Apparatus and method for processing image in mobile terminal having camera
CN108780568A (en) A kind of image processing method, device and aircraft
CN107203646A (en) A kind of intelligent social sharing method and device
JP6323548B2 (en) Imaging assistance system, imaging apparatus, information processing apparatus, imaging assistance program, and imaging assistance method
CN103353879B (en) Image processing method and apparatus
CN110047115B (en) Star image shooting method and device, computer equipment and storage medium
CN111314620B (en) Photographing method and apparatus
CN112565586A (en) Automatic focusing method and device
CN114697539A (en) Photographing recommendation method and device, electronic equipment and storage medium
CN112184722A (en) Image processing method, terminal and computer storage medium
CN116109922A (en) Bird recognition method, bird recognition apparatus, and bird recognition system
CN114677620A (en) Focusing method, electronic device and computer readable medium
CN113989387A (en) Camera shooting parameter adjusting method and device and electronic equipment
CN117459830B (en) Automatic zooming method and system for mobile equipment
JP2017184021A (en) Content providing device and content providing program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination