WO2020042126A1 - Appareil de focalisation, procédé et dispositif associé - Google Patents

Appareil de focalisation, procédé et dispositif associé Download PDF

Info

Publication number
WO2020042126A1
WO2020042126A1 PCT/CN2018/103370 CN2018103370W WO2020042126A1 WO 2020042126 A1 WO2020042126 A1 WO 2020042126A1 CN 2018103370 W CN2018103370 W CN 2018103370W WO 2020042126 A1 WO2020042126 A1 WO 2020042126A1
Authority
WO
WIPO (PCT)
Prior art keywords
roi
image
target
effective
information
Prior art date
Application number
PCT/CN2018/103370
Other languages
English (en)
Chinese (zh)
Inventor
马彦鹏
宋永福
杨琪
王军
陈聪
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2018/103370 priority Critical patent/WO2020042126A1/fr
Priority to CN201880096896.4A priority patent/CN112602319B/zh
Publication of WO2020042126A1 publication Critical patent/WO2020042126A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/02Constructional features of telephone sets
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules

Definitions

  • the present application relates to the field of image processing technologies, and in particular, to a focusing device, method, and related equipment.
  • Smartphone camera photography technology is moving towards SLR.
  • SLR Small Remote Location Register
  • many smartphone cameras have surpassed traditional card cameras in terms of camera capabilities.
  • High-quality photography relies on high-precision focusing technology.
  • the existing focusing technology In the shooting of static scenes, the existing focusing technology generally places the focus point on the center of the screen. This focusing method can meet the needs of most consumers.
  • the center focus at this time will often cause the shooting target to be blurred.
  • shooting dynamic scenes especially when the target is moving fast, this fixed center focus cannot meet the needs, so it is urgent to develop high-precision motion tracking technology.
  • Embodiments of the present invention provide a focusing device, method, and related equipment to improve focusing accuracy.
  • an embodiment of the present invention provides a focusing device, including a processor, a neural network processor and an image signal processor coupled to the processor; the image signal processor is configured to generate a first image
  • the neural network processor is configured to obtain a first region of interest ROI set in the first image, where the first ROI set includes one or more first ROIs, and each first ROI includes one shooting An object; the processor, configured to: obtain a second ROI set in the first image, where the second ROI set includes one or more second ROIs, and each second ROI is a motion region; based on the first A ROI set and the second ROI set determine a target ROI in the first image; determine characteristic information of the target ROI; and identify the target ROI in the image signal processing according to the characteristic information of the target ROI Position information and size information in the second image generated by the processor, the first image is located before the second image in the time domain; and focusing is performed according to the position information and size information.
  • one or more candidate shooting objects are obtained by using NPU for AI object detection through image frames generated by the ISP in the focusing device, and one or more candidate motion areas are obtained by using a processor for moving object detection.
  • the detected subject and the motion area are integrated to determine the target ROI to be finally focused, and subsequent tracking and focusing are performed based on the characteristic information of the target ROI. That is, using AI target detection and moving target detection, automatically comprehensively identify the target ROI in the field of view FOV, and then use the target ROI tracking algorithm to accurately calculate the real-time motion trajectory and size of the target ROI, and finally use the autofocus AF algorithm to calculate Movement track, do sports follow focus.
  • the entire process does not require manual intervention by the user and the tracking focus is accurate, which greatly improves the shooting experience and effect.
  • the processor is specifically configured to determine a valid first ROI from one or more first ROIs in the first ROI set, where the valid first ROI is in the first Within a first preset region of an image; determining a valid second ROI from one or more second ROIs in the second ROI set, the valid second ROI being within a second preset of the first image Within a region; and in a case where an intersection ratio of the effective first ROI and the effective second ROI is greater than or equal to a preset threshold, the effective first ROI is determined as a target ROI.
  • the first ROI set and the second ROI set are filtered to improve the recognition accuracy of the target ROI. And when the overlapping area between the effective first ROI and the effective second ROI is large, it indicates that the detection of the subject and the moving area at this time is likely to include the effective first area, so the effective first area can be As the target ROI.
  • the processor is further specifically configured to: when the intersection ratio of the effective first ROI and the effective second ROI is less than a preset threshold, reduce the effective first ROI The ROI between the two ROIs and the effective first ROI which is closer to the center point of the first image is determined as the target ROI.
  • the overlapping area between the effective first ROI and the effective second ROI when the overlapping area between the effective first ROI and the effective second ROI is small, it may indicate that the detection at this time is incorrect or the target ROI is drifting, so an ROI closer to the center point may be selected As the target ROI.
  • the valid first ROI has a highest evaluation score in one or more first ROIs within a first preset region of the first image; and / or the valid first ROI
  • the two ROIs have the highest evaluation score in one or more second ROIs in the second preset region of the first image; wherein the evaluation score of each ROI satisfies at least one of the following: the area with the ROI Proportionally proportional to the distance of the ROI from the center point of the first image, and proportional to the priority of the object category to which the ROI belongs.
  • the processor when there are still multiple ROIs that may still exist after the processor performs filtering through a preset area, at this time, the area of the ROI, the distance from the center point of the first image, and the priority of the category to which the subject belongs The level is judged, and an ROI with a higher possibility of tracking and focusing is selected.
  • the processor is further configured to update the feature information of the target ROI based on the feature information corresponding to the position and size of the target ROI in the historical image.
  • the characteristic information of the target ROI is determined according to the characteristic information of the first image corresponding to the target ROI and the characteristic information of at least one third image.
  • the at least one third image is Time domain is located between the first image and the second image.
  • the processor not only needs to determine the initial value of the target ROI, but also needs to update the feature information in real time based on the motion tracking situation of the target ROI to more accurately track the focus.
  • the processor is further configured to: recalculate the target ROI after a first preset time period; or when the tracking confidence of the target ROI is less than a confidence threshold , Recalculating the target ROI, wherein the tracking confidence is used to indicate the tracking accuracy of the target ROI, and the tracking confidence is directly proportional to the tracking accuracy.
  • the processor not only needs to update the feature information in real time based on the tracking situation of the target ROI to more accurately track the focus, but also the updated feature information needs to be time-efficient.
  • the confidence level of the target ROI is low, it is necessary to consider initializing related parameters to perform a new round of confirmation and tracking of the target ROI.
  • the feature information includes one or more of directional gradient hog information, color lab information, and convolutional neural network CNN information.
  • the embodiments of the present invention provide multiple extraction methods of feature information to meet the requirements for extracting feature information in different images or different scenes.
  • an embodiment of the present invention provides a focusing method, which may include:
  • the first ROI set being a ROI set obtained from a first image generated by an image signal processor, the first ROI set including one or more first ROI, each first ROI includes a photographic subject; the second ROI set is a ROI set obtained from the first image, and the second ROI set includes one or more second ROIs, each The two ROIs are moving regions; determining a target ROI in the first image based on the first ROI set and the second ROI set; determining characteristic information of the target ROI; and identifying based on the characteristic information of the target ROI Position information and size information of the target ROI in a second image generated by the image signal processor, and the first image is located before the second image in the time domain; based on the position information and size information, Focus.
  • the determining a target ROI in the first image based on the first ROI set and the second ROI set includes: from one or more of the first ROI set A valid first ROI is determined from each of the first ROIs, the valid first ROI is within a first preset region of the first image; and a valid is determined from one or more second ROIs in the second ROI set A second ROI, where the effective second ROI is within a second preset region of the first image; and at an intersection of the effective first ROI and the effective second ROI that is greater than or equal to a preset threshold IoU
  • the valid first ROI is determined as a target ROI.
  • the method further includes: when the intersection ratio IoU of the effective first ROI and the effective second ROI is less than a preset threshold, dividing the effective second ROI with A ROI closer to the center point of the first image in the effective first ROI is determined as a target ROI.
  • the valid first ROI has a highest evaluation score in one or more first ROIs within a first preset region of the first image; and / or the valid first ROI
  • the two ROIs have the highest evaluation score in one or more second ROIs in the second preset region of the first image; wherein the evaluation score of each ROI satisfies at least one of the following: the area with the ROI Proportionally proportional to the distance of the ROI from the center point of the first image, and proportional to the priority of the object category to which the ROI belongs.
  • the method further includes: updating the feature information of the target ROI based on the feature information corresponding to the position and size of the target ROI in the historical image.
  • the characteristic information of the target ROI is determined according to the characteristic information of the first image corresponding to the target ROI and the characteristic information of at least one third image.
  • the at least one third image is Time domain is located between the first image and the second image.
  • the method further includes: recalculating the target ROI after a first preset period of time; or re-calculating the target ROI when the tracking confidence is less than a confidence threshold. Calculate the target ROI, wherein the tracking confidence is used to indicate the tracking accuracy of the target ROI, and the tracking confidence is directly proportional to the tracking accuracy.
  • the feature information includes one or more of directional gradient hog information, color lab information, and convolutional neural network CNN information.
  • an embodiment of the present invention provides a focusing device, which may include:
  • a first processing unit configured to determine a first ROI set and a second ROI set, where the first ROI set is a ROI set obtained from a first image generated by an image signal processor, and the first ROI set Including one or more first ROIs, each of which includes a photographic subject; the second ROI set is a ROI set obtained from the first image, and the second ROI set includes one or more A second ROI, each second ROI being a motion region; a second processing unit, configured to determine a target ROI in the first image based on the first ROI set and the second ROI set; a third processing unit, Used to determine feature information of the target ROI; a recognition unit, configured to identify position information and size information of the target ROI in a second image generated by the image signal processor according to the feature information of the target ROI, The first image is located before the second image in a time domain; a focusing unit is configured to focus according to the position information and size information.
  • the second processing unit is specifically configured to determine a valid first ROI from one or more first ROIs in the first ROI set, where the valid first ROI is Within a first preset region of the first image; determining a valid second ROI from one or more second ROIs in the second ROI set, the valid second ROI being within a first Within two preset regions; and in a case where an intersection ratio of the effective first ROI and the effective second ROI is greater than or equal to a preset threshold, the effective first ROI is determined as a target ROI.
  • the second processing unit is further configured to:
  • the effective second ROI and the effective first ROI are distanced from the first image center point. The more recent ROI is determined as the target ROI.
  • the valid first ROI has a highest evaluation score in one or more first ROIs within a first preset region of the first image; and / or the valid first ROI
  • the two ROIs have the highest evaluation score in one or more second ROIs in the second preset region of the first image; wherein the evaluation score of each ROI satisfies at least one of the following: the area with the ROI Proportionally proportional to the distance of the ROI from the center point of the first image, and proportional to the priority of the object category to which the ROI belongs.
  • the third processing unit is further configured to update the feature information of the target ROI based on the feature information corresponding to the position and size of the target ROI in the historical image.
  • the characteristic information of the target ROI is determined according to the characteristic information of the first image corresponding to the target ROI and the characteristic information of at least one third image.
  • the at least one third image is Time domain is located between the first image and the second image.
  • the apparatus further includes:
  • a first initialization unit configured to recalculate the target ROI after a first preset time period
  • a second initialization unit configured to recalculate the target ROI when the tracking confidence of the target ROI is less than a confidence threshold, where the tracking confidence is used to indicate the tracking accuracy of the target ROI,
  • the tracking confidence is directly proportional to the tracking accuracy.
  • the feature information includes one or more of directional gradient hog information, color lab information, and convolutional neural network CNN information.
  • an embodiment of the present invention provides an electronic device, including an image sensor and the focusing device according to any one of the foregoing first aspects; wherein
  • the image sensor is used to collect image data
  • the image signal processor is configured to generate the first image based on the image data.
  • the electronic device further includes: a memory for storing program instructions; and the program instructions are executed by the processor.
  • the present application provides a focusing device having the function of implementing any of the above-mentioned focusing methods.
  • This function can be realized by hardware, and can also be implemented by hardware executing corresponding software.
  • the hardware or software includes one or more modules corresponding to the above functions.
  • the present application provides a terminal.
  • the terminal includes a processor, and the processor is configured to support the terminal to perform a corresponding function in a focusing method provided in the second aspect.
  • the terminal may further include a memory, which is used for coupling with the processor, and stores the program instructions and data necessary for the terminal.
  • the terminal may further include a communication interface for the terminal to communicate with other devices or a communication network.
  • the present application provides a computer storage medium that stores a computer program that, when executed by a processor, implements the focusing method flow described in any one of the second aspects.
  • an embodiment of the present invention provides a computer program.
  • the computer program includes instructions.
  • the computer program can execute the focusing method process according to any one of the second aspects.
  • the present application provides a chip system that includes a processor, and is configured to implement functions involved in the focusing method process in any one of the foregoing second aspects.
  • the chip system further includes a memory, and the memory is configured to store program instructions and data necessary for the focusing method.
  • the chip system can be composed of chips, and can also include chips and other discrete devices.
  • FIG. 1 is a schematic structural diagram of a focusing device according to an embodiment of the present invention.
  • FIG. 2 is a schematic diagram of a first image according to an embodiment of the present invention.
  • FIG. 3 is a schematic structural diagram of another focusing device according to an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of a functional principle of a focusing device according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of an SSD network implementation process provided by an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of screening a target ROI provided by an embodiment of the present invention.
  • FIG. 7 is a schematic flowchart of determining a target ROI according to an embodiment of the present invention.
  • FIG. 8 is a schematic flowchart of a target ROI tracking process according to an embodiment of the present invention.
  • FIG. 9 is a schematic diagram of target ROI tracking provided by an embodiment of the present invention.
  • FIG. 10 is a schematic diagram of updating feature information of a target ROI according to an embodiment of the present invention.
  • FIG. 11 is a hardware structural diagram of a neural network processor according to an embodiment of the present invention.
  • FIG. 12 is a schematic flowchart of a focusing method according to an embodiment of the present invention.
  • FIG. 13 is a schematic structural diagram of another focusing device according to an embodiment of the present invention.
  • an embodiment herein means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the present application.
  • the appearances of this phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are they separate or alternative embodiments that are mutually exclusive with other embodiments. It is explicitly and implicitly understood by those skilled in the art that the embodiments described herein may be combined with other embodiments.
  • a component may be, but is not limited to, a process running on a processor, a processor, an object, an executable, a thread of execution, a program, and / or a computer.
  • an application running on a computing device and a computing device can be components.
  • One or more components can reside within a process and / or thread of execution, and a component can be localized on one computer and / or distributed between 2 or more computers.
  • these components can execute from various computer readable media having various data structures stored thereon.
  • a component may, for example, be based on a signal having one or more data packets (e.g., data from two components that interact with another component between a local system, a distributed system, and / or a network, such as the Internet that interacts with other systems through signals) Communicate via local and / or remote processes.
  • data packets e.g., data from two components that interact with another component between a local system, a distributed system, and / or a network, such as the Internet that interacts with other systems through signals
  • ROI Region of interest
  • the area to be processed is outlined from the processed image in the form of boxes, circles, ellipses, and irregular polygons. It is called interest. region.
  • AI Artificial Intelligence
  • AI is a theory, method, technology, and method that uses digital computers or digital computer-controlled machines to simulate, extend, and expand human intelligence, perceive the environment, acquire knowledge, and use knowledge to obtain the best results. operating system.
  • artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and produce a new kind of intelligent machine that can respond in a similar way to human intelligence.
  • Artificial intelligence is to study the design principles and implementation methods of various intelligent machines, so that the machines have functions of perception, reasoning and decision-making.
  • Research in the field of artificial intelligence includes robotics, natural language processing, computer vision, decision-making and reasoning, human-computer interaction, recommendation and search, and basic theories of AI.
  • Convolutional Neural Network is a multi-layer neural network. Each layer consists of multiple two-dimensional planes, and each plane consists of multiple independent neurons. The neurons share weights, and the number of parameters in the neural network can be reduced by weight sharing.
  • a processor performing a convolution operation usually converts a convolution of an input signal feature and a weight into a matrix multiplication operation between a signal matrix and a weight matrix.
  • the signal matrix and the weight matrix are divided into blocks to obtain multiple Fractional signal matrices and fractal weight matrices, and then matrix multiplication and accumulation are performed on the multiple fractal signal matrices and fractal weight matrices.
  • Image Signal Processing is a unit that is mainly used to process the output signal of the front-end image sensor to match the image sensors of different manufacturers.
  • Image processor for cameras.
  • the pipelined image processing engine can process image signals at high speed. It is also equipped with a dedicated circuit for the evaluation of Auto Exposure / Auto Focus / Auto White Balance.
  • Intersection-over-Union a concept used in object detection, is the overlap rate between the generated candidate frame and the ground truth frame, that is, their The ratio of intersection to union. Ideally, they are completely overlapping, that is, the ratio is 1.
  • a fixed center position is set in advance as the focus area.
  • the AF algorithm needs to reconfigure the focus point, which lengthens the focusing time and the user's photo taking time.
  • the focus cannot follow the target movement in real time.
  • Focus tracking method based on the detection of the feature points. This method detects the feature points in the picture in real time, and then sets the focus on the feature points.
  • Target tracking method based on motion detection, through the content changes of the two frames before and after, quickly identify the moving objects in the shooting scene, and output the moving area to the AF algorithm in real time, and then adjust the focus point to the moving area in real time to achieve the moving target
  • an artificial intelligence servo autofocus function is implemented in the prior art.
  • In a high-speed continuous focusing mode of a moving subject half-press the shutter to capture the subject in the viewfinder and detect its movement track.
  • the built-in autofocus sensor in the SLR can identify whether the object is stationary or moving, and identify its moving direction, so that it can achieve accurate focus when shooting sports, children or animals.
  • the problems and application scenarios that the embodiments of the present invention mainly solve include the following:
  • AI object detection algorithm is used to detect the main object in the picture, and then the main object area is input to the target tracking algorithm to monitor the status of the target in real time
  • the AF algorithm directly sets the focus on the main target object to stabilize the focus.
  • the tracking algorithm will follow the target's movement in real time, and the AF algorithm will do the tracking focus in real time.
  • the AI object detection algorithm combined with the moving target detection algorithm comprehensively outputs the main object in the current picture, and then the target tracking algorithm monitors the position area and size of the output moving target in real time to solve the misidentification of the moving target and the target Problems such as smoothness, unstable target tracking, and discontinuous focus.
  • FIG. 1 is a schematic structural diagram of a focusing device according to an embodiment of the present invention.
  • the focusing device 10 may include a processor 101, a neural network processor 102 and image signal processing coupled to the processor 101. ⁇ 103; wherein,
  • Image Signal Processor (ISP) 103 is used to generate the first image, which can match the image sensors of different manufacturers to process the image data output by the front-end image sensor, and generate corresponding image signals based on the image data .
  • a neural network processor (Neutral Processing Unit, NPU) 102 configured to obtain a first region of interest ROI set in the first image, where the first ROI set includes one or more first ROIs, and each first The ROI includes a subject.
  • the subject can be any object, such as a person, an animal, a building, a plant, etc.
  • the neural network processor 102 recognizes that there is a flower, a person, and a dog in the first image
  • the first ROI set includes The three first ROIs are plants, people, and animals.
  • FIG. 2 is a schematic diagram of a first image provided by an embodiment of the present invention.
  • the NPU recognizes a human face (area 1), a dog face (area 3), a flower (area 4), and
  • the table (area 5) is the first ROI.
  • a processor (Central Processing Unit) 101 is configured to obtain a second ROI set in the first image, and determine a target ROI in the first image based on the first ROI set and the second ROI set. Determining characteristic information of the target ROI; identifying position information and size information of the target ROI in the second image generated by the image signal processor 103 according to the characteristic information of the target ROI, and according to the position Information and size information to focus.
  • the second ROI set includes one or more second ROIs, and each second ROI is a motion region. For example, if a puppy is moving through a frame or frames before the first image and the first image, then the area where the puppy is located in the first image is determined as the second ROI.
  • the first image is located before the second image in the time domain, that is, the feature information of the target ROI determined by integrating AI recognition and motion detection in the previously collected and generated image is used as a basis for subsequent tracking of the target ROI.
  • Real-time tracking focus It can be understood that if no object movement is detected in the first image, the second ROI set may also be an empty set, which is equivalent to a static shooting scene.
  • the CPU detects that a person is moving through the motion, and thus recognizes that the region 2 where the character is located is a motion region, that is, a second ROI.
  • the processor 101 is further configured to, for example, run a general operating system software, and control the neural network processor 102 and the image signal processor 103 to perform focusing under the function of the general operating system software.
  • the first image generated by the image signal processor 103 is sent to the neural network processor 102 to obtain a first ROI set, and the first ROI set obtained by the neural network processor 102 is received.
  • the processor 101 is further configured to complete calculation processing and control related to the focusing process.
  • the aforementioned neural network processor may also be integrated in the processor 101 as a part of the processor 101; it may also be another functional chip coupled to the processor 101 and capable of obtaining the first ROI set; Similarly, the functions performed by the processor 101 may be distributed and executed on multiple different function chips, which is not specifically limited in the embodiment of the present invention.
  • FIG. 3 is a schematic structural diagram of another focusing device according to an embodiment of the present invention
  • FIG. 4 is a functional principle schematic diagram of a focusing device according to an embodiment of the present invention.
  • the focusing device 10 may include a processor 101, a neural network processor 102 and an image signal processor 103 coupled to the processor 101, and a lens 104, an image sensor 105, and a focusing device coupled to the image signal processor 103.
  • Voice Coil Motor (VCM) 106 Voice Coil Motor
  • the lens 104 is configured to focus the optical information of the real world on the image sensor through the principle of optical imaging.
  • the lens 104 may be a rear camera, a front camera, a rotary camera, etc. of a terminal (such as a smart phone).
  • the image sensor 105 is configured to output image data based on optical information collected by the lens 103 to provide the image data to the image signal processor 103 to generate a corresponding image signal.
  • the focus motor 106 may include a mechanical structure for performing static or dynamic focusing based on the position information and size information of the target ROI determined by the processor 101. For example, if the processor 101 recognizes that the target ROI is in a stationary state, the processor 101 controls the focus motor 106 to perform static focusing; if the processor 101 recognizes that the target ROI is in a moving state, the processor 101 controls the focus motor 106 to perform dynamic focusing .
  • the focusing device in FIG. 1 or FIG. 3 may be located in a terminal (such as a smart phone, a tablet, a smart wearable device, etc.), a smart camera device (a smart camera, a smart camera, a smart tracking device), and a smart monitoring device. , Aerial drones, etc., this application will not list them one by one.
  • one or more candidate shooting objects are obtained through AI object detection using the NPU through the image frames generated by the ISP in the focusing device of FIG. 1 or FIG. 3 described above, and one or more are obtained through moving object detection using the processor.
  • a plurality of candidate motion regions are combined with the detected shooting objects and motion regions to determine a target ROI to be finally focused, and subsequent tracking and focusing are performed based on the characteristic information of the target ROI. That is, AI target detection and moving target detection are used to automatically and comprehensively identify the target ROI in the field of view FOV, and then use the target ROI tracking algorithm to accurately calculate the real-time motion trajectory and size of the target ROI.
  • the auto-focus AF algorithm is based on the real-time target ROI. Movement track, do sports follow focus. The entire process does not require manual intervention by the user and the tracking focus is accurate, which greatly improves the shooting experience and effect.
  • the neural network processor 102 obtains the first ROI set in the first image, and specifically implements The method can be as follows:
  • the neural network processor 102 uses an AI object detection algorithm to obtain the target object in the picture (the first image), that is, the target ROI, and uses a general structure (such as the first few layers of structures such as resnet18, resnet26, etc.) as the basic network, and then on this basis Add other layers as the detection structure.
  • the classification base model extracts the low-level features of the image to ensure that the low-level features can be distinguished. By adding a classifier of shallow features, it can help improve the classification performance.
  • the detection part makes it possible to output a series of discretized bounding boxes on feature maps at different levels and the probability that each box contains an object instance. Finally, a non-maximum suppression (NMS) algorithm is performed to obtain the final object prediction result.
  • the detection model algorithm may adopt a single shot detection (SSD) framework. Please refer to FIG. 5.
  • SSD single shot detection
  • the main body adopts a one-stage detection structure, which prevents a large number of candidate target positions similar to faster-rcnn from entering two stages, thereby greatly improving the detection speed.
  • each layer of features has different receptive fields, so that it can adapt to detect targets of different sizes and achieve better performance.
  • the default boxes determine the initial position of the final prediction box. Through different sizes and ratios, it can adapt to different sizes and shapes of the main object, and give the optimal initial value to make the prediction more accurate. accurate.
  • the AI object detection algorithm runs on the NPU, considering the limitation of power consumption performance, it can output detection results every 10 frames.
  • the types of objects that can be detected include: flowers, people, cats, dogs, birds, bicycles, buses, Motorcycle, truck, car, train, boat, horse, kite, balloon, vase, bowl, plate, cup, classic handbag.
  • the priority of the object category to which the shooting object belongs can be divided into four levels, the first priority is human, the second priority is flower, the third priority is cat and dog, and the fourth priority is the rest.
  • the specific implementation manner of the processor 101 in the focusing device 10 acquiring the second ROI set in the first image may be as follows:
  • the processor 101 may obtain a second ROI set by using a moving target detection algorithm.
  • the moving object detection algorithm is performed once every two frames, that is, the moving area in the current image is output every two frames.
  • the speed of the movement and the direction of the movement can be further output.
  • region 2 is the second ROI, which is the motion region output by the motion detection algorithm
  • region 1 is the final target ROI.
  • the specific implementation manner that the processor 10 in the focusing device 10 determines the target ROI in the first image based on the first ROI set and the second ROI set may be: a processor 101 Determine a valid first ROI from one or more first ROIs in the first ROI set, and determine a valid second ROI from one or more second ROIs in the second ROI set; and When the effective first ROI crosses the effective second ROI and the ratio IoU is greater than or equal to a preset threshold, determining the effective first ROI as the target ROI; wherein the effective first ROI is in the Within a first preset region of the first image; the effective second ROI is within a second preset region of the first image.
  • the processor 101 adds the effective second ROI to the effective first ROI.
  • a ROI closer to the center point of the first image is determined as a target ROI. That is, when the overlapping area between the effective first ROI and the effective second ROI is large, it indicates that the detection of the subject and the moving area at this time may include the effective first area, so the effective first area can be As the target ROI; when the overlapping area between the effective first ROI and the effective second ROI is small, it may indicate that the detection is wrong or the target ROI is drifting, so the ROI closer to the center point can be selected as Target ROI.
  • the target ROI may also be selected according to other calculation rules, such as combining a valid first ROI and a valid second ROI to obtain a new ROI, which is not enumerated in this application.
  • FIG. 6 is a schematic diagram of screening a target ROI provided by an embodiment of the present invention.
  • a first image (field of view of a camera) displayed on a mobile phone screen in FIG. 6 has a width of width and a height of height.
  • the second ROI is valid within the second preset region.
  • the length or width of the invalid region w2 min (width, height) ⁇ 0.1; at this time, ROI1 and ROI2 is valid, ROI0 is invalid.
  • the effective first ROI has the highest evaluation score in one or more first ROIs in the first preset region of the first image; and / or the effective second ROI is in the The one or more second ROIs in the second preset region of the first image have the highest evaluation score; wherein the evaluation score of each ROI satisfies at least one of the following: proportional to the area of the ROI, and The distance of the ROI from the center point of the first image is inversely proportional to the priority of the object category to which the ROI belongs. That is, when multiple ROIs may still exist after filtering through the corresponding preset regions, at this time, the area of the ROI, the distance from the center point of the first image, and the priority of the category to which the subject belongs can be determined.
  • the priority of different object categories can also be set according to the current shooting mode. For example, in portrait mode, people have the highest priority, and in landscape mode, plants or buildings have the highest priority.
  • FIG. 7 is a schematic flowchart of determining a target ROI according to an embodiment of the present invention.
  • AI object detection is performed by the NPU to obtain a first ROI set
  • moving object detection is performed by the CPU to obtain a second ROI set.
  • the processor 101 detects whether the first ROI in the first ROI set and the second ROI in the second set are valid.
  • the focusing device 10 in the embodiment of the present invention may also combine other preset strategies to provide different methods for determining the target ROI in different scenarios.
  • the preset strategy may include: 1) user-specified priority; 2) AI object detection priority; 3) motion detection priority; 4) joint selection of object detection and motion detection.
  • the feature information of the target ROI determined by the processor 101 in the above-mentioned focusing device 10 includes one or more of directional gradient hog information, color lab information, and convolutional neural network CNN information.
  • it only includes the color feature Hog information extracted by the processor 101, only the directional gradient hog information extracted by the processor 101, or only the CNN information extracted by the neural network processor 102, or it is one of the three types of information described above. Any two, or a combination of three.
  • the above-mentioned direction gradient hog information and color lab information can be extracted by the processor 101, and CNN information can be extracted by the neural network processor 102, and then sent to the processor 101 through the neural network processor 102.
  • the processor 101 further updates the feature information of the target ROI based on the feature information corresponding to the position and size of the target ROI in the historical image.
  • the feature information of the target ROI is determined according to the feature information of the first image corresponding to the target ROI and the feature information of at least one third image, the at least one third image Located between the first image and the second image in the time domain. That is, the processor 10 in the focusing device 10 is in the process of identifying the position information and the size information of the target ROI in the second image generated by the image signal processor according to the characteristic information of the target ROI.
  • the processor 101 recalculates the target ROI after the first preset time period; or when the tracking confidence of the target ROI is less than the confidence threshold, the target ROI is recalculated, where The tracking confidence is used to indicate the tracking accuracy of the target ROI, and the tracking confidence is directly proportional to the tracking accuracy.
  • the processor 101 not only needs to update the feature information in real time based on the tracking condition of the target ROI to more accurately track the focus, but also the updated feature information is time-effective. After a long period of time, or the currently tracked When the confidence level of the target ROI is low, it is necessary to consider initializing related parameters to perform a new round of confirmation and tracking of the target ROI.
  • FIG. 8 is a schematic diagram of a target ROI tracking process according to an embodiment of the present invention.
  • the processor 101 selects a certain feature or a combination of multiple features to determine the feature information according to a preset rule, and determines whether to initialize the tracker after the rule judgment. If the tracker does not need to be initialized, directly Enter the tracking calculation, output the position and size information of the target ROI, and output a possible response map of the target's position, and finally update the feature information based on the new position and size of the target ROI, which can mainly include the following steps:
  • This part can choose different feature combinations according to different needs, such as using the hog feature alone, or a combination of hog + lab + cnn;
  • the tracking calculation algorithm uses related filtering algorithms, such as KCF (Kernel Correlation Filte), ECO (Efficient Convolution Operators), etc.
  • KCF Kernel Correlation Filte
  • ECO Easy Convolution Operators
  • the response graph for each frame of image output is w ⁇ h floating point two-dimensional
  • the array F [w] [h] can be described as F w, h , which has been normalized to the range of 0 to 1.0.
  • the response map reflects the possible distribution of the target ROI in the picture, and the largest point is where the target ROI is located. The position can reflect the confidence level of the target ROI tracking through the response graph.
  • the average correlation peak energy index is average peak-to-correlation energy (APCE), where
  • F max is max (F [w] [h]), which is the maximum value of (F [w] [h]);
  • F min is min (F [w] [h]), which is The minimum value of (F [w] [h]);
  • ⁇ w, h (F w, h -F min ) 2 means traverse each value of F w, h and subtract the minimum value, then do the square operation, and finally find with.
  • This indicator can be used to characterize: when the calculated value of this indicator drops sharply compared with the historical average, it represents that the position and size of the target ORI of the current frame are not reliable, such as the target ROI is blocked or lost.
  • FIG. 9 is a schematic diagram of target ROI tracking provided by an embodiment of the present invention.
  • the initial position of the target ROI is 1, and the movement process from 1 to 6 in the picture
  • the target tracking algorithm module outputs the position and size of the target in each frame in real time. At this time, the tracking confidence is high, and the feature information of the target ROI needs to be updated in real time.
  • the processor 101 uses the target ROI determined by the first image as an initial ROI input. After feature extraction, feature selection, and tracking calculation, the target ROI is calculated in real time for each subsequent frame image (including the first image). Position and size in. The basis for judging whether the feature information is updated is as follows:
  • the feature information is updated in order to satisfy the feature information update condition:
  • the target ROI feature information is not updated, that is, the feature information of the current image frame will not participate in the update of the target ROI feature information to optimize the tracking system and avoid the target ROI Tracking drift
  • the processor 101 may be triggered to re-determine the target ROI (including the NPU reacquiring the first A ROI set, and the CPU reacquires the second ROI set), that is, the initialization update of the tracking is completed again.
  • the position information and size information of the target ROI are output in real time.
  • the position is constrained: the green frame is the effective range when the target is stationary, at this time it is output to the AF algorithm for stable focusing; The frame is the effective range when the target is moving. At this time, the real-time output is output to the AF algorithm for motion tracking.
  • FIG. 10 is a schematic diagram of updating feature information of a target ROI according to an embodiment of the present invention.
  • the image signal processor 103 generates n frames of images in the first preset time period.
  • the feature information of the target ROI is extracted, that is, the feature information A in FIG. 10 Is also the initial identifying feature information of the target ROI; when the image signal processor generates the second frame image, at this time, the feature information B of the second frame image is first obtained; wherein the method of obtaining the feature information B may be based on,
  • the position and size of the target ROI in the first frame image are extracted from the feature information of the position and size corresponding to the area in the second frame image, that is, the feature information B.
  • the subsequent image frames are extracted from the feature information of the target ROI corresponding to the frame. The principle is the same and will not be repeated here.
  • the processor 101 compares the feature information B with the feature information A to determine the position and size of the target ROI determined in the first frame image in the second frame image; at the same time, according to the feature information A and the feature information B determines whether the second frame satisfies the feature information update condition.
  • the feature information of the latest update is used as the comparison model, or when it is determined that the initialization restart conditions are met, but the specified time point is not reached (that is, the processor 101 outputs a new Time point of the target ROI), it also continues to use the most recently updated feature information as the comparison model; however, if it is determined that the initialization restart conditions are met and the specified time point is reached, the target ROI re-output by the processor 101 can be used , And perform a new round of tracking ROI calculation.
  • the application does not specifically limit the conditions for updating the characteristic information and the update formula.
  • the feature information D of the target ROI is determined in the image of the fourth frame, and after the feature information A "updated from the update of the third frame is correlated with the feature information D, it is determined that the current image of the fourth frame does not meet the features Information update conditions (for example, at this time, the target ROI is blocked or drifted greatly in the fourth frame). Therefore, the feature information D of the fourth frame does not participate in the subsequent update of the feature information, so it is necessary to continue to use the information in the third frame.
  • the updated feature information that is, after the feature information E is determined in the fifth frame, is still associated with the feature information updated in the third frame. Further, it is assumed that the feature information E is updated with the feature information A updated in the third frame.
  • Frame 11 image recalculate feature information
  • tracking and focusing can be performed based on the embodiment of the invention described above, and feature information is updated, which is not exhaustive here.
  • the processor 101 enters the target ROI tracking and focusing process, according to the real-time target ROI information, the current state of the target ROI is determined.
  • the target ROI is tracked and focused.
  • the target detection algorithm + motion detection algorithm + Tracking algorithm can solve the two major problems of no ROI information when tracking target movement and ROI loss after the target is stationary.
  • the AF algorithm can directly follow the ROI window for motion tracking, and when the moving target is stationary, it can perform stable focusing, which can solve the focus selection when the target is not in the center. problem.
  • FIG. 11 is a hardware structural diagram of a neural network processor according to an embodiment of the present invention.
  • the neural network processor NPU 102 is mounted on the CPU (such as Host CPU) as a coprocessor, and the Host CPU assigns tasks.
  • the core part of the NPU is an arithmetic circuit 1203.
  • the controller 1204 controls the arithmetic circuit 1203 to extract matrix data in the memory and perform multiplication operations.
  • the arithmetic circuit 1203 includes multiple processing units (Process Engines, PEs). In some implementations, the arithmetic circuit 1203 is a two-dimensional pulsating array. The arithmetic circuit 1203 may also be a one-dimensional pulsation array or other electronic circuits capable of performing mathematical operations such as multiplication and addition. In some implementations, the arithmetic circuit 1203 is a general-purpose matrix processor.
  • PEs Processing Units
  • the arithmetic circuit 1203 is a two-dimensional pulsating array.
  • the arithmetic circuit 1203 may also be a one-dimensional pulsation array or other electronic circuits capable of performing mathematical operations such as multiplication and addition.
  • the arithmetic circuit 1203 is a general-purpose matrix processor.
  • the operation circuit takes the data corresponding to the matrix B from the weight memory 1202, and buffers it on each PE in the operation circuit.
  • the arithmetic circuit takes matrix A data from the input memory 1201 and performs matrix operations on the matrix B. Partial or final results of the obtained matrix are stored in the accumulator 1208 accumulator.
  • the unified memory 1206 is used to store input data and output data.
  • the weight data is directly accessed to the controller 12012 through the storage unit, and the memory is accessed to the controller, and the DMAC is transferred to the weight memory 1202.
  • the input data is also transferred to the unified memory 1206 through the DMAC.
  • BIU stands for Bus Interface Unit, that is, the bus interface unit 1210, which is used for the interaction between the AXI bus and the DMAC and the instruction fetch memory 1209.
  • the bus interface unit 1210 (Bus Interface Unit, referred to as BIU) is used to fetch the instruction memory 1209 to obtain instructions from external memory, and is also used for the storage unit access controller 12012 to obtain the original data of the input matrix A or weight matrix B from the external memory.
  • BIU Bus Interface Unit
  • the DMAC is mainly used to transfer the input data in the external memory DDR to the unified memory 1206 or the weight data to the weight memory 1202 or the input data data to the input memory 1201.
  • the vector calculation unit 1207 has a plurality of arithmetic processing units, and further processes the output of the arithmetic circuit, such as vector multiplication, vector addition, exponential operation, logarithmic operation, size comparison, etc., if necessary. It is mainly used for non-convolutional / FC layer network calculations in neural networks, such as Pooling, Batch Normalization, Local Normalization, and so on.
  • the vector calculation unit can 1207 store the processed output vector into the unified buffer 1206.
  • the vector calculation unit 1207 may apply a non-linear function to the output of the arithmetic circuit 1203, such as a vector of accumulated values, to generate an activation value.
  • the vector calculation unit 1207 generates a normalized value, a merged value, or both.
  • a vector of the processed output can be used as an activation input to the arithmetic circuit 1203, for example for use in subsequent layers in a neural network.
  • An instruction fetch memory 1209 connected to the controller 1204 is used to store instructions used by the controller 1204;
  • the unified memory 1206, the input memory 1201, the weight memory 1202, and the fetch memory 1209 are all On-Chip memories. External memory is private to the NPU hardware architecture.
  • FIG. 12 is a schematic flowchart of a focusing method according to an embodiment of the present invention.
  • the focusing method is applicable to any one of the focusing devices in FIG. 1 and FIG. 3 and a device including the focusing device.
  • the method may include the following steps S201-S205.
  • Step S201 Determine a first ROI set and a second ROI set, where the first ROI set is a ROI set obtained from a first image generated by an image signal processor, and the first ROI set includes one or more First ROIs, each of which includes a photographic subject; the second ROI set is a ROI set obtained from the first image, and the second ROI set includes one or more second ROIs, Each second ROI is a motion area;
  • Step S202 determine a target ROI in the first image based on the first ROI set and the second ROI set;
  • the determining a target ROI in the first image based on the first ROI set and the second ROI set includes:
  • the effective first ROI is determined as the target ROI.
  • the method further includes:
  • the effective second ROI and the effective first ROI are distanced from the first image center point. The more recent ROI is determined as the target ROI.
  • the effective first ROI has a highest evaluation score in one or more first ROIs within a preset area of the first image; and / or the effective second ROI The one or more second ROIs within the preset region of the first image have the highest evaluation score; wherein the evaluation score of each ROI satisfies at least one of the following: proportional to the area of the ROI, and The distance of the ROI from the center point of the first image is inversely proportional to the priority of the object category to which the ROI belongs.
  • Step S203 determine the characteristic information of the target ROI
  • the feature information includes one or more of directional gradient hog information, color lab information, and convolutional neural network CNN information.
  • the feature information of the target ROI is also updated based on the feature information corresponding to the position and size of the target ROI in the historical image.
  • the characteristic information of the target ROI is determined according to the characteristic information of the first image corresponding to the target ROI and the characteristic information of at least one third image.
  • the at least one third image is Time domain is located between the first image and the second image.
  • Step S204 Identify the position information and size information of the target ROI in the second image generated by the image signal processor according to the characteristic information of the target ROI, and the first image is located in the third region in the time domain.
  • Step S205 Focus according to the position information and size information.
  • the target ROI is recalculated, where the tracking confidence is used to indicate that the tracking of the target ROI is accurate
  • the tracking confidence is directly proportional to the tracking accuracy.
  • FIG. 13 is a schematic structural diagram of another focusing device according to an embodiment of the present invention.
  • the focusing device 30 may include a first processing unit 301, a second processing unit 302, a third processing unit 303, and a recognition unit 304. And focusing unit 305,
  • the first processing unit 301 is configured to determine a first ROI set and a second ROI set, where the first ROI set is a ROI set obtained from a first image generated by an image signal processor, and the first ROI
  • the set includes one or more first ROIs, and each first ROI includes a subject;
  • the second ROI set is a ROI set obtained from the first image, and the second ROI set includes one or more Second ROIs, each second ROI is a motion area;
  • a second processing unit 302 configured to determine a target ROI in the first image based on the first ROI set and the second ROI set;
  • a third processing unit 303 configured to determine feature information of the target ROI
  • a recognition unit 304 configured to identify position information and size information of the target ROI in a second image generated by the image signal processor according to the characteristic information of the target ROI, where the first image is located in the time domain Before the second image;
  • the focusing unit 305 is configured to perform focusing according to the position information and the size information.
  • the second processing unit 302 is specifically configured to:
  • the effective first ROI is determined as the target ROI.
  • the second processing unit 302 is further configured to:
  • the effective second ROI and the effective first ROI are distanced from the first image center point. The more recent ROI is determined as the target ROI.
  • the effective first ROI has a highest evaluation score in one or more first ROIs within a preset area of the first image; and / or the effective second ROI The one or more second ROIs within the preset region of the first image have the highest evaluation score; wherein the evaluation score of each ROI satisfies at least one of the following: proportional to the area of the ROI, and The distance of the ROI from the center point of the first image is inversely proportional to the priority of the object category to which the ROI belongs.
  • the third processing unit 303 is further configured to update the feature information of the target ROI based on the feature information corresponding to the position and size of the target ROI in the historical image.
  • the characteristic information of the target ROI is determined according to the characteristic information of the first image corresponding to the target ROI and the characteristic information of at least one third image.
  • the at least one third image is Time domain is located between the first image and the second image.
  • the apparatus further includes:
  • a first initialization unit 306 configured to recalculate the target ROI after a first preset time period
  • a second initialization unit 307 is configured to recalculate the target ROI when the tracking confidence of the target ROI is less than a confidence threshold, where the tracking confidence is used to indicate the tracking accuracy of the target ROI , The tracking confidence is directly proportional to the tracking accuracy.
  • the feature information includes one or more of directional gradient hog information, color lab information, and convolutional neural network CNN information.
  • Each unit in FIG. 13 may be implemented in software, hardware, or a combination thereof.
  • Units implemented in hardware may include circuits and electric furnaces, algorithm circuits, or analog circuits.
  • a unit implemented in software may include program instructions, which is regarded as a software product, stored in a memory, and may be run by a processor to implement related functions. For details, refer to the previous introduction.
  • An embodiment of the present invention further provides a computer storage medium, where the computer storage medium may store a program, and when the program is executed, it includes part or all of the steps described in any of the foregoing method embodiments.
  • An embodiment of the present invention further provides a computer program.
  • the computer program includes instructions.
  • the computer program When the computer program is executed by a computer, the computer can perform part or all of the steps of any method for upgrading a vehicle-mounted device.
  • the disclosed device may be implemented in other ways.
  • the device embodiments described above are only schematic.
  • the division of the above units is only a logical function division.
  • multiple units or components may be combined or integrated.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be electrical or other forms.
  • the units described above as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, which may be located in one place, or may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objective of the solution of this embodiment.
  • the functional units in the embodiments of the present application may be integrated into one processing unit, or each of the units may exist separately physically, or two or more units may be integrated into one unit.
  • the above integrated unit may be implemented in the form of hardware or in the form of software functional unit.
  • the technical solution of the present application is essentially a part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, which is stored in a storage medium. It includes several instructions to enable a computer device (which may be a personal computer, a server, or a network device, and specifically a processor in a computer device) to perform all or part of the steps of the foregoing method in each embodiment of the present application.
  • a computer device which may be a personal computer, a server, or a network device, and specifically a processor in a computer device
  • the foregoing storage medium may include: a U disk, a mobile hard disk, a magnetic disk, an optical disk, a read-only memory (abbreviation: ROM), or a random access memory (Random Access Memory, abbreviation: RAM).
  • ROM read-only memory
  • RAM random access memory

Abstract

La présente invention concerne un appareil de focalisation, un procédé et un dispositif associé. L'appareil de focalisation comprend un processeur, et un NPU et un ISP couplés à une CPU ; l'ISP est utilisé pour générer une première image; le NPU est utilisé pour acquérir une première région d'intérêt (ROI) définie dans la première image, la première ROI définie comporte une ou plusieurs premières de ROI, et chaque première ROI comporte un objet photographique ; et la CPU est utilisée pour : acquérir une seconde ROI définie dans la première image, la seconde ROI comportant une ou plusieurs secondes ROI, et chaque seconde ROI étant une région de mouvement ; déterminer une ROI cible dans la première image sur la base de la première ROI définie et de la seconde ROI définie ; et selon des informations de caractéristique de la ROI cible, identifier des informations de position et des informations de taille de la ROI cible dans la seconde image et réaliser une focalisation, la première image étant située avant la seconde image dans un domaine temporel. La présente invention permet d'améliorer la précision de focalisation.
PCT/CN2018/103370 2018-08-30 2018-08-30 Appareil de focalisation, procédé et dispositif associé WO2020042126A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2018/103370 WO2020042126A1 (fr) 2018-08-30 2018-08-30 Appareil de focalisation, procédé et dispositif associé
CN201880096896.4A CN112602319B (zh) 2018-08-30 2018-08-30 一种对焦装置、方法及相关设备

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/103370 WO2020042126A1 (fr) 2018-08-30 2018-08-30 Appareil de focalisation, procédé et dispositif associé

Publications (1)

Publication Number Publication Date
WO2020042126A1 true WO2020042126A1 (fr) 2020-03-05

Family

ID=69644764

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/103370 WO2020042126A1 (fr) 2018-08-30 2018-08-30 Appareil de focalisation, procédé et dispositif associé

Country Status (2)

Country Link
CN (1) CN112602319B (fr)
WO (1) WO2020042126A1 (fr)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111626916A (zh) * 2020-06-01 2020-09-04 上海商汤智能科技有限公司 信息处理方法、装置及设备
CN112132162A (zh) * 2020-09-08 2020-12-25 Oppo广东移动通信有限公司 图像处理方法、图像处理器、电子设备及可读存储介质
CN115735226A (zh) * 2020-12-01 2023-03-03 华为技术有限公司 图像处理方法及设备
CN116055866A (zh) * 2022-05-30 2023-05-02 荣耀终端有限公司 一种拍摄方法及相关电子设备

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114827481B (zh) * 2022-06-29 2022-10-25 深圳思谋信息科技有限公司 对焦方法、装置、变焦设备和存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007077283A1 (fr) * 2005-12-30 2007-07-12 Nokia Corporation Procede et dispositif de reglage de l'autofocalisation d'une camera video par suivi d'une region d'interet
KR20110007437A (ko) * 2009-07-16 2011-01-24 삼성전기주식회사 이동 피사체의 자동 추적 시스템 및 그 방법
CN106060407A (zh) * 2016-07-29 2016-10-26 努比亚技术有限公司 一种对焦方法及终端
CN108024065A (zh) * 2017-12-28 2018-05-11 努比亚技术有限公司 一种终端拍摄的方法、终端及计算机可读存储介质

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5493789B2 (ja) * 2009-12-07 2014-05-14 株式会社リコー 撮像装置および撮像方法
JP2013191011A (ja) * 2012-03-14 2013-09-26 Casio Comput Co Ltd 画像処理装置、画像処理方法及びプログラム
US9538065B2 (en) * 2014-04-03 2017-01-03 Qualcomm Incorporated System and method for multi-focus imaging
CN106324945A (zh) * 2015-06-30 2017-01-11 中兴通讯股份有限公司 非接触式自动对焦方法和装置
US9858496B2 (en) * 2016-01-20 2018-01-02 Microsoft Technology Licensing, Llc Object detection and classification in images
CN106254780A (zh) * 2016-08-31 2016-12-21 宇龙计算机通信科技(深圳)有限公司 一种双摄像头拍照控制方法、拍照控制装置及终端
CN107302658B (zh) * 2017-06-16 2019-08-02 Oppo广东移动通信有限公司 实现人脸清晰的对焦方法、装置和计算机设备

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007077283A1 (fr) * 2005-12-30 2007-07-12 Nokia Corporation Procede et dispositif de reglage de l'autofocalisation d'une camera video par suivi d'une region d'interet
KR20110007437A (ko) * 2009-07-16 2011-01-24 삼성전기주식회사 이동 피사체의 자동 추적 시스템 및 그 방법
CN106060407A (zh) * 2016-07-29 2016-10-26 努比亚技术有限公司 一种对焦方法及终端
CN108024065A (zh) * 2017-12-28 2018-05-11 努比亚技术有限公司 一种终端拍摄的方法、终端及计算机可读存储介质

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111626916A (zh) * 2020-06-01 2020-09-04 上海商汤智能科技有限公司 信息处理方法、装置及设备
CN111626916B (zh) * 2020-06-01 2024-03-22 上海商汤智能科技有限公司 信息处理方法、装置及设备
CN112132162A (zh) * 2020-09-08 2020-12-25 Oppo广东移动通信有限公司 图像处理方法、图像处理器、电子设备及可读存储介质
CN112132162B (zh) * 2020-09-08 2024-04-02 Oppo广东移动通信有限公司 图像处理方法、图像处理器、电子设备及可读存储介质
CN115735226A (zh) * 2020-12-01 2023-03-03 华为技术有限公司 图像处理方法及设备
CN115735226B (zh) * 2020-12-01 2023-08-22 华为技术有限公司 一种图像处理方法及芯片
CN116055866A (zh) * 2022-05-30 2023-05-02 荣耀终端有限公司 一种拍摄方法及相关电子设备
CN116055866B (zh) * 2022-05-30 2023-09-12 荣耀终端有限公司 一种拍摄方法及相关电子设备

Also Published As

Publication number Publication date
CN112602319B (zh) 2022-09-23
CN112602319A (zh) 2021-04-02

Similar Documents

Publication Publication Date Title
WO2020042126A1 (fr) Appareil de focalisation, procédé et dispositif associé
CN109559320B (zh) 基于空洞卷积深度神经网络实现视觉slam语义建图功能的方法及系统
WO2020259179A1 (fr) Procédé de mise au point, dispositif électronique et support d'informations lisible par ordinateur
US11847826B2 (en) System and method for providing dominant scene classification by semantic segmentation
US11410038B2 (en) Frame selection based on a trained neural network
CN110866480B (zh) 对象的跟踪方法及装置、存储介质、电子装置
CN114424253A (zh) 模型训练方法、装置、存储介质及电子设备
WO2020103110A1 (fr) Procédé et dispositif d'acquisition de limite d'image fondés sur une carte de nuage de points et aéronef
WO2020103108A1 (fr) Procédé et dispositif de génération de sémantique, drone et support d'informations
US20170054897A1 (en) Method of automatically focusing on region of interest by an electronic device
US20220223153A1 (en) Voice controlled camera with ai scene detection for precise focusing
KR20230084486A (ko) 이미지 효과를 위한 세그먼트화
WO2021104124A1 (fr) Procédé, appareil et système de détermination d'informations d'enclos de confinement, et support d'enregistrement
WO2019144263A1 (fr) Procédé de commande et dispositif destiné à une plateforme mobile, et support d'informations lisible par ordinateur
CN111147751B (zh) 拍照模式的生成方法、装置和计算机可读存储介质
WO2023138403A1 (fr) Procédé et appareil de détermination de geste déclencheur, et dispositif
CN111291646A (zh) 一种人流量统计方法、装置、设备及存储介质
CN106922181A (zh) 方向感知自动聚焦
WO2019052197A1 (fr) Procédé et appareil de réglage de paramètre d'aéronef
CN113056907A (zh) 拍摄方法、拍摄装置及存储介质
CN117252912A (zh) 深度图像获取方法、电子设备及存储介质
CN115457666A (zh) 活体对象运动重心识别方法、系统及计算机可读存储介质
CN115223135A (zh) 车位跟踪方法、装置、车辆及存储介质
CN114677620A (zh) 对焦方法、电子设备和计算机可读介质
CN110012208B (zh) 拍照对焦方法、装置、存储介质及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18931886

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18931886

Country of ref document: EP

Kind code of ref document: A1