CN118264891A

CN118264891A - Image processing method, terminal and computer storage medium

Info

Publication number: CN118264891A
Application number: CN202410283035.4A
Authority: CN
Inventors: 周晨航
Original assignee: Shanghai Chuanying Information Technology Co Ltd
Current assignee: Shanghai Chuanying Information Technology Co Ltd
Priority date: 2020-09-15
Filing date: 2020-09-15
Publication date: 2024-06-28
Also published as: CN112184722B; CN112184722A

Abstract

The application discloses an image processing method, a terminal and a computer storage medium, wherein the image processing method is applied to the terminal and comprises the following steps: acquiring an image to be shot; detecting a preset key point of a target object in the image to be shot to obtain the position information of the preset key point of the target object in the image to be shot; focusing the target object according to the position information of the preset key point of the target object in the image to be shot. According to the image processing method, the terminal and the computer storage medium, the target object in the image to be shot is subjected to focusing processing according to the position information of the preset key point of the target object in the image to be shot by detecting the preset key point of the target object, so that a clear image of the target object is shot, and the image quality is improved.

Description

Image processing method, terminal and computer storage medium

Technical Field

The present application relates to the field of image processing, and in particular, to an image processing method, a terminal, and a computer storage medium.

Background

The focus tracking photography is also called shake shooting, is a specific slow shutter shooting skill, and achieves the effect of blurring a background highlighting theme by a shooting method of tracking a target motion track of a device lens and a target object almost in parallel and stably at the same speed, so that the photo can be seen to have dynamic sense and strong visual impact. The shooting effect of the focus tracking is different from the background blurring effect of the large aperture, the focus tracking background is dynamic, and the large aperture background is static and blurring. For example, when the vehicle is shot, the vehicle runs on a road at the same speed as the shot vehicle on another vehicle, and is focused on the shot vehicle, the shot vehicle is relatively kept still, a background picture is moved, and then a dynamic effect of clear vehicle and dynamic blur of the background can be generated in the shot picture. At present, the focus tracking shooting technology is mostly used for professional shooting, needs a lot of support of external shooting conditions, and is difficult to apply to terminal instant imaging. Although some software can achieve a similar focus tracking effect, the operation is complex due to too much human intervention, and the use experience of a user is affected.

The foregoing description is provided for general background information and does not necessarily constitute prior art.

Disclosure of Invention

The application aims to provide an image processing method, a terminal and a computer storage medium, which improve image quality.

In order to achieve the above purpose, the technical scheme of the application is realized as follows:

In a first aspect, an embodiment of the present application provides an image processing method, which is applied to a terminal, including:

Acquiring an image to be processed;

Performing target object segmentation processing on the image to be processed to obtain an area where the target object is located in the image to be processed;

And carrying out superposition processing on the region except the region where the target object is located in the image to be processed to obtain the image to be processed with a dynamic background picture.

As one embodiment, the image to be processed is a single frame image; the step of performing superposition processing on the region except the region where the target object is located in the image to be processed to obtain the image to be processed with a background picture in a dynamic effect, includes:

And under the condition that the area where the target object is located is kept unchanged, the areas except the area where the target object is located in at least two images to be processed are subjected to dislocation superposition, and the images to be processed with the background image being dynamic effects are obtained.

As one implementation manner, the image to be processed is at least two continuous frame images containing the same target object; the step of performing superposition processing on the region except the region where the target object is located in the image to be processed to obtain the image to be processed with a background picture in a dynamic effect, includes:

determining a first to-be-processed image serving as a reference image and a second to-be-processed image serving as a non-reference image from the to-be-processed images;

performing image registration processing on the first to-be-processed image and the second to-be-processed image according to the region where the target object is located in the first to-be-processed image and the region where the target object is located in the second to-be-processed image;

and after the image registration is completed, overlapping the region except the region where the target object is positioned in the second image to be processed to the region except the region where the target object is positioned in the first image to be processed, so as to obtain the image to be processed, wherein the background picture of the image to be processed is in a dynamic effect.

As one embodiment, before performing the image registration processing on the first to-be-processed image and the second to-be-processed image according to the region where the target object is located in the first to-be-processed image and the region where the target object is located in the second to-be-processed image, the method further includes:

acquiring a first characteristic point of an area where the target object is located in the first image to be processed and a second characteristic point of an area where the target object is located in the second image to be processed;

And detecting whether the target object is in a rigid motion state according to the first feature points and the second feature points, and if so, executing the step of carrying out image registration processing on the first to-be-processed image and the second to-be-processed image according to the region where the target object is in the first to-be-processed image and the region where the target object is in the second to-be-processed image.

As one embodiment, the performing object segmentation processing on the image to be processed to obtain an area where the object is located in the image to be processed includes:

Inputting the image to be processed into a trained target segmentation neural network model to obtain an area where the target object is located in the image to be processed, which is output by the trained target segmentation neural network model.

And receiving a selection operation of a user on the image to be processed, and taking the area selected by the selection operation as the area where the target object is located in the image to be processed.

In a second aspect, an embodiment of the present application provides an image processing method, including:

Acquiring an image to be shot;

detecting a preset key point of a target object in the image to be shot to obtain the position information of the preset key point of the target object in the image to be shot;

Focusing the target object according to the position information of the preset key point of the target object in the image to be shot.

As one embodiment, the method further comprises:

acquiring position change information of a preset key point of the target object in at least two continuous frames of images;

Judging whether the target object is in a preset motion state according to the position change information;

and when determining whether the target object is in a preset motion state, starting a preset image shooting mode.

As one embodiment, the method further comprises:

Acquiring the gesture of the target object according to the position information of the preset key point of the target object in the image to be shot;

And shooting the image to be shot when the gesture of the target object is determined to be the preset gesture.

As one embodiment, the detecting the preset key point of the target object in the image to be shot to obtain the position information of the preset key point of the target object in the image to be shot includes:

performing target object segmentation processing on the image to be processed to obtain a rectangular frame of an area or a position where the target object is located in the image to be processed;

and detecting preset key points according to the region or the position rectangular frame where the target object is located in the image to be processed, and obtaining the position information of the preset key points of the target object in the image to be shot.

In a second aspect, an embodiment of the present application provides a terminal including a processor and a memory for storing a program; the program, when executed by the processor, causes the processor to implement the image processing method as described in the first aspect and/or the second aspect.

In a third aspect, an embodiment of the present application provides a computer storage medium storing a computer program which, when executed by a processor, implements the image processing method according to the first and/or second aspects.

The embodiment of the application provides an image processing method, a terminal and a computer storage medium, wherein the image processing method is applied to the terminal and comprises the following steps: acquiring an image to be shot; detecting a preset key point of a target object in the image to be shot to obtain the position information of the preset key point of the target object in the image to be shot; focusing the target object according to the position information of the preset key point of the target object in the image to be shot. In this way, the target object in the image to be shot is subjected to preset key point detection, and focusing processing is performed on the target object according to the position information of the preset key point of the target object in the image to be shot, so that a clear image of the target object is shot, and the image quality is improved.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application. In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.

Fig. 1 is a schematic diagram of a hardware structure of a mobile terminal implementing various embodiments of the present application;

fig. 2 is a schematic diagram of a communication network system according to an embodiment of the present application;

fig. 3 is a schematic flow chart of an image processing method according to an embodiment of the present application;

fig. 4 is a second flowchart of an image processing method according to an embodiment of the present application;

Fig. 5 is a schematic flow chart of an image processing method according to an embodiment of the present application;

Fig. 6 is a schematic diagram of a specific flow chart of an image processing method according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of a terminal according to an embodiment of the present application.

Detailed Description

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, the element defined by the phrase "comprising one … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element, and furthermore, elements having the same name in different embodiments of the application may have the same meaning or may have different meanings, the particular meaning of which is to be determined by its interpretation in this particular embodiment or by further combining the context of this particular embodiment.

It should be understood that although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope herein. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "in response to a determination" depending on the context. Furthermore, as used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context indicates otherwise. It will be further understood that the terms "comprises," "comprising," "includes," and/or "including" specify the presence of stated features, steps, operations, elements, components, items, categories, and/or groups, but do not preclude the presence, presence or addition of one or more other features, steps, operations, elements, components, items, categories, and/or groups. The terms "or" and/or "as used herein are to be construed as inclusive, or meaning any one or any combination. Thus, "A, B or C" or "A, B and/or C" means "any of the following: a, A is as follows; b, a step of preparing a composite material; c, performing operation; a and B; a and C; b and C; A. b and C). An exception to this definition will occur only when a combination of elements, functions, steps or operations are in some way inherently mutually exclusive.

It should be understood that, although the steps in the flowcharts in the embodiments of the present application are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited in order and may be performed in other orders, unless explicitly stated herein. Moreover, at least some of the steps in the figures may include multiple sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, the order of their execution not necessarily occurring in sequence, but may be performed alternately or alternately with other steps or at least a portion of the other steps or stages.

It should be noted that, in this document, step numbers such as S101 and S102 are adopted, and the purpose of the present application is to more clearly and briefly describe the corresponding content, and not to constitute a substantial limitation on the sequence, and those skilled in the art may execute S102 before executing S101 in the implementation, which are all within the scope of the present application.

It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.

In the following description, suffixes such as "module", "part" or "unit" for representing elements are used only for facilitating the description of the present application, and have no specific meaning per se. Thus, "module," "component," or "unit" may be used in combination.

The terminal may be implemented in various forms. For example, the terminals described in the present application may include mobile terminals such as a mobile phone, a tablet computer, a notebook computer, a palm computer, a Personal digital assistant (Personal DIGITAL ASSISTANT, PDA), a Portable media player (Portable MEDIA PLAYER, PMP), a navigation device, a wearable device, a smart bracelet, a pedometer, and the like, as well as fixed terminals such as a digital TV, a desktop computer, and the like.

The following description will be given taking a mobile terminal as an example, and those skilled in the art will understand that the configuration according to the embodiment of the present application can be applied to a fixed type terminal in addition to elements particularly used for a moving purpose.

Referring to fig. 1, which is a schematic diagram of a hardware structure of a mobile terminal implementing various embodiments of the present application, the mobile terminal 100 may include: an RF (Radio Frequency) unit 101, a WiFi module 102, an audio output unit 103, an a/V (audio/video) input unit 104, a sensor 105, a display unit 106, a user input unit 107, an interface unit 108, a memory 109, a processor 110, and a power supply 111. Those skilled in the art will appreciate that the mobile terminal structure shown in fig. 1 is not limiting of the mobile terminal and that the mobile terminal may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.

The following describes the components of the mobile terminal in detail with reference to fig. 1:

The radio frequency unit 101 may be used for receiving and transmitting signals during the information receiving or communication process, specifically, after receiving downlink information of the base station, processing the downlink information by the processor 110; and, the uplink data is transmitted to the base station. Typically, the radio frequency unit 101 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like. In addition, the radio frequency unit 101 may also communicate with networks and other devices via wireless communications. The wireless communication may use any communication standard or protocol, including but not limited to GSM (Global System of Mobile communication, global System for Mobile communications), GPRS (GENERAL PACKET Radio Service), CDMA2000 (Code Division Multiple Access, code Division multiple Access 2000), WCDMA (Wideband Code Division Multiple Access ), TD-SCDMA (Time Division-Synchronous Code Division Multiple Access, time Division synchronous code Division multiple Access), FDD-LTE (Frequency Division Duplexing-Long Term Evolution, frequency Division Duplex Long term evolution) and TDD-LTE (Time Division Duplexing-Long Term Evolution, time Division Duplex Long term evolution), etc.

WiFi belongs to a short-distance wireless transmission technology, and a mobile terminal can help a user to send and receive e-mails, browse web pages, access streaming media and the like through the WiFi module 102, so that wireless broadband Internet access is provided for the user. Although fig. 1 shows a WiFi module 102, it is understood that it does not belong to the necessary constitution of a mobile terminal, and can be omitted entirely as required within a range that does not change the essence of the invention.

The audio output unit 103 may convert audio data received by the radio frequency unit 101 or the WiFi module 102 or stored in the memory 109 into an audio signal and output as sound when the mobile terminal 100 is in a call signal reception mode, a talk mode, a recording mode, a voice recognition mode, a broadcast reception mode, or the like. Also, the audio output unit 103 may also provide audio output (e.g., a call signal reception sound, a message reception sound, etc.) related to a specific function performed by the mobile terminal 100. The audio output unit 103 may include a speaker, a buzzer, and the like.

The a/V input unit 104 is used to receive an audio or video signal. The a/V input unit 104 may include a graphics processor (Graphics Processing Unit, GPU) 1041 and a microphone 1042, the graphics processor 1041 processing image data of still pictures or video obtained by an image capturing device (e.g. a camera) in a video capturing mode or an image capturing mode. The processed image frames may be displayed on the display unit 106. The image frames processed by the graphics processor 1041 may be stored in the memory 109 (or other storage medium) or transmitted via the radio frequency unit 101 or the WiFi module 102. The microphone 1042 can receive sound (audio data) via the microphone 1042 in a phone call mode, a recording mode, a voice recognition mode, and the like, and can process such sound into audio data. The processed audio (voice) data may be converted into a format output that can be transmitted to the mobile communication base station via the radio frequency unit 101 in the case of a telephone call mode. The microphone 1042 may implement various types of noise cancellation (or suppression) algorithms to cancel (or suppress) noise or interference generated in the course of receiving and transmitting the audio signal.

The mobile terminal 100 also includes at least one sensor 105, such as a light sensor, a motion sensor, and other sensors. Specifically, the light sensor includes an ambient light sensor and a proximity sensor, optionally, the ambient light sensor may adjust the brightness of the display panel 1061 according to the brightness of ambient light, and the proximity sensor may turn off the display panel 1061 and/or the backlight when the mobile terminal 100 moves to the ear. As one of the motion sensors, the accelerometer sensor can detect the acceleration in all directions (generally three axes), and can detect the gravity and direction when stationary, and can be used for applications of recognizing the gesture of a mobile phone (such as horizontal and vertical screen switching, related games, magnetometer gesture calibration), vibration recognition related functions (such as pedometer and knocking), and the like; as for other sensors such as fingerprint sensors, pressure sensors, iris sensors, molecular sensors, gyroscopes, barometers, hygrometers, thermometers, infrared sensors, etc. that may also be configured in the mobile phone, the detailed description thereof will be omitted.

The display unit 106 is used to display information input by a user or information provided to the user. The display unit 106 may include a display panel 1061, and the display panel 1061 may be configured in the form of a Liquid crystal display (Liquid CRYSTAL DISPLAY, LCD), an Organic Light-Emitting Diode (OLED), or the like.

The user input unit 107 may be used to receive input numeric or character information and to generate key signal inputs related to user settings and function control of the mobile terminal. In particular, the user input unit 107 may include a touch panel 1071 and other input devices 1072. The touch panel 1071, also referred to as a touch screen, may collect touch operations thereon or thereabout by a user (e.g., operations of the user on the touch panel 1071 or thereabout by using any suitable object or accessory such as a finger, a stylus, etc.) and drive the corresponding connection device according to a predetermined program. The touch panel 1071 may include two parts of a touch detection device and a touch controller. Optionally, the touch detection device detects the touch azimuth of the user, detects a signal brought by touch operation, and transmits the signal to the touch controller; the touch controller receives touch information from the touch detection device, converts it into touch point coordinates, and sends the touch point coordinates to the processor 110, and can receive and execute commands sent from the processor 110. Further, the touch panel 1071 may be implemented in various types such as resistive, capacitive, infrared, and surface acoustic wave. The user input unit 107 may include other input devices 1072 in addition to the touch panel 1071. In particular, other input devices 1072 may include, but are not limited to, one or more of a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, mouse, joystick, etc., as specifically not limited herein.

Further, the touch panel 1071 may overlay the display panel 1061, and when the touch panel 1071 detects a touch operation thereon or thereabout, the touch panel 1071 is transferred to the processor 110 to determine the type of touch event, and then the processor 110 provides a corresponding visual output on the display panel 1061 according to the type of touch event. Although in fig. 1, the touch panel 1071 and the display panel 1061 are two independent components for implementing the input and output functions of the mobile terminal, in some embodiments, the touch panel 1071 may be integrated with the display panel 1061 to implement the input and output functions of the mobile terminal, which is not limited herein.

The interface unit 108 serves as an interface through which at least one external device can be connected with the mobile terminal 100. For example, the external devices may include a wired or wireless headset port, an external power (or battery charger) port, a wired or wireless data port, a memory card port, a port for connecting a device having an identification module, an audio input/output (I/O) port, a video I/O port, an earphone port, and the like. The interface unit 108 may be used to receive input (e.g., data information, power, etc.) from an external device and transmit the received input to one or more elements within the mobile terminal 100 or may be used to transmit data between the mobile terminal 100 and an external device.

Memory 109 may be used to store software programs as well as various data. The memory 109 may mainly include a storage program area and a storage data area, and alternatively, the storage program area may store an operating system, an application program required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, phonebook, etc.) created according to the use of the handset, etc. In addition, memory 109 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device.

The processor 110 is a control center of the mobile terminal, connects various parts of the entire mobile terminal using various interfaces and lines, and performs various functions of the mobile terminal and processes data by running or executing software programs and/or modules stored in the memory 109 and calling data stored in the memory 109, thereby performing overall monitoring of the mobile terminal. Processor 110 may include one or more processing units; preferably, the processor 110 may integrate an application processor and a modem processor, the application processor optionally handling mainly an operating system, a user interface, an application program, etc., the modem processor handling mainly wireless communication. It will be appreciated that the modem processor described above may not be integrated into the processor 110.

The mobile terminal 100 may further include a power source 111 (e.g., a battery) for supplying power to the respective components, and preferably, the power source 111 may be logically connected to the processor 110 through a power management system, so as to perform functions of managing charging, discharging, and power consumption management through the power management system.

Although not shown in fig. 1, the mobile terminal 100 may further include a bluetooth module or the like, which is not described herein.

In order to facilitate understanding of the embodiments of the present application, a communication network system on which the mobile terminal of the present application is based will be described below.

Referring to fig. 2, fig. 2 is a schematic diagram of a communication network system according to an embodiment of the present application, where the communication network system is an LTE system of a general mobile communication technology, and the LTE system includes a UE (User Equipment) 201, an e-UTRAN (Evolved UMTS Terrestrial Radio Access Network ) 202, an epc (Evolved Packet Core, evolved packet core) 203, and an IP service 204 of an operator that are sequentially connected in communication.

Specifically, the UE201 may be the terminal 100 described above, and will not be described herein.

The E-UTRAN202 includes eNodeB2021 and other eNodeB2022, etc. Alternatively, the eNodeB2021 may connect with other enodebs 2022 over a backhaul (e.g., X2 interface), the eNodeB2021 is connected to the EPC203, and the eNodeB2021 may provide access for the UE201 to the EPC 203.

EPC203 may include MME (Mobility MANAGEMENT ENTITY ) 2031, hss (Home Subscriber Server, home subscriber server) 2032, other MMEs 2033, SGW (SERVING GATE WAY ) 2034, pgw (PDN GATE WAY, packet data network gateway) 2035, PCRF (Policy AND CHARGING Rules Function) 2036, and so on. Optionally, MME2031 is a control node that handles signaling between UE201 and EPC203, providing bearer and connection management. HSS2032 is used to provide registers to manage functions such as home location registers (not shown) and to hold user specific information about service characteristics, data rates, etc. All user data may be sent through SGW2034 and PGW2035 may provide IP address allocation and other functions for UE201, PCRF2036 is a policy and charging control policy decision point for traffic data flows and IP bearer resources, which selects and provides available policy and charging control decisions for a policy and charging enforcement function (not shown).

IP services 204 may include the internet, intranets, IMS (IP Multimedia Subsystem ), or other IP services, etc.

Although the LTE system is described above as an example, it should be understood by those skilled in the art that the present application is not limited to LTE systems, but may be applied to other wireless communication systems, such as GSM, CDMA2000, WCDMA, TD-SCDMA, and future new network systems.

Based on the above-mentioned mobile terminal hardware structure and communication network system, various embodiments of the present application are presented.

Referring to fig. 3, an image processing method provided in an embodiment of the present application may be suitable for a case where an image is processed to achieve a dynamic effect on a picture, where the image processing method may be performed by an image processing apparatus provided in an embodiment of the present application, where the image processing apparatus may be implemented in software and/or hardware, and in a specific application, the image processing apparatus may be a terminal such as a smart phone, a personal digital assistant, a tablet computer, a video camera, or the like. In this embodiment, taking the execution subject of the image processing method as an example, the image processing method includes the following steps:

Step S301: acquiring an image to be processed;

It should be noted that the image to be processed may be a single frame image, or may be at least two continuous frame images including the same target object, that is, at least two images obtained by continuously capturing the same target object while the capturing position remains unchanged.

Step S302: performing target object segmentation processing on the image to be processed to obtain an area where the target object is located in the image to be processed;

Here, in order to keep the target object clear and the background motion blurred during the image processing, the target object needs to be segmented from the image so as not to be processed during the image processing. The target object can be set according to actual situation requirements, and can be a vehicle, a pedestrian or other common moving objects. It should be noted that, the terminal may adopt an automatic mode or a manual mode for performing the object segmentation processing on the image to be processed, and correspondingly, the terminal may provide a virtual key switch for turning on the automatic mode and/or the manual mode, so as to select which mode is adopted by the user. In an embodiment, the performing object segmentation processing on the image to be processed to obtain an area where the object is located in the image to be processed includes: inputting the image to be processed into a trained target object segmentation neural network model to obtain an area where the target object is located in the image to be processed, which is output by the trained target object segmentation neural network model; optionally, the target object segmentation neural network model is obtained by training based on a historical image containing the target object. It should be noted that, the terminal may build the target object segmentation neural network model by training based on the history image including the target object in advance, or the terminal may acquire the target object segmentation neural network model built by training based on the history image including the target object from the server. The target object segmentation neural network model can automatically identify the region where the target object is located from the image to be processed according to the input image to be processed, and can be equivalent to performing matting processing on the image to be processed so as to obtain the region where the target object is located. Here, the target object segmentation neural network model may be built by using an artificial intelligence algorithm, and in this embodiment, a Deeplab series of segmentation networks are used to build the target object segmentation neural network model. In addition, when the terminal performs the target object segmentation processing on the image to be processed by adopting the manual mode, a user can select the region where the target object is located from the image to be processed according to the requirement, and specifically, the region where the target object is located can be selected by continuously moving a finger on a touch screen of the terminal. In an embodiment, the performing object segmentation processing on the image to be processed to obtain an area where the object is located in the image to be processed includes: and receiving a selection operation of a user on the image to be processed, and taking the area selected by the selection operation as the area where the target object is located in the image to be processed. That is, the user selects a specific area as the area where the target object is located through a touch input manner, and the terminal correspondingly uses the area selected by the user as the area where the target object is located in the image to be processed. Therefore, by providing a plurality of modes for acquiring the region where the target object in the image to be processed is located, the operation convenience is better, and the use experience of the user is further improved.

Step S303: and carrying out superposition processing on the region except the region where the target object is located in the image to be processed to obtain the image to be processed with a dynamic background picture.

It will be appreciated that when the image to be processed is one or more frames, the corresponding superimposing processing is not the same. In an embodiment, when the image to be processed is a single frame image, the overlapping processing is performed on an area except for an area where the target object is located in the image to be processed, to obtain the image to be processed with a background picture in a dynamic effect, including: and under the condition that the area where the target object is located is kept unchanged, the areas except the area where the target object is located in at least two images to be processed are subjected to dislocation superposition, and the images to be processed with the background image being dynamic effects are obtained. That is, with the image to be processed as a reference, under the condition that the area where the target object is located is kept unchanged, the area except the area where the target object is located in one image to be processed is overlapped with the area except the area where the target object is located in at least one other image to be processed in a staggered manner, so that the dynamic blurring effect is displayed in the area except the area where the target object is located in the image to be processed, and the area where the target object is located is kept clear. Here, the misalignment overlapping may be that, with the image to be processed as a reference, an area other than the area where the target object is located is moved in a preset direction, such as left, right, upward or downward, and then an image to be overlapped is generated, and then the image to be overlapped is overlapped with the original image to be processed while keeping the area where the target object of the original image to be processed is located unchanged. Therefore, the images with dynamic images are obtained by carrying out dislocation superposition on the areas except the areas where the target objects are located, the operation is simple and quick, and the use experience of users is further improved.

In an embodiment, when the image to be processed is at least two continuous frame images including the same target object, the overlapping processing is performed on the area of the image to be processed except the area where the target object is located, so as to obtain the image to be processed with a background picture in a dynamic effect, including:

It will be appreciated that, since the continuous frame images are obtained by photographing the target object at the same photographing position, the background of each frame image in the continuous frame images should be substantially the same, and all the continuous frame images contain the target object. The first to-be-processed image serving as the reference image and the second to-be-processed image serving as the non-reference image are determined from the to-be-processed images, wherein one frame of image is arbitrarily selected from the to-be-processed images to serve as the first to-be-processed image, and at least one frame of image with shooting time before the first to-be-processed image is selected to serve as the second to-be-processed image. Here, according to the region where the target object is located in the first to-be-processed image and the region where the target object is located in the second to-be-processed image, the positions of the target object in the first to-be-processed image and the second to-be-processed image can be obtained, respectively, then the first to-be-processed image and the second to-be-processed image are subjected to image registration processing by taking the first to-be-processed image as a reference image, and then the region except the region where the target object is located in the second to-be-processed image is overlapped to the region except the region where the target object is located in the first to-be-processed image, so that the region except the region where the target object is located in the first to-be-processed image presents a dynamic blurring effect, the region where the target object is located in the first to-be-processed image remains clear, and the to-be-processed image with a dynamic effect is obtained. It should be noted that, since the position of the target object in different continuous frame images is different due to the motion of the target object, the displacement of the target object relative to the two images can be obtained by performing image registration processing on the different images, then one frame image is taken as a reference image, and under the condition that the area where the target object is located is kept unchanged, the other frame image is moved by the displacement and then overlapped with the image serving as the reference image, thereby realizing the dynamic effect of the image presentation picture serving as the reference image. Therefore, the areas of the continuous frame images except the area where the target object is located are overlapped with each other, so that the image with the dynamic effect of the picture is obtained, the operation is simple and quick, and the use experience of a user is further improved.

In an embodiment, before performing the image registration processing on the first to-be-processed image and the second to-be-processed image according to the region where the target object is located in the first to-be-processed image and the region where the target object is located in the second to-be-processed image, the method further includes:

It will be appreciated that the successive frame images that are capable of superimposing the regions other than the region in which the target object is located should at least ensure that the pose of the target object in each frame image is substantially uniform, taking the target object as an example, if a standing pose is human in one frame image, the human should also be a standing pose in other successive frame images, rather than a squatting pose. The rigid motion state refers to that the posture of the target object in the continuous frame images is basically consistent, such as standing posture or squatting posture. For different target objects, the corresponding feature points may be different, for example, if the target object is a person, the corresponding feature points may be a head, a neck, a hand, a back, etc.; if the target object is a vehicle, the corresponding feature points may be wheels, windows, etc. Taking the target object as an example, the first feature points may be a head, a neck and a back, and the second feature points may be a head, a neck and a back. Whether the target object is in a rigid motion state or not is detected according to the first feature point and the second feature point, whether the similarity of the target object is equal to or greater than a preset threshold value can be judged according to the first feature point and the second feature point, if so, the target object is in the rigid motion state, otherwise, the target object is not in the rigid motion state is indicated. Therefore, only when the target object is in a rigid motion state, the first to-be-processed image and the second to-be-processed image are subjected to image registration processing according to the region where the target object is in the first to-be-processed image and the region where the target object is in the second to-be-processed image, so that the dynamic effect in the obtained image with the dynamic effect is basically consistent after the regions except the region where the target object is in the image are subjected to superposition processing, the dynamic effect of some regions except the region where the target object is not present can be avoided, and the dynamic effect of some regions can not be realized.

In summary, in the image processing method provided in the foregoing embodiment, the target object in the image is subjected to segmentation processing, and then the region except the region where the target object is located in the image is subjected to superposition processing, so as to obtain the image with a dynamic effect on the screen, which is simple and convenient to operate, and improves the user experience.

In an embodiment, before the obtaining the image to be processed, the method further includes:

Detecting preset key points of a target object in an image to be shot, and acquiring position information of the preset key points of the target object in the image to be shot;

It can be understood that, in the process of previewing an image, the terminal may detect a preset key point of a target object in the image to be shot, that is, the previewing image, so as to obtain position information of the preset key point of the target object in the image to be shot. Here, the terminal may acquire the position of the target object in the image to be shot by performing the target object segmentation process on the image to be shot, and further detect the preset key point of the target object in the image to be shot according to the position of the target object in the image to be shot. Taking a target object as a human body as an example, the terminal can rapidly detect and position preset key points of the human body through a set human skeleton key point detection network. The number of the preset key points can be determined according to actual requirements, one or more of the preset key points can be determined by taking a target object as an example of a human body, and the preset key points can be parts such as the top of the head, the five sense organs, the neck, the limbs and the like. When shooting the target object, the target object is usually taken as a focus, so that the target object can be focused according to the position information of the preset key point of the target object in the image to be shot, so that a clear image of the target object can be shot, and the image can be used as an image to be processed. In this way, focusing processing is carried out on the target object according to the position information of the preset key point of the target object in the image to be shot, so that a clear image of the target object is shot, and the image quality is improved.

In an embodiment, after focusing the target object according to the position information of the preset key point of the target object in the image to be shot, the method further includes:

It can be understood that the gesture of the target object is obtained according to the position information of the preset key point of the target object in the image to be shot, for example, taking the target object as a human body, the gesture of the target object may be a standing gesture, a moving gesture, a specific gesture, or the like. The terminal can set at least one preset gesture according to the user requirement, so as to shoot the image to be shot when the gesture of the target object is detected to be the preset gesture, and further realize automatic snapshot of the image. It should be noted that, the terminal may determine whether the target object is in a motion state according to the position change information of the preset key point of the target object in the continuous N frames of images, if so, start a preset snapshot mode, so as to automatically perform shooting when the gesture of the target object is detected to be the preset gesture. Therefore, the shooting experience of the user is improved, and the use experience of the user is further improved.

Referring to fig. 4, an image processing method provided by an embodiment of the present application may be suitable for a case of focusing shooting, and the image processing method may be performed by an image processing apparatus provided by an embodiment of the present application, where the image processing apparatus may be implemented in a software and/or hardware manner, and in a specific application, the image processing apparatus may be a terminal such as a smart phone, a personal digital assistant, a tablet computer, a video camera, or the like. In this embodiment, taking the execution subject of the image processing method as an example, the image processing method includes the following steps:

Step S401: acquiring an image to be shot;

step S402: detecting a preset key point of a target object in an image to be shot, and obtaining position information of the preset key point of the target object in the image to be shot;

step S403: focusing the target object according to the position information of the preset key point of the target object in the image to be shot.

In an embodiment, further comprising:

In an embodiment, the detecting the preset key point of the target object in the image to be shot to obtain the position information of the preset key point of the target object in the image to be shot includes:

Based on the same inventive concept as the foregoing embodiments, the present embodiment describes in detail, by way of specific examples, the technical solution of the foregoing embodiments, taking the target object as a main body as an example, referring to fig. 5, an image processing method, which may also be referred to as a focus tracking shooting method, provided for the embodiment of the present application includes the following steps:

Step S501: acquiring a clear image to be processed of a main body;

Specifically, a clear planar image of a subject to be processed or a video or multi-frame image with a moving object is obtained.

Step S502: training a target segmentation network according to the collected data;

here, data to be segmented is collected for the requirements of the product design, and a targeted target segmentation network is trained. The focus tracking shooting is performed by taking a moving vehicle, a pedestrian and other common moving objects as a common shooting subject, collecting related data to construct an AI training data set, and completing training of a target segmentation network by using the collected data set, so that the image subject target segmentation in the step S501 can be completed by using the target segmentation network obtained by training.

Step S503: acquiring a main body region of interest by adopting a manual or automatic mode;

Here, the subject target segmentation may be set to an automatic mode in which the subject target types existing in the sample set are segmented according to a pre-trained target segmentation network, and a manual mode in which a region to be kept clear, i.e., a subject region of interest, is selected by a user according to user's needs.

Step S504: the image matching of the continuous frame image and the reference image is completed by utilizing the characteristic points, namely, the continuous frame is mapped to the reference image;

Here, for a single frame image, after the region of interest is acquired, blurring processing is performed in the optional axial direction (may be any angle blurring), and the specific blurring processing may be to superimpose the images by using an image dislocation superimposing method while keeping the region of interest clear. The multi-frame image synthesizes the dynamic effect outside the interested main body area, and the effect is more natural and real unlike a single frame processing mode. Firstly, selecting a reference image as a reference image of a focus tracking dynamic effect, and acquiring a main body region of interest through a manual or automatic mode; acquiring continuous frame images of a period of time before the moment of the reference image, and acquiring a main body region of interest in the continuous frame images at intervals; then, calculating characteristic points of the main body interested region in the reference image and the continuous frame image, and completing image matching of the continuous frame image and the reference image by using the characteristic points; finally, after registration, the reference image is used as a target image, and the continuous frames of the previous period are mapped to the reference image, so that the aim of the step is to simulate the fixed background movement of the main body position.

Step S505: and the multi-frame focus tracking background is overlapped to realize the focus tracking dynamic effect.

Here, by superposing the motion backgrounds detected in the multi-frame images, the superposition of multi-frame focus tracking dynamic effects is realized, the main body is ensured to be clear and unchanged, and the main body area is not provided with dynamic special effects, namely the background presents dynamic effects.

In addition, the terminal can judge whether a moving target main body exists in a current scene based on a motion detection technology in the video recording process, specifically, the terminal can perform motion detection based on front and rear frame images and through an AI technology or a traditional optical flow method and other algorithms, if the moving target main body exists, the characteristic points of the interested region of the main body in the front and rear frame images are extracted, the front and rear frame images are matched based on the characteristic points, the similarity of the target main body is judged by using the matched front and rear frame images, if the similarity is high, the target main body is in a rigid motion state, otherwise, the target main body is in a non-rigid motion state, when the target main body is in the rigid motion state, the automatic focus tracking effect generation can be performed on the target main body in the rigid motion state, and the generated focus tracking effect thumbnail is displayed to a user as a thumbnail of the video for the user to select.

In summary, in the image processing method provided in this embodiment, the focus tracking effect may be synthesized based on a single image or multiple images, so that a static image may have a "speed" sense immediately. The special effect is suitable for some moving targets, and the dynamic sense of the picture is improved.

Based on the same inventive concept as the foregoing embodiments, the present embodiment describes in detail, by way of specific examples, the technical solution of the foregoing embodiments, taking the target object as a human body as an example, referring to fig. 6, an image processing method, which may also be referred to as a focusing shooting method, provided for the embodiment of the present application includes the following steps:

step S601: acquiring a plane image of a photographed moving person;

Here, the terminal may acquire a planar image of the subject moving person through the camera preview interface.

Step S602: acquiring a portrait outline or a portrait frame area according to a lightweight AI portrait segmentation network or a lightweight AI portrait area detection network;

Here, the terminal may design a lightweight AI portrait segmentation network to complete segmentation of a human body region in the planar image, or design a lightweight AI portrait region detection network to detect a portrait position rectangular frame, and then cut out a portrait region or a portrait contour region according to a portrait position obtained by the lightweight AI portrait segmentation network or the lightweight AI portrait region detection network.

Step S603: detecting key points of human bones of an AI (human body) in a human figure outline or human figure frame area;

Here, the terminal can design a lightweight AI human skeleton key point detection network to rapidly detect and locate human key points. The key points can be determined according to the requirements and comprise main joint parts such as the top of the head, the five sense organs, the neck, the limbs and the like, and the lightweight AI human skeleton key point detection network supports complex scenes such as multi-person detection, large actions and the like, so that the lightweight AI human skeleton key point detection network is suitable for conditions such as slight shielding and cutting-off of human bodies.

Step S604: taking the key points of the human bones as focusing points or reference points for image shooting.

Here, the terminal may correspond the detected human skeleton key points to coordinate positions in the full map, so as to perform subsequent focusing calculation, and take the human skeleton key points as focus points or reference points to perform image capturing. It will be appreciated that the more key points of the human skeleton are acquired, the more accurate the focus, the higher the imaging rate, but the accompanying time consumption is also linearly increasing. In addition, when a plurality of persons appear in the preview screen, one-person focusing or multi-person focusing may be selected. The human skeleton key points are used as focusing points, so that more accurate human image position tracking and focusing point selection can be realized, and the focusing accuracy and the film forming rate in motion photography are effectively improved. In addition, the design of the lightweight AI image segmentation network and the lightweight image area detection network for detection is used for effectively improving detection accuracy and optimizing calculation efficiency, and can be used for directly detecting key points of human bones of plane images of moving images.

In addition, the image processing method provided by the embodiment can be further expanded to automatically capture the self-starting human body gesture based on human body key point focus detection, specifically, the human body gesture can be preset first, and when a user makes a certain gesture action, a specific gesture photo can be captured while focusing accurately through matching with gesture detection of a preset key point. For example, when the user jumps, the motion state of the user can be determined according to the change condition of the focus at the previous frames, and once the severe motion state of the user is detected, the motion snapshot mode is automatically started to automatically snapshot the image when the gesture of the user is a preset gesture, or the motion video recording mode is automatically started. Here, the human body key point focus detection is started, and the accurate focusing of the human body key point focus detection is relied on to capture the clearest gesture photo for the user to select and store. When the motion video recording mode is activated, video images with clear sequences can be generated depending on the accurate focusing of human body key point focus detection. In addition, the slow motion mode can be activated in the video recording mode, the accurate focusing of human body key point focus detection is also relied on, video images with clear sequences can be generated, and the quality of slow motion video generated by video frame insertion is facilitated.

In summary, in the image processing method provided in the above embodiment, the intelligent rapid auto-focusing method based on human body segmentation and human body key point detection can prevent the condition that the human image is out of focus in the motion shooting, improve the focusing speed and precision of the motion shooting, and improve the quality of the motion shooting image. That is, by means of advanced AI motion detection, object segmentation and keypoint detection algorithms, the focal point is always close to the portrait, and a clear and unblurred image can be shot by pressing the shutter. Meanwhile, the algorithm provided by the embodiment is not limited to moving portrait shooting, and can be correspondingly expanded to moving capture shooting scenes of animals, moving objects and the like.

Based on the same inventive concept as the previous embodiments, an embodiment of the present application provides a terminal, as shown in fig. 7, including: a processor 310 and a memory 311 for storing a computer program capable of running on the processor 310; the number of the processors 310 illustrated in fig. 7 is not used to refer to one number of the processors 310, but is merely used to refer to a positional relationship of the processors 310 with respect to other devices, and in practical applications, the number of the processors 310 may be one or more; likewise, the memory 311 illustrated in fig. 7 is also used in the same sense, that is, only to refer to the positional relationship of the memory 311 with respect to other devices, and in practical applications, the number of the memories 311 may be one or more. The processor 310 is configured to implement the image processing method applied to the terminal described above when running the computer program.

The terminal may further include: at least one network interface 312. The various components in the terminal are coupled together by a bus system 313. It is appreciated that the bus system 313 is used to enable connected communication between these components. The bus system 313 includes a power bus, a control bus, and a status signal bus in addition to the data bus. But for clarity of illustration the various buses are labeled as bus system 313 in fig. 7.

The memory 311 may be a volatile memory or a nonvolatile memory, or may include both volatile and nonvolatile memories. The non-volatile Memory may be, among other things, a Read Only Memory (ROM), a programmable Read Only Memory (PROM, programmable Read-Only Memory), erasable programmable Read-Only Memory (EPROM, erasable Programmable Read-Only Memory), electrically erasable programmable Read-Only Memory (EEPROM, ELECTRICALLY ERASABLE PROGRAMMABLE READ-Only Memory), Magnetic random access Memory (FRAM, ferromagnetic random access Memory), flash Memory (Flash Memory), magnetic surface Memory, optical disk, or compact disk-Only (CD-ROM, compact Disc Read-Only Memory); the magnetic surface memory may be a disk memory or a tape memory. The volatile memory may be random access memory (RAM, random Access Memory) which acts as external cache memory. By way of example and not limitation, many forms of RAM are available, such as static random access memory (SRAM, static Random Access Memory), synchronous static random access memory (SSRAM, synchronous Static Random Access Memory), dynamic random access memory (DRAM, dynamic Random Access Memory), synchronous dynamic random access memory (SDRAM, synchronous Dynamic Random Access Memory), and, Double data rate synchronous dynamic random access memory (DDRSDRAM, double Data Rate Synchronous Dynamic Random Access Memory), enhanced synchronous dynamic random access memory (ESDRAM, enhanced Synchronous Dynamic Random Access Memory), synchronous link dynamic random access memory (SLDRAM, syncLink Dynamic Random Access Memory), Direct memory bus random access memory (DRRAM, direct Rambus Random Access Memory). The memory 311 described in embodiments of the present application is intended to comprise, without being limited to, these and any other suitable types of memory.

The memory 311 in the embodiment of the present application is used to store various types of data to support the operation of the terminal. Examples of such data include: any computer program for operating on the terminal, such as an operating system and application programs; contact data; telephone book data; a message; a picture; video, etc. The operating system includes various system programs, such as a framework layer, a core library layer, a driver layer, and the like, for implementing various basic services and processing hardware-based tasks. The application programs may include various application programs such as a media player (MEDIA PLAYER), a Browser (Browser), etc. for implementing various application services. Here, a program for implementing the method of the embodiment of the present application may be included in an application program.

Based on the same inventive concept as the previous embodiments, the present embodiment further provides a computer storage medium in which a computer program is stored, where the computer storage medium may be a Memory such as a magnetic random access Memory (FRAM, ferromagnetic random access Memory), a Read Only Memory (ROM), a programmable Read Only Memory (PROM, programmable Read-Only Memory), an erasable programmable Read Only Memory (EPROM, erasable Programmable Read-Only Memory), an electrically erasable programmable Read Only Memory (EEPROM, ELECTRICALLY ERASABLE PROGRAMMABLE READ-Only Memory), a Flash Memory (Flash Memory), a magnetic surface Memory, an optical disk, or a compact disk-Read Only Memory (CD-ROM, compact Disc Read-Only Memory); but may be a variety of devices including one or any combination of the above-described memories, such as a mobile phone, computer, tablet device, personal digital assistant, or the like. The computer program stored in the computer storage medium, when executed by a processor, implements the image processing method applied to the terminal. The specific step flow implemented when the computer program is executed by the processor is described with reference to the embodiment shown in fig. 3 or fig. 4, and will not be described herein.

The technical features of the above-described embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above-described embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

In this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a list of elements is included, and may include other elements not expressly listed.

The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. An image processing method, the method comprising:

Acquiring an image to be shot;

2. The method as recited in claim 1, further comprising:

3. The method as recited in claim 1, further comprising:

4. A method according to any one of claims 1 to 3, wherein the performing preset keypoint detection on the target object in the image to be photographed to obtain the position information of the preset keypoints of the target object in the image to be photographed includes:

5. A terminal, comprising: a processor and a memory for storing a computer program capable of running on the processor,

Wherein the processor, when running the computer program, implements the image processing method according to any one of claims 1 to 4.

6. A computer storage medium, characterized in that a computer program is stored, which, when being executed by a processor, implements the image processing method according to any one of claims 1 to 4.