CN112437226B

CN112437226B - Image processing method, apparatus and storage medium

Info

Publication number: CN112437226B
Application number: CN202010970450.9A
Authority: CN
Inventors: 周晨航
Original assignee: Shanghai Chuanying Information Technology Co Ltd
Current assignee: Shanghai Chuanying Information Technology Co Ltd
Priority date: 2020-09-15
Filing date: 2020-09-15
Publication date: 2022-09-16
Anticipated expiration: 2040-09-15
Also published as: CN112437226A

Abstract

The embodiment of the application discloses an image processing method, image processing equipment and a storage medium, wherein the method comprises the following steps: acquiring an image to be processed, wherein the image to be processed comprises a first object; and inputting the image to be processed into an object prediction model for processing to obtain a target image, wherein the target image comprises a second object matched with the first object. The embodiment of the application can realize image prediction of the object with relevance and provides a novel image prediction application.

Description

Image processing method, apparatus and storage medium

Technical Field

The present application relates to the field of image processing technologies, and in particular, to an image processing method, an image processing apparatus, and a storage medium.

Background

Artificial Intelligence (AI) is a new technical science to study and develop theories, methods, techniques and application systems for simulating, extending and expanding human Intelligence.

Artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence, and research in this field includes robotics, speech recognition, image processing, and the like. In the field of image processing, how to perform image prediction application is a current research focus.

The foregoing description is provided for general background information and is not admitted to be prior art.

Disclosure of Invention

The embodiment of the application provides an image processing method, an image processing device and a storage medium, which can realize image prediction of an object with relevance and provide a novel image prediction application.

In a first aspect, an embodiment of the present application provides an image processing method, including:

acquiring an image to be processed, wherein the image to be processed comprises a first object;

and inputting the image to be processed into an object prediction model for processing to obtain a target image, wherein the target image comprises a second object matched with the first object.

In a second aspect, an embodiment of the present application provides an image processing apparatus, including:

the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring an image to be processed, and the image to be processed comprises a first object;

and the processing unit is used for inputting the image to be processed into an object prediction model for processing to obtain a target image, wherein the target image comprises a second object matched with the first object.

In a third aspect, an embodiment of the present application provides a computer device, including a processor and a memory, where the memory is used for storing a computer program, and the computer program includes program instructions, and the processor is configured to call the program instructions to execute the method according to the first aspect.

In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, including: the computer readable storage medium has stored thereon one or more instructions adapted to be loaded by a processor and to perform the method according to the first aspect.

In the embodiment of the application, an image to be processed comprising a first object is obtained, the image to be processed is input into an object prediction model to be processed, and a target image comprising a second object matched with the first object is obtained; therefore, the image prediction of the object with relevance can be realized through the embodiment of the application, and a novel image prediction application is provided.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application. In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments will be briefly described below, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.

Fig. 1 is a schematic hardware structure diagram of a mobile terminal implementing various embodiments of the present application;

fig. 2 is a communication network system architecture diagram according to an embodiment of the present application;

FIG. 3 is an exemplary system architecture diagram provided by embodiments of the present application;

fig. 4 is a flowchart of an image processing method provided in an embodiment of the present application;

FIG. 5a is a schematic diagram of a display interface provided by an embodiment of the present application;

FIG. 5b is a schematic view of another display interface provided by an embodiment of the present application;

FIG. 6a is a schematic diagram of yet another display interface provided by an embodiment of the present application;

FIG. 6b is a schematic diagram of yet another display interface provided by an embodiment of the present application;

FIG. 6c is a schematic view of a sticker selection page provided by an embodiment of the present application;

FIG. 7 is a flowchart of another image processing method provided in the embodiments of the present application;

FIG. 8a is a schematic diagram of a tag selection page provided by an embodiment of the present application;

FIG. 8b is a schematic diagram of another tab selection page provided by an embodiment of the present application;

FIG. 8c is a schematic diagram of another tab selection page provided by an embodiment of the present application;

fig. 9 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application;

fig. 10 is a schematic structural diagram of a computer device according to an embodiment of the present application.

The implementation, functional features and advantages of the objectives of the present application will be further explained with reference to the accompanying drawings. With the above figures, there are shown specific embodiments of the present application, which will be described in more detail below. These drawings and written description are not intended to limit the scope of the inventive concepts in any manner, but rather to illustrate the inventive concepts to those skilled in the art by reference to specific embodiments.

Detailed Description

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, the recitation of an element by the phrase "comprising an … …" does not exclude the presence of additional like elements in the process, method, article, or apparatus that comprises the element, and further, where similarly-named elements, features, or elements in different embodiments of the disclosure may have the same meaning, or may have different meanings, that particular meaning should be determined by their interpretation in the embodiment or further by context with the embodiment.

It should be understood that although the terms first, second, third, etc. may be used herein to describe various information, such information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope herein. The word "if," as used herein, may be interpreted as "at … …" or "when … …" or "in response to a determination," depending on the context. Also, as used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context indicates otherwise. It will be further understood that the terms "comprises," "comprising," "includes" and/or "including," when used in this specification, specify the presence of stated features, steps, operations, elements, components, items, species, and/or groups, but do not preclude the presence, or addition of one or more other features, steps, operations, elements, components, species, and/or groups thereof. The terms "or" and/or "as used herein are to be construed as inclusive or meaning any one or any combination. Thus, "A, B or C" or "A, B and/or C" means "any of the following: a; b; c; a and B; a and C; b and C; A. b and C ". An exception to this definition will occur only when a combination of elements, functions, steps or operations are inherently mutually exclusive in some way.

It should be understood that, although the steps in the flowcharts in the embodiments of the present application are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not performed in the exact order shown and may be performed in other orders unless explicitly stated herein. Moreover, at least some of the steps in the figures may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, in different orders, and may be performed alternately or at least partially with respect to other steps or sub-steps of other steps.

It should be noted that step numbers such as S201 and S202 are used herein for the purpose of more clearly and briefly describing the corresponding contents, and do not constitute a substantial limitation on the sequence, and those skilled in the art may perform S202 first and then S201 in the specific implementation, but these should be within the scope of the present application.

It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

In the following description, suffixes such as "module", "component", or "unit" used to denote elements are used only for the convenience of description of the present application, and have no specific meaning in themselves. Thus, "module", "component" or "unit" may be used mixedly.

The apparatus may be embodied in various forms. For example, the devices described in the present application may include mobile terminals such as a mobile phone, a tablet computer, a notebook computer, a palmtop computer, a Personal Digital Assistant (PDA), a Portable Media Player (PMP), a navigation device, a wearable device, a smart band, a pedometer, and the like, and fixed terminals such as a Digital TV, a desktop computer, and the like.

The following description will be given taking a mobile terminal as an example, and it will be understood by those skilled in the art that the configuration according to the embodiment of the present application can be applied to a fixed type terminal in addition to elements particularly used for mobile purposes.

Referring to fig. 1, which is a schematic diagram of a hardware structure of a mobile terminal for implementing various embodiments of the present application, the mobile terminal 100 may include: RF (Radio Frequency) unit 101, WiFi module 102, audio output unit 103, a/V (audio/video) input unit 104, sensor 105, display unit 106, user input unit 107, interface unit 108, memory 109, processor 110, and power supply 111. Those skilled in the art will appreciate that the mobile terminal architecture shown in fig. 1 is not intended to be limiting of mobile terminals, which may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.

The following describes each component of the mobile terminal in detail with reference to fig. 1:

the radio frequency unit 101 may be configured to receive and transmit signals during information transmission and reception or during a call, and specifically, receive downlink information of a base station and then process the downlink information to the processor 110; in addition, the uplink data is transmitted to the base station. Typically, radio frequency unit 101 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like. In addition, the radio frequency unit 101 can also communicate with a network and other devices through wireless communication. The wireless communication may use any communication standard or protocol, including but not limited to GSM (Global System for Mobile communications), GPRS (General Packet Radio Service), CDMA2000(Code Division Multiple Access 2000), WCDMA (Wideband Code Division Multiple Access), TD-SCDMA (Time Division-Synchronous Code Division Multiple Access), FDD-LTE (Frequency Division duplex Long Term Evolution), and TDD-LTE (Time Division duplex Long Term Evolution).

WiFi belongs to short-distance wireless transmission technology, and the mobile terminal can help a user to receive and send e-mails, browse webpages, access streaming media and the like through the WiFi module 102, and provides wireless broadband internet access for the user. Although fig. 1 shows the WiFi module 102, it is understood that it does not belong to the essential constitution of the mobile terminal, and may be omitted entirely as needed within the scope not changing the essence of the invention.

The audio output unit 103 may convert audio data received by the radio frequency unit 101 or the WiFi module 102 or stored in the memory 109 into an audio signal and output as sound when the mobile terminal 100 is in a call signal reception mode, a call mode, a recording mode, a voice recognition mode, a broadcast reception mode, or the like. Also, the audio output unit 103 may also provide audio output related to a specific function performed by the mobile terminal 100 (e.g., a call signal reception sound, a message reception sound, etc.). The audio output unit 103 may include a speaker, a buzzer, and the like.

The a/V input unit 104 is used to receive audio or video signals. The a/V input Unit 104 may include a Graphics Processing Unit (GPU) 1041 and a microphone 1042, the Graphics processor 1041 Processing image data of still pictures or video obtained by an image capturing device (e.g., a camera) in a video capturing mode or an image capturing mode. The processed image frames may be displayed on the display unit 106. The image frames processed by the graphic processor 1041 may be stored in the memory 109 (or other storage medium) or transmitted via the radio frequency unit 101 or the WiFi module 102. The microphone 1042 may receive sounds (audio data) via the microphone 1042 in a phone call mode, a recording mode, a voice recognition mode, or the like, and may be capable of processing such sounds into audio data. The processed audio (voice) data may be converted into a format output transmittable to a mobile communication base station via the radio frequency unit 101 in case of a phone call mode. The microphone 1042 may implement various types of noise cancellation (or suppression) algorithms to cancel (or suppress) noise or interference generated in the course of receiving and transmitting audio signals.

The mobile terminal 100 also includes at least one sensor 105, such as a light sensor, motion sensor, and other sensors. Specifically, the light sensor includes an ambient light sensor that may optionally adjust the brightness of the display panel 1061 according to the brightness of ambient light, and a proximity sensor that may turn off the display panel 1061 and/or the backlight when the mobile terminal 100 is moved to the ear. As one of the motion sensors, the accelerometer sensor can detect the magnitude of acceleration in each direction (generally, three axes), can detect the magnitude and direction of gravity when stationary, and can be used for applications of recognizing the posture of a mobile phone (such as horizontal and vertical screen switching, related games, magnetometer posture calibration), vibration recognition related functions (such as pedometer and tapping), and the like; as for other sensors such as a fingerprint sensor, a pressure sensor, an iris sensor, a molecular sensor, a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which can be configured on the mobile phone, further description is omitted here.

The display unit 106 is used to display information input by a user or information provided to the user. The Display unit 106 may include a Display panel 1061, and the Display panel 1061 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like.

The user input unit 107 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the mobile terminal. Specifically, the user input unit 107 may include a touch panel 1071 and other input devices 1072. The touch panel 1071, also referred to as a touch screen, may collect a touch operation performed by a user on or near the touch panel 1071 (e.g., an operation performed by the user on or near the touch panel 1071 using a finger, a stylus, or any other suitable object or accessory), and drive a corresponding connection device according to a predetermined program. The touch panel 1071 may include two parts of a touch detection device and a touch controller. Optionally, the touch detection device detects a touch orientation of a user, detects a signal caused by a touch operation, and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 110, and can receive and execute commands sent by the processor 110. In addition, the touch panel 1071 may be implemented in various types, such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. In addition to the touch panel 1071, the user input unit 107 may include other input devices 1072. In particular, other input devices 1072 may include, but are not limited to, one or more of a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like, and are not limited to these specific examples.

Further, the touch panel 1071 may cover the display panel 1061, and when the touch panel 1071 detects a touch operation thereon or nearby, the touch panel 1071 transmits the touch operation to the processor 110 to determine the type of the touch event, and then the processor 110 provides a corresponding visual output on the display panel 1061 according to the type of the touch event. Although the touch panel 1071 and the display panel 1061 are shown in fig. 1 as two separate components to implement the input and output functions of the mobile terminal, in some embodiments, the touch panel 1071 and the display panel 1061 may be integrated to implement the input and output functions of the mobile terminal, and is not limited herein.

The interface unit 108 serves as an interface through which at least one external device is connected to the mobile terminal 100. For example, the external device may include a wired or wireless headset port, an external power supply (or battery charger) port, a wired or wireless data port, a memory card port, a port for connecting a device having an identification module, an audio input/output (I/O) port, a video I/O port, an earphone port, and the like. The interface unit 108 may be used to receive input (e.g., data information, power, etc.) from external devices and transmit the received input to one or more elements within the mobile terminal 100 or may be used to transmit data between the mobile terminal 100 and external devices.

The memory 109 may be used to store software programs as well as various data. The memory 109 may mainly include a program storage area and a data storage area, and optionally, the program storage area may store an operating system, an application program (such as a sound playing function, an image playing function, and the like) required by at least one function, and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. Further, the memory 109 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.

The processor 110 is a control center of the mobile terminal, connects various parts of the entire mobile terminal using various interfaces and lines, and performs various functions of the mobile terminal and processes data by operating or executing software programs and/or modules stored in the memory 109 and calling data stored in the memory 109, thereby performing overall monitoring of the mobile terminal. Processor 110 may include one or more processing units; preferably, the processor 110 may integrate an application processor and a modem processor, optionally, the application processor mainly handles operating systems, user interfaces, application programs, etc., and the modem processor mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 110.

The mobile terminal 100 may further include a power supply 111 (e.g., a battery) for supplying power to various components, and preferably, the power supply 111 may be logically connected to the processor 110 via a power management system, so as to manage charging, discharging, and power consumption management functions via the power management system.

Although not shown in fig. 1, the mobile terminal 100 may further include a bluetooth module or the like, which is not described in detail herein.

In order to facilitate understanding of the embodiments of the present application, a communication network system on which the mobile terminal of the present application is based is described below.

Referring to fig. 2, fig. 2 is an architecture diagram of a communication Network system according to an embodiment of the present disclosure, where the communication Network system is an LTE system of a universal mobile telecommunications technology, and the LTE system includes a UE (User Equipment) 201, an E-UTRAN (Evolved UMTS Terrestrial Radio Access Network) 202, an EPC (Evolved Packet Core) 203, and an IP service 204 of an operator, which are in communication connection in sequence.

Specifically, the UE201 may be the terminal 100 described above, and is not described herein again.

The E-UTRAN202 includes eNodeB2021 and other eNodeBs 2022, among others. Alternatively, the eNodeB2021 may be connected with other enodebs 2022 through a backhaul (e.g., X2 interface), the eNodeB2021 is connected to the EPC203, and the eNodeB2021 may provide the UE201 access to the EPC 203.

The EPC203 may include an MME (Mobility Management Entity) 2031, an HSS (Home Subscriber Server) 2032, other MMEs 2033, an SGW (Serving gateway) 2034, a PGW (PDN gateway) 2035, and a PCRF (Policy and Charging Rules Function) 2036, and the like. Optionally, the MME2031 is a control node that handles signaling between the UE201 and the EPC203, providing bearer and connection management. HSS2032 is used to provide registers to manage functions such as home location register (not shown) and holds subscriber specific information about service characteristics, data rates, etc. All user data may be sent through SGW2034, PGW2035 may provide IP address assignment for UE201 and other functions, and PCRF2036 is a policy and charging control policy decision point for traffic data flow and IP bearer resources, which selects and provides available policy and charging control decisions for a policy and charging enforcement function (not shown).

The IP services 204 may include the internet, intranets, IMS (IP Multimedia Subsystem), or other IP services, among others.

Although the LTE system is described as an example, it should be understood by those skilled in the art that the present application is not limited to the LTE system, but may also be applied to other wireless communication systems, such as GSM, CDMA2000, WCDMA, TD-SCDMA, and future new network systems.

Based on the above mobile terminal hardware structure and communication network system, various embodiments of the present application are provided.

The embodiment of the application mainly relates to Artificial Intelligence (AI), Natural Language Processing (NLP), Machine Learning (ML) and Image Processing (IP), and the target Image corresponding to the Image to be processed can be predicted more accurately according to the information of the Image to be processed by combining the AI, the NLP, the ML and the IP. The AI is a theory, method, technique and application system that simulates, extends and expands human intelligence, senses the environment, acquires knowledge and uses the knowledge to obtain the best results using a digital computer or a machine controlled by a digital computer. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The AI technology is a comprehensive subject, and relates to the field of extensive technology, both hardware level technology and software level technology. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, processing technologies for large applications, operating/interactive systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

NLP is an important direction in the fields of computer science and AI. It studies various theories and methods that enable efficient communication between a person and a computer using natural language. NLP is a science integrating linguistics, computer science and mathematics. Therefore, the research in this field will involve natural language, i.e. the language that people use everyday, so it is closely related to the research of linguistics. NLP techniques typically include text processing, semantic understanding, machine translation, robotic question and answer, knowledge mapping, and the like.

ML is a multi-field interdisciplinary, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. ML is the core of artificial intelligence, is the fundamental way to make computers intelligent, and its application is spread over various fields of artificial intelligence. ML and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, migratory learning, inductive learning, and formal learning.

In terms of image processing, embodiments of the present application also relate to a Generative Adaptive Networks (GAN), which may include a Generative network (G network) and a discriminant network (D network). The generation network is a network for generating images, can be understood as an image generator, the discrimination network is a network for discriminating whether an input image is a real image, can be understood as an image discriminator, and the generation network and the discrimination network are trained by continuous confrontation. In the embodiments of the present application, the generative countermeasure network is merely used as an example, and other Neural Networks may be used to perform image processing, such as Convolutional Neural Networks (CNNs), Deep Neural Networks (DNNs), and Recurrent Neural Networks (RNNs).

According to the image prediction method and device, the image to be processed can be processed by the object prediction model based on the trained object prediction model, and the target image matched with the image to be processed can be obtained, so that image prediction of the object with relevance can be achieved, and a novel image prediction application is provided. Referring to fig. 1 in particular, fig. 1 is a diagram illustrating an exemplary system architecture to which an embodiment of an image processing method or an image processing apparatus of the present application may be applied.

As shown in fig. 3, the system architecture diagram may include an input device 301, a terminal device 302, and a server 303. The network serves as a medium for providing a communication link between terminal device 301 and server 303. The network may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others. The input device 301 interacts with the terminal device 302 through a network, and various applications, such as a music playing application, an image processing application, a social application, and a search application, may be installed on the terminal device 302.

The terminal device 302 may be hardware or software, and the embodiment of the present application is not limited. When the terminal device 302 is hardware, it may be various devices with a display screen, including but not limited to a smart phone, a PC (Personal Computer), a notebook Computer, a PAD (tablet Computer), a smart wearable device, and the like; when the terminal device 302 is software, it can be installed in the above-listed devices. The server 303 may be a server, a server cluster composed of a plurality of servers, or a cloud computing service center.

It should be noted that the image processing method provided in the embodiment of the present application may be executed by the terminal device 302, or may be executed by the server 303, and accordingly, the image processing apparatus may be provided in the server 303, or may be provided in the terminal device 302. It is understood that the number of terminal devices, networks, and servers in fig. 3 is merely illustrative, and that any number of terminal devices, networks, and servers may be present, as appropriate.

Taking the example that the image processing apparatus is disposed in the server 303, the image processing flow mainly includes: the server 303 obtains an image to be processed (e.g., a face image, an eye image, a complete or partial human body image of a child), where the image to be processed includes a first object (e.g., a face, an eye, a complete or partial human body, etc. of a child), inputs the image to be processed into the object prediction model to be processed, and obtains a target image (e.g., a face image, an eye image, a complete or partial human body image, etc. of a middle-aged person), where the target image includes a second object (e.g., a face, an eye, a complete or partial human body, etc. of a middle-aged person). Optionally, the second object is matched with the first object, or a person to which the second object belongs and a person to which the first object belongs have an association relationship; if the first object is a face or a complete human body of a child and the second object is a face or a complete human body of a middle-aged person, the face or the complete human body of the middle-aged person may be a face or a complete human body of a father or a mother of the child corresponding to the predicted face or the complete human body of the child.

Referring to fig. 4, fig. 4 is a flowchart of an image processing method according to an embodiment of the present disclosure. The main body of execution of the method is as mentioned above, which may be the terminal device 302 or the server 303. The specific steps of the image processing method may include the following steps S401 to S402:

s401, obtaining an image to be processed, wherein the image to be processed comprises a first object.

In the embodiment of the application, the image to be processed may be an image selected by a user for an image locally stored in the terminal, or an image acquired from other devices or a network. The image to be processed may be an image related to a person, for example: the image of a face, eyes, nose, feet, hands, a whole body, etc., may be an image relating to an animal, such as an image of a cat face, cat eyes, etc. The image to be processed may be one, two or more than two, and the image to be processed includes a first object, such as a human face, a pair of eyes, a nose, etc. Of course, a plurality of objects, such as a plurality of human faces, a plurality of whole bodies of different persons (whole bodies including heads, limbs, and the like of persons), may be included in the image to be processed.

In an alternative embodiment, the acquiring the image to be processed may include: controlling a camera to shoot a shot object to obtain an initial video, and acquiring an image to be processed from the initial video; optionally, after the image to be processed is input into the object prediction model for processing to obtain the target image, the method further includes: correspondingly adjusting the motion state corresponding to the second object in the target image according to the motion state corresponding to the first object in each frame of image in the initial video to obtain a first image set; and generating a special effect video according to the initial video and the first image set.

In the embodiment of the application, an initial video is obtained by shooting through a camera, an image to be processed is obtained from the initial video, the image to be processed is input into an object prediction model to be processed, after a target image corresponding to each frame of the image to be processed is obtained, a motion state corresponding to a second object in the target image can be correspondingly adjusted according to a motion state corresponding to a first object in each frame of the image in the initial video, a first image set is obtained, and a special effect video is generated according to the initial video and the first image set. For example, the initial video shot by the camera is a three-minute video in which a pair of couples jog on the grassland first and then walk, the terminal acquires the complete human body images of the couple from each frame of video picture, inputs each frame of the complete human body images of the couple into the object prediction model for processing to obtain the corresponding human body image of the child, then correspondingly adjusts the motion state of the child in the generated corresponding image of the child according to the motion state corresponding to the couple in each frame of image in the initial video to obtain a first image set, and finally generates a special effect video in which the couple and the child jog on the grassland first and then walk together. It can be understood that if the couple is in hand, the child can also be in hand with the couple in the last special effect video, and of course, the position of the child in each frame of image can be set systematically or set by the user.

In an alternative embodiment, the generating a special effect video from the initial video and the first set of images may include: respectively synthesizing each image in the first image set with a corresponding image in the initial video to generate a second image set; and generating a special effect video according to the second image set.

In an alternative embodiment, the acquiring the image to be processed may include: controlling a camera to acquire images of a shot object to obtain a preview image, and displaying the preview image in a display interface; determining the preview image as an image to be processed; optionally, after the image to be processed is input into the object prediction model for processing to obtain the target image, the method may further include: and displaying the target image in the display interface.

In the embodiment of the application, a camera is used for acquiring an image to obtain a preview image, the preview image is determined as an image to be processed and input into an object prediction model for processing to obtain a target image, and then the target object is displayed in a display interface. For example, when the photographed object is a middle-aged person, the preview image is a complete human body image of the middle-aged person, the complete human body image is input into the object prediction model to be processed, a complete human body image of a child is obtained, and the complete human body image of the child is displayed on the display interface.

In an alternative embodiment, the displaying the target image in the display interface may include: replacing the preview image with the target image in the display interface; or, displaying the preview image and the target image in the display interface, wherein optionally, the display positions of the preview image and the target image are different; or, the preview image and the target image are synthesized to obtain a synthesized image, and the synthesized image is displayed in the display interface, and optionally, a position corresponding to the first object in the synthesized image is adjacent to a position corresponding to the second object.

In the embodiment of the application, displaying the target image on the display interface may be to replace the preview image with the target image, that is, only the target image is displayed on the display interface; or displaying two images on a display interface, wherein the two images comprise a preview image and a target image (as shown in fig. 5a) with different display positions, so that a user can compare the images conveniently; or the obtained target image and the preview image are synthesized to obtain one image, and the positions of the corresponding objects in the target image and the preview image are adjacent (as shown in fig. 5 b). For example, the preview image is an image of an elderly person, the target image is an image of a girl, and the two images are combined to obtain an image of a girl standing beside an elderly person. In an optional embodiment, the method may further comprise: and when the triggering operation of the user on the shooting button is detected, saving the preview image and the target image. In an optional embodiment, the method may further include: and when the triggering operation of the user on the shooting button is detected, saving the image displayed in the display interface.

S402, inputting the image to be processed into the object prediction model for processing to obtain a target image, wherein the target image comprises a second object matched with the first object.

In the embodiment of the application, the object prediction model may have a preset attribute tag set by a relevant operator, or a preset attribute tag set by a user, or the object prediction model does not have a preset attribute tag set, and the user needs to define the attribute tag by himself/herself. Optionally, the preset attribute tag may include a first attribute tag and/or a second attribute tag. In an optional implementation manner, the object prediction model includes a preset attribute tag, and the inputting the image to be processed into the object prediction model for processing to obtain the target image may include: and inputting the image to be processed into an object prediction model for processing to obtain a target image, wherein the attribute characteristics of the second object are matched with the preset attribute label.

In an optional implementation manner, the object prediction model includes a prediction network and a decision network, and inputting the image to be processed into the object prediction model for processing to obtain the target image may include: inputting an image to be processed into a prediction network for processing to obtain an image to be output; inputting an image to be output into a judgment network for processing to obtain a judgment result; and when the judgment result indicates that the similarity between the image to be processed and the image to be output is greater than or equal to the similarity threshold, taking the image to be output as the target image.

Optionally, the object prediction model may include two neural networks, one is a prediction network and the other is a determination network, when the image to be processed is input into the object prediction model, the image to be processed is input into the prediction network to be processed, an image to be output is generated, the image to be output is input into the determination network to be determined, and if the determination result indicates that the similarity between the image to be processed and the image to be output is greater than or equal to the similarity threshold, the image to be output is taken as the target image. It is understood that the similarity threshold may be preset, and may be, for example, 81% to 62%, and the like, and the embodiment of the present application is not limited.

In the embodiment of the present application, the object prediction model is obtained by training according to image sample data, specifically, by obtaining image sample data, the image sample data may include a training sample image and a result sample image, optionally, an object in the training sample image matches an object in the result sample image, and the initial object prediction model is trained according to the training sample image and the result sample image, so as to obtain a trained object prediction model. For example, image sample data related to the faces of parents and children may be created, the face image of children/the face image of parents is used as a training sample image, the face image of parents/the face image of children is used as a result sample image, an initial object prediction model is trained for multiple times, and parameters in the object prediction model are adjusted for multiple times, so that the face image of children is input into the trained object prediction model for processing, and the face image of parents can be obtained, or the face image of parents is input into the trained object prediction model for processing, and the face image of children can be obtained. It should be noted that, only the training prediction of the face image is taken as an example here, other face organs, a complete human body, a partial human body, an animal, and the like may also be selected, and the embodiment of the present application is not limited. It is understood that the training sample images may be one, two, or more than two, and the result sample images may also be one, two, or more than two.

In an optional embodiment, if the first object is a face image, the image to be processed is input into the object prediction model to be processed, so as to obtain a target image, and a person corresponding to the second object in the target image has a relationship with a person corresponding to the first object.

In this embodiment of the application, the image to be processed may include one or more first objects, and when the image to be processed includes a plurality of first objects, the object prediction model may select one of the first objects to be processed, and if the image to be processed includes a plurality of identical face images, the object prediction model may select one of the face images to be processed. The relationship of relatives herein may refer to the relationship of direct blood relatives, specifically, the relatives having direct blood relationship with each other, i.e., the breeding oneself and the parents of the upper and lower generations of self-breeding. Such as parents and children, grandparents and grandchildren, etc., the orthodox ancestors are linked together by a series of rectilinear birth facts. It is understood that, to realize image prediction here, such training sample images and result sample images are needed to train the object prediction model, for example, the training sample image may be a facial image of a mother or a facial image of a father, and the result sample image is a facial image of a child; the training sample image may be an image of a child's face and the resulting sample image is an image of a father's or mother's face. It should be noted that, here, only the training of the face image is taken as an example, and other training with images of human organs, a complete human body, a partial human body, an animal and the like may also be selected, and the embodiment of the present application is not limited.

In an alternative embodiment, the image to be processed may include a first image and a second image, and the person corresponding to the object in the first image and the person corresponding to the object in the second image have a relationship; inputting an image to be processed into the object prediction model for processing to obtain a target image, which may include: inputting the first image and the second image into an object prediction model for processing to obtain a target image; the person corresponding to the second object in the target image is the predicted person having a relationship with the persons corresponding to the objects in the first and second images.

In the embodiment of the present application, the person corresponding to the object in the first image and the person corresponding to the object in the second image have a relationship, where the relationship may be an immediate relationship, or a relationship such as a brother, a sister, or a sister. It can be understood that, the image to be processed may be one or more, and when the image to be processed includes two or more different objects with/without an association relationship, the image to be processed may be directly input into the object prediction model for processing; or extracting an image area (such as a first image and a second image) corresponding to each object from the image to be processed, inputting the image area corresponding to each object obtained by extraction into the object prediction model, and optionally, taking the object as a face image as an example, the image area having an association relationship may be a relationship having a direct blood relative, a brother, a sister, and the like.

For example, taking an object as a face, where the first image is an image of a face of a child a, the second image is an image of a face of a child b, and the child a and the child b are in a relation of siblings, the first image and the second image are input into a trained object prediction model for processing, so as to obtain a person corresponding to the second object in the target image, where the predicted relationship between the person and the person corresponding to the objects in the first image and the second image is a relationship between the person and the child b, for example, the target image may be a predicted image of a father or mother of the child a and the child b, and then the predicted relationship between the person and the person corresponding to the objects in the first image and the second image is specifically a relationship between the person and the child b; or the first image may be an image of the face of child a, the second image may be an image of the face of the father of child a, the image of the face of child a and the image of the face of the father of child a are input into the object prediction model for processing, and the obtained second object in the target image is the predicted face of the mother of child a. It should be noted that the object may be an animal, a whole human body, a partial human body, or the like, and the embodiment of the present application is not particularly limited. It is understood that, to implement the image prediction here, such training sample images and result sample images are required to train the object prediction model, for example, the training sample images may be a facial image of a child c and a facial image of a child d, the result sample image is a facial image of a father or a facial image of a mother, or the result sample image is a facial image of a father and a facial image of a mother; the training sample image can be a facial image of child c and a facial image of father/father, and the result sample image is a facial image of mother/father; the training sample images may be face images of a plurality of children and a face image of either one of parents, and the result sample image is a face image of the other parent or father different from the training sample images. It should be noted that the parents and the children in the training sample image and the result sample image have a relationship, but it is to be understood that the person corresponding to the object in the training sample image and the person corresponding to the object in the result sample image may have another relationship.

In an optional implementation manner, the image to be processed may include a first image and a second image, and the inputting the image to be processed into the object prediction model for processing to obtain the target image may include: inputting the first image and the second image into an object prediction model for processing to obtain a target image; the person corresponding to the second object in the target image is the predicted person having a relationship with the persons corresponding to the objects in the first and second images.

In the embodiment of the present application, for example, taking an object in an image as a face, a first image is a face image of a young male, a second image is a face image of a young female, the first image and the second image are input into an object prediction model for processing, a person corresponding to a second object in an obtained target image is a predicted person having an affiliation with the person corresponding to the object in the first image and the second image, if a preset attribute tag in the object prediction model is a smiling face of a man wearing glasses, the person corresponding to the second object in the target image is a predicted smiling man wearing glasses having an affiliation with the young male and the young female, and the affiliation may be an affiliation, that is, the object prediction model is input by using a face image of a male and a face image of a female, a predicted image of the face of their child may be obtained. For the image prediction here, such training sample images and result sample images are needed to train the object prediction model, for example, the training sample images may be a facial image of father and a facial image of mother, the result sample image may be a facial image of child e, or the result sample image may be a facial image of child f and a facial image of child e. It should be noted that, here, only the training of the face image is taken as an example, and other training with images of human organs, a complete human body, a partial human body, an animal and the like may also be selected, and the embodiment of the present application is not limited.

In an optional implementation manner, an image to be processed is input into an object prediction model to be processed, so that a target image comprising a third image and a fourth image is obtained; optionally, the persons corresponding to the objects in the third image and the fourth image are predicted persons having relativity with the persons corresponding to the objects in the image to be processed.

In the embodiment of the present application, the relationship may be a direct relationship, taking the object as a face image as an example, if the image to be processed is an image of a middle-aged male, the image includes a first object of a face, the image is input into the object prediction model, and the obtained target image includes a third image and a fourth image, the third image may be an image of a child a of the predicted middle-aged male, and the fourth image may be an image of a child b of the predicted middle-aged male. In order to realize image prediction, a training sample image and a result sample image are needed to train an object prediction model, wherein the training sample image can be a child face image, and the result sample image can be a father face image and a mother face image; the training sample image may be a father face image or a mother face image, and the result sample images are a child c face image and a child d face image. It should be noted that, here, only training of a face image is taken as an example, other training using images of a human organ, a whole human body, a part of a human body, an animal, and the like may also be selected, and the embodiment of the present application is not limited.

In an optional implementation manner, after the image to be processed is input into the object prediction model for processing, and a target image is obtained, the method may further include: determining an image area corresponding to a second object in the target image, and setting the image area as a movable image area; when the movement operation of the user for the image area is detected, the area position of the image area in the target image is adjusted according to the movement operation.

In this embodiment of the application, after obtaining the target image, the terminal may determine an image area corresponding to the second object, for example, a face image area, a whole or partial human body image area, and set the image area as a movable image area, and the user may move the image area to a desired position through a moving operation. It can be understood that the user can use the background image in the target image as the movable image area, and then the user can move the background image through the moving operation, thereby improving the user experience.

In an alternative embodiment, the target image is a three-dimensional image, and the method may further include: when the rotation operation of the user on the second object in the target image is detected, adjusting the image content corresponding to the second object in the target image according to the rotation operation.

In this embodiment, if the target image is a three-dimensional image, the user may perform rotation and movement operations on a second object in the target image, for example, rotate the human body 45 degrees to the left, move the human body 2 centimeters to the right, and the like; of course, it can be understood that the user may perform a rotation operation on the background image in the target image, so that the background image can be rotated to an angle or direction that the user likes, and further the target image can better meet the personalized requirements of the user.

In an optional implementation manner, after the image to be processed is input into the object prediction model for processing, and a target image is obtained, the method may further include: acquiring a sticker to be added; and adding the paster to be added into the target image to obtain a special effect image. In the embodiment of the application, as various stickers are set in the terminal of fig. 6a, for example, stickers related to characters, cartoons, seasons, and expressions, a user can select to add to a target image to obtain a special effect image. It is understood that the sticker of the embodiment of the present application may be dynamic or static.

In an optional embodiment, the adding the to-be-added sticker to the target image to obtain a special effect image may include: when the paster to be added is a decorative paster related to the five sense organs, determining a target five sense organs matched with the paster to be added in the second object; and adding the paster to be added to the position of the target five sense organs in the target image to obtain a special effect image. In the embodiment of the application, when the sticker to be added selected by the user is a decorative sticker related to five sense organs (as shown in fig. 6b, the decorative sticker related to five sense organs may include glasses, earrings, lipstick, and the like), the decorative sticker to be added is correspondingly added to the position of the target five sense organs in the target image, so as to obtain a special effect image, for example, the glasses sticker is added to the position of the eyes, and the earrings sticker is added to the corresponding position of the ears.

In an optional embodiment, the adding the to-be-added sticker to the target image to obtain a special effect image may include: adding the paster to be added to an initial position in the target image; when the moving operation of the user on the paster to be added is detected, the adding position of the paster to be added is adjusted according to the moving operation, and a special effect image is obtained. In the embodiment of the application, after the user selects the to-be-added paster, the to-be-added paster can be added to the initial position in the target image, the initial position can be set by a system or automatically set by the user, the user can move the position of the to-be-added paster according to a moving operation and move the position to the position selected by the user, and then the to-be-added paster is adjusted to the position to obtain the special effect image, so that the personalized requirement of the user is met.

In an alternative embodiment, obtaining the sticker to be added may include: displaying a sticker selection page, the sticker selection page including at least one sticker option; and when the selection operation aiming at the sticker selection page is detected, using the sticker selected by the selection operation as the sticker to be added. In this embodiment of the application, as shown in fig. 6c, the terminal interface displays a sticker selection page, where the sticker selection page includes at least one sticker option, each sticker option corresponds to one sticker picture or multiple sticker pictures, and when a user selects a sticker, a sticker selected by the user from the sticker pictures corresponding to the sticker options is used as a sticker to be added. It will be appreciated that a sticker selection sub-page may be provided for each sticker option, such as a sticker selection sub-page relating to spring, summer, and autumn may be included with respect to the season sticker option, which may correspond to multiple stickers, such as trees, grass, flowers, etc.

In an alternative embodiment, the displaying the sticker selection page may include: acquiring the characteristic attribute of the person corresponding to the second object; selecting at least one type of paster matched with the characteristic attribute from a paster database according to the characteristic attribute; and displaying a sticker selection page according to the selected at least one type of stickers matched with the characteristic attributes. In the embodiment of the application, when a sticker selection page is displayed, the feature attributes of a person corresponding to a second object in a target image can be obtained first, wherein the feature attributes can include one or more of age, expression, gender and skin color, at least one type of sticker matched with the feature attributes is selected from a sticker database according to the feature attributes, and a sticker selection interface is displayed according to the selected at least one type of sticker matched with the feature attributes; it is understood that the terminal stores in advance the correspondence of the characteristic attribute with the type of the sticker.

In the embodiment of the application, an image to be processed comprising a first object is obtained, and the image to be processed is input into an object prediction model to be processed, so that a target image comprising a second object matched with the first object is obtained; therefore, the image prediction of the object with relevance can be realized through the embodiment of the application, and a novel image prediction application is provided.

Referring to fig. 7, fig. 7 is a flowchart of another image processing method according to an embodiment of the present disclosure. The main body of the method is as mentioned above, which may be the terminal device 302 or the server 303, and the preset attribute tag is not set in the object prediction model in the method.

The specific steps of the image processing method may include the following steps S701 to S706:

s701, acquiring an image to be processed, wherein the image to be processed comprises a first object.

It should be noted that step S701 in this embodiment may specifically refer to step S401 in the foregoing embodiment, and details of this embodiment are not described again.

S702, acquiring a first attribute label.

In the embodiment of the present application, the first attribute tag is an attribute tag of the generation class, that is, plays a critical role in a finally generated target image. Taking the image to be processed as an image of a person as an example, as shown in fig. 8a, the first attribute tag may include a specific attribute tag in tag options such as gender and age. In an alternative embodiment, obtaining the first attribute tag may include: displaying a tag selection page, wherein the tag selection page comprises at least one attribute tag option; and when the selection operation aiming at the tag selection page is detected, taking the attribute tag selected by the selection operation as a first attribute tag. That is to say, when the preset attribute tag is not set in the object prediction model, a tag selection page is provided for the user to select, for example, the sex of the object in the target image that the user can select and predict on the tag selection page is a male, and the age is a youth, and the person corresponding to the object in the target image that appears finally is a young male. It should be noted that the display mode of the tab selection page is only an example, and the tab selection page may also be displayed in any one of a display mode in a preset fixed screen area and a display mode in a floating window, or in other modes, and the embodiment of the present application is not limited.

S703, inputting the image to be processed and the first attribute label into the object prediction model for processing to obtain a target image, wherein the first attribute feature of the second object is matched with the first attribute label.

In the embodiment of the application, an image to be processed and a first attribute tag are input into an object prediction model to be processed, so as to obtain a target image, a first attribute feature of a second object is matched with the first attribute tag, for example, if the first attribute tag selects a woman and an old person, the second object in the target image is a woman of the old person.

S704, acquiring a second attribute label.

In the embodiment of the present application, the second attribute tag is a modification-type attribute tag, that is, plays a role in assisting modification of an object in a final target image. Taking the image to be processed as an image of a person as an example, as shown in fig. 8b, the second attribute tag may include a specific tag corresponding to a tag option such as a gesture, an expression, hair, or an accessory. It should be noted that the tag selection page for acquiring the first attribute tag may be the same as the tag selection page for acquiring the second attribute tag (as shown in fig. 8c), that is, the tag selection page displays the first attribute tag option and the second attribute tag option. Of course, the display of the first and second property tab options may not be on the same tab selection page (as in fig. 8a and 8 b).

In an optional implementation manner, after the first attribute tag is obtained, a second attribute tag is obtained, and the image to be processed, the first attribute tag, and the second attribute tag are input into the object prediction model to be processed, so as to obtain a target image, where a first attribute feature of the second object is matched with the first attribute tag, and a second attribute feature of the second object is matched with the second attribute tag.

S705, determining a second attribute feature of the second object according to the second attribute tag.

In this embodiment of the application, the second attribute tag may include a plurality of attribute tags, such as a front face that turns left 30 degrees, a front face that turns right 60 degrees, a front face that turns left 10 degrees, a right face that turns right 45 degrees, etc. in the posture attribute tag option, a smile, a sweet smile, a lacrimation, a pseudo smile, etc. in the expression attribute tag, a wave roll, a pony tail, a short straight hair, a wave head, etc. in the hair attribute tag, and an attribute tag in other attribute tag options, and determine the second attribute feature of the second object through the acquired second attribute tag, so as to process the object in the target image.

And S706, processing the target image according to the second attribute characteristics to obtain a processed target image.

In this embodiment of the application, after obtaining the predicted target image matching the first attribute tag, the target image may be further processed again, specifically, taking the target image as an image related to a person, as shown in fig. 8b, the user may select the attribute tag on the tag selection page, so that the final processed target image matches the second attribute characteristic. For example, the predicted target image matching the first attribute tag is an image of a female aged person, and the user may select on the tag selection page, and if the selection gesture is smile, the hair is curly wave, and the accessory is glasses, the predicted target image is an image of a female aged person who is smile and curly wave wearing glasses.

In an optional embodiment, after the image to be processed is input into the object prediction model for processing, and a target image is obtained, the method may further include: beautifying the target image to obtain a beautified target image; optionally, the beautification treatment comprises a beauty treatment and/or a filter treatment. In the embodiment of the present application, after obtaining the target image, the beautification processing may be performed on the target image, specifically, the user may select to beautify an object of the target image, may select to beautify a background image in the target image, and may select to beautify an area at any position in the target image, and optionally, the beautification processing may include a beauty processing and/or a filter processing, a style conversion, and the like.

In the embodiment of the application, an image to be processed and a first attribute label selected by a user are input into an object prediction model, and a target image of a second object with a first attribute characteristic matched with the first attribute label is obtained; and then, determining a second attribute feature of the second object according to the second attribute tag selected by the user, and processing the target image according to the second attribute feature to obtain a processed target image. The image prediction of the object with relevance can be realized through the embodiment of the application, a novel image prediction application is provided, and the interestingness of man-machine interaction is improved.

Referring to fig. 9, fig. 9 is a schematic structural diagram of an image processing apparatus according to an exemplary embodiment of the present application, where the apparatus may be mounted on a computer device in the foregoing method embodiment, and the computer device may specifically be the server 303 shown in fig. 3. Of course, in some embodiments, the object prediction model may also be installed on a terminal device, and an application program designed based on the corresponding object prediction model is installed on the terminal device. The image processing apparatus shown in fig. 9 may be used to perform some or all of the functions in the method embodiments described above with respect to fig. 4 and 7. Wherein, the detailed description of each unit is as follows:

an obtaining unit 901, configured to obtain an image to be processed, where the image to be processed includes a first object;

a processing unit 902, configured to input the image to be processed into an object prediction model for processing, so as to obtain a target image, where the target image includes a second object matched with the first object.

In an optional embodiment, the obtaining unit 901 is further configured to obtain a first attribute tag; the processing unit 902 is further configured to input the image to be processed and the first attribute tag into an object prediction model for processing, so as to obtain a target image, where the first attribute feature of the second object is matched with the first attribute tag.

In an optional implementation manner, the obtaining unit 901 is specifically configured to display a tag selection page, where the tag selection page includes at least one attribute tag option; and when the selection operation aiming at the tag selection page is detected, taking the attribute tag selected by the selection operation as a first attribute tag.

In an optional implementation manner, the image to be processed includes a first image and a second image, and a person corresponding to an object in the first image and a person corresponding to an object in the second image have a relationship; the processing unit 902 is specifically configured to:

inputting the first image and the second image into an object prediction model for processing to obtain a target image; the person corresponding to the second object in the target image is a predicted person having a relationship with the persons corresponding to the objects in the first image and the second image.

In an optional implementation manner, the processing unit 902 is specifically configured to input the image to be processed into an object prediction model for processing, so as to obtain a target image; optionally, the target image includes a third image and a fourth image, and the people corresponding to the objects in the third image and the fourth image are predicted people having a relationship with the people corresponding to the objects in the image to be processed.

In an optional implementation manner, the obtaining unit 901 is further configured to obtain a second attribute tag, and determine a second attribute feature of the second object according to the second attribute tag;

the processing unit 902 is further configured to process the target image according to the second attribute feature to obtain a processed target image.

In an optional implementation manner, the object prediction model includes a prediction network and a decision network, and the processing unit 902 is specifically configured to input the image to be processed into the prediction network for processing, so as to obtain an image to be output;

inputting the image to be output into the judgment network for processing to obtain a judgment result;

and when the judgment result indicates that the similarity between the image to be processed and the image to be output is greater than or equal to a similarity threshold, taking the image to be output as a target image.

In an optional implementation manner, the object prediction model is obtained by training according to image sample data, and the obtaining unit 901 is further configured to obtain the image sample data, where the image sample data includes a training sample image and a result sample image; optionally, objects in the training sample images and objects in the result sample images match; the processing unit 902 is further configured to train an initial object prediction model according to the training sample images and the result sample images to obtain a trained object prediction model.

In an optional implementation manner, the processing unit 902 is configured to input the image to be processed into an object prediction model for processing, and after obtaining a target image, is further configured to determine an image area corresponding to the second object in the target image, and set the image area as a movable image area; the image processing device is further used for adjusting the area position of the image area in the target image according to the movement operation when the movement operation of the user on the image area is detected.

In an optional implementation manner, the target image is a three-dimensional image, and the processing unit 902 is further configured to, when a rotation operation of the user on the second object in the target image is detected, adjust image content corresponding to the second object in the target image according to the rotation operation.

In an optional implementation manner, the obtaining unit 901 is further configured to control the camera to shoot a shooting object, obtain an initial video, and obtain an image to be processed from the initial video; the processing unit 902 is further configured to input the image to be processed into an object prediction model for processing, so as to obtain a target image, and further configured to correspondingly adjust a motion state corresponding to the second object in the target image according to a motion state corresponding to the first object in each frame of image in the initial video, so as to obtain a first image set; and further configured to generate a special effects video from the initial video and the first set of images.

In an optional implementation manner, the processing unit 902 is configured to generate a special effect video according to the initial video and the first image set, and specifically, to synthesize each image in the first image set with a corresponding image in the initial video, respectively, to generate a second image set; and is further configured to generate a special effect video from the second set of images.

In an optional implementation manner, the obtaining unit 901 obtains a unit to be processed, where the obtaining unit 901 is specifically configured to control a camera to perform image acquisition on a shot object, obtain a preview image, and display the preview image in a display interface; the preview image is determined as an image to be processed; the apparatus further comprises a display unit 903 for displaying the target image in the display interface.

In an optional implementation manner, the display unit 903 is specifically configured to replace the preview image with the target image in the display interface; or, displaying the preview image and the target image in the display interface, wherein optionally, the display positions of the preview image and the target image are different; or, the preview image and the target image are synthesized to obtain a synthesized image, and the synthesized image is displayed in the display interface, and optionally, a position corresponding to the first object in the synthesized image is adjacent to a position corresponding to the second object.

In an optional embodiment, the obtaining unit 901 is further configured to obtain a sticker to be added; the processing unit 902 is further configured to add the to-be-added sticker to the target image, so as to obtain a special effect image.

In an optional embodiment, the processing unit 902 is further specifically configured to determine, when the sticker to be added is a decorative sticker related to five sense organs, a target five sense organs in the second object that match the sticker to be added; and adding the paster to be added to the position of the target five sense organs in the target image to obtain a special effect image.

In an optional implementation, the processing unit 902 is specifically further configured to add the to-be-added sticker to an initial position in the target image; when the moving operation of the user on the paster to be added is detected, the adding position of the paster to be added is adjusted according to the moving operation, and a special effect image is obtained.

In an alternative embodiment, the display unit 903 is further configured to display a sticker selection page, where the sticker selection page includes at least one sticker option; the processing unit 902 is further configured to, when a selection operation for the sticker selection page is detected, take the sticker selected by the selection operation as a sticker to be added.

In an optional implementation manner, the obtaining unit 901 is further configured to obtain a characteristic attribute of a person corresponding to the second object; the processing unit 902 is further configured to select at least one type of sticker matching the characteristic attribute from a sticker database according to the characteristic attribute; and displaying a sticker selection page according to the selected at least one type of stickers matched with the characteristic attributes.

In an optional implementation manner, the processing unit 902 is further configured to perform a beautification processing on the target image, so as to obtain an beautified target image; optionally, the beautification treatment comprises a beauty treatment and/or a filter treatment.

According to an embodiment of the present application, some steps involved in the image processing methods shown in fig. 4 and 7 may be performed by respective units in the image processing apparatus shown in fig. 9. For example, step S401 shown in fig. 4 may be executed by the acquisition unit 901 shown in fig. 9, and step S402 may be executed by the processing unit 902 shown in fig. 9. Steps S701 to S702, S704 to S705 shown in fig. 9 may be executed by the acquisition unit 901 shown in fig. 9, and steps S703 and S706 may be executed by the processing unit 902 shown in fig. 9. The units in the image processing apparatus shown in fig. 9 may be respectively or entirely combined into one or several other units to form one or several other units, or some unit(s) may be further split into multiple functionally smaller units to form one or several other units, which may achieve the same operation without affecting the achievement of the technical effects of the embodiments of the present application. The units are divided based on logic functions, and in practical application, the functions of one unit can be realized by a plurality of units, or the functions of a plurality of units can be realized by one unit. In other embodiments of the present application, the image processing apparatus may also include other units, and in practical applications, these functions may also be implemented by assistance of other units, and may be implemented by cooperation of a plurality of units.

According to another embodiment of the present application, the image processing apparatus as shown in fig. 9 may be configured by running a computer program (including program codes) capable of executing the steps involved in the respective methods as shown in fig. 4 and 7 on a general-purpose computing apparatus such as a computer including a processing element such as a Central Processing Unit (CPU), a random access storage medium (RAM), a read-only storage medium (ROM), and a storage element, and the image processing method of the embodiment of the present application may be implemented. The computer program may be recorded on a computer-readable recording medium, for example, and loaded and executed in the above-described computing apparatus via the computer-readable recording medium.

Based on the same inventive concept, the principle and the advantageous effect of the image processing apparatus provided in the embodiment of the present application for solving the problem are similar to those of the image processing method in the embodiment of the present application for solving the problem, and for brevity, the principle and the advantageous effect of the implementation of the method can be referred to, and are not described herein again.

Referring to fig. 10, fig. 10 is a schematic structural diagram of a computer device according to an exemplary embodiment of the present application, where the computer device includes at least a processor 1001, a communication interface 1002, and a memory 1003. The processor 1001, the communication interface 1002, and the memory 1003 may be connected by a bus or other means. The processor 1001 (or Central Processing Unit (CPU)) is a computing core and a control core of the terminal, and can analyze various instructions in the terminal and process various data of the terminal, for example: the CPU can be used for analyzing a power-on and power-off instruction sent to the terminal by a user and controlling the terminal to carry out power-on and power-off operation; the following steps are repeated: the CPU may transmit various types of interactive data between the internal structures of the terminal, and so on. The communication interface 1002 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI, mobile communication interface, etc.), which may be controlled by the processor 1001 for transceiving data; the communication interface 1002 can also be used for transmission and interaction of data inside the terminal. A Memory 1003(Memory) is a Memory device in the terminal, and stores programs and data. It is understood that the memory 1003 may include a built-in memory of the terminal, and may also include an expansion memory supported by the terminal. The memory 1003 provides storage space that stores the operating system of the terminal, which may include, but is not limited to: android system, iOS system, Windows Phone system, etc., which are not limited in this application.

In the embodiment of the present application, the processor 1001 executes the executable program code in the memory 1003 to perform the following operations:

acquiring an image to be processed through a communication interface 1002, wherein the image to be processed comprises a first object;

In an alternative embodiment, the first attribute tag is obtained through the communication interface 1002; and inputting the image to be processed and the first attribute label into an object prediction model for processing to obtain a target image, wherein the first attribute characteristic of the second object is matched with the first attribute label.

In an optional embodiment, the obtaining of the first attribute tag through the communication interface 1002 specifically includes displaying a tag selection page, where the tag selection page includes at least one attribute tag option;

and when the selection operation aiming at the tag selection page is detected, taking the attribute tag selected by the selection operation as a first attribute tag.

In an optional implementation manner, the image to be processed includes a first image and a second image, and a person corresponding to an object in the first image and a person corresponding to an object in the second image have a relationship; the processor 1001 inputs the image to be processed into the object prediction model for processing, so as to obtain a target image, which may specifically include: inputting the first image and the second image into an object prediction model for processing to obtain a target image; the person corresponding to the second object in the target image is a predicted person having a relationship with the persons corresponding to the objects in the first image and the second image.

In an optional implementation manner, the inputting, by the processor 1001, the image to be processed into the object prediction model for processing to obtain the target image may specifically include: inputting the image to be processed into an object prediction model for processing to obtain a target image; optionally, the target image includes a third image and a fourth image, and the people corresponding to the objects in the third image and the fourth image are predicted people having a relationship with the people corresponding to the objects in the image to be processed.

In an optional implementation manner, after the processor 1001 inputs the image to be processed into the object prediction model for processing, so as to obtain a target image, the processor 1001 acquires a second attribute tag through the communication interface 1002; determining a second attribute feature of the second object according to the second attribute tag; the processor 1001 processes the target image according to the second attribute feature to obtain a processed target image.

In an optional implementation manner, the object prediction model includes a prediction network and a decision network, and the processor 1001 inputs the image to be processed into the object prediction model to process the image to be processed, so as to obtain the target image, which may specifically include: inputting the image to be processed into the prediction network for processing to obtain an image to be output; inputting the image to be output into the judgment network for processing to obtain a judgment result;

In an optional implementation, the object prediction model is obtained by training according to image sample data, and the processor 1001 acquires the image sample data through the communication interface 1002, where the image sample data includes a training sample image and a result sample image; optionally, objects in the training sample images and objects in the result sample images match; the processor 1001 trains an initial object prediction model according to the training sample image and the result sample image to obtain a trained object prediction model.

In an optional implementation manner, the processor 1001 is further configured to determine an image area corresponding to the second object in the target image after inputting the image to be processed into the object prediction model for processing to obtain the target image, and set the image area as a movable image area; the image processing device is further used for adjusting the area position of the image area in the target image according to the movement operation when the movement operation of the user on the image area is detected.

In an optional implementation manner, the target image is a three-dimensional image, and when detecting a rotation operation of a user on the second object in the target image, the processor 1001 adjusts image content corresponding to the second object in the target image according to the rotation operation.

In an optional implementation manner, the communication interface 1002 controls the camera to shoot a shooting object to obtain an initial video, and obtains an image to be processed from the initial video; the processor 1001 inputs the image to be processed into an object prediction model for processing, and after obtaining a target image, is further configured to correspondingly adjust a motion state corresponding to the second object in the target image according to a motion state corresponding to the first object in each frame of image in the initial video, so as to obtain a first image set; and further configured to generate a special effects video from the initial video and the first set of images.

In an optional implementation manner, the processor 1001 generates a special effect video according to the initial video and the first image set, and is specifically configured to combine each image in the first image set with a corresponding image in the initial video, so as to generate a second image set; and is further configured to generate a special effect video from the second set of images.

In an optional implementation manner, the communication interface 1002 acquires an image to be processed, the processor 1001 controls the camera to acquire an image of a shot object, so as to obtain a preview image, and the preview image is displayed in the display interface; the image processing device is used for determining the preview image as an image to be processed; the processor 1001 displays the target image in the display interface.

In an alternative embodiment, the processor 1001 replaces the preview image with the target image in the display interface; or, displaying the preview image and the target image in the display interface, wherein optionally, the display positions of the preview image and the target image are different; or, the preview image and the target image are synthesized to obtain a synthesized image, and the synthesized image is displayed in the display interface, and optionally, a position corresponding to the first object in the synthesized image is adjacent to a position corresponding to the second object.

In an alternative embodiment, the communication interface 1002 obtains the sticker to be added; the processor 1001 adds the sticker to be added to the target image, and obtains a special effect image.

In an alternative embodiment, when the sticker to be added is a decorative sticker related to five sense organs, the processor 1001 determines a target five sense organs in the second object that match the sticker to be added; and adding the paster to be added to the position of the target five sense organs in the target image to obtain a special effect image.

In an alternative embodiment, the processor 1001 adds the sticker to be added to an initial position in the target image; when the moving operation of the user on the paster to be added is detected, the adding position of the paster to be added is adjusted according to the moving operation, and a special effect image is obtained.

In an alternative embodiment, the processor 1001 displays a sticker selection page including at least one sticker option; when detecting a selection operation for the sticker selection page, the processor 1001 takes the sticker selected by the selection operation as a sticker to be added.

In an optional implementation manner, the communication interface 1002 acquires a characteristic attribute of a person corresponding to the second object; the processor 1001 selects at least one type of sticker matched with the characteristic attribute from a sticker database according to the characteristic attribute; and displaying a sticker selection page according to the selected at least one type of stickers matched with the characteristic attributes.

In an alternative embodiment, the processor 1001 performs beautification processing on the target image to obtain an beautified target image; optionally, the beautification treatment comprises a beauty treatment and/or a filter treatment.

Based on the same inventive concept, the principle and the beneficial effect of the problem solving of the terminal device provided in the embodiment of the present application are similar to the principle and the beneficial effect of the problem solving of the image processing method in the embodiment of the present application, and for brevity, the principle and the beneficial effect of the implementation of the method can be referred to, and are not described herein again.

The present application further provides an apparatus, comprising: a memory, a processor, a computer program stored on the memory, which computer program, when executed by the processor, implements the steps of the method as described above.

An embodiment of the present application further provides a chip, which includes a memory and a processor, where the memory is used to store a computer program, and the processor is used to call and run the computer program from the memory, so that a device in which the chip is installed executes the method described in the above various possible embodiments.

The embodiment of the present application further provides a computer-readable storage medium, where one or more instructions are stored in the computer-readable storage medium, and the one or more instructions are adapted to be loaded by a processor and execute the image processing method according to the above method embodiment.

Embodiments of the present application further provide a computer program product containing instructions, which when run on a computer, cause the computer to execute the image processing method described in the above method embodiments.

It should be noted that, for simplicity of description, the above-mentioned embodiments of the method are described as a series of acts or combinations, but those skilled in the art should understand that the present application is not limited by the order of acts described, as some steps may be performed in other orders or simultaneously according to the present application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.

The steps in the method of the embodiment of the application can be sequentially adjusted, combined and deleted according to actual needs.

The modules in the device can be merged, divided and deleted according to actual needs.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, a controlled terminal, or a network device) to execute the method of each embodiment of the present application.

The above description is only a preferred embodiment of the present application, and not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made by the contents of the specification and the drawings of the present application, or which are directly or indirectly applied to other related technical fields, are included in the scope of the present application.

Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the application disclosed herein. The embodiments of the present application are intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.

It will be understood that the present application is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof.

Claims

1. An image processing method, characterized in that the method comprises:

acquiring an image to be processed, wherein the image to be processed comprises a first object; wherein the image to be processed comprises at least one image;

inputting the image to be processed into an object prediction model for processing to obtain a target image, wherein the target image comprises a second object matched with the first object, and the method comprises the following steps: acquiring a first attribute label, inputting the image to be processed and the first attribute label into an object prediction model for processing to obtain a target image, wherein a first attribute feature of the second object is matched with the first attribute label, and the first attribute label is a generation class attribute label playing a critical role in generation of the target image;

the acquiring of the image to be processed includes:

controlling a camera to shoot a shot object to obtain an initial video, and acquiring an image to be processed from the initial video;

after the image to be processed is input into the object prediction model for processing to obtain the target image, the method further includes:

correspondingly adjusting the motion state corresponding to the second object in the target image according to the motion state corresponding to the first object in each frame of image in the initial video to obtain a first image set;

generating a special effect video from the initial video and the first set of images, comprising: and respectively synthesizing each image in the first image set with a corresponding image in the initial video to generate a second image set, and generating a special effect video according to the second image set.

2. The method of claim 1, wherein obtaining the first attribute tag comprises:

displaying a tag selection page, wherein the tag selection page comprises at least one attribute tag option;

3. The method according to claim 1, wherein the image to be processed comprises a first image and a second image, and the person corresponding to the object in the first image and the person corresponding to the object in the second image have a relationship;

the inputting the image to be processed into an object prediction model for processing to obtain a target image comprises:

4. The method according to claim 1, wherein the inputting the image to be processed into an object prediction model for processing to obtain a target image comprises:

and inputting the image to be processed into an object prediction model for processing to obtain a target image comprising a third image and a fourth image.

5. The method according to any one of claims 1 to 4, wherein after the image to be processed is input into an object prediction model for processing, and a target image is obtained, the method further comprises:

acquiring a second attribute label;

determining a second attribute feature of the second object according to the second attribute tag;

and processing the target image according to the second attribute characteristics to obtain a processed target image.

6. The method according to any one of claims 1 to 4, wherein the object prediction model comprises a prediction network and a decision network, and the inputting the image to be processed into the object prediction model for processing to obtain the target image comprises:

inputting the image to be processed into the prediction network for processing to obtain an image to be output;

7. The method of any of claims 1 to 4, wherein the object prediction model is trained from image sample data, the method further comprising:

acquiring image sample data, wherein the image sample data comprises a training sample image and a result sample image; and training an initial object prediction model according to the training sample image and the result sample image to obtain a trained object prediction model.

8. The method according to any one of claims 1 to 4, wherein after the image to be processed is input into an object prediction model for processing, and a target image is obtained, the method further comprises:

determining an image area corresponding to the second object in the target image, and setting the image area as a movable image area;

when the movement operation of the user for the image area is detected, the area position of the image area in the target image is adjusted according to the movement operation.

9. The method of claim 8, wherein the target image is a three-dimensional image, the method further comprising:

when the rotation operation of the user on the second object in the target image is detected, adjusting the image content corresponding to the second object in the target image according to the rotation operation.

10. The method according to any one of claims 1 to 4, wherein the acquiring an image to be processed comprises:

controlling a camera to acquire images of a shot object to obtain a preview image, and displaying the preview image in a display interface;

determining the preview image as an image to be processed;

and displaying the target image in the display interface.

11. The method of claim 8, wherein displaying the target image in a display interface comprises:

replacing the preview image with the target image in the display interface;

or, displaying the preview image and the target image in the display interface;

or, the preview image and the target image are synthesized to obtain a synthesized image, and the synthesized image is displayed in the display interface.

12. The method according to any one of claims 1 to 4, wherein after the image to be processed is input into an object prediction model for processing, and a target image is obtained, the method further comprises:

acquiring a sticker to be added;

and adding the paster to be added into the target image to obtain a special effect image.

13. The method of claim 12, wherein adding the sticker to be added to the target image resulting in a special effects image comprises:

when the paster to be added is a decorative paster related to the five sense organs, determining a target five sense organs matched with the paster to be added in the second object;

and adding the paster to be added to the position of the target five sense organs in the target image to obtain a special effect image.

14. The method according to claim 12, wherein the adding the sticker to be added to the target image to obtain a special effect image comprises:

adding the paster to be added to an initial position in the target image;

when the moving operation of the user on the paster to be added is detected, the adding position of the paster to be added is adjusted according to the moving operation, and a special effect image is obtained.

15. The method of claim 12, wherein the obtaining the sticker to be added comprises:

displaying a sticker selection page, the sticker selection page including at least one sticker option;

and when the selection operation aiming at the sticker selection page is detected, using the sticker selected by the selection operation as the sticker to be added.

16. The method of claim 15, wherein displaying the sticker selection page comprises:

acquiring the characteristic attribute of the person corresponding to the second object;

selecting at least one type of paster matched with the characteristic attribute from a paster database according to the characteristic attribute;

and displaying a sticker selection page according to the selected at least one type of stickers matched with the characteristic attributes.

17. The method according to any one of claims 1 to 4, wherein after the image to be processed is input into an object prediction model for processing, and a target image is obtained, the method further comprises:

and beautifying the target image to obtain a beautified target image.

18. A computer device comprising a processor, a memory, wherein the memory is configured to store a computer program comprising program instructions, and wherein the processor is configured to invoke the program instructions to perform the method of any of claims 1 to 17.

19. A computer-readable storage medium, comprising: the computer-readable storage medium stores one or more instructions adapted to be loaded by a processor and to perform the method of any of claims 1-17.