CN116012270A - Image processing method and device - Google Patents

Image processing method and device Download PDF

Info

Publication number
CN116012270A
CN116012270A CN202111222630.XA CN202111222630A CN116012270A CN 116012270 A CN116012270 A CN 116012270A CN 202111222630 A CN202111222630 A CN 202111222630A CN 116012270 A CN116012270 A CN 116012270A
Authority
CN
China
Prior art keywords
image
feature points
type
pixel values
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111222630.XA
Other languages
Chinese (zh)
Inventor
金卓群
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202111222630.XA priority Critical patent/CN116012270A/en
Publication of CN116012270A publication Critical patent/CN116012270A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Image Processing (AREA)

Abstract

The application provides an image processing method and device. The method comprises the following steps: acquiring a first image; acquiring first type feature points in the first image; acquiring a second image; transforming the first type of feature points in the first image into feature points corresponding to the first type of feature points in the second image; fusing the pixel values of the first type of feature points and the pixel values of the feature points corresponding to the first type of feature points in the second image to obtain a third image; outputting the third image. According to the method and the device for fusing the pixels in the local area in the first image and the pixels in the area corresponding to the second image, fusion of the pixels in the local area in the first image and the pixels in the area corresponding to the second image can be achieved, and user experience is improved. The embodiment of the application can be applied to artificial intelligence, such as face deformation and face fusion scenes in the technical field of computer vision.

Description

Image processing method and device
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for image processing.
Background
Artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use the knowledge to obtain optimal results. Computer Vision (CV) technology is an important branch of artificial intelligence, and mainly uses a camera and a Computer to replace human eyes to perform machine Vision such as recognition, tracking and measurement on targets, and further performs graphic processing, so that the Computer is processed into images more suitable for human eyes to observe or transmit to an instrument to detect.
Image fusion techniques are an important application for computer vision. In the scheme provided by the related art, at least two images of the same size are integrated into one image by setting the entirety of the at least two images to different transparencies. However, the fused images in the existing scheme tend to have poor fusion effect.
Disclosure of Invention
The embodiment of the application provides an image processing method and device, which can realize the fusion of pixels in a local area in a first image and pixels in an area corresponding to a second image, and improve user experience.
In a first aspect, embodiments of the present application provide a method for image processing, including:
acquiring a first image;
acquiring first type feature points in the first image;
acquiring a second image;
transforming the first type of feature points in the first image into feature points corresponding to the first type of feature points in the second image;
fusing the pixel values of the first type of feature points and the pixel values of the feature points corresponding to the first type of feature points in the second image to obtain a third image;
outputting the third image.
In a second aspect, embodiments of the present application provide an apparatus for image processing, including:
An acquisition unit configured to acquire a first image;
the acquisition unit is also used for acquiring first type feature points in the first image;
the acquisition unit is also used for acquiring a second image;
a transformation unit, configured to transform the first type of feature points in the first image into feature points corresponding to the first type of feature points in the second image;
the fusion unit is used for fusing the pixel values of the first type of feature points and the pixel values of the feature points corresponding to the first type of feature points in the second image to obtain a third image;
and the output unit is used for outputting the third image.
In a third aspect, an electronic device is provided, comprising: a processor and a memory; the memory is used for storing a computer program; the processor is configured to execute the computer program to implement the method described above.
In a fourth aspect, a chip is provided, comprising: a processor for calling and running a computer program from the memory, causing the device on which the chip is mounted to perform the method as described above.
In a fifth aspect, there is provided a computer readable storage medium comprising computer instructions which, when executed by a computer, cause the computer to implement a method as previously described.
In a sixth aspect, there is provided a computer program product comprising computer program instructions which, when run on a computer, cause the computer to perform the method as described above.
According to the embodiment of the application, the characteristic points of the local area (or the specific area) can be obtained by classifying the characteristic points of the first image, for example, the first type characteristic points are further converted into the characteristic points corresponding to the first type characteristic points in the second image, and then the third image is obtained by fusing the pixel values of the first type characteristic points and the pixel values of the characteristic points corresponding to the first type characteristic points in the second image, so that the fusion of the pixels in the local area (or the specific area) in the first image and the pixels in the area corresponding to the second image can be realized, and the user experience is improved.
Furthermore, the embodiment of the application can adaptively adjust the fusion ratio of the feature points at different positions in different areas in the composite image based on the positions of the different feature points in the local area, so that smooth transition of the edges of the local area during image fusion can be facilitated, reality during fusion is reflected, and user experience is improved.
Drawings
Fig. 1 is a schematic architecture diagram of a network system to which the scheme provided in the embodiments of the present application is applied;
fig. 2 is a schematic structural diagram of an electronic device according to an embodiment of the present application;
FIG. 3 is a schematic flow chart of a method of image processing provided by an embodiment of the present application;
FIG. 4 is an example of a display interface to which the method of image processing provided by the embodiments of the present application is applied;
FIG. 5A is an example of extracted face features provided by an embodiment of the present application;
fig. 5B is an example of feature points after outliers are deleted from the face feature points in fig. 5A;
fig. 6 is an example of feature points after region division of the feature points of the facial organ in fig. 5B;
fig. 7 is an example of feature points of the mouth region after region division of fig. 5B;
FIG. 8 is a schematic diagram of triangulating a face according to the Delaunay triangulation method provided in an embodiment of the present application;
FIG. 9 is a schematic diagram of affine transformation of triangular regions provided by embodiments of the present application;
FIG. 10 is another example of a display interface to which the method of image processing provided by the embodiments of the present application is applied;
FIG. 11 is an alternative schematic block diagram of an apparatus for image processing of embodiments of the present application;
Fig. 12 is another alternative schematic block diagram of an electronic device provided by an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present application based on the embodiments herein.
Artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision. The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning, automatic driving, intelligent traffic and other directions.
Computer vision technology is used as a scientific discipline, and computer vision research-related theory and technology is attempted to build artificial intelligence systems capable of acquiring information from images or multidimensional data. Computer vision technologies typically include image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D technology, virtual reality, augmented reality, synchronous positioning and mapping, autopilot, intelligent transportation, etc., as well as common biometric technologies such as face recognition, fingerprint recognition, etc.
With research and progress of artificial intelligence technology, the artificial intelligence technology is developed in various fields such as common smart home, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned, automatic driving, unmanned aerial vehicle, robot, smart medical, smart customer service, internet of vehicles, automatic driving, smart transportation, etc. With the development of technology, artificial intelligence technology will find application in more fields and will develop more and more important value.
The scheme provided by the embodiment of the application relates to technologies such as computer vision of artificial intelligence, and particularly provides an image processing method, which comprises the following steps: acquiring a first image, a second image and a first type of feature point in the first image, converting the first type of feature point in the first image into a feature point corresponding to the first type of feature point in the second image, fusing the pixel value of the first type of feature point with the pixel value of the feature point corresponding to the first type of feature point in the second image, acquiring a third image, and outputting the third image.
In the embodiment of the present application, the first image is an image to be processed, which may also be referred to as an original image, that is, an image that is not processed by the image processing method in the embodiment of the present application; the second image comprises an image to be fused into the first image, which may also be referred to as a target feature map; the third image may also be referred to as a composite image, that is, an image obtained by processing the original image by the image processing method of the embodiment of the present application. The image processing may refer to fusing the first type of feature points in the first image with feature points corresponding to the second image, so as to generate a third image. For example, the first type of feature points in the first image can be subjected to feature transformation through a transformation matrix to obtain corresponding feature points in the second image, and the pixel values of the first type of feature points and the pixel values of the feature points corresponding to the second image are fused according to the fusion ratio of the pixel values of the first type of feature points in the synthesized image (for example, the third image) to obtain the third image.
According to the embodiment of the application, the characteristic points of the local area (or the specific area) can be obtained by classifying the characteristic points of the first image, for example, the first type characteristic points are further converted into the characteristic points corresponding to the first type characteristic points in the second image, and then the third image is obtained by fusing the pixel values of the first type characteristic points and the pixel values of the characteristic points corresponding to the first type characteristic points in the second image, so that the fusion of the pixels in the local area (or the specific area) in the first image and the pixels in the area corresponding to the second image can be realized, and the user experience is improved.
The image processing method provided by the embodiment of the application can be used in scenes such as computer vision, face special effects (such as face fusion and face deformation) and the like. The human face fusion refers to feature fusion of two human face images according to a certain fusion ratio. The face deformation refers to the application of feature transformation on the face feature points to transform the original face features into target face features. For example, when the embodiment of the application is applied to a face special effect scene, the embodiment of the application can realize face fusion or face deformation aiming at face partial organs (namely local areas in a face image). Specifically, after an organ (such as a mouth, eyes or nose, etc.) to be fused (or deformed) is specified, image fusion processing can be performed on a specific organ of two face images.
As a specific example, in the P-graph software, or in the face-beautifying camera, the method of image processing provided in the embodiment of the present application may be applied to implement deformation of a local organ of a face; in video editing, live broadcasting or real-time video session, the image processing method provided by the embodiment of the application can be applied to perform real-time transformation on the local organs of the human face. In addition, the image processing method provided by the embodiment of the application can be further applied to an online education scene, for example, based on a picture used for the online education scene, a group of mouth shape pictures based on the human face are obtained, and further mouth-shaped animation of the human face of the user is generated, so that learning of mouth-shaped pronunciation is facilitated.
It should be noted that, in the embodiment of the present application, the method for processing an image is described by taking a face special effect as an example, but the embodiment of the present application is not limited thereto, and for example, the image processing method provided in the embodiment of the present application may also be used to process images of other objects, for example, to fuse or deform organs of an animal, or to fuse images of two other different objects, or the like.
The following is a few simple descriptions of application scenarios applicable to the technical solution of the embodiment of the present application with reference to fig. 1 and fig. 2. It should be noted that the application scenarios described below are only for illustrating the embodiments of the present application and are not limiting. In specific implementation, the technical scheme provided by the embodiment of the application can be flexibly applied according to actual needs.
Fig. 1 is a schematic diagram of an alternative architecture of a network system 100 to which the scheme provided by the embodiments of the present application is applied. As shown in fig. 1, terminal device 130 is connected to server 110 via network 120, which may be connected to database 140. The network 120 may be a wide area network or a local area network, or a combination of the two, which is not limited.
In some embodiments, the method for image processing provided in the embodiments of the present application may be implemented by the terminal device 130. For example, after acquiring the first image and the second image, the terminal device 130 may extract a first type of feature point in the first image, for example, extract a feature point in the first image, classify the feature point to obtain the first type of feature point, then transform the first type of feature point in the first image into a feature point corresponding to the first type of feature point in the second image, fuse a pixel value of the first type of feature point with a pixel value of a feature point corresponding to the first type of feature point in the second image, acquire a third image, for example, determine a fusion ratio of the pixel value of the first type of feature point in the composite image, determine a transformation matrix between the first type of feature point and the corresponding feature point of the second image, transform the first type of feature point in the first image into a feature point corresponding to the first type of feature point in the second image according to the transformation matrix, fuse the pixel value of the first type of feature point with the pixel value of the feature point corresponding to the first type of feature point in the second image according to the fusion ratio, and acquire the third image, and output the third image. For example, the terminal device 130 may display the third image. The second image may be stored locally in the terminal device 130 in advance, or may be obtained from the database 140 by the terminal device 130 through the network 120 and the server 110, which is not limited in this application.
In some embodiments, the method for image processing provided in the embodiments of the present application may be cooperatively implemented by the server 110 and the terminal device 130. For example, when receiving the first image sent by the terminal device 130 and performing processing on the first image, the server 110 may obtain a first type of feature point in the first image, for example, extract the feature point of the first image, classify the feature point, obtain the first type of feature point, transform the first type of feature point in the first image into a feature point corresponding to the first type of feature point in the second image, fuse the pixel value of the first type of feature point with the pixel value of the feature point corresponding to the first type of feature point in the second image, obtain a third image, for example, determine a fusion ratio of the pixel value of the first type of feature point in the composite image, determine a transformation matrix between the first type of feature point and the corresponding feature point of the second image, transform the first type of feature point in the first image into a feature point corresponding to the first type of feature point in the second image according to the transformation matrix, fuse the pixel value of the first type of feature point with the pixel value of the second image with the pixel value of the first type of feature point corresponding to the first type of feature point in the second image, and obtain the third image. For example, the server 110 may transmit the third image to the terminal device 130 through the network 120, and the terminal device 130 may display the third image. The second image may be stored locally in the server 110 in advance, or may be obtained from the database 140 by the server 110, which is not limited in this application.
In some alternative embodiments, the server 110 may determine the above fusion ratio and/or the transformation matrix, and send the determined fusion ratio and/or transformation matrix to the terminal device 130, where the terminal device 130 processes the first type of feature points in the first image according to the fusion ratio and the transformation matrix, so as to obtain the third image, thereby helping to reduce resource consumption at the terminal device side and reduce the computation cost of the terminal device.
It should be noted that, the storage location of the second image is not limited in this embodiment, and may be, for example, a location of a distributed file system or a blockchain in the server 110, in addition to the database 140.
In some embodiments, the server 110 may be an independent physical server, or may be a server cluster or a distributed system formed by a plurality of physical servers, or may be a cloud server that provides cloud services, a cloud database, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, and basic cloud computing services such as big data and an artificial intelligence platform, where the cloud services may be image processing services, and are called by the terminal device 130 to process an image sent by the terminal device 130, and finally send the processed image to the terminal device 130. The terminal device 130 may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, a vehicle-mounted device, etc. The terminal device and the server may be directly or indirectly connected through wired or wireless communication, which is not limited in the embodiments of the present application.
Fig. 2 is a schematic structural diagram of an electronic device 200 according to an embodiment of the present application. The electronic device 200 is, for example, a terminal device or a server in fig. 1. It will be appreciated that portions of the structure of the electronic device 200 may be default, such as the user input module 230, the display panel in the output module 240, or the audio video input module 260 when the electronic device 200 is a server. As shown in fig. 2, the electronic device 200 includes a communication module 210, a sensor 220, a user input module 230, an output module 240, a processor 250, an audio video input module 260, a memory 270, and a power supply 280.
The communication module 210 may include at least one module that enables communication between the electronic device and other electronic devices. For example, the communication module 210 may include one or more of a wired network interface, a broadcast receiving module, a mobile communication module, a wireless internet module, a local area communication module, and a location (or position) information module, etc. The various modules are implemented in various ways in the prior art and are not described in detail herein.
The sensor 220 may sense a current state of the electronic device, such as an open/closed state, a position, whether there is contact with a user, a direction, and acceleration/deceleration, and the sensor 220 may generate a sensing signal for controlling an operation of the system.
The user input module 230 is used for receiving input digital information, character information or touch operation/non-touch gestures, receiving signal input related to user settings of the system and function control, and the like. The user input module 230 includes a touch panel and/or other input device such as a keyboard, mouse, touch screen display, or other input buttons and controls.
The output module 240 includes a display panel for displaying information input by a user, information provided to the user, various menu interfaces of the system, or the like. Alternatively, the display panel may be configured in the form of a liquid crystal display (liquid crystal display, LCD) or an organic light-emitting diode (OLED), or the like. In other embodiments, the touch panel may be overlaid on the display panel to form a touch display. In addition, the output module 240 may further include an audio output module, an alarm, a haptic module, and the like. For example, in the embodiment of the present application, the display panel may display the first image, the second image, the third image, or the like, which is not limited.
The audio/video input module 260 is used for inputting audio signals or video signals. The audio video input module 260 may include a camera and a microphone. For example, in the embodiment of the present application, the camera may acquire the first image to be processed.
The power supply 280 may receive external power and internal power under the control of the processor 250 and provide power required for the operation of the various components of the system.
Processor 250 may be indicative of one or more processors, for example, processor 250 may include one or more central processors, or include one central processor and one graphics processor, or include one application processor and one co-processor (e.g., a micro-control unit). When the processor 250 includes a plurality of processors, the plurality of processors may be integrated on the same chip or may be separate chips. A processor may include one or more physical cores, where a physical core is the smallest processing module.
The memory 270 stores computer programs including an operating system program 272, an application program 271, and the like. Typical operating systems such as Windows from Microsoft corporation, macOS from apple corporation, etc. are used in desktop or notebook systems, as well as on a Google corporation development base
Figure BDA0003313184160000081
Android->
Figure BDA0003313184160000082
A system or the like for a mobile terminal. The method provided in the embodiment of the present application may be implemented by means of software, and may be considered as a specific implementation of the application 271.
Memory 270 may be one or more of the following types: flash memory, hard disk type memory, micro multimedia card memory, card memory (e.g., SD or XD memory), random access memory (random access memory, RAM), static RAM (SRAM), read Only Memory (ROM), electrically erasable programmable read only memory (electrically erasable programmable read-only memory, EEPROM), programmable Read Only Memory (PROM), magnetic memory, magnetic disk, or optical disk. In other embodiments, memory 270 may also be a network storage device on the Internet, and the system may perform updates or reads to memory 270 on the Internet.
The processor 250 is configured to read the computer program in the memory 270 and then execute the method defined by the computer program, e.g. the processor 250 reads the operating system program 272 to run the operating system and implement various functions of the operating system on the system, or reads one or more application programs 271 to run applications on the electronic device 200.
The memory 270 also stores other data 273 than a computer program, such as a first image, a second image, a third image, feature points of images, fusion ratios, a transformation matrix, and the like, which are referred to in the present application.
The connection relationship between each module in fig. 2 is only an example, and the method provided in any embodiment of the present application may also be applied to electronic devices with other connection manners, for example, all modules are connected through a bus.
The methods provided by the embodiments of the present application may also be implemented in hardware, and the apparatus implementing the methods provided by the embodiments of the present application may be a processor in the form of a hardware decoding processor programmed to perform the image processing methods provided by the embodiments of the present application, for example, the processor in the form of a hardware decoding processor may employ one or more application specific integrated circuits (ApplicationSpecific Integrated Circuit, ASIC), DSP, programmable logic device (ProgrammableLogicDevice, PLD), complex programmable logic device (Complex Programmable Logic Device, CPLD), field programmable gate array (Field-Programmable Gate Array, FPGA), or other electronic component.
The method for image processing provided in the embodiments of the present application is described in detail below with reference to the accompanying drawings.
Fig. 3 shows a schematic flow chart of a method 300 of image processing provided by an embodiment of the present application. The method 300 may be performed by the terminal device 130 in fig. 1, or performed by the terminal device 130 and the server 110 cooperatively, without limitation. As shown in fig. 3, the method 300 includes steps 310 through 360.
A first image is acquired 310.
The first image may be, for example, a photograph taken by a camera of the terminal device, or a frame of image in a taken video, or an image acquired by the terminal device from a storage unit thereof, or from other electronic devices, without limitation. In some embodiments, the terminal device may obtain the first image through a camera. In other embodiments, the terminal device may send the first image acquired by the camera to other devices, for example, a server, and the corresponding server acquires the first image.
When the terminal device obtains the first image through the camera, the first image may include an image captured by the camera when the camera is opened to perform shooting, or include an image captured by the camera after the user clicks to shoot or an image in a video, which is not limited in this application.
When the embodiment of the application is applied to scenes such as face special effects (such as face fusion and face deformation), the first image comprises a face image.
Fig. 4 shows a schematic diagram of an interface to which the method of image processing of the embodiment of the present application is applied. As shown in fig. 4, in a 401 module in the interface, the original (one example of the first image) may be uploaded. Optionally, when uploading the original image, the photo can be taken in real time for uploading, or the photo is selected from the local album for uploading, without limitation. The 401 module in fig. 4 (a) includes a schematic diagram of an interface before uploading the original image, and the 401 module in fig. 4 (b) includes a schematic diagram of an interface after uploading the original image.
And 320, acquiring first type feature points in the first image.
For example, after the first image is acquired, feature points in the first image may be extracted, and the feature points of the first image may be classified to acquire first-class feature points. As a possible implementation, the deep network capable of extracting features may be obtained based on a deep learning method, for example by extensive feature training, for example feature extraction may be performed based on an existing pre-trained depth extractor.
For scenes such as face special effects (such as face fusion and face deformation), namely when the first image comprises a face image, feature points of the face image in the first image can be extracted, for example, feature points of organs of the face are included. Further, the feature points of the face image may be classified, and the feature points of the first type, for example, the feature points of an organ (such as a nose, a mouth, or eyes, etc.), may be obtained.
For example, the feature extraction of the image may be performed according to a well-trained deep learning model, such as a face recognition model or a face key point detector, for example, the face position may be detected using a directional gradient histogram (Histogram Of Gradient, HOG), and then the image may be transformed into HOG form. By way of example, the model may be tested with Labeled Faces in the Wild face dataset with an accuracy of up to 99.38%. Fig. 5A shows an example of face features extracted by a detector employing 68 feature points. Referring to fig. 5A, pixel positions of 68 feature points on a face can be obtained.
In other possible implementations, the face feature recognition may be performed by using wavelet transformation and a principal component analysis method to fuse features (such as all features) in the image, which is not limited in this application.
After the feature points in the first image are obtained, the feature points of the first image can be classified, and the first type of feature points are obtained. Illustratively, the result of feature extraction of the first image is a two-dimensional spatially dispersed set of points. In the embodiment of the present application, the local area (or the specific area) of the first image needs to be fused, and the area corresponding to each feature point needs to be known, so that these scattered point sets are classified. Classification may also be referred to as clustering, with the aim of bringing similar samples together. By way of example, clustering (classifying) algorithms include, without limitation, partitioning methods (e.g., K-means algorithm, K-center-points (K-means) algorithm, clarants algorithm, etc.), hierarchical methods (e.g., BIRCH algorithm, CURE algorithm, CHAMELEON algorithm), density-based methods (e.g., DBSCAN algorithm, options algorithm, denlue algorithm), grid-based methods (STING algorithm, CLIQUE algorithm, WAVE-CLUSTER algorithm), model-based methods, etc. Clustering algorithms are commonly used for data analysis, and can help to classify samples, and can also be used for classifying feature points.
In some embodiments, for the feature points of the first image including a two-dimensional spatially dispersed point set, the sample size is small, and the sample attribute is single, the most commonly used partitioning algorithm k-means clustering algorithm may be adopted. For example, k feature points may be randomly selected as objects, and an instance may be in which class as soon as it is near to which point, and then the center point of each class may be used as a new object, and the iteration is continued until classification is completed. Here, the k value needs to be predetermined. It should be noted that, the classification result is more easily affected by the initial value or the noise and the outlier, and the final result is locally optimal.
It should be noted that, the k-means clustering algorithm is taken as an example in the embodiment of the present application, but the embodiment of the present application is not limited thereto, and may also be classified by using other algorithms as described above, for example.
In some possible implementations, sparse points may be screened for the point set first, and outliers in some non-specific regions may be eliminated. For example, for a scene where the first image comprises a face image, outliers of some non-face organ areas may be eliminated. In some alternative embodiments, the facial organ feature points include at least one of eye region feature points, nose region feature points, and mouth region feature points. Accordingly, the facial organ feature points are classified, and the obtained first type of feature points may include at least one of an eye region feature point, a nose region feature point, and a mouth region feature point.
Fig. 5B is an example of feature points after outliers are deleted from the face feature points in fig. 5A. As shown in fig. 5B, the classification targets are human face organs including 2 eye regions (left eye region and right eye region), 1 nose region, 1 mouth region. The k value can be set to 4. Fig. 6 is an example of feature points after region division of the feature points of the facial organ in fig. 5B. As shown in fig. 6, the feature points are classified into four types, the point set corresponding to "Σ" is the feature point of the left eye, the point set corresponding to "cine" is the feature point of the right eye, the point set corresponding to "+" is the feature point of the nose, and the point set corresponding to "Δ" is the feature point of the mouth. Where "+." is the center point of each divided region. It can be seen that the division of the feature points is desirable. Fig. 7 is an example of feature points of the mouth region after region division of fig. 5B.
In some alternative embodiments, the type of the first type of feature point may also be obtained, and the first type of feature point may be determined according to the type. For example, when the embodiment of the present application is applied to a scene such as a face effect (e.g., face fusion, face deformation), the type of the first type of feature point may include at least one of a mouth region, a nose region, a left eye region, a right eye region, and the like. After the types of the first type of feature points are obtained, the feature points in the face can be clustered based on the mode, and the types of the feature points to be fused, for example, the first type of feature points, are selected.
With continued reference to FIG. 4, in block 402 of the interface, a region of conversion may be selected, such as a mouth region, or other region, without limitation.
And 330, acquiring a second image.
The second image may include, for example, an image that is to be fused into the first image, such as an entirety of the image that is to be fused into the first image, or an image that includes the image that is to be fused into the first image and other partial images. As an example, the second image may be a photograph taken by a camera of the terminal device, or a frame of image in a video taken, or an image selected by the terminal device from a target feature gallery. The target feature gallery may be stored locally at the terminal device or obtained from a server, for example, without limitation.
When the embodiment of the application is applied to scenes such as face special effects (such as face fusion and face deformation), the second image comprises images of partial organs of the face, such as partial images of mouth areas, eye areas or nose areas.
Continuing with the example of fig. 4, in block 403 in the interface, a target feature map may be uploaded, where the target feature map is an example of a second image in the embodiments of the present application, that is, an image that needs to be fused into the artwork. Optionally, when uploading the target feature map, uploading may be selected from the target feature map library, or uploading the photo from the local album, without limitation. Alternatively, when a photograph is selected, the photograph may be processed, such as by cropping. Alternatively, the number of the target feature maps may be one or more, which is not limited. The module 403 in the (a) diagram in fig. 4 includes a schematic diagram of the interface before uploading the target feature diagram, and the module 403 in the (b) diagram includes a schematic diagram of the interface after uploading the target feature diagram.
In the interface shown in fig. 4, when the original image is uploaded, the converted area is selected, and after the target feature image is selected, the user may click or touch the area where the conversion is started 404, so that the device may fuse the original image and the target feature image to obtain a third image. In particular, the process of acquiring the third image may be described in steps 340 and 350 below.
340 transforming the first type of feature points in the first image into feature points corresponding to the first type of feature points in the second image.
In some alternative embodiments, first positions of three points in the first type of feature points may be acquired, second positions of three points in the second image corresponding to the three points in the first type of feature points may be acquired, a transformation matrix between the first positions and the second positions may be determined as a transformation matrix between the first type of feature points and corresponding feature points of the second image, and finally, the first type of feature points in the first image may be transformed into feature points corresponding to the first type of feature points in the second image according to the transformation matrix.
Here, the transformation matrix may be referred to as an affine transformation matrix, and the corresponding feature points of the first type may be affine transformed into corresponding feature points in the second image.
As a possible implementation manner, the first type of feature points may be divided into at least one triangle area according to a triangulation image method, and then three vertex positions in the at least one triangle area are determined as the first positions.
As an example, the method of triangulation image may be Delaunay triangulation method, i.e. for a given set of plane points, a method of triangulation image may be determined, which may satisfy that the sum of the smallest interior angles in the triangle after the triangulation is largest, so that the divided triangle may not have a situation that a certain interior angle is too small. The embodiment of the application can divide the image (such as each facial organ area) into a plurality of areas based on the characteristic points. Fig. 8 shows a schematic representation of triangulating a face according to the Delaunay triangulation method. Wherein, alternatively, when the feature points are fewer, for example, only the key feature points, the triangle area can be subdivided by interpolation in the vicinity of the feature points for subsequent affine transformation.
For example, the first type of feature points in the first image may be divided into at least one triangle area #1 and the feature points corresponding to the first type of feature points in the second image may be divided into at least one triangle area #2 by a Delaunay triangulation method. Alternatively, each corresponding small triangle area may be aligned one by one and then affine transformed. For reference, fig. 9, in which (a) illustrates one example of a triangle area #1 and (b) illustrates one example of a triangle area #2. Wherein triangle area #1 corresponds to triangle area #2, and three vertices 1,2,3 in triangle area #1 correspond to three vertices 1,2,3 in triangle area #2, respectively. In this way, affine transformation can be performed on the three vertices of the triangle area #1 and the three vertices of the corresponding triangle area #2, and transformation matrices between the three vertices of the triangle area #1 and the three vertices of the corresponding triangle area #2 can be obtained. Accordingly, points other than three vertices within the triangle area #1 may be converted to triangle area #2 by applying the transformation matrix.
As one possible implementation, the transformation matrix may be obtained using cv2.getaffintransform (srcttri) in opencv, where "srcTri" represents the positions of three points in the original image (e.g., the first image) and "dstttri" represents the positions of the corresponding three points in the target image (e.g., the second image). The essence of affine transformation is to obtain a transformation matrix (matrix multiplication) according to key points (for example, three vertexes of the triangle area), and then apply other points in the area (for example, the triangle area) to the transformation matrix to obtain the position of the transformed pixel point. Specifically, affine transformation may be performed according to the following formula (1):
Figure BDA0003313184160000141
wherein (x, y) represents a pixel point in the original image, (x ', y') represents a pixel point in the target image,
Figure BDA0003313184160000142
is a transformation matrix.
It should be noted that, in the embodiment of the present application, the triangulation method is a Delaunay triangulation method, specifically, a point-by-point insertion method is adopted to describe the triangulation method in opencv based on the Bowyer-Watson algorithm, and optionally, the first image or the second image may be triangulated based on other algorithms such as a flanging algorithm, a segmentation merging algorithm, and the like, which is not limited in the embodiment of the present application.
Therefore, in the embodiment of the present application, the local area in the first image can be transformed into the corresponding area in the second image by triangulating the local area of the image and transforming the pixels of the triangulated area.
And 350, fusing the pixel values of the first type of feature points and the pixel values of the feature points corresponding to the first type of feature points in the second image to obtain a third image.
In some optional embodiments, a fusion ratio of the pixel values of the first type of feature points in the composite image may be determined, and then, according to the fusion ratio, the pixel values of the first type of feature points and the pixel values of the feature points corresponding to the first type of feature points in the second image are fused to obtain the third image.
Specifically, the fusion ratio may refer to a ratio of the pixel value of the first type feature point to the pixel value of the synthesized image (i.e., the fused image), and may be denoted as α, for example. For example, in the case of fusing two images (for example, a first image and a second image) into one image, the fusion ratio of the pixel values of the feature points of the corresponding region of the second image in the composite image is (1- α). In some embodiments, the blend ratio may also be referred to as transparency, without limitation.
In some alternative embodiments, the fusion ratio of the pixel values of the first type of feature points in the composite image may be determined according to the positions of the first type of feature points. In other words, different fusion rates may be employed for feature points at different locations, e.g., feature points of different regions. That is, in the embodiment of the present application, the corresponding fusion ratio may be adaptively adjusted according to the position of the feature point. For example, for feature points in 4 regions in fig. 6, different fusion ratios may be employed.
As a possible implementation manner, the fusion ratio of the pixel value of the feature point in the composite image may be determined based on the distance between the feature point and the center point of the region where the feature point is located. For example, a center point of the first type of feature point may be obtained, a distance between the center point and a feature point in the first type of feature point may be determined, and then a fusion ratio of pixel values of the feature point in the first type of feature point in the composite image may be determined according to the magnitude of the distance.
For example, a larger blending ratio may be set for feature points closer to the center point P (e.g., D (x, y) is smaller than a preset threshold), and a smaller blending ratio may be set for feature points farther from the center point P (e.g., D (x, y) is larger than a preset threshold).
As a specific implementation manner, the distance between the feature point i and the center point p can be determined according to the coordinates of the feature point i in the first type of feature points and the coordinates of the center point p of the first type of feature points, and then the fusion ratio alpha of the pixel value of the feature point i in the first type of feature points in the composite image can be determined according to the distance between the feature point i and the center point p and the maximum distance between the first type of feature points and the center point p i
As a specific example, the fusion ratio α of the pixel value of the feature point i in the first type of feature points to the synthesized image may be determined according to the following formula (2) i
Figure BDA0003313184160000151
Wherein,,
Figure BDA0003313184160000152
D max =maxD i (x,y);
(x, y) represents the coordinates of the feature point i, (x) p ,y p ) Represents the coordinates of the center point p, D i (x, y) represents the distance between the feature point i and the center point p, D max Representing the maximum distance between the first class feature point and the center point p.
Therefore, according to the embodiment of the application, the fusion ratio of the pixel value of the feature point in the synthesized image is determined according to the distances between different feature points in the local area and the center point of the area, so that the fusion ratio of the feature points at different positions in different areas can be adjusted in a self-adaptive mode.
In some optional embodiments, the pixel values of the first type of feature points and the pixel values of the feature points corresponding to the first type of feature points in the second image may be fused to obtain a target image, and then the target image and a portion of the first image except for the image corresponding to the first type of feature points are overlapped to obtain the third image.
When the pixel values of the first type of feature points and the pixel values of the feature points corresponding to the first type of feature points in the second image are fused, the fusion ratio of the pixel values of the first type of feature points is alpha, and then the fusion ratio of the pixel values of the feature points corresponding to the first type of feature points in the second image is (1-alpha). That is, the target image includes two partial images (or pixels), one of which is an image (or pixel) in which a partial region of the first image is retained, and the other of which is an image (or pixel) of a corresponding region after the partial region of the first image is converted. Accordingly, the third image includes images (or pixels) other than the converted local area in the first image in addition to the target image.
As a possible implementation manner, the first type of feature points in the first image may be processed according to the following formula (3), to obtain a third image:
F=α(1-mask)·f MO ·M+(1-α)(1-mask)·M+mask·M
(3)
wherein F represents a third image; m represents a first image, e.g. when the first image comprises a face image, M may represent a reference face; o represents a second image, e.g., O may represent a target face when the second image comprises a face image (e.g., a facial organ); f (f) MO Is a transformation matrix from M to O; mask is a mask that can filter non-D regions in M, where D is the region in M where features need to be changed (e.g., an organ/organs of a face such as an eye region, mouth region, or nose region, etc.).
Alternatively, the mask may have a size consistent with that of M, for example, may be a two-dimensional matrix, where when the pixel (x, y) belongs to the region D, (x, y) =0; conversely (x, y) =1.
Wherein, in the above formula (3), α (1-mask). F MO M may represent the multiplication of each pixel point of the D region by a transformation matrix f MO Multiplying the fusion ratio alpha corresponding to each pixel point, wherein (1-mask) limits the D area of image deformation only. The portion may represent a (part of) pixel that converts the D region in M into a deformed corresponding region. (1-alpha) (1-mask) M may represent each pixel point of the D region multiplied by a fusion ratio (1-alpha), representing the pixels remaining in the D region in M. mask M represents the non-D portion of the pixels in M, i.e., the non-D area in M can all be kept as original in M.
With continued reference to fig. 4, after clicking the portion where the conversion starts 404 in the interface shown in the (b) diagram, after obtaining the feature points of the mouth region in the original image, the background may perform affine transformation on the mouth region in the face image according to the affine transformation matrix to obtain the target pixel point that needs to be fused into the original image, for example, the mouth property corresponding to the target feature diagram in the (b) diagram in fig. 4. Then, the background can calculate the fusion ratio of each position in the mouth area, and fusion is carried out on the first type of characteristic points and the pixel values and the pixel points corresponding to the first type of characteristic points in the second image according to the fusion ratio, so as to obtain a third image.
And outputting the third image 360.
Illustratively, after acquiring the third image, the terminal device may display the third image through the interface. Referring to fig. 10, an example of a face after the original image and the target feature image in fig. 4 (b) are fused is shown, where the mouth region of the conversion result in the module 405 is obtained by fusing the mouth region in the original image in the module 401 and the mouth shape of the target feature image in the module 403 in fig. 4 (b).
Therefore, by classifying the feature points of the first image, the embodiment of the application can acquire the feature points of the local area (or the specific area), for example, the first type feature points, further transform the first type feature points in the first image into the feature points corresponding to the first type feature points in the second image, and then acquire the third image by fusing the pixel values of the first type feature points and the pixel values of the feature points corresponding to the first type feature points in the second image, so that the embodiment of the application can realize the fusion of the pixels in the local area (or the specific area) in the first image and the pixels in the area corresponding to the second image, and improve the user experience.
Further, the embodiment of the application can adaptively adjust the fusion ratio of the feature points at different positions in different areas in the composite image based on the positions of the different feature points in the local area, so that the problem that the edge (such as a mouth area, an eye area or a nose area) of the local area is unnatural, such as an edge mutation, caused by image fusion can be solved or eliminated, smooth transition of the edge of the local area during image fusion can be realized, reality during fusion can be reflected, and user experience can be improved.
The embodiment of the application can be applied to scenes such as face special effects (such as face fusion and face deformation) and the like. In particular, the embodiment of the application can adaptively fuse or deform local organs of a human face, while other parts can be kept as the same. For example, for an organ to be transformed or deformed, such as a mouth, which is specified in a face, the mouth regions of the two faces A, B may be fused and migrated according to a certain fusion ratio, and for the edge positions of the mouth regions, the embodiment of the present application may adaptively adjust the fusion ratio according to the positions of feature points of the mouth (for example, the distance from the center point of the region), so as to implement natural transition when the local regions are fused. For another example, an existing mouth shape in the program, such as smile, laugh, etc., may be migrated to the user's mouth shape, or an organ, such as eyes, nose, etc., may be enlarged or reduced.
The specific embodiments of the present application have been described in detail above with reference to the accompanying drawings, but the present application is not limited to the specific details of the above embodiments, and various simple modifications may be made to the technical solutions of the present application within the scope of the technical concept of the present application, and all the simple modifications belong to the protection scope of the present application. For example, the specific features described in the above embodiments may be combined in any suitable manner, and in order to avoid unnecessary repetition, various possible combinations are not described in detail. As another example, any combination of the various embodiments of the present application may be made without departing from the spirit of the present application, which should also be considered as disclosed herein.
It should be further understood that, in the various method embodiments of the present application, the sequence numbers of the foregoing processes do not mean the order of execution, and the order of execution of the processes should be determined by the functions and internal logic of the processes, and should not constitute any limitation on the implementation process of the embodiments of the present application. It is to be understood that the numbers may be interchanged where appropriate such that the described embodiments of the application may be implemented in other sequences than those illustrated or described.
Method embodiments of the present application are described in detail above in connection with fig. 1-10, and apparatus embodiments of the present application are described in detail below in connection with fig. 11-12.
Fig. 11 is a schematic block diagram of an apparatus 600 for image processing according to an embodiment of the present application. As shown in fig. 11, the apparatus 600 may include an acquisition unit 610, a transformation unit 620, a fusion unit 630, and an output unit 640.
An acquisition unit 610 is configured to acquire a first image.
The obtaining unit 610 is further configured to obtain a first type of feature point in the first image.
The acquiring unit 610 is further configured to acquire a second image.
A transforming unit 620, configured to transform the first type feature point in the first image into a feature point corresponding to the first type feature point in the second image.
And a fusion unit 630, configured to fuse the pixel value of the first type of feature point with the pixel value of the feature point corresponding to the first type of feature point in the second image, so as to obtain a third image.
And an output unit 640 for outputting the third image.
In some alternative embodiments, the fusion unit 630 is specifically configured to:
determining the fusion ratio of the pixel values of the first type of feature points in the synthesized image according to the positions of the first type of feature points;
and according to the fusion ratio, fusing the pixel values of the first type of feature points and the pixel values of the feature points corresponding to the first type of feature points in the second image, and obtaining the third image.
In some alternative embodiments, the fusion unit 630 is specifically configured to:
acquiring a center point of the first type of feature points;
determining a distance between the center point and a feature point in the first type of feature points;
and determining the fusion ratio of the pixel values of the feature points in the first type of feature points in the composite image according to the distance.
In some alternative embodiments, the fusion unit 630 is specifically configured to:
determining the distance between the characteristic point i and the central point p according to the coordinates of the characteristic point i in the first type of characteristic points and the coordinates of the central point p of the first type of characteristic points;
According to the distance between the characteristic point i and the central point pDetermining a fusion ratio alpha of pixel values of the feature points i in the first type of feature points in the composite image, wherein the fusion ratio alpha is separated from the maximum distance between the first type of feature points and the central point p i
In some alternative embodiments, the fusion unit 630 is specifically configured to:
fusing the pixel values of the first type of feature points and the pixel values of the feature points corresponding to the first type of feature points in the second image to obtain a target image;
and superposing the target image and the part, except for the image corresponding to the first type of characteristic points, of the first image to obtain the third image.
In some alternative embodiments, the conversion unit 620 is specifically configured to:
acquiring first positions of three points in the first type of feature points;
acquiring second positions of three points in the second image, which correspond to the three points in the first type of feature points;
determining a transformation matrix between the first position and the second position as a transformation matrix between the first type of feature points and corresponding feature points of a second image;
and according to the transformation matrix, transforming the first type of feature points in the first image into feature points corresponding to the first type of feature points in the second image.
In some alternative embodiments, the conversion unit 620 is specifically configured to:
dividing the first type of feature points into at least one triangular region according to a triangulation image method;
three vertex positions in the at least one triangle area are determined as the first position.
In some alternative embodiments, the first image comprises a facial image and the first type of feature points comprise facial organ feature points.
In some alternative embodiments, the facial organ feature points include at least one of eye region feature points, nose region feature points, and mouth region feature points.
It should be understood that apparatus embodiments and method embodiments may correspond with each other and that similar descriptions may refer to the method embodiments. To avoid repetition, no further description is provided here. Specifically, the apparatus 600 for image processing in this embodiment may correspond to a corresponding main body for executing the method 300 in this embodiment of the present application, and the foregoing and other operations and/or functions of each module in the apparatus 600 are respectively for implementing a corresponding flow in the method in fig. 3, which is not described herein for brevity.
The apparatus and system of embodiments of the present application are described above in terms of functional modules in connection with the accompanying drawings. It should be understood that the functional module may be implemented in hardware, or may be implemented by instructions in software, or may be implemented by a combination of hardware and software modules. Specifically, each step of the method embodiments in the embodiments of the present application may be implemented by an integrated logic circuit of hardware in a processor and/or an instruction in software form, and the steps of the method disclosed in connection with the embodiments of the present application may be directly implemented as a hardware decoding processor or implemented by a combination of hardware and software modules in the decoding processor. Alternatively, the software modules may be located in a well-established storage medium in the art such as random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, and the like. The storage medium is located in a memory, and the processor reads information in the memory, and in combination with hardware, performs the steps in the above method embodiments.
Fig. 12 is a schematic block diagram of an electronic device 800 provided in an embodiment of the present application.
As shown in fig. 12, the electronic device 800 may include:
a memory 810 and a processor 820, the memory 810 being for storing a computer program and transmitting the program code to the processor 820. In other words, the processor 820 may call and run a computer program from the memory 810 to implement the method of image processing in the embodiments of the present application.
For example, the processor 820 may be used to perform the steps of the method 300 described above according to instructions in the computer program.
In some embodiments of the present application, the processor 820 may include, but is not limited to:
a general purpose processor, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), field programmable gate array (Field Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like.
In some embodiments of the present application, the memory 810 includes, but is not limited to:
volatile memory and/or nonvolatile memory. The nonvolatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable EPROM (EEPROM), or a flash Memory. The volatile memory may be random access memory (Random Access Memory, RAM) which acts as an external cache. By way of example, and not limitation, many forms of RAM are available, such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (Double Data Rate SDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), and Direct memory bus RAM (DR RAM).
In some embodiments of the present application, the computer program may be partitioned into one or more modules that are stored in the memory 810 and executed by the processor 820 to perform the encoding methods provided herein. The one or more modules may be a series of computer program instruction segments capable of performing the specified functions, which are used to describe the execution of the computer program in the electronic device 800.
Optionally, as shown in fig. 12, the electronic device 800 may further include:
a transceiver 830, the transceiver 830 being connectable to the processor 820 or the memory 810.
Processor 820 may control transceiver 830 to communicate with other devices, and in particular, may send information or data to other devices or receive information or data sent by other devices. Transceiver 830 may include a transmitter and a receiver. Transceiver 830 may further include antennas, the number of which may be one or more.
It should be appreciated that the various components in the electronic device 800 are connected by a bus system that includes a power bus, a control bus, and a status signal bus in addition to a data bus.
According to an aspect of the present application, there is provided an apparatus for image processing, comprising a processor and a memory for storing a computer program, the processor being adapted to invoke and run the computer program stored in the memory, to cause the encoder to perform the method of the above-described method embodiment.
According to an aspect of the present application, there is provided a computer storage medium having stored thereon a computer program which, when executed by a computer, enables the computer to perform the method of the above-described method embodiments. Alternatively, embodiments of the present application also provide a computer program product comprising instructions which, when executed by a computer, cause the computer to perform the method of the method embodiments described above.
According to another aspect of the present application, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium and executes the computer instructions to cause the computer device to perform the method of the above-described method embodiments.
In other words, when implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces, in whole or in part, a flow or function consistent with embodiments of the present application. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital subscriber line (digital subscriber line, DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a digital video disc (digital video disc, DVD)), or a semiconductor medium (e.g., a Solid State Disk (SSD)), or the like.
It should be understood that in the embodiments of the present application, "B corresponding to a" means that B is associated with a. In one implementation, B may be determined from a. It should also be understood that determining B from a does not mean determining B from a alone, but may also determine B from a and/or other information.
In the description of the present application, unless otherwise indicated, "at least one" means one or more, and "a plurality" means two or more. In addition, "and/or" describes an association relationship of the association object, and indicates that there may be three relationships, for example, a and/or B may indicate: a alone, a and B together, and B alone, wherein a, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b, or c may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or plural.
It should be further understood that the description of the first, second, etc. in the embodiments of the present application is for purposes of illustration and distinction only, and does not represent a specific limitation on the number of devices in the embodiments of the present application, and should not constitute any limitation on the embodiments of the present application.
It should also be appreciated that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present application. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or server that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Those of ordinary skill in the art will appreciate that the various illustrative modules and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the several embodiments provided in this application, it should be understood that the disclosed apparatus, device, and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, and for example, the division of the modules is merely a logical function division, and there may be additional divisions when actually implemented, for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or modules, which may be in electrical, mechanical, or other forms.
The modules illustrated as separate components may or may not be physically separate, and components shown as modules may or may not be physical modules, i.e., may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. For example, functional modules in the embodiments of the present application may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module.
The foregoing is merely a specific embodiment of the present application, but the protection scope of the present application is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the present application, and the changes or substitutions are covered in the protection scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (17)

1. A method of image processing, comprising:
acquiring a first image;
acquiring first type feature points in the first image;
acquiring a second image;
transforming the first type of feature points in the first image into feature points corresponding to the first type of feature points in the second image;
fusing the pixel values of the first type of feature points and the pixel values of the feature points corresponding to the first type of feature points in the second image to obtain a third image;
outputting the third image.
2. The method according to claim 1, wherein the fusing the pixel values of the first type of feature points with the pixel values of feature points corresponding to the first type of feature points in the second image to obtain a third image includes:
Determining the fusion ratio of the pixel values of the first type of feature points in the synthesized image according to the positions of the first type of feature points;
and according to the fusion ratio, fusing the pixel values of the first type of feature points and the pixel values of the feature points corresponding to the first type of feature points in the second image, and obtaining the third image.
3. The method according to claim 2, wherein determining a fusion ratio of pixel values of the first type of feature points in the composite image according to the positions of the first type of feature points includes:
acquiring a center point of the first type of feature points;
determining a distance between the center point and a feature point in the first type of feature points;
and determining the fusion ratio of the pixel values of the feature points in the first type of feature points in the composite image according to the distance.
4. A method according to claim 3, wherein determining a fusion ratio of pixel values of feature points in the first type of feature points in the composite image according to the magnitude of the distance includes:
determining the distance between the characteristic point i and the central point p according to the coordinates of the characteristic point i in the first type of characteristic points and the coordinates of the central point p of the first type of characteristic points;
Determining the features in the first type of feature points according to the distance between the feature point i and the central point p and the maximum distance between the first type of feature points and the central point pFusion ratio alpha of pixel value of point i in composite image i
5. The method according to any one of claims 1-4, wherein the fusing the pixel values of the first type of feature points with the pixel values of feature points corresponding to the first type of feature points in the second image to obtain a third image includes:
fusing the pixel values of the first type of feature points and the pixel values of the feature points corresponding to the first type of feature points in the second image to obtain a target image;
and superposing the target image and the part, except for the image corresponding to the first type of characteristic points, of the first image to obtain the third image.
6. The method according to any one of claims 1-5, wherein said transforming the first type of feature points in the first image into feature points in the second image corresponding to the first type of feature points comprises:
acquiring first positions of three points in the first type of feature points;
Acquiring second positions of three points in the second image, which correspond to the three points in the first type of feature points;
determining a transformation matrix between the first location and the second location as a transformation matrix between the first type of feature points and the corresponding feature points of the second image;
and transforming the first type of feature points into feature points corresponding to the first type of feature points in the second image according to the transformation matrix.
7. The method of claim 6, wherein the obtaining the first location of three points of the first type of feature point comprises:
dividing the first type of feature points into at least one triangular region according to a triangulation image method;
three vertex positions in the at least one triangle area are determined as the first position.
8. The method of any of claims 1-7, wherein the first image comprises a facial image and the first type of feature points comprise facial organ feature points.
9. The method of claim 8, wherein the facial organ feature points comprise at least one of eye region feature points, nose region feature points, and mouth region feature points.
10. An apparatus for image processing, comprising:
an acquisition unit configured to acquire a first image;
the acquisition unit is also used for acquiring first type feature points in the first image;
the acquisition unit is also used for acquiring a second image;
a transformation unit, configured to transform the first type of feature points in the first image into feature points corresponding to the first type of feature points in the second image;
the fusion unit is used for fusing the pixel values of the first type of feature points and the pixel values of the feature points corresponding to the first type of feature points in the second image to obtain a third image;
and the output unit is used for outputting the third image.
11. The device according to claim 10, wherein the fusion unit is specifically configured to:
determining the fusion ratio of the pixel values of the first type of feature points in the synthesized image according to the positions of the first type of feature points;
and according to the fusion ratio, fusing the pixel values of the first type of feature points and the pixel values of the feature points corresponding to the first type of feature points in the second image, and obtaining the third image.
12. The device according to claim 11, wherein the fusion unit is specifically configured to:
acquiring a center point of the first type of feature points;
determining a distance between the center point and a feature point in the first type of feature points;
and determining the fusion ratio of the pixel values of the feature points in the first type of feature points in the composite image according to the distance.
13. The device according to claim 12, wherein the fusion unit is specifically configured to:
determining the distance between the characteristic point i and the central point p according to the coordinates of the characteristic point i in the first type of characteristic points and the coordinates of the central point p of the first type of characteristic points;
determining a fusion ratio alpha of pixel values of the feature points i in the first type of feature points in the composite image according to the distance between the feature points i and the central point p and the maximum distance between the first type of feature points and the central point p i
14. The device according to any one of claims 10-13, wherein the fusion unit is specifically configured to:
fusing the pixel values of the first type of feature points and the pixel values of the feature points corresponding to the first type of feature points in the second image to obtain a target image;
And superposing the target image and the part, except for the image corresponding to the first type of characteristic points, of the first image to obtain the third image.
15. An electronic device comprising a processor and a memory, the memory having instructions stored therein that when executed by the processor cause the processor to perform the method of any of claims 1-9.
16. A computer storage medium for storing a computer program, the computer program comprising instructions for performing the method of any one of claims 1-9.
17. A computer program product comprising computer program code which, when run by an electronic device, causes the electronic device to perform the method of any one of claims 1-9.
CN202111222630.XA 2021-10-20 2021-10-20 Image processing method and device Pending CN116012270A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111222630.XA CN116012270A (en) 2021-10-20 2021-10-20 Image processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111222630.XA CN116012270A (en) 2021-10-20 2021-10-20 Image processing method and device

Publications (1)

Publication Number Publication Date
CN116012270A true CN116012270A (en) 2023-04-25

Family

ID=86030376

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111222630.XA Pending CN116012270A (en) 2021-10-20 2021-10-20 Image processing method and device

Country Status (1)

Country Link
CN (1) CN116012270A (en)

Similar Documents

Publication Publication Date Title
US11481869B2 (en) Cross-domain image translation
CN108460338B (en) Human body posture estimation method and apparatus, electronic device, storage medium, and program
CN111754596B (en) Editing model generation method, device, equipment and medium for editing face image
WO2021139324A1 (en) Image recognition method and apparatus, computer-readable storage medium and electronic device
Liu et al. Real-time robust vision-based hand gesture recognition using stereo images
WO2022105125A1 (en) Image segmentation method and apparatus, computer device, and storage medium
CN111553267B (en) Image processing method, image processing model training method and device
WO2020134238A1 (en) Living body detection method and apparatus, and storage medium
CN111275784B (en) Method and device for generating image
AU2019268184A1 (en) Precise and robust camera calibration
CN108388889B (en) Method and device for analyzing face image
CN112651333B (en) Silence living body detection method, silence living body detection device, terminal equipment and storage medium
CN111950570B (en) Target image extraction method, neural network training method and device
CN111833360B (en) Image processing method, device, equipment and computer readable storage medium
CN112598780A (en) Instance object model construction method and device, readable medium and electronic equipment
WO2022179603A1 (en) Augmented reality method and related device thereof
US20160086365A1 (en) Systems and methods for the conversion of images into personalized animations
CN112381707A (en) Image generation method, device, equipment and storage medium
CN114972016A (en) Image processing method, image processing apparatus, computer device, storage medium, and program product
CN116152334A (en) Image processing method and related equipment
CN117252947A (en) Image processing method, image processing apparatus, computer, storage medium, and program product
CN117576248B (en) Image generation method and device based on gesture guidance
CN117237547B (en) Image reconstruction method, reconstruction model processing method and device
CN117094362B (en) Task processing method and related device
Purps et al. Reconstructing facial expressions of hmd users for avatars in vr

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination