WO2021147966A1 - 图像识别方法及装置 - Google Patents

图像识别方法及装置 Download PDF

Info

Publication number
WO2021147966A1
WO2021147966A1 PCT/CN2021/073150 CN2021073150W WO2021147966A1 WO 2021147966 A1 WO2021147966 A1 WO 2021147966A1 CN 2021073150 W CN2021073150 W CN 2021073150W WO 2021147966 A1 WO2021147966 A1 WO 2021147966A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
recognized
key point
coordinates
splicing
Prior art date
Application number
PCT/CN2021/073150
Other languages
English (en)
French (fr)
Inventor
施雪梅
许强强
杨浩
Original Assignee
北京达佳互联信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京达佳互联信息技术有限公司 filed Critical 北京达佳互联信息技术有限公司
Publication of WO2021147966A1 publication Critical patent/WO2021147966A1/zh
Priority to US17/746,842 priority Critical patent/US20220279241A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/10Image acquisition
    • G06V10/16Image acquisition using multiple overlapping images; Image stitching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/04Context-preserving transformations, e.g. by using an importance map
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4038Image mosaicing, e.g. composing plane images from plane sub-images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/4223Cameras
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44016Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for substituting a video clip
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/478Supplemental services, e.g. displaying phone caller identification, shopping application
    • H04N21/4788Supplemental services, e.g. displaying phone caller identification, shopping application communicating with other users, e.g. chatting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/8146Monomedia components thereof involving graphical data, e.g. 3D object, 2D graphics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows

Definitions

  • the present disclosure relates to the field of video technology, and in particular to an image recognition method and device.
  • Video communication can be widely used in application scenarios such as video calls, video conferences, and video live broadcasts.
  • the user can shoot through the local terminal and play the video captured by the local terminal, and the local terminal can also play the video captured by the other terminal. Therefore, the user can see the real-time video of both parties through the local terminal. .
  • users can perform special effects processing on video images. For example, in a live video broadcast, users put animated stickers on the video images of both parties.
  • the present disclosure provides an image recognition method and device.
  • the technical solutions of the present disclosure are as follows:
  • an image recognition method including:
  • a respective second key point of each of the to-be-recognized images is determined.
  • the pixel coordinates of the first key point on the target image are the first key point coordinates, and each of the to-be-identified key points is determined according to the plurality of first key points of the target image.
  • the second key points of each image include:
  • the coordinate conversion parameters are parameters for converting the coordinates of the first key point into the coordinates of the second key point on the image to be recognized ;
  • the pixel at the coordinate of the second key point in the image to be recognized is used as the second key point.
  • the target image includes a plurality of image regions, and the plurality of image regions respectively have corresponding images to be recognized, and the determining the coordinate conversion parameters corresponding to the coordinates of the first key point includes:
  • it further includes:
  • the target image is divided into a plurality of the image regions respectively corresponding to the plurality of images to be recognized.
  • the determining the coordinate conversion parameter of the coordinate conversion parameter corresponding to the coordinate of the first key point includes:
  • the coordinate conversion parameter is determined based on the coordinates of the reference pixel after splicing and the coordinates of the reference pixel before splicing.
  • the determining the coordinate conversion parameter based on the post-splicing reference pixel coordinates and the pre-splicing reference pixel coordinates includes:
  • the difference obtained by subtracting the coordinate of the reference pixel after the splicing from the coordinate of the reference pixel before the splicing is used as the coordinate conversion parameter; or,
  • the difference between the coordinate of the reference pixel before splicing and the coordinate of the reference pixel after splicing is used as the coordinate conversion parameter.
  • the converting the first key point coordinates into the second key point coordinates according to the coordinate conversion parameters corresponding to the first key point coordinates includes:
  • the coordinate conversion parameter is the difference obtained by subtracting the reference pixel coordinates after the splicing from the reference pixel coordinates before the splicing, the coordinate conversion parameter is subtracted from the first key point coordinates to obtain the The second key point coordinates;
  • the coordinate conversion parameter is the difference obtained by subtracting the reference pixel coordinates before splicing from the reference pixel coordinates after splicing
  • the first key point coordinates are added to the coordinate conversion parameter to obtain the second Key point coordinates.
  • the splicing a plurality of the to-be-recognized images to obtain the target image includes:
  • Scaling at least one of the plurality of images to be recognized to obtain a scaled image; the image sizes of the plurality of scaled images are the same;
  • a video live broadcast method including:
  • image special effects are added to the first image to be recognized to obtain the first special effect image, and according to the second key point of the second image to be recognized, Adding image special effects to the second to-be-identified image to obtain a second special-effected image;
  • an image recognition device including:
  • the image acquisition unit is configured to perform acquisition of multiple images to be recognized
  • the image splicing unit is configured to perform splicing of a plurality of the to-be-identified images to obtain a target image
  • a key point recognition unit configured to perform inputting the target image into an image recognition model to obtain a plurality of first key points of the target image
  • the key point determination unit is configured to execute the determination of the respective second key points of each of the to-be-recognized images according to the multiple first key points of the target image.
  • the pixel coordinates of the first key point on the target image are the first key point coordinates
  • the key point determination unit is configured to execute:
  • the coordinate conversion parameters are parameters for converting the coordinates of the first key point into the coordinates of the second key point on the image to be recognized ;
  • the pixel at the coordinate of the second key point in the image to be recognized is used as the second key point.
  • the target image includes a plurality of image regions, and the plurality of image regions respectively have corresponding images to be recognized, and the key point determination unit is configured to execute:
  • the device further includes:
  • the dividing unit is configured to perform determining the image boundary of the image to be recognized according to the pixel coordinates of the pixel points in the image to be recognized; determining the pixel coordinates of the image boundary of the image to be recognized on the target image, Obtain image area division coordinates; according to the image area division coordinates, divide the target image into a plurality of the image areas respectively corresponding to the plurality of images to be recognized.
  • the key point determination unit is configured to execute:
  • the coordinate conversion parameter is determined based on the coordinates of the reference pixel after splicing and the coordinates of the reference pixel before splicing.
  • the key point determination unit is configured to execute:
  • the difference obtained by subtracting the coordinate of the reference pixel after the splicing from the coordinate of the reference pixel before the splicing is used as the coordinate conversion parameter; or,
  • the difference between the coordinate of the reference pixel before splicing and the coordinate of the reference pixel after splicing is used as the coordinate conversion parameter.
  • the key point determination unit is configured to execute:
  • the coordinate conversion parameter is the difference obtained by subtracting the reference pixel coordinates after the splicing from the reference pixel coordinates before the splicing, the coordinate conversion parameter is subtracted from the first key point coordinates to obtain the The second key point coordinates;
  • the coordinate conversion parameter is the difference obtained by subtracting the reference pixel coordinates before splicing from the reference pixel coordinates after splicing
  • the first key point coordinates are added to the coordinate conversion parameter to obtain the second Key point coordinates.
  • the image stitching unit is configured to perform:
  • Scaling at least one of the plurality of images to be recognized to obtain a scaled image; the image sizes of the plurality of scaled images are the same;
  • a video live broadcast device including:
  • the video stream obtaining unit is configured to perform obtaining the live video stream of the first account and obtain the live video stream of the second account;
  • An image acquisition unit configured to extract a first image to be recognized from the live video stream of the first account, and extract a second image to be recognized from the live video stream of the second account;
  • An image splicing unit configured to perform splicing of the first image to be recognized and the second image to be recognized to obtain a target image
  • a key point recognition unit configured to perform inputting the target image into an image recognition model to obtain a plurality of first key points of the target image
  • a key point determining unit configured to execute, according to a plurality of first key points of the target image, to determine each second key point of the first image to be recognized and the second image to be recognized;
  • the special effect adding unit is configured to perform adding image special effects to the first image to be identified according to the second key point of the first image to be identified to obtain the first special effect image, and, according to the second key point of the image to be identified, The second key point of the image, adding image special effects to the second to-be-recognized image to obtain a second special-effected image;
  • a special effect playing unit configured to play the special effect live video of the first account and the special effect live video of the second account; the special effect live video of the first account includes the first special effect image; The special effect live video of the second account includes the second special effect image.
  • a computer device including: a processor; a memory for storing executable instructions of the processor; wherein the processor is configured to execute the instructions to implement The method in the embodiment described in the above aspect.
  • a storage medium is provided, and instructions in the storage medium are executed by a processor of a computer device, so that the computer device can execute the method in the embodiment described in the foregoing aspect.
  • a computer program product including: computer program code, which is executed by a computer, so that the computer executes the method in the embodiment described in the above aspect.
  • FIG. 1 is a schematic flowchart of an image recognition method according to an embodiment
  • Fig. 2 is an application environment diagram of an image recognition method according to an embodiment
  • Fig. 3 is an application scenario of video live broadcast according to an embodiment
  • Fig. 4 is a schematic diagram of a video playing interface according to an embodiment
  • FIG. 5 is a schematic diagram of adding image special effects during a live video broadcast according to an embodiment
  • Fig. 6 is a schematic diagram of adding image special effects to a video playback interface according to an embodiment
  • Fig. 7 is a schematic diagram of a mosaic edge of an image according to an embodiment
  • Fig. 8 is a schematic diagram of a stitched image according to an embodiment
  • FIG. 9 is a schematic diagram of key points of a spliced image according to an embodiment.
  • FIG. 10 is a schematic diagram of key points of an image according to an embodiment
  • FIG. 11 is a schematic diagram of adding image special effects to an image according to key points according to an embodiment
  • Fig. 12 is a flowchart of a step of determining key points of an image according to an embodiment
  • FIG. 13 is a schematic diagram of a two-dimensional coordinate system of stitched images according to an embodiment
  • FIG. 14 is a schematic diagram of determining the coordinates of the second key point according to an embodiment
  • FIG. 15 is a schematic flowchart of a video live broadcast method according to an embodiment
  • FIG. 16 is a structural block diagram of a live broadcast system according to an embodiment
  • FIG. 17 is a schematic flowchart of a video live broadcast method of an embodiment
  • FIG. 18 is a structural block diagram of an image recognition device according to an embodiment
  • FIG. 19 is a structural block diagram of a video live broadcast device in an embodiment
  • Fig. 20 is a structural block diagram of a computer device according to an embodiment.
  • an image recognition method is provided.
  • the image recognition method provided in this embodiment is applied to the application environment as shown in FIG. 2.
  • the application environment includes a first terminal 21, a second terminal 22, and a server 23.
  • the first terminal 21 and the second terminal 22 include, but are not limited to, personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices.
  • the server 23 is implemented as an independent server or a server cluster composed of multiple servers.
  • the above-mentioned image recognition method is applied to application scenarios of video communication such as video calls, video conferences, and video live broadcasts.
  • the above-mentioned image recognition method is applied to the application scenario of adding image special effects to the images in the video during the video communication process.
  • the above-mentioned image recognition method is applied to an application scenario where multiple images are recognized.
  • an application scenario of a live video broadcast according to an embodiment is provided.
  • the first user logs in to the first account on the live video platform through the first terminal 21, takes a selfie through the first terminal 21, and the first terminal 21 sends the captured video stream to the server 23, and the server 23 sends the first account to the first account.
  • the video stream is sent to the second terminal 22.
  • the second user logs in to the second account on the live video platform through the second terminal 22, takes a selfie through the second terminal 22, the second terminal 22 sends the video stream obtained by shooting to the server 23, and the server 23 sends the video stream of the second account to First terminal 21.
  • the first terminal 21 and the second terminal 22 both obtain the respective video streams of the first account and the second account, that is, both the first terminal 21 and the second terminal 22 obtain two video streams.
  • the first terminal 21 and the second terminal 22 respectively perform live video broadcasts according to two video streams. Both the first user and the second user can watch the live broadcast of themselves and the other party on the terminal.
  • the server 23 can send two video streams to the third terminal 24 of other users, and the other users watch the live images of the first user and the second user through the third terminal 24.
  • FIG. 4 a schematic diagram of a video playing interface according to an embodiment is provided.
  • the first user and the second user who conduct the live video can watch the live broadcast of themselves and the other party in real time, and communicate through at least one method such as voice and text, and the live broadcast of themselves and the other party as well as The content exchanged between the two parties can also be watched by other users in real time, so this application scenario is usually also called "live streaming".
  • FIG. 5 a schematic diagram of adding image special effects in a live video broadcast process according to an embodiment is provided.
  • the second user submits a special effect instruction through the second terminal 22, and the screens of the first account and the second account on the video playback interface add special effects for the displayed face.
  • the second terminal 22 needs to create an image recognition instance to perform image recognition on consecutive multiple frames of images in the video stream, identify key points in the image, and add image special effects based on the key points in the image to obtain the The image of the image effect is displayed.
  • the second terminal 22 since there are two video streams, the second terminal 22 needs to create image recognition instances for the images in the two video streams respectively to output the images to the image recognition model.
  • the image recognition model outputs the key points of the images in the two video streams.
  • the execution of the image recognition instance to perform image recognition through the image recognition model requires the processing resources of the second terminal 22.
  • the image recognition method in needs to consume a lot of processing resources of the terminal. For terminals with poor performance, executing multiple image recognition instances to perform image recognition on multiple video streams at the same time may cause problems such as screen freezes and delays due to insufficient processing resources.
  • the applicant has conducted in-depth research on image recognition methods in related technologies.
  • the applicant found that after the second terminal 22 creates an image recognition instance, the second terminal 22 performs image recognition processing according to the image recognition instance, and converts the image to Input to the image recognition model, when performing image recognition processing through the image recognition model, the second terminal 22 scans each pixel in the entire image in a certain order, and each scan process will consume more processing on the terminal resource. Therefore, the applicant proposes a new image recognition method, which can be applied to the above-mentioned application scenarios, which can complete image recognition through a single image recognition instance, reduce the consumption of terminal processing resources, and improve the image The efficiency of recognition.
  • An image recognition method in this embodiment is described by taking the method applied to the second terminal 22 in FIG. 2 as an example, and includes the following steps:
  • step S11 a plurality of images to be recognized are acquired.
  • the image to be recognized is an image that will be currently subjected to image recognition to obtain key points.
  • the image processing method is applied in video communication application scenarios; the first terminal 21 and the second terminal 22 have video applications installed, and the first user logs in to the video application platform through the video application of the first terminal 21 With the first account, the second user logs in to the second account of the video application platform through the video application of the second terminal 22.
  • the first terminal 21 and the second terminal 22 are connected through the server 23 for video communication.
  • the first user shoots through the first terminal 21 to obtain the video stream of the first account, and forwards the video stream of the first account to the second terminal 22 through the server 23; and the second user shoots through the second terminal 22 to obtain The video stream of the second account.
  • the second terminal 22 obtains two video streams.
  • the video application of the second terminal 22 provides a video playback interface on which the video is played according to the images in the respective video streams of the first account and the second account.
  • the video playback interface of the second terminal 22 is divided into left and right sub-interfaces.
  • the left sub-interface displays continuous multi-frame images in the video stream of the first account
  • the right sub-interface displays the second account’s Continuous multiple frames of images in a video stream.
  • the video application of the second terminal 22 provides an entry for adding special effects for the user to request to add image special effects.
  • a virtual button 51 of "facial expression special effect” is set on the video playback interface, and the user clicks the virtual button 51, and the image special effect of the expression special effect can be added to the human face in the image.
  • the second terminal 22 extracts images from the two video streams. Since each video stream contains multiple images, the second terminal 22 extracts one or more consecutive frames of images from the two video streams, thereby obtaining the image of the first account and the image of the second account.
  • the image of the first account and the image of the second account are used as the aforementioned multiple images to be recognized.
  • step S12 a plurality of images to be recognized are spliced to obtain a target image.
  • the target image is an image obtained by stitching multiple images to be recognized.
  • the second terminal 22 splices the images to be recognized separately extracted from the two video streams, and uses the spliced image as the aforementioned target image.
  • the second terminal 22 selects one of the image edges as the stitching edge among the multiple image edges of the image to be identified, and stitches the plurality of images to be identified according to the stitching edge.
  • the splicing edges of each image to be recognized are overlapped, thereby completing the splicing of multiple images to be recognized.
  • the second terminal 22 stitches a plurality of images to be recognized left and right. For example, for two images to be recognized, the image edge on the right side of one image is selected as the stitching edge, and the image edge on the left side of the other image is selected as the stitching edge, and stitching is performed according to the respective stitching edges of the two images.
  • FIG. 7 there is provided a schematic diagram of stitching edges of an image to be recognized according to an embodiment.
  • two images to be recognized which are the image 61 and the image 62 extracted from the video stream of the first account and the video of the second account.
  • the image edge on the right side of the image 61 is selected as the stitching edge
  • the image edge on the left side of the image 62 is selected as the stitching edge
  • the stitching is performed according to the stitching edges of the image 61 and the image 62.
  • FIG. 8 a schematic diagram of a stitched image according to an embodiment is provided. As shown in the figure, after stitching according to the stitching edges of the image 61 and the image 62, a target image 63 composed of the image 61 and the image 62 is obtained.
  • the second terminal 22 stitches a plurality of images to be recognized up and down. For example, the second terminal 22 selects the upper image edge of a to-be-recognized image as the splicing edge, selects the lower image edge of another to-be-recognized image as the splicing edge, and performs splicing according to the upper and lower splicing edges of the image to be recognized.
  • the second terminal 22 first generates a blank image, adds multiple to-be-recognized images to the blank image, and uses the image to which multiple to-be-recognized images are added as the aforementioned target image.
  • the second terminal 22 uses multiple splicing methods to splice the multiple to-be-recognized images into the aforementioned target images, and the present disclosure does not limit the splicing methods.
  • each image to be identified is essentially composed of a pixel array, and each pixel of the image to be identified has a corresponding pixel value and pixel coordinate.
  • Stitching multiple images to be recognized into a target image essentially generates a new pixel array representing the target image based on the pixel array in the image to be recognized.
  • To stitch multiple images to be recognized into stitched images is to change the pixel values and pixel coordinates in the pixel array.
  • step S13 the target image is input into the image recognition model to obtain the first key point of the target image.
  • the first key point is a pixel point with a specific characteristic in the target image.
  • the first key point is a key point of any part of the target object in the target image.
  • the first key point is the key point of the face or the key point of the facial features.
  • the second terminal 22 creates an image recognition instance for image recognition of the target image
  • the second terminal 22 executes the image recognition instance to realize the input of the target image into the image recognition model
  • the second terminal 22 follows
  • the image recognition example scans each pixel in the target image to determine whether a certain pixel is a key point.
  • the second terminal 22 recognizes and obtains the key points in the target image through the image recognition model as the aforementioned first key points.
  • the second terminal 22 determines the pixel coordinates of the first key point in the two-dimensional coordinate system constructed with the target image according to the first key point in the target image.
  • FIG. 9 a schematic diagram of key points of a first target image according to an embodiment is provided. As shown in the figure, after image recognition, the key points 64 with the contour features of the face in the first image 63 are obtained.
  • step S14 according to the first key point of the target image, the respective second key point of each image to be recognized is determined.
  • the second terminal 22 uses the first key point of the target image to respectively determine one or more pixel points of each image to be recognized as the key point to obtain the above-mentioned second key point. For example, after the second terminal 22 obtains the first key point of the target image, it determines the pixel point corresponding to each first key point of the target image in each image to be recognized, and sets each first key point of the target image at The corresponding pixel in each image to be recognized is used as the second key point in each image to be recognized.
  • FIG. 10 a schematic diagram of the second key point of each image to be recognized according to an embodiment is provided. As shown in the figure, after determining the first key point 64 of the target image 63, the second terminal 22 determines the second key point 65 of each of the image 61 and the image 62.
  • the second terminal 22 after the second terminal 22 obtains the respective second key points of each image to be recognized, the second terminal 22 adds to each image to be recognized according to the respective second key points of each image to be recognized.
  • Image effects display images with image effects added.
  • FIG. 11 there is provided a schematic diagram of adding image special effects to the image to be recognized according to the second key point according to an embodiment.
  • the second terminal 22 obtains the second key point 65 with the contour feature of the human face in the image 61 and the image 62, it adds an expression special effect on the human face.
  • the second terminal 22 determines the respective second key point of each image to be recognized according to the first key point of the target image.
  • the second terminal 22 after obtaining the target image, the second terminal 22 records the pixel points corresponding to each pixel in the image to be recognized in the target image. After the first key point of the target image is obtained, the pixel point corresponding to the first key point of the target image in each image to be recognized is determined, thereby obtaining the second key point of the image to be recognized.
  • the second terminal 22 first determines at least one pixel in the image to be recognized as the reference pixel, for example, the pixel at the end of the image in the image to be recognized is used as the reference pixel, and the reference pixel is recorded at The pixel coordinates in the two-dimensional coordinate system constructed by the image to be recognized are used as the reference pixel coordinates before splicing. After obtaining the target image, the second terminal 22 determines the pixel coordinates of the reference pixel in the two-dimensional coordinate system constructed with the target image as the reference pixel coordinates after stitching. The second terminal 22 calculates the coordinate difference between the reference pixel coordinates before splicing and the reference pixel coordinates after splicing as a coordinate conversion parameter.
  • the second terminal 22 After obtaining the first key point of the target image, the second terminal 22 converts the pixel coordinates of the first key point in the target image into corresponding pixel coordinates according to the pixel coordinates of the first key point in the target image and the aforementioned coordinate conversion parameters.
  • the pixel coordinates of the pixel point in the image to be recognized, and the pixel point corresponding to the converted pixel coordinate is the second key point on the image to be recognized, thereby obtaining the second key point of the image to be recognized.
  • the second terminal 22 can also use other methods to determine the second key point of each image to be recognized according to the first key point of the target image.
  • the second terminal 22 when the second terminal 22 executes the image recognition instance, it will input the target image into the image recognition model.
  • the image recognition model is in the process of recognizing the target image.
  • the processing process of pixel scanning the scanning processing of each image will consume more processing resources of the terminal.
  • multiple images are spliced into a target image, and the target image is input to the image recognition model.
  • the second terminal only needs to perform a single scanning process on the target image, instead of separately processing multiple images to be recognized.
  • the scanning process is performed multiple times, thereby saving the processing resources required for the scanning process.
  • the multiple images to be recognized are stitched into a target image, and the target image is input to the image recognition model to obtain the first key point of the target image, which is determined according to the first key point
  • the second key points of the multiple images to be recognized Therefore, the image recognition of the multiple images to be recognized can be realized by inputting the target image into the image recognition model, and the respective key points of the multiple images to be recognized can be obtained.
  • the processing resources required for image recognition solves the problem that image recognition methods in related technologies seriously consume terminal processing resources.
  • the second terminal 22 reduces the consumption of processing resources when identifying key points of the image to add image special effects. As the consumption of processing resources is reduced, problems such as frame freezes and delays of the video communication caused by insufficient processing resources of the second terminal 22 are avoided.
  • Step S14 includes:
  • S121 Determine a coordinate conversion parameter corresponding to the coordinates of the first key point; the coordinate conversion parameter is a parameter used to convert the coordinates of the first key point into the coordinates of the second key point on the image to be recognized.
  • the coordinate conversion parameter corresponding to the coordinates of the first key point may be a coordinate conversion parameter of the image to be recognized corresponding to the first key point, and the coordinate conversion parameter is a parameter for pixel coordinate conversion between the image to be recognized and the target image.
  • this step includes: for each first key point, determining the image to be recognized corresponding to the first key point, and determining the coordinate conversion parameters of the image to be recognized.
  • the second terminal 22 determines the pixel coordinates of the first key point on the target image as the aforementioned first key point coordinates.
  • a two-dimensional coordinate system is first constructed according to the target image, and each pixel on the target image has a corresponding two-dimensional coordinate system. Pixel coordinates.
  • FIG. 13 provides a schematic diagram of a two-dimensional coordinate system of a target image according to an embodiment.
  • the end point of the lower left end of the target image is taken as the origin O of the two-dimensional coordinate system
  • the horizontal edge of the lower side of the target image is taken as the X axis
  • the vertical edge on the left side of the target image is taken as the Y axis, thereby constructing the target image
  • the two-dimensional coordinate system The two-dimensional coordinate system.
  • Each first key point 64 in the target image has a corresponding first key point coordinate (X1, Y1) in the two-dimensional coordinate system.
  • the second terminal 22 After determining one or more first key point coordinates, the second terminal 22 determines the coordinate conversion parameter corresponding to the first key point coordinates.
  • the pixel coordinates of the pixels of the image to be identified on the image to be identified will be changed to the pixel coordinates of the pixel on the target image. Pixel coordinates.
  • the above-mentioned coordinate conversion parameters are obtained according to the difference between the pixel coordinates of the pixel point of the image to be recognized on the image to be recognized and the pixel coordinate of the pixel point on the target image after the target image is obtained.
  • the pixel coordinates of a certain pixel on the image to be recognized are (5, 10), and the pixel coordinates of the pixel on the target image are (15, 10), so that the pixel of the image to be recognized is
  • the coordinate difference between the pixel coordinate on the image and the pixel coordinate of the pixel on the target image is (10, 0), and the coordinate difference is used as the aforementioned coordinate conversion parameter.
  • the difference between the pixel coordinates of different pixels on the image to be recognized and the pixel coordinates of the pixel on the target image is also different, so according to the coordinates of the first key point, determine the corresponding coordinates Conversion parameters in order to perform coordinate conversion according to the corresponding coordinate conversion parameters.
  • S122 Convert the first key point coordinates to the second key point coordinates according to the coordinate conversion parameters corresponding to the first key point coordinates.
  • the coordinate conversion parameter corresponding to the coordinates of the first key point is the coordinate conversion parameter of the image to be recognized, and this step includes: converting the coordinates of the first key point to the second key point according to the coordinate conversion parameters of the image to be recognized. Point coordinates.
  • the second terminal 22 obtains the coordinate conversion parameter corresponding to the first key point coordinate, and converts the first key point coordinate to the second key point coordinate according to the coordinate conversion parameter.
  • the coordinate conversion parameters Through the coordinate conversion parameters, the pixel coordinates of the key points on the target image are restored to the pixel coordinates of the key points on the image to be recognized.
  • S123 Use a pixel at the second key point coordinate in the image to be recognized as the second key point.
  • the second terminal 22 after the second terminal 22 determines the second key point coordinates, it searches for the pixel point at the second key point coordinate on the image to be recognized as the second key point of the image to be recognized, and then marks the second key point. key point.
  • FIG. 14 provides a schematic diagram of determining the coordinates of the second key point according to an embodiment. Assuming that the first key point coordinates of the first key point 64 of the target image 63 are (15, 10), the coordinate conversion parameter is a coordinate difference (10, 0), and the first key point coordinates (15, 10) are subtracted The coordinate difference (10, 0) is used to obtain the second key point coordinate (5, 10), and the pixel at the second key point coordinate (5, 10) is searched in the image 62 to obtain the second key point 65.
  • the first key point coordinates are converted to the second key point coordinates according to the coordinate conversion parameters, and finally the second key point in the image to be recognized is converted
  • the pixel point of the coordinate is used as the second key point of the image to be recognized. Therefore, through a small number of coordinate conversion parameters, the second key point of each image to be recognized can be determined according to the multiple first key points of the target image. There is no need to establish a corresponding relationship between the pixel points of the image to be recognized and the pixel points of the target image one by one, which further saves the processing resources of the second terminal 22.
  • the target image includes multiple image regions, and the multiple image regions respectively have corresponding images to be recognized.
  • Step S121 includes:
  • the target image area where the first key point coordinates are located in the target image determines the target image area where the first key point coordinates are located in the target image; determine the coordinate conversion parameters corresponding to the first key point coordinates according to the image to be recognized corresponding to the target image area.
  • the second terminal 22 determines the image boundary of the image to be recognized based on the pixel coordinates of each pixel in the image to be recognized.
  • the image boundary divides the target image obtained by splicing multiple images to be recognized to obtain multiple image regions in the target image.
  • the second terminal 22 After obtaining the first key point of the target image, the second terminal 22 first determines the image area where the coordinates of the first key point are located in the target image as the aforementioned target image area. Then, the second terminal 22 determines the image to be recognized corresponding to the target image area, and determines the coordinate conversion parameter corresponding to the coordinates of the first key point according to the image to be recognized corresponding to the target image area.
  • the second terminal 22 uses the to-be-recognized image corresponding to the target image area as the coordinate conversion parameter corresponding to the first key point coordinate.
  • the coordinate conversion parameter corresponding to the first key point is determined according to the image area where the first key point is located on the target image, and there is no need to record the corresponding coordinate conversion parameter for each pixel on the target image. This saves processing resources required for image recognition, reduces terminal consumption, and improves image recognition efficiency.
  • step S12 the method further includes:
  • the target image is divided into multiple image areas respectively corresponding to multiple images to be recognized.
  • the second terminal 22 determines whether the pixel is at the image boundary of the image to be recognized according to the pixel coordinates of the pixel in the image to be recognized, so as to determine the image boundary of the image to be recognized. Then, the second terminal 22 searches for the pixel coordinates of the image boundary of the image to be recognized on the target image to obtain the image area division coordinates. Based on the image area division coordinates, the target image is divided into several image areas, each of which is Have a corresponding image to be recognized.
  • the image boundary of the image to be recognized is determined by the pixel coordinates of the pixel points of the image to be recognized, the image area division coordinates are determined on the target image by using the image boundary, and the target image is divided based on the image area division coordinates
  • the image areas corresponding to the multiple images to be recognized are respectively obtained, so that the image areas corresponding to the images to be recognized in the target image can be obtained in a convenient manner, which improves the efficiency of image recognition.
  • step S12 the method further includes:
  • the second terminal 22 uses any one or more pixels in the image to be recognized as the aforementioned reference pixel.
  • the second terminal 22 uses the pixel at the end point in the image to be recognized as the aforementioned reference pixel.
  • the second terminal 22 determines the pixel coordinates of the reference pixel on the image to be recognized as the reference pixel coordinates before splicing, and determines the pixel coordinates of the reference pixel on the target image as the reference pixel coordinates after splicing.
  • the difference obtained by subtracting the coordinate of the reference pixel after splicing from the coordinate of the reference pixel before splicing is used as the coordinate conversion parameter; or,
  • the difference between the reference pixel coordinates before splicing and the reference pixel coordinates after splicing is used as the coordinate conversion parameter.
  • step S122 includes:
  • the coordinate conversion parameter is the difference obtained by subtracting the reference pixel coordinates after splicing from the reference pixel coordinates before splicing
  • the coordinate conversion parameter is subtracted from the first key point coordinates to obtain the second key point coordinates.
  • the first key point coordinates are added to the coordinate conversion parameters to obtain the second key point coordinates
  • the first key point coordinates of a certain first key point on the target image are (20, 20), and the coordinate conversion parameter corresponding to the first key point is the coordinate difference (10, 0). Therefore, the first key point The key point coordinates (20, 20) minus the coordinate difference (10, 0) to obtain the second key point coordinates (10, 20), which will be the pixel at the second key point coordinates (10, 20) on the image to be recognized Point, as the second key point.
  • the coordinate conversion parameter is used to obtain the second key point of the image according to the first key point of the target image.
  • step S12 includes:
  • the multiple to-be-recognized images are zoomed to obtain a zoomed image; the image sizes of the multiple zoomed images are the same; and the multiple zoomed images are spliced to obtain the target image.
  • the second terminal 22 respectively scales the multiple images to be recognized to adjust the image size of the images to be recognized, and obtain multiple images with the same image size as the aforementioned zoomed images.
  • the second terminal 22 stitches multiple zoomed images to obtain the above-mentioned target image.
  • the second terminal 22 zooms all the images in the plurality of images to be recognized, or performs zoom processing on part of the images in the plurality of images to be recognized.
  • the image size of an image A is 720 pixels * 1280 pixels
  • the image size of another image B is 540 pixels * 960 pixels.
  • Scale another image B to obtain a scaled image B of 720 pixels * 1280 pixels.
  • the image A and the zoomed image B' are spliced together to obtain a target image with an image size of 1440 pixels * 1280 pixels.
  • the image to be recognized is scaled to a scaled image with the same image size, so that the terminal stitches the image of the same size into the target image, which reduces the resources consumed by the image stitching process.
  • step S11 includes:
  • the method further includes:
  • image special effects are added to the first image to be recognized to obtain the first special effect image, and according to the second key point of the second image to be recognized, the second image to be recognized is added Image special effects to obtain the second special effect image;
  • the special effect live video of the first account and the special effect live video of the second account are played; the special effect live video of the first account includes the first special effect image; the special effect live video of the second account includes the second special effect image.
  • the second terminal 22 receives the respective video streams of the first account and the second account, and extracts images from the respective video streams of the first account and the second account to obtain the first to-be-recognized image and the second The image to be recognized.
  • the target image is obtained by splicing the first image to be recognized and the second image to be recognized. Create and execute an image recognition instance to input the target image into the image recognition model.
  • the image recognition model outputs the first key point of the target image.
  • the second terminal 22 obtains the first image to be recognized and the second image to be recognized according to the first key point. The second key point of each image.
  • the second terminal 22 adds image special effects to the first image to be identified according to the second key point of the first image to be identified to obtain the aforementioned first special effect image. Similarly, the second terminal 22 adds image special effects to the second image to be identified according to the second key points of the second image to be identified to obtain the above-mentioned second special effect image.
  • an expression special effect is added to the face in the image to be identified.
  • the second terminal 22 can obtain multiple frames of continuous special effects images, and display multiple frames of continuous special effects images in sequence, that is, the playback includes special effects. Live video with special effects of images.
  • a video live broadcast method is also provided. Taking the method applied to the second terminal 22 in FIG. 2 as an example for description, the method includes the following steps:
  • S151 Obtain the live video stream of the first account, and obtain the live video stream of the second account;
  • S152 Extract the first image to be recognized from the live video stream of the first account, and extract the second image to be recognized from the live video stream of the second account;
  • S154 Input the target image into the image recognition model to obtain multiple first key points of the target image
  • S155 Determine respective second key points of the first image to be recognized and the second image to be recognized according to the multiple first key points of the target image;
  • the first to-be-recognized image and the second to-be-recognized image are The recognition image is spliced into the target image, the target image is input to the image recognition model, and the first key point of the target image is obtained. According to the first key point, the second key point of the image to be recognized is determined. Therefore, only the target image is required.
  • Input to the image recognition model can realize the image recognition of multiple to-be-recognized images, and obtain the respective key points of the multiple to-be-recognized images.
  • the images are input to the image recognition model to identify key points for multiple images to be recognized, thereby saving the processing resources required by the terminal for image recognition, and solving the problem of image recognition methods in related technologies that seriously consume terminal processing resources. problem.
  • the terminal can reduce the consumption of processing resources when identifying key points of the image to add image special effects. As the consumption of processing resources is reduced, problems such as screen freezing and delay of video communication caused by insufficient processing resources of the terminal are avoided.
  • a live broadcast system 1600 including:
  • the first terminal 21 is configured to generate a live video stream of the first account, and send the live video stream of the first account to the second terminal 22;
  • the first terminal 21 sends the live video stream of the first account to the second terminal 22 through the server 23.
  • the second terminal 22 is used to generate a live video stream of the second account
  • the second terminal 22 is further configured to extract the first image to be recognized from the live video stream of the first account, and extract the second image to be recognized from the live video stream of the second account;
  • the second terminal 22 is also used to input the stitched image into the image recognition model to obtain multiple first key points of the target image;
  • the second terminal 22 is further configured to determine respective second key points of the first image to be recognized and the second image to be recognized according to the multiple first key points of the target image;
  • the second terminal 22 is further configured to add image special effects to the first image to be identified according to the second key point of the first image to be identified to obtain the first special effect image, and according to the second key point of the second image to be identified , Adding image special effects to the second image to be recognized to obtain a second special effect image;
  • the second terminal 22 is also used to play the special effect live video of the first account and the special effect live video of the second account; the special effect live video of the first account includes the first special effect image; the special effect live video of the second account Including the second special effect image.
  • S1702 Extract images from the video stream of the first account and the video stream of the second account, respectively, to obtain a first image to be recognized and a second image to be recognized;
  • S1703 Scale the first image to be recognized and the second image to be recognized to obtain the first image to be recognized and the second image to be recognized with the same image size;
  • S1705 Determine respective reference pixels of the first image to be recognized and the second image to be recognized;
  • S1706 Determine the reference pixel coordinates of the respective reference pixel points of the first image to be recognized and the second image to be recognized on the first image and the second image before splicing, and determine each of the first image to be recognized and the second image to be recognized The post-spliced reference pixel coordinates of the reference pixels on the spliced image;
  • S1707 Calculate the difference between the reference pixel coordinates after stitching and the reference pixel coordinates before stitching of each of the first image to be recognized and the second image to be identified, to obtain the first coordinate conversion parameter and the second coordinate conversion parameter;
  • S1709 Create and execute an image recognition instance, input the target image into the image recognition model, and obtain multiple first key points in the target image;
  • S1710 Determine the first image to be recognized or the second image to be recognized corresponding to each of the first key points according to the image regions where the multiple first key points are located in the target image respectively;
  • S1711 Determine the corresponding first coordinate conversion parameter or second coordinate conversion parameter according to the first image to be recognized or the second image to be recognized corresponding to the first key point;
  • S1712 Subtract the first coordinate conversion parameter or the second coordinate conversion parameter from the first key point coordinates to obtain the second key point coordinates of the first image to be recognized or the second image to be recognized;
  • S1713 Use the pixel at the second key point coordinate in the first image to be recognized or the second image to be recognized as the second key point of the first image to be recognized or the second image to be recognized;
  • S1715 Play the special effect live video of the first account including the first special effect image, and play the special effect live video of the second account including the second special effect image.
  • the various steps in the flowchart of the present disclosure are displayed in sequence as indicated by the arrows, these steps are not necessarily performed in sequence in the order indicated by the arrows. Unless specifically stated in this article, there is no strict order for the execution of these steps, and these steps are executed in other orders. Moreover, at least a part of the steps of the flowchart of the present disclosure includes multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but executed at different times. The sequence is not necessarily performed sequentially, but is performed alternately or alternately with at least a part of other steps or sub-steps or stages of other steps.
  • an image recognition device 1800 including:
  • the image acquisition unit 1801 is configured to perform acquisition of multiple images to be recognized
  • the image stitching unit 1802 is configured to perform stitching of multiple images to be recognized to obtain a target image
  • the key point recognition unit 1803 is configured to execute the input of the target image into the image recognition model to obtain multiple first key points of the target image;
  • the key point determining unit 1804 is configured to execute the determination of the respective second key point of each image to be recognized according to the multiple first key points of the target image.
  • the pixel coordinates of the first key point on the target image are the first key point coordinates
  • the key point determination unit 1804 is configured to execute:
  • the coordinate conversion parameters are parameters for converting the coordinates of the first key point into the coordinates of the second key point on the image to be recognized;
  • the pixel at the coordinates of the second key point in the image to be recognized is taken as the second key point.
  • the target image includes multiple image regions, and the multiple image regions respectively have corresponding images to be recognized, and the key point determination unit 1804 is configured to execute:
  • the image to be recognized corresponding to the target image area is determined as the image to be recognized corresponding to the first key point coordinates.
  • the device further includes:
  • the dividing unit is configured to determine the image boundary of the image to be recognized based on the pixel coordinates of the pixel in the image to be recognized; determine the pixel coordinates of the image boundary of the image to be recognized on the target image to obtain the image area division coordinates;
  • the area division coordinates divide the target image into multiple image areas respectively corresponding to multiple images to be recognized.
  • the key point determining unit 1804 is configured to execute:
  • the coordinate conversion parameters are determined.
  • the key point determining unit 1804 is configured to execute:
  • the difference value obtained by subtracting the coordinate of the reference pixel after splicing from the coordinate of the reference pixel before splicing is used as the coordinate conversion parameter; or,
  • the difference between the reference pixel coordinates before splicing and the reference pixel coordinates after splicing is used as the coordinate conversion parameter.
  • the key point determining unit 1804 is configured to execute:
  • the coordinate conversion parameter is the difference obtained by subtracting the reference pixel coordinates after splicing from the reference pixel coordinates before splicing, subtract the coordinate conversion parameters from the first key point coordinates to obtain the second key point coordinates;
  • the coordinate conversion parameter is the difference obtained by subtracting the reference pixel coordinates after the splicing from the reference pixel coordinates before splicing
  • the first key point coordinates are added to the coordinate conversion parameters to obtain the second key point coordinates.
  • the image stitching unit 1802 is further configured to perform:
  • a video live broadcast apparatus 1900 including:
  • the video stream obtaining unit 1901 is configured to perform obtaining the live video stream of the first account and obtain the live video stream of the second account;
  • the image acquisition unit 1902 is configured to extract the first image to be recognized from the live video stream of the first account, and to extract the second image to be recognized from the live video stream of the second account;
  • the image splicing unit 1903 is configured to perform splicing of the first to-be-recognized image and the second to-be-recognized image to obtain a target image;
  • the key point recognition unit 1904 is configured to execute the input of the target image into the image recognition model to obtain multiple first key points of the target image;
  • the key point determining unit 1905 is configured to execute the second key point of each of the first image to be recognized and the second image to be recognized according to multiple first key points of the target image;
  • the special effect adding unit 1906 is configured to perform adding image special effects to the first image to be identified according to the second key point of the first image to be identified to obtain the first special effect image, and according to the second key point of the second image to be identified Click to add image special effects to the second image to be recognized to obtain a second special effect image;
  • the special effect playing unit 1907 is configured to perform playing the special effect live video of the first account and the special effect live video of the second account; the special effect live video of the first account includes the first special effect image; the special effect live video of the second account The video includes a second special effect image.
  • Each module in the above-mentioned image recognition device and video live broadcast device can be implemented in whole or in part by software, hardware, and a combination thereof.
  • the above-mentioned modules can be embedded in or independent of the processor in the computer equipment in the form of hardware, and can also be stored in the memory in the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.
  • the image recognition device and video live broadcast device provided above can be used to implement the image recognition method and video live broadcast method provided in any of the above embodiments, and have corresponding functions and beneficial effects.
  • An embodiment of the present disclosure shows a computer device, which includes a processor
  • a memory for storing processor executable instructions
  • the processor is configured to execute instructions to implement the following steps:
  • the second key point of each image to be recognized is determined.
  • the pixel coordinates of the first key point on the target image are the first key point coordinates
  • the processor is configured to execute instructions to implement the following steps:
  • the coordinate conversion parameters are parameters for converting the coordinates of the first key point into the coordinates of the second key point on the image to be recognized;
  • the pixel at the coordinates of the second key point in the image to be recognized is taken as the second key point.
  • the processor is configured to execute instructions to implement the following steps:
  • the image to be recognized corresponding to the target image area is determined as the image to be recognized corresponding to the first key point coordinates.
  • the processor is configured to execute instructions to implement the following steps:
  • the target image is divided into multiple image areas respectively corresponding to multiple images to be recognized.
  • the processor is configured to execute instructions to implement the following steps:
  • the coordinate conversion parameters are determined.
  • the processor is configured to execute instructions to implement the following steps:
  • the difference value obtained by subtracting the coordinate of the reference pixel after splicing from the coordinate of the reference pixel before splicing is used as the coordinate conversion parameter; or,
  • the difference between the reference pixel coordinates before splicing and the reference pixel coordinates after splicing is used as the coordinate conversion parameter.
  • the processor is configured to execute instructions to implement the following steps:
  • the coordinate conversion parameter is the difference obtained by subtracting the reference pixel coordinates after splicing from the reference pixel coordinates before splicing, subtract the coordinate conversion parameters from the first key point coordinates to obtain the second key point coordinates;
  • the coordinate conversion parameter is the difference obtained by subtracting the reference pixel coordinates after the splicing from the reference pixel coordinates before splicing
  • the first key point coordinates are added to the coordinate conversion parameters to obtain the second key point coordinates.
  • the processor is configured to execute instructions to implement the following steps:
  • An embodiment of the present disclosure shows a computer device, which includes a processor
  • the computer equipment includes a processor
  • a memory for storing processor executable instructions
  • the processor is configured to execute instructions to implement the following steps:
  • the multiple third key points of the target image determine the respective fourth key points of the first image to be recognized and the second image to be recognized;
  • image special effects are added to the first image to be recognized to obtain the first special effect image
  • the second image to be recognized is added Image special effects to obtain the second special effect image
  • the special effect live video of the first account and the special effect live video of the second account are played; the special effect live video of the first account includes the first special effect image; the special effect live video of the second account includes the second special effect image.
  • FIG. 20 is a computer device shown in an embodiment of the present disclosure.
  • the computer device is provided as a terminal, and its internal structure is shown in FIG. 20.
  • the computer equipment includes a processor, a memory, a network interface, a display screen and an input device connected through a system bus.
  • the processor of the computer device is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system and a computer program.
  • the internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer program is executed by the processor to realize an image recognition method and a video live broadcast method.
  • the display screen of the computer device is a liquid crystal display screen or an electronic ink display screen
  • the input device of the computer device is a touch layer covered on the display screen, or a button, trackball or touchpad set on the computer device shell, or External keyboard, touchpad or mouse, etc.
  • FIG. 20 is only a block diagram of a part of the structure related to the solution of the present disclosure, and does not constitute a limitation on the computer device to which the solution of the present disclosure is applied.
  • the computer device includes a diagram More or fewer components are shown in, or some components are combined, or have different component arrangements.
  • the present disclosure also provides a computer program product, including: computer program code, which is executed by a computer in response to the computer program code, so that the computer executes the above-mentioned image recognition method and video live broadcast method.
  • Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
  • Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Computer Graphics (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Processing (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

本公开涉及一种图像识别方法及装置。所述方法包括:获取多个待识别图像;拼接多个所述待识别图像,得到目标图像;将所述目标图像输入图像识别模型,得到所述目标图像的多个第一关键点;根据所述目标图像的多个第一关键点,确定每个所述待识别图像各自的第二关键点。

Description

图像识别方法及装置
本公开要求于2020年01月21日提交、申请号为202010070867.X的中国专利申请的优先权,其全部内容通过引用结合在本公开中。
技术领域
本公开涉及视频技术领域,特别是涉及一种图像识别方法及装置。
背景技术
目前,随着视频技术的发展,越来越多的用户通过手机或者台式电脑等终端进行视频通讯,视频通讯能够广泛应用于视频通话、视频会议、视频直播等应用场景中。通常,在上述的应用场景中,用户能够通过本地端进行拍摄,播放本地端拍摄到的视频,本地端还能够播放另一端拍摄到的视频,由此,用户通过本地端看到双方实时的视频。
通常,在上述的应用场景中,用户能够对视频图像进行特效处理。例如,在视频直播中,用户在双方的视频图像中贴上动画贴纸。
发明内容
本公开提供一种图像识别方法及装置。本公开的技术方案如下:
根据本公开实施例的一方面,提供一种图像识别方法,包括:
获取多个待识别图像;
拼接多个所述待识别图像,得到目标图像;
将所述目标图像输入图像识别模型,得到所述目标图像的多个第一关键点;
根据所述目标图像的多个第一关键点,确定每个所述待识别图像各自的第二关键点。
在一些实施例中,所述第一关键点在所述目标图像上的像素坐标为第一关键点坐标,所述根据所述目标图像的多个第一关键点,确定每个所述待识别图像各自的第二关键点,包括:
确定所述第一关键点坐标对应的坐标转换参数;所述坐标转换参数为用于将所述第一关键点坐标转换成在所述待识别图像上确定所述第二关键点的坐标的参数;
按照所述第一关键点坐标对应的坐标转换参数,将所述第一关键点坐标转换为第二关键点坐标;
将所述待识别图像中处于所述第二关键点坐标的像素点,作为所述第二关键点。
在一些实施例中,所述目标图像包括多个图像区域,多个所述图像区域分别具有对应的待识别图像,所述确定所述第一关键点坐标对应的坐标转换参数,包括:
在所述拼接图像中的多个所述图像区域中,确定所述第一关键点坐标在所述拼接图像中所处的目标图像区域;
根据所述目标图像区域对应的待识别图像,确定所述第一关键点坐标对应的坐标转换参数。
在一些实施例中,还包括:
根据所述待识别图像中的像素点的像素坐标,确定所述待识别图像的图像边界;
确定所述待识别图像的图像边界在所述目标图像上的像素坐标,得到图像区域划分坐标;
根据所述图像区域划分坐标,将所述目标图像划分为分别与多个所述待识别图像对应的多个所述图像区域。
在一些实施例中,所述确定所述第一关键点坐标对应的坐标转换参数的坐标转换参数,包括:
确定所述待识别图像中的至少一个像素点,为基准像素点;
确定所述基准像素点在所述待识别图像上的像素坐标,得到拼接前基准像素坐标,以及,确定所述基准像素点在所述目标图像上的像素坐标,得到拼接后基准像素坐标;
基于所述拼接后基准像素坐标与所述拼接前基准像素坐标,确定所述坐标转换参数。
在一些实施例中,所述基于所述拼接后基准像素坐标与所述拼接前基准像素坐标,确定所述坐标转换参数,包括:
将所述拼接后基准像素坐标减去所述拼接前基准像素坐标所得到的差值,作为所述坐标转换参数;或者,
将所述拼接前基准像素坐标减去所述拼接后基准像素坐标的差值,作为所述坐标转换参数。
在一些实施例中,所述按照所述第一关键点坐标对应的坐标转换参数,将所述第一关键点坐标转换为第二关键点坐标,包括:
在所述坐标转换参数为所述拼接后基准像素坐标减去所述拼接前基准像素坐标得到的差值的情况下,将所述第一关键点坐标减去所述坐标转换参数,得到所述第二关键点坐标;
在所述坐标转换参数为拼接前基准像素坐标减去所述拼接后基准像素坐标得到的差值的情况下,将 所述第一关键点坐标加上所述坐标转换参数,得到所述第二关键点坐标。
在一些实施例中,所述拼接多个所述待识别图像,得到目标图像,包括:
对多个所述待识别图像中的至少一个所述待处理图像进行缩放,得到缩放后图像;多个所述缩放后图像的图像尺寸相同;
拼接多个所述缩放后图像,得到所述目标图像。
根据本公开实施例的另一方面,提供一种视频直播方法,包括:
获取第一账户的直播视频流,以及,获取第二账户的直播视频流;
从所述第一账户的直播视频流中提取出第一待识别图像,以及,从所述第二账户的直播视频流中提取出第二待识别图像;
拼接所述第一待识别图像和所述第二待识别图像,得到目标图像;
将所述目标图像输入图像识别模型,得到所述目标图像的多个第一关键点;
根据所述目标图像的多个第一关键点,确定所述第一待识别图像和所述第二待识别图像各自的第二关键点;
按照所述第一待识别图像的第二关键点,对所述第一待识别图像添加图像特效,得到第一特效化图像,以及,按照所述第二待识别图像的第二关键点,对所述第二待识别图像添加图像特效,得到第二特效化图像;
播放所述第一账户的特效化直播视频和所述第二账户的特效化直播视频;所述第一账户的特效化直播视频包括所述第一特效化图像;所述第二账户的特效化直播视频包括所述第二特效化图像。
根据本公开实施例的另一方面,提供了一种图像识别装置,包括:
图像获取单元,被配置为执行获取多个待识别图像;
图像拼接单元,被配置为执行拼接多个所述待识别图像,得到目标图像;
关键点识别单元,被配置为执行将所述目标图像输入图像识别模型,得到所述目标图像的多个第一关键点;
关键点确定单元,被配置为执行根据所述目标图像的多个第一关键点,确定每个所述待识别图像各自的第二关键点。
在一些实施例中,所述第一关键点在所述目标图像上的像素坐标为第一关键点坐标,所述关键点确定单元,被配置为执行:
确定所述第一关键点坐标对应的坐标转换参数;所述坐标转换参数为用于将所述第一关键点坐标转换成在所述待识别图像上确定所述第二关键点的坐标的参数;
按照所述第一关键点坐标对应的坐标转换参数,将所述第一关键点坐标转换为第二关键点坐标;
将所述待识别图像中处于所述第二关键点坐标的像素点,作为所述第二关键点。
在一些实施例中,所述目标图像包括多个图像区域,多个所述图像区域分别具有对应的待识别图像,所述关键点确定单元,被配置为执行:
在所述拼接图像中的多个所述图像区域中,确定所述第一关键点坐标在所述拼接图像中所处的目标图像区域;
根据所述目标图像区域对应的待识别图像,确定所述第一关键点坐标对应的坐标转换参数。
在一些实施例中,所述装置还包括:
划分单元,被配置为执行根据所述待识别图像中的像素点的像素坐标,确定所述待识别图像的图像边界;确定所述待识别图像的图像边界在所述目标图像上的像素坐标,得到图像区域划分坐标;根据所述图像区域划分坐标,将所述目标图像划分为分别与多个所述待识别图像对应的多个所述图像区域。
在一些实施例中,所述关键点确定单元,被配置为执行:
确定所述待识别图像中的至少一个像素点,为基准像素点;
确定所述基准像素点在所述待识别图像上的像素坐标,得到拼接前基准像素坐标,以及,确定所述基准像素点在所述目标图像上的像素坐标,得到拼接后基准像素坐标;
基于所述拼接后基准像素坐标与所述拼接前基准像素坐标,确定所述坐标转换参数。
在一些实施例中,所述关键点确定单元,被配置为执行:
将所述拼接后基准像素坐标减去所述拼接前基准像素坐标所得到的差值,作为所述坐标转换参数;或者,
将所述拼接前基准像素坐标减去所述拼接后基准像素坐标的差值,作为所述坐标转换参数。
在一些实施例中,所述关键点确定单元,被配置为执行:
在所述坐标转换参数为所述拼接后基准像素坐标减去所述拼接前基准像素坐标得到的差值的情况下,将所述第一关键点坐标减去所述坐标转换参数,得到所述第二关键点坐标;
在所述坐标转换参数为拼接前基准像素坐标减去所述拼接后基准像素坐标得到的差值的情况下,将所述第一关键点坐标加上所述坐标转换参数,得到所述第二关键点坐标。
在一些实施例中,所述图像拼接单元,被配置为执行:
对多个所述待识别图像中的至少一个所述待处理图像进行缩放,得到缩放后图像;多个所述缩放后图像的图像尺寸相同;
拼接多个所述缩放后图像,得到所述目标图像。
根据本公开实施例的另一方面,提供了一种视频直播装置,包括:
视频流获取单元,被配置为执行获取第一账户的直播视频流,以及,获取第二账户的直播视频流;
图像获取单元,被配置为执行从所述第一账户的直播视频流中提取出第一待识别图像,以及,从所述第二账户的直播视频流中提取出第二待识别图像;
图像拼接单元,被配置为执行拼接所述第一待识别图像和所述第二待识别图像,得到目标图像;
关键点识别单元,被配置为执行将所述目标图像输入图像识别模型,得到所述目标图像的多个第一关键点;
关键点确定单元,被配置为执行根据所述目标图像的多个第一关键点,确定所述第一待识别图像和所述第二待识别图像各自的第二关键点;
特效添加单元,被配置为执行按照所述第一待识别图像的第二关键点,对所述第一待识别图像添加图像特效,得到第一特效化图像,以及,按照所述第二待识别图像的第二关键点,对所述第二待识别图像添加图像特效,得到第二特效化图像;
特效播放单元,被配置为执行播放所述第一账户的特效化直播视频和所述第二账户的特效化直播视频;所述第一账户的特效化直播视频包括所述第一特效化图像;所述第二账户的特效化直播视频包括所述第二特效化图像。
根据本公开实施例的另一方面,提供一种计算机设备,包括:处理器;用于存储所述处理器可执行指令的存储器;其中,所述处理器被配置为执行所述指令,以实现上述方面所述的实施方式中的方法。
据本公开实施例的另一方面,提供一种存储介质,所述存储介质中的指令由计算机设备的处理器执行,使得计算机设备能够执行上述方面所述的实施方式中的方法。
根据本公开实施例的另一方面,提供一种计算机程序产品,包括:计算机程序代码,所述计算机程序代码被计算机运行,使得所述计算机执行上述方面所述的实施方式中的方法。
附图说明
图1是一个实施例的一种图像识别方法的流程示意图;
图2是一个实施例的一种图像识别方法的应用环境图;
图3是一个实施例的一种视频直播的应用场景;
图4是一个实施例的一种视频播放界面的示意图;
图5是一个实施例的一种视频直播过程中添加图像特效的示意图;
图6是一个实施例的一种在视频播放界面添加图像特效的示意图;
图7是一个实施例的一种图像的拼接边缘的示意图;
图8是一个实施例的一种拼接图像的示意图;
图9是一个实施例的一种拼接图像的关键点的示意图;
图10是一个实施例的一种图像的关键点的示意图;
图11是一个实施例的一种根据关键点对图像添加图像特效的示意图;
图12是一个实施例的一种确定图像的关键点步骤的流程图;
图13是一个实施例的一种拼接图像的二维坐标系的示意图;
图14是一个实施例的一种确定第二关键点坐标的示意图;
图15是一个实施例的一种视频直播方法的流程示意图;
图16是一个实施例的一种直播系统的结构框图;
图17是一个实施例的一种视频直播方法的流程示意图;
图18是一个实施例的一种图像识别装置的结构框图;
图19是一个实施例的一种视频直播装置的结构框图;
图20是一个实施例的一种计算机设备的结构框图。
具体实施方式
为了使本公开的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本公开进行进一步详细说明。应当理解,此处描述的实施例仅仅用以解释本公开,并不用于限定本公开。
在一些实施例中,如图1所示,提供了一种图像识别方法。本实施例提供的图像识别方法,应用于如图2所示的应用环境中。该应用环境中包括有第一终端21、第二终端22和服务器23。其中,第一终端21和第二终端22包括但不限于个人计算机、笔记本电脑、智能手机、平板电脑和便携式可穿戴设备。服务器23是独立的服务器或者由多个服务器组成的服务器集群来实现。
在一些实施例中,上述的图像识别方法应用于在视频通话、视频会议、视频直播等的视频通讯的应用场景。例如,上述的图像识别方法应用在视频通讯过程中对视频中的图像添加图像特效的应用场景。再如,上述的图像识别方法应用于对多个图像进行识别的应用场景。
例如,参考图3,提供了一个实施例的一种视频直播的应用场景。如图所示,第一用户通过第一终端21在视频直播平台登录第一账户,通过第一终端21自拍,第一终端21将拍摄得到的视频流发送至服务器23,服务器23将第一账户的视频流发送至第二终端22。第二用户通过第二终端22在视频直播平台登录第二账户,通过第二终端22自拍,第二终端22将拍摄得到的视频流发送至服务器23,服务器23将第二账户的视频流发送至第一终端21。由此,第一终端21和第二终端22均得到了第一账户和第二账户各自的视频流,也即,第一终端21和第二终端22均得到了两路视频流。第一终端21和第二终端22分别根据两路视频流进行视频直播。第一用户和第二用户均能够在终端上观看到自身和对方的直播画面。此外,服务器23能够将两路视频流发送至其他用户的第三终端24,其他用户通过第三终端24观看第一用户和第二用户的直播画面。
参考图4,提供了一个实施例的一种视频播放界面的示意图。如图所示,在第一终端21、第二终端22和第三终端24的视频播放界面上,同时播放第一账户的视频流和第二账户的视频流。上述的视频直播的应用场景中,进行视频直播的第一用户和第二用户能够实时观看到自身和对方的直播画面,通过语音和文字等至少一种方式进行交流,自身和对方的直播画面以及双方交流的内容,还能够被其他用户实时观看,因此该应用场景通常也称为“直播连麦”。
在视频直播过程中,用户能够对视频直播中的人物、背景等内容添加图像特效。参考图5,提供了一个实施例的一种视频直播过程中添加图像特效的示意图。如图5所示,第二用户通过第二终端22提交一个特效指令,视频播放界面上的第一账户和第二账户的画面中,针对所显示的人脸添加有表情特效。
为了添加图像特效,第二终端22需要创建图像识别实例,以对视频流中的连续多帧图像进行图像识别,识别出图像中的关键点,根据图像中的关键点添加图像特效,得到添加有图像特效的图像并显示。对于上述的视频直播的应用场景中,由于存在有两路视频流,因此,第二终端22需要对两路视频流中的图像分别创建图像识别实例,以分别将图像输出至图像识别模型,通过图像识别模型输出两路视频流中的图像的关键点。
然而,执行图像识别实例以通过图像识别模型进行图像识别需要耗费第二终端22的处理资源,为了保证视频直播的实时性,则需要同时执行多个图像识别实例同时进行图像识别,因此,相关技术中的图像识别方法需要耗费终端大量的处理资源。对于性能较差的终端而言,执行多个图像识别实例同时对多路视频流进行图像识别,则可能会由于处理资源不足而导致画面卡顿、延迟等的问题。
针对于上述问题,申请人对相关技术中的图像识别方法进行了深入研究,申请人发现,第二终端22创建图像识别实例后,第二终端22则按照图像识别实例执行图像识别处理,将图像输入至图像识别模型,通过图像识别模型进行图像识别处理时,第二终端22则会对整个图像中的每个像素点按照一定的顺序进行扫描处理,每次扫描处理会耗费终端较多的处理资源。因此,申请人提出了一种新的图像识别方法,将该图像识别方法应用于上述的应用场景中,能够通过单个图像识别实例完成图像识别,降低了对终端的处理资源的耗费,提升了图像识别的效率。
本实施例中的一种图像识别方法,以该方法应用于图2中的第二终端22为例进行说明,包括以下步骤:
在步骤S11中,获取多个待识别图像。
其中,待识别图像为当前将要进行图像识别以得到关键点的图像。
在一些实施例中,图像处理方法应用在视频通讯的应用场景中;第一终端21和第二终端22中安装有视频应用,第一用户通过第一终端21的视频应用,登录视频应用平台的第一账户,第二用户通过第二终端22的视频应用,登录视频应用平台的第二账户。第一终端21和第二终端22通过服务器23进行连接,以进行视频通讯。第一用户通过第一终端21进行拍摄,得到第一账户的视频流,通过服务器23将第一账户的视频流转发至第二终端22;并且,第二用户通过第二终端22进行拍摄,得到第二账户的视频流。由此,第二终端22得到了两路视频流。
第二终端22的视频应用提供一个视频播放界面,在该视频播放界面上,根据第一账户和第二账户各自的视频流中的图像进行视频播放。例如,参考图4,第二终端22的视频播放界面上,划分为左右两边的分界面,左边分界面显示第一账户的视频流中的连续多帧的图像,右边分界面显示第二账户的视频流中的连续多帧的图像。
第二终端22的视频应用提供特效添加入口,供用户请求添加图像特效。例如,参考图6,在视频播放界面上设置一个“人脸表情特效”的虚拟按钮51,用户点击该虚拟按钮51,则能够针对图像中的人脸添加表情特效的图像特效。响应于用户请求添加图像特效,第二终端22从两路视频流提取图像。由于每一路视频流包含有多个图像,第二终端22分别从两路视频流中提取出一帧或连续多帧的图像,由此得到了第一账户的图像和第二账户的图像。在本公开实施例中,将第一账户的图像和第二账户的图像作为上述的多个待识别图像。
在步骤S12中,拼接多个待识别图像,得到目标图像。
其中,目标图像为多个待识别图像进行拼接后得到的图像。
在一些实施例中,第二终端22将从两路视频流分别提取出的待识别图像进行拼接,将拼接得到的图像作为上述的目标图像。
拼接图像的实现方式有多种。在一些实施例中,对于每个待识别图像,第二终端22在该待识别图像的多条图像边缘中,选取其中一条图像边缘作为拼接边缘,将多个待识别图像按照拼接边缘进行拼接,使得每个待识别图像的拼接边缘重合,从而完成多个待识别图像的拼接。
在一些实施例中,第二终端22将多个待识别图像进行左右拼接。例如,对于两个待识别图像,选取其中一张图像右侧的图像边缘为拼接边缘,选取另一张图像左侧的图像边缘为拼接边缘,按照两个图像各自的拼接边缘进行拼接。
参考图7,提供了一个实施例的一种待识别图像的拼接边缘的示意图。如图所示,当前存在有两个待识别图像,分别是从第一账户的视频流和第二账户的视频中所提取出的图像61和图像62。选取图像61右侧的图像边缘作为拼接边缘,选取图像62左侧的图像边缘作为拼接边缘,根据图像61和图像62的拼接边缘进行拼接。
参考图8,提供了一个实施例的一种拼接图像的示意图。如图所示,根据图像61和图像62的拼接边缘进行拼接后,得到由图像61和图像62组成的目标图像63。
在一些实施例中,第二终端22将多个待识别图像进行上下拼接。例如,第二终端22选取一个待识别图像的上侧图像边缘作为拼接边缘,选取另一个待识别图像的下侧的图像边缘作为拼接边缘,根据待识别图像的上下侧的拼接边缘进行拼接。
在一些实施例中,第二终端22首先生成一张空白图像,将多个待识别图像添加至空白图像,将添加有多个待识别图像的图像作为上述的目标图像。
在一些实施例中,第二终端22采用多种拼接方式将多个待识别图像拼接为上述的目标图像,本公开对拼接方式不作限制。
在一些实施例中,每一个待识别图像实质上由像素数组组成,待识别图像的每个像素点均具有对应的像素值和像素坐标。将多个待识别图像拼接为目标图像,实质上是根据待识别图像中的像素数组,生成新的代表目标图像的像素数组。将多个待识别图像拼接为拼接图像,即是对像素数组中的像素值和像素坐标进行更改。
在步骤S13中,将目标图像输入图像识别模型,得到目标图像的第一关键点。
其中,第一关键点为目标图像中具有特定特征的像素点。第一关键点为目标图像中的目标对象的任一部位的关键点。例如,第一关键点为人脸关键点或者五官关键点等。
在一些实施例中,第二终端22创建用于对目标图像进行图像识别的图像识别实例,第二终端22执行图像识别实例,以实现将目标图像输入至图像识别模型,第二终端22则按照图像识别实例扫描目标图像中的各个像素点,判断某个像素点是否为关键点。
第二终端22通过图像识别模型,识别得到目标图像中的关键点,作为上述的第一关键点。第二终端22根据目标图像中的第一关键点,确定第一关键点在以目标图像构建的二维坐标系中的像素坐标。
参考图9,提供了一个实施例的一种第一目标图像的关键点的示意图。如图所示,经过图像识别,得到第一图像63中具有人脸轮廓特征的关键点64。
在步骤S14中,根据目标图像的第一关键点,确定每个待识别图像各自的第二关键点。
在一些实施例中,第二终端22利用目标图像的第一关键点,分别确定每个待识别图像各自的一个或多个像素点为关键点,得到上述的第二关键点。例如,第二终端22得到目标图像的第一关键点后,确定目标图像的每个第一关键点在每个待识别图像中所对应的像素点,将目标图像的每个第一关键点在每个待识别图像中所对应的像素点,作为每个待识别图像中的第二关键点。
参考图10,提供了一个实施例的一种每个待识别图像的第二关键点的示意图。如图所示,第二终端22确定目标图像63的第一关键点64之后,则确定图像61和图像62各自的第二关键点65。
在一些实施例中,第二终端22在得到每个待识别图像各自的第二关键点之后,则第二终端22根据每个待识别图像各自的第二关键点,对每个待识别图像添加图像特效,显示添加有图像特效的图像。
参考图11,提供了一个实施例的一种根据第二关键点对待识别图像添加图像特效的示意图。如图所示,第二终端22在得到图像61和图像62中具有人脸轮廓特征的第二关键点65之后,在人脸上添加表情特效。
第二终端22根据目标图像的第一关键点确定每个待识别图像各自的第二关键点的实现方式有多种。
在一些实施例中,第二终端22在得到目标图像之后,记录待识别图像中各个像素点在目标图像中所对应的像素点。在得到目标图像的第一关键点之后,确定目标图像的第一关键点在每个待识别图像中所对应的像素点,由此得到待识别图像的第二关键点。
在一些实施例中,第二终端22首先确定待识别图像中的至少一个像素点作为基准像素点,例如,将待识别图像中处于图像端点的像素点作为基准像素点,记录该基准像素点在以待识别图像构建的二维坐标系中的像素坐标,作为拼接前基准像素坐标。在得到目标图像之后,第二终端22确定该基准像素点在以目标图像构建的二维坐标系中的像素坐标,作为拼接后基准像素坐标。第二终端22计算拼接前基准像素坐标与拼接后基准像素坐标之间的坐标差值,作为坐标转换参数。在得到目标图像的第一关键点之后,第二终端22根据第一关键点在目标图像中的像素坐标和上述的坐标转换参数,将第一关键点在目标图像中的像素坐标转换为对应的像素点在待识别图像中的像素坐标,转换后的像素坐标所对应的像素点,即为待识别图像上的第二关键点,由此得到待识别图像的第二关键点。
当然,第二终端22还能够采用其他方式根据目标图像的第一关键点确定每个待识别图像的第二关键点。
在一些实施例中,第二终端22执行图像识别实例,则会将目标图像输入至图像识别模型,图像识别模型在识别目标图像的过程,实质上是第二终端22对整个图像中的每个像素点进行扫描的处理过程,对每个图像的扫描处理会耗费终端较多的处理资源。上述的图像识别方法,将多个图像拼接为目标图像,将目标图像输入至图像识别模型,实质上第二终端只需要对目标图像进行单次的扫描处理,而无须分别对多个待识别图像进行多次扫描处理,由此,节省了扫描处理所需消耗的处理资源。
上述的图像识别方法中,通过获取多个待识别图像,将多个待识别图像拼接为目标图像,将目标图像输入至图像识别模型,得到目标图像的第一关键点,根据第一关键点确定多个待识别图像各自的第二关键点,由此,只需要将目标图像输入至图像识别模型即能够实现对多个待识别图像的图像识别,得到多个待识别图像各自的关键点,而无须针对多个待识别图像分别执行多个图像识别实例,将多个待识别图像分别输入至图像识别模型,以对多个待识别图像分别识别出关键点,从而,节省了第二终端22进行图像识别所需的处理资源,解决了相关技术中的图像识别方法严重耗费终端处理资源的问题。
而且,将上述的图像识别方法应用于视频通讯时添加图像特效的应用场景时,使得第二终端22在识别图像的关键点以添加图像特效时,降低了处理资源的耗费。由于降低了处理资源的消耗,避免第二终端22由于处理资源不足而导致视频通讯的画面卡顿、延迟等的问题。
如图12所示,在一些实施例中,提供了一种确定图像的关键点步骤的流程图,第一关键点在目标图像上的像素坐标为第一关键点坐标,步骤S14,包括:
S121,确定第一关键点坐标对应的坐标转换参数;坐标转换参数为用于将第一关键点坐标转换成在待识别图像上确定第二关键点的坐标的参数。
其中,第一关键点坐标对应的坐标转换参数可以为第一关键点对应的待识别图像的坐标转换参数,该坐标转换参数为待识别图像与目标图像之间进行像素点坐标转换的参数。相应的,本步骤包括:对于每个第一关键点,确定第一关键点对应的待识别图像,确定该待识别图像的坐标转换参数。
在一些实施例中,第二终端22得到了第一关键点之后,确定第一关键点在目标图像上的像素坐标,作为上述的第一关键点坐标。
在一些实施例中,为了确定第一关键点在目标图像上的像素坐标,首先根据目标图像构建二维坐标系,目标图像上的每个像素点,在该二维坐标系中均具有对应的像素坐标。
图13提供了一个实施例的一种目标图像的二维坐标系的示意图。如图所示,以目标图像左下端的端点作为二维坐标系的原点O,以目标图像下侧的水平边缘为X轴,以目标图像左侧的垂直边缘为Y轴,由此构建出目标图像的二维坐标系。目标图像中的每个第一关键点64在该二维坐标系中均具有对应的第一关键点坐标(X1,Y1)。
第二终端22确定一个或多个的第一关键点坐标之后,确定该第一关键点坐标所对应的坐标转换参数。
在一些实施例中,第二终端22在将多个待识别图像拼接为目标拼接之后,待识别图像的像素点在待识别图像上的像素坐标,会被改变为该像素点在目标图像上的像素坐标,为了根据某个第一关键点在目标图像中的像素坐标确定该第一关键点在待识别图像上的像素坐标,则需要利用坐标转换参数,将第一关键点在目标图像中的像素坐标转换为第一关键点在待识别图像上的像素坐标。
上述的坐标转换参数是在得到目标图像之后,根据待识别图像的像素点在待识别图像上的像素坐标与该像素点在目标图像上的像素坐标之间的差异得到。
例如,某个像素点在待识别图像上的像素坐标为(5,10),该像素点在目标图像上的像素坐标为(15,10),由此得到待识别图像的像素点在待识别图像上的像素坐标与该像素点在目标图像上的像素坐标之间的坐标差值为(10,0),将该坐标差值作为上述的坐标转换参数。
由于在进行图像拼接之后,不同像素点在待识别图像上的像素坐标与该像素点在目标图像上的像素坐标之间的差异也不同,因此,根据第一关键点坐标,确定与其对应的坐标转换参数,以便按照对应的坐标转换参数进行坐标转换。
S122,按照第一关键点坐标对应的坐标转换参数,将第一关键点坐标转换为第二关键点坐标。
在一些实施例中,第一关键点坐标对应的坐标转换参数为待识别图像的坐标转换参数,则本步骤包括:按照待识别图像的坐标转换参数,将第一关键点坐标转换为第二关键点坐标。
在一些实施例中,第二终端22得到第一关键点坐标对应的坐标转换参数,按照该坐标转换参数将第一关键点坐标转换为第二关键点坐标。通过坐标转换参数,将目标图像上的关键点的像素坐标,还原为待识别图像上的关键点的像素坐标。
S123,将待识别图像中处于第二关键点坐标的像素点,作为第二关键点。
在一些实施例中,第二终端22确定第二关键点坐标之后,则在待识别图像上查找处于第二关键点坐标的像素点,作为待识别图像的第二关键点,然后标记该第二关键点。
图14提供了一个实施例的一种确定第二关键点坐标的示意图。假设目标图像63的第一关键点64的第一关键点坐标为(15,10),坐标转换参数为一个坐标差值(10,0),将第一关键点坐标(15,10)减去坐标差值(10,0),得到第二关键点坐标(5,10),在图像62查找处于第二关键点坐标(5,10)的像素点,得到第二关键点65。
上述的图像识别方法中,通过首先确定第一关键点坐标对应的坐标转换参数,按照坐标转换参数将第一关键点坐标转换为第二关键点坐标,最后将待识别图像中处于第二关键点坐标的像素点,作为待识别图像的第二关键点,由此,通过少量的坐标转换参数,即可根据目标图像的多个第一关键点确定每个待识别图像各自的第二关键点,而无须针对待识别图像的像素点与目标图像的像素点逐一建立对应关系,进一步节省了第二终端22的处理资源。
在一些实施例中,目标图像包括多个图像区域,多个图像区域分别具有对应的待识别图像,步骤S121包括:
在目标图像中的多个图像区域中,确定第一关键点坐标在目标图像中所处的目标图像区域;根据目标图像区域对应的待识别图像,确定第一关键点坐标对应的坐标转换参数。
在一些实施例中,在将多个待识别图像拼接为目标图像时,第二终端22根据每个待识别图像中的像素点的像素坐标,确定待识别图像的图像边界,基于待识别图像的图像边界,对拼接多个待识别图像后得到的目标图像进行划分,得到目标图像中的多个图像区域。得到目标图像的第一关键点之后,第二终端22首先确定第一关键点坐标在目标图像中所处的图像区域,作为上述的目标图像区域。然后,第二终端22确定目标图像区域所对应的待识别图像,根据目标图像区域所对应的待识别图像,确定第一关键点坐标对应的坐标转换参数。其中,第二终端22将目标图像区域所对应的待识别图像,作为第一关键点坐标对应的坐标转换参数。
上述的图像识别方法中,通过根据第一关键点在目标图像上所处的图像区域,确定第一关键点所对应的坐标转换参数,无须对目标图像上每个像素点记录对应的坐标转换参数,节省了进行图像识别所需的处理资源,降低了终端消耗,提升了图像识别效率。
在一些实施例中,在步骤S12之后,还包括:
根据待识别图像中的像素点的像素坐标,确定待识别图像的图像边界;
确定待识别图像的图像边界在目标图像上的像素坐标,得到图像区域划分坐标;
根据图像区域划分坐标,将目标图像划分为分别与多个待识别图像对应的多个图像区域。
在一些实施例中,第二终端22根据待识别图像中的像素点的像素坐标,判断像素点是否处于待识别图像的图像边界,从而确定出在待识别图像的图像边界。然后,第二终端22查找待识别图像的图像边界在目标图像上的像素坐标,从而得到图像区域划分坐标,基于该图像区域划分坐标,将目标图像划分为若干个图像区域,每个图像区域均具有对应的待识别图像。
上述的图像识别方法中,通过待识别图像的像素点的像素坐标确定待识别图像的图像边界,利用图像边界在目标图像上确定出图像区域划分坐标,基于图像区域划分坐标,将目标图像划分出分别与多个待识别图像对应的图像区域,从而通过便捷的方式得到目标图像中分别与各个待识别图像对应的图像区域,提升了图像识别效率。
在一些实施例中,在步骤S12之后,还包括:
确定待识别图像中的至少一个像素点,为基准像素点;确定基准像素点在待识别图像上的像素坐标,得到拼接前基准像素坐标,以及,确定基准像素点在目标图像上的像素坐标,得到拼接后基准像素坐标;基于拼接后基准像素坐标与拼接前基准像素坐标,确定坐标转换参数;记录待识别图像与坐标转换参数的对应关系。
在一些实施例中,第二终端22将待识别图像中任意的一个或多个像素点,作为上述的基准像素点。例如,第二终端22将待识别图像中处于端点的像素点作为上述的基准像素点。
然后,第二终端22确定该基准像素点在待识别图像上的像素坐标,作为拼接前基准像素坐标,以及,确定该基准像素点在目标图像上的像素坐标,作为拼接后基准像素坐标。
最后,基于拼接后基准像素坐标与拼接前基准像素坐标,确定上述的坐标转换参数,记录下待识别图像与该坐标转换参数之间的对应关系。
在一些实施例中,将拼接后基准像素坐标减去拼接前基准像素坐标所得到的差值,作为坐标转换参数;或者,
将拼接前基准像素坐标减去拼接后基准像素坐标的差值,作为坐标转换参数。
在一些实施例中,步骤S122包括:
在该坐标转换参数为拼接后基准像素坐标减去拼接前基准像素坐标得到的差值的情况下,将第一关键点坐标减去坐标转换参数,得到第二关键点坐标。在该坐标转换参数为拼接前基准像素坐标减去拼接后基准像素坐标得到的差值的情况下,将第一关键点坐标加上坐标转换参数,得到第二关键点坐标
例如,某个第一关键点在目标图像上的第一关键点坐标为(20,20),该第一关键点对应的坐标转换参数为坐标差值(10,0),因此,将第一关键点坐标(20,20)减去坐标差值(10,0),得到第二关键点坐标(10,20),将在待识别图像上处于第二关键点坐标(10,20)的像素点,作为第二关键点。由此,利用坐标转换参数,根据目标图像的第一关键点得到图像的第二关键点。
在一些实施例中,步骤S12包括:
对多个待识别图像进行缩放,得到缩放后图像;多个缩放后图像的图像尺寸相同;拼接多个缩放后图像,得到目标图像。
在一些实施例中,第二终端22分别对多个待识别图像进行缩放,以调整待识别图像的图像尺寸,得到图像尺寸相同的多个图像,作为上述的缩放后图像。第二终端22将多个缩放后图像进行拼接,得到上述的目标图像。
在一些实施例中,第二终端22对多个待识别图像中的全部图像进行缩放,或者对多个待识别图像中的部分图像进行缩放处理。例如,一个图像A的图像尺寸为720像素*1280像素,另一个图像B的图像尺寸为540像素*960像素,将另一个图像B进行缩放,得到720像素*1280像素的的缩放后图像B`,将图像A与缩放后图像B`进行拼接,得到图像尺寸为1440像素*1280像素的目标图像。
上述的图像识别方法,通过将待识别图像缩放为图像尺寸大小相同的缩放后图像,使得终端将相同尺寸的图像拼接为目标图像,降低了图像拼接处理所消耗的资源。
在一些实施例中,步骤S11包括:
接收多路视频流;多路视频流分别来源于第一账户和第二账户;
从第一账户的视频流中提取出第一待识别图像,以及,从第二账户的视频流中提取出第二待识别图像;
在根据目标图像的第一关键点,确定每个待识别图像各自的第二关键点之后,还包括:
按照第一待识别图像的第二关键点,对第一待识别图像添加图像特效,得到第一特效化图像,以及, 按照第二待识别图像的第二关键点,对第二待识别图像添加图像特效,得到第二特效化图像;
播放第一账户的特效化直播视频和第二账户的特效化直播视频;第一账户的特效化直播视频包括第一特效化图像;第二账户的特效化直播视频包括第二特效化图像。
在一些实施例中,第二终端22接收到第一账户和第二账户各自的视频流,从第一账户和第二账户各自的视频流分别提取出图像,得到第一待识别图像和第二待识别图像。
通过对第一待识别图像和第二待识别图像进行拼接,得到目标图像。创建以及执行图像识别实例,从而将目标图像输入至图像识别模型中,图像识别模型输出目标图像的第一关键点,第二终端22根据第一关键点得到第一待识别图像和第二待识别图像各自的第二关键点。
第二终端22根据第一待识别图像的第二关键点,对第一待识别图像添加图像特效,得到上述的第一特效化图像。同样地,第二终端22根据第二待识别图像的第二关键点,对第二待识别图像添加图像特效,得到上述的第二特效化图像。
参考图11,根据第一待识别图像61和第二待识别图像62各自的具有人脸轮廓特征的第二关键点65,在待识别图像中的人脸添加表情特效。
对于视频流中的多帧连续的待识别图像,重复执行上述的多个步骤,第二终端22能够得到多帧连续的特效化图像,依次显示多帧连续的特效化图像,即播放包括特效化图像的特效化直播视频。
在一些实施例中,如图15所示,还提供了一种视频直播方法,以该方法应用于图2中的第二终端22为例进行说明,包括以下步骤:
S151,获取第一账户的直播视频流,以及,获取第二账户的直播视频流;
S152,从第一账户的直播视频流中提取出第一待识别图像,以及,从第二账户的直播视频流中提取出第二待识别图像;
S153,拼接第一待识别图像和第二待识别图像,得到目标图像;
S154,将目标图像输入图像识别模型,得到目标图像的多个第一关键点;
S155,根据目标图像的多个第一关键点,确定第一待识别图像和第二待识别图像各自的第二关键点;
S156,按照第一待识别图像的第二关键点,对第一待识别图像添加图像特效,得到第一特效化图像,以及,按照第二待识别图像的第二关键点,对第二待识别图像添加图像特效,得到第二特效化图像;
S157,播放第一账户的特效化直播视频和第二账户的特效化直播视频;第一账户的特效化直播视频包括第一特效化图像;第二账户的特效化直播视频包括第二特效化图像。
由于上述各个步骤的实现方式在前述实施例中已有详细说明,在此不再赘述。
上述的视频直播额方法中,通过获取第一账户和第二账户各自的直播视频流,并分别从中提取出第一待识别图像和第二待识别图像,将第一待识别图像和第二待识别图像拼接为目标图像,将目标图像输入至图像识别模型,得到拼目标图像的第一关键点,根据第一关键点确定待识别图像各自的第二关键点,由此,只需要将目标图像输入至图像识别模型即能够实现对多个待识别图像的图像识别,得到多个待识别图像各自的关键点,而无须针对多个待识别图像分别执行多个图像识别实例,将多个待识别图像分别输入至图像识别模型,以对多个待识别图像分别识别出关键点,从而,节省了终端进行图像识别所需的处理资源,解决了相关技术中的图像识别方法严重耗费终端处理资源的问题。
而且,将上述的图像识别方法应用于视频通讯时添加图像特效的应用场景时,使得终端在识别图像的关键点以添加图像特效时,降低了处理资源的耗费。由于降低了处理资源的消耗,避免终端由于处理资源不足而导致视频通讯的画面卡顿、延迟等的问题。
在一些实施例中,如图16所示,还提供了一种直播系统1600,包括:
第一终端21和第二终端22;
第一终端21,用于生成第一账户的直播视频流,发送第一账户的直播视频流至第二终端22;
在一些实施例中,第一终端21通过服务器23发送第一账户的直播视频流至第二终端22。
第二终端22,用于生成第二账户的直播视频流;
第二终端22,还用于从第一账户的直播视频流中提取出第一待识别图像,以及,从第二账户的直播视频流中提取出第二待识别图像;
第二终端22,还用于将拼接图像输入图像识别模型,得到目标图像的多个第一关键点;
第二终端22,还用于根据目标图像的多个第一关键点,确定第一待识别图像和第二待识别图像各自的第二关键点;
第二终端22,还用于按照第一待识别图像的第二关键点,对第一待识别图像添加图像特效,得到 第一特效化图像,以及,按照第二待识别图像的第二关键点,对第二待识别图像添加图像特效,得到第二特效化图像;
第二终端22,还用于播放第一账户的特效化直播视频和第二账户的特效化直播视频;第一账户的特效化直播视频包括第一特效化图像;第二账户的特效化直播视频包括第二特效化图像。
由于上述第一终端21和第二终端22所执行步骤的实现方式在前述实施例中已有详细说明,在此不再赘述。
为了便于本领域技术人员深入理解本公开实施例,如图17所示,以在一个视频直播流程中进行图像处理为例进行说明,包括以下步骤:
S1701,获取第一账户的视频流和第二账户的视频流;
S1702,从第一账户的视频流和第二账户的视频流中分别提取出图像,得到第一待识别图像和第二待识别图像;
S1703,对第一待识别图像和第二待识别图像进行缩放,得到图像尺寸相同的第一待识别图像和第二待识别图像;
S1704,将第一待识别图像和第二待识别图像进行拼接,得到目标图像;
S1705,分别确定第一待识别图像和第二待识别图像各自的基准像素点;
S1706,确定第一待识别图像和第二待识别图像各自的基准像素点在第一图像和第二图像上的拼接前基准像素坐标,以及,确定第一待识别图像和第二待识别图像各自的基准像素点在拼接图像上的拼接后基准像素坐标;
S1707,计算第一待识别图像和第二待识别图像各自的拼接后基准像素坐标与拼接前基准像素坐标之间的差值,得到第一坐标转换参数和第二坐标转换参数;
S1708,建立第一待识别图像与第一坐标转换参数的对应关系,以及,建立第二待识别图像与第二坐标转换参数的对应关系;
S1709,创建以及执行图像识别实例,将目标图像输入至图像识别模型,得到目标图像中的多个第一关键点;
S1710,根据多个第一关键点在目标图像中分别所处的图像区域,确定各个第一关键点分别对应的第一待识别图像或第二待识别图像;
S1711,根据第一关键点对应的第一待识别图像或第二待识别图像,确定对应的第一坐标转换参数或第二坐标转换参数;
S1712,将第一关键点坐标减去第一坐标转换参数或第二坐标转换参数,得到第一待识别图像或第二待识别图像的第二关键点坐标;
S1713,将第一待识别图像或第二待识别图像中处于第二关键点坐标的像素点,作为第一待识别图像或第二待识别图像的第二关键点;
S1714,按照第一待识别图像和第二待识别图像各自的第二关键点,对第一待识别图像和第二待识别图像添加图像特效,得到第一特效化图像和第二特效化图像;
S1715,播放第一账户的包括第一特效化图像的特效化直播视频,以及,播放第二账户的包括第二特效化图像的特效化直播视频。
在一些实施例中,虽然本公开的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤以其它的顺序执行。而且,本公开的流程图的至少一部分步骤包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。
在一些实施例中,如图18所示,提供了一种图像识别装置1800,包括:
图像获取单元1801,被配置为执行获取多个待识别图像;
图像拼接单元1802,被配置为执行拼接多个待识别图像,得到目标图像;
关键点识别单元1803,被配置为执行将目标图像输入图像识别模型,得到目标图像的多个第一关键点;
关键点确定单元1804,被配置为执行根据目标图像的多个第一关键点,确定每个待识别图像各自的第二关键点。
在一些实施例中,第一关键点在目标图像上的像素坐标为第一关键点坐标,关键点确定单元1804,被配置为执行:
确定第一关键点坐标对应的坐标转换参数;坐标转换参数为用于将第一关键点坐标转换成在待识别图像上确定第二关键点的坐标的参数;
按照第一关键点坐标对应的坐标转换参数,将第一关键点坐标转换为第二关键点坐标;
将待识别图像中处于第二关键点坐标的像素点,作为第二关键点。
在一些实施例中,目标图像包括多个图像区域,多个图像区域分别具有对应的待识别图像,关键点确定单元1804,被配置为执行:
在目标图像中的多个图像区域中,确定第一关键点坐标所处的目标图像区域;
将目标图像区域对应的待识别图像,确定为第一关键点坐标对应的待识别图像。
在一些实施例中,装置还包括:
划分单元,被配置为执行根据待识别图像中的像素点的像素坐标,确定待识别图像的图像边界;确定待识别图像的图像边界在目标图像上的像素坐标,得到图像区域划分坐标;根据图像区域划分坐标,将目标图像划分为分别与多个待识别图像对应的多个图像区域。
在一些实施例中,关键点确定单元1804,被配置为执行:
确定待识别图像中的至少一个像素点,为基准像素点;
确定基准像素点在待识别图像上的像素坐标,得到拼接前基准像素坐标,以及,确定基准像素点在目标图像上的像素坐标,得到拼接后基准像素坐标;
基于拼接后基准像素坐标与拼接前基准像素坐标,确定坐标转换参数。
在一些实施例中,关键点确定单元1804,被配置为执行:
将拼接后基准像素坐标减去拼接前基准像素坐标所得到的差值,作为坐标转换参数;或者,
将拼接前基准像素坐标减去拼接后基准像素坐标的差值,作为坐标转换参数。
在一些实施例中,关键点确定单元1804,被配置为执行:
在该坐标转换参数为拼接后基准像素坐标减去拼接前基准像素坐标得到的差值的情况下,将第一关键点坐标减去坐标转换参数,得到第二关键点坐标;
在该坐标转换参数为拼接前基准像素坐标减去拼接后基准像素坐标得到的差值的情况下,将第一关键点坐标加上坐标转换参数,得到第二关键点坐标。
在一些实施例中,图像拼接单元1802,还被配置为执行:
对多个待识别图像中的至少一个待识别图像进行缩放处理,缩放处理用于使得多个待识别图像的图像尺寸相同。
在一些实施例中,如图19所示,提供了一种视频直播装置1900,包括:
视频流获取单元1901,被配置为执行获取第一账户的直播视频流,以及,获取第二账户的直播视频流;
图像获取单元1902,被配置为执行从第一账户的直播视频流中提取出第一待识别图像,以及,从第二账户的直播视频流中提取出第二待识别图像;
图像拼接单元1903,被配置为执行拼接第一待识别图像和第二待识别图像,得到目标图像;
关键点识别单元1904,被配置为执行将目标图像输入图像识别模型,得到目标图像的多个第一关键点;
关键点确定单元1905,被配置为执行根据目标图像的多个第一关键点,确定第一待识别图像和第二待识别图像各自的第二关键点;
特效添加单元1906,被配置为执行按照第一待识别图像的第二关键点,对第一待识别图像添加图像特效,得到第一特效化图像,以及,按照第二待识别图像的第二关键点,对第二待识别图像添加图像特效,得到第二特效化图像;
特效播放单元1907,被配置为执行播放第一账户的特效化直播视频和第二账户的特效化直播视频;第一账户的特效化直播视频包括第一特效化图像;第二账户的特效化直播视频包括第二特效化图像。
关于图像识别装置和视频直播装置的限定参见上文中对于图像识别和视频直播方法的限定,在此不再赘述。上述图像识别装置和视频直播装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块能够以硬件形式内嵌于或独立于计算机设备中的处理器中,也能够以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。
上述提供的图像识别装置和视频直播装置可用于执行上述任意实施例提供的图像识别方法和视频 直播方法,具备相应的功能和有益效果。
本公开实施例示出的一种计算机设备,该计算机设备包括处理器;
用于存储处理器可执行指令的存储器;
其中,处理器被配置为执行指令,以实现如下步骤:
获取多个待识别图像;
拼接多个待识别图像,得到目标图像;
将目标图像输入图像识别模型,得到目标图像的多个第一关键点;
根据目标图像的多个第一关键点,确定每个待识别图像各自的第二关键点。
在一些实施例中,第一关键点在目标图像上的像素坐标为第一关键点坐标,处理器被配置为执行指令,以实现如下步骤:
确定第一关键点坐标对应的坐标转换参数;坐标转换参数为用于将第一关键点坐标转换成在待识别图像上确定第二关键点的坐标的参数;
按照第一关键点坐标对应的坐标转换参数,将第一关键点坐标转换为第二关键点坐标;
将待识别图像中处于第二关键点坐标的像素点,作为第二关键点。
在一些实施例中,处理器被配置为执行指令,以实现如下步骤:
在目标图像中的多个图像区域中,确定第一关键点坐标所处的目标图像区域;
将目标图像区域对应的待识别图像,确定为第一关键点坐标对应的待识别图像。
在一些实施例中,处理器被配置为执行指令,以实现如下步骤:
根据待识别图像中的像素点的像素坐标,确定待识别图像的图像边界;
确定待识别图像的图像边界在目标图像上的像素坐标,得到图像区域划分坐标;
根据图像区域划分坐标,将目标图像划分为分别与多个待识别图像对应的多个图像区域。
在一些实施例中,处理器被配置为执行指令,以实现如下步骤:
确定待识别图像中的至少一个像素点,为基准像素点;
确定基准像素点在待识别图像上的像素坐标,得到拼接前基准像素坐标,以及,确定基准像素点在目标图像上的像素坐标,得到拼接后基准像素坐标;
基于拼接后基准像素坐标与拼接前基准像素坐标,确定坐标转换参数。
在一些实施例中,处理器被配置为执行指令,以实现如下步骤:
将拼接后基准像素坐标减去拼接前基准像素坐标所得到的差值,作为坐标转换参数;或者,
将拼接前基准像素坐标减去拼接后基准像素坐标的差值,作为坐标转换参数。
在一些实施例中,处理器被配置为执行指令,以实现如下步骤:
在该坐标转换参数为拼接后基准像素坐标减去拼接前基准像素坐标得到的差值的情况下,将第一关键点坐标减去坐标转换参数,得到第二关键点坐标;
在该坐标转换参数为拼接前基准像素坐标减去拼接后基准像素坐标得到的差值的情况下,将第一关键点坐标加上坐标转换参数,得到第二关键点坐标。
在一些实施例中,处理器被配置为执行指令,以实现如下步骤:
对多个待识别图像中的至少一个待处理图像进行缩放,得到缩放后图像;多个缩放后图像的图像尺寸相同;
拼接多个缩放后图像,得到目标图像。
本公开实施例示出的一种计算机设备,该计算机设备包括处理器;
该计算机设备包括处理器;
用于存储处理器可执行指令的存储器;
其中,处理器被配置为执行指令,以实现如下步骤:
获取第一账户的直播视频流,以及,获取第二账户的直播视频流;
从第一账户的直播视频流中提取出第一待识别图像,以及,从第二账户的直播视频流中提取出第二待识别图像;
拼接第一待识别图像和第二待识别图像,得到目标图像;
将目标图像输入图像识别模型,得到目标图像的多个第三关键点;
根据目标图像的多个第三关键点,确定第一待识别图像和第二待识别图像各自的第四关键点;
按照第一待识别图像的第四关键点,对第一待识别图像添加图像特效,得到第一特效化图像,以及,按照第二待识别图像的第四关键点,对第二待识别图像添加图像特效,得到第二特效化图像;
播放第一账户的特效化直播视频和第二账户的特效化直播视频;第一账户的特效化直播视频包括第一特效化图像;第二账户的特效化直播视频包括第二特效化图像。
图20是本公开实施例示出的一种计算机设备,该计算机设备提供为终端,其内部结构图如图20所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口、显示屏和输入装置。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统和计算机程序。该内存储器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机程序被处理器执行时以实现一种图像识别方法和视频直播方法。该计算机设备的显示屏是液晶显示屏或者电子墨水显示屏,该计算机设备的输入装置是显示屏上覆盖的触摸层,或者是计算机设备外壳上设置的按键、轨迹球或触控板,或者是外接的键盘、触控板或鼠标等。
本领域技术人员能够理解,图20中示出的结构,仅仅是与本公开方案相关的部分结构的框图,并不构成对本公开方案所应用于其上的计算机设备的限定,计算机设备包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
本公开还提供一种计算机程序产品,包括:计算机程序代码,响应于计算机程序代码被计算机运行,使得计算机执行上述的图像识别方法和视频直播方法。
本领域普通技术人员能够理解实现上述实施例方法中的全部或部分流程,是通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一非易失性计算机可读取存储介质中,该计算机程序在执行时,可包括如上述各方法的实施例的流程。其中,本公开所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本公开的其它实施方案。本公开旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本公开的真正范围和精神由下面的权利要求指出。
应当理解的是,本公开并不局限于上面已经描述并在附图中示出的精确结构,并且在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限制。
本领域普通技术人员能够理解实现上述实施例方法中的全部或部分流程,是通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一非易失性计算机可读取存储介质中,该计算机程序在执行时,可包括如上述各方法的实施例的流程。其中,本公开所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。
以上实施例的各技术特征能够进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。

Claims (29)

  1. 一种图像识别方法,包括:
    获取多个待识别图像;
    拼接多个所述待识别图像,得到目标图像;
    将所述目标图像输入图像识别模型,得到所述目标图像的多个第一关键点;
    根据所述目标图像的多个第一关键点,确定每个所述待识别图像各自的第二关键点。
  2. 根据权利要求1所述的方法,其中,所述第一关键点在所述目标图像上的像素坐标为第一关键点坐标,所述根据所述目标图像的多个第一关键点,确定每个所述待识别图像各自的第二关键点,包括:
    确定所述第一关键点坐标对应的坐标转换参数;所述坐标转换参数为用于将所述第一关键点坐标转换成在所述待识别图像上确定所述第二关键点的坐标的参数;
    按照所述第一关键点坐标对应的坐标转换参数,将所述第一关键点坐标转换为第二关键点坐标;
    将所述待识别图像中处于所述第二关键点坐标的像素点,作为所述第二关键点。
  3. 根据权利要求2所述的方法,其特征在于,所述目标图像包括多个图像区域,多个所述图像区域分别具有对应的待识别图像,所述确定所述第一关键点坐标对应的坐标转换参数,包括:
    在所述拼接图像中的多个所述图像区域中,确定所述第一关键点坐标在所述拼接图像中所处的目标图像区域;
    根据所述目标图像区域对应的待识别图像,确定所述第一关键点坐标对应的坐标转换参数。
  4. 根据权利要求3所述的方法,其特征在于,还包括:
    根据所述待识别图像中的像素点的像素坐标,确定所述待识别图像的图像边界;
    确定所述待识别图像的图像边界在所述目标图像上的像素坐标,得到图像区域划分坐标;
    根据所述图像区域划分坐标,将所述目标图像划分为分别与多个所述待识别图像对应的多个所述图像区域。
  5. 根据权利要求2所述的方法,其特征在于,所述确定所述第一关键点坐标对应的坐标转换参数,包括:
    确定所述待识别图像中的至少一个像素点,为基准像素点;
    确定所述基准像素点在所述待识别图像上的像素坐标,得到拼接前基准像素坐标,以及,确定所述基准像素点在所述目标图像上的像素坐标,得到拼接后基准像素坐标;
    基于所述拼接后基准像素坐标与所述拼接前基准像素坐标,确定所述坐标转换参数。
  6. 根据权利要求5所述的方法,其特征在于,所述基于所述拼接后基准像素坐标与所述拼接前基准像素坐标,确定所述坐标转换参数,包括:
    将所述拼接后基准像素坐标减去所述拼接前基准像素坐标所得到的差值,作为所述坐标转换参数;或者,
    将所述拼接前基准像素坐标减去所述拼接后基准像素坐标的差值,作为所述坐标转换参数。
  7. 根据权利要求6所述的方法,其中,所述按照所述第一关键点坐标对应的坐标转换参数,将所述第一关键点坐标转换为第二关键点坐标,包括:
    在所述坐标转换参数为所述拼接后基准像素坐标减去所述拼接前基准像素坐标得到的差值的情况下,将所述第一关键点坐标减去所述坐标转换参数,得到所述第二关键点坐标;
    在所述坐标转换参数为拼接前基准像素坐标减去所述拼接后基准像素坐标得到的差值的情况下,将所述第一关键点坐标加上所述坐标转换参数,得到所述第二关键点坐标。
  8. 根据权利要求1所述的方法,其中,所述拼接多个所述待识别图像,得到目标图像,包括:
    对多个所述待识别图像中的至少一个所述待处理图像进行缩放,得到缩放后图像;多个所述缩放后图像的图像尺寸相同;
    拼接多个所述缩放后图像,得到所述目标图像。
  9. 一种视频直播方法,包括:
    获取第一账户的直播视频流,以及,获取第二账户的直播视频流;
    从所述第一账户的直播视频流中提取出第一待识别图像,以及,从所述第二账户的直播视频流中提取出第二待识别图像;
    拼接所述第一待识别图像和所述第二待识别图像,得到目标图像;
    将所述目标图像输入图像识别模型,得到所述目标图像的多个第一关键点;
    根据所述目标图像的多个第一关键点,确定所述第一待识别图像和所述第二待识别图像各自的第 二关键点;
    按照所述第一待识别图像的第二关键点,对所述第一待识别图像添加图像特效,得到第一特效化图像,以及,按照所述第二待识别图像的第二关键点,对所述第二待识别图像添加图像特效,得到第二特效化图像;
    播放所述第一账户的特效化直播视频和所述第二账户的特效化直播视频;所述第一账户的特效化直播视频包括所述第一特效化图像;所述第二账户的特效化直播视频包括所述第二特效化图像。
  10. 一种图像识别装置,包括:
    图像获取单元,被配置为执行获取多个待识别图像;
    图像拼接单元,被配置为执行拼接多个所述待识别图像,得到目标图像;
    关键点识别单元,被配置为执行将所述目标图像输入图像识别模型,得到所述目标图像的多个第一关键点;
    关键点确定单元,被配置为执行根据所述目标图像的多个第一关键点,确定每个所述待识别图像各自的第二关键点。
  11. 根据权利要求10所述的装置,其特征在于,所述第一关键点在所述目标图像上的像素坐标为第一关键点坐标,所述关键点确定单元,被配置为执行:
    确定所述第一关键点坐标对应的坐标转换参数;所述坐标转换参数为用于将所述第一关键点坐标转换成在所述待识别图像上确定所述第二关键点的坐标的参数;
    按照所述第一关键点坐标对应的坐标转换参数,将所述第一关键点坐标转换为第二关键点坐标;
    将所述待识别图像中处于所述第二关键点坐标的像素点,作为所述第二关键点。
  12. 根据权利要求11所述的装置,其特征在于,所述目标图像包括多个图像区域,多个所述图像区域分别具有对应的待识别图像,所述关键点确定单元,被配置为执行:
    在所述拼接图像中的多个所述图像区域中,确定所述第一关键点坐标在所述拼接图像中所处的目标图像区域;
    根据所述目标图像区域对应的待识别图像,确定所述第一关键点坐标对应的坐标转换参数。
  13. 根据权利要求12所述的装置,其特征在于,所述装置还包括:
    划分单元,被配置为执行根据所述待识别图像中的像素点的像素坐标,确定所述待识别图像的图像边界;确定所述待识别图像的图像边界在所述目标图像上的像素坐标,得到图像区域划分坐标;根据所述图像区域划分坐标,将所述目标图像划分为分别与多个所述待识别图像对应的多个所述图像区域。
  14. 根据权利要求11所述的装置,其特征在于,所述关键点确定单元,被配置为执行:
    确定所述待识别图像中的至少一个像素点,为基准像素点;
    确定所述基准像素点在所述待识别图像上的像素坐标,得到拼接前基准像素坐标,以及,确定所述基准像素点在所述目标图像上的像素坐标,得到拼接后基准像素坐标;
    基于所述拼接后基准像素坐标与所述拼接前基准像素坐标,确定所述坐标转换参数。
  15. 根据权利要求14所述的装置,其特征在于,所述关键点确定单元,被配置为执行:
    将所述拼接后基准像素坐标减去所述拼接前基准像素坐标所得到的差值,作为所述坐标转换参数;或者,
    将所述拼接前基准像素坐标减去所述拼接后基准像素坐标的差值,作为所述坐标转换参数。
  16. 根据权利要求15所述的装置,其特征在于,所述关键点确定单元,被配置为执行:
    在所述坐标转换参数为所述拼接后基准像素坐标减去所述拼接前基准像素坐标得到的差值的情况下,将所述第一关键点坐标减去所述坐标转换参数,得到所述第二关键点坐标;
    在所述坐标转换参数为拼接前基准像素坐标减去所述拼接后基准像素坐标得到的差值的情况下,将所述第一关键点坐标加上所述坐标转换参数,得到所述第二关键点坐标。
  17. 根据权利要求10所述的装置,其特征在于,所述图像拼接单元,被配置为执行:
    对多个所述待识别图像中的至少一个所述待处理图像进行缩放,得到缩放后图像;多个所述缩放后图像的图像尺寸相同;
    拼接多个所述缩放后图像,得到所述目标图像。
  18. 一种视频直播装置,包括:
    视频流获取单元,被配置为执行获取第一账户的直播视频流,以及,获取第二账户的直播视频流;
    图像获取单元,被配置为执行从所述第一账户的直播视频流中提取出第一待识别图像,以及,从所述第二账户的直播视频流中提取出第二待识别图像;
    图像拼接单元,被配置为执行拼接所述第一待识别图像和所述第二待识别图像,得到目标图像;
    关键点识别单元,被配置为执行将所述目标图像输入图像识别模型,得到所述目标图像的多个第三关键点;
    关键点确定单元,被配置为执行根据所述目标图像的第三关键点,确定所述第一待识别图像和所述第二待识别图像各自的第四关键点;
    特效添加单元,被配置为执行按照所述第一待识别图像的第四关键点,对所述第一待识别图像添加图像特效,得到第一特效化图像,以及,按照所述第二待识别图像的第四关键点,对所述第二待识别图像添加图像特效,得到第二特效化图像;
    特效播放单元,被配置为执行播放所述第一账户的特效化直播视频和所述第二账户的特效化直播视频;所述第一账户的特效化直播视频包括所述第一特效化图像;所述第二账户的特效化直播视频包括所述第二特效化图像。
  19. 一种计算机设备,包括:
    处理器;
    用于存储所述处理器可执行指令的存储器;
    其中,所述处理器被配置为执行所述指令,以实现如下步骤:
    获取多个待识别图像;
    拼接多个所述待识别图像,得到目标图像;
    将所述目标图像输入图像识别模型,得到所述目标图像的多个第一关键点;
    根据所述目标图像的多个第一关键点,确定每个所述待识别图像各自的第二关键点。
  20. 根据权利要求19所述的计算机设备,其中,所述第一关键点在所述目标图像上的像素坐标为第一关键点坐标,所述处理器被配置为执行所述指令,实现如下步骤:
    确定所述第一关键点坐标对应的坐标转换参数;所述坐标转换参数为用于将所述第一关键点坐标转换成在所述待识别图像上确定所述第二关键点的坐标的参数;
    按照所述第一关键点坐标对应的坐标转换参数,将所述第一关键点坐标转换为第二关键点坐标;
    将所述待识别图像中处于所述第二关键点坐标的像素点,作为所述第二关键点。
  21. 根据权利要求20所述的计算机设备,其中,所述目标图像包括多个图像区域,多个所述图像区域分别具有对应的待识别图像,所述处理器被配置为执行所述指令,实现如下步骤:
    在所述目标图像中的多个所述图像区域中,确定所述第一关键点坐标所处的目标图像区域;
    将所述目标图像区域对应的待识别图像,确定为所述第一关键点坐标对应的待识别图像。
  22. 根据权利要求21所述的计算机设备,其中,所述处理器被配置为执行所述指令,实现如下步骤:
    根据所述待识别图像中的像素点的像素坐标,确定所述待识别图像的图像边界;
    确定所述待识别图像的图像边界在所述目标图像上的像素坐标,得到图像区域划分坐标;
    根据所述图像区域划分坐标,将所述目标图像划分为分别与多个所述待识别图像对应的多个所述图像区域。
  23. 根据权利要求20所述的计算机设备,其中,所述处理器被配置为执行所述指令,实现如下步骤:
    确定所述待识别图像中的至少一个像素点,为基准像素点;
    确定所述基准像素点在所述待识别图像上的像素坐标,得到拼接前基准像素坐标,以及,确定所述基准像素点在所述目标图像上的像素坐标,得到拼接后基准像素坐标;
    基于所述拼接后基准像素坐标与所述拼接前基准像素坐标,确定所述坐标转换参数。
  24. 根据权利要求23所述的计算机设备,其中,所述处理器被配置为执行所述指令,实现如下步骤:
    将所述拼接后基准像素坐标减去所述拼接前基准像素坐标所得到的差值,作为所述坐标转换参数;或者,
    将所述拼接前基准像素坐标减去所述拼接后基准像素坐标的差值,作为所述坐标转换参数。
  25. 根据权利要求24所述的计算机设备,其中,所述处理器被配置为执行所述指令,实现如下步骤:
    在所述坐标转换参数为所述拼接后基准像素坐标减去所述拼接前基准像素坐标得到的差值的情况下,将所述第一关键点坐标减去所述坐标转换参数,得到所述第二关键点坐标;
    在所述坐标转换参数为拼接前基准像素坐标减去所述拼接后基准像素坐标得到的差值的情况下,将所述第一关键点坐标加上所述坐标转换参数,得到所述第二关键点坐标。
  26. 根据权利要求19所述的计算机设备,其中,所述处理器被配置为执行所述指令,实现如下步骤:
    对多个所述待识别图像中的至少一个所述待处理图像进行缩放,得到缩放后图像;多个所述缩放后图像的图像尺寸相同;
    拼接多个所述缩放后图像,得到所述目标图像。
  27. 一种计算机设备,包括:
    处理器;
    用于存储所述处理器可执行指令的存储器;
    其中,所述处理器被配置为执行所述指令,以实现如下步骤:
    获取第一账户的直播视频流,以及,获取第二账户的直播视频流;
    从所述第一账户的直播视频流中提取出第一待识别图像,以及,从所述第二账户的直播视频流中提取出第二待识别图像;
    拼接所述第一待识别图像和所述第二待识别图像,得到目标图像;
    将所述目标图像输入图像识别模型,得到所述目标图像的多个第一关键点;
    根据所述目标图像的多个第一关键点,确定所述第一待识别图像和所述第二待识别图像各自的第二关键点;
    按照所述第一待识别图像的第二关键点,对所述第一待识别图像添加图像特效,得到第一特效化图像,以及,按照所述第二待识别图像的第二关键点,对所述第二待识别图像添加图像特效,得到第二特效化图像;
    播放所述第一账户的特效化直播视频和所述第二账户的特效化直播视频;所述第一账户的特效化直播视频包括所述第一特效化图像;所述第二账户的特效化直播视频包括所述第二特效化图像。
  28. 一种存储介质,响应于所述存储介质中的指令由计算机设备的处理器执行,使得所述计算机设备能够执行如下步骤:
    获取多个待识别图像;
    拼接多个所述待识别图像,得到目标图像;
    将所述目标图像输入图像识别模型,得到所述目标图像的多个第一关键点;
    根据所述目标图像的多个第一关键点,确定每个所述待识别图像各自的第二关键点。
  29. 一种存储介质,所述存储介质中的指令由计算机设备的处理器执行,使得所述计算机设备能够执行如下步骤:
    获取第一账户的直播视频流,以及,获取第二账户的直播视频流;
    从所述第一账户的直播视频流中提取出第一待识别图像,以及,从所述第二账户的直播视频流中提取出第二待识别图像;
    拼接所述第一待识别图像和所述第二待识别图像,得到目标图像;
    将所述目标图像输入图像识别模型,得到所述目标图像的多个第一关键点;
    根据所述目标图像的多个第一关键点,确定所述第一待识别图像和所述第二待识别图像各自的第二关键点;
    按照所述第一待识别图像的第二关键点,对所述第一待识别图像添加图像特效,得到第一特效化图像,以及,按照所述第二待识别图像的第二关键点,对所述第二待识别图像添加图像特效,得到第二特效化图像;
    播放所述第一账户的特效化直播视频和所述第二账户的特效化直播视频;所述第一账户的特效化直播视频包括所述第一特效化图像;所述第二账户的特效化直播视频包括所述第二特效化图像。
PCT/CN2021/073150 2020-01-21 2021-01-21 图像识别方法及装置 WO2021147966A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/746,842 US20220279241A1 (en) 2020-01-21 2022-05-17 Method and device for recognizing images

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010070867.X 2020-01-21
CN202010070867.XA CN113225613B (zh) 2020-01-21 2020-01-21 图像识别、视频直播方法和装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/746,842 Continuation US20220279241A1 (en) 2020-01-21 2022-05-17 Method and device for recognizing images

Publications (1)

Publication Number Publication Date
WO2021147966A1 true WO2021147966A1 (zh) 2021-07-29

Family

ID=76993169

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/073150 WO2021147966A1 (zh) 2020-01-21 2021-01-21 图像识别方法及装置

Country Status (3)

Country Link
US (1) US20220279241A1 (zh)
CN (1) CN113225613B (zh)
WO (1) WO2021147966A1 (zh)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106791710A (zh) * 2017-02-10 2017-05-31 北京地平线信息技术有限公司 目标检测方法、装置和电子设备
CN107343211A (zh) * 2016-08-19 2017-11-10 北京市商汤科技开发有限公司 视频图像处理方法、装置和终端设备
CN107770484A (zh) * 2016-08-19 2018-03-06 杭州海康威视数字技术股份有限公司 一种视频监控信息生成方法、装置及摄像机
US20180070075A1 (en) * 2008-08-08 2018-03-08 Avigilon Fortress Corporation Automatic calibration of ptz camera system
CN109068181A (zh) * 2018-07-27 2018-12-21 广州华多网络科技有限公司 基于视频直播的足球游戏交互方法、系统、终端及装置
CN109729379A (zh) * 2019-02-01 2019-05-07 广州虎牙信息科技有限公司 直播视频连麦的实现方法、装置、终端和存储介质
CN110188640A (zh) * 2019-05-20 2019-08-30 北京百度网讯科技有限公司 人脸识别方法、装置、服务器和计算机可读介质
CN111027526A (zh) * 2019-10-25 2020-04-17 深圳羚羊极速科技有限公司 一种提高车辆目标检测识别检测效率的方法
CN112101305A (zh) * 2020-05-12 2020-12-18 杭州宇泛智能科技有限公司 多路图像处理方法、装置及电子设备

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107895344B (zh) * 2017-10-31 2021-05-11 深圳市森国科科技股份有限公司 视频拼接装置及方法

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180070075A1 (en) * 2008-08-08 2018-03-08 Avigilon Fortress Corporation Automatic calibration of ptz camera system
CN107343211A (zh) * 2016-08-19 2017-11-10 北京市商汤科技开发有限公司 视频图像处理方法、装置和终端设备
CN107770484A (zh) * 2016-08-19 2018-03-06 杭州海康威视数字技术股份有限公司 一种视频监控信息生成方法、装置及摄像机
CN106791710A (zh) * 2017-02-10 2017-05-31 北京地平线信息技术有限公司 目标检测方法、装置和电子设备
CN109068181A (zh) * 2018-07-27 2018-12-21 广州华多网络科技有限公司 基于视频直播的足球游戏交互方法、系统、终端及装置
CN109729379A (zh) * 2019-02-01 2019-05-07 广州虎牙信息科技有限公司 直播视频连麦的实现方法、装置、终端和存储介质
CN110188640A (zh) * 2019-05-20 2019-08-30 北京百度网讯科技有限公司 人脸识别方法、装置、服务器和计算机可读介质
CN111027526A (zh) * 2019-10-25 2020-04-17 深圳羚羊极速科技有限公司 一种提高车辆目标检测识别检测效率的方法
CN112101305A (zh) * 2020-05-12 2020-12-18 杭州宇泛智能科技有限公司 多路图像处理方法、装置及电子设备

Also Published As

Publication number Publication date
CN113225613A (zh) 2021-08-06
CN113225613B (zh) 2022-07-08
US20220279241A1 (en) 2022-09-01

Similar Documents

Publication Publication Date Title
US11373275B2 (en) Method for generating high-resolution picture, computer device, and storage medium
US9497416B2 (en) Virtual circular conferencing experience using unified communication technology
US20220233957A1 (en) Lag detection method and apparatus, device, and readable storage medium
US20220188357A1 (en) Video generating method and device
US11409794B2 (en) Image deformation control method and device and hardware device
CN111405301B (zh) 终端的录屏交互方法、装置、计算机设备及存储介质
CN113542875B (zh) 视频处理方法、装置、电子设备以及存储介质
WO2019000793A1 (zh) 直播中打码方法及装置、电子设备及存储介质
US11627281B2 (en) Method and apparatus for video frame interpolation, and device and storage medium
CN113852756B (zh) 图像获取方法、装置、设备和存储介质
CN112989112B (zh) 在线课堂内容采集方法及装置
CN110430356A (zh) 一种修图方法与电子设备
US9036921B2 (en) Face and expression aligned movies
WO2021057957A1 (zh) 视频通话方法、装置、计算机设备和存储介质
WO2021147966A1 (zh) 图像识别方法及装置
CN108320331B (zh) 一种生成用户场景的增强现实视频信息的方法与设备
CN111475677A (zh) 图像处理方法、装置、存储介质及电子设备
CN113918023B (zh) 屏保显示方法、装置及显示设备
JP2023539273A (ja) 対象の追加方式を決定するための方法、装置、電子機器及び媒体
CN108769525B (zh) 一种图像调整方法、装置、设备及存储介质
CN113992866B (zh) 视频制作方法及装置
WO2024045026A1 (zh) 一种显示方法、电子设备、显示设备、传屏器及介质
WO2022104800A1 (zh) 一种虚拟名片的发送方法、装置、系统及可读存储介质
CN113362224A (zh) 图像处理方法、装置、电子设备及可读存储介质
CN115484492A (zh) 界面时延的获取方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21744692

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21744692

Country of ref document: EP

Kind code of ref document: A1