CN112307925B - Image detection method, image display method, related device and storage medium - Google Patents

Image detection method, image display method, related device and storage medium Download PDF

Info

Publication number
CN112307925B
CN112307925B CN202011151676.2A CN202011151676A CN112307925B CN 112307925 B CN112307925 B CN 112307925B CN 202011151676 A CN202011151676 A CN 202011151676A CN 112307925 B CN112307925 B CN 112307925B
Authority
CN
China
Prior art keywords
image
resource
target
frame
special effect
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011151676.2A
Other languages
Chinese (zh)
Other versions
CN112307925A (en
Inventor
蒋政胜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202011151676.2A priority Critical patent/CN112307925B/en
Publication of CN112307925A publication Critical patent/CN112307925A/en
Application granted granted Critical
Publication of CN112307925B publication Critical patent/CN112307925B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Processing Or Creating Images (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

The embodiment of the invention discloses an image detection method, an image display method, related equipment and a storage medium, wherein the image detection method comprises the following steps: playing a fusion resource image of the target resource in a terminal screen, wherein the fusion resource image is a special effect image obtained by fusing a reference object for detection into a dynamic template image of the target resource; recording a screen of the playing process of the fusion resource image to obtain a recorded video of the playing process of the fusion resource image; object recognition is carried out on multiple frames of candidate images of the recorded video, and recognition results of the candidate images of each frame are obtained; selecting a key frame image from multiple frame candidate images according to the identification result of each frame candidate image; and carrying out anomaly detection on the special effect display logic of the dynamic template image according to the key frame image. The embodiment of the invention can enrich the special effect display logic of the dynamic template image of the target resource and improve the viscosity of the user; and the special effect display logic can be better detected, and the accuracy of the detection result is improved.

Description

Image detection method, image display method, related device and storage medium
Technical Field
The present invention relates to the field of internet technologies, and in particular, to the field of image processing technologies, and in particular, to an image detection method, an image display method, an image detection device, an image display device, a terminal, and a computer storage medium.
Background
With the development of internet technology, more and more clients support a communication function, so that any user can perform information interaction with other users through the communication function. In the process of information interaction, the client can also support any user to send one or more resources (such as virtual gifts, electronic red packages and the like) to other users; and after the triggering operation of the user for sending the resource is detected, playing the template image according to the special effect display logic of the template image of the target resource sent by the user so as to promote the interestingness.
At present, after generating a template image of any resource, it is generally required to firstly perform anomaly detection on special effect display logic of the template image of any resource, and after ensuring that the special effect display logic is not abnormal, then issue the template image of any resource. Therefore, after the triggering operation of sending the resource is detected in the actual application, the template image can be normally played according to the special effect display logic of the template image of the resource. Based on this, how to perform anomaly detection on special effect display logic of template images of resources becomes a research hotspot.
Disclosure of Invention
The embodiment of the invention provides an image detection method, an image display method, related equipment and a storage medium, which can better detect a brand new special effect display logic and improve the accuracy of detection results; and through the brand new special effect display logic, the special effect display mode of the dynamic template image can be enriched, and the viscosity of a user is improved.
In one aspect, an embodiment of the present invention provides an image detection method, including:
playing a fusion resource image of a target resource in a terminal screen, wherein the fusion resource image is a special effect image obtained by fusing a reference object for detection into a dynamic template image of the target resource;
recording a screen of the playing process of the fusion resource image to obtain a recorded video of the playing process of the fusion resource image;
object recognition is carried out on the multi-frame candidate images of the recorded video, so that recognition results of the candidate images of each frame are obtained, and the recognition results of any candidate image are used for indicating: whether the any candidate image includes a target object associated with the reference object;
selecting a key frame image from the multi-frame candidate images according to the identification result of each frame of candidate image, wherein the key frame image refers to the candidate image comprising the target object;
And performing anomaly detection on the special effect display logic of the dynamic template image according to the key frame image.
In another aspect, an embodiment of the present invention provides an image detection apparatus, including:
the playing unit is used for playing a fusion resource image of the target resource in the terminal screen, wherein the fusion resource image is a special effect image obtained by fusing a reference object for detection into a dynamic template image of the target resource;
the screen recording unit is used for recording the screen of the playing process of the fusion resource image to obtain a recorded video of the playing process of the fusion resource image;
the processing unit is used for carrying out object recognition on the multi-frame candidate images of the recorded video to obtain recognition results of the candidate images of each frame, and the recognition result of any candidate image is used for indicating: whether the any candidate image includes a target object associated with the reference object;
the processing unit is further configured to select a key frame image from the multiple frame candidate images according to the identification result of each frame candidate image, where the key frame image is a candidate image including the target object;
the processing unit is further configured to perform anomaly detection on the special effect display logic of the dynamic template image according to the key frame image.
In one embodiment, the target resource is sent by a first test user to a second test user in a communication scenario;
the reference object comprises at least one of: the user portrait of the first test user and the user portrait of the second test user; the user portrait is a face image or a gesture image;
when the user portrait is the face image, the target object is the face with the target number; when the user portrait is the gesture image, the target object is a target number of gestures; wherein the target number is the number of user portraits included in the reference object.
In still another embodiment, when the processing unit is configured to perform object recognition on multiple frames of candidate images of the recorded video to obtain a recognition result of each frame of candidate image, the processing unit may be specifically configured to:
if the user portrait is the face image, carrying out face recognition on the multi-frame candidate images of the recorded video to obtain the recognition result of each frame of candidate image;
and if the user portrait is a gesture image, carrying out gesture recognition on the multi-frame candidate images of the recorded video to obtain recognition results of the candidate images of each frame.
In still another embodiment, before the processing unit is configured to perform object recognition on the multiple frames of candidate images of the recorded video, the processing unit is further configured to:
carrying out framing treatment on the recorded video to obtain multi-frame initial images;
and determining multi-frame candidate images of the recorded video from the multi-frame initial images.
In still another embodiment, the processing unit, when configured to perform frame processing on the recorded video to obtain a multi-frame initial image, may be specifically configured to:
performing picture segmentation on the recorded video to obtain multi-frame recorded pictures of the fusion resource images;
determining the multi-frame recorded picture as a multi-frame initial image, or respectively cutting out an image of a target area from each frame recorded picture as a multi-frame initial image; and the region center point of the target region coincides with the display position of the fusion resource image in the terminal screen.
In another embodiment, each frame of initial image has an image sequence number, and the frames of initial images are sequentially arranged according to the image sequence numbers; correspondingly, when the processing unit is used for determining the multi-frame candidate image of the recorded video from the multi-frame initial image, the processing unit can be specifically used for:
Taking the multi-frame initial image as a multi-frame candidate image of the recorded video; or,
and sampling the sequentially arranged multi-frame initial images according to the sampling frequency to obtain multi-frame candidate images of the recorded video.
In still another embodiment, the processing unit, when configured to select a key frame image from the multiple frame candidate images according to the identification result of each frame candidate image, may be specifically configured to:
screening candidate images comprising the target object from the multi-frame candidate images according to the identification result of each frame of candidate images;
and selecting a key frame image from the screened candidate images of each frame.
In still another embodiment, the processing unit, when configured to select a key frame image from the screened candidate images, may be specifically configured to:
arranging the screened frame candidate images according to the sequence from small to large of the image sequence numbers of the screened frame candidate images to obtain the arrangement positions of the screened frame candidate images;
and selecting the candidate image with the arrangement position at the target position from the sequentially arranged multi-frame candidate images as a key frame image.
In still another embodiment, the processing unit, when configured to perform anomaly detection on the special effect exhibition logic of the dynamic template image according to the keyframe image, may be specifically configured to:
Performing static image matching processing on the key frame image and the target image element to obtain a matching result;
if the matching result indicates that the key frame image comprises the target image element, determining that the special effect display logic of the dynamic template image is not abnormal;
and if the matching result indicates that the key frame image does not comprise the target image element, determining that the special effect display logic of the dynamic template image is abnormal.
In yet another embodiment, the special effect presentation logic of the dynamic template image has an abnormality including at least one of: an abnormality exists in the process of fusing the reference object to the dynamic template image, and an abnormality exists in the playing process of the fused resource image; accordingly, the processing unit may be further configured to:
if the special effect display logic of the dynamic template image is abnormal, outputting alarm information to prompt a technician to analyze the abnormal condition of the special effect display logic of the dynamic template image;
and if the special effect display logic of the dynamic template image is not abnormal, releasing the dynamic template image.
In still another embodiment, the fused resource image is played in the terminal screen when no abnormality occurs in the special effect shielding function and the special effect shielding function is in a closed state;
The abnormality detection mode of the special effect shielding function is as follows:
starting the special effect shielding function, and playing a test image of a target resource in a terminal screen, wherein the test image comprises the fusion resource image or the dynamic template image;
recording a screen during the playing process of the test image to obtain a screen recording video about the playing process of the test image; detecting whether the multi-frame image of the screen recording video comprises the target object or not;
if so, determining that the special effect shielding function is abnormal; and if not, determining that the special effect shielding function is not abnormal.
In still another aspect, an embodiment of the present invention provides a terminal, where the terminal includes an input device and an output device, and the terminal further includes:
a processor adapted to implement one or more instructions; the method comprises the steps of,
a computer storage medium storing one or more first instructions adapted to be loaded by the processor and to perform the steps of:
playing a fusion resource image of a target resource in a terminal screen, wherein the fusion resource image is a special effect image obtained by fusing a reference object for detection into a dynamic template image of the target resource;
Recording a screen of the playing process of the fusion resource image to obtain a recorded video of the playing process of the fusion resource image;
object recognition is carried out on the multi-frame candidate images of the recorded video, so that recognition results of the candidate images of each frame are obtained, and the recognition results of any candidate image are used for indicating: whether the any candidate image includes a target object associated with the reference object;
selecting a key frame image from the multi-frame candidate images according to the identification result of each frame of candidate image, wherein the key frame image refers to the candidate image comprising the target object;
and performing anomaly detection on the special effect display logic of the dynamic template image according to the key frame image.
In yet another aspect, embodiments of the present invention provide a computer storage medium storing one or more first instructions adapted to be loaded by a processor and to perform the steps of:
playing a fusion resource image of a target resource in a terminal screen, wherein the fusion resource image is a special effect image obtained by fusing a reference object for detection into a dynamic template image of the target resource;
Recording a screen of the playing process of the fusion resource image to obtain a recorded video of the playing process of the fusion resource image;
object recognition is carried out on the multi-frame candidate images of the recorded video, so that recognition results of the candidate images of each frame are obtained, and the recognition results of any candidate image are used for indicating: whether the any candidate image includes a target object associated with the reference object;
selecting a key frame image from the multi-frame candidate images according to the identification result of each frame of candidate image, wherein the key frame image refers to the candidate image comprising the target object;
and performing anomaly detection on the special effect display logic of the dynamic template image according to the key frame image.
Aiming at the dynamic template image of the target resource, the embodiment of the invention provides special effect display logic which can fuse other data contents in the dynamic template image to play the fused special effect image, and can effectively enrich the special effect display mode of the dynamic template image. And when the special effect display logic is used for detecting the abnormality, the fusion resource image obtained in the dynamic template image of the target resource fused with the reference object can be played first, and the screen recording is carried out in the playing process of the fusion resource image so as to obtain a recorded video. Then, object recognition can be carried out on multiple frames of candidate images of the recorded video so as to recognize whether each frame of candidate image comprises a target object or not; by introducing the target object associated with the reference object, the object identification can be realized by only identifying whether the candidate image comprises the target object or not, and the identification complexity can be effectively reduced without identifying whether the candidate image comprises the reference object or not, so that the processing resource is saved. Then, a key frame image including the target object can be selected from the multi-frame candidate images; because the key frame image comprises the target object, the key frame image can have higher reference degree, so that the abnormal detection is carried out on the special effect display logic of the dynamic template image according to the key frame image, the detection efficiency can be improved, and the accuracy of the detection result can be effectively improved.
In still another aspect, an embodiment of the present invention provides an image display method, including:
responding to a triggering operation of resource transmission from a first terminal user to a second terminal user in a communication scene, and outputting a resource selection interface; the resource selection interface at least comprises a resource identifier of a target resource, and the resource identifier is displayed in the resource selection interface under the condition that the special effect display logic of the dynamic template image of the target resource is determined to be not abnormal by adopting the image detection method;
selecting a resource identifier of the target resource in the resource selection interface;
if the confirmation operation aiming at the resource identification is detected, playing a target special effect image of a target resource in a terminal screen; the target special effect image is obtained by fusing a user identification of a target terminal user into a dynamic template image of the target resource, and the target terminal user comprises at least one of the following: the first end user and the second end user.
In still another aspect, an embodiment of the present invention provides an image display apparatus, including:
the output unit is used for responding to the triggering operation of resource transmission from the first terminal user to the second terminal user in the communication scene and outputting a resource selection interface; the resource selection interface at least comprises a resource identifier of a target resource, and the resource identifier is displayed in the resource selection interface under the condition that the special effect display logic of the dynamic template image of the target resource is determined to be not abnormal by adopting the image detection method;
The processing unit is used for selecting the resource identification of the target resource in the resource selection interface;
the output unit is further configured to play a target special effect image of a target resource in a terminal screen if a confirmation operation for the resource identifier is detected; the target special effect image is obtained by fusing a user identification of a target terminal user into a dynamic template image of the target resource, and the target terminal user comprises at least one of the following: the first end user and the second end user.
In still another aspect, an embodiment of the present invention provides a terminal, where the terminal includes an input device and an output device, and the terminal further includes:
a processor adapted to implement one or more instructions; the method comprises the steps of,
a computer storage medium storing one or more second instructions adapted to be loaded by the processor and to perform the steps of:
responding to a triggering operation of resource transmission from a first terminal user to a second terminal user in a communication scene, and outputting a resource selection interface; the resource selection interface at least comprises a resource identifier of a target resource, and the resource identifier is displayed in the resource selection interface under the condition that the special effect display logic of the dynamic template image of the target resource is determined to be not abnormal by adopting the image detection method;
Selecting a resource identifier of the target resource in the resource selection interface;
if the confirmation operation aiming at the resource identification is detected, playing a target special effect image of a target resource in a terminal screen; the target special effect image is obtained by fusing a user identification of a target terminal user into a dynamic template image of the target resource, and the target terminal user comprises at least one of the following: the first end user and the second end user.
In yet another aspect, embodiments of the present invention provide a computer storage medium storing one or more second instructions adapted to be loaded by a processor and to perform the steps of:
responding to a triggering operation of resource transmission from a first terminal user to a second terminal user in a communication scene, and outputting a resource selection interface; the resource selection interface at least comprises a resource identifier of a target resource, and the resource identifier is displayed in the resource selection interface under the condition that the special effect display logic of the dynamic template image of the target resource is determined to be not abnormal by adopting the image detection method;
Selecting a resource identifier of the target resource in the resource selection interface;
if the confirmation operation aiming at the resource identification is detected, playing a target special effect image of a target resource in a terminal screen; the target special effect image is obtained by fusing a user identification of a target terminal user into a dynamic template image of the target resource, and the target terminal user comprises at least one of the following: the first end user and the second end user.
The embodiment of the invention can respond to the triggering operation of the first terminal user in the communication scene for carrying out resource transmission to the second terminal user and output the resource selection interface. The resource selection interface comprises a resource identifier of a target resource; the resource identifier is displayed in a resource selection interface under the condition that the special effect display logic of the dynamic template image of the target resource is determined to be not abnormal; therefore, the usability of the dynamic template image can be effectively ensured, and the probability of abnormal display of the dynamic template image in the actual use process is reduced. After the user confirms that the resource identification of the target resource is selected through the resource selection interface, the terminal can fuse the user identification of the first terminal user and/or the user identification of the second terminal user into the dynamic template image of the target resource and play the target special effect image of the target resource obtained through fusion in the terminal screen, so that the special effect display mode of the dynamic template image can be effectively enriched, the interestingness of image display is improved, and the user viscosity is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1a is a system architecture diagram of a communication system according to an embodiment of the present invention;
FIG. 1b is a schematic diagram of an image detection scheme according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of an image detection method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of generating a fused resource image according to an embodiment of the present invention;
FIG. 4 is a flowchart of an image detection method according to another embodiment of the present invention;
FIG. 5a is a schematic diagram of an initial image obtained by framing according to an embodiment of the present invention;
FIG. 5b is a schematic diagram of object recognition provided by an embodiment of the present invention;
FIG. 5c is a schematic diagram of another object recognition provided by an embodiment of the present invention;
FIG. 5d is a schematic diagram illustrating a comparison of adjacent initial images according to an embodiment of the present invention;
FIG. 5e is a schematic diagram of a candidate image according to an embodiment of the present application;
FIG. 5f is a schematic diagram of a special effect marker provided by an embodiment of the present application;
fig. 6 is a schematic flow chart of an image display method according to an embodiment of the present application;
FIG. 7a is a schematic diagram of a resource selection interface according to an embodiment of the present application;
FIG. 7b is a schematic diagram of a selected resource identifier according to an embodiment of the present application;
FIG. 7c is a schematic diagram of a playback target effect image according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of an image detection device according to an embodiment of the present application;
fig. 9 is a schematic structural view of an image display device according to an embodiment of the present application;
fig. 10 is a schematic structural diagram of a terminal according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. It should be understood that the embodiments described below are only some, but not all, embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the application, are within the scope of the application based on the various embodiments of the application.
In an embodiment of the invention, a communication system is involved; referring to fig. 1a, the communication system may comprise at least: at least two terminals 11 and a server 13. Communication between any of the terminals 11 and the server 12 may be achieved by any communication means, such as mobile communication based on the third generation partnership project (3rd Generation Partnership Project,3GPP), on the fourth generation mobile communication technology (the 4th generation mobile communication technology,4G), on the fifth generation mobile communication technology (the 5th generation mobile networks,5G), long term evolution (Long Term Evolution, LTE), worldwide interoperability for microwave access (Worldwide Interoperability for Microwave Access, wiMAX), or computer network communication based on the TCP/IP protocol family (TCP/IP Protocol Suite, TCP/IP), user datagram protocol (User Datagram Protocol, UDP), etc.
Wherein, any terminal 11 refers to any intelligent device with communication function and man-machine interaction function; which may all include, but are not limited to: smart phones, tablet computers, smart televisions, wearable devices, laptop computers, desktop computers, and the like. And, communication APP can be installed and run in any terminal 11; communication APP herein refers to APP (application) with communication function, such as social APP (e.g. instant messaging APP, content interaction APP), live APP, and so on. Optionally, any of the terminals 11 may also be installed to run other various APP's, such as video play APP, music play APP, etc. The server 12 is a service device that can establish a communication connection between any two terminals 11 and provide various services such as information interaction service and resource processing service to each terminal 11; including but not limited to: a data processing server, an application server, a web server, etc.
The above-mentioned resources refer to anything that can be transferred between users; which may include, but are not limited to: virtual resources, object resources, and electronic resources, etc. Wherein, the virtual resource refers to a resource which circulates on a network and does not have an entity (actual object), such as a virtual gift involved in a live scene, a game prop involved in a game scene, and the like; the object resource refers to a resource with an entity, such as a movable automobile, a singing microphone, and the like; electronic resources refer to electronic money that exists in electronic form, such as electronic red packs involved in social scenarios, transfer funds between users, and so forth. Any resource can have a corresponding dynamic template image, wherein the dynamic template image refers to a template image with a dynamic effect generated by image switching of a plurality of frames of static template images at a preset frequency.
It should be understood that fig. 1a is merely exemplary and not limiting as to the system architecture characterizing a communication session system. For example, the number of terminals in fig. 1a is 4; in other embodiments, however, the number of terminals may not be limited to 4, and may be 2, 3, 10, etc. As another example, the server 12 in fig. 1a is a stand-alone service device; in other embodiments, however, the server 12 may be deployed as a cluster device composed of a plurality of service devices in common, such as a cluster device composed of at least a communication server for providing information interaction services and a resource server for providing resource processing services, and so on.
In a specific implementation, the above-mentioned communication system operates as follows: when any user has a communication requirement, a communication request can be sent to the server through the communication APP in any terminal in the communication system, so as to request the server to establish communication connection between the any terminal and other terminals used by at least one other user. After the server successfully establishes communication connection, any user can perform information interaction with each other user through the communication function provided by the communication APP in any terminal. For example, if the communication APP is a social APP, and the other users are all contact users of the any user, the any user may perform a chat session with each other user through the communication function provided by the social APP; for another example, if the communication APP is a live APP, other users include a host user (i.e., a user who performs live broadcast) and other audience users (i.e., a user who views live broadcast), then any user may also view live broadcast pictures of the host user, view comment information sent by other audience users, and so on through a communication function provided by the live APP. In the process of information interaction between any user and each other user, if any terminal detects that any user sends a triggering operation of a certain resource to a target user, the resource is sent to the target user; and the dynamic template image of the resource can be obtained, and then the special effect playing is carried out in the terminal screen of any terminal according to the dynamic template image. Optionally, any other terminal (such as a terminal used by a target user who receives a target resource, a terminal used by the remaining users except for any user and the target user) may also acquire a dynamic template image of the resource, and perform special effect playing in its own terminal screen according to the dynamic template image, and so on.
In order to enrich the special effect display modes of the dynamic template images of the resources, the embodiment of the invention provides a brand-new special effect display logic. Specifically, the special effect display logic refers to: logic for fusing other data contents in the dynamic template image to play the fused special effect image; other data content mentioned herein may include, but is not limited to: user identification of a resource sender, user identification of a resource receiver, user identification of other users in the same communication scene as the resource sender and the resource receiver, data content preset according to service requirements, and the like. In addition, the embodiment of the invention also provides an image detection scheme for better detecting the abnormality of the special effect display logic of the dynamic template image. In a specific implementation, the image detection scheme may be performed by the test terminal or a communication APP running in the test terminal; for convenience of explanation, the following description will take the test terminal to execute the image detection scheme as an example.
Referring to fig. 1b, the general principle of the image detection scheme is as follows: firstly, a reference object for detection can be fused into a dynamic template image to be detected, and a fused special effect image is obtained. And secondly, the special effect image can be played and the whole playing process is recorded and collected. Then, the video acquired by the screen recording can be subjected to framing processing, and the object recognition is carried out on each frame of image obtained by framing by adopting a corresponding algorithm so as to find out the key frame image fused with the reference object. Finally, the key frame image can be subjected to image recognition, so that whether the dynamic template image is normally fused with a reference object or not and whether the fused special effect image is normally displayed and played or not are judged, and further, the abnormal detection of special effect display logic of the dynamic template image is realized.
Practice shows that the image detection scheme provided by the embodiment of the invention has at least the following beneficial effects: (1) the whole detection process does not need to check a background log system, so that the detection threshold can be effectively reduced; (2) the test terminal automatically detects the abnormality of the special effect display logic of the dynamic template image through a series of operations such as screen recording, framing, object recognition, image recognition and the like, and the manual detection of a user is not needed, so that the labor cost can be effectively saved; (3) the part related to image recognition does not need to collect a large amount of materials, only needs to correspondingly recognize the key frame image, can reduce the detection time to a certain extent, and effectively improves the detection efficiency; (4) the accuracy of the detection result can be effectively improved by performing anomaly detection on the special effect display logic of the dynamic template image according to the key frame image with higher reference degree.
Based on the related description of the image detection scheme, the embodiment of the invention provides an image detection method; the image detection method can be executed by the test terminal or a communication APP running in the test terminal; for convenience of explanation, the test terminal will be used to execute the image detection method. Referring to fig. 2, the image detection method may include the following steps S201 to S205:
S201, playing the fusion resource image of the target resource in the terminal screen.
In the embodiment of the invention, the target resource can be a fusion resource corresponding to any dynamic template image to be detected; the fusion resource refers to: the resources corresponding to the dynamic template image with the special effect display logic are provided. That is, if a dynamic template image of a resource has the special effect presentation logic mentioned above, the resource may be referred to as a fusion resource; the resources referred to herein may be object resources, electronic resources, virtual resources, and so forth. Then in particular, when the resource is an object resource, then the target resource may be any fused object resource; when the resource is an electronic resource, the target resource can be any fused electronic resource; when the resource is a virtual resource, the target resource may be any converged virtual resource, such as a converged gift, a converged prop, or the like. The fusion gift refers to a virtual gift resource corresponding to the dynamic template image with the special effect display logic; that is, after a certain audience user in a live scene gives a certain virtual gift to a host user, if the user identification of the audience user and/or the user identification of the host user are supported to be fused into a dynamic template image of the virtual gift, so as to play the fused special effect image, the virtual gift may be referred to as a fused gift. The user identification referred to herein may comprise at least one of: user portraits and user names; by user portraits is meant pictures used by the user during the communication process for identifying the identity of the user, such as user portraits and the like; by user name is meant a name used by the user during communication to identify the identity of the user, such as a nickname of the user, a remark name of the user, etc. Similarly, the fusion prop refers to a virtual game prop corresponding to a dynamic template image having the special effect display logic.
In a specific implementation, since the detection of the special effect exhibition logic of the dynamic template image for the target resource mainly involves two aspects: on one hand, whether the dynamic template image can normally fuse other data contents or not is detected, and on the other hand, whether the fused special effect image can normally display and play is detected; and, it is contemplated that there are currently more sophisticated object recognition techniques, such as face recognition techniques, gesture recognition techniques, text recognition techniques, and so forth. Therefore, the embodiment of the invention can introduce corresponding reference objects according to the actually adopted object recognition technology, such as a face image corresponding to the face recognition technology, a gesture image corresponding to the gesture recognition technology, text content corresponding to the text recognition technology and the like; therefore, the object recognition technology can be conveniently used for carrying out object recognition based on the reference object, so that whether the dynamic template image can normally fuse other data contents or not is detected, and whether the fused special effect image can normally display and play is detected.
Based on this, the test terminal can acquire a reference object for detection. In one embodiment, the reference object may be a default object preset according to the service requirement, such as a default face image, a default gesture image, or default text content, etc. In still another embodiment, in practical application, a dynamic template image of a target resource is used only after a user in a communication scene is detected to send the target resource to another user, and at this time, the application of special effect display logic of the dynamic template image is involved; therefore, when the special effect display logic of the dynamic template image is used for detecting the abnormality, the detection environment can be set in a real normal scene, so that the detection process is more fit with the real application process. Then in this case the target resource may be sent by the first test user to the second test user in the communication scenario; accordingly, the reference object may include at least one of: the user image of the first test user and the user image of the second test user. It should be noted that, the user image herein may be a face image or a gesture image; the face image used by the first test user or the face image used by the second test user can be an image generated by collecting the face of the corresponding test user (the first test user or the second test user), or can be any image containing the face downloaded by the corresponding test user (the first test user or the second test user) through the internet.
It should be noted that, the above mentioned reference objects may include, in addition to the user image, a user name of the first test user and/or a user name of the second test user; alternatively, the reference object may not include the user representation, which includes only: a user name of the first test user and/or a user name of the second test user, etc. For convenience of explanation, the reference objects include at least a user image of the first test user and a user image of the second test user, and both user images are face images. Also, the above mentioned communication scenarios may include, but are not limited to: social session scenes built based on social APP, live broadcast scenes built based on live APP, and so on. Specifically, when the communication scene is a social session scene, the first test user may be any user in the target session, and the second test user may be any user in the target session other than the first test user; the target session mentioned here may be a single chat session composed of two users or a group session composed of at least three users, which is not limited. When the communication scene is a live scene, the first test user may be any audience user in a live room (a virtual room in which live pictures can be viewed), the second test user may be a hosting user in the live room, and so on.
After the reference object is acquired, the test terminal can fuse the reference object for detection into the dynamic template image of the target resource to obtain a fused resource image of the target resource. The dynamic template image refers to a template image with dynamic effect generated by image switching of a plurality of frames of static template images at a preset frequency; when the reference object is fused into the dynamic template image of the target resource, the test terminal can directly fuse the reference object into part or all of the frame static template images in the dynamic template image of the target resource, so as to obtain a fused resource image of the target resource. Or, in order to facilitate the follow-up object recognition, the accuracy of object recognition is improved; the test terminal can also perform standardization processing on the user portrait (such as a face image) included in the reference object, and then fuse the standardized reference object into part or all of the static template images in the dynamic template images of the target resource to obtain a fused resource image of the target resource. Taking the example of fusing the reference object into a part of the static template image, a schematic diagram of the fused resource image is shown in fig. 3. After the fusion resource image is obtained, the test terminal can play the fusion resource image of the target resource in the terminal screen.
S202, recording the screen of the playing process of the fusion resource image to obtain a recorded video of the playing process of the fusion resource image.
In a specific implementation, the test terminal can immediately execute the screen recording operation after detecting the play event of the fusion resource image. Further, if the target resource is sent to the second test user by the first test user in the communication scene, after the first test user sends the target resource, the test terminal needs to take a certain time to acquire the fusion resource image of the target resource; therefore, the test terminal can immediately execute the screen recording operation after detecting that the first test user executes the triggering operation of sending the target resource to the second test user, so that the screen recording of the whole playing process of the fused resource image can be ensured, each frame of image in the fused resource image is ensured to be recorded, and the integrity of the recorded video is further ensured. When the screen recording operation is executed, the test terminal can call the third party screen recording software to record the playing process of the fusion resource image, so as to obtain a recorded video of the playing process of the fusion resource image. Further, considering that the third party screen recording software may have faults in the screen recording process, the stability of the screen recording is improved; the test terminal can call a bottom layer screen recording command (such as adb shell screenrecord command) to record the playing process of the fusion resource image, so as to obtain a recorded video of the playing process of the fusion resource image.
And S203, performing object recognition on the multi-frame candidate images of the recorded video to obtain recognition results of the candidate images of each frame.
From the foregoing, the fused resource image is a special effect image obtained by fusing the reference object for detection to the dynamic template image of the target resource; if the dynamic template image of the target resource can normally fuse other data contents (such as a reference object), the reference object is necessarily included in the recorded video obtained by recording the playing process of the fused resource image. Based on the principle, the test terminal can perform object identification on the recorded video; specifically, since the recorded video is composed of a frame-by-frame recorded picture (i.e., jpg (Joint Photographic Experts Group, an image file format) picture or png (Portable Network Graphics ) picture), the test terminal may first perform frame-dividing processing on the recorded video to obtain multiple frame candidate images, and then identify whether the multiple frame candidate images of the recorded video include a target object associated with the reference object, so as to obtain an identification result of each frame candidate image. The identification result of any candidate image is used for indicating: whether any candidate image includes a target object associated with a reference object. By carrying out object recognition on the multi-frame candidate images, whether the dynamic template images are normally fused with the reference object or not can be conveniently determined according to the recognition results of the candidate images of each frame, and whether the fused special effect images (namely the fused resource images) are normally played and displayed or not is further detected after the dynamic template images are determined to be normally fused with the reference object.
Wherein the target object and the reference object mentioned above may be the same or different; i.e. the target object may or may not be a reference object. When the target object and the reference object are different, the target object may include any one of the following: when the reference object comprises a user portrait of the first test user and/or a user portrait of the second test user, if the user portrait is a face image, the target object may be a target number of faces; if the user portrait is a gesture image, the target object is a target number of gestures. Wherein the target number is the number of user portraits included in the reference object; for example, assuming that the reference object includes only the user image of the first test user or the user image of the second test user, the target number is 1; assuming that the reference object includes a user image of the first test user and a user image of the second test user, the target number is 2. When the reference object is a default object preset according to the service requirement, if the default object is a default face image, the target object is a face; if the default object is the default gesture image, the target object is a gesture; if the default object is the default text content, the target object is text.
S204, selecting a key frame image from the multi-frame candidate images according to the identification result of each frame candidate image.
In a specific implementation, after the test terminal obtains the identification result of each frame of candidate image through step S203, a key frame image can be selected from multiple frames of candidate images according to the identification result of each frame of candidate image; wherein the key frame image refers to a candidate image including the target object. If the key frame image is successfully selected, the condition that at least one frame of candidate image comprises a target object in each frame of candidate image can be indicated, and the dynamic template image of the target resource can be determined to be normally fused with the reference object; in this case, the test terminal may further detect whether the fused resource image is displayed in normal play by executing step S205, so as to determine whether the special effect display logic of the dynamic template image is abnormal. If the key frame image is not successfully selected (namely, the key frame image is not selected successfully), the condition that the target object is not included in each frame of candidate image can be indicated, and the condition that the dynamic template image of the target resource can not normally fuse the reference object can be determined; under the condition, the test terminal can directly determine that the special effect display logic of the dynamic template image is abnormal.
S205, performing anomaly detection on the special effect display logic of the dynamic template image according to the key frame image.
In particular implementations, the key frame image may be image-identified to detect whether the key frame image includes a target image element. In one embodiment, the image recognition can be directly performed on the key frame image to obtain at least one image element contained in the key frame image; then, it is detected whether the target image element is included in the at least one image element, thereby determining whether the target image element is included in the key frame image. In still another embodiment, the method of image matching static image may also be used to identify the key location in the key frame image, so as to detect whether the image element at the key location is the target image element, thereby determining whether the target image element is included in the key frame image.
The target image element is preset according to the service requirement and the dynamic template image; for example, for the dynamic template image shown in fig. 3, the target image element may be a "wing" image element, a "gemstone" image element, and so on. If the key frame image is detected to comprise the target image element, determining that the fusion resource image is normally played and displayed; in this case, it can be determined that there is no abnormality in the special effect display logic of the dynamic template image; if the key frame image is detected to not comprise the target image element, determining that the fusion resource image is not normally played and displayed; in this case, it can be determined that there is an abnormality in the special effect presentation logic of the dynamic template image.
Aiming at the dynamic template image of the target resource, the embodiment of the invention provides special effect display logic which can fuse other data contents in the dynamic template image to play the fused special effect image, and can effectively enrich the special effect display mode of the dynamic template image. And when the special effect display logic is used for detecting the abnormality, the fusion resource image obtained in the dynamic template image of the target resource fused with the reference object can be played first, and the screen recording is carried out in the playing process of the fusion resource image so as to obtain a recorded video. Then, object recognition can be carried out on multiple frames of candidate images of the recorded video so as to recognize whether each frame of candidate image comprises a target object or not; by introducing the target object associated with the reference object, the object identification can be realized by only identifying whether the candidate image comprises the target object or not, and the identification complexity can be effectively reduced without identifying whether the candidate image comprises the reference object or not, so that the processing resource is saved. Then, a key frame image including the target object can be selected from the multi-frame candidate images; because the key frame image comprises the target object, the key frame image can have higher reference degree, so that the abnormal detection is carried out on the special effect display logic of the dynamic template image according to the key frame image, the detection efficiency can be improved, and the accuracy of the detection result can be effectively improved.
Based on the related description of the method embodiment shown in fig. 2, another more specific image detection method is also provided in the embodiment of the present invention; the image detection method can be executed by the test terminal or a communication APP running in the test terminal; for convenience of explanation, the test terminal will be used to execute the image detection method. In addition, in the embodiment of the invention, considering that in practical application, partial special effect images generated after other data contents are fused into the dynamic template image may not be wanted to be watched by a user, the embodiment of the invention can also provide a special effect shielding function for the user so as to shield the special effect images obtained by fusing the related data contents into the dynamic template image. Based on the above, in order to ensure the normal use of the special effect shielding function, the test terminal also needs to perform abnormality detection on the special effect shielding function. After the special effect shielding function is opened, if the special effect shielding function is not abnormal, only the dynamic template image is played for the user, and the related fusion resource image is not played; only when the special effect shielding function is abnormal, the related fusion resource image can be played. Therefore, the abnormality detection method of the special effect shielding function can be as follows:
Firstly, a special effect shielding function can be started, and a test image of a target resource is played in a terminal screen; wherein the test image may comprise a fused resource image or a dynamic template image. Secondly, the screen recording can be carried out on the playing process of the test image, and the screen recording video of the playing process of the test image is obtained. Then, whether a target object is included in the multi-frame image of the screen recording video can be detected; specifically, object recognition may be performed on multiple frame images of the recorded video to determine whether the multiple frame images include the target object, and the specific implementation manner of the object recognition may refer to the specific implementation manner of the object recognition on multiple frame candidate images of the recorded video related to the playing process of the fused resource image in other embodiments of the present invention, which is not described herein. If the special effect shielding function is included, the played test image can be indicated to be a fusion resource image, so that the special effect shielding function is determined to be abnormal; under the condition, the special effect shielding function can be overhauled by technicians, so that the follow-up normal use of the special effect shielding function can be ensured. If the special effect shielding function is not included, the played test image can be indicated to be a dynamic template image, so that the special effect shielding function can be determined to be abnormal.
Therefore, in the whole process of abnormality detection of the special effect shielding function, no human intervention is needed, and the labor cost can be saved. And moreover, the abnormal detection of the special effect shielding function is automatically realized, so that the detection speed can be effectively improved, the detection time is effectively shortened, and the detection efficiency is improved. In addition, whether the special effect shielding function is abnormal or not is detected by combining an object recognition technology on the basis of automation, and the detection accuracy can be effectively improved.
After determining that the special effect shielding function is not abnormal, the test terminal can close the special effect shielding function, and the image detection method shown in fig. 4 is adopted to detect the abnormality of the special effect display logic of the dynamic template image of the target resource. Referring to fig. 4, the image detection method may include the following steps S401 to S407:
s401, playing the fusion resource image of the target resource in the terminal screen.
S402, recording the screen of the playing process of the fusion resource image to obtain a recorded video of the playing process of the fusion resource image.
The fusion resource image is a special effect image obtained by fusing a reference object for detection into a dynamic template image of the target resource; and the fusion resource image is played in the terminal screen under the condition that the special effect shielding function is not abnormal and the special effect shielding function is in a closed state. It should be noted that, the specific implementation manner of the steps S401 to S402 may refer to the related descriptions of the steps S201 to S202 in the above embodiment of the invention, which are not repeated herein.
S403, framing the recorded video to obtain multi-frame initial images.
In a specific implementation, the test terminal may first perform picture segmentation on the recorded video to obtain a multi-frame recorded picture fused with the resource image. Then, the multi-frame recorded picture can be directly determined to be a multi-frame initial image; or the image of the target area can be cut out from each frame of recording picture as a multi-frame initial image. The central point of the target area coincides with the display position of the fusion resource image in the terminal screen, and the area of the target area can be set according to an empirical value or according to the display size of the fusion resource image; specifically, the area of the target area may be greater than or equal to the display size of the fusion resource image. For example, the ratio of the distance between the display position of the fusion resource image in the terminal screen and the top end of the terminal screen to the length of the terminal screen is set to be 1/3, namely, the display position of the fusion resource in the terminal screen is set to be 1/3 of the length of the terminal screen; then the center point of the target area is 1/3 of the length of each recorded picture, and then the image of the target area centered at the 1/3 length of the picture can be cut out from each frame of recorded picture as the initial image, as shown in fig. 5 a.
Thus, the test terminal can support two framing modes: the first is to directly frame the recorded video according to 100% of the recorded picture, and the second is to cut only the image of the target area in the recorded picture for framing. The second framing method may have at least the following advantages with respect to the first framing method: (1) the customized framing treatment is carried out according to the display mode of the fusion resource image, so that the initial image obtained by framing comprises the complete information of the fusion resource image, and the influence on the subsequent object recognition effect can be avoided; under the condition that the recognition effect is kept the same as that of the first framing mode, the object recognition time can be effectively shortened due to the fact that the object recognition operation on other useless images except the target area is reduced, and therefore the object recognition efficiency is improved. (2) Since other interference objects except the reference object may exist in the recording picture, if the object identification is performed on the whole recording picture, the influence of the other interference objects may be received, thereby affecting the accuracy of the object identification result; for example, if the recording picture includes other faces (i.e., interference objects), the test terminal not only recognizes the faces included in the reference object, but also recognizes other faces when performing object recognition, so as to obtain the output of the interference item influence result shown in fig. 5b, and thus the accuracy of the object recognition result is reduced. However, if the image of the target area is taken as the initial image, the influence of other interference objects (such as the game face image in the live game screen shown in fig. 5 b) can be effectively reduced, so that the accuracy of the object recognition result is improved, as shown in fig. 5 c.
S404, determining multi-frame candidate images of the recorded video from the multi-frame initial images.
In a specific implementation, each frame of initial image can have an image sequence number, and the frames of initial images can be sequentially arranged according to the image sequence numbers; the image sequence number of any initial image can be used for indicating the screen recording acquisition sequence of the recording picture corresponding to any initial image. For example, if the image number of the initial image a is 1, it indicates that the recorded picture corresponding to the initial image a is the 1 st frame of picture collected by the recording screen; if the image number of the initial image B is 100, it indicates that the recording picture corresponding to the initial image B is the 100 th frame picture acquired by the recording screen, and so on.
In one embodiment, the multi-frame initial image may be directly used as a multi-frame candidate image for the recorded video. In still another embodiment, since the fused resource image is obtained by fusing the reference object to all or part of the static template image in the dynamic template image, the time point of the screen recording may be earlier than the playing time point of the fused resource image; therefore, only a part of recorded pictures in the recorded video obtained by the screen may have a fused special effect, so that a part of initial images may exist in a plurality of initial images obtained by framing the recorded video to have the fused special effect. In this case, if the multi-frame initial image is directly used as a candidate image, a large amount of time is consumed for object recognition for each frame of initial image. In this regard, the embodiment of the present invention further researches the multi-frame initial image, and finds that there are a lot of image differences between the adjacent initial images in the multi-frame initial image, especially the adjacent initial image with the fused special effect, as shown in fig. 5d (the left image is the 910 th initial image, and the right image is the 923 rd initial image). Based on the method, the test terminal can sample the sequentially arranged multi-frame initial images according to the sampling frequency to obtain multi-frame candidate images of the recorded video, so that the object recognition is only carried out on each frame of candidate images obtained by sampling. The sampling frequency can be set according to an empirical value or service requirements; for example, the sampling frequency can be set to be once for every 10 images, when the number of the initial images is 1000, the sampling frequency is adopted for sampling, so that 1000 initial images can be compressed to 100 images for object recognition, the object recognition time can be effectively shortened, and the object recognition efficiency is improved.
S405, performing object recognition on multiple frames of candidate images of the recorded video to obtain recognition results of the candidate images of each frame.
From the foregoing, it can be appreciated that there are a variety of implementations of the reference object; then, along with the difference of the reference objects, the specific implementation mode of the test terminal for carrying out object recognition on the multi-frame candidate images of the recorded video is also different; the method comprises the following steps:
when the reference object comprises a user portrait of the first test user and/or a user portrait of the second test user, if the user portrait is a face image, the face recognition can be carried out on multiple frames of candidate images of the recorded video, and the recognition result of each frame of candidate images is obtained; if the user portrait is a gesture image, gesture recognition can be carried out on multiple frames of candidate images of the recorded video, and recognition results of the candidate images of each frame are obtained. Optionally, if the reference object includes the user name of the first test user and/or the user name of the second test user, text recognition may be further performed on multiple candidate images of the recorded video, so as to obtain a recognition result of each candidate image.
When the reference object comprises a default object set according to service requirements, if the default object is a default face image, the face recognition can be carried out on multiple frames of candidate images of the recorded video to obtain the recognition result of each frame of candidate image; if the default object is a default gesture image, gesture recognition can be performed on multiple frames of candidate images of the recorded video, and recognition results of the candidate images of each frame are obtained; if the default object is the default text content, text recognition can be carried out on multiple frames of candidate images of the recorded video, and recognition results of the candidate images of each frame are obtained.
S406, selecting a key frame image from the multi-frame candidate images according to the identification result of each frame candidate image.
The identification result of any candidate image is used for indicating: whether any candidate image includes a target object associated with a reference object. In the specific implementation of step S406, the test terminal may first screen out candidate images including the target object from the multiple frames of candidate images according to the recognition result of each frame of candidate image; then, a key frame image can be selected from the screened candidate images of each frame. In one embodiment, each of the selected frame candidate images may be directly used as a key frame image; or selecting one candidate image from the screened candidate images of each frame as a key frame image.
In yet another embodiment, although each of the selected frame candidate images includes a target object, anomaly detection of the special effects display logic is achieved by subsequently identifying whether the key frame image includes a target image element. And the research shows that: the fusion resource image may have some exposure special effects in the playing process, so that partial screened candidate images have partial exposure, thereby causing the element information in the partial candidate images to be unclear, and further causing the partial candidate images to be unsuitable for subsequent abnormal detection of special effect display logic. For example, referring to fig. 5e, the image numbers of the candidate images including the target object are selected as follows: 790. 800, 810 …, 910; it can be seen that the fused special effect is shown starting from the candidate image with image number 790. However, although the candidate image with image number 790 includes the target object (e.g., 2 faces), the candidate image is not clear in the image elements (e.g., the "wing" elements) due to the presence of the exposure special effects; if the subsequent detection is performed according to the candidate image, the accuracy of the detection result may be affected. Based on this, in order to reduce the probability of selecting a candidate image having an exposure special effect as a key frame image, the subsequent detection result is affected; when the test terminal selects the key frame images, the screened frame candidate images can be arranged according to the sequence from small to large of the image sequence numbers of the screened frame candidate images, so that the arrangement positions of the screened frame candidate images are obtained; and selecting the candidate image with the arrangement position at the target position from the sequentially arranged multi-frame candidate images as a key frame image. Or in other embodiments, the test terminal may sequentially add the image serial numbers of the selected candidate images of each frame to an array according to the ascending image serial numbers; then, candidate images indicated by the image serial numbers corresponding to the target positions in the selected array are used as key frame images.
Wherein, the target position can be set according to experience value or exposure special effect display logic; for example, if the exposure effect presentation logic indicates that the exposure effect is generally presented in the previous N frame of candidate images, then the target location may be set to be any location after the N frame of candidate images. For example, as shown in fig. 5e, the number of the screened multi-frame candidate images is 13; then the target position may be set as the intermediate position of the sequentially arranged multi-frame candidate images or the intermediate position of the array, and the key frame image in this case is the candidate image with the image sequence number 850 corresponding to the intermediate position.
S407, carrying out anomaly detection on the special effect display logic of the dynamic template image according to the key frame image.
In a specific implementation, static image matching processing can be performed on the key frame image and the target image element to obtain a matching result. If the matching result indicates that the key frame image comprises the target image element, determining that the special effect display logic of the dynamic template image is not abnormal; if the matching result indicates that the key frame image does not comprise the target image element, the special effect display logic of the dynamic template image can be determined to be abnormal. The specific implementation mode of carrying out static image matching processing on the key frame image and the target image element to obtain a matching result is as follows:
Firstly, special effect marking can be carried out on the key frame image, and a special effect marking result is obtained. The special effect markers mentioned here refer to: searching out a key frame image and marking the key position in the key frame image; accordingly, the special effect marker result includes position information of a key position in the key frame image. Wherein, the key position can be set according to the experience value or the service requirement; taking the example that the key frame image is a candidate image with image number 850, the key position may be the position of the "wing" image element and/or the position of the "gemstone" image element, and the special effect marking result may be as shown in fig. 5 f. After the marking result is obtained, key feature elements of key positions of the key frame image can be obtained according to the special effect marking result, and static diagram matching is carried out on the key feature elements and the target image elements. If the matching is successful, generating a matching result for indicating that the key frame image comprises the target image element; if the matching fails, a matching result is generated, wherein the matching result is used for indicating that the key frame image does not comprise the target image element.
It should be noted that, in practical application, the image detection method shown in fig. 2 or fig. 4 may be packaged into an interface, so that the interface may be directly invoked to perform anomaly detection on special effect display logic of any dynamic template image to be detected, so as to avoid repeated code compiling and improve detection efficiency.
The embodiment of the invention can combine the object recognition and image recognition technology on an automatic basis, can realize the abnormal detection of the special effect display logic of the dynamic resource template and the abnormal detection of the shielding logic of the special effect shielding function in a self-customized manner, effectively shortens the detection time and improves the detection efficiency and the detection accuracy. In addition, aiming at the special effect display logic of the dynamic resource template, the recorded video obtained by the screen is subjected to framing treatment by adopting a new video framing mode, so that the operation time and the recognition accuracy of the subsequent object recognition can be effectively improved; the time required for object recognition can be further shortened by performing special processing (such as sampling processing) on the adjacent similar candidate images and then performing object recognition. After the key frame image is obtained, whether the special effect display logic is abnormal or not is judged by carrying out image element matching search on the key position of the key frame image, a large amount of materials are not required to be collected, and only the corresponding element identification is carried out on the key frame image and the key position, so that the detection efficiency and the accuracy of the detection result can be improved.
In practical application, after abnormality detection is performed on the special effect display logic of the dynamic resource template of the target resource by using the image detection method shown in fig. 2 or fig. 4, corresponding processing may be performed according to the detection result. Specifically, if the special effect display logic of the dynamic template image is not abnormal, the dynamic template image is released; the distribution dynamic template image means: the dynamic template image disclosure is provided to each user for use. If the special effect display logic of the dynamic template image is abnormal, the known related content can indicate that the special effect display logic of the dynamic template image is abnormal, which can include at least one of the following conditions: abnormal conditions exist in the process of fusing the reference object into the dynamic template image, and abnormal conditions exist in the playing process of the fused resource image; therefore, under the condition, the test terminal can output alarm information to prompt a technician to analyze the abnormal condition of the special effect display logic of the dynamic template image, so that the technician can pertinently modify the special effect display logic of the dynamic template image according to the analysis result, the modified special effect display logic is free from abnormality, and the modified special effect display logic is further released.
After the dynamic template image is released, any terminal in the communication system can play the special effect according to the dynamic template image according to the special effect display logic of the dynamic template image after detecting that the first terminal user sends the target resource to the second terminal user. Based on the above, the embodiment of the invention also provides an image display method; the image display method can be executed by the first terminal in the communication system or by the communication APP running in the first terminal; for convenience of explanation, the image display method is executed by the first terminal, which is a terminal used by the first user in the communication system, in the following description. Referring to fig. 6, the image display method may include the following steps S601 to S603:
s601, responding to a triggering operation of resource transmission from a first terminal user to a second terminal user in a communication scene, and outputting a resource selection interface.
Among other things, the communication scenario may include, but is not limited to: social session scenes built based on social APP, live broadcast scenes built based on live APP, and so on. When the communication scene is a social session scene, the first terminal user can be any user in the target session, and the second terminal user can be any user except the first terminal user in the target session; when the communication scene is a live scene, the first end user may be any viewer user in the live room, the second end user may be a host user in the live room, and so on.
When the first terminal user wants to transmit the target resource to the second terminal user, a trigger operation for resource transmission may be input in a current interface displayed in the first terminal. The current interfaces referred to herein may include, but are not limited to: a session interface of a target session, a live interface in a live scene, and the like; the triggering operation input by the first terminal user on the current interface for resource transmission may include, but is not limited to: an input operation of a preset voice password, an input operation of a preset gesture, a click operation or a press operation for a trigger component in a current interface, and the like. Correspondingly, the first terminal can respond to the triggering operation of resource transmission and output a resource selection interface. Wherein the resource selection interface at least comprises a resource identifier of the target resource; the target resource herein may be any converged resource, such as a converged gift, a converged electronic resource, etc., and the resource identification of the target resource may include, but is not limited to: resource name, resource number, resource image, etc. The resource image may be a thumbnail corresponding to any one of the static template images in the dynamic template images of the target resource, or may be an image independent of the dynamic template image, which is not limited. And, the resource identifier is displayed in the resource selection interface when it is determined that there is no abnormality in the special effect display logic of the dynamic template image of the target resource by using the image detection method shown in fig. 2 or fig. 4.
Specifically, when the first terminal outputs the resource selection interface, the first terminal may directly output the resource selection interface on the current interface, or replace the current interface with the resource selection interface, or may switch from the current interface to the resource selection interface. Taking the target resource as a fusion gift, the current interface is a live interface and the output resource selection interface is an example on the current interface, a schematic diagram of the output resource selection interface 70 and the resource identifier 71 of the target resource can be seen in fig. 7 a.
S602, selecting the resource identification of the target resource in the resource selection interface.
In a specific implementation, a first terminal user selects a resource identifier of a target resource in a resource selection interface; accordingly, the first terminal may select the resource identifier of the target resource in the resource selection interface according to the selection operation of the first terminal user, as shown in fig. 7 b. It should be understood that fig. 7b is only a schematic diagram for exemplarily characterizing the resource identification of the selected target resource, which is not limited thereto; for example, in other embodiments, the first terminal may also switch the display state (display color, display size, display position, etc.) of the resource identifier of the target resource from the first state to the second state in the resource selection interface to select the resource identifier of the target resource.
S603, if the confirmation operation for the resource identification is detected, the target special effect image of the target resource is played in the terminal screen.
In a specific implementation, after the first terminal user selects the resource identifier of the target resource, a confirmation operation can be performed on the resource identifier; the validation operation may include: an operation of inputting a confirmation gesture or confirmation voice, a triggering operation for the confirmation component 72 in the resource selection interface, and the like. Correspondingly, the first terminal can respond to the confirmation operation and play the target special effect image of the target resource in the terminal screen; the target special effect image is obtained by fusing the user identification of the target terminal user into a dynamic template image of the target resource. Wherein the target user may comprise at least one of: a first end user and a second end user; the user identification includes at least one of: user portraits and user names; the user screens mentioned herein may include, but are not limited to: face images, gesture images, scenic images, animal images, and so forth. A schematic representation of playing target effect image 73 is shown with reference to fig. 7c, with the target user comprising a first end user and a second end user, and the user identification comprising a user portrait and a user name.
The embodiment of the invention can respond to the triggering operation of the first terminal user in the communication scene for carrying out resource transmission to the second terminal user and output the resource selection interface. The resource selection interface comprises a resource identifier of a target resource; the resource identifier is displayed in a resource selection interface under the condition that the special effect display logic of the dynamic template image of the target resource is determined to be not abnormal; therefore, the usability of the dynamic template image can be effectively ensured, and the probability of abnormal display of the dynamic template image in the actual use process is reduced. After the user confirms that the resource identification of the target resource is selected through the resource selection interface, the terminal can fuse the user identification of the first terminal user and/or the user identification of the second terminal user into the dynamic template image of the target resource and play the target special effect image of the target resource obtained through fusion in the terminal screen, so that the special effect display mode of the dynamic template image can be effectively enriched, the interestingness of image display is improved, and the user viscosity is improved.
It will be appreciated that in the specific implementation of any of the above method embodiments, relevant data (e.g., face images, user identifications, etc.) relating to user information, etc., is collected with user permissions or consents, and the collection, use and processing of the relevant data complies with relevant legal regulations and standards of the relevant country and region when the above method embodiments are applied to specific products or technologies.
Based on the above description of the embodiments of the image detection method, the embodiments of the present invention also disclose an image detection apparatus, which may be a computer program (including program code) running in the above-mentioned test terminal. The image detection apparatus may perform the method shown in fig. 2 or fig. 4. Referring to fig. 8, the image detection apparatus may operate as follows:
a playing unit 801, configured to play, in a terminal screen, a fused resource image of a target resource, where the fused resource image is a special effect image obtained by fusing a reference object for detection to a dynamic template image of the target resource;
a screen recording unit 802, configured to record a screen of the playing process of the fused resource image, so as to obtain a recorded video related to the playing process of the fused resource image;
the processing unit 803 is configured to perform object recognition on multiple candidate images of the recorded video, obtain a recognition result of each candidate image, and indicate that the recognition result of any candidate image is used for indicating: whether the any candidate image includes a target object associated with the reference object;
the processing unit 803 is further configured to select a key frame image from the multiple frame candidate images according to the identification result of each frame candidate image, where the key frame image is a candidate image including the target object;
The processing unit 803 is further configured to perform anomaly detection on the special effect display logic of the dynamic template image according to the key frame image.
In one embodiment, the target resource is sent by a first test user to a second test user in a communication scenario;
the reference object comprises at least one of: the user portrait of the first test user and the user portrait of the second test user; the user portrait is a face image or a gesture image;
when the user portrait is the face image, the target object is the face with the target number; when the user portrait is the gesture image, the target object is a target number of gestures; wherein the target number is the number of user portraits included in the reference object.
In still another embodiment, when the processing unit 803 is configured to perform object recognition on multiple frames of candidate images of the recorded video to obtain a recognition result of each frame of candidate image, the processing unit may be specifically configured to:
if the user portrait is the face image, carrying out face recognition on the multi-frame candidate images of the recorded video to obtain the recognition result of each frame of candidate image;
And if the user portrait is a gesture image, carrying out gesture recognition on the multi-frame candidate images of the recorded video to obtain recognition results of the candidate images of each frame.
In still another embodiment, before the processing unit 803 is configured to perform object recognition on multiple frames of candidate images of the recorded video, the processing unit may be further configured to:
carrying out framing treatment on the recorded video to obtain multi-frame initial images;
and determining multi-frame candidate images of the recorded video from the multi-frame initial images.
In still another embodiment, the processing unit 803, when configured to perform framing processing on the recorded video to obtain a multi-frame initial image, may be specifically configured to:
performing picture segmentation on the recorded video to obtain multi-frame recorded pictures of the fusion resource images;
determining the multi-frame recorded picture as a multi-frame initial image, or respectively cutting out an image of a target area from each frame recorded picture as a multi-frame initial image; and the region center point of the target region coincides with the display position of the fusion resource image in the terminal screen.
In another embodiment, each frame of initial image has an image sequence number, and the frames of initial images are sequentially arranged according to the image sequence numbers; accordingly, the processing unit 803 may be specifically configured to, when determining the multiple candidate images of the recorded video from the multiple initial images:
Taking the multi-frame initial image as a multi-frame candidate image of the recorded video; or,
and sampling the sequentially arranged multi-frame initial images according to the sampling frequency to obtain multi-frame candidate images of the recorded video.
In still another embodiment, the processing unit 803 may be specifically configured to, when configured to select a key frame image from the multiple frame candidate images according to the identification result of each frame candidate image:
screening candidate images comprising the target object from the multi-frame candidate images according to the identification result of each frame of candidate images;
and selecting a key frame image from the screened candidate images of each frame.
In still another embodiment, the processing unit 803 may be specifically configured to, when selecting a key frame image from the screened candidate images:
arranging the screened frame candidate images according to the sequence from small to large of the image sequence numbers of the screened frame candidate images to obtain the arrangement positions of the screened frame candidate images;
and selecting the candidate image with the arrangement position at the target position from the sequentially arranged multi-frame candidate images as a key frame image.
In still another embodiment, the processing unit 803 may be specifically configured to:
Performing static image matching processing on the key frame image and the target image element to obtain a matching result;
if the matching result indicates that the key frame image comprises the target image element, determining that the special effect display logic of the dynamic template image is not abnormal;
and if the matching result indicates that the key frame image does not comprise the target image element, determining that the special effect display logic of the dynamic template image is abnormal.
In yet another embodiment, the special effect presentation logic of the dynamic template image has an abnormality including at least one of: an abnormality exists in the process of fusing the reference object to the dynamic template image, and an abnormality exists in the playing process of the fused resource image; accordingly, the processing unit 803 may be further configured to:
if the special effect display logic of the dynamic template image is abnormal, outputting alarm information to prompt a technician to analyze the abnormal condition of the special effect display logic of the dynamic template image;
and if the special effect display logic of the dynamic template image is not abnormal, releasing the dynamic template image.
In still another embodiment, the fused resource image is played in the terminal screen when no abnormality occurs in the special effect shielding function and the special effect shielding function is in a closed state;
The abnormality detection mode of the special effect shielding function is as follows:
starting the special effect shielding function, and playing a test image of a target resource in a terminal screen, wherein the test image comprises the fusion resource image or the dynamic template image;
recording a screen during the playing process of the test image to obtain a screen recording video about the playing process of the test image; detecting whether the multi-frame image of the screen recording video comprises the target object or not;
if so, determining that the special effect shielding function is abnormal; and if not, determining that the special effect shielding function is not abnormal.
According to one embodiment of the present invention, the steps involved in the method shown in fig. 2 or fig. 4 may be performed by the respective units in the image detection apparatus shown in fig. 8. For example, steps S201 and S202 shown in fig. 2 may be performed by the playback unit 801 and the recording unit 802 shown in fig. 8, respectively, and steps S203 to S205 may be performed by the processing unit 803 shown in fig. 8. As another example, steps S401 and S402 shown in fig. 4 may be performed by the playing unit 801 and the recording unit 802 shown in fig. 8, respectively, steps S403 to S407 may each be performed by the processing unit 803 shown in fig. 8, and so on.
According to another embodiment of the present invention, each unit in the image detection apparatus shown in fig. 8 may be separately or completely combined into one or several other units, or some unit(s) thereof may be further split into a plurality of units having smaller functions, which may achieve the same operation without affecting the achievement of the technical effects of the embodiments of the present invention. The above units are divided based on logic functions, and in practical applications, the functions of one unit may be implemented by a plurality of units, or the functions of a plurality of units may be implemented by one unit. In other embodiments of the present invention, the image-based detection apparatus may also include other units, and in practical applications, these functions may also be implemented with assistance from other units, and may be implemented by cooperation of a plurality of units.
According to another embodiment of the present invention, an image detection apparatus device as shown in fig. 8 may be constructed by running a computer program (including program code) capable of executing the steps involved in the respective methods as shown in fig. 2 or fig. 4 on a general-purpose computing device such as a computer including a processing element such as a Central Processing Unit (CPU), a random access storage medium (RAM), a read only storage medium (ROM), and the like, and a storage element, and the image detection method of the embodiment of the present invention is implemented. The computer program may be recorded on, for example, a computer-readable recording medium, and loaded into and executed by the above-described computing device via the computer-readable recording medium.
Aiming at the dynamic template image of the target resource, the embodiment of the invention provides special effect display logic which can fuse other data contents in the dynamic template image to play the fused special effect image, and can effectively enrich the special effect display mode of the dynamic template image. And when the special effect display logic is used for detecting the abnormality, the fusion resource image obtained in the dynamic template image of the target resource fused with the reference object can be played first, and the screen recording is carried out in the playing process of the fusion resource image so as to obtain a recorded video. Then, object recognition can be carried out on multiple frames of candidate images of the recorded video so as to recognize whether each frame of candidate image comprises a target object or not; by introducing the target object associated with the reference object, the object identification can be realized by only identifying whether the candidate image comprises the target object or not, and the identification complexity can be effectively reduced without identifying whether the candidate image comprises the reference object or not, so that the processing resource is saved. Then, a key frame image including the target object can be selected from the multi-frame candidate images; because the key frame image comprises the target object, the key frame image can have higher reference degree, so that the abnormal detection is carried out on the special effect display logic of the dynamic template image according to the key frame image, the detection efficiency can be improved, and the accuracy of the detection result can be effectively improved.
Based on the above description of the embodiments of the image display method, the embodiments of the present invention also disclose an image display apparatus, which may be a computer program (including program code) running in the above-mentioned first terminal. The image display apparatus may perform the method shown in fig. 6. Referring to fig. 9, the image display apparatus may operate as follows:
an output unit 901, configured to respond to a triggering operation that a first end user in a communication scenario performs resource transmission to a second end user, and output a resource selection interface; the resource selection interface at least comprises a resource identifier of a target resource, and the resource identifier is displayed in the resource selection interface under the condition that the special effect display logic of the dynamic template image of the target resource is determined to be not abnormal by adopting the image detection method;
a processing unit 902, configured to select a resource identifier of the target resource in the resource selection interface;
the output unit 901 is further configured to play a target special effect image of a target resource in a terminal screen if a confirmation operation for the resource identifier is detected; the target special effect image is obtained by fusing a user identification of a target terminal user into a dynamic template image of the target resource, and the target terminal user comprises at least one of the following: the first end user and the second end user.
According to one embodiment of the present invention, the steps involved in the method shown in fig. 6 may be performed by the units in the image display apparatus shown in fig. 9. For example, steps S601 and S603 shown in fig. 6 may be performed by the output unit 901 shown in fig. 9, and step S602 may be performed by the processing unit 902 shown in fig. 9.
According to another embodiment of the present invention, each unit in the image display apparatus shown in fig. 9 may be separately or completely combined into one or several additional units, or some unit(s) thereof may be further split into a plurality of units having smaller functions, which may achieve the same operation without affecting the achievement of the technical effects of the embodiments of the present invention. The above units are divided based on logic functions, and in practical applications, the functions of one unit may be implemented by a plurality of units, or the functions of a plurality of units may be implemented by one unit. In other embodiments of the present invention, the image-based display device may also include other units, and in practical applications, these functions may also be realized with assistance of other units, and may be realized by cooperation of a plurality of units.
According to another embodiment of the present invention, an image display apparatus as shown in fig. 9 may be constructed by running a computer program (including program code) capable of executing the steps involved in the respective methods as shown in fig. 6 on a general-purpose computing device such as a computer including a processing element such as a Central Processing Unit (CPU), a random access storage medium (RAM), a read only storage medium (ROM), and the like, and a storage element, and implementing the image display method of the embodiment of the present invention. The computer program may be recorded on, for example, a computer-readable recording medium, and loaded into and executed by the above-described computing device via the computer-readable recording medium.
The embodiment of the invention can respond to the triggering operation of the first terminal user in the communication scene for carrying out resource transmission to the second terminal user and output the resource selection interface. The resource selection interface comprises a resource identifier of a target resource; the resource identifier is displayed in a resource selection interface under the condition that the special effect display logic of the dynamic template image of the target resource is determined to be not abnormal; therefore, the usability of the dynamic template image can be effectively ensured, and the probability of abnormal display of the dynamic template image in the actual use process is reduced. After the user confirms that the resource identification of the target resource is selected through the resource selection interface, the terminal can fuse the user identification of the first terminal user and/or the user identification of the second terminal user into the dynamic template image of the target resource and play the target special effect image of the target resource obtained through fusion in the terminal screen, so that the special effect display mode of the dynamic template image can be effectively enriched, the interestingness of image display is improved, and the user viscosity is improved.
Based on the description of the method embodiment and the device embodiment, the embodiment of the invention also provides a terminal; the terminal may be the test terminal mentioned above or the first terminal mentioned above. Referring to fig. 10, the terminal may include at least a processor 1001, an input device 1002, an output device 1003, and a computer storage medium 1004. Wherein the processor 1001, input device 1002, output device 1003, and computer storage medium 1004 within the terminal may be coupled by bus or other means. A computer storage medium 1004 may be stored in a memory of the terminal, the computer storage medium 1004 is used to store a computer program, the computer program includes program instructions, and the processor 1001 is used to execute the program instructions stored in the computer storage medium 1004. The processor 1001, or CPU (Central Processing Unit ), is a computing core and a control core of the terminal, which are adapted to implement one or more instructions, in particular to load and execute one or more instructions to implement a corresponding method flow or a corresponding function.
In one embodiment, if the terminal is a test terminal, the processor 1001 in the embodiment of the present invention may be configured to perform a series of image detection processes, which specifically includes: playing a fusion resource image of a target resource in a terminal screen, wherein the fusion resource image is a special effect image obtained by fusing a reference object for detection into a dynamic template image of the target resource; recording a screen of the playing process of the fusion resource image to obtain a recorded video of the playing process of the fusion resource image; object recognition is carried out on the multi-frame candidate images of the recorded video, so that recognition results of the candidate images of each frame are obtained, and the recognition results of any candidate image are used for indicating: whether the any candidate image includes a target object associated with the reference object; selecting a key frame image from the multi-frame candidate images according to the identification result of each frame of candidate image, wherein the key frame image refers to the candidate image comprising the target object; and performing anomaly detection on special effect display logic of the dynamic template image according to the key frame image, and the like.
In still another embodiment, if the terminal is the first terminal, the processor 1001 in the embodiment of the present invention may be configured to perform a series of image display processes, including: responding to a triggering operation of resource transmission from a first terminal user to a second terminal user in a communication scene, and outputting a resource selection interface; the resource selection interface at least comprises a resource identifier of a target resource, and the resource identifier is displayed in the resource selection interface under the condition that the special effect display logic of the dynamic template image of the target resource is determined to be not abnormal by adopting the image detection method; selecting a resource identifier of the target resource in the resource selection interface; if the confirmation operation aiming at the resource identification is detected, playing a target special effect image of a target resource in a terminal screen; the target special effect image is obtained by fusing a user identification of a target terminal user into a dynamic template image of the target resource, and the target terminal user comprises at least one of the following: the first end user and the second end user.
The embodiment of the invention also provides a computer storage medium (Memory), which is a Memory device in the terminal and is used for storing programs and data. It will be appreciated that the computer storage medium herein may include both a built-in storage medium in the terminal and an extended storage medium supported by the terminal. The computer storage medium provides a storage space that stores an operating system of the terminal. Also stored in this memory space are one or more instructions, which may be one or more computer programs (including program code), adapted to be loaded and executed by the processor 1001. The computer storage medium herein may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one magnetic disk memory; optionally, at least one computer storage medium remote from the processor may be present.
In one embodiment, one or more first instructions stored in a computer storage medium may be loaded and executed by the processor 1001 to implement the respective steps of the methods described above in relation to the image detection method embodiments; in particular implementations, one or more first instructions in the computer storage medium are loaded by the processor 1001 and perform the steps of:
playing a fusion resource image of a target resource in a terminal screen, wherein the fusion resource image is a special effect image obtained by fusing a reference object for detection into a dynamic template image of the target resource;
recording a screen of the playing process of the fusion resource image to obtain a recorded video of the playing process of the fusion resource image;
object recognition is carried out on the multi-frame candidate images of the recorded video, so that recognition results of the candidate images of each frame are obtained, and the recognition results of any candidate image are used for indicating: whether the any candidate image includes a target object associated with the reference object;
selecting a key frame image from the multi-frame candidate images according to the identification result of each frame of candidate image, wherein the key frame image refers to the candidate image comprising the target object;
And performing anomaly detection on the special effect display logic of the dynamic template image according to the key frame image.
In one embodiment, the target resource is sent by a first test user to a second test user in a communication scenario;
the reference object comprises at least one of: the user portrait of the first test user and the user portrait of the second test user; the user portrait is a face image or a gesture image;
when the user portrait is the face image, the target object is the face with the target number; when the user portrait is the gesture image, the target object is a target number of gestures; wherein the target number is the number of user portraits included in the reference object.
In still another embodiment, when performing object recognition on the multiple frames of candidate images of the recorded video to obtain a recognition result of each frame of candidate image, the one or more first instructions may be loaded and specifically executed by the processor 1001:
if the user portrait is the face image, carrying out face recognition on the multi-frame candidate images of the recorded video to obtain the recognition result of each frame of candidate image;
And if the user portrait is a gesture image, carrying out gesture recognition on the multi-frame candidate images of the recorded video to obtain recognition results of the candidate images of each frame.
In still another embodiment, before performing object recognition on the multiple frames of candidate images of the recorded video to obtain a recognition result of each frame of candidate image, the one or more first instructions may further be loaded and specifically executed by the processor 1001:
carrying out framing treatment on the recorded video to obtain multi-frame initial images;
and determining multi-frame candidate images of the recorded video from the multi-frame initial images.
In still another embodiment, when the recorded video is subjected to framing to obtain a multi-frame initial image, the one or more first instructions may be loaded and specifically executed by the processor 1001:
performing picture segmentation on the recorded video to obtain multi-frame recorded pictures of the fusion resource images;
determining the multi-frame recorded picture as a multi-frame initial image, or respectively cutting out an image of a target area from each frame recorded picture as a multi-frame initial image; and the region center point of the target region coincides with the display position of the fusion resource image in the terminal screen.
In another embodiment, each frame of initial image has an image sequence number, and the frames of initial images are sequentially arranged according to the image sequence numbers; accordingly, in determining the multiple candidate images of the recorded video from the multiple initial images, the one or more first instructions may be loaded and specifically executed by the processor 1001:
taking the multi-frame initial image as a multi-frame candidate image of the recorded video; or,
and sampling the sequentially arranged multi-frame initial images according to the sampling frequency to obtain multi-frame candidate images of the recorded video.
In yet another embodiment, the one or more first instructions may be loaded and specifically executed by the processor 1001 when selecting a key frame image from the multiple frame candidate images according to the identification result of each frame candidate image:
screening candidate images comprising the target object from the multi-frame candidate images according to the identification result of each frame of candidate images;
and selecting a key frame image from the screened candidate images of each frame.
In yet another embodiment, when selecting a key frame image from the screened candidate images, the one or more first instructions may be loaded and specifically executed by the processor 1001:
Arranging the screened frame candidate images according to the sequence from small to large of the image sequence numbers of the screened frame candidate images to obtain the arrangement positions of the screened frame candidate images;
and selecting the candidate image with the arrangement position at the target position from the sequentially arranged multi-frame candidate images as a key frame image.
In yet another embodiment, the one or more first instructions may be loaded and executed specifically by the processor 1001 when performing anomaly detection on the special effects display logic of the dynamic template image based on the keyframe image:
performing static image matching processing on the key frame image and the target image element to obtain a matching result;
if the matching result indicates that the key frame image comprises the target image element, determining that the special effect display logic of the dynamic template image is not abnormal;
and if the matching result indicates that the key frame image does not comprise the target image element, determining that the special effect display logic of the dynamic template image is abnormal.
In yet another embodiment, the special effect presentation logic of the dynamic template image has an abnormality including at least one of: an abnormality exists in the process of fusing the reference object to the dynamic template image, and an abnormality exists in the playing process of the fused resource image; accordingly, the one or more first instructions may also be loaded and executed in particular by the processor 1001:
If the special effect display logic of the dynamic template image is abnormal, outputting alarm information to prompt a technician to analyze the abnormal condition of the special effect display logic of the dynamic template image;
and if the special effect display logic of the dynamic template image is not abnormal, releasing the dynamic template image.
In still another embodiment, the fused resource image is played in the terminal screen when no abnormality occurs in the special effect shielding function and the special effect shielding function is in a closed state;
the abnormality detection mode of the special effect shielding function is as follows:
starting the special effect shielding function, and playing a test image of a target resource in a terminal screen, wherein the test image comprises the fusion resource image or the dynamic template image;
recording a screen during the playing process of the test image to obtain a screen recording video about the playing process of the test image; detecting whether the multi-frame image of the screen recording video comprises the target object or not;
if so, determining that the special effect shielding function is abnormal; and if not, determining that the special effect shielding function is not abnormal.
Aiming at the dynamic template image of the target resource, the embodiment of the invention provides special effect display logic which can fuse other data contents in the dynamic template image to play the fused special effect image, and can effectively enrich the special effect display mode of the dynamic template image. And when the special effect display logic is used for detecting the abnormality, the fusion resource image obtained in the dynamic template image of the target resource fused with the reference object can be played first, and the screen recording is carried out in the playing process of the fusion resource image so as to obtain a recorded video. Then, object recognition can be carried out on multiple frames of candidate images of the recorded video so as to recognize whether each frame of candidate image comprises a target object or not; by introducing the target object associated with the reference object, the object identification can be realized by only identifying whether the candidate image comprises the target object or not, and the identification complexity can be effectively reduced without identifying whether the candidate image comprises the reference object or not, so that the processing resource is saved. Then, a key frame image including the target object can be selected from the multi-frame candidate images; because the key frame image comprises the target object, the key frame image can have higher reference degree, so that the abnormal detection is carried out on the special effect display logic of the dynamic template image according to the key frame image, the detection efficiency can be improved, and the accuracy of the detection result can be effectively improved.
In yet another embodiment, one or more second instructions stored in a computer storage medium may be loaded and executed by the processor 1001 to implement the respective steps of the method described above in relation to the image detection method embodiment; in particular implementations, one or more second instructions in the computer storage medium are loaded by the processor 1001 and perform the steps of:
responding to a triggering operation of resource transmission from a first terminal user to a second terminal user in a communication scene, and outputting a resource selection interface; the resource selection interface at least comprises a resource identifier of a target resource, and the resource identifier is displayed in the resource selection interface under the condition that the special effect display logic of the dynamic template image of the target resource is determined to be not abnormal by adopting the image detection method;
selecting a resource identifier of the target resource in the resource selection interface;
if the confirmation operation aiming at the resource identification is detected, playing a target special effect image of a target resource in a terminal screen; the target special effect image is obtained by fusing a user identification of a target terminal user into a dynamic template image of the target resource, and the target terminal user comprises at least one of the following: the first end user and the second end user.
The embodiment of the application can respond to the triggering operation of the first terminal user in the communication scene for carrying out resource transmission to the second terminal user and output the resource selection interface. The resource selection interface comprises a resource identifier of a target resource; the resource identifier is displayed in a resource selection interface under the condition that the special effect display logic of the dynamic template image of the target resource is determined to be not abnormal; therefore, the usability of the dynamic template image can be effectively ensured, and the probability of abnormal display of the dynamic template image in the actual use process is reduced. After the user confirms that the resource identification of the target resource is selected through the resource selection interface, the terminal can fuse the user identification of the first terminal user and/or the user identification of the second terminal user into the dynamic template image of the target resource and play the target special effect image of the target resource obtained through fusion in the terminal screen, so that the special effect display mode of the dynamic template image can be effectively enriched, the interestingness of image display is improved, and the user viscosity is improved.
It should be noted that according to an aspect of the present application, there is also provided a computer program product or a computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the methods provided in the various alternative ways of the image detection method embodiments aspects shown in fig. 2 or fig. 4 described above; or cause the computer device to perform the method provided in the various alternatives to the aspects of the image display method embodiment shown in fig. 6 described above.
It is also to be understood that the foregoing is merely illustrative of the present invention and is not to be construed as limiting the scope of the invention, which is defined by the appended claims.

Claims (14)

1. An image detection method, comprising:
playing a fusion resource image of a target resource in a terminal screen, wherein the fusion resource image is a special effect image obtained by fusing a reference object for detection into a dynamic template image of the target resource;
recording a screen of the playing process of the fusion resource image to obtain a recorded video of the playing process of the fusion resource image;
object recognition is carried out on the multi-frame candidate images of the recorded video, so that recognition results of the candidate images of each frame are obtained, and the recognition results of any candidate image are used for indicating: whether the any candidate image includes a target object associated with the reference object;
selecting a key frame image from the multi-frame candidate images according to the identification result of each frame of candidate image, wherein the key frame image refers to the candidate image comprising the target object;
performing static image matching processing on the key frame image and the target image element to obtain a matching result;
If the matching result indicates that the key frame image comprises the target image element, determining that the special effect display logic of the dynamic template image is not abnormal;
and if the matching result indicates that the key frame image does not comprise the target image element, determining that the special effect display logic of the dynamic template image is abnormal.
2. The method of claim 1, wherein the target resource is transmitted by a first test user to a second test user in a communication scenario;
the reference object comprises at least one of: the user portrait of the first test user and the user portrait of the second test user; the user portrait is a face image or a gesture image;
when the user portrait is the face image, the target object is the face with the target number; when the user portrait is the gesture image, the target object is a target number of gestures; wherein the target number is the number of user portraits included in the reference object.
3. The method as claimed in claim 2, wherein said performing object recognition on the multiple frames of candidate images of the recorded video to obtain recognition results of each frame of candidate image comprises:
If the user portrait is the face image, carrying out face recognition on the multi-frame candidate images of the recorded video to obtain the recognition result of each frame of candidate image;
and if the user portrait is a gesture image, carrying out gesture recognition on the multi-frame candidate images of the recorded video to obtain recognition results of the candidate images of each frame.
4. The method as claimed in any one of claims 1 to 3, wherein before performing object recognition on the multiple frames of candidate images of the recorded video to obtain a recognition result of each frame of candidate image, the method further comprises:
carrying out framing treatment on the recorded video to obtain multi-frame initial images;
and determining multi-frame candidate images of the recorded video from the multi-frame initial images.
5. The method of claim 4, wherein framing the recorded video to obtain a multi-frame initial image comprises:
performing picture segmentation on the recorded video to obtain multi-frame recorded pictures of the fusion resource images;
determining the multi-frame recorded picture as a multi-frame initial image, or respectively cutting out an image of a target area from each frame recorded picture as a multi-frame initial image; and the region center point of the target region coincides with the display position of the fusion resource image in the terminal screen.
6. The method of claim 4, wherein each frame of initial image has an image sequence number, and wherein the frames of initial images are arranged in sequence according to the image sequence number; the determining the multi-frame candidate image of the recorded video from the multi-frame initial image comprises the following steps:
taking the multi-frame initial image as a multi-frame candidate image of the recorded video; or,
and sampling the sequentially arranged multi-frame initial images according to the sampling frequency to obtain multi-frame candidate images of the recorded video.
7. A method according to any one of claims 1 to 3, wherein selecting a key frame image from the plurality of frame candidate images according to the identification result of each frame candidate image comprises:
screening candidate images comprising the target object from the multi-frame candidate images according to the identification result of each frame of candidate images;
and selecting a key frame image from the screened candidate images of each frame.
8. The method of claim 7, wherein selecting a key frame image from the screened frame candidate images comprises:
arranging the screened frame candidate images according to the sequence from small to large of the image sequence numbers of the screened frame candidate images to obtain the arrangement positions of the screened frame candidate images;
And selecting the candidate image with the arrangement position at the target position from the sequentially arranged multi-frame candidate images as a key frame image.
9. The method of claim 1, wherein the special effects presentation logic of the dynamic template image is abnormal including at least one of: an abnormality exists in the process of fusing the reference object to the dynamic template image, and an abnormality exists in the playing process of the fused resource image;
the method further comprises the steps of:
if the special effect display logic of the dynamic template image is abnormal, outputting alarm information to prompt a technician to analyze the abnormal condition of the special effect display logic of the dynamic template image;
and if the special effect display logic of the dynamic template image is not abnormal, releasing the dynamic template image.
10. A method according to any one of claims 1 to 3, wherein the fused resource image is played in the terminal screen in a case where no abnormality occurs in the special effect mask function and the special effect mask function is in an off state;
the abnormality detection mode of the special effect shielding function is as follows:
starting the special effect shielding function, and playing a test image of a target resource in a terminal screen, wherein the test image comprises the fusion resource image or the dynamic template image;
Recording a screen during the playing process of the test image to obtain a screen recording video about the playing process of the test image; detecting whether the multi-frame image of the screen recording video comprises the target object or not;
if so, determining that the special effect shielding function is abnormal; and if not, determining that the special effect shielding function is not abnormal.
11. An image display method, comprising:
responding to a triggering operation of resource transmission from a first terminal user to a second terminal user in a communication scene, and outputting a resource selection interface; the resource selection interface at least comprises a resource identifier of a target resource, wherein the resource identifier is displayed in the resource selection interface under the condition that the special effect display logic of the dynamic template image of the target resource is determined to be not abnormal by adopting the image detection method as set forth in any one of claims 1 to 10;
selecting a resource identifier of the target resource in the resource selection interface;
if the confirmation operation aiming at the resource identification is detected, playing a target special effect image of a target resource in a terminal screen; the target special effect image is obtained by fusing a user identification of a target terminal user into a dynamic template image of the target resource, and the target terminal user comprises at least one of the following: the first end user and the second end user.
12. An image detection apparatus, comprising:
the playing unit is used for playing a fusion resource image of the target resource in the terminal screen, wherein the fusion resource image is a special effect image obtained by fusing a reference object for detection into a dynamic template image of the target resource;
the screen recording unit is used for recording the screen of the playing process of the fusion resource image to obtain a recorded video of the playing process of the fusion resource image;
the processing unit is used for carrying out object recognition on the multi-frame candidate images of the recorded video to obtain recognition results of the candidate images of each frame, and the recognition result of any candidate image is used for indicating: whether the any candidate image includes a target object associated with the reference object;
the processing unit is further configured to select a key frame image from the multiple frame candidate images according to the identification result of each frame candidate image, where the key frame image is a candidate image including the target object;
the processing unit is also used for carrying out static image matching processing on the key frame image and the target image element to obtain a matching result; if the matching result indicates that the key frame image comprises the target image element, determining that the special effect display logic of the dynamic template image is not abnormal; and if the matching result indicates that the key frame image does not comprise the target image element, determining that the special effect display logic of the dynamic template image is abnormal.
13. An image display device, comprising:
the output unit is used for responding to the triggering operation of resource transmission from the first terminal user to the second terminal user in the communication scene and outputting a resource selection interface; the resource selection interface at least comprises a resource identifier of a target resource, wherein the resource identifier is displayed in the resource selection interface under the condition that the special effect display logic of the dynamic template image of the target resource is determined to be not abnormal by adopting the image detection method as set forth in any one of claims 1 to 10;
the processing unit is used for selecting the resource identification of the target resource in the resource selection interface;
the output unit is further configured to play a target special effect image of a target resource in a terminal screen if a confirmation operation for the resource identifier is detected; the target special effect image is obtained by fusing a user identification of a target terminal user into a dynamic template image of the target resource, and the target terminal user comprises at least one of the following: the first end user and the second end user.
14. A terminal comprising an input device and an output device, further comprising:
A processor adapted to implement one or more instructions; the method comprises the steps of,
a computer storage medium storing one or more first instructions adapted to be loaded by the processor and to perform the image detection method of any one of claims 1-10; alternatively, the computer storage medium stores one or more second instructions adapted to be loaded by the processor and to perform the image display method of claim 11.
CN202011151676.2A 2020-10-23 2020-10-23 Image detection method, image display method, related device and storage medium Active CN112307925B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011151676.2A CN112307925B (en) 2020-10-23 2020-10-23 Image detection method, image display method, related device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011151676.2A CN112307925B (en) 2020-10-23 2020-10-23 Image detection method, image display method, related device and storage medium

Publications (2)

Publication Number Publication Date
CN112307925A CN112307925A (en) 2021-02-02
CN112307925B true CN112307925B (en) 2023-11-28

Family

ID=74330340

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011151676.2A Active CN112307925B (en) 2020-10-23 2020-10-23 Image detection method, image display method, related device and storage medium

Country Status (1)

Country Link
CN (1) CN112307925B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111356016B (en) 2020-03-11 2022-04-22 北京小米松果电子有限公司 Video processing method, video processing apparatus, and storage medium
CN114501041B (en) * 2021-04-06 2023-07-14 抖音视界有限公司 Special effect display method, device, equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018095252A1 (en) * 2016-11-28 2018-05-31 优酷网络技术(北京)有限公司 Video recording method and device
CN108958610A (en) * 2018-07-27 2018-12-07 北京微播视界科技有限公司 Special efficacy generation method, device and electronic equipment based on face
CN110851652A (en) * 2019-11-29 2020-02-28 京东方科技集团股份有限公司 Method and device for assisting in viewing driving record video

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010002921A1 (en) * 2008-07-01 2010-01-07 Yoostar Entertainment Group, Inc. Interactive systems and methods for video compositing

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018095252A1 (en) * 2016-11-28 2018-05-31 优酷网络技术(北京)有限公司 Video recording method and device
CN108958610A (en) * 2018-07-27 2018-12-07 北京微播视界科技有限公司 Special efficacy generation method, device and electronic equipment based on face
WO2020019663A1 (en) * 2018-07-27 2020-01-30 北京微播视界科技有限公司 Face-based special effect generation method and apparatus, and electronic device
CN110851652A (en) * 2019-11-29 2020-02-28 京东方科技集团股份有限公司 Method and device for assisting in viewing driving record video

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
监控视频中基于运动目标显著性的关键帧提取方法;胡圆圆 等;《南京邮电大学学报(自然科学版)》;第36卷(第1期);全文 *

Also Published As

Publication number Publication date
CN112307925A (en) 2021-02-02

Similar Documents

Publication Publication Date Title
US10873769B2 (en) Live broadcasting method, method for presenting live broadcasting data stream, and terminal
CN106658200B (en) Live video sharing and acquiring method and device and terminal equipment thereof
CN102799383B (en) Screen sectional drawing method and screen sectional drawing device for mobile terminals
CN105871690B (en) Method and device for realizing instant messaging
CN112307925B (en) Image detection method, image display method, related device and storage medium
CN111654730B (en) Video playing method, data processing method, related device and medium
CN109274999A (en) A kind of video playing control method, device, equipment and medium
CN110389697B (en) Data interaction method and device, storage medium and electronic device
CN113014854B (en) Method, device, equipment and medium for generating interactive record
CN107168974A (en) The display control method and device of message in display items related content, social class application
CN111901695A (en) Video content interception method, device and equipment and computer storage medium
US11163827B2 (en) Video processing method, device, terminal and storage medium
CN116980605A (en) Video processing method, apparatus, computer device, storage medium, and program product
CN115278346B (en) Method for sending comments and receiving comments in live broadcasting room and related equipment
CN110619513A (en) Electronic resource obtaining method, electronic resource distributing method and related equipment
CN109672710B (en) File uploading method, system and equipment
CN116049490A (en) Material searching method and device and electronic equipment
CN115238125A (en) Data processing method and device, computer equipment and readable storage medium
CN112351420A (en) Networking identity creating method and device of terminal device and readable storage medium
CN115563320A (en) Information reply method, device, electronic equipment, computer storage medium and product
CN115563319A (en) Information reply method, device, electronic equipment, computer storage medium and product
CN112165626A (en) Image processing method, resource acquisition method, related device and medium
CN115599335B (en) Method and system for sharing layout files based on multi-screen mode
CN116828018A (en) Game video processing method, device and equipment for cloud game and storage medium
CN111898640B (en) Method and device for pushing pictures by analog snapshot machine, test system and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40038802

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant