CN112307925A

CN112307925A - Image detection method, image display method, related device and storage medium

Info

Publication number: CN112307925A
Application number: CN202011151676.2A
Authority: CN
Inventors: 蒋政胜
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2020-10-23
Filing date: 2020-10-23
Publication date: 2021-02-02
Anticipated expiration: 2040-10-23
Also published as: CN112307925B

Abstract

The embodiment of the invention discloses an image detection method, an image display method, related equipment and a storage medium, wherein the image detection method comprises the following steps: playing a fusion resource image of a target resource in a terminal screen, wherein the fusion resource image is a special effect image obtained by fusing a reference object for detection into a dynamic template image of the target resource; recording a screen in the playing process of the fusion resource image to obtain a recorded video about the playing process of the fusion resource image; carrying out object identification on a plurality of frames of candidate images of the recorded video to obtain an identification result of each frame of candidate image; selecting a key frame image from the multi-frame candidate images according to the identification result of each frame candidate image; and carrying out anomaly detection on the special effect display logic of the dynamic template image according to the key frame image. The embodiment of the invention can enrich the special effect display logic of the dynamic template image of the target resource and improve the viscosity of the user; the special effect display logic can be well detected, and the accuracy of a detection result is improved.

Description

Image detection method, image display method, related device and storage medium

Technical Field

The present invention relates to the field of internet technologies, and in particular, to the field of image processing technologies, and in particular, to an image detection method, an image display method, an image detection apparatus, an image display apparatus, a terminal, and a computer storage medium.

Background

With the development of internet technology, more and more clients support a communication function, so that any user can interact information with other users through the communication function. In the information interaction process, the client can also support any user to send one or more resources (such as virtual gifts, electronic red packages and the like) to other users; and after the triggering operation of the user for sending the resources is detected, the template image is played according to the special effect display logic of the template image of the target resources sent by the user, so that the interestingness is improved.

At present, after a template image of any resource is generated, it is usually necessary to perform anomaly detection on the special effect display logic of the template image of any resource, and after it is ensured that no anomaly occurs in the special effect display logic, the template image of any resource is distributed again. Therefore, after the trigger operation of sending the resource is detected in practical application, the template image can be normally played according to the special effect display logic of the template image of the resource. Based on the above, how to detect the abnormality of the special effect display logic of the template image of the resource becomes a research hotspot.

Disclosure of Invention

The embodiment of the invention provides an image detection method, an image display method, related equipment and a storage medium, which can be used for better detecting a brand-new special effect display logic and improving the accuracy of a detection result; moreover, through the brand-new special effect display logic, the special effect display modes of the dynamic template image can be enriched, and the viscosity of a user is improved.

In one aspect, an embodiment of the present invention provides an image detection method, where the method includes:

playing a fusion resource image of a target resource in a terminal screen, wherein the fusion resource image is a special effect image obtained by fusing a reference object for detection into a dynamic template image of the target resource;

recording a screen in the playing process of the fusion resource image to obtain a recorded video related to the playing process of the fusion resource image;

carrying out object identification on a plurality of candidate images of the recorded video to obtain an identification result of each candidate image, wherein the identification result of any candidate image is used for indicating: whether the any of the candidate images includes a target object associated with the reference object;

selecting a key frame image from the plurality of frame candidate images according to the identification result of each frame candidate image, wherein the key frame image is a candidate image comprising the target object;

and carrying out anomaly detection on the special effect display logic of the dynamic template image according to the key frame image.

In another aspect, an embodiment of the present invention provides an image detection apparatus, where the apparatus includes:

the system comprises a playing unit, a processing unit and a processing unit, wherein the playing unit is used for playing a fusion resource image of a target resource in a terminal screen, and the fusion resource image is a special effect image obtained by fusing a reference object for detection into a dynamic template image of the target resource;

the screen recording unit is used for recording the screen of the playing process of the fusion resource image to obtain a recorded video related to the playing process of the fusion resource image;

a processing unit, configured to perform object identification on multiple candidate images of the recorded video to obtain an identification result of each candidate image, where the identification result of any candidate image is used to indicate: whether the any of the candidate images includes a target object associated with the reference object;

the processing unit is further configured to select a key frame image from the multiple frame candidate images according to the recognition result of each frame candidate image, where the key frame image is a candidate image including the target object;

the processing unit is further configured to perform anomaly detection on the special effect display logic of the dynamic template image according to the key frame image.

In one embodiment, the target resource is sent from a first test user to a second test user in a communication scenario;

the reference object comprises at least one of: the user image of the first test user and the user image of the second test user; the user portrait is a face image or a gesture image;

when the user image is the face image, the target objects are the faces with the target number; when the user portrait is the gesture image, the target object is a target number of gestures; wherein the target number is a number of user portraits comprised by the reference object.

In another embodiment, when the processing unit is configured to perform object recognition on multiple candidate images of the recorded video to obtain a recognition result of each candidate image, the processing unit may be specifically configured to:

if the user image is the face image, carrying out face recognition on a plurality of frame candidate images of the recorded video to obtain a recognition result of each frame candidate image;

and if the user portrait is a gesture image, performing gesture recognition on the multi-frame candidate image of the recorded video to obtain a recognition result of each frame of candidate image.

In another embodiment, before the processing unit is configured to perform object recognition on multiple candidate images of the recorded video to obtain a recognition result of each candidate image, the processing unit may be further configured to:

performing frame division processing on the recorded video to obtain a plurality of frames of initial images;

and determining a plurality of candidate images of the recorded video from the plurality of initial images.

In another embodiment, when the processing unit is configured to perform framing processing on the recorded video to obtain multiple frames of initial images, the processing unit may be specifically configured to:

carrying out picture segmentation on the recorded video to obtain a multi-frame recorded picture of the fusion resource image;

determining the multi-frame recording picture as a multi-frame initial image, or cutting an image of a target area from each frame recording picture as the multi-frame initial image; and the central point of the target area coincides with the display position of the fusion resource image in the terminal screen.

In another embodiment, each frame of initial image has an image sequence number, and the initial images of each frame are arranged in sequence according to the image sequence numbers; correspondingly, when the processing unit is configured to determine, from the multiple initial images, multiple candidate images of the recorded video, the processing unit may be specifically configured to:

taking the multi-frame initial image as a multi-frame candidate image of the recorded video; or,

and sampling the multiple frames of initial images which are sequentially arranged according to the sampling frequency to obtain multiple frames of candidate images of the recorded video.

In another embodiment, when the processing unit is configured to select a key frame image from the multiple frame candidate images according to the identification result of each frame candidate image, the processing unit may be specifically configured to:

screening out candidate images comprising the target object from the multiple frames of candidate images according to the identification result of each frame of candidate image;

and selecting a key frame image from the screened candidate images of each frame.

In another embodiment, the processing unit, when being configured to select a key frame image from the screened frame candidate images, may be specifically configured to:

arranging the screened frame candidate images according to the sequence of the image sequence numbers of the screened frame candidate images from small to large to obtain the arrangement positions of the screened frame candidate images;

selecting a candidate image with the arrangement position at the target position from a plurality of frames of candidate images which are sequentially arranged as a key frame image.

In another embodiment, when the processing unit is configured to perform the anomaly detection on the special effect presentation logic of the dynamic template image according to the key frame image, the processing unit may be specifically configured to:

performing static image matching processing on the key frame image and the target image element to obtain a matching result;

if the matching result indicates that the key frame image comprises the target image element, determining that the special effect display logic of the dynamic template image is not abnormal;

and if the matching result indicates that the key frame image does not comprise the target image element, determining that the special effect display logic of the dynamic template image is abnormal.

In another embodiment, the logic for showing the special effects of the dynamic template image has an abnormality, which includes at least one of the following situations: the process of fusing the reference object to the dynamic template image is abnormal, and the playing process of the fusion resource image is abnormal; accordingly, the processing unit may be further operable to:

if the special effect display logic of the dynamic template image is abnormal, outputting alarm information to prompt a technician to analyze the abnormal condition of the special effect display logic of the dynamic template image;

and if the special effect display logic of the dynamic template image is not abnormal, issuing the dynamic template image.

In another embodiment, the fusion resource image is played in the terminal screen when the special effect shielding function is not abnormal and the special effect shielding function is in a closed state;

the abnormality detection method of the special effect shielding function is as follows:

starting the special effect shielding function, and playing a test image of a target resource in a terminal screen, wherein the test image comprises the fusion resource image or the dynamic template image;

recording a screen in the playing process of the test image to obtain a screen recording video about the playing process of the test image; detecting whether a multi-frame image of the screen recording video comprises the target object or not;

if yes, determining that the special effect shielding function is abnormal; and if not, determining that the special effect shielding function is not abnormal.

In another aspect, an embodiment of the present invention provides a terminal, where the terminal includes an input device and an output device, and the terminal further includes:

a processor adapted to implement one or more instructions; and the number of the first and second groups,

a computer storage medium storing one or more first instructions adapted to be loaded by the processor and to perform the steps of:

In yet another aspect, an embodiment of the present invention provides a computer storage medium, where one or more first instructions are stored, and the one or more first instructions are adapted to be loaded by a processor and perform the following steps:

The embodiment of the invention provides a special effect display logic which can fuse other data contents in a dynamic template image and play a fused special effect image aiming at the dynamic template image of a target resource, and can effectively enrich a special effect display mode of the dynamic template image. When the special effect display logic is detected abnormally, a fused resource image obtained by fusing the reference object to the dynamic template image of the target resource can be played first, and a recorded video is obtained by recording a screen in the playing process of the fused resource image. Then, object recognition can be carried out on the multi-frame candidate images of the recorded video so as to recognize whether each frame of candidate image comprises a target object or not; by introducing the target object associated with the reference object, only whether the target object is included in the candidate image or not can be recognized during object recognition, and whether the reference object is included in the candidate image or not does not need to be recognized, so that the recognition complexity can be effectively reduced, and the processing resource is saved. Then, a key frame image including the target object may be selected from the plurality of candidate images; because the key frame image comprises the target object, the key frame image can have higher reference degree, and the special effect display logic of the dynamic template image is subjected to anomaly detection according to the key frame image, so that the detection efficiency can be improved, and the accuracy of the detection result can be effectively improved.

In another aspect, an embodiment of the present invention provides an image display method, where the method includes:

responding to a triggering operation of sending the resource to a second terminal user by a first terminal user in a communication scene, and outputting a resource selection interface; the resource selection interface at least comprises a resource identifier of a target resource, and the resource identifier is displayed in the resource selection interface under the condition that the image detection method is adopted to determine that the special effect display logic of the dynamic template image of the target resource is not abnormal;

selecting the resource identifier of the target resource in the resource selection interface;

if the confirmation operation aiming at the resource identification is detected, playing a target special effect image of the target resource in a terminal screen; the target special effect image is obtained by fusing a user identifier of a target terminal user into the dynamic template image of the target resource, wherein the target terminal user comprises at least one of the following items: the first end user and the second end user.

In still another aspect, an embodiment of the present invention provides an image display apparatus, including:

the output unit is used for responding to the triggering operation of sending the resource to the second terminal user by the first terminal user in the communication scene and outputting a resource selection interface; the resource selection interface at least comprises a resource identifier of a target resource, and the resource identifier is displayed in the resource selection interface under the condition that the image detection method is adopted to determine that the special effect display logic of the dynamic template image of the target resource is not abnormal;

the processing unit is used for selecting the resource identifier of the target resource in the resource selection interface;

the output unit is further used for playing a target special effect image of the target resource in a terminal screen if the confirmation operation aiming at the resource identification is detected; the target special effect image is obtained by fusing a user identifier of a target terminal user into the dynamic template image of the target resource, wherein the target terminal user comprises at least one of the following items: the first end user and the second end user.

a computer storage medium storing one or more second instructions adapted to be loaded by the processor and to perform the steps of:

In yet another aspect, an embodiment of the present invention provides a computer storage medium, where one or more second instructions are stored, and the one or more second instructions are adapted to be loaded by a processor and perform the following steps:

The embodiment of the invention can respond to the triggering operation of sending the resource to the second terminal user by the first terminal user in the communication scene, and output the resource selection interface. Wherein, the resource selection interface comprises a resource identifier of the target resource; the resource identification is displayed in a resource selection interface under the condition that the special effect display logic of the dynamic template image of the target resource is determined not to be abnormal; therefore, the usability of the dynamic template image can be effectively ensured, and the probability of showing abnormity of the dynamic template image in the actual use process is reduced. After a user confirms that the resource identification of the target resource is selected through the resource selection interface, the terminal can fuse the user identification of the first terminal user and/or the user identification of the second terminal user into the dynamic template image of the target resource and play the target special effect image of the target resource obtained by fusion in the terminal screen, so that the special effect display mode of the dynamic template image can be effectively enriched, the interestingness of image display is improved, and the viscosity of the user is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1a is a system architecture diagram of a communication system according to an embodiment of the present invention;

FIG. 1b is a schematic diagram of an image detection scheme provided by an embodiment of the present invention;

FIG. 2 is a schematic flow chart of an image detection method according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a method for generating a fusion resource image according to an embodiment of the invention;

FIG. 4 is a flowchart illustrating an image detection method according to another embodiment of the present invention;

FIG. 5a is a schematic diagram of obtaining an initial image by framing according to an embodiment of the present invention;

FIG. 5b is a schematic diagram of object recognition provided by an embodiment of the present invention;

FIG. 5c is a schematic diagram of another object recognition provided by embodiments of the present invention;

FIG. 5d is a schematic diagram illustrating a comparison of adjacent initial images according to an embodiment of the present invention;

FIG. 5e is a schematic diagram of a candidate image according to an embodiment of the present invention;

FIG. 5f is a diagram of an effect mark provided by an embodiment of the present invention;

FIG. 6 is a flowchart illustrating an image displaying method according to an embodiment of the present invention;

FIG. 7a is a schematic diagram of a resource selection interface according to an embodiment of the present invention;

FIG. 7b is a diagram of a selected resource identifier according to an embodiment of the present invention;

fig. 7c is a schematic diagram of playing a target special effect image according to an embodiment of the present invention;

FIG. 8 is a schematic structural diagram of an image detection apparatus according to an embodiment of the present invention;

fig. 9 is a schematic structural diagram of an image display device according to an embodiment of the present invention;

fig. 10 is a schematic structural diagram of a terminal according to an embodiment of the present invention.

Detailed Description

The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. It is to be understood that the embodiments described below are only a few embodiments of the present application and are not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the various embodiments of the present application without inventive step, are within the scope of the present application.

In an embodiment of the present invention, it relates to a communication system; referring to fig. 1a, the communication system may comprise at least: at least two terminals 11 and a server 13. Any one of the terminals 11 and the server 12 may communicate with each other through any communication method, such as third Generation Partnership Project (3 GPP), fourth Generation mobile communication technology (4G), fifth Generation mobile communication technology (5G), Long Term Evolution (LTE), Worldwide Interoperability for Microwave Access (WiMAX), or computer network communication based on TCP/IP Protocol Suite (TCP/IP), User Datagram Protocol (UDP) protocols, and so on.

Any terminal 11 is any intelligent device with a communication function and a human-computer interaction function; each of which may include, but is not limited to: smart phones, tablets, smart televisions, wearable devices, laptop computers, and desktop computers, among others. In addition, a communication APP can be installed and run in any terminal 11; here, the communication APP refers to an APP (application) with a communication function, such as a social APP (e.g., an instant messaging APP, a content interaction APP), a live APP, and the like. Optionally, any terminal 11 may also be installed and operated with various other APPs, such as a video playing APP, a music playing APP, and the like. The server 12 is a service device that can establish a communication connection between any two terminals 11 and provide various services such as information interaction service and resource processing service for each terminal 11; which include but are not limited to: data processing servers, application servers, and web servers, among others.

The above mentioned resources refer to anything that can be transferred between users; which may include, but are not limited to: virtual resources, object resources, and electronic resources, among others. The virtual resource refers to a resource which circulates on the network and has no entity (real object), such as a virtual gift involved in a live scene, a game item involved in a game scene, and the like; object resources refer to resources with entities, such as exercisable cars, singable microphones, and so on; electronic resources refer to electronic money in electronic form, such as electronic red packs involved in social scenarios, money transfers between users, and so forth. Any resource can have a corresponding dynamic template image, and the dynamic template image refers to a template image which generates a dynamic effect by switching images of a plurality of static template images at a preset frequency.

It should be understood that fig. 1a merely schematically represents the system architecture of the communication session system, and is not intended to limit the same. For example, the number of terminals in fig. 1a is 4; but in other embodiments the number of terminals may not be limited to 4, which may be 2, 3, 10, etc. As another example, the server 12 in FIG. 1a is a stand-alone service device; however, in other embodiments, the server 12 may be deployed as a cluster device formed by a plurality of service devices, for example, the server 12 may be deployed as a cluster device formed by at least a communication server for providing information interaction service and a resource server for providing resource processing service, and so on.

In a specific implementation, the operation principle of the above mentioned communication system is as follows: when any user has a communication demand, a communication request can be sent to the server through the communication APP in any terminal in the communication system so as to request the server to establish a communication connection between the any terminal and other terminals used by at least one other user. After the server successfully establishes the communication connection, the any user can perform information interaction with each other user through the communication function provided by the communication APP in the any terminal. For example, if the communication APP is a social APP and the other users are all contact users of the any user, the any user can perform a chat session with each other user through a communication function provided by the social APP; for another example, if the communication APP is a live APP, and the other users include a main user (i.e., a user performing live broadcasting) and other audience users (users watching live broadcasting), the any user can also watch a live screen of the main user through a communication function provided by the live APP, and view comment information sent by the other audience users, and the like. In the process of information interaction between any user and each other user, if any terminal detects that any user sends a trigger operation of a certain resource to a target user, the resource is sent to the target user; and moreover, a dynamic template image of the resource can be acquired, and then special effect playing is carried out in a terminal screen of any terminal according to the dynamic template image. Optionally, any other terminal (e.g., a terminal used by a target user who receives the target resource, a terminal used by a remaining user other than any user and the target user) may also obtain the dynamic template image of the resource, and perform special-effect playing on its own terminal screen according to the dynamic template image, and so on.

In order to enrich the special effect display mode of the dynamic template image of the resources, the embodiment of the invention provides a brand-new special effect display logic. Specifically, the special effect display logic is as follows: fusing other data contents in the dynamic template image to play the logic of the fused special effect image; other data content mentioned herein may include, but is not limited to: the resource sending method comprises the steps of setting user identification of a resource sender, user identification of a resource receiver, user identification of other users in the same communication scene with the resource sender and the resource receiver, data content preset according to business requirements, and the like. In addition, the embodiment of the invention also provides an image detection scheme to better detect the abnormity of the special effect display logic of the dynamic template image. In a specific implementation, the image detection scheme may be executed by the test terminal or a communication APP running in the test terminal; for convenience of illustration, the image detection scheme executed by the test terminal is described as an example.

Referring to FIG. 1b, the general principle of this image detection scheme is as follows: firstly, a reference object for detection can be fused into a dynamic template image to be detected, so as to obtain a fused special effect image. Secondly, the special effect image can be played and the whole playing process can be recorded and collected. Then, the video collected by the screen recording can be subjected to framing processing, and corresponding algorithms are adopted to perform object identification on each framed image so as to find out the key frame image fused with the reference object. Finally, the key frame image can be subjected to image recognition, so that whether the dynamic template image is normally fused with the reference object or not and whether the fused special effect image is normally displayed and played or not are judged, and the abnormity detection of the special effect display logic of the dynamic template image is further realized.

Practice shows that the image detection scheme provided by the embodiment of the invention has at least the following beneficial effects: firstly, a background log system does not need to be checked in the whole detection process, and the detection threshold can be effectively reduced; the test terminal automatically performs abnormity detection on the special effect display logic of the dynamic template image through a series of operations such as screen recording, framing, object identification, image identification and the like, manual detection of a user is not needed, and labor cost can be effectively saved; the part related to image recognition does not need to collect a large amount of materials, and only needs to correspondingly recognize the key frame image, so that the detection time can be reduced to a certain extent, and the detection efficiency is effectively improved; and fourthly, carrying out anomaly detection on the special effect display logic of the dynamic template image according to the key frame image with higher reference degree, and effectively improving the accuracy of the detection result.

Based on the related description of the image detection scheme, the embodiment of the invention provides an image detection method; the image detection method can be executed by a test terminal or a communication APP operated in the test terminal; for convenience of illustration, the image detection method performed by the test terminal is described as an example. Referring to fig. 2, the image detection method may include the following steps S201 to S205:

s201, playing a fused resource image of the target resource in a terminal screen.

In the embodiment of the invention, the target resource can be a fusion resource corresponding to any dynamic template image to be detected; the term "converged resource" refers to: the resources corresponding to the dynamic template image with the special effect showing logic mentioned above. That is, if the dynamic template image of a resource has the above-mentioned special effect presentation logic, the resource may be referred to as a fusion resource; the resources referred to herein may be physical resources, electronic resources, virtual resources, and the like. Specifically, when the resource is an object resource, the target resource may be any fused object resource; when the resource is an electronic resource, the target resource can be any fused electronic resource; when the resource is a virtual resource, the target resource may be any converged virtual resource, such as a converged gift, a converged item, and so on. The fusion gift refers to a virtual gift resource corresponding to the dynamic template image with the special effect presentation logic; that is, after a certain audience user presents a certain virtual gift to a main broadcast user in a live broadcast scene, if it is supported that the user identifier of the audience user and/or the user identifier of the main broadcast user are fused into the dynamic template image of the virtual gift, so as to play the fused special effect image, the virtual gift may be called a fused gift. The user identification mentioned herein may comprise at least one of: a user representation and a user name; the user portrait refers to a picture used by a user in a communication process and used for identifying the identity of the user, such as a user head portrait; the user name refers to a name used by a user in a communication process and used for identifying the identity of the user, such as a user nickname, a user remark name and the like. Similarly, the merged item refers to a virtual game item corresponding to the dynamic template image having the special effect presentation logic.

In a specific implementation, since the detection of the special effect presentation logic of the dynamic template image for the target resource mainly involves two aspects: on one hand, detecting whether the dynamic template image can be normally fused with other data contents, and on the other hand, detecting whether the fused special effect image can be normally displayed and played; moreover, it is considered that there are more mature object recognition technologies such as face recognition technology, gesture recognition technology, text recognition technology, etc. Therefore, according to the embodiment of the invention, corresponding reference objects can be introduced according to the actually adopted object recognition technology, such as face images corresponding to the face recognition technology, gesture images corresponding to the gesture recognition technology, text contents corresponding to the text recognition technology and the like; therefore, the object recognition technology can be conveniently adopted for carrying out object recognition based on the reference object subsequently, so that whether the dynamic template image can normally fuse other data contents is detected, and whether the fused special effect image can be normally displayed and played is detected.

Based on this, the test terminal can acquire a reference object for detection. In one embodiment, the reference object may be a default object preset according to business requirements, such as a default face image, a default gesture image, or default text content, and so on. In another embodiment, in practical application, after it is detected that a user in a communication scene sends a target resource to another user, a dynamic template image of the target resource is used, and at this time, application of special effect display logic of the dynamic template image is involved; therefore, when the special effect display logic of the dynamic template image is subjected to abnormity detection, the detection environment can be set in a real common scene, so that the detection process is more suitable for a real application process. In this case, then, the target resource may be sent by the first test user to the second test user in the communication scenario; accordingly, the reference object may comprise at least one of: a user image of a first test user and a user image of a second test user. It should be noted that, the user image here may be a face image or a gesture image; the face image used by the first test user or the face image used by the second test user may be an image generated by collecting a face of the corresponding test user (the first test user or the second test user), or may be any image containing a face downloaded by the corresponding test user (the first test user or the second test user) through the internet.

It should be noted that, the reference object mentioned above may include, in addition to the user representation, a user name of the first test user and/or a user name of the second test user; alternatively, the reference object may not include a user representation, which includes only: the user name of the first test user and/or the user name of the second test user, and so on. For convenience of description, the reference object at least includes a user portrait of a first test user and a user portrait of a second test user, and both user portraits are face images. Also, the above mentioned communication scenarios may include, but are not limited to: a social session scene built based on a social APP, a live scene built based on a live APP, and the like. Specifically, when the communication scenario is a social session scenario, the first test user may be any user in the target session, and the second test user may be any user in the target session except for the first test user; the target session mentioned herein may be a single chat session composed of two users, or may be a group session composed of at least three users, which is not limited herein. When the communication scenario is a live scenario, the first test user may be any audience user in a live room (a virtual room in which a live view can be viewed), the second test user may be a main user in the live room, and so on.

After the reference object is obtained, the test terminal can fuse the reference object for detection into the dynamic template image of the target resource to obtain a fused resource image of the target resource. As can be seen from the foregoing, the dynamic template image refers to a template image with a dynamic effect generated by switching images of multiple static template images at a predetermined frequency; when the reference object is fused into the dynamic template image of the target resource, the test terminal can directly fuse the reference object into a part of or all of the frame static template images in the dynamic template image of the target resource to obtain a fused resource image of the target resource. Or, the object identification can be better carried out for the convenience of follow-up so as to improve the accuracy of the object identification; the test terminal can also standardize a user image (such as a human face image) included in the reference object, and then fuse the standardized reference object into a part of or all of static template images in the dynamic template image of the target resource to obtain a fused resource image of the target resource. For example, the reference object is fused into a part of the static template image, and a schematic diagram of the fused resource image is obtained as shown in fig. 3. After the converged resource image is obtained, the testing terminal can play the converged resource image of the target resource in the terminal screen.

S202, recording the screen of the playing process of the fusion resource image to obtain a recorded video of the playing process of the fusion resource image.

In a specific implementation, the screen recording operation can be immediately executed by the test terminal after the play event of the fusion resource image is detected. Further, if the target resource is sent by the first test user to the second test user in the communication scenario, after the first test user sends the target resource, the test terminal needs to take a certain time to obtain a fused resource image of the target resource; therefore, the test terminal can immediately execute the screen recording operation after detecting that the first test user executes the triggering operation of sending the target resource to the second test user, so that the whole playing process of the fused resource image can be ensured to be recorded, each frame of image in the fused resource image is ensured to be recorded, and the integrity of the recorded video is ensured. When the screen recording operation is executed, the test terminal can call third-party screen recording software to record the screen of the playing process of the fusion resource image, and a recorded video related to the playing process of the fusion resource image is obtained. Furthermore, the screen recording stability is improved in consideration of the fact that a third-party screen recording software possibly fails in the screen recording process; the test terminal can call a screen recording command (such as an adb shell screen recording command) of the bottom layer to record the screen of the playing process of the fusion resource image, so as to obtain a recorded video of the playing process of the fusion resource image.

And S203, carrying out object identification on the multi-frame candidate images of the recorded video to obtain the identification result of each frame of candidate image.

As can be seen from the foregoing, the fused resource image is a special effect image obtained by fusing a reference object for detection to the dynamic template image of the target resource; if the dynamic template image of the target resource can normally fuse other data contents (such as a reference object), the reference object is inevitably included in a recorded video obtained by recording the screen during the playing process of the fused resource image. Based on the principle, the test terminal can perform object identification on the recorded video; specifically, the recorded video is composed of one frame and one frame of recorded picture (i.e., a jpg (Joint Photographic Experts Group) picture or a png (Portable Network Graphics) picture), so that the test terminal can firstly perform framing processing on the recorded video to obtain multi-frame candidate images, and then identify whether the multi-frame candidate images of the recorded video include a target object associated with the reference object, so as to obtain an identification result of each frame of candidate images. Wherein the recognition result of any candidate image is used for indicating that: whether any of the candidate images includes a target object associated with the reference object. By carrying out object recognition on the multi-frame candidate images, whether the dynamic template image is normally fused with the reference object can be conveniently determined according to the recognition result of each frame of candidate image, and after the dynamic template image is determined to be normally fused with the reference object, whether the special effect image (namely the fused resource image) obtained by fusion is normally played and displayed is further detected.

Wherein, the above mentioned target object and reference object can be the same or different; that is, the target object may be the reference object or may not be the reference object. When the target object and the reference object are different, the target object may include any one of: when the reference object comprises a user portrait of the first test user and/or a user portrait of the second test user, if the user portrait is a face image, the target object may be a target number of faces; if the user portrait is a gesture image, the target object is a target number of gestures. Wherein the target number is the number of user portraits included in the reference object; for example, assuming that the reference object only includes the user image of the first test user or the user image of the second test user, the target number is 1; assuming that the reference object comprises a user image of a first test user and a user image of a second test user, the target number is 2. When the reference object is a default object preset according to business requirements, if the default object is a default face image, the target object is a face; if the default object is a default gesture image, the target object is a gesture; and if the default object is the default text content, the target object is a text.

And S204, selecting a key frame image from the multi-frame candidate images according to the identification result of each frame candidate image.

In a specific implementation, after the test terminal obtains the recognition result of each frame of candidate image through step S203, the test terminal may select a key frame image from the multiple frames of candidate images according to the recognition result of each frame of candidate image; the key frame image refers to a candidate image including a target object. If the key frame image is successfully selected, the fact that at least one frame of candidate image comprises the target object exists in each frame of candidate image can be indicated, and the fact that the reference object is normally fused in the dynamic template image of the target resource can be determined; in this case, the test terminal may further detect whether the fusion resource image is normally played and displayed by executing step S205, so as to determine whether the special effect displaying logic of the dynamic template image is abnormal. If the key frame image is not successfully selected (namely the key frame image is unsuccessfully selected), the fact that the target object is not included in each frame of candidate image can be shown, and therefore it can be determined that the reference object cannot be normally fused with the dynamic template image of the target resource; in this case, the test terminal can directly determine that the special effect display logic of the dynamic template image is abnormal.

And S205, carrying out anomaly detection on the special effect display logic of the dynamic template image according to the key frame image.

In particular implementations, image recognition may be performed on the key frame image to detect whether the target image element is included in the key frame image. In one embodiment, the image recognition can be directly performed on the key frame image to obtain at least one image element contained in the key frame image; then, whether the target image element is included in the at least one image element is detected, thereby determining whether the target image element is included in the key frame image. In another embodiment, the method of image matching the static map may also be used to perform image recognition on the key positions in the key frame image to detect whether the image elements at the key positions are the target image elements, so as to determine whether the target image elements are included in the key frame image.

The target image elements are preset according to business requirements and dynamic template images; for example, for the dynamic template image shown in fig. 3, the target image element may be a "wings" image element, a "jewel" image element, and so on. If the key frame image is detected to comprise the target image element, the fusion resource image can be determined to be normally played and displayed; in this case, it can be determined that there is no abnormality in the special effect presentation logic of the dynamic template image; if the key frame image is detected not to include the target image element, determining that the fusion resource image is not normally played and displayed; in this case, it can be determined that there is an abnormality in the special effect presentation logic of the dynamic template image.

Based on the above description related to the embodiment of the method shown in fig. 2, another more specific image detection method is further proposed in the embodiment of the present invention; the image detection method can be executed by a test terminal or a communication APP operated in the test terminal; for convenience of illustration, the image detection method performed by the test terminal is described as an example. In addition, in the embodiment of the present invention, considering that there may be some special effect images generated after some users do not want to view other data contents fused to the dynamic template image in practical applications, the embodiment of the present invention may further provide a special effect masking function for the users to mask the special effect image obtained by fusing the relevant data contents to the dynamic template image. Therefore, in order to ensure the normal use of the special effect shielding function, the test terminal also needs to perform abnormal detection on the special effect shielding function. After the special effect shielding function is opened, if the special effect shielding function is not abnormal, only the dynamic template image is played for the user, and the related fusion resource image is not played; and only when the special effect shielding function is abnormal, the related fusion resource image can be played. Therefore, the abnormality detection method of the special effect masking function may be as follows:

firstly, a special effect shielding function can be started, and a test image of a target resource is played in a terminal screen; wherein the test image may comprise a fusion resource image or a dynamic template image. Secondly, the screen recording can be carried out on the playing process of the test image, and the screen recording video related to the playing process of the test image is obtained. Then, whether a target object is included in a multi-frame image of the screen recording video can be detected; specifically, object recognition may be performed on a multi-frame image of a screen recording video to determine whether the multi-frame image includes a target object, and for a specific implementation of the object recognition on a multi-frame candidate image of a recording video in a playing process of a fusion resource image, reference may be made to the specific implementation of the object recognition on the multi-frame candidate image of the recording video in other invention embodiments, which is not described herein again. If the special effect shielding function is abnormal, the played test image can be shown to be the fusion resource image, so that the abnormal special effect shielding function can be determined; under the condition, the special effect shielding function can be overhauled by technicians to ensure that the special effect shielding function can be normally put into use subsequently. If not, the played test image is indicated to be a dynamic template image, so that the special effect shielding function can be determined not to be abnormal.

Therefore, in the whole process of carrying out abnormity detection on the special effect shielding function, human intervention is not needed, and the labor cost can be saved. Moreover, the abnormity detection of the special effect shielding function is automatically realized, and the detection speed can be effectively increased, so that the detection time is effectively shortened, and the detection efficiency is improved. In addition, whether the special effect shielding function is abnormal or not is detected by combining an object recognition technology on the basis of automation, and the detection accuracy can be effectively improved.

After determining that the special effect shielding function is not abnormal, the test terminal may close the special effect shielding function, and perform abnormality detection on the special effect display logic of the dynamic template image of the target resource by using the image detection method shown in fig. 4. Referring to fig. 4, the image detection method may include the following steps S401 to S407:

s401, playing a fused resource image of the target resource in a terminal screen.

S402, recording the screen of the playing process of the fusion resource image to obtain a recorded video of the playing process of the fusion resource image.

The fusion resource image is a special effect image obtained by fusing a reference object for detection to a dynamic template image of a target resource; and the fused resource image is played in the terminal screen under the condition that the special effect shielding function is not abnormal and the special effect shielding function is in a closed state. It should be noted that, for specific implementation of steps S401 to S402, reference may be made to the related description of steps S201 to S202 in the foregoing embodiment of the present invention, and details are not described herein again.

And S403, performing framing processing on the recorded video to obtain a plurality of frames of initial images.

In a specific implementation, the test terminal may perform frame segmentation on the recorded video to obtain a multi-frame recorded frame of the fusion resource image. Then, the multi-frame recording picture can be directly determined as a multi-frame initial image; or the image of the target area can be cut out from each frame recording picture as a multi-frame initial image. The area center point of the target area is overlapped with the display position of the fusion resource image in the terminal screen, and the area of the target area can be set according to an empirical value or the display size of the fusion resource image; specifically, the area of the target region may be greater than or equal to the display size of the fusion resource image. For example, let the ratio of the distance between the display position of the fusion resource image in the terminal screen and the top end of the terminal screen to the length of the terminal screen be 1/3, that is, let the display position of the fusion resource in the terminal screen be 1/3 length of the terminal screen; the area center point of the target area is 1/3 th of each recorded picture, and the image of the target area centered at 1/3 th of the recorded picture is cut out from each frame as the initial image, as shown in fig. 5 a.

Therefore, the test terminal can support two framing modes: the first is to frame the recorded video directly according to 100% of the recorded picture, and the second is to frame the image of the target area in the recorded picture. Compared with the first framing method, the second framing method has at least the following advantages: firstly, customized framing processing is carried out according to a display mode of the fusion resource image, so that an initial image obtained by framing comprises complete information of the fusion resource image, and subsequent object recognition effect can be prevented from being influenced; under the condition that the recognition effect is kept the same as that of the first framing mode, the object recognition operation on other useless images except the target area is reduced, so that the object recognition time can be effectively shortened, and the object recognition efficiency is improved. Secondly, because other interference objects except the reference object may exist in the recorded picture, if the whole recorded picture is subjected to object identification, the influence of other interference objects can be received, so that the accuracy of an object identification result is influenced; for example, if the reference object includes a face image, and if the recording picture includes other faces (i.e., an interfering object), the test terminal may recognize not only the face included in the reference object but also the other faces when performing object recognition, so as to obtain an output of an interference term influence result shown in fig. 5b, which may reduce the accuracy of the object recognition result. However, by capturing the image of the target area as the initial image, the influence of other interfering objects (such as the game face image in the live game screen shown in fig. 5 b) can be effectively reduced, and the accuracy of the object recognition result can be improved, as shown in fig. 5 c.

S404, determining a plurality of candidate images of the recorded video from the plurality of initial images.

In a specific implementation, each frame of initial image may have an image sequence number, and each frame of initial image may be arranged in sequence according to the image sequence number; the image sequence number of any initial image can be used for indicating the screen recording acquisition sequence of the recording picture corresponding to the initial image. For example, if the image sequence number of the initial image a is 1, it indicates that the recording picture corresponding to the initial image a is the 1 st picture acquired by recording; if the image sequence number of the initial image B is 100, it indicates that the recording picture corresponding to the initial image B is the 100 th frame picture acquired by recording, and so on.

In one embodiment, a plurality of frames of initial images can be directly used as a plurality of frames of candidate images of the recorded video. In another embodiment, the fusion resource image is obtained by fusing the reference object to all or part of the static template image in the dynamic template image, and the screen recording time point may be earlier than the playing time point of the fusion resource image; therefore, only a part of recorded pictures in the recorded video obtained by recording the screen may have the special fusion effect, so that a part of initial images may exist in multi-frame initial images obtained by framing the recorded video and have the special fusion effect. In this case, if the initial images of a plurality of frames are directly used as candidate images, it may take a long time to perform object recognition on each initial image. In this regard, the embodiment of the present invention further studies on the multi-frame initial images, and finds that there are a large number of image differences between the near initial images in the multi-frame initial images, and particularly, there are neighboring initial images with fusion special effects, as shown in fig. 5d (the left image is the initial image of the 910 th frame, and the right image is the initial image of the 923 th frame). Based on the method, the test terminal can sample the multiple frames of initial images which are sequentially arranged according to the sampling frequency to obtain multiple frames of candidate images of the recorded video, so that the object identification is carried out on each frame of candidate images obtained by sampling in the follow-up process. Wherein, the sampling frequency can be set according to an empirical value or a service requirement; for example, the sampling frequency can be set to be once per 10 images, and when the number of the initial images is 1000, the sampling frequency is adopted for sampling, so that the 1000 initial images can be compressed to 100 images for object recognition, and thus, the object recognition time can be effectively shortened, and the object recognition efficiency can be improved.

S405, carrying out object identification on the multi-frame candidate images of the recorded video to obtain the identification result of each frame of candidate image.

From the foregoing, it can be seen that there are a variety of implementations of the reference object; then, with the difference of the reference objects, the specific implementation of the test terminal for performing object identification on the multi-frame candidate images of the recorded video is also different; the method comprises the following specific steps:

when the reference object comprises a user portrait of a first test user and/or a user portrait of a second test user, if the user portrait is a face image, face recognition can be carried out on a plurality of frames of candidate images of the recorded video to obtain a recognition result of each frame of candidate images; if the user portrait is a gesture image, gesture recognition can be carried out on multiple frames of candidate images of the recorded video, and recognition results of the candidate images of all the frames are obtained. Optionally, if the reference object includes the user name of the first test user and/or the user name of the second test user, text recognition may be performed on multiple candidate images of the recorded video to obtain a recognition result of each candidate image.

When the reference object comprises a default object set according to business requirements, if the default object is a default face image, face recognition can be carried out on multiple frames of candidate images of the recorded video to obtain a recognition result of each frame of candidate image; if the default object is a default gesture image, performing gesture recognition on a plurality of frames of candidate images of the recorded video to obtain a recognition result of each frame of candidate image; if the default object is the default text content, text recognition can be performed on the multi-frame candidate images of the recorded video, and recognition results of the candidate images of all the frames are obtained.

S406, selecting a key frame image from the multi-frame candidate images according to the identification result of each frame candidate image.

Wherein the recognition result of any candidate image is used for indicating that: whether any of the candidate images includes a target object associated with the reference object. In the specific implementation of step S406, the test terminal may first screen out a candidate image including the target object from the multiple frames of candidate images according to the recognition result of each frame of candidate image; then, a key frame image may be selected from the screened frame candidate images. In one embodiment, each screened candidate image of each frame can be directly used as a key frame image; or, one candidate image is arbitrarily selected from the screened candidate images of the frames as the key frame image.

In another embodiment, although each frame of the selected candidate image includes the target object, the anomaly detection of the special effect display logic is realized by identifying whether the key frame image includes the target image pixel. And the research shows that: some exposure special effects may exist in the process of playing the fusion resource image, so that partial exposure exists in the candidate image screened out in the part, and therefore element information in the part of the candidate image is unclear, and the part of the candidate image is not suitable for being subsequently used for carrying out abnormity detection on special effect display logic. For example, referring to fig. 5e, the image numbers of the screened candidate images including the target object are sequentially: 790. 800, 810 … 910; it can be seen that the fusion special effect starts to be shown from the candidate image with the image number of 790. However, although the candidate image with the image number 790 includes target objects (e.g., 2 faces), the candidate image has an unclear image element (e.g., a "wing" element) due to the existence of the exposure special effect; if subsequent detection is performed according to the candidate image, the accuracy of the detection result may be affected. Based on the method, in order to reduce the probability of selecting the candidate image with the special exposure effect as the key frame image, the subsequent detection result is influenced; when the test terminal selects the key frame image, the screened candidate images of each frame can be arranged according to the sequence of the image sequence numbers of the screened candidate images of each frame from small to large to obtain the arrangement position of the screened candidate images of each frame; and then selecting the candidate image with the arrangement position at the target position from the multiple candidate images arranged in sequence as a key frame image. Or, in other embodiments, the test terminal may also sequentially add the image sequence numbers of the screened candidate images to an array in an ascending order of the image sequence numbers; then, the candidate image indicated by the image sequence number corresponding to the target position in the selected array is used as the key frame image.

Wherein, the above mentioned target position can be set according to experience value or exposure special effect display logic; for example, if the presentation logic of the exposure special effect indicates that the exposure special effect is generally presented in the previous N frame candidate images, the target position may be set to be located at any position after the nth frame candidate image. For example, following the example shown in fig. 5e, the number of the screened multi-frame candidate images is 13; the target position may be the middle position of the multiple frame candidate images arranged in sequence or the middle position of the array, in which case the key frame image is the candidate image with the image number 850 corresponding to the middle position.

And S407, carrying out anomaly detection on the special effect display logic of the dynamic template image according to the key frame image.

In a specific implementation, the key frame image and the target image element may be subjected to static image matching processing to obtain a matching result. If the matching result indicates that the key frame image comprises the target image element, determining that the special effect display logic of the dynamic template image has no abnormality; if the matching result indicates that the key frame image does not include the target image element, it may be determined that the special effect presentation logic of the dynamic template image is abnormal. The specific implementation method for obtaining the matching result by performing static image matching processing on the key frame image and the target image element is as follows:

firstly, the key frame image can be specially marked to obtain a special effect marking result. The special effect markers mentioned here refer to: finding out a key frame image and marking the key position in the key frame image; accordingly, the special effect marking result includes position information of key positions in the key frame image. Wherein, the key position can be set according to experience value or service requirement; taking the example that the key frame image is the candidate image with the image number 850, the key position may be the position of the "wings" image element and/or the position of the "gems" image element, and then the special effect marking result may be as shown in fig. 5 f. After the marking result is obtained, key feature elements of key positions of the key frame image can be obtained according to the special effect marking result, and static image matching is carried out on the key feature elements and the target image elements. If the matching is successful, generating a matching result for indicating that the key frame image comprises the target image element; if the matching fails, a matching result indicating that the key frame image does not include the target image element is generated.

It should be noted that, in practical applications, the image detection method shown in fig. 2 or fig. 4 may be encapsulated into an interface, so that the interface may be directly called to perform anomaly detection on the special effect display logic of any dynamic template image to be detected subsequently, thereby avoiding repeated code compilation and improving detection efficiency.

The embodiment of the invention can combine the object recognition technology and the image recognition technology on the basis of automation, can realize the abnormity detection of the special effect display logic of the dynamic resource template and the abnormity detection of the shielding logic of the special effect shielding function in a customized manner, effectively shortens the detection time, and improves the detection efficiency and the detection accuracy. In addition, aiming at the special effect display logic of the dynamic resource template, the recorded video obtained by recording the screen is subjected to framing processing in a new video framing mode, so that the operation time and the identification accuracy of subsequent object identification can be effectively improved; by performing the object recognition after performing the special processing (such as sampling processing) on the neighboring similar candidate images, the time required for the object recognition can be further shortened. After the key frame image is obtained, whether the special effect display logic is abnormal is judged by performing image element matching search of key positions on the key frame image, a large amount of materials do not need to be collected, and only corresponding element identification is needed to be performed on the key frame image and the key positions, so that the detection efficiency and the accuracy of a detection result can be improved.

In practical applications, after the image detection method shown in fig. 2 or fig. 4 is used to perform the anomaly detection on the special effect display logic of the dynamic resource template of the target resource, corresponding processing may be performed according to the detection result. Specifically, if the special effect display logic of the dynamic template image is not abnormal, the dynamic template image is released; the distributed moving template image means: the dynamic template image is publicly provided for each user to use. If the special effect displaying logic of the dynamic template image has an abnormality, then, as can be known from the foregoing related contents, the existence of the abnormality of the special effect displaying logic of the dynamic template image may include at least one of the following situations: the method comprises the following steps that an abnormity exists in the process of fusing a reference object to a dynamic template image, and an abnormity exists in the playing process of a fusion resource image; therefore, in this case, the test terminal may output alarm information to prompt a technician to analyze an abnormal situation existing in the special effect display logic of the dynamic template image, so that the technician may modify the special effect display logic of the dynamic template image in a targeted manner according to an analysis result, so that the modified special effect display logic is not abnormal, and then the modified special effect display logic is issued.

After the dynamic template image is released, any terminal in the communication system can perform special effect playing according to the special effect display logic of the dynamic template image and the dynamic template image after detecting that the first terminal user sends the target resource to the second terminal user. Based on the above, the embodiment of the invention also provides an image display method; the image display method can be executed by the first terminal in the communication system or executed by a communication APP operated in the first terminal; for convenience of illustration, the first terminal is used by the first user in the communication system to perform the image display method. Referring to fig. 6, the image display method may include the following steps S601 to S603:

s601, responding to the trigger operation of sending the resource to the second terminal user by the first terminal user in the communication scene, and outputting a resource selection interface.

Among other things, communication scenarios may include, but are not limited to: a social session scene built based on a social APP, a live scene built based on a live APP, and the like. When the communication scene is a social conversation scene, the first terminal user may be any user in the target conversation, and the second terminal user may be any user except the first terminal user in the target conversation; when the communication scenario is a live scenario, the first end user may be any viewer user in the live room, the second end user may be a anchor user in the live room, and so on.

When a first terminal user wants to send a target resource to a second terminal user, a trigger operation for sending the resource can be input in a current interface displayed in the first terminal. The current interfaces referred to herein may include, but are not limited to: a session interface of a target session, a live interface in a live scene, and the like; the triggering operation for resource transmission input by the first end user in the current interface may include, but is not limited to: an input operation of a preset voice password, an input operation of a preset gesture, a click operation or a press operation for a trigger component in the current interface, and the like. Correspondingly, the first terminal can respond to the triggering operation for sending the resource and output a resource selection interface. Wherein, the resource selection interface at least comprises a resource identifier of the target resource; the target resource may be any converged resource, such as a converged gift, a converged electronic resource, etc., and the resource identification of the target resource may include, but is not limited to: resource name, resource number, resource image, etc. The resource image may be a thumbnail corresponding to any one of the static template images in the dynamic template images of the target resource, or may be an image independent from the dynamic template images, which is not limited herein. The resource identifier is displayed in the resource selection interface when it is determined that the special effect display logic of the dynamic template image of the target resource has no abnormality by using the image detection method shown in fig. 2 or fig. 4.

Specifically, when the first terminal outputs the resource selection interface, the first terminal may directly output the resource selection interface on the current interface, or replace the current interface with the resource selection interface, or may switch from the current interface to the resource selection interface. For example, taking the target resource as the fusion gift, the current interface as the live interface, and the resource selection interface output on the current interface, a schematic diagram of the output resource selection interface 70 and the resource identifier 71 of the target resource is shown in fig. 7 a.

S602, selecting the resource identifier of the target resource in the resource selection interface.

In a specific implementation, a first terminal user selects a resource identifier of a target resource in a resource selectable interface; accordingly, the first terminal may select the resource identifier of the target resource in the resource selection interface according to the selection operation of the first terminal user, as shown in fig. 7 b. It should be understood that fig. 7b is a schematic diagram of the resource identifier for exemplarily representing the selected target resource, and is not limited thereto; for example, in other embodiments, the first terminal may also switch the display state (display color, display size, display position, etc.) of the resource identifier of the target resource from the first state to the second state in the resource selection interface to select the resource identifier of the target resource.

S603, if the confirmation operation aiming at the resource identification is detected, playing a target special effect image of the target resource in a terminal screen.

In a specific implementation, after the first terminal user selects the resource identifier of the target resource, the first terminal user can perform a confirmation operation on the resource identifier; the confirmation operation may include: an operation to enter a confirmation gesture or a confirmation voice, a triggering operation to a confirmation component 72 in the resource selection interface, and so forth. Correspondingly, the first terminal can respond to the confirmation operation and play the target special effect image of the target resource in the terminal screen; the target special effect image is obtained by fusing the user identification of the target terminal user into the dynamic template image of the target resource. Wherein the target user may include at least one of: a first end user and a second end user; the user identification comprises at least one of: a user representation and a user name; user screens referred to herein may include, but are not limited to: face images, gesture images, landscape images, animal images, and the like. For the example that the target users include a first end user and a second end user, and the user identifier includes a user portrait and a user name, a schematic diagram of playing the target special effect image 73 can be seen in fig. 7 c.

Based on the description of the above embodiment of the image detection method, the embodiment of the present invention also discloses an image detection apparatus, which may be a computer program (including a program code) running in the above mentioned test terminal. The image detection apparatus may perform the method shown in fig. 2 or fig. 4. Referring to fig. 8, the image detection apparatus may operate as follows:

a playing unit 801, configured to play a fusion resource image of a target resource in a terminal screen, where the fusion resource image is a special effect image obtained by fusing a reference object for detection to a dynamic template image of the target resource;

a screen recording unit 802, configured to record a screen of the fusion resource image in a playing process of the fusion resource image, so as to obtain a recorded video of the playing process of the fusion resource image;

a processing unit 803, configured to perform object recognition on multiple candidate images of the recorded video to obtain recognition results of the candidate images of each frame, where the recognition result of any candidate image is used to indicate: whether the any of the candidate images includes a target object associated with the reference object;

the processing unit 803 is further configured to select a key frame image from the multiple frame candidate images according to the recognition result of the frame candidate images, where the key frame image is a candidate image including the target object;

the processing unit 803 is further configured to perform anomaly detection on the special effect display logic of the dynamic template image according to the key frame image.

In another embodiment, when the processing unit 803 is configured to perform object recognition on multiple candidate images of the recorded video to obtain a recognition result of each candidate image, the processing unit may be specifically configured to:

In yet another embodiment, before the processing unit 803 is configured to perform object recognition on multiple candidate images of the recorded video to obtain a recognition result of each candidate image, the processing unit may be further configured to:

In another embodiment, when the processing unit 803 is configured to perform frame division processing on the recorded video to obtain multiple frames of initial images, it may be specifically configured to:

In another embodiment, each frame of initial image has an image sequence number, and the initial images of each frame are arranged in sequence according to the image sequence numbers; correspondingly, when the processing unit 803 is configured to determine, from the multiple frames of initial images, multiple frames of candidate images of the recorded video, the processing unit may be specifically configured to:

In another embodiment, when the processing unit 803 is configured to select a key frame image from the multiple frame candidate images according to the recognition result of each frame candidate image, it is specifically configured to:

In another embodiment, when the processing unit 803 is configured to select a key frame image from the screened frame candidate images, it may be specifically configured to:

In another embodiment, when the processing unit 803 is configured to perform anomaly detection on the special effect presentation logic of the dynamic template image according to the key frame image, it may be specifically configured to:

In another embodiment, the logic for showing the special effects of the dynamic template image has an abnormality, which includes at least one of the following situations: the process of fusing the reference object to the dynamic template image is abnormal, and the playing process of the fusion resource image is abnormal; accordingly, the processing unit 803 may also be configured to:

According to an embodiment of the present invention, each step involved in the method shown in fig. 2 or fig. 4 may be performed by each unit in the image detection apparatus shown in fig. 8. For example, steps S201 and S202 shown in fig. 2 may be performed by the playback unit 801 and the screen recording unit 802 shown in fig. 8, respectively, and steps S203 to S205 may be performed by the processing unit 803 shown in fig. 8, respectively. As another example, steps S401 and S402 shown in fig. 4 may be performed by the playback unit 801 and the screen recording unit 802 shown in fig. 8, respectively, steps S403 to S407 may be performed by the processing unit 803 shown in fig. 8, and so on.

According to another embodiment of the present invention, the units in the image detection apparatus shown in fig. 8 may be respectively or entirely combined into one or several other units to form the image detection apparatus, or some unit(s) thereof may be further split into multiple units with smaller functions to form the image detection apparatus, which may achieve the same operation without affecting the achievement of the technical effect of the embodiment of the present invention. The units are divided based on logic functions, and in practical application, the functions of one unit can be realized by a plurality of units, or the functions of a plurality of units can be realized by one unit. In other embodiments of the present invention, the image-based detection apparatus may also include other units, and in practical applications, these functions may also be implemented by the assistance of other units, and may be implemented by cooperation of a plurality of units.

According to another embodiment of the present invention, the image detection apparatus device as shown in fig. 8 may be constructed by running a computer program (including program codes) capable of executing the steps involved in the respective methods as shown in fig. 2 or fig. 4 on a general-purpose computing device such as a computer including a processing element such as a Central Processing Unit (CPU), a random access storage medium (RAM), a read-only storage medium (ROM), and a storage element, and the image detection method of the embodiment of the present invention may be implemented. The computer program may be recorded on a computer-readable recording medium, for example, and loaded and executed in the above-described computing apparatus via the computer-readable recording medium.

Based on the description of the above embodiment of the image display method, the embodiment of the present invention also discloses an image display apparatus, which may be a computer program (including a program code) running in the above-mentioned first terminal. The image display apparatus may perform the method shown in fig. 6. Referring to fig. 9, the image display apparatus may operate as follows:

an output unit 901, configured to respond to a trigger operation of a first terminal user sending a resource to a second terminal user in a communication scenario, and output a resource selection interface; the resource selection interface at least comprises a resource identifier of a target resource, and the resource identifier is displayed in the resource selection interface under the condition that the image detection method is adopted to determine that the special effect display logic of the dynamic template image of the target resource is not abnormal;

a processing unit 902, configured to select a resource identifier of the target resource in the resource selection interface;

the output unit 901 is further configured to play a target special effect image of a target resource in a terminal screen if a confirmation operation for the resource identifier is detected; the target special effect image is obtained by fusing a user identifier of a target terminal user into the dynamic template image of the target resource, wherein the target terminal user comprises at least one of the following items: the first end user and the second end user.

According to an embodiment of the present invention, each step involved in the method shown in fig. 6 may be performed by each unit in the image display apparatus shown in fig. 9. For example, steps S601 and S603 shown in fig. 6 may be performed by the output unit 901 shown in fig. 9, and step S602 may be performed by the processing unit 902 shown in fig. 9.

According to another embodiment of the present invention, the units in the image display apparatus shown in fig. 9 may be respectively or entirely combined into one or several other units to form the image display apparatus, or some unit(s) thereof may be further split into a plurality of functionally smaller units to form the image display apparatus, which may achieve the same operation without affecting the achievement of the technical effects of the embodiments of the present invention. The units are divided based on logic functions, and in practical application, the functions of one unit can be realized by a plurality of units, or the functions of a plurality of units can be realized by one unit. In other embodiments of the present invention, the image-based display device may also include other units, and in practical applications, these functions may also be implemented by the assistance of other units, and may be implemented by cooperation of a plurality of units.

According to another embodiment of the present invention, the image display apparatus device as shown in fig. 9 may be constructed by running a computer program (including program codes) capable of executing the steps involved in the corresponding method as shown in fig. 6 on a general-purpose computing device such as a computer including a processing element such as a Central Processing Unit (CPU), a random access storage medium (RAM), a read-only storage medium (ROM), and a storage element, and the image display method of the embodiment of the present invention may be implemented. The computer program may be recorded on a computer-readable recording medium, for example, and loaded and executed in the above-described computing apparatus via the computer-readable recording medium.

Based on the description of the method embodiment and the device embodiment, the embodiment of the invention also provides a terminal; the terminal may be the above mentioned test terminal or the above mentioned first terminal. Referring to fig. 10, the terminal may include at least a processor 1001, an input device 1002, an output device 1003, and a computer storage medium 1004. The processor 1001, the input device 1002, the output device 1003, and the computer storage medium 1004 in the terminal may be connected by a bus or other means. A computer storage medium 1004 may be stored in the memory of the terminal, the computer storage medium 1004 being used for storing a computer program comprising program instructions, the processor 1001 being used for executing the program instructions stored by the computer storage medium 1004. The processor 1001 (or CPU) is a computing core and a control core of the terminal, and is adapted to implement one or more instructions, and in particular, is adapted to load and execute the one or more instructions so as to implement a corresponding method flow or a corresponding function.

In an embodiment, if the terminal is a test terminal, the processor 1001 according to the embodiment of the present invention may be configured to perform a series of image detection processes, which specifically includes: playing a fusion resource image of a target resource in a terminal screen, wherein the fusion resource image is a special effect image obtained by fusing a reference object for detection into a dynamic template image of the target resource; recording a screen in the playing process of the fusion resource image to obtain a recorded video related to the playing process of the fusion resource image; carrying out object identification on a plurality of candidate images of the recorded video to obtain an identification result of each candidate image, wherein the identification result of any candidate image is used for indicating: whether the any of the candidate images includes a target object associated with the reference object; selecting a key frame image from the plurality of frame candidate images according to the identification result of each frame candidate image, wherein the key frame image is a candidate image comprising the target object; and carrying out anomaly detection on the special effect display logic of the dynamic template image according to the key frame image, and the like.

In another embodiment, if the terminal is a first terminal, the processor 1001 according to an embodiment of the present invention may be configured to perform a series of image display processing, including: responding to a triggering operation of sending the resource to a second terminal user by a first terminal user in a communication scene, and outputting a resource selection interface; the resource selection interface at least comprises a resource identifier of a target resource, and the resource identifier is displayed in the resource selection interface under the condition that the image detection method is adopted to determine that the special effect display logic of the dynamic template image of the target resource is not abnormal; selecting the resource identifier of the target resource in the resource selection interface; if the confirmation operation aiming at the resource identification is detected, playing a target special effect image of the target resource in a terminal screen; the target special effect image is obtained by fusing a user identifier of a target terminal user into the dynamic template image of the target resource, wherein the target terminal user comprises at least one of the following items: the first end user and the second end user.

The embodiment of the invention also provides a computer storage medium (Memory), which is a Memory device in the terminal and is used for storing programs and data. It is understood that the computer storage medium herein may include a built-in storage medium in the terminal, and may also include an extended storage medium supported by the terminal. The computer storage medium provides a storage space that stores an operating system of the terminal. Also stored in this memory space are one or more instructions, which may be one or more computer programs (including program code), suitable for loading and execution by processor 1001. The computer storage medium may be a high-speed RAM memory, or may be a non-volatile memory (non-volatile memory), such as at least one disk memory; and optionally at least one computer storage medium located remotely from the processor.

In one embodiment, one or more first instructions stored in a computer storage medium may be loaded and executed by the processor 1001 to implement the corresponding steps of the method described above in relation to the image detection method embodiment; in a specific implementation, one or more first instructions in the computer storage medium are loaded by the processor 1001 and perform the following steps:

In another embodiment, when performing object recognition on multiple candidate images of the recorded video to obtain a recognition result of each candidate image, the one or more first instructions may be loaded and specifically executed by the processor 1001:

In yet another embodiment, before performing object recognition on multiple candidate images of the recorded video to obtain a recognition result of each candidate image, the one or more first instructions may be further loaded and specifically executed by the processor 1001:

In another embodiment, when performing frame division processing on the recorded video to obtain multiple frames of initial images, the one or more first instructions may be loaded and specifically executed by the processor 1001:

In another embodiment, each frame of initial image has an image sequence number, and the initial images of each frame are arranged in sequence according to the image sequence numbers; accordingly, when determining multiple candidate images of the recorded video from the multiple initial images, the one or more first instructions may be loaded and specifically executed by the processor 1001:

In still another embodiment, when selecting a key frame image from the multiple frame candidate images according to the identification result of each frame candidate image, the one or more first instructions may be loaded and specifically executed by the processor 1001:

In another embodiment, when selecting a key frame image from the screened candidate images, the one or more first instructions may be loaded and specifically executed by the processor 1001:

In another embodiment, when performing anomaly detection on the special effect presentation logic of the dynamic template image according to the key frame image, the one or more first instructions may be loaded and specifically executed by the processor 1001:

In another embodiment, the logic for showing the special effects of the dynamic template image has an abnormality, which includes at least one of the following situations: the process of fusing the reference object to the dynamic template image is abnormal, and the playing process of the fusion resource image is abnormal; accordingly, the one or more first instructions may also be loaded and specifically executed by the processor 1001:

In yet another embodiment, one or more second instructions stored in the computer storage medium may be loaded and executed by the processor 1001 to implement the corresponding steps of the method in the above-described embodiments of the image detection method; in a specific implementation, one or more second instructions in the computer storage medium are loaded by the processor 1001 and perform the following steps:

It should be noted that according to an aspect of the present application, a computer program product or a computer program is also provided, and the computer program product or the computer program includes computer instructions, and the computer instructions are stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the method provided in the various alternatives in the aspect of the embodiment of the image detection method shown in fig. 2 or fig. 4 described above; or cause the computer device to perform the method provided in the various alternatives in the aspect of the embodiment of the image display method shown in fig. 6 described above.

It should be understood, however, that the detailed description and specific examples, while indicating the preferred embodiment of the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.

Claims

1. An image detection method, comprising:

2. The method of claim 1, wherein the target resource is sent by a first test user to a second test user in a communication scenario;

3. The method of claim 2, wherein the performing object recognition on multiple candidate images of the recorded video to obtain a recognition result of each candidate image comprises:

4. The method according to any one of claims 1 to 3, wherein before performing object recognition on the plurality of candidate images of the recorded video to obtain the recognition result of each candidate image, the method further comprises:

5. The method of claim 4, wherein the framing the recorded video to obtain a plurality of initial images comprises:

6. The method of claim 4, wherein each frame of the initial images has an image number, and the frame of the initial images are arranged in sequence according to the image numbers; the determining, from the plurality of frames of initial images, a plurality of frames of candidate images of the recorded video includes:

7. The method according to any one of claims 1 to 3, wherein the selecting a key frame image from the plurality of frame candidate images according to the identification result of each frame candidate image comprises:

8. The method of claim 7, wherein said selecting a key frame image from the screened frame candidate images comprises:

9. The method of any one of claims 1 to 4, wherein the performing anomaly detection on the special effect presentation logic of the dynamic template image according to the key frame image comprises:

10. The method of claim 9, wherein the logic of special effect presentation of the dynamic template image having an anomaly comprises at least one of: the process of fusing the reference object to the dynamic template image is abnormal, and the playing process of the fusion resource image is abnormal;

the method further comprises the following steps:

11. The method of any of claims 1-3, wherein the converged asset image is played in the terminal screen with no exception to a special effects masking function and the special effects masking function in an off state;

12. An image display method, comprising:

responding to a triggering operation of sending the resource to a second terminal user by a first terminal user in a communication scene, and outputting a resource selection interface; the resource selection interface at least comprises a resource identifier of a target resource, wherein the resource identifier is displayed in the resource selection interface when the image detection method according to any one of claims 1 to 11 is adopted to determine that the special effect display logic of the dynamic template image of the target resource has no abnormality;

13. An image detection apparatus, characterized by comprising:

14. An image display apparatus, comprising:

the output unit is used for responding to the triggering operation of sending the resource to the second terminal user by the first terminal user in the communication scene and outputting a resource selection interface; the resource selection interface at least comprises a resource identifier of a target resource, wherein the resource identifier is displayed in the resource selection interface when the image detection method according to any one of claims 1 to 11 is adopted to determine that the special effect display logic of the dynamic template image of the target resource has no abnormality;

15. A terminal comprising an input device and an output device, further comprising:

a computer storage medium storing one or more first instructions adapted to be loaded by the processor and to perform the image detection method of any of claims 1-11; alternatively, the computer storage medium stores one or more second instructions adapted to be loaded by the processor and to perform the image display method of claim 12.