CN111859025A

CN111859025A - Expression instruction generation method, device, equipment and storage medium

Info

Publication number: CN111859025A
Application number: CN202010631788.1A
Authority: CN
Inventors: 欧阳育军; 吴怡蕾
Original assignee: Guangzhou Huaduo Network Technology Co Ltd
Current assignee: Guangzhou Huaduo Network Technology Co Ltd
Priority date: 2020-07-03
Filing date: 2020-07-03
Publication date: 2020-10-30

Abstract

The application discloses an expression instruction generation method, an expression instruction generation device, expression instruction generation equipment and a storage medium, wherein the method comprises the following steps: monitoring a video image acquired by the camera shooting equipment of the local machine in real time, identifying a specific facial expression image from the video image, tracking the change amplitude of a specific facial expression in the specific facial expression image, quantizing the specific facial expression image, and generating a corresponding quantized amplitude; in parallel with the monitoring, playing a target video correspondingly provided for the specific facial expression; and tracking and detecting whether the quantized amplitude meets a preset condition, and when the quantized amplitude meets the preset condition after the target video is played, issuing the expression instruction to trigger a pre-associated event. The method and the device have the advantages of being obvious and rich, achieving the effect of generating the relevant instructions according to the facial expressions of the user, belonging to the basic technology, being capable of expanding and enriching various computer application scenes and being suitable for meeting various practical requirements based on computer application.

Description

Expression instruction generation method, device, equipment and storage medium

Technical Field

The present application relates to the field of computer control technologies, and in particular, to a method, an apparatus, a device, and a storage medium for generating an expression instruction.

Background

With the rapid development of mobile communication technology, various small video platforms are produced, video application is more and more common, and people are more and more accustomed to spreading and acquiring information through the video application. The rapid popularization of video as an information form naturally also derives various application requirements based on video images. How to obtain real-time dynamic states of users through video images, especially various expressions, for interaction among users related to video applications or for man-machine interaction is a technology which needs to be researched and deepened urgently.

Traditional human-computer interaction or user-user interaction is generally based on words, images, videos and voices generated by users to carry and convey user intentions, and the related data is usually used for realizing instant interaction and does not necessarily accord with some specific application scenes. For example, in some entertainment or communication scenarios where a user expression is recognized, an instruction is confirmed or an event trigger condition is determined by the user expression, an instant interaction is not necessarily required, and even a user hesitation and trade-off link may be required to be set so as to match a delay decision of the user, and a relevant user instruction is finally confirmed. For such application scenarios, the application requirements are obviously not satisfied in the traditional instant interaction manner.

On the other hand, no matter what kind of user action the mobile terminal starts the camera device to acquire the video image, the camera device is never developed and utilized as a user interaction portal. The internet economy is characterized by depending on user flow, so that based on the fact that the AI face recognition technology is mature day by day and short video consumption is granted, if the camera device can acquire user video data as a user access entry of the internet platform service, richer interaction modes are provided, the internet platform can generate larger user flow, the daily activity, the survival rate and other indexes are improved, and the software and hardware operating efficiency of related services of the internet platform is improved. Obviously, to achieve such an object, it is necessary to further consider providing necessary technical support for the invocation of the image pickup apparatus to be effective.

According to the analysis, the terminal equipment acquires the expression information of the user contained in the video image of the user, so that large technical mining and utilization space exists, the information is used well, the man-machine interaction technology is deepened, and the realization of richer Internet service forms is facilitated.

Disclosure of Invention

The first objective of the present application is to provide an expression instruction generating method, so as to more effectively utilize expression information contained in a user video image acquired by a terminal device in real time.

As another object of the present application, there is provided an expression instruction generation device adapted to the aforementioned method.

As a further object of the present application, an electronic device adapted thereto is provided based on the foregoing method.

As a further object of the present application, a non-volatile storage medium is provided, which is adapted to store a computer program implemented according to the method.

In order to meet various purposes of the application, the following technical scheme is adopted in the application:

the expression instruction generation method provided by adapting to the primary purpose of the application comprises the following steps:

monitoring a video image acquired by the camera shooting equipment of the local machine in real time, identifying a specific facial expression image from the video image, tracking the change amplitude of a specific facial expression in the specific facial expression image, quantizing the specific facial expression image, and generating a corresponding quantized amplitude;

in parallel with the monitoring, playing a target video correspondingly provided for the specific facial expression;

and tracking and detecting whether the quantized amplitude meets a preset condition, and when the quantized amplitude meets the preset condition after the target video is played, issuing the expression instruction to trigger a pre-associated event.

In one class of embodiments, the method of the present application further includes the following pre-step:

an access entry is displayed at the user interface of the native application, and subsequent steps of the method are performed in response to an event that triggers the access entry.

In this type of embodiment, the application program is a camera program of the native device, and the access entry is represented as a preset control or a preset touch instruction of the camera program interface;

or the application program is installed on the local computer and used for providing the video playing service, the access entry is triggered to be accessed by a specific type of video which is being accessed by the video playing service application program by default, or the access entry is represented as a preset control or a preset touch instruction associated with the specific type of video.

In a preferred embodiment, the monitoring step and the playing step are respectively executed in different threads.

In some embodiments, the target video is a video that is pre-associated with the specific facial expression and is pushed by a remote server.

In one class of embodiments, the preset condition is configured to include a value interval corresponding to the quantized amplitude, and when it is continuously detected that the quantized amplitudes all exceed or belong to the value interval by the end of the target video playing, the preset condition is satisfied, otherwise, when it is detected that the quantized amplitudes belong to or exceed the value interval in the target video playing process, the preset condition is not satisfied;

Or: the preset condition is configured to include a threshold corresponding to the quantized amplitude, and the preset condition is satisfied when the quantized amplitude is continuously detected to be greater than/less than the threshold by the end of the target video playing, otherwise, the preset condition is not satisfied when the quantized amplitude is detected to be less than/greater than the threshold in the target video playing process.

In a more specific embodiment, when the quantized amplitude is detected, an average value of the quantized amplitudes within a fixed time length range is used to detect whether the quantized amplitude satisfies the preset condition.

In a further embodiment, in the step of tracking detection, when a preset condition is satisfied, result information about the satisfaction of the preset condition is output, and the result information is analyzed as the expression instruction, so as to trigger a pre-associated event according to the expression instruction.

In a preferred embodiment, the pre-correlated event includes any one or more of the following: playing a preset special effect, sending reward information to a local user, starting the authority of a further customs link, changing the identity characteristic parameters of the local user, and popping up a sharing operation interface.

In a further embodiment of the method according to the invention,

When the step of monitoring the video image acquired by the camera equipment in real time identifies that the specific facial expression image is not in the lens designated range in the monitoring process, or identifies that the specific facial expression image contains the eye closing feature of the user, the execution of the method is reset or stopped.

In an expanded embodiment, the method comprises the following steps:

and synchronously displaying the emoticons on a graphical user interface of the local machine in parallel with the monitoring step, wherein the emoticons are used for correspondingly representing the change of the specific facial expression image according to the quantitative amplitude.

In a further embodiment, the video image, the emoticon, and the target video are loaded in the same graphical user interface for visual display.

Another object of the present application is to provide an expression instruction generation device, including:

the monitoring unit is used for monitoring a video image acquired by the camera shooting equipment of the local camera shooting equipment in real time, identifying a specific facial expression image from the video image, tracking the change amplitude of a specific facial expression in the specific facial expression image, quantizing the specific facial expression image and generating a corresponding quantized amplitude;

a playing unit, configured to play a target video correspondingly provided for the specific facial expression in parallel with the monitoring;

And the triggering unit is used for tracking and detecting whether the quantized amplitude values meet preset conditions or not, and issuing the expression instruction to trigger a pre-associated event when the quantized amplitude values meet the preset conditions after the target video is played.

Another object of the present application is to provide an electronic device, which includes a central processing unit and a memory, wherein the central processing unit is used for invoking and running a computer program stored in the memory to execute the steps of the method for generating emoticon instructions.

A non-volatile storage medium storing a computer program implemented according to the expression instruction generation method, the computer program being invoked by a computer to execute the steps included in the method.

Compared with the prior art, the application has the following advantages:

firstly, after the method is operated, specific facial expression images, such as expression-related images of happiness, anger, sadness, smile and the like corresponding to the grief, the sadness, the smile and the like, can be identified from the video images acquired by monitoring the camera shooting equipment of the local camera in real time, and the images are further quantized on the basis of identifying one specific type of image so as to be converted into corresponding quantized amplitude values; meanwhile, target videos which are well pre-associated with the expressions are played for a local user to watch, the user can be stimulated to change the mood of the user to some extent, expression changes are slowly shown, and the application program can relatively accurately grasp the expression change degree of the user through continuous monitoring and quantification, so that whether the quantified amplitude value reaches a preset condition or not can be utilized to reach relevant expression instructions according to the preset condition, and relevant events can be triggered. Therefore, in the technology, firstly, the real evaluation of a user on the played target video can be quantitatively examined, and the obtained evaluation index can be obviously used for data mining subsequently; secondly, emotional stimulation to the user can be realized by means of a technical mode according to a set expected target (preset condition); thirdly, an expression instruction for triggering a relevant event is generated by considering whether the quantitative amplitude of the expression of the user meets a preset condition or not in the process that the user watches the target video, so that a delay decision mode related to the instruction of the user is technically provided in practice, and meanwhile, the instruction mode is determined according to the real emotion of the user, and according to the mode, richer computer events matched with the emotion of the user can be matched; and fourthly, realizing virtual activities based on short videos by utilizing the technology, guiding the facial expression activities of the users to meet certain requirements by requiring the users to watch the videos and simultaneously investigating whether the quantized amplitude meets certain preset conditions, triggering associated events when the quantitative amplitude meets the requirements, and realizing fighting interaction with the users by predefining the associated events to be electronic gift issuing events, system permission opening events and the like by a platform side so as to realize the virtual activities.

Secondly, the program module developed according to the method can be integrated into a common third-party application program, for example, an application program provided by some live broadcast platforms or a short video service application program, an access entry is set in the application program to realize calling operation, and the program module can also be realized as a function module of a terminal device, particularly a camera program of a mobile terminal, so that the camera program is used as a drainage entry, thereby providing richer drainage channels for related platform parties, efficiently utilizing the camera program of the mobile terminal, and improving the utilization efficiency of various terminal devices.

Moreover, program developers can flexibly change the cooperation rules meeting the preset conditions by flexibly setting the preset conditions so as to meet various practical purposes. For example, for the purpose of stimulating a user's mood, the predetermined condition can be set to a relatively high threshold value, and when the quantified amplitude of the user's smile exceeds the threshold value, a reward event can be triggered to achieve a result. For another example, a relatively low threshold may be set for the purpose of controlling the user's angry mood, and when the quantified magnitude of the user's angry expression is below a certain threshold, a certain reward may be given in the same way, which has the effect of relieving the user's mood. As another example, a game can be implemented that maintains a form of smile with the user on the game, for which a preset condition is set to a numerical interval, and when the quantized magnitude of the user's smile exceeds this interval, the user fails to challenge, otherwise the user succeeds in challenging. Whether successful or not, some corresponding event may be triggered. The method and the device are various, flexible and diverse, obviously, a technical path for generating the instruction by using the expression is provided by a technical means, and therefore a new implementation mode is provided for the terminal device to efficiently utilize the expression information of the user.

In addition, the application also provides a technical means for preventing the user from cheating. When a user moves the head or closes the eyes during watching a video and tries not to receive video image information so as to meet the preset conditions, the user monitors the specific facial expression image of the user and recognizes the situations, and the execution of the method is reset or terminated in response to the situations, so that the execution logic and the rules of the method are maintained, the user cannot try to meet the preset conditions by cheating, the reliability of the method is improved, the user is guaranteed to watch the content of the target video, and the human-computer interaction between the user and the terminal equipment is guaranteed to always follow the expected rules. Obviously, once such a function is embodied, the method of the present application is also applicable to some application scenarios of advertisement video promotion, that is, when the target video contains advertisement content, the manner of the present application can relatively ensure that the advertisement information therein is widely and effectively spread and received.

In summary, the application has obvious and rich advantages, achieves the effect of generating related instructions according to the facial expressions of the user, belongs to basic technology, can expand and enrich various computer application scenes, and is suitable for meeting various practical requirements based on computer application.

Additional aspects and advantages of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the present application.

Drawings

The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

fig. 1 is a schematic diagram of a typical network deployment architecture related to implementing the technical solution of the present application;

FIG. 2 is a schematic flowchart of an exemplary embodiment of an emoticon generation method according to the present application;

fig. 3 is a schematic block diagram of an exemplary embodiment of an emoticon generation apparatus according to the present application.

Detailed Description

Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary only for the purpose of explaining the present application and are not to be construed as limiting the present application.

As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. As used herein, the term "and/or" includes all or any element and all combinations of one or more of the associated listed items.

It will be understood by those within the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

As will be appreciated by those skilled in the art, "client," "terminal," and "terminal device" as used herein include both devices that are wireless signal receivers, which are devices having only wireless signal receivers without transmit capability, and devices that are receive and transmit hardware, which have receive and transmit hardware capable of two-way communication over a two-way communication link. Such a device may include: cellular or other communication devices such as personal computers, tablets, etc. having single or multi-line displays or cellular or other communication devices without multi-line displays; PCS (personal communications Service), which may combine voice, data processing, facsimile and/or data communications capabilities; a PDA (Personal Digital Assistant), which may include a radio frequency receiver, a pager, internet/intranet access, a web browser, a notepad, a calendar and/or a GPS (Global positioning system) receiver; a conventional laptop and/or palmtop computer or other device having and/or including a radio frequency receiver. As used herein, a "client," "terminal device" can be portable, transportable, installed in a vehicle (aeronautical, maritime, and/or land-based), or situated and/or configured to operate locally and/or in a distributed fashion at any other location(s) on earth and/or in space. The "client", "terminal Device" used herein may also be a communication terminal, a web terminal, a music/video playing terminal, such as a PDA, an MID (Mobile Internet Device) and/or a Mobile phone with music/video playing function, and may also be a smart tv, a set-top box, and the like.

The hardware referred to by the names "server", "client", "service node", etc. is essentially an electronic device with the performance of a personal computer, and is a hardware device having necessary components disclosed by the von neumann principle such as a central processing unit (including an arithmetic unit and a controller), a memory, an input device, an output device, etc., a computer program is stored in the memory, and the central processing unit calls a program stored in an external memory into the internal memory to run, executes instructions in the program, and interacts with the input and output devices, thereby completing a specific function.

It should be noted that the concept of "server" as referred to in this application can be extended to the case of a server cluster. According to the network deployment principle understood by those skilled in the art, the servers should be logically divided, and in physical space, the servers may be independent from each other but can be called through an interface, or may be integrated into one physical computer or a set of computer clusters. Those skilled in the art will appreciate this variation and should not be so limited as to restrict the implementation of the network deployment of the present application.

Referring to fig. 1, the hardware basis required for implementing the related art embodiments of the present application may be deployed according to the architecture shown in the figure. The server 80 is deployed at the cloud end, and serves as a front-end application server, and is responsible for further connecting a related data server, a video streaming server, and other servers providing related support, so as to form a logically associated server cluster to provide services for related terminal devices, such as a smart phone 81 and a personal computer 82 shown in the figure. Both the smart phone and the personal computer can access the internet through a known network access mode, and establish a data communication link with the cloud server 80 so as to run a terminal application program related to the service provided by the server. In the related technical solution of the present application, the server 80 may be responsible for establishing the short video push service, and the terminal correspondingly runs an application program corresponding to the short video push service.

The application program refers to an application program running on the terminal device, the application program realizes the method of the application program in a programming mode, program codes of the application program can be stored in a nonvolatile storage medium which can be identified by the terminal device and called into a memory by a central processing unit to run, and the relevant device of the application program is constructed through the running of the application program on the terminal device.

In order to support the operation of the application program, the terminal device is provided with a related operating system, such as an IOS, Android, and other operating systems providing equivalent functions, and under the support of such an operating system, the adaptively developed application program can normally operate, and human-computer interaction and remote interaction are realized.

For various terminal devices which are popular at present, particularly for mobile devices such as tablets and mobile phones, camera devices such as a camera are usually built in, or a personal computer can also be externally connected to the camera devices, so that an application program for realizing the method of the application can be called for the camera devices under the conditions, and the method of the application is not prevented from being used for realizing the method of the application by utilizing the camera devices.

The terminal equipment equipped with the camera shooting equipment is usually internally provided with a corresponding background driver and a front-end camera program, the driver is used for driving a camera at the bottom layer of a system to obtain a video image, the camera program is mainly used for providing a man-machine interaction interface, and the camera shooting function of the terminal equipment can be realized by calling the driver. Therefore, the method of the present application can be implemented in the camera program, so as to implement a drainage path for extending the access source of the short video push service platform by using the camera program.

The method of the application can also be programmed and built in an application program providing the network live broadcast, and the function can be expanded as a part of the application program. The live webcast refers to a live webcast room network service realized based on the network deployment architecture.

The live broadcast room is a video chat room realized by means of an internet technology, generally has an audio and video broadcast control function and comprises a main broadcast user and audience users, wherein the audience users can comprise registered users registered in a platform or unregistered tourist users; either registered users who are interested in the anchor user or registered or unregistered users who are not interested in the anchor user. The interaction between the anchor user and the audience user can be realized through known online interaction modes such as voice, video, characters and the like, generally, the anchor user performs programs for the audience user in the form of audio and video streams, and economic transaction behaviors can also be generated in the interaction process. Of course, the application form of the live broadcast room is not limited to online entertainment, and can be popularized to other relevant scenes, such as an educational training scene, a video conference scene, a product recommendation and sale scene, and any other scene needing similar interaction.

The application program related to the present application can generate an action of dispatching the electronic gift according to the requirement of the platform side design, the electronic gift, also called electronic gift, is non-solid and represents a certain tangible or intangible value electronic form mark, the realization form of the mark is wide and flexible, and the mark is usually presented to the user in a visual form, such as an icon, a quantity and a value form. The electronic gift usually needs the user to purchase and consume, and can also be the gift that internet service platform provided, but, once the electronic gift produced, its itself can both support exchanging with the real securities, also can be non-exchange article, and depending on internet service platform technical implementation, this does not influence the implementation of this application in essence. Accordingly, the act of the user purchasing the electronic gift constitutes the act of the user consuming the electronic gift. The act of dispatching electronic gifts is represented at the program level and triggers corresponding consumption events. The electronic gift can be purchased by the user in the application program, and the event triggered by the act of purchasing the electronic gift can be regarded as a consumption event. When the platform side issues the electronic gift unconditionally and actively, the information that the platform side obtains the relevant reward is correspondingly sent to the user, so that when the user receives the reward information, the user usually obtains a certain electronic gift given by the platform side.

The application program related to the application can trigger the action of playing special effects according to the requirement of the platform side design. The special effect is a technical control effect realized by a computer animation display effect or a similar mode, and is usually used for enhancing interactive perception atmosphere in a graphical user interface of an application program. When the special effect is triggered to play, the user interface can see the corresponding animation playing effect, so that the special effect is sensed. The realization form of the special effect is various and can be flexibly realized by the technicians in the field.

The application program related to the application program can open the authority of a further customs link for the user according to the requirement designed by the platform side, and generally occurs under the condition that the application program is realized by taking a customs game as a scene. In this case, after the user successfully achieves a certain preset condition in participating in the previous game link, the application program opens the access right of the next game link, so that the user can enter the next game link to continue the game.

The application program related to the application can change the identity characteristic parameters of the user according to the design requirements of the platform side. Generally, a user of an application, which may be a registered user on a platform side, is often in an account login state. The user usually has at least one identity characteristic parameter, for example, when the user is a main broadcasting user, the normally expressed user vote number represents an evaluation value of a certain dimension of the main broadcasting user, and the live broadcasting platform can set the relative status of the main broadcasting user among part or all other main broadcasting users according to the identity characteristic parameter, for example, a corresponding level label is identified for the main broadcasting user according to the specific evaluation value. For the acquisition of the identity characteristic parameters, the platform side can acquire or convert the identity characteristic parameters by designing various activities according to the needs of the platform side, for example, the identity characteristic parameters are quantitatively determined according to the number of concerned users of the anchor user, the activity of the users during live broadcasting, the profit capacity during live broadcasting room activities and any similar trigger factors. Such identity characteristic parameters can be further used for secondary data mining by the live broadcast platform, more complex platform functions are realized based on the identity characteristic parameters, and various activities of the live broadcast room of the anchor user are served as necessary. Of course, these identity characteristic parameters may also be expressed as "personality values" for example, and it is obvious that the identity of the user can be represented to some extent by converting the quantified amplitude of the expression of the user into the "personality values" and increasing or decreasing the quantified amplitude correspondingly.

The application program related to the application can be triggered to pop up a sharing operation interface according to the requirement designed by the platform side, the interface usually contains related information for popularization, and a user can share the related information to a third-party service platform or the application program or share the related user, friends, comment areas and the like of the same platform through further operation on the interface, so that more users can access functions related to the method of the application through the shared information, and the application program of the application can be popularized in a fission mode.

The person skilled in the art will know this: although the various methods of the present application are described based on the same concept so as to be common to each other, they may be independently performed unless otherwise specified. In the same way, for each embodiment disclosed in the present application, it is proposed based on the same inventive concept, and therefore, concepts of the same expression and concepts of which expressions are different but are appropriately changed only for convenience should be equally understood.

Referring to fig. 2, an exemplary embodiment of an emoticon generation method of the present application is designed as an entertainment event to be more visually disclosed, but it will be understood by those skilled in the art that the representation form of the image does not affect the essence of the method adopting sufficient technical means.

In an exemplary embodiment, the method of the present application includes the steps of:

step S11, monitoring a video image acquired by the local camera device in real time, identifying a specific facial expression image therefrom, tracking a variation range of a specific facial expression therein, quantizing the variation range, and generating a corresponding quantized amplitude:

when the application program implementing the method is started and the steps of the method are started, firstly, the camera device of the local terminal device where the application program is located is started, and the video image is started to be acquired in real time through the camera device.

In a practical scene of the method, the user is usually a user of the terminal device, such as a handheld terminal device of a mobile phone or a tablet, and such a device usually has a front camera and a rear camera, or has a camera device that realizes the functions of the front camera and the rear camera through orientation state switching, so that the camera device turned on here can be acquiescently turned on the front camera or make the camera device in an orientation state suitable for playing the role of the front camera, so as to be mainly used for achieving the purpose of acquiring the video image of the user of the terminal device. Of course, in some special occasions, for example, in a scene of displaying a graphical user interface output by the terminal device in a split screen mode, a rear camera of the terminal device can be used instead of the front camera. It is understood that the choice of front camera and rear camera, determined according to the specific use case, for capturing the facial expression of the interaction with the application program should not be a factor affecting the inventive spirit of the present application.

As one of the functions conventionally implemented by the application program, a video image acquired by the camera device is displayed in the current graphical user interface of the local terminal device, so that a user of the local terminal device can intuitively know the instant expression state of the user. The video image acquired by the camera equipment can be displayed in a graphical user interface in a half-window mode or in a full-screen mode; the display can be performed in a semitransparent layer mode or a non-transparent layer mode, and the implementation of the application is not affected.

The algorithms for recognizing specific types of facial expression images from the video images, including various algorithms obtained by machine learning and various algorithms implemented based on AI technology, are well known to those skilled in the art, and all of the functions and effects of recognizing specific facial expression images from video images acquired by the camera device can be realized. In some cases, the relevant algorithms to achieve this function and effect can be programmed into the relevant SDK and subsequently utilized by developers by way of method calls. Alternatively, known algorithms for achieving equivalent functions and effects may be specifically programmed by developers to be implemented in the application program of the method. And so on, as would be known to one skilled in the art.

In principle, recognizing a specific facial expression image is usually recognized in units of image frames of a video image, and the purpose of recognizing a specific facial expression image is undoubtedly to determine whether the current facial expression of the user belongs to a specific type. The specificity of the facial expression refers to various expressions formed by the natural expression of human emotions such as joy, funny, wild laughing, twitching, crying, hurting, sadness, excitement … … and the like, and the expressions are classified in advance to form various facial expressions, for example, facial expressions related to smiling, laughing and wild laughing can be classified as smiling face class, and twitching and crying can be classified as crying face class, and the like. The specific facial expression refers to a facial expression activity state corresponding to one type of facial expression, and accordingly, the specific facial expression image is represented in the video image, namely one frame or several frames of video images representing the facial expression activity state. It will be appreciated that one skilled in the art may determine a particular facial expression image by technical means to confirm that the user's face is exhibiting a particular facial expression.

The facial expression of the user is difficult to be still, particularly when the application program of the method plays the target video for watching to apply emotional stimulation to the target video, so that whether the change amplitude of the facial expression of the user is kept unchanged or changed, namely the change state of the specific facial expression is monitored in the step.

In order to quantify the change condition of the facial expression of the user, in the method, a program developer can establish a quantification standard for the change degree of the facial expression of the user according to a fact rule in advance, for example, the change standard can be corresponding to different quantification amplitudes according to the difference of upwarp degree and/or mouth opening degree of the mouth corner of the human face, generally speaking, a smile closing mouth corresponds to a lower quantification amplitude, a smile opening mouth corresponds to a higher quantification amplitude, and so on, a quantification reference system can be determined. Other specific types of facial expression quantification methods are also possible. Therefore, the corresponding quantization amplitude value can be determined according to the specific change amplitude of the specific facial expression in the video image. Of course, the method of converting the facial expression into the quantized amplitude according to the variation amplitude of the facial expression is various, and is not limited to the examples herein, and those skilled in the art can implement the method flexibly according to the examples herein, and will not be repeated.

As mentioned above, the video image is composed of image frames, such as 60 frames/second or 25 frames/second, and in any case, when quantization is performed, the image frames are acquired and subjected to the recognition to recognize the specific facial expression.

In the present exemplary embodiment, the recognition may be performed for each frame so that the recognition of the facial expression of the user is accurate to each minute expressive activity. In other embodiments, the non-adjacent image frames may be identified by, for example, interval sampling, in consideration of allowing the change in facial expression of the user to be changed within a certain range. The inventive spirit of the present application is not affected by any embodiment.

In an improved embodiment, in order to avoid the uneven property of the change of the quantized amplitude value due to the roughness of the algorithm, when the quantized amplitude value is detected, the quantized amplitude value within a fixed time length range, such as 1 second, is averaged, and the averaged quantized amplitude value is provided to the subsequent step for detecting whether the quantized amplitude value meets the preset condition.

It is noted that tracking the amplitude of change of a particular facial expression to achieve quantification as described herein means that the particular facial expression is quantified only after it has been identified from the image frames, but it should be understood that the tracking may also be non-instantaneous, e.g., a suitable time difference may occur, and that the time difference is suitably such that it does not cause the user to perceive a significant difference between the quantified amplitude and his or her true emotion.

This step can be implemented as a callback function in the SDK that facilitates programmatic invocation, thereby making it suitable for running in an independent thread manner, enabling parallel running with the main process and other threads of the application program of the method, thereby enabling run-time parallelism among threads, the threads, and the main process to be run without conflict.

It can be understood that the quantized amplitude of the facial expression of the user, which changes relatively in real time, is obtained by monitoring the video image of the camera device and identifying and quantizing the specific facial expression, and the quantized amplitude can be returned to the caller of the callback function for further utilization.

Step S12, in parallel with the monitoring, playing the target video provided corresponding to the specific facial expression:

in the exemplary embodiment of the present application, in order to implement parallelism of functions executed by the present step and the previous step, and to implement playing the target video while monitoring, the present step may be implemented as an independent thread, so that the video provided for the specific facial expression may be played "synchronously" while the previous step implements the monitoring action.

The target video is usually prepared in advance to match the design intention of a program developer, or is stored in a remote server or a terminal device, and the implementation of the application is not influenced.

And establishing a logical association between the target video and the specific facial expression according to the design intention of the developer. For example, for a design intention aiming at stimulating a user to laugh or control a sad emotion by watching a target video, the target video is preferably some comedy fragments for laughing or other witty humorous video content; for a design intent intended to train the face expression ability of a user by watching a target video, the target video may prefer segments of various effects, including comedy segments, tragedy segments, and the like. It should be noted that the association between the target video and the specific facial expression is logical, and can be flexibly matched by those skilled in the art according to the program development intention in combination with the examples herein, and the matching relationship between the two is not one-to-one, but can be various, depending on the program development intention. It can be appreciated how the specific content of the target video is specifically matched to a particular facial expression, without affecting the implementation of the inventive spirit of the present application.

In the exemplary embodiment of the present application, the target video may be pushed by a remote server. Specifically, when the application program of the method identifies that the facial expression of the user belongs to a specific type for the first time, the design intention of the application program is combined, a specific label of the specific facial expression is submitted to the remote server according to the specification of a developer during programming, the remote server selects one target video or the target video of the data from a preset video library according to the label and pushes the target video to the terminal equipment, and the application program of the terminal equipment can open a play window and play the target video.

In an improved embodiment, the current graphical user interface of the terminal device allows the target video to be played in full screen, and at the same time, the video image obtained by the camera device in real time is not displayed in the current graphical user interface, but the application program still monitors and detects the quantized amplitude at the background, and displays an emoticon displayed in animation form in the current graphical user interface, so that the animation change state of the emoticon is associated with the quantized amplitude returned in the monitoring step, and the emoticon correspondingly represents the change of the specific facial expression image according to the quantized amplitude, therefore, a user of the terminal device can perceive the change degree of the facial expression of the user through the emoticon, even though the user cannot view the video image of the user in real time.

In another improved embodiment, in consideration of the requirement of monitoring in the previous step and the requirement of playing in the present step, in order to improve user experience and increase interaction perception, an upper half window and a lower half window may be respectively used for displaying a video image acquired by the camera device in real time and playing the target video in a graphical user interface of the terminal device. Similarly, the emoticon described in the previous embodiment may also be implemented in a graphical user interface, so that the video image, the emoticon, and the target video are simultaneously loaded and displayed in the current graphical user interface.

It will be further appreciated from the above disclosure that this step may be run in parallel with the previous step on implementing the snoop in the memory of the terminal design.

Step S13, tracking and detecting whether the quantized amplitude meets the preset condition, and when the quantized amplitude meets the preset condition after the target video is played, issuing the expression instruction to trigger the pre-associated event:

this step can be implemented by the main process of the application of the method or by other parallel processes. When the target video is being played on the graphical user interface, and the terminal device is outputting the quantized amplitude of the specific facial expression in the video image acquired by the camera device in real time, one process of the application program may be responsible for executing the function of this step in parallel.

Specifically, during the playing of the target video, the step of monitoring continuously outputs the quantized amplitude values in response to the actual facial expression activities of the user, so that the step tracks each output quantized amplitude value to detect whether the quantized amplitude value satisfies a predetermined condition. This predetermined condition is predetermined according to the programming intent, and is usually already specified in the code of the application or in one or more rule files or data tables for the application to call.

Detecting whether the quantized amplitude meets preset conditions, and adopting a plurality of corresponding implementation modes according to different preset conditions, wherein the detection methods are determined based on different program design intents as follows:

in the first type of detection method, the preset condition is configured to include a value interval corresponding to the quantized amplitude value, and whether the actual facial expression change condition of the current user meets the preset condition is determined by detecting whether the obtained quantized amplitude value exceeds or belongs to the value interval. The alternative cases of "exceeding" and "belonging" are included here, and are suitable for different programming intents.

For example, for the case of "belonging", the application of the present application is designed to stimulate the user to smile by playing the target video, and the betting rule between the user and the user is set so that the user must control the smile within a certain range of variation, which can be easily and vividly understood that the user successfully challenges the smile if held back, and fails to challenge the smile or not. Therefore, the numerical interval is set to be (30, 60), if the quantitative amplitude representing the facial expression of the user is between 30 and 60 and the target video playing is adhered to, the user is considered to be successful in challenge, and the preset condition is met; otherwise, if the quantization amplitude is below 30 or above 60 at any time during the playing of the target video, the user of the terminal device fails to challenge. In this case, if the callback function related to the snoop returns a quantized amplitude out of the range of 30 to 60, it is determined that the preset condition is not satisfied.

As another example, the "exceeding" is different from the former in the manner of using the numerical value interval for comparison, and is suitable for, for example, exciting the user to achieve a large fluctuation in emotion, such as a case where the user continuously keeps laughing (quantization amplitude higher than 60) or not (quantization amplitude lower than 30) to satisfy the predetermined condition, and the predetermined condition is not satisfied if the user has no change in facial expression and is in the numerical value interval of 30 to 60 in quantization amplitude.

In the second type of detection method, the detection condition can be further simplified, and the technical means of the method is different from the former type mainly in that a threshold value is set instead of the aforementioned numerical value interval, for example, the threshold value is set to 50 corresponding to a specific facial expression of a smiling face. For such a situation of setting the threshold, according to the difference between the design intentions of the programmer and pursuing the user to be laughter or laughter, the preset condition is configured to be satisfied when the obtained quantized amplitudes are all greater than or all less than the threshold during the whole process of playing the target video, otherwise, at any time during the playing process of the target video, as long as the quantized amplitudes corresponding to the facial expressions in one or more frames of images are detected to be correspondingly less than or all greater than the threshold, the preset condition is determined not to be satisfied.

It can be understood that based on the above two detection methods, various preset conditions can be flexibly designed for different programming purposes and intentions. According to the disclosure of the application principle of the preset conditions, the preset conditions can be flexibly configured and flexibly realized in the process of program development by the skilled person.

In summary, in a typical application of the present application, it is continuously detected whether the quantized amplitude values satisfy a preset condition in a target video playing process, and if all the quantized amplitude values satisfy the preset condition until the target video playing is finished, the quantized amplitude values may be regarded as a generation condition of an expression instruction for triggering a default and pre-associated event; if the quantized amplitude does not meet the preset condition at any time in the process of playing the target video, additional processing can be performed, for example, another expression instruction for triggering different pre-associated events is generated, and even the playing of the target video is terminated.

According to the technical means that the pre-associated event is triggered when the preset condition is met, it can be understood that the meeting of the preset condition (even the meeting of the preset condition) is understood as the triggering condition of a certain pre-set computer event in the method of the present application, and therefore can be understood as a generalized computer instruction, that is, an "expression instruction" in the present application. That is to say, the expression instruction described in the present application refers to result information that is generated correspondingly based on whether the quantized amplitude satisfies the preset condition through the processing of the foregoing process, and the result information is analyzed as an expression instruction by the application program of the present application, and can be analyzed by the method of the present application, and a pre-associated computer event is triggered according to the expression instruction. These pre-correlated events may be, in theory, any type of computer event that may be executed by the computer code, such as playing a predetermined special effect, sending a reward message to the local user, opening a further link, changing the identity characteristic of the local user, popping up the sharing operation … …, and so forth, as previously described.

As for the two results produced by whether the quantized amplitude value meets the preset condition, although the event which is considered to constitute the generation condition of the expression instruction and trigger the pre-association is called the first event when the preset condition is met, the event which constitutes the generation condition of another expression instruction and triggers another pre-association is also considered to constitute the generation condition of another expression instruction and trigger another pre-association event, namely the second event when the preset condition is not met. Therefore, as long as the two results of meeting and not meeting the preset condition are preset by a developer during programming to establish the association between the preset different events, the judgment result can be adopted to enable the terminal equipment to execute the next action no matter the preset condition is met or not met, so that the execution result of the corresponding event is presented. In a typical application scenario, according to a programming intention, a user makes a laughing stop in the whole process of watching a target video, a corresponding quantized amplitude value of the user is always controlled within a numerical value interval range until the target video is played, so that the target video is considered to meet a preset condition, the laughing stop of the user and the application program is successfully challenged, the application program obtains corresponding result information, an expression instruction is issued according to the result as a generation condition of the expression instruction, a pre-associated first event is correspondingly triggered according to the expression instruction, the first event firstly outputs notification information of successful challenge to a graphical user interface of a terminal device, then a preset electronic gift is dispatched to a login user of the application program, and the increased electronic gift is associated to a personal account of the login user. On the other hand, in the second round of challenge, the user laughs himself due to the scenario of watching the target video in the middle process, the corresponding quantitative amplitude of the facial expression of the user exceeds the numerical range, the facial expression performance of the user does not meet the preset condition, then the application program obtains corresponding result information, and the result information is taken as an expression instruction to trigger a related second event, and the second event only outputs notification information of 'challenge failure' in the current graphical user interface.

In order to facilitate integration with the application program of the present application, the functions implemented by the method of the present application are integrated into a functional module of the application program, a pre-execution step may be added to the method of the present application with reference to the following improved embodiments, an access entry of the method is set in the application program, so that a user may invoke an application instance implemented by the method through the access entry, and when the user operates the access entry, an event that invokes the application instance of the method through the access entry may be triggered, which is specifically as follows:

in one embodiment, the application program of the present application is an application program based on a video playing service such as short video, and the platform side pushes the target video for triggering execution of the relevant codes of the method to the user of the application program randomly or according to a certain policy, the user of the terminal device "swipes" the short video in the process of browsing the short video by using the application program, so that the user (user) can view the target video in the graphical user interface of the terminal device, the target video forms an access entry, thereby causing the user to automatically invite the user to participate in a "laughing" activity, and after the user accepts the invitation, the user starts to execute an application instance, thereby implementing utilization of the method of the present application. For this type of target video, the relevant tags are typically attached, as they can be identified by the tags provided at the time of server push. Certainly, in this embodiment, the target video may not automatically invite the user to participate in the application instance of the method by default, but a corresponding preset control may be provided on the graphical user interface as the access entry, or a preset touch instruction operating in the terminal device may be defaulted as the access entry, and in any case, as long as the application instance of the method of the present application can be finally invoked, the implementation of the present application is not affected.

In another embodiment, in order to create a camera program of a terminal device as a drainage portal of the internet, referring to the previous embodiment, the camera program is used as the application program described in the present application, an application instance of the method of the present application is integrated in the camera program, and the preset control is provided in a graphical user interface of the camera program or the preset touch instruction is recognized in the interface, so as to invoke the instance of the method of the present application, and similarly, the invocation of the instance of the method of the present application can also be implemented.

It can be seen that the access entry setting manner regarding the call to the example of the method of the present application is very flexible, and can be flexibly implemented by those skilled in the art according to the disclosure of the above embodiment.

On the other hand, in the process of implementing the monitoring in step S11, in order to prevent the user from cheating by illegal means such as transferring the face, closing the eyes to see no target video, providing a static facial expression image, etc., when it is recognized that the specific facial expression image in the video image is not within the specified range of the lens during the monitoring process, or it is recognized that the specific facial expression image contains any one of the characteristics of the user such as closing the eyes or not focusing the eyes, an alarm message may be generated, so as to notify the application program of the present application to control the termination or the reset of the playing of the target video, so as to implement a punishment mechanism against the rule violation of the user (terminal device user). It should be noted that the technical means for identifying whether the facial expression image is closed and whether the two eyes are focused from the video image are known to those skilled in the art, and therefore detailed descriptions thereof are omitted.

Besides the various application scenarios involved in the disclosure of the above embodiments, the method of the present application can also be applied to scenarios requiring intelligent recognition of the attitude of the user to implement a delay decision, thereby deriving other embodiments in various situations.

In one class of embodiments, the target videos may be contents of movies, dramas, news and the like, and the method includes the steps of starting a camera device on a background to acquire video images in real time while playing the target videos, identifying facial expressions of viewers from the video images, identifying specific facial expression images including smiling faces representing different satisfaction degrees or disgust faces representing different satisfaction degrees and the like, correspondingly converting the facial expression images into quantized amplitude values, generating corresponding expression instructions when the quantized amplitude values meet preset conditions according to a pre-associated relationship, and triggering pre-associated events, such as an evaluation event. Therefore, the method can be used for acquiring the overall satisfaction degree of the user for watching the target video and finally acquiring the evaluation data aiming at the target video. These evaluation data are naturally also suitable for further data mining. Naturally, adapting to this scenario, if necessary, the preset condition and the trigger mechanism thereof may be adaptively modified, for example, when detecting whether the quantized amplitude satisfies the preset condition, whether the sum or the average of the quantized amplitudes of multiple time periods in the whole playing process reaches or exceeds a certain threshold may be further considered, so as to determine that the preset condition is satisfied; and if the sum or the average value of the quantized amplitude values of one or more time periods fails to reach the threshold value, the preset condition is considered to be not met, the viewer does not like the content played by the target video, and the playing of the target video is terminated or reset.

Therefore, the method of the present application can be applied to various application scenarios by those skilled in the art according to the indications of the above various embodiments of the present application, so that the beneficial effects of the method of the present application can be maximally embodied and exerted.

Further, an emoticon generation apparatus of the present application may be constructed by functionalizing the steps in the methods disclosed in the above embodiments, and according to this idea, please refer to fig. 3, wherein in an exemplary embodiment, the apparatus includes:

the monitoring unit 51 is configured to monitor a video image acquired by the local camera device in real time, identify a specific facial expression image from the video image, track a change amplitude of a specific facial expression in the specific facial expression image, quantize the change amplitude, and generate a corresponding quantization amplitude;

a playing unit 52, configured to play the target video correspondingly provided for the specific facial expression in parallel with the monitoring;

and the triggering unit 53 is configured to track and detect whether the quantized amplitude values meet preset conditions, and issue the expression instruction to trigger a pre-associated event when the quantized amplitude values all meet the preset conditions after the target video is played.

Further, to facilitate the implementation of the present application, the present application provides an electronic device, including a central processing unit and a memory, where the central processing unit is configured to call and run a computer program stored in the memory to perform the steps of the expression instruction generation method in the foregoing embodiments.

It can be seen that the memory is suitable for adopting a nonvolatile storage medium, the aforementioned method is implemented as a computer program and installed in an electronic device such as a mobile phone or a computer, the related program code and data are stored in the nonvolatile storage medium of the electronic device, and further the program is executed by a central processing unit of the electronic device and is called from the nonvolatile storage medium to a memory for execution, so as to achieve the desired purpose of the present application. Therefore, it is understood that in an embodiment of the present application, a non-volatile storage medium may be further provided, in which a computer program implemented according to various embodiments of the method for instructing emotions is stored, and when the computer program is called by a computer, the computer program executes the steps included in the method.

In conclusion, the application has obvious and rich advantages, realizes the effect of generating related instructions according to the facial expressions of the user, belongs to the basic technology, can expand and enrich various computer application scenes, and is suitable for meeting various practical requirements based on computer application.

As will be appreciated by one skilled in the art, the present application includes apparatus that are directed to performing one or more of the operations, methods described herein. These devices may be specially designed and manufactured for the required purposes, or they may comprise known devices in general-purpose computers. These devices have computer programs stored in their memories that are selectively activated or reconfigured. Such a computer program may be stored in a device (e.g., computer) readable medium, including, but not limited to, any type of disk including floppy disks, hard disks, optical disks, CD-ROMs, and magnetic-optical disks, ROMs (Read-Only memories), RAMs (Random Access memories), EPROMs (erasable Programmable Read-Only memories), EEPROMs (electrically erasable Programmable Read-Only memories), flash memories, magnetic cards, or optical cards, or any type of media suitable for storing electronic instructions, and each coupled to a bus. That is, a readable medium includes any medium that stores or transmits information in a form readable by a device (e.g., a computer).

It will be understood by those within the art that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by computer program instructions. Those skilled in the art will appreciate that the computer program instructions may be implemented by a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, implement the aspects specified in the block or blocks of the block diagrams and/or flowchart illustrations disclosed herein.

Those of skill in the art will appreciate that the various operations, methods, steps in the processes, acts, or solutions discussed in this application can be interchanged, modified, combined, or eliminated. Further, other steps, measures, or schemes in various operations, methods, or flows that have been discussed in this application can be alternated, altered, rearranged, broken down, combined, or deleted. Further, steps, measures, schemes in the prior art having various operations, methods, procedures disclosed in the present application may also be alternated, modified, rearranged, decomposed, combined, or deleted.

The foregoing is only a partial embodiment of the present application, and it should be noted that, for those skilled in the art, several modifications and decorations can be made without departing from the principle of the present application, and these modifications and decorations should also be regarded as the protection scope of the present application.

Claims

1. An expression instruction generation method is characterized by comprising the following steps:

2. The method of claim 1, further comprising the preliminary steps of:

3. The method according to claim 2, wherein the application program is a native camera program, and the access entry is represented as a preset control or a preset touch instruction of the camera program interface;

4. The method of claim 1, wherein the listening step and the playing step are run on different threads.

5. The method of claim 1, wherein the target video is a video previously associated with the specific facial expression and pushed by a remote server.

6. The method according to claim 1, wherein the preset condition is configured to include a value interval corresponding to the quantized amplitude value, and when the quantized amplitude value is detected to exceed or belong to the value interval continuously by the end of the target video playing, the preset condition is satisfied, otherwise, when the quantized amplitude value is detected to belong to or exceed the value interval in the target video playing process, the preset condition is not satisfied;

7. The method of claim 6, wherein when detecting the quantized amplitude, an average value of the quantized amplitude within a fixed time period is used to detect whether the quantized amplitude satisfies the predetermined condition.

8. The method according to claim 1, wherein in the step of tracking detection, when a predetermined condition is satisfied, the result information about the satisfaction of the predetermined condition is output, and the result information is resolved into the emoticon, so as to trigger a pre-associated event according to the emoticon.

9. The method of claim 8, wherein the pre-correlated event comprises any one or more of: playing a preset special effect, sending reward information to a local user, starting the authority of a further customs link, changing the identity characteristic parameters of the local user, and popping up a sharing operation interface.

10. The method of claim 1,

11. Method according to claim 1, characterized in that it comprises the following steps:

12. The method of claim 11, wherein the video image, the emoticon, and the target video are loaded in a same graphical user interface for visual presentation.

13. An expression instruction generation device, characterized by comprising:

14. An electronic device comprising a central processing unit and a memory, wherein the central processing unit is configured to invoke and run a computer program stored in the memory to perform the steps of the emoticon generation method according to any one of claims 1 to 12.

15. A non-volatile storage medium storing a computer program implemented by the method for generating emoticons according to any one of claims 1 to 12, the computer program, when called by a computer, executing the steps included in the method.