CN109621425B

CN109621425B - Video generation method, device, equipment and storage medium

Info

Publication number: CN109621425B
Application number: CN201811588828.8A
Authority: CN
Inventors: 张庭亮
Original assignee: Guangzhou Cubesili Information Technology Co Ltd
Current assignee: Guangzhou Cubesili Information Technology Co Ltd
Priority date: 2018-12-25
Filing date: 2018-12-25
Publication date: 2023-08-18
Anticipated expiration: 2038-12-25
Also published as: CN109621425A

Abstract

The application provides a video generation method, a device, equipment and a computer readable storage medium, wherein the method comprises the following steps: playing the selected dance game video after detecting the dance machine game starting instruction, and synchronously shooting the action of the user to obtain an action video frame; the game video frame of the dance game video comprises at least one virtual key and at least one note point; generating a score image based on the degree of difference between the positions of the note points in the game video frame, the positions of the virtual keys, and the positions of the designated feature parts identified in the corresponding action video frame; and generating a target video frame according to the game video frame, the corresponding action video frame and the score image. The application can play the dance machine at any time and any place without preparing additional dance equipment or receiving place limitation, thereby improving the game experience of users.

Description

Video generation method, device, equipment and storage medium

Technical Field

The present application relates to the field of image processing, and in particular, to a video generating method, apparatus, device, and computer readable storage medium.

Background

In the prior art, a body-feeling dance game usually needs related equipment to assist in running, such as a dance machine game, which needs related dance equipment such as a dance mat and a dance machine to assist in completing the dance machine game, so that a user recognizes the limb actions of the user through corresponding limb actions on the dance equipment, and interaction with the game in a computer or a mobile phone screen is realized, and the game is completed.

However, in carrying out the invention, the inventors found that: when a user wants to participate in a body-feeling dance game such as a dance machine game, the user needs to purchase related dance equipment or pay to a game place with the related dance equipment to participate in the game, the related dance equipment also has certain weight and volume and is not easy to carry, the limitation of game conditions makes the user unable to experience the body-feeling dance game anytime and anywhere, meanwhile, the existing body-feeling dance game also increases the consumption of the related dance equipment and the expenditure of the user, and when the user plays the game, the user cannot intuitively watch the performance of the user in the game process at the end of the game because the game process of the user cannot be recorded, so that the participation desire of the user is reduced.

Disclosure of Invention

In view of this, the present application provides a video generation method, apparatus, device, and computer-readable storage medium.

First, a first aspect of the present application provides a video generating method, which specifically includes:

playing the selected dance game video after detecting the dance machine game starting instruction, and synchronously shooting the action of the user to obtain an action video frame; the game video frame of the dance game video comprises at least one virtual key and at least one note point;

generating a score image based on the degree of difference between the positions of the note points in the game video frame, the positions of the virtual keys, and the positions of the designated feature parts identified in the corresponding action video frame;

and generating a target video frame according to the game video frame, the corresponding action video frame and the score image.

Preferably, the method further comprises:

shooting the gesture of the user in advance to obtain a preliminary video frame;

and carrying out gesture correction on the user by identifying whether the appointed characteristic part of the user exists in the preparation video frame.

Preferably, the specified feature is identified based on a pre-established convolutional neural network model.

Preferably, the corresponding action video frames include all action video frames that are spaced from the game video frames by a predetermined time.

Preferably, the generating a score image based on the degree of difference between the positions of the note points in the game video frame, the positions of the virtual keys, and the positions of the specified feature parts identified in the corresponding action video frame includes:

when the fact that the region of the note point in the game video frame and the region of the virtual key are overlapped in the same coordinate range is detected, judging whether the region of the virtual key overlapped with the region of the note point is overlapped with the region of the appointed characteristic part identified in the corresponding action video frame or not, and generating a score image according to the judging result.

Preferably, whether or not two regions overlap is determined by judging whether or not an intersection of coordinate sets included in two regions within the same coordinate range is a non-empty set.

Preferably, the priorities of the triggering events corresponding to different virtual keys are different;

the determining whether the region of the virtual key overlapping with the region of the note point overlaps with the region of the specified feature identified in the corresponding action video frame within the same coordinate range includes:

When the part of the game video frame, which is overlapped with the area of the appointed characteristic part identified in the corresponding action video frame, comprises at least two virtual key areas, determining a priority virtual key which is overlapped with the area of the appointed characteristic part based on the priority of the triggering event corresponding to the virtual key, and judging whether the priority virtual key is a virtual key which is overlapped with the area of the note point.

Preferably, the generating a target video frame according to the game video frame, the corresponding action video frame and the score image includes:

and superposing the game video frame, the corresponding action video frame and the score image to generate a target video frame.

According to a second aspect of an embodiment of the present application, there is provided a video generating apparatus, the apparatus including:

the video playing and shooting module is used for playing the selected dance game video after detecting the dance machine game starting instruction and synchronously shooting the action of the user to obtain an action video frame; the game video frame of the dance game video comprises at least one virtual key and at least one note point;

a score image generation module for generating a score image based on the degree of difference between the positions of the note points in the game video frame, the positions of the virtual keys, and the positions of the designated feature parts identified in the corresponding action video frame;

And the target video frame generation module is used for generating a target video frame according to the game video frame, the corresponding action video frame and the score image.

According to a third aspect of an embodiment of the present application, there is provided an electronic apparatus including:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to:

According to a fourth aspect of embodiments of the present application, there is also provided a computer readable storage medium having stored thereon computer instructions which, when executed by a processor, implement the steps of the method of any of the first aspects.

The technical scheme provided by the embodiment of the application can comprise the following beneficial effects:

according to the method, after a dance machine game starting instruction is detected, a selected dance game video is played, actions of a user are shot synchronously to obtain action video frames, the game video frames of the dance game video comprise at least one virtual key and at least one note point, score images are generated based on the difference degree among the positions of the note points in the game video frames, the positions of the virtual keys and the positions of the appointed characteristic parts identified in the corresponding action video frames, finally, target video frames are generated according to the game video frames, the corresponding action video frames and the score images, the user does not need to prepare additional dance equipment, and does not need to receive place limitation, only a mobile terminal with a shooting function is required to perform the dance machine game anytime and anywhere, equipment loss is reduced, game expense is saved, applicability of the dance machine game is improved, and the game process of the user is intuitively recorded, so that the user can feel own game performance, and the sense of participation and immersion of the user in the game are improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application as claimed.

Drawings

FIG. 1 is a flow chart illustrating an embodiment of a video generation method according to an exemplary embodiment of the present application;

FIG. 2 is a schematic diagram of a target video frame according to an exemplary embodiment of the present application;

FIG. 3 is a flow chart illustrating the score determination performed by the electronic device according to an exemplary embodiment of the present application;

FIG. 4 is a flowchart illustrating another video generation method according to an exemplary embodiment of the present application;

fig. 5 is a schematic diagram illustrating an embodiment of a video generating apparatus according to an exemplary embodiment of the present application;

fig. 6 is a schematic diagram of an electronic device according to an exemplary embodiment of the present application.

Detailed Description

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the application. Rather, they are merely examples of apparatus and methods consistent with aspects of the application as detailed in the accompanying claims.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.

It should be understood that although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the application. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "responsive to a determination", depending on the context.

In the prior art, the body-feeling dance game usually needs to be assisted by related equipment, but the related dance equipment also has certain weight and volume and is not easy to carry, the limitation of game conditions makes the user unable to experience the body-feeling dance game anytime and anywhere, and meanwhile, the existing body-feeling dance game also increases the consumption of the related dance equipment and the expenditure of the user, and when the user plays the game, the user cannot intuitively watch the performance of the user in the game process at the end of the game because the game process of the user cannot be recorded, thereby reducing the participation desire of the user.

Accordingly, the present application provides a video generating method, as shown in fig. 1, for solving the above problems of the body-sensing dance game in the prior art, where the video generating method may be integrated into an electronic device, and the electronic device may be a mobile phone, a computer, a smart tablet, a PDA (Personal Digital Assistant, a personal digital assistant), or other computing devices with video capturing function.

Fig. 1 is a flowchart of an embodiment of a video generating method according to an exemplary embodiment of the present application, where the method specifically includes:

step S101, after detecting a dance machine game starting instruction, playing a selected dance game video, and synchronously shooting actions of a user to obtain action video frames; the game video frame of the dance game video comprises at least one virtual key and at least one note point.

Step S102, generating a score image based on the difference degree among the positions of the note points in the game video frame, the positions of the virtual keys and the positions of the appointed characteristic parts identified in the corresponding action video frame.

Step S103, generating a target video frame according to the game video frame, the corresponding action video frame and the score image.

In step S101, after detecting the dance game start instruction, the electronic device plays the dance game video selected by the user, and synchronously shoots the action of the user through its own camera device, so that the picture of the corresponding action of the user according to the dance game video being played can be synchronously recorded, thereby facilitating the subsequent scoring judgment.

In one possible implementation, the triggering of the dancer game start instruction includes, but is not limited to, confirming receipt of the dancer game start instruction when detecting that the preset button is triggered, confirming receipt of the dancer game start instruction when recognizing a specified action made by the user, confirming receipt of the dancer game start instruction when receiving a voice confirmation message of the user, and the like.

In one possible implementation manner, for a process of selecting a dance game video by a user, an application program of the electronic device may set a triggering condition of dance game video selection, for example, the triggering condition may be a dance game video selection control, and the user may enter a dance game video selection interface to perform selection of the dance game video by triggering the dance game selection control, so that the electronic device may obtain the dance game video selected by the user; as another possible, the triggering condition may be that when the dance game start instruction is received, the electronic device may automatically enter a dance game video selection interface, the user may select one of the dance game videos from the dance game video selection interface according to his own demand, and the electronic device may obtain the dance game video selected by the user for playing.

In step S102, in the process of playing the dance game video selected by the user and synchronously shooting the actions of the user through the own camera device, the electronic device acquires the game video frame of the played dance game video and acquires the corresponding action video frame including the actions of the user to perform score judgment, as shown in fig. 2, the game video frame of the dance game video may include at least one virtual key and at least one note point, the virtual key may be rectangular, circular, elliptic or hexagonal, and the note store may be circular, star-shaped or bar-shaped, and the like, and the invention does not limit this, the electronic device identifies the designated feature part of the user in the action video frame, and then generates a score image according to the difference degree between the position of the note point in the game video frame, the position of the virtual key and the position of the designated feature part identified in the corresponding action video frame; the specified feature part can be identified based on a pre-established convolutional neural network model, the specified feature part can be a part such as a foot part, a hand part, a head part, a waist part or a shoulder part, a series of specified feature part data are stored in the convolutional neural network model through a training process, and in the using process, the electronic equipment takes the action video frame as an input parameter of the convolutional neural network model, and outputs the specified feature part data corresponding to the action video frame through convolution operation of the convolutional neural network model.

It should be noted that, since a certain human body reflection time is required from watching dance actions in a dance game video to making corresponding actions according to the dance actions, a certain delay judgment time is set for the process, and action video frames corresponding to the game video frames are set to include all action video frames within a preset time interval from front to back of the game video frames, for example, when the electronic device acquires a game video frame, score judgment is carried out on all action video frames which are respectively separated from front to back of the game video frame by 0.2s one by one, so that humanized design requirements are reflected.

In this embodiment, as shown in fig. 3, fig. 3 is a schematic flow chart of the score judgment performed by the electronic device, in the process of performing the score judgment, the electronic device first detects whether the region of the soundpoint in the game video frame overlaps with the region of the virtual key in the same coordinate range, when the intersection of the coordinate set included in the region of the soundpoint and the coordinate set included in the region of the virtual key is a non-empty set, it is determined that the region of the soundpoint in the game video frame overlaps with the region of the virtual key in the same coordinate range, then the electronic device further judges whether the region of the virtual key overlapping with the region of the soundpoint overlaps with the region of the designated feature part identified in the corresponding action video frame in the same coordinate range, when the coordinate set included in the region of the virtual key in the same coordinate range and the region of the designated feature part in the corresponding action video frame are non-empty sets, it is determined that the region of the virtual key overlaps with the region of the designated feature part identified in the corresponding action video frame in the same coordinate range, and then the score is newly generated by the score is performed by the electronic device; if the coordinate set included in the region of the virtual key and the coordinate set included in the region of the designated feature part in the corresponding action video frame are empty sets within the same coordinate range, it is determined that the region of the virtual key and the region of the designated feature part identified in the corresponding action video frame are not overlapped within the same coordinate range, and this means that the user fails to score, and the electronic device generates a new score image according to the preset reduction number.

In addition, when the intersection of the coordinate set included in the note point region and the coordinate set included in the virtual key region is an empty set, it is determined that the note point region and the virtual key region in the game video frame do not overlap in the same coordinate range, at this time, the game video frame does not need to further perform interactive judgment with the user video frame, the current score does not change, and the electronic device acquires the score image of the previous game video frame as the current score image.

As a possible implementation manner, after determining that the region of the sounder point in the game video frame overlaps the region of the virtual key in the same coordinate range, the electronic device further determines whether the region of the virtual key overlapping the region of the sounder point in the game video frame overlaps the region of the two virtual keys in the corresponding action video frame, that is, whether the region of the designated feature part identified in the corresponding action video frame overlaps the region of the designated feature part in the game video frame is in the same coordinate range, and in this case, a situation that the game video frame exists may be detected, and the region overlapping the region of the designated feature part in the corresponding action video frame includes at least two virtual keys, for example, the designated feature part is a foot, when a user makes a corresponding action in accordance with the position of the sounder point in the game video frame and the position of the virtual key, the electronic device detects that the foot region of the user overlaps the region of the two virtual keys in the game video frame in the same coordinate range, that means that the two virtual keys in the foot of the user step on, in this case, the trigger event corresponding to the virtual key is not set, and thus the priority of the virtual key can be set, and the priority of the virtual key is not overlapped in accordance with the region of the virtual key is determined, and the priority is determined when the virtual key is a new point is not overlapped in accordance with the virtual key, and the priority is determined according to the region of the virtual key; if not, the user error score is lost, and the electronic equipment generates a new score image according to the preset score reduction.

As another possible implementation manner, for the above situation, the present application may further determine the virtual key in step according to the area sizes of at least two intersecting areas formed by the area of the designated feature part and the area of at least two virtual keys, and as an example, may determine the virtual key hit by acquiring at least two minimum bounding rectangles corresponding to the at least two intersecting areas respectively, and then comparing the area sizes between the at least two minimum bounding rectangles, thereby reducing the calculation amount and improving the program running efficiency.

In an embodiment, the dance game video further includes a corresponding game level, when the user selects the corresponding dance game video, the user also selects the corresponding game level, in different game levels, the number of virtual keys included in the game video frames of the dance game video is different, accordingly, the designated feature parts required to be identified from the corresponding action video frames of the electronic device are also different, as an example, game level 1 is set, the number of virtual keys included in the video frames of the dance game video is 3, and the electronic device is required to identify foot key points; setting the game grade 2, wherein the number of virtual keys included in video frames of dance game video is 5, and the electronic equipment needs to identify foot key points and hand key points; setting the game grade 3, wherein the number of virtual keys included in the video frame of the dance game video is 6, the electronic equipment needs to identify foot key points, hand key points, head key points and the like, and the specific situation can be set specifically according to actual needs.

In step 103, after the score image is obtained, fig. 2 shows a schematic diagram of a target video frame, and the electronic device performs superposition processing on the game video frame, the corresponding action video frame and the score image to generate a target video frame, and after the dance game video is finished, the electronic device integrates all the target video frames to generate a target dance video recording dance actions and scoring conditions of the user.

In one possible implementation manner, after the dance game video is finished, the electronic device generates a total evaluation image according to all the score images, and then combines all the target video frames to generate a target dance video for recording the dance actions and the score conditions of the user.

In one possible implementation manner, the electronic device further has a sharing function, and when a user is detected to trigger a sharing control, the target dance video is shared, so that the popularity of the dance machine game is improved.

As shown in fig. 4, the present invention further provides another video generating method, where the method specifically includes:

in step S201, the gesture of the user is photographed in advance to acquire a preliminary video frame.

Step S202, performing gesture correction on the user by identifying whether the specified feature of the user exists in the preliminary video frame.

Step S203, after detecting a dance machine game starting instruction, playing a selected dance game video, and synchronously shooting actions of a user to obtain action video frames; the game video frame of the dance game video comprises at least one virtual key and at least one note point. Similar to step S101 shown in fig. 1, a detailed description thereof will be omitted.

Step S204, generating a score image based on the difference degree between the positions of the note points in the game video frame, the positions of the virtual keys, and the positions of the specified feature parts identified in the corresponding action video frame. Similar to step S102 shown in fig. 1, a detailed description is omitted here.

Step S205, generating a target video frame according to the game video frame, the corresponding action video frame and the score image. Similar to step S103 shown in fig. 1, a detailed description thereof will be omitted.

In step S201, the electronic device may start its own image capturing device when the user opens the dance machine game preparation interface, and then capture the preparation gesture of the user through its own image capturing device to obtain the preparation video frame, so as to perform gesture correction.

In step S202, the electronic device performs gesture correction on the user by identifying whether there is a specified feature of the user in the preliminary video frame, where the specified feature may be identified based on a pre-established convolutional neural network model, where the convolutional neural network model stores a series of specified feature data through a training process, and in the use process, the electronic device uses the preliminary video frame as an input parameter of the convolutional neural network model, and performs a convolution operation of the convolutional neural network model, so as to output specified feature data corresponding to the preliminary video frame, and if it is determined that there is a specified feature of the user in the preliminary video frame, it is determined that the gesture is correct, and then a dance machine game may be performed; if the appointed characteristic part of the user is not identified in the preparation video frame, reminding the user to adjust the gesture to correct again, for example, a corresponding outline area can be displayed on the dance machine game preparation interface, so that the user can quickly adjust the preparation gesture according to the outline area, the preparation time before the game is saved, the gesture correction process ensures that the appointed characteristic part in the following game process is in an effective shooting range and can be accurately identified, and the fault that the appointed characteristic part cannot be identified is avoided.

In a possible implementation manner, according to the difference of the game levels of the dance game video selected by the user, the electronic device identifies that the designated feature parts of the user in the prepared video frame are different, for example, if the user selects the game level 1, whether the foot key points of the user exist in the prepared video frame needs to be identified; if the user selects the game level 2, it is necessary to identify whether the foot key points and the hand key points of the user exist in the preliminary video frame, so as to ensure that the designated feature parts of the user are in the effective shooting range and can be accurately identified in the process of playing the dance game video of different game levels.

The present application also provides embodiments of the video generating apparatus, the electronic device, and the computer-readable storage medium, corresponding to embodiments of the video generating method of the present application.

Referring to fig. 5, a block diagram of a video generating apparatus according to an embodiment of the present application includes:

the video playing and shooting module 11 is used for playing the selected dance game video after detecting the dance machine game starting instruction and synchronously shooting the action of the user to obtain an action video frame; the game video frame of the dance game video comprises at least one virtual key and at least one note point.

The score image generating module 12 is configured to generate a score image based on the degree of difference between the positions of the note points in the game video frame, the positions of the virtual keys, and the positions of the specified feature points identified in the corresponding action video frame.

The target video frame generating module 13 is configured to generate a target video frame according to the game video frame, the corresponding action video frame, and the score image.

Preferably, the method further comprises:

and the preparation video frame acquisition module is used for shooting the gesture of the user in advance to acquire the preparation video frame.

And the gesture correction module is used for correcting the gesture of the user by identifying whether the appointed characteristic part of the user exists in the preparation video frame.

Preferably, the score image generation module 12 includes:

Preferably, the priorities of the triggering events corresponding to different virtual keys are different.

The score image generation module 12 includes:

the first judging unit is used for judging whether the region of the sound symbol point in the game video frame and the region of the virtual key are overlapped in the same coordinate range or not; and if so, executing the second judging unit.

And a second judging unit configured to determine, when a portion overlapping with a region of a specified feature identified in the corresponding action video frame in the game video frame includes at least two virtual key regions, a priority virtual key overlapping with the region of the specified feature based on a priority of a trigger event corresponding to the virtual key, and judge whether the priority virtual key is a virtual key overlapping with the region of the note point.

Preferably, the target video frame generation module 13 includes:

For the device embodiments, reference is made to the description of the method embodiments for the relevant points, since they essentially correspond to the method embodiments. The apparatus embodiments described above are merely illustrative, wherein elements illustrated as separate elements may or may not be physically separate, and elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purposes of the present invention. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

Accordingly, as shown in fig. 6, the present invention further provides an electronic device 30, including a processor 31; a memory 32 for storing executable instructions, the memory 32 comprising a computer program 33; wherein the processor 31 is configured to:

playing the selected dance game video after detecting the dance machine game starting instruction, and synchronously shooting the action of the user to obtain an action video frame; the game video frame of the dance game video comprises at least one virtual key and at least one note point.

And generating a score image based on the degree of difference between the positions of the note points in the game video frame, the positions of the virtual keys and the positions of the designated feature parts identified in the corresponding action video frame.

The processor 31 executes the computer program 33 included in the memory 32, and the processor 31 may be a central processing unit (Central Processing Unit, CPU), or other general purpose processor, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit

(Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 32 stores a computer program of the video generating method, and the memory 32 may include at least one type of storage medium including flash memory, a hard disk, a multimedia card, a card memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. Moreover, the apparatus may cooperate with a network storage device that performs the storage function of the memory via a network connection. The memory 32 may be an internal storage unit of the electronic device 30, such as a hard disk or a memory of the electronic device 30. The memory 32 may also be an external storage device of the electronic device 30, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the electronic device 30. Further, the memory 32 may also include both internal and external storage units of the electronic device 30. The memory 32 is used to store a computer program 33 as well as other programs and data required by the device. The memory 32 may also be used to temporarily store data that has been output or is to be output.

The various embodiments described herein may be implemented using a computer readable medium, such as computer software, hardware, or any combination thereof. For hardware implementation, the embodiments described herein may be implemented through the use of at least one of Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), processors, controllers, micro-controllers, microprocessors, electronic units designed to perform the functions described herein. For a software implementation, an embodiment such as a process or function may be implemented with a separate software module that allows for performing at least one function or operation. The software codes may be implemented by a software application (or program) written in any suitable programming language, which may be stored in memory and executed by a controller.

The electronic device 30 includes, but is not limited to, the following forms of presence: (1) a mobile communication device: such devices are characterized by mobile communication capabilities and are primarily aimed at providing voice, data communications. Such terminals include: smart phones (e.g., iPhone), multimedia phones, functional phones, and low-end phones, etc.; (2) ultra mobile personal computer device: such devices are in the category of personal computers, having computing and processing functions, and generally also having mobile internet access characteristics. Such terminals include: PDA, MID, and UMPC devices, etc., such as iPad; (3) portable entertainment device: such devices may display and play multimedia content. The device comprises: audio, video players (e.g., iPod), palm game consoles, electronic books, and smart toys and portable car navigation devices; (4) other electronic devices with data interaction function. Devices may include, but are not limited to, a processor 31, a memory 32. It will be appreciated by those skilled in the art that fig. 6 is merely an example of the electronic device 30 and is not meant to be limiting of the electronic device 30, and may include more or fewer components than shown, or may combine certain components, or different components, e.g., the device may further include an input-output device, a network access device, a bus, an imaging device, etc.

The implementation process of the functions and roles of each unit in the above-mentioned device is specifically detailed in the implementation process of the corresponding steps in the above-mentioned method, and will not be described herein again.

In an exemplary embodiment, a non-transitory computer readable storage medium is also provided, such as a memory, comprising instructions executable by a processor of an apparatus to perform the above-described method. For example, the non-transitory computer readable storage medium may be ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.

A non-transitory computer readable storage medium, which when executed by a processor of a terminal, enables the terminal to perform the video generation method described above.

Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This invention is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.

It is to be understood that the invention is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, alternatives, and improvements that fall within the spirit and scope of the invention.

Claims

1. A method of video generation, the method comprising:

generating a score image based on the degree of difference between the positions of the note points in the game video frame, the positions of the virtual keys, and the positions of the designated feature parts identified in the corresponding action video frame; the method specifically comprises the following steps:

detecting whether the region of the phonetic symbol point in the game video frame and the region of the virtual key are overlapped in the same coordinate range;

When the intersection of the coordinate set included in the region of the note point and the coordinate set included in the region of the virtual key is a non-empty set, determining that the region of the note point in the game video frame overlaps with the region of the virtual key in the same coordinate range, and further judging whether the region of the virtual key overlapping with the region of the note point overlaps with the region of the designated feature part identified in the corresponding action video frame in the same coordinate range;

when the coordinate set contained in the region of the virtual key in the same coordinate range and the coordinate set contained in the region of the appointed characteristic part identified in the corresponding action video frame are non-empty sets, determining that the region of the virtual key and the region of the appointed characteristic part identified in the corresponding action video frame are overlapped in the same coordinate range, and generating a score image according to a preset score;

when a part of the game video frame, which is overlapped with the area of the appointed characteristic part identified in the corresponding action video frame, comprises at least two areas of virtual keys, determining a hit virtual key based on the priority of a triggering event corresponding to the virtual key, or determining a hit virtual key according to the area size of at least two intersecting areas formed by the area of the appointed characteristic part and the areas of the at least two virtual keys respectively;

2. The video generation method according to claim 1, further comprising:

3. The video generation method according to claim 2, wherein the specified feature is identified based on a convolutional neural network model established in advance.

4. The video generation method according to claim 1, wherein the corresponding action video frames include all action video frames that are separated from the game video frames by a predetermined time.

5. The video generation method according to claim 1, wherein the generating a score image based on the degree of difference between the positions of the note points in the game video frame, the positions of the virtual keys, and the positions of the specified feature points identified in the corresponding action video frame, further comprises:

when the coordinate set contained in the region of the virtual key and the coordinate set contained in the region of the appointed characteristic part identified in the corresponding action video frame are empty sets within the same coordinate range, determining that the region of the virtual key and the region of the appointed characteristic part identified in the corresponding action video frame are not overlapped within the same coordinate range, and generating a score image according to a preset reduction number.

6. The video generation method according to claim 1, further comprising:

and judging whether the hit virtual key is a virtual key overlapped with the region of the note point.

7. The video generation method according to claim 1, wherein the generating a target video frame from the game video frame, the corresponding action video frame, and the score image includes:

8. A video generating apparatus, the apparatus comprising:

a score image generation module for generating a score image based on the degree of difference between the positions of the note points in the game video frame, the positions of the virtual keys, and the positions of the designated feature parts identified in the corresponding action video frame; the method specifically comprises the following steps: detecting whether the region of the phonetic symbol point in the game video frame and the region of the virtual key are overlapped in the same coordinate range; when the intersection of the coordinate set included in the region of the note point and the coordinate set included in the region of the virtual key is a non-empty set, determining that the region of the note point in the game video frame overlaps with the region of the virtual key in the same coordinate range, and further judging whether the region of the virtual key overlapping with the region of the note point overlaps with the region of the designated feature part identified in the corresponding action video frame in the same coordinate range; when the coordinate set contained in the region of the virtual key in the same coordinate range and the coordinate set contained in the region of the appointed characteristic part identified in the corresponding action video frame are non-empty sets, determining that the region of the virtual key and the region of the appointed characteristic part identified in the corresponding action video frame are overlapped in the same coordinate range, and generating a score image according to a preset score; when a part of the game video frame, which is overlapped with the area of the appointed characteristic part identified in the corresponding action video frame, comprises at least two areas of virtual keys, determining a hit virtual key based on the priority of a triggering event corresponding to the virtual key, or determining a hit virtual key according to the area size of at least two intersecting areas formed by the area of the appointed characteristic part and the areas of the at least two virtual keys respectively;

9. An electronic device, comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to:

detecting whether the region of the phonetic symbol point in the game video frame and the region of the virtual key are overlapped in the same coordinate range; when the intersection of the coordinate set included in the region of the note point and the coordinate set included in the region of the virtual key is a non-empty set, determining that the region of the note point in the game video frame overlaps with the region of the virtual key in the same coordinate range, and further judging whether the region of the virtual key overlapping with the region of the note point overlaps with the region of the designated feature part identified in the corresponding action video frame in the same coordinate range; when the coordinate set contained in the region of the virtual key in the same coordinate range and the coordinate set contained in the region of the appointed characteristic part identified in the corresponding action video frame are non-empty sets, determining that the region of the virtual key and the region of the appointed characteristic part identified in the corresponding action video frame are overlapped in the same coordinate range, and generating a score image according to a preset score; when a part of the game video frame, which is overlapped with the area of the appointed characteristic part identified in the corresponding action video frame, comprises at least two areas of virtual keys, determining a hit virtual key based on the priority of a triggering event corresponding to the virtual key, or determining a hit virtual key according to the area size of at least two intersecting areas formed by the area of the appointed characteristic part and the areas of the at least two virtual keys respectively;

10. A computer readable storage medium having stored thereon computer instructions, which when executed by a processor, implement the steps of the video generation method of any of claims 1 to 7.