WO2019223782A1 - 游戏场景描述方法、装置、设备及存储介质 - Google Patents
游戏场景描述方法、装置、设备及存储介质 Download PDFInfo
- Publication number
- WO2019223782A1 WO2019223782A1 PCT/CN2019/088348 CN2019088348W WO2019223782A1 WO 2019223782 A1 WO2019223782 A1 WO 2019223782A1 CN 2019088348 W CN2019088348 W CN 2019088348W WO 2019223782 A1 WO2019223782 A1 WO 2019223782A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- game
- area
- map
- video frame
- image
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 46
- 238000001514 detection method Methods 0.000 claims abstract description 102
- 238000013145 classification model Methods 0.000 claims abstract description 19
- 238000012937 correction Methods 0.000 claims description 33
- 238000012549 training Methods 0.000 claims description 16
- 230000015654 memory Effects 0.000 claims description 14
- 230000008859 change Effects 0.000 claims description 10
- 230000011218 segmentation Effects 0.000 claims description 10
- 230000004807 localization Effects 0.000 claims description 8
- 230000001629 suppression Effects 0.000 claims description 6
- 238000004590 computer program Methods 0.000 claims description 5
- 230000007123 defense Effects 0.000 description 11
- 230000004083 survival effect Effects 0.000 description 8
- 230000006870 function Effects 0.000 description 7
- 239000008280 blood Substances 0.000 description 4
- 210000004369 blood Anatomy 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000013135 deep learning Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 230000004069 differentiation Effects 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 239000013078 crystal Substances 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000013403 standard screening design Methods 0.000 description 1
Images
Classifications
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F13/00—Video games, i.e. games using an electronically generated display having two or more dimensions
- A63F13/50—Controlling the output signals based on the game progress
- A63F13/52—Controlling the output signals based on the game progress involving aspects of the displayed game scene
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F13/00—Video games, i.e. games using an electronically generated display having two or more dimensions
- A63F13/60—Generating or modifying game content before or while executing the game program, e.g. authoring tools specially adapted for game development or game-integrated level editor
- A63F13/67—Generating or modifying game content before or while executing the game program, e.g. authoring tools specially adapted for game development or game-integrated level editor adaptively or by learning from player actions, e.g. skill level adjustment or by storing successful combat sequences for re-use
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/478—Supplemental services, e.g. displaying phone caller identification, shopping application
- H04N21/4781—Games
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/23412—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs for generating or manipulating the scene composition of objects, e.g. MPEG-4 objects
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/23418—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/44008—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/44012—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving rendering scenes according to scene graphs, e.g. MPEG-4 scene graphs
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F13/00—Video games, i.e. games using an electronically generated display having two or more dimensions
- A63F13/50—Controlling the output signals based on the game progress
- A63F13/53—Controlling the output signals based on the game progress involving additional visual information provided to the game scene, e.g. by overlay to simulate a head-up display [HUD] or displaying a laser sight in a shooting game
- A63F13/537—Controlling the output signals based on the game progress involving additional visual information provided to the game scene, e.g. by overlay to simulate a head-up display [HUD] or displaying a laser sight in a shooting game using indicators, e.g. showing the condition of a game character on screen
- A63F13/5378—Controlling the output signals based on the game progress involving additional visual information provided to the game scene, e.g. by overlay to simulate a head-up display [HUD] or displaying a laser sight in a shooting game using indicators, e.g. showing the condition of a game character on screen for displaying an additional top view, e.g. radar screens or maps
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F13/00—Video games, i.e. games using an electronically generated display having two or more dimensions
- A63F13/85—Providing additional services to players
- A63F13/86—Watching games played by other players
Definitions
- the embodiments of the present application relate to the field of computer vision technology, for example, to a method, a device, a device, and a storage medium for describing a game scene.
- the anchor client sends a large number of game live video streams to the server and the server sends them to the user client for users to watch.
- the information carried by the game live video stream is very limited, for example, the live room number, anchor name, and signature added by the anchor. This information cannot accurately describe the game scene inside the game live video stream, and it is impossible to push or distinguish the game live video stream for a specific game scene, thereby failing to meet the personalized needs of users, which is not conducive to improving the content distribution efficiency of the game live broadcast industry. .
- This application provides a method, a device, a device, and a storage medium for describing a game scene, so as to accurately describe a game scene inside a game live video stream.
- an embodiment of the present application provides a method for describing a game scene, including:
- an embodiment of the present application further provides a device for describing a game scene, where the device includes:
- An acquisition module configured to acquire at least one video frame in a game live video stream
- An intercepting module configured to intercept an image of a game map area in the at least one video frame
- a display area recognition module configured to input the game map area image to a first target detection model to obtain a display area of a game element on the game map area image;
- a state recognition module configured to input an image of a display area of the game element into a classification model to obtain a state of the game element;
- the forming module is configured to use the display area and state of the game element to form the description information of the game scene displayed by the at least one video frame.
- an embodiment of the present application further provides an electronic device, including:
- One or more processors are One or more processors;
- Memory set to store one or more programs
- an embodiment of the present application further provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the method for describing a game scenario according to any of the embodiments is implemented.
- the present application obtains at least one video frame in a live game video stream and captures an image of a game map area in the at least one video frame, and obtains a game map that reflects a game situation from the live game video stream; through a first goal Detecting the model and the classification model to obtain the display area and status of game elements on the game map area image, applying an image recognition algorithm based on deep learning to the understanding of the game map, and extracting the display area and status of the game elements; then, Using the display area and state of the game element to form the description information of the game scene displayed in the at least one video frame, so that the game map is used as the identification object and the image recognition algorithm is combined to obtain the specific game inside the game live video stream Scenarios make it easy to push or categorize the live game video streams of specific game scenarios in the future, meet the user's personalized needs, and help improve the content distribution efficiency of the game live broadcast industry.
- FIG. 1 is a flowchart of a method for describing a game scene according to the first embodiment of the present application
- FIG. 2 is a flowchart of a method for describing a game scene provided in Embodiment 2 of the present application;
- FIG. 3 is a flowchart of a method for describing a game scene provided in Embodiment 3 of the present application;
- FIG. 4 is a schematic structural diagram of a game scene description apparatus according to a fourth embodiment of the present application.
- FIG. 5 is a schematic structural diagram of an electronic device according to a fifth embodiment of the present application.
- FIG. 1 is a flowchart of a method for describing a game scenario provided in Embodiment 1 of the present application. This embodiment is applicable to describing a situation of a game scenario inside a game live video stream.
- the method may be executed by a game scenario description device.
- the device may be composed of hardware and / or software, and may generally be integrated in a server, an anchor client, or a user client.
- the method includes the following steps.
- the game scene description device receives the live game video stream corresponding to the anchor live room in real time.
- the game live video stream refers to a video stream whose content is a game, for example, a video stream of a game of glory of a king and a video stream of a League of Legends game.
- at least one video frame is intercepted from any position in the currently received live video stream of the game.
- the video frame displays a game display interface, which is the main interface of the game application.
- the game display interface displays a game map.
- the image of the display area of the game map is called the game map area image.
- capturing the game map area image in at least one video frame includes at least the following two implementations:
- the first embodiment In order to facilitate the game for the player, the game map is generally displayed in a preset display area of the game display interface.
- the display area of the game map can be represented by (abscissa value, ordinate value, width, height), and The display area of the game map varies depending on the type of game. Based on this, the display area of the game map is determined according to the type of game; an image of the display area of the game map in at least one video frame is captured. It is worth noting that the first embodiment uses the display area of the game map on the game display interface as the display area of the game map on the video frame. This method can obtain more accurate results when the video display screen displays the game display interface.
- the target detection model includes, but is not limited to, Yolo (You Only Only Look Once), Residual Neural Network (ResNet), MobileNetV1, MobileNetV2 and other convolutional networks and Single Shot MultiBox Detector (SSD) Or, it includes Fast Regional Convolutional Neural Network (Faster, Regions, Convolutional, Neural Network, FasterRCNN), etc.
- the object detection model extracts the features of the video frame and matches the features of the pre-stored game map to obtain the display area of the game map; and captures an image of the display area of the game map in at least one video frame. It is worth noting that the second embodiment can obtain more accurate results when the video frame is displayed full screen or the game display interface is displayed full screen.
- S130 Input the game map area image to the first target detection model to obtain a display area of game elements on the game map area image.
- S140 Input the image of the display area of the game element into the classification model to obtain the state of the game element.
- Game elements on the game map include, but are not limited to, game characters, defense towers, beasts, and so on.
- the status of the game elements includes, but is not limited to, the name, survival status, team, type, etc. of the game character. For example, the name of the game character, the team to which the game character belongs, the survival status of the game character, the name of the defense tower, the survival status of the defense tower, the team to which the defense tower belongs, the name of the beast, and the survival status of the beast.
- the display area and status of game elements can reflect the current game situation.
- the model used to detect the display area of the game element is called the first target detection model
- the model used to detect the display area of the game map is called the second target detection model.
- the second target detection model includes, but is not limited to, convolutional networks and SSDs such as Yolo, ResNet, MobileNetV1, MobileNetV2, or FasterRCNN.
- the classification model includes, but is not limited to, Cifar10's lightweight classification network, ResNet, MobileNet, Inception, etc.
- S150 Use display areas and states of game elements to form description information of a game scene displayed in at least one video frame.
- the display area of the game element output by the first target detection model is a digital format.
- the display area of the game element is represented by (abscissa value, ordinate value, width, height).
- the width and height of the game element are preset.
- the display area of the game element is directly represented by (abscissa value, ordinate value).
- the status output by the classification model is character format, such as the name and number of the game character, the type of the defense tower, and the survival status.
- the format of the description information may be a chart, a text, a number, or a character, and the content of the description information includes, but is not limited to, an attack route, a method, and a degree of participation.
- S150 includes the following optional implementations:
- the video frame may be one, two, or more.
- the display area of the digital format of the game element and the state of the character format in at least one video frame are composed into an array, and are directly used as description information of the game scene, for example (abscissa, ordinate, state).
- the video frame may be one, two, or more.
- the display area of the above digital format and the state of the character format are converted into text, and connection words are added between the text to form the description information of the game scene.
- the description information is that the survival state of the anchor tower of the anchor side in the first video frame is full of blood, and the anchor's game characters are concentrated in the middle; the survival state of the anchor tower of the anchor side in the second video frame is residual blood.
- the anchor game characters are concentrated in the highlands.
- the number of video frames is one.
- the correspondence between the display area and state of the game element and the description information is stored in advance, and the description information of the game scene displayed in a video frame is obtained according to the correspondence between the display area and state of the game element and the description information in a video frame.
- the survival status of the anchor tower of the anchor side is full of blood and the anchor player ’s game characters are concentrated in the middle, corresponding to “the anchor side is expected to win”.
- the number of video frames is two or more.
- the change trend and status change trend of the display area of the game element are obtained, and the change trend can be displayed in the form of a chart; according to the change trend and description information, Correspondence, get the description information of the game scene displayed by two or more video frames.
- the changing trend “the anchor's high-level defense tower's blood is getting lower and lower” corresponds to “anchor will fail”.
- the changing trend “the anchor game character moves from the middle of the map to the enemy's highland” corresponds to "the anchor party is attacking the crystal”.
- a game map that reflects the game situation is obtained from the game live video stream;
- Object detection model and classification model to obtain the display area and status of game elements on the game map area image.
- intercepting an image of a game map area in at least one video frame includes: inputting at least one video frame to a second target detection model to obtain at least one video.
- Game map detection area in the frame by matching the line features and prior features in the game map detection area, the game map detection area is corrected to obtain the game map correction area; the game map correction area is relative to the game map detection area
- the image of the game map detection area in the video frame is intercepted; if the deviation distance of the game map correction area relative to the game map detection area does not exceed the deviation threshold, the game in the video frame is intercepted.
- An image of the map correction area is a flowchart of a method for describing a game scene provided in Embodiment 2 of the present application. As shown in FIG. 2, the method provided in this embodiment includes the following steps.
- S210 is the same as S110, and is not repeated here.
- S220 Input at least one video frame to a second target detection model to obtain a game map detection area in at least one video frame.
- the method further includes training the second target detection model.
- the training process of the second target detection model includes the following two steps. That is, the second target detection model can be generated by training in the following two steps.
- Step 1 Obtain multiple sample video frames.
- the sample video frames are the same as the game types corresponding to at least one video frame in S210.
- the color, texture, path, and size of game maps of similar games are the same.
- the trained second target detection model can be applied to the display area recognition of the game map.
- the second step constructing a training sample set from the multiple sample video frames and the display area of the game map on the multiple sample video frames to train a second target detection model.
- the difference between the display area output by the second target detection model and the display area in the sample set is used as a cost function, and the parameters in the second target detection model are iteratively iterated until the cost function is lower than the loss threshold, and the second target The detection model training is completed.
- the second target detection model includes a feature map generation sub-model, a mesh segmentation sub-model, and a localization sub-model that are sequentially connected.
- S220 at least one video frame is input to a feature map generation sub-model, and a feature map of the video frame is generated.
- the feature map may be two-dimensional or three-dimensional.
- the feature map of the video frame is input to the grid segmentation sub-model, and the feature map is divided into multiple grids; the difference between the size of the grid and the size of the game map is within a preset size range.
- the size of the grid is represented by hyperparameters, and it is set according to the size of the game map before the second target detection model is trained.
- multiple grids are input into the positioning submodel.
- the positioning submodel is loaded with the features of the standard game map.
- the positioning submodel matches each grid with the features of the standard game map to obtain each grid and the standard game.
- the degree of matching of the features of the map is, for example, the cosine or the distance of these two features.
- the area corresponding to the grid whose degree of matching exceeds the threshold of the degree of matching is used as the game map detection area. If there is no grid with a matching degree exceeding the matching degree threshold, it means that there is no game map in the video frame, and the localization sub-model directly outputs "no game map exists".
- the detection area of the game map is directly identified by the second target detection model.
- an image of the game map detection area may be directly intercepted from the video frame as the game map area image.
- S230 Correct the game map detection area by performing feature matching on the line features and the prior features in the game map detection area to obtain a game map correction area.
- the game map detection area is corrected in this embodiment.
- prior characteristics of the lines in the standard game map area such as line angle, line thickness, line color, and the like are stored in advance.
- a straight line with a specified width and angle in the detection area of the game map is extracted as a line feature.
- Feature matching is performed on the line features and the prior features in the game map detection area, that is, the matching degree between the aforementioned line features and the prior features is calculated. If the matching degree is greater than the matching degree threshold, an image of the game map detection area is intercepted from the video frame as the game map area image.
- the display position of the game map detection area is corrected until the matching degree is greater than the matching degree threshold.
- the corrected area is called the game map correction area.
- an image of the game map correction area is captured from a video frame as the game map area image.
- S240 Determine whether the deviation distance of the game map correction area from the game map detection area exceeds the deviation threshold. In response to the determination result that the deviation distance of the game map correction area from the game map detection area exceeds the deviation threshold, jump to S250 and respond to The determination result that the deviation distance of the game map correction area from the game map detection area does not exceed the deviation threshold value jumps to S260.
- step S250 Capture an image of a game map detection area in a video frame. Go to step S270.
- step S260 Capture an image of the game map correction area in the video frame. Go to step S270.
- the offset distance of the game map correction area relative to the game map detection area is calculated.
- the center of the game map correction area is relatively Based on the offset distance of the center of the game map detection area, the offset distance of the upper right corner of the game map correction area relative to the upper right corner of the game map detection area. If the offset distance of the game map correction area of a video frame from the game map detection area of the video frame exceeds the deviation threshold, indicating that the game map correction area of the video frame is overcorrected, discard the game map correction area of the video frame.
- the image of the game map detection area of the video frame is intercepted as the image of the game map area of the video frame; if the offset distance does not exceed the deviation threshold, it means that the correction of the game map correction area of the video frame is not excessive, then the video frame is intercepted.
- the image of the game map correction area is used as the image of the game map area of the video frame.
- the game map area image is input to the first target detection model to obtain a display area of game elements on the game map area image.
- S290 Use display areas and states of game elements to form description information of a game scene displayed in at least one video frame.
- S270, S280, and S290 are respectively the same as S130, S140, and S150 in the foregoing embodiment, and details are not described herein again.
- the game map detection area is corrected to obtain the game map correction area, and if the game map correction area deviates from the game map detection area, If the distance exceeds the deviation threshold, the image of the game map detection area in the video frame is intercepted. If the deviation distance of the game map correction area from the game map detection area does not exceed the deviation threshold, the image of the game area correction area is intercepted. Corrects pinpoint game images.
- inputting the game map area image to the first target detection model to obtain the display area of the game element on the game map area image includes: the game map area
- the image is input to the feature map generation sub-model to generate the feature map of the game map area image.
- the feature map is input to the grid segmentation sub-model to divide the feature map into multiple grids.
- the size of the grid is the smallest of the game elements.
- the difference is within the preset size range; multiple grids are input to the positioning submodel to obtain the matching degree of each grid with the characteristics of various game elements; the non-maximum suppression algorithm is used to determine the grid with the highest matching degree
- the corresponding area is a display area of a corresponding type of game element on the game map area image.
- FIG. 3 is a flowchart of a method for describing a game scene provided in Embodiment 3 of the present application. As shown in FIG. 3, the method provided in this embodiment includes the following steps.
- S310 is the same as S110, and is not repeated here.
- the method before the game map area image is input to the first target detection model to obtain the display area of the game element on the game map area image, the method further includes training the first target detection model.
- the training process of the first target detection model includes the following two steps, that is, the first target detection module may be generated through training in the following two steps.
- Step 1 Obtain multiple game map sample images, that is, game map images.
- the game map sample image and the game map area image correspond to the same type of game.
- the color, shape, texture, and other image characteristics of similar game elements are the same.
- the first target detection model trained on the game map sample image can be applied to the display area recognition of game elements.
- Step 2 Construct a training sample set by combining multiple game map sample images and multiple game map sample images with game element display areas to train a first target detection model.
- the difference between the display area output by the first target detection model and the display area in the sample set is used as a cost function, and the parameters in the first target detection model are iteratively iterated until the cost function is lower than the loss threshold, and the first target The detection model training is completed.
- the first target detection model includes a feature map generation sub-model, a mesh segmentation sub-model, and a localization sub-model that are sequentially connected.
- the detection process of the first target detection model is described below through S330-S350.
- the feature map may be two-dimensional or three-dimensional.
- the feature map is input to a grid segmentation sub-model, and the feature map is divided into multiple grids.
- the difference between the size of the grid and the minimum size of the game element is within a preset size range.
- the game map displays at least one game element, and the sizes of different types of game elements are generally different.
- the difference between the grid size and the minimum size of the game element is within a preset size range.
- the size of the grid is represented by hyperparameters, which are set according to the minimum size of the game elements before the first target detection model is trained.
- S350 Input multiple meshes into the localization sub-model to obtain the matching degree of each mesh with the characteristics of multiple game elements.
- the positioning sub-model is loaded with the characteristics of a variety of standard game elements, and each grid is essentially a grid-sized feature.
- the localization sub-model matches each grid with the characteristics of a variety of standard game elements, and obtains the matching degree of each grid with the characteristics of standard multiple game elements.
- the matching degree is, for example, the cosine of these two features. Or distance.
- the game element includes two elements of a game character and a defense tower.
- the locator sub-model is loaded with features of standard game characters and features of standard defense towers.
- the positioning sub-model matches the characteristics of the standard game character with grid 1 to obtain a matching degree A, and the characteristics of the standard defense tower obtains the matching degree B.
- the positioning sub-module matches the characteristics of the standard game character with grid 2 Match to get the matching degree C, and match the characteristics of the standard defense tower to get the matching degree D.
- the non-maximum suppression algorithm is used to find the maximum value in the range of the entire grid, and the non-maximum value is suppressed.
- the matching degree C is the maximum value, and the area corresponding to the grid 2 is used as the display area of the game character. If the matching degree C and the matching degree A are both maximum values, the area where the grid 1 and the grid 2 are combined is used as the display area of the game character.
- a certain game element is not displayed in the game map, and a matching degree threshold corresponding to the type of the game element is set.
- a non-maximum suppression algorithm is used for matching degrees exceeding the matching degree threshold. If all the matching degrees do not exceed the matching degree threshold, it is considered that the game element is not displayed in the game map.
- An image of a display area of a game element is captured, and the image is input to a classification model.
- the classification model stores the state and corresponding characteristics of standard game elements in advance.
- the classification model extracts features in the image and matches them with a feature library corresponding to the state of the game elements stored in advance to obtain the state corresponding to the feature with the highest matching degree.
- S380 Use display areas and states of game elements to form description information of the game scene displayed in at least one video frame.
- the precise positioning of game elements is achieved through the feature map generation submodel, mesh segmentation submodel, and positioning submodel, and the classification model is used to achieve accurate classification of game elements, thereby improving the accuracy of game scene description.
- FIG. 4 is a schematic structural diagram of a game scene description device provided in Embodiment 4 of the present application. As shown in FIG. 4, the device includes an acquisition module 41, an interception module 42, a display area identification module 43, a status identification module 44, and a formation device. Module 45.
- the acquisition module 41 is configured to acquire at least one video frame in a live game video stream;
- the interception module 42 is configured to intercept a game map area image in at least one video frame;
- the display area identification module 43 is configured to input a game map area image To the first target detection model, the display area of the game element on the game map area image is obtained;
- the state recognition module 44 is configured to input the image of the display area of the game element to the classification model to obtain the state of the game element;
- the forming module 45 is configured to set In order to use the display area and state of the game elements, the description information of the game scene displayed by at least one video frame is formed.
- the present application obtains at least one video frame in a live game video stream and captures an image of a game map area in the at least one video frame, and obtains a game map that can reflect the game situation from the live game video stream; the first target detection model And classification model to get the display area and status of game elements on the game map area image, and apply deep learning-based image recognition algorithm to the understanding of the game map to extract the display area and status of the game element; then, use the game element's
- the display area and status form the description information of the game scene displayed in at least one video frame, so that the game map is used as the identification object and the image recognition algorithm is combined to obtain the specific game scene inside the game live video stream, which is convenient for subsequent specific game scenarios Push or categorize the live video streams of games to meet the personalized needs of users and help improve the content distribution efficiency of the game live broadcast industry.
- the interception module 42 is configured to: input at least one video frame to the second target detection model, and obtain a game map detection area of each video frame in the at least one video frame; The line features and prior features in the feature matching are used to correct the game map detection area to obtain the game map correction area.
- the deviation distance of the game map correction area in a video frame from the game map detection area of the video frame exceeds the deviation threshold.
- an image of a game map detection area in the video frame is captured.
- an image of the game map correction area in the video frame is captured.
- the device further includes a training module configured to obtain a plurality of sample video frames before the at least one video frame is input to the second target detection model, and the game corresponding to the at least one video frame
- the types are the same; the display area of the game map on the multiple sample video frames and the multiple sample video frames constitutes a training sample set to train a second target detection model.
- the training module is further configured to obtain multiple game map sample images and game maps before the game map area image is input to the first target detection model to obtain the display area of the game element on the game map area image.
- the sample image corresponds to the same game type as the game map area image; a plurality of game map sample images and a plurality of game map sample images are used to form a training sample set to train a first target detection model.
- the first target detection model includes a feature map generation sub-model, a mesh segmentation sub-model, and a localization sub-model.
- the display area recognition module 43 is configured to: input the game map area image to a feature map generation sub-model to generate a feature map of the game map area image; input the feature map to a grid segmentation sub-model, and divide the feature map into multiple networks Grid, the difference between the size of the grid and the minimum size of the game element is within the preset size range; multiple grids are input to the positioning submodel to obtain the matching degree of each grid with the characteristics of multiple game elements;
- the maximum value suppression algorithm determines that the area corresponding to the grid with the highest matching degree is the display area of the corresponding type of game element on the game map area image.
- the forming module 45 is configured to: obtain the description information of the game scene displayed in a video frame according to the corresponding relationship between the display area and state of the game element in a video frame and the description information; or, according to two The display area and status of game elements in one or more video frames, to obtain the change trend of the display area and status of game elements; according to the corresponding relationship between the change trend and the description information, the game displayed in two or more video frames is obtained.
- the description of the scene is configured to: obtain the description information of the game scene displayed in a video frame according to the corresponding relationship between the display area and state of the game element in a video frame and the description information; or, according to two The display area and status of game elements in one or more video frames, to obtain the change trend of the display area and status of game elements; according to the corresponding relationship between the change trend and the description information, the game displayed in two or more video frames is obtained.
- the description of the scene is configured to: obtain the description information of the game scene displayed in a video frame according to the corresponding
- the game scene description apparatus provided by the embodiment of the present application can execute the game scene description method provided by any embodiment of the present application, and has the corresponding function modules and beneficial effects of the execution method.
- FIG. 5 is a schematic structural diagram of an electronic device according to Embodiment 5 of the present application.
- the electronic device may be a server, a streaming client, or a user client.
- the electronic device includes a processor 50 and a memory 51; the number of processors 50 in the electronic device may be one or more, and one processor 50 is taken as an example in FIG. 5; the processor 50 in the electronic device
- the memory 51 may be connected through a bus or other methods. In FIG. 5, the connection through a bus is taken as an example.
- the memory 51 is a computer-readable storage medium that can be used to store software programs, computer-executable programs, and modules, such as program instructions / modules corresponding to the game scenario description method in the embodiments of the present application (for example, (Acquisition module 41, interception module 42, display area identification module 43, status identification module 44 and formation module 45).
- the processor 50 executes various functional applications and data processing of the electronic device by running software programs, instructions, and modules stored in the memory 51, that is, implementing the foregoing method for describing a game scene.
- the memory 51 may mainly include a storage program area and a storage data area, where the storage program area may store an operating system and application programs required for at least one function; the storage data area may store data created according to the use of the terminal, and the like.
- the memory 51 may include a high-speed random access memory, and may further include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other non-volatile solid-state storage device.
- the memory 51 may include memory remotely set with respect to the processor 50, and these remote memories may be connected to the electronic device through a network. Examples of the above network include, but are not limited to, the Internet, an intranet, a local area network, a mobile communication network, and combinations thereof.
- Embodiment 6 of the present application also provides a computer-readable storage medium having a computer program stored thereon.
- the method is used to execute a method for describing a game scene.
- the method includes: acquiring a live game video stream. At least one video frame in the game; capture the game map area image in the at least one video frame; input the game map area image to the first target detection model to obtain the display area of the game element on the game map area image; and display the game element display area
- the images are input to the classification model to obtain the status of the game elements; the display area and status of the game elements are used to form the description information of the game scene displayed by at least one video frame.
- the computer-readable storage medium provided with the computer program stored in the embodiment of the present application is not limited to the above method operation, and can also execute the method for describing the game scene provided by any embodiment of the present application. Related operations.
- the present application can be implemented by software and general hardware, and of course, can also be implemented by hardware.
- the technical solution of the present application can be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as a computer's floppy disk, Read-Only Memory (ROM), Random access memory (RAM), flash memory (FLASH), hard disk or optical disk, etc., including multiple instructions to enable a computer device (can be a personal computer, a server, or a network device, etc.) to execute any of this application Examples of methods.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Radar, Positioning & Navigation (AREA)
- Physics & Mathematics (AREA)
- Optics & Photonics (AREA)
- Image Analysis (AREA)
Abstract
本申请公开了一种游戏场景描述方法、装置、设备及存储介质。其中,方法包括:获取游戏直播视频流中的至少一个视频帧;截取所述至少一个视频帧中的游戏地图区域图像;将所述游戏地图区域图像输入至第一目标检测模型,得到所述游戏地图区域图像上游戏元素的显示区域;将所述游戏元素的显示区域的图像输入至分类模型,得到所述游戏元素的状态;采用所述游戏元素的显示区域和状态,形成所述至少一个视频帧展示的游戏场景的描述信息。
Description
本申请要求在2018年05月25日提交中国专利局、申请号为201810517799.X的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。
本申请实施例涉及计算机视觉技术领域,例如涉及一种游戏场景描述方法、装置、设备及存储介质。
随着游戏直播行业的发展和游戏主播数量的日益增多,主播客户端将大量的游戏直播视频流发送至服务器并由服务器下发至用户客户端,以供用户观看。
游戏直播视频流携带的信息十分有限,例如,游戏直播视频流对应的直播间号、主播名、主播添加签名等。这些信息无法准确描述游戏直播视频流内部的游戏场景,也就无法针对特定游戏场景的游戏直播视频流进行推送或者区分,进而无法满足用户的个性化需求,不利于提高游戏直播行业的内容分发效率。
发明内容
本申请提供一种游戏场景描述方法、装置、设备及存储介质,以准确描述游戏直播视频流内部的游戏场景。
第一方面,本申请实施例提供了一种游戏场景描述方法,包括:
获取游戏直播视频流中的至少一个视频帧;
截取所述至少一个视频帧中的游戏地图区域图像;
将所述游戏地图区域图像输入至第一目标检测模型,得到所述游戏地图区域图像上游戏元素的显示区域;
将所述游戏元素的显示区域的图像输入至分类模型,得到所述游戏元素的状态;
采用所述游戏元素的显示区域和状态,形成所述至少一个视频帧展示的游戏场景的描述信息。
第二方面,本申请实施例还提供了一种游戏场景描述装置,该装置包括:
获取模块,设置为获取游戏直播视频流中的至少一个视频帧;
截取模块,设置为截取所述至少一个视频帧中的游戏地图区域图像;
显示区域识别模块,设置为将所述游戏地图区域图像输入至第一目标检测模型,得到所述游戏地图区域图像上游戏元素的显示区域;
状态识别模块,设置为将所述游戏元素的显示区域的图像输入至分类模型,得到所述游戏元素的状态;
形成模块,设置为采用所述游戏元素的显示区域和状态,形成所述至少一个视频帧展示的游戏场景的描述信息。
第三方面,本申请实施例还提供了一种电子设备,包括:
一个或多个处理器;
存储器,设置为存储一个或多个程序;
当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现任一实施例所述的游戏场景描述方法。
第四方面,本申请实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现任一实施例所述的游戏场景描述方法。
本申请通过获取游戏直播视频流中的至少一个视频帧,并截取所述至少一个视频帧中的游戏地图区域图像,从游戏直播视频流中获取到能够反映游戏态势的游戏地图;通过第一目标检测模型和分类模型,得到所述游戏地图区域图像上游戏元素的显示区域和状态,将基于深度学习的图像识别算法应用在游戏地图的理解上,提取出游戏元素的显示区域和状态;然后,采用所述游戏元素的显示区域和状态,形成所述至少一个视频帧展示的游戏场景的描述信息,从而以游戏地图为识别对象,结合图像识别算法,得到该游戏直播视频流的内部具体的游戏场景,便于后续对特定游戏场景的游戏直播视频流进行推送或者分类,满足用户的个性化需求,有利于提高游戏直播行业的内容分发效率。
图1是本申请实施例一提供的一种游戏场景描述方法的流程图;
图2是本申请实施例二提供的一种游戏场景描述方法的流程图;
图3是本申请实施例三提供的一种游戏场景描述方法的流程图;
图4是本申请实施例四提供的一种游戏场景描述装置的结构示意图;
图5是本申请实施例五提供的一种电子设备的结构示意图。
下面结合附图和实施例对本申请作说明。可以理解的是,此处所描述的具体实施例仅仅用于解释本申请,而非对本申请的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与本申请相关的部分而非全部结构。
实施例一
图1是本申请实施例一提供的一种游戏场景描述方法的流程图,本实施例可适用于描述游戏直播视频流内部的游戏场景的情况,该方法可以由游戏场景描述装置来执行,该装置可由硬件和/或软件组成,并一般可集成在服务器、主播客户端或者用户客户端中,该方法包括如下步骤。
S110、获取游戏直播视频流中的至少一个视频帧。
游戏场景描述装置实时接收主播直播间对应的游戏直播视频流。其中,游戏直播视频流指视频内容为游戏的视频流,例如王者荣耀游戏的视频流、英雄联盟游戏的视频流。为了保证视频帧的实时性,进而保证后续识别的内容的准确性与及时性,从当前接收到的游戏直播视频流中的任意位置截取至少一个视频帧。
S120、截取至少一个视频帧中的游戏地图区域图像。
视频帧显示有游戏显示界面,该游戏显示界面是游戏应用的主界面,游戏显示界面上显示有游戏地图。为了方便描述和区分,游戏地图的显示区域的图像称为游戏地图区域图像。
在一实施例中,截取至少一个视频帧中的游戏地图区域图像至少包括以下两种实施方式:
第一种实施方式:为了方便玩家游戏,游戏地图一般会显示在游戏显示界面的预设的显示区域,游戏地图的显示区域可以用(横坐标值,纵坐标值,宽度,高度)表示,而游戏地图的显示区域会因游戏种类的不同而不同。基于此,根据游戏种类,确定游戏地图的显示区域;截取至少一个视频帧中游戏地图的显示区域的图像。值得说明的是,第一种实施方式将游戏显示界面上游戏地图的显示区域作为视频帧上游戏地图的显示区域,当视频帧满屏显示游戏显示界面时,该方式可以得到较精确的结果。
第二种实施方式:基于目标检测模型,识别游戏地图的显示区域。该目标检测模型包括但不限于Yolo(You Only Look Once)、残差神经网络(Residual Neural Network,ResNet)、MobileNetV1、MobileNetV2等卷积网络与单次多框检测器(Single Shot MultiBox Detector,SSD)、或者包括快速区域卷积神经网络(Faster Regions with Convolutional Neural Network,FasterRCNN)等。该目标检测模型提取视频帧的特征,并与预存的游戏地图的特征进行匹配,得到游戏 地图的显示区域;截取至少一个视频帧中游戏地图的显示区域的图像。值得说明的是,当视频帧满屏显示或者不满屏显示游戏显示界面时,第二种实施方式均能得到较精确的结果。
S130、将游戏地图区域图像输入至第一目标检测模型,得到游戏地图区域图像上游戏元素的显示区域。
S140、将游戏元素的显示区域的图像输入至分类模型,得到游戏元素的状态。
游戏地图上的游戏元素包括但不限于游戏角色、防御塔、野兽等。游戏元素的状态包括但不限于游戏角色的名称、生存状态、所属队伍、类型等。例如,游戏角色的名称、游戏角色所属队伍、游戏角色的生存状态,防御塔的名称、防御塔生存状态、防御塔所属队伍,野兽名称、野兽生存状态。游戏元素的显示区域和状态能够反映当前游戏态势。
为了方便描述和区分,用于检测游戏元素的显示区域的模型称为第一目标检测模型,上述用于检测游戏地图的显示区域的模型称为第二目标检测模型。在一实施例中,第二目标检测模型包括但不限于Yolo、ResNet、MobileNetV1、MobileNetV2等卷积网络与SSD、或者包括FasterRCNN等。分类模型包括但不限于Cifar10轻量的分类网络、ResNet,MobileNet,Inception等。
S150、采用游戏元素的显示区域和状态,形成至少一个视频帧展示的游戏场景的描述信息。
第一目标检测模型输出的游戏元素的显示区域是数字格式,例如游戏元素的显示区域用(横坐标值,纵坐标值,宽度,高度)表示,又例如,游戏元素的宽度和高度是预设的,则游戏元素的显示区域直接用(横坐标值,纵坐标值)表示。
分类模型输出的状态是字符格式,例如游戏角色的名称、编号,防御塔的类型、生存状态等。在一实施例中,描述信息的格式可以是图表、文字、数字或字符,该描述信息的内容包括但不限于攻击路线、方式、参与度。
根据视频帧个数的不同和描述信息格式的不同,S150包括以下几种可选实施方式:
在一可选实施方式中,视频帧可以是一个、两个或者多个。将至少一个视频帧中游戏元素的数字格式的显示区域和字符格式的状态组成数组,直接作为游戏场景的描述信息,例如(横坐标,纵坐标,状态)。
在另一可选实施方式中,视频帧可以是一个、两个或者多个。将上述数字格式的显示区域和字符格式的状态转换为文字,并在文字间加入连接词以形成 游戏场景的描述信息。例如,该描述信息是第一个视频帧中主播方高地防御塔的生存状态是满血,主播方游戏角色集中在中路;第二个视频帧中主播方高地防御塔的生存状态是残血,主播方游戏角色集中在高地。
在又一可选实施方式中,视频帧的数量是一个。预先存储游戏元素的显示区域和状态与描述信息的对应关系,根据一个视频帧中游戏元素的显示区域和状态与描述信息的对应关系,得到一个视频帧展示的游戏场景的描述信息。例如,主播方高地防御塔的生存状态是满血且主播方游戏角色集中在中路对应“主播方有望取得胜利”,又例如,主播方高地防御塔的生存状态是残血且主播方游戏角色集中在高地对应“主播方防守”。
在又一可选实施方式中,视频帧的数量是两个或者两个以上。根据两个或两个以上视频帧中游戏元素的显示区域和状态,得到游戏元素的显示区域的变化趋势和状态的变化趋势,该变化趋势可以以图表的形式展示;根据变化趋势与描述信息的对应关系,得到两个或两个以上视频帧展示的游戏场景的描述信息。例如,变化趋势“主播方高地防御塔的血量越来越低”对应“主播方将要失败”。又例如,变化趋势“主播游戏角色从地图中部移动到敌方高地”对应“主播方正在攻打水晶”。
本实施例中,通过获取游戏直播视频流中的至少一个视频帧,并截取至少一个视频帧中的游戏地图区域图像,从游戏直播视频流中获取到能够反映游戏态势的游戏地图;通过第一目标检测模型和分类模型,得到游戏地图区域图像上游戏元素的显示区域和状态,将基于深度学习的图像识别算法应用在游戏地图的理解上,提取出游戏元素的显示区域和状态;然后,采用游戏元素的显示区域和状态,形成至少一个视频帧展示的游戏场景的描述信息,从而以游戏地图为识别对象,结合图像识别算法,得到该游戏直播视频流的内部具体的游戏场景,便于后续对特定游戏场景的游戏直播视频流进行推送或者分类,满足用户的个性化需求,有利于提高游戏直播行业的内容分发效率。
实施例二
本实施例对上述实施例中的S120进行说明,在本实施例中,将截取至少一个视频帧中的游戏地图区域图像包括:将至少一个视频帧输入至第二目标检测模型,得到至少一个视频帧中的游戏地图检测区域;通过对游戏地图检测区域中的线路特征和先验特征进行特征匹配,校正游戏地图检测区域,以得到游戏地图校正区域;在游戏地图校正区域相对于游戏地图检测区域的偏离距离超过偏离阈值的情况下,截取视频帧中的游戏地图检测区域的图像;在游戏地图校正区域相对于游戏地图检测区域的偏离距离未超过偏离阈值的情况下,截取视 频帧中的游戏地图校正区域的图像。图2是本申请实施例二提供的一种游戏场景描述方法的流程图,如图2所示,本实施例提供的方法包括以下步骤。
S210、获取游戏直播视频流中的至少一个视频帧。
S210与S110相同,此处不再赘述。
S220、将至少一个视频帧输入至第二目标检测模型,得到至少一个视频帧中的游戏地图检测区域。
在将至少一个视频帧输入至第二目标检测模型之前,还包括训练第二目标检测模型。在一实施例中,第二目标检测模型的训练过程,包括以下两个步骤。也就是说,所述第二目标检测模型可以通过以下两个步骤的方法训练生成。
第一步:获取多个样本视频帧,样本视频帧与S210中的至少一个视频帧对应的游戏种类相同,同类游戏的游戏地图的颜色、纹理、路径、尺寸等图像特征相同,通过样本视频帧训练出的第二目标检测模型能够应用于游戏地图的显示区域识别中。
第二步:将多个样本视频帧和多个样本视频帧上游戏地图的显示区域构成训练样本集,训练第二目标检测模型。在一实施例中,将第二目标检测模型输出的显示区域与样本集中的显示区域的差距作为代价函数,反复迭代第二目标检测模型中的参数,直到代价函数低于损失阈值,第二目标检测模型训练完成。
第二目标检测模型包括顺次连接的特征图生成子模型、网格分割子模型和定位子模型。在S220中,将至少一个视频帧输入至特征图生成子模型,生成视频帧的特征图,该特征图可以是二维的,也可以是三维的。然后,将视频帧的特征图输入至网格分割子模型,将特征图分割为多个网格;网格的尺寸与游戏地图的尺寸之差在预设尺寸范围内。在具体实现上,网格的尺寸采用超参数表示,在第二目标检测模型训练之前根据游戏地图的尺寸设置。接着,将多个网格输入至定位子模型中,定位子模型加载有标准游戏地图的特征,定位子模型将每个网格与标准游戏地图的特征进行匹配,得到每个网格与标准游戏地图的特征的匹配度,匹配度例如是这两个特征的余弦或者距离,将匹配度超过匹配度阈值的网格对应的区域作为游戏地图检测区域。如果没有匹配度超过匹配度阈值的网格,说明视频帧中不存在游戏地图,则定位子模型直接输出“不存在游戏地图”。
可见,游戏地图检测区域是直接由第二目标检测模型识别得到的。在一些实施例中,可以直接从视频帧中截取游戏地图检测区域的图像,作为游戏地图区域图像。
S230、通过对游戏地图检测区域中的线路特征和先验特征进行特征匹配, 校正游戏地图检测区域,以得到游戏地图校正区域。
考虑到游戏地图检测区域可能存在误差,本实施例中对游戏地图检测区域进行校正。示例性地,预先存储标准游戏地图区域中线路的先验特征,例如线路角度、线路粗度、线路颜色等。提取游戏地图检测区域中指定宽度和角度的直线,作为线路特征。对游戏地图检测区域中的线路特征和先验特征进行特征匹配,即计算前述线路特征和先验特征的匹配度。如果该匹配度大于匹配度阈值,则从视频帧中截取该游戏地图检测区域的图像,作为游戏地图区域图像。如果该匹配度小于或等于匹配度阈值,则校正游戏地图检测区域的显示位置,直到该匹配度大于匹配度阈值。校正后的区域称为游戏地图校正区域。在一些实施例中,从视频帧中截取该游戏地图校正区域的图像,作为游戏地图区域图像。
S240、判断游戏地图校正区域相对于游戏地图检测区域的偏离距离是否超过偏离阈值,响应于游戏地图校正区域相对于游戏地图检测区域的偏离距离超过偏离阈值的判断结果,跳转到S250,响应于游戏地图校正区域相对于游戏地图检测区域的偏离距离未超过偏离阈值的判断结果,跳转到S260。
S250、截取视频帧中的游戏地图检测区域的图像。跳转到步骤S270。
S260、截取视频帧中的游戏地图校正区域的图像。跳转到步骤S270。
考虑到游戏地图校正区域可能存在校正过度,导致游戏地图定位不够精确的情况,本实施例中,计算游戏地图校正区域相对于游戏地图检测区域的偏移距离,例如,游戏地图校正区域的中心相对于游戏地图检测区域的中心的偏移距离,游戏地图校正区域的右上角相对于游戏地图检测区域的右上角的偏移距离。如果一个视频帧的游戏地图校正区域相对于该视频帧的游戏地图检测区域的偏移距离超过偏离阈值,说明该视频帧的游戏地图校正区域校正过度,则丢弃该视频帧的游戏地图校正区域,截取该视频帧的游戏地图检测区域的图像,作为该视频帧的游戏地图区域图像;如果偏移距离未超过偏离阈值,说明该视频帧的游戏地图校正区域校正未过度,则截取该视频帧的的游戏地图校正区域的图像,作为该视频帧的游戏地图区域图像。
S270、将游戏地图区域图像输入至第一目标检测模型,得到游戏地图区域图像上游戏元素的显示区域。
S280、将游戏元素的显示区域的图像输入至分类模型,得到游戏元素的状态。
S290、采用游戏元素的显示区域和状态,形成至少一个视频帧展示的游戏场景的描述信息。
其中,S270、S280和S290分别与上述实施例中的S130、S140和S150相同,此处不再赘述。
本实施例中,通过对游戏地图检测区域中的线路特征和先验特征进行特征匹配,校正游戏地图检测区域,以得到游戏地图校正区域,以及如果游戏地图校正区域相对于游戏地图检测区域的偏离距离超过偏离阈值,截取视频帧中的游戏地图检测区域的图像,如果游戏地图校正区域相对于游戏地图检测区域的偏离距离未超过偏离阈值,截取游戏地区校正区域的图像,从而通过特征匹配和区域校正精确定位游戏图像。
实施例三
本实施例对上述实施例中的S130进行说明,在本实施例中,将将游戏地图区域图像输入至第一目标检测模型,得到游戏地图区域图像上游戏元素的显示区域包括:将游戏地图区域图像输入至特征图生成子模型,生成游戏地图区域图像的特征图;将特征图输入至网格分割子模型,将特征图分割为多个网格;网格的尺寸与游戏元素的最小尺寸之差在预设尺寸范围内;将多个网格输入至定位子模型,得到每个网格与多种游戏元素的特征的匹配度;采用非极大值抑制算法,确定匹配度最大的网格所对应的区域为游戏地图区域图像上对应种类的游戏元素的显示区域。图3是本申请实施例三提供的一种游戏场景描述方法的流程图,如图3所示,本实施例提供的方法包括以下步骤。
S310、获取游戏直播视频流中的至少一个视频帧。
S310与S110相同,此处不再赘述。
S320、截取至少一个视频帧中的游戏地图区域图像。
针对S320的描述参见上述实施例一和实施例二,此处不再赘述。
本实施例中,在将游戏地图区域图像输入至第一目标检测模型,得到游戏地图区域图像上游戏元素的显示区域之前,还包括训练第一目标检测模型。在一实施例中,第一目标检测模型的训练过程,包括以下两个步骤,即,所述第一目标检测模块可以通过以下两个步骤的方法训练生成。
第一步:获取多个游戏地图样本图像,即游戏地图的图像,游戏地图样本图像与游戏地图区域图像对应的游戏种类相同,同类游戏的游戏元素的颜色、形状、纹理等图像特征相同,通过游戏地图样本图像训练出的第一目标检测模型能够应用于游戏元素的显示区域识别中。
第二步:将多个游戏地图样本图像和多个游戏地图样本图像上游戏元素的 显示区域构成训练样本集,训练第一目标检测模型。在一实施例中,将第一目标检测模型输出的显示区域与样本集中的显示区域的差距作为代价函数,反复迭代第一目标检测模型中的参数,直到代价函数低于损失阈值,第一目标检测模型训练完成。
第一目标检测模型包括顺次连接的特征图生成子模型、网格分割子模型和定位子模型。下面通过S330-S350描述第一目标检测模型的检测过程。
S330、将游戏地图区域图像输入至特征图生成子模型,生成游戏地图区域图像的特征图。
其中,特征图可以是二维的也可以是三维的。
S340、将特征图输入至网格分割子模型,将特征图分割为多个网格;网格的尺寸与游戏元素的最小尺寸之差在预设尺寸范围内。
游戏地图显示有至少一种游戏元素,不同种类的游戏元素的尺寸一般不同,为了避免网格的过度分割,网格尺寸与游戏元素的最小尺寸之差在预设尺寸范围内。在具体实现上,网格的尺寸采用超参数表示,在第一目标检测模型训练之前根据游戏元素的最小尺寸设置。
S350、将多个网格输入至定位子模型,得到每个网格与多种游戏元素的特征的匹配度。
S360、采用非极大值抑制算法,确定匹配度最大的网格对应的区域为游戏地图区域图像上对应种类的游戏元素的显示区域。
定位子模型加载有标准的多种游戏元素的特征,每个网格实质是网格般大小的特征。定位子模型将每个网格分别与标准的多种游戏元素的特征进行匹配,分别得到每个网格与标准的多种游戏元素的特征的匹配度,匹配度例如是这两个特征的余弦或者距离。
示例性地,游戏元素包括游戏角色和防御塔这两种元素。定位子模型加载有标准游戏角色的特征和标准防御塔的特征。定位子模型将网格1与标准游戏角色的特征进行匹配,得到匹配度A,与标准防御塔的特征进行匹配得到匹配度B;接着,定位子模块将网格2与标准游戏角色的特征进行匹配,得到匹配度C,与标准防御塔的特征进行匹配得到匹配度D。
采用非极大值抑制算法在全部网格的范围内寻找极大值,抑制非极大值,得到匹配度C是极大值,则将网格2对应的区域作为游戏角色的显示区域。如果得到匹配度C和匹配度A均是极大值,则将网格1和网格2合并的区域作为游戏角色的显示区域。
在一些实施例中,可能游戏地图中未显示某种游戏元素,则设置与游戏元素种类对应的匹配度阈值。对超过匹配度阈值的匹配度采用非极大值抑制算法。如果所有匹配度均没有超过匹配度阈值,则认为游戏地图中未显示该种游戏元素。
S370、将游戏元素的显示区域的图像输入至分类模型,得到游戏元素的状态。
截取游戏元素的显示区域的图像,并将该图像输入至分类模型。分类模型预先存储有标准游戏元素的状态和对应的特征。分类模型提取该图像中的特征,并与预先存储的对应游戏元素的状态的特征库进行匹配,得到匹配度最高的特征对应的状态。
S380、采用游戏元素的显示区域和状态,形成至少一个视频帧展示的游戏场景的描述信息。
本实施例中,通过特征图生成子模型、网格分割子模型和定位子模型实现游戏元素的精确定位,通过分类模型实现游戏元素的准确分类,从而提高游戏场景描述的准确性。
实施例四
图4是本申请实施例四提供的一种游戏场景描述装置的结构示意图,如图4所示,该装置包括:获取模块41、截取模块42、显示区域识别模块43、状态识别模块44和形成模块45。
获取模块41,设置为获取游戏直播视频流中的至少一个视频帧;截取模块42,设置为截取至少一个视频帧中的游戏地图区域图像;显示区域识别模块43,设置为将游戏地图区域图像输入至第一目标检测模型,得到游戏地图区域图像上游戏元素的显示区域;状态识别模块44,设置为将游戏元素的显示区域的图像输入至分类模型,得到游戏元素的状态;形成模块45,设置为采用游戏元素的显示区域和状态,形成至少一个视频帧展示的游戏场景的描述信息。
本申请通过获取游戏直播视频流中的至少一个视频帧,并截取至少一个视频帧中的游戏地图区域图像,从游戏直播视频流中获取到能够反映游戏态势的游戏地图;通过第一目标检测模型和分类模型,得到游戏地图区域图像上游戏元素的显示区域和状态,将基于深度学习的图像识别算法应用在游戏地图的理解上,提取出游戏元素的显示区域和状态;然后,采用游戏元素的显示区域和状态,形成至少一个视频帧展示的游戏场景的描述信息,从而以游戏地图为识别对象,结合图像识别算法,得到该游戏直播视频流的内部具体的游戏场景, 便于后续对特定游戏场景的游戏直播视频流进行推送或者分类,满足用户的个性化需求,有利于提高游戏直播行业的内容分发效率。
在一可选实施方式中,截取模块42是设置为:将至少一个视频帧输入至第二目标检测模型,得到至少一个视频帧中每个视频帧的游戏地图检测区域;通过对游戏地图检测区域中的线路特征和先验特征进行特征匹配,校正游戏地图检测区域,以得到游戏地图校正区域;在一个视频帧的游戏地图校正区域相对于该视频帧的游戏地图检测区域的偏离距离超过偏离阈值的情况下,截取该视频帧中的游戏地图检测区域的图像。在一个视频帧的游戏地图校正区域相对于该视频帧的游戏地图检测区域的偏离距离未超过偏离阈值的情况下,截取该视频帧中的游戏地图校正区域的图像。
在一可选实施方式中,该装置还包括训练模块在将至少一个视频帧中输入至第二目标检测模型之前,设置为获取多个样本视频帧,样本视频帧与至少一个视频帧对应的游戏种类相同;将多个样本视频帧和多个样本视频帧上游戏地图的显示区域构成训练样本集,训练第二目标检测模型。
在一可选实施方式中,训练模块在将游戏地图区域图像输入至第一目标检测模型,得到游戏地图区域图像上游戏元素的显示区域之前,还设置为获取多个游戏地图样本图像,游戏地图样本图像与游戏地图区域图像对应的游戏种类相同;将多个游戏地图样本图像和多个游戏地图样本图像上游戏元素的显示区域构成训练样本集,训练第一目标检测模型。
在一可选实施方式中,第一目标检测模型包括特征图生成子模型、网格分割子模型和定位子模型。显示区域识别模块43是设置为:将游戏地图区域图像输入至特征图生成子模型,生成游戏地图区域图像的特征图;将特征图输入至网格分割子模型,将特征图分割为多个网格,网格的尺寸与游戏元素的最小尺寸之差在预设尺寸范围内;将多个网格输入至定位子模型,得到每个网格与多种游戏元素的特征的匹配度;采用非极大值抑制算法,确定匹配度最大的网格所对应的区域为所述游戏地图区域图像上对应种类的游戏元素的显示区域。
在一可选实施方式中,形成模块45是设置为:根据一个视频帧中游戏元素的显示区域和状态与描述信息的对应关系,得到一个视频帧展示的游戏场景的描述信息;或者,根据两个或两个以上视频帧中游戏元素的显示区域和状态,得到游戏元素的显示区域和状态的变化趋势;根据变化趋势与描述信息的对应关系,得到两个或两个以上视频帧展示的游戏场景的描述信息。
本申请实施例所提供的游戏场景描述装置可执行本申请任意实施例所提供的游戏场景描述方法,具备执行方法相应的功能模块和有益效果。
实施例五
图5是本申请实施例五提供的一种电子设备的结构示意图,该电子设备可以是服务器、主播客户端或者用户客户端。如图5所示,该电子设备包括处理器50、存储器51;电子设备中处理器50的数量可以是一个或多个,图5中以一个处理器50为例;电子设备中的处理器50、存储器51可以通过总线或其他方式连接,图5中以通过总线连接为例。
存储器51作为一种计算机可读存储介质,可用于存储软件程序、计算机可执行程序以及模块,如本申请实施例中的游戏场景描述方法对应的程序指令/模块(例如,游戏场景描述装置中的获取模块41、截取模块42、显示区域识别模块43、状态识别模块44和形成模块45)。处理器50通过运行存储在存储器51中的软件程序、指令以及模块,从而执行电子设备的多种功能应用以及数据处理,即实现上述的游戏场景描述方法。
存储器51可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序;存储数据区可存储根据终端的使用所创建的数据等。此外,存储器51可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他非易失性固态存储器件。在一些实例中,存储器51可包括相对于处理器50远程设置的存储器,这些远程存储器可以通过网络连接至电子设备。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
实施例六
本申请实施例六还提供一种其上存储有计算机程序的计算机可读存储介质,计算机程序在由计算机处理器执行时用于执行一种游戏场景描述方法,该方法包括:获取游戏直播视频流中的至少一个视频帧;截取至少一个视频帧中的游戏地图区域图像;将游戏地图区域图像输入至第一目标检测模型,得到游戏地图区域图像上游戏元素的显示区域;将游戏元素的显示区域的图像输入至分类模型,得到游戏元素的状态;采用游戏元素的显示区域和状态,形成至少一个视频帧展示的游戏场景的描述信息。
当然,本申请实施例所提供的一种其上存储有计算机程序的计算机可读存储介质,其计算机程序不限于如上的方法操作,还可以执行本申请任意实施例所提供的游戏场景描述方法中的相关操作。
通过以上关于实施方式的描述,所属领域的技术人员可以了解到,本申请 可借助软件及通用硬件来实现,当然也可以通过硬件来实现。基于这样的理解,本申请的技术方案可以以软件产品的形式体现出来,该计算机软件产品可以存储在计算机可读存储介质中,如计算机的软盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、闪存(FLASH)、硬盘或光盘等,包括多个指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请任意实施例的方法。
值得注意的是,上述游戏场景描述装置的实施例中,所包括的多个单元和模块只是按照功能逻辑进行划分的,但并不局限于上述的划分,只要能够实现相应的功能即可;另外,每个功能单元的名称也只是为了便于相互区分,并不用于限制本申请的保护范围。
Claims (10)
- 一种游戏场景描述方法,包括:获取游戏直播视频流中的至少一个视频帧;截取所述至少一个视频帧中的游戏地图区域图像;将所述游戏地图区域图像输入至第一目标检测模型,得到所述游戏地图区域图像上游戏元素的显示区域;将所述游戏元素的显示区域的图像输入至分类模型,得到所述游戏元素的状态;采用所述游戏元素的显示区域和状态,形成所述至少一个视频帧展示的游戏场景的描述信息。
- 根据权利要求1所述的方法,其中,所述截取所述至少一个视频帧中的游戏地图区域图像,包括:将所述至少一个视频帧输入至第二目标检测模型,得到所述至少一个视频帧中的游戏地图检测区域;通过对所述游戏地图检测区域中的线路特征和先验特征进行特征匹配,校正所述游戏地图检测区域,以得到游戏地图校正区域;在一个视频帧的游戏地图校正区域相对于所述一个视频帧的游戏地图检测区域的偏离距离超过偏离阈值的情况下,截取所述一个视频帧中的游戏地图检测区域的图像。
- 根据权利要求2所述的方法,还包括:在一个视频帧的游戏地图校正区域相对于所述一个视频帧的游戏地图检测区域的偏离距离未超过所述偏离阈值的情况下,截取所述一个视频帧中的游戏地图校正区域的图像。
- 根据权利要求2或3所述的方法,在将所述至少一个视频帧输入至第二目标检测模型之前,还包括:获取多个样本视频帧,所述样本视频帧与所述至少一个视频帧对应的游戏种类相同;将所述多个样本视频帧和所述多个样本视频帧上游戏地图的显示区域构成训练样本集,训练所述第二目标检测模型。
- 根据权利要求1-4任一项所述的方法,在所述将所述游戏地图区域图像输入至第一目标检测模型,得到所述游戏地图区域图像上游戏元素的显示区域之前,还包括:获取多个游戏地图样本图像,所述游戏地图样本图像与所述游戏地图区域图像对应的游戏种类相同;将所述多个游戏地图样本图像和所述多个游戏地图样本图像上游戏元素的显示区域构成训练样本集,训练所述第一目标检测模型。
- 根据权利要求1-5任一项所述的方法,其中,所述第一目标检测模型包括特征图生成子模型、网格分割子模型和定位子模型;所述将所述游戏地图区域图像输入至所述第一目标检测模型,得到所述游戏地图区域图像上游戏元素的显示区域,包括:将所述游戏地图区域图像输入至所述特征图生成子模型,生成游戏地图区域图像的特征图;将所述特征图输入至所述网格分割子模型,将所述特征图分割为多个网格,所述网格的尺寸与所述游戏元素的最小尺寸之差在预设尺寸范围内;将所述多个网格输入至所述定位子模型,得到每个网格与多种游戏元素的特征的匹配度;采用非极大值抑制算法,确定匹配度最大的网格所对应的区域为所述游戏地图区域图像上对应种类的游戏元素的显示区域。
- 根据权利要求1-6任一项所述的方法,其中,所述采用所述游戏元素的显示区域和状态,形成所述至少一个视频帧展示的游戏场景的描述信息,包括:根据一个视频帧中所述游戏元素的显示区域和状态与描述信息的对应关系,得到所述一个视频帧展示的游戏场景的描述信息;或者,所述采用所述游戏元素的显示区域和状态,形成所述至少一个视频帧展示的游戏场景的描述信息,包括:根据多个视频帧中所述游戏元素的显示区域和状态,得到所述游戏元素的显示区域的变化趋势和状态的变化趋势;根据所述游戏元素的显示区域的变化趋势和状态的变化趋势与所述描述信息的对应关系,得到所述多个视频帧展示的游戏场景的描述信息。
- 一种游戏场景描述装置,包括:获取模块,设置为获取游戏直播视频流中的至少一个视频帧;截取模块,设置为截取所述至少一个视频帧中的游戏地图区域图像;显示区域识别模块,设置为将所述游戏地图区域图像输入至第一目标检测 模型,得到所述游戏地图区域图像上游戏元素的显示区域;状态识别模块,设置为将所述游戏元素的显示区域的图像输入至分类模型,得到所述游戏元素的状态;形成模块,设置为采用所述游戏元素的显示区域和状态,形成所述至少一个视频帧展示的游戏场景的描述信息。
- 一种电子设备,包括:至少一个处理器;存储器,设置为存储至少一个程序;当所述至少一个程序被所述至少一个处理器执行,使得所述至少一个处理器实现如权利要求1-7任一所述的游戏场景描述方法。
- 一种计算机可读存储介质,存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1-7任一所述的游戏场景描述方法。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
SG11202010692RA SG11202010692RA (en) | 2018-05-25 | 2019-05-24 | Game scene description method and apparatus, device, and storage medium |
US16/977,831 US20210023449A1 (en) | 2018-05-25 | 2019-05-24 | Game scene description method and apparatus, device, and storage medium |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810517799.X | 2018-05-25 | ||
CN201810517799.XA CN108769821B (zh) | 2018-05-25 | 2018-05-25 | 游戏场景描述方法、装置、设备及存储介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2019223782A1 true WO2019223782A1 (zh) | 2019-11-28 |
Family
ID=64006021
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/088348 WO2019223782A1 (zh) | 2018-05-25 | 2019-05-24 | 游戏场景描述方法、装置、设备及存储介质 |
Country Status (4)
Country | Link |
---|---|
US (1) | US20210023449A1 (zh) |
CN (1) | CN108769821B (zh) |
SG (1) | SG11202010692RA (zh) |
WO (1) | WO2019223782A1 (zh) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111191542A (zh) * | 2019-12-20 | 2020-05-22 | 腾讯科技(深圳)有限公司 | 虚拟场景中的异常动作识别方法、装置、介质及电子设备 |
CN112704874A (zh) * | 2020-12-21 | 2021-04-27 | 北京信息科技大学 | 一种3d游戏中的哥特式场景自动生成的方法和装置 |
CN113423000A (zh) * | 2021-06-11 | 2021-09-21 | 完美世界征奇(上海)多媒体科技有限公司 | 视频的生成方法及装置、存储介质、电子装置 |
Families Citing this family (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108769821B (zh) * | 2018-05-25 | 2019-03-29 | 广州虎牙信息科技有限公司 | 游戏场景描述方法、装置、设备及存储介质 |
CN109582463B (zh) * | 2018-11-30 | 2021-04-06 | Oppo广东移动通信有限公司 | 资源配置方法、装置、终端及存储介质 |
CN109819271A (zh) * | 2019-02-14 | 2019-05-28 | 网易(杭州)网络有限公司 | 游戏直播间展示的方法及装置、存储介质、电子设备 |
CN110135476A (zh) * | 2019-04-28 | 2019-08-16 | 深圳市中电数通智慧安全科技股份有限公司 | 一种个人安全装备的检测方法、装置、设备及系统 |
CN110177295B (zh) | 2019-06-06 | 2021-06-22 | 北京字节跳动网络技术有限公司 | 字幕越界的处理方法、装置和电子设备 |
CN110227264B (zh) * | 2019-06-06 | 2023-07-11 | 腾讯科技(成都)有限公司 | 虚拟对象控制方法、装置、可读存储介质和计算机设备 |
CN110152301B (zh) * | 2019-06-18 | 2022-12-16 | 金陵科技学院 | 一种电子竞技游戏数据获取方法 |
CN110276348B (zh) * | 2019-06-20 | 2022-11-25 | 腾讯科技(深圳)有限公司 | 一种图像定位方法、装置、服务器及存储介质 |
CN110532893A (zh) * | 2019-08-05 | 2019-12-03 | 西安电子科技大学 | 电竞小地图图像中的图标检测方法 |
CN110569391B (zh) * | 2019-09-11 | 2021-10-15 | 腾讯科技(深圳)有限公司 | 播报事件识别方法、电子设备及计算机可读存储介质 |
CN112492346A (zh) * | 2019-09-12 | 2021-03-12 | 上海哔哩哔哩科技有限公司 | 确定游戏视频中精彩时刻的方法和游戏视频的播放方法 |
US11154773B2 (en) * | 2019-10-31 | 2021-10-26 | Nvidia Corpration | Game event recognition |
CN110909630B (zh) * | 2019-11-06 | 2023-04-18 | 腾讯科技(深圳)有限公司 | 一种异常游戏视频检测方法和装置 |
CN110865753B (zh) * | 2019-11-07 | 2021-01-22 | 支付宝(杭州)信息技术有限公司 | 应用消息的通知方法及装置 |
CN111097168B (zh) * | 2019-12-24 | 2024-02-27 | 网易(杭州)网络有限公司 | 游戏直播中的显示控制方法及装置、存储介质及电子设备 |
CN111097169B (zh) * | 2019-12-25 | 2023-08-29 | 上海米哈游天命科技有限公司 | 一种游戏图像的处理方法、装置、设备及存储介质 |
CN111672109B (zh) * | 2020-06-10 | 2021-12-03 | 腾讯科技(深圳)有限公司 | 一种游戏地图生成的方法、游戏测试的方法以及相关装置 |
CN112396697B (zh) * | 2020-11-20 | 2022-12-06 | 上海莉莉丝网络科技有限公司 | 游戏地图内区域生成方法、系统及计算机可读存储介质 |
CN112560728B (zh) * | 2020-12-22 | 2023-07-11 | 上海幻电信息科技有限公司 | 目标对象识别方法及装置 |
AU2021204578A1 (en) * | 2021-06-14 | 2023-01-05 | Sensetime International Pte. Ltd. | Methods, apparatuses, devices and storage media for controlling game states |
KR20220169466A (ko) * | 2021-06-18 | 2022-12-27 | 센스타임 인터내셔널 피티이. 리미티드. | 게임 상태들을 제어하기 위한 방법들 및 장치들 |
CN113728326A (zh) * | 2021-06-24 | 2021-11-30 | 商汤国际私人有限公司 | 游戏监控 |
CN115623227B (zh) * | 2021-07-12 | 2024-08-20 | 北京字节跳动网络技术有限公司 | 直播视频拍照的方法、装置、设备和计算机可读存储介质 |
CN114708363A (zh) * | 2022-04-06 | 2022-07-05 | 广州虎牙科技有限公司 | 游戏直播封面生成方法及服务器 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3050605A1 (en) * | 2015-02-02 | 2016-08-03 | GameFly Israel Ltd. | A method for event detection in real-time graphic applications |
US20170228600A1 (en) * | 2014-11-14 | 2017-08-10 | Clipmine, Inc. | Analysis of video game videos for information extraction, content labeling, smart video editing/creation and highlights generation |
CN107197370A (zh) * | 2017-06-22 | 2017-09-22 | 北京密境和风科技有限公司 | 一种直播视频的场景检测方法和装置 |
CN107569848A (zh) * | 2017-08-30 | 2018-01-12 | 武汉斗鱼网络科技有限公司 | 一种游戏分类方法、装置及电子设备 |
CN107998655A (zh) * | 2017-11-09 | 2018-05-08 | 腾讯科技(成都)有限公司 | 数据显示方法、装置、存储介质和电子装置 |
CN108769821A (zh) * | 2018-05-25 | 2018-11-06 | 广州虎牙信息科技有限公司 | 游戏场景描述方法、装置、设备及存储介质 |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106390459A (zh) * | 2016-09-19 | 2017-02-15 | 腾讯科技(深圳)有限公司 | 一种游戏数据获取方法和装置 |
CN111405299B (zh) * | 2016-12-19 | 2022-03-01 | 广州虎牙信息科技有限公司 | 基于视频流的直播交互方法及其相应的装置 |
AU2018215460A1 (en) * | 2017-02-03 | 2019-09-19 | Taunt Inc. | System and method for synchronizing and predicting game data from game video and audio data |
CN107040795A (zh) * | 2017-04-27 | 2017-08-11 | 北京奇虎科技有限公司 | 一种直播视频的监控方法和装置 |
US10719712B2 (en) * | 2018-02-26 | 2020-07-21 | Canon Kabushiki Kaisha | Classify actions in video segments using play state information |
US10449461B1 (en) * | 2018-05-07 | 2019-10-22 | Microsoft Technology Licensing, Llc | Contextual in-game element recognition, annotation and interaction based on remote user input |
US11148062B2 (en) * | 2018-05-18 | 2021-10-19 | Sony Interactive Entertainment LLC | Scene tagging |
-
2018
- 2018-05-25 CN CN201810517799.XA patent/CN108769821B/zh active Active
-
2019
- 2019-05-24 WO PCT/CN2019/088348 patent/WO2019223782A1/zh active Application Filing
- 2019-05-24 US US16/977,831 patent/US20210023449A1/en not_active Abandoned
- 2019-05-24 SG SG11202010692RA patent/SG11202010692RA/en unknown
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170228600A1 (en) * | 2014-11-14 | 2017-08-10 | Clipmine, Inc. | Analysis of video game videos for information extraction, content labeling, smart video editing/creation and highlights generation |
EP3050605A1 (en) * | 2015-02-02 | 2016-08-03 | GameFly Israel Ltd. | A method for event detection in real-time graphic applications |
CN107197370A (zh) * | 2017-06-22 | 2017-09-22 | 北京密境和风科技有限公司 | 一种直播视频的场景检测方法和装置 |
CN107569848A (zh) * | 2017-08-30 | 2018-01-12 | 武汉斗鱼网络科技有限公司 | 一种游戏分类方法、装置及电子设备 |
CN107998655A (zh) * | 2017-11-09 | 2018-05-08 | 腾讯科技(成都)有限公司 | 数据显示方法、装置、存储介质和电子装置 |
CN108769821A (zh) * | 2018-05-25 | 2018-11-06 | 广州虎牙信息科技有限公司 | 游戏场景描述方法、装置、设备及存储介质 |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111191542A (zh) * | 2019-12-20 | 2020-05-22 | 腾讯科技(深圳)有限公司 | 虚拟场景中的异常动作识别方法、装置、介质及电子设备 |
CN111191542B (zh) * | 2019-12-20 | 2023-05-02 | 腾讯科技(深圳)有限公司 | 虚拟场景中的异常动作识别方法、装置、介质及电子设备 |
CN112704874A (zh) * | 2020-12-21 | 2021-04-27 | 北京信息科技大学 | 一种3d游戏中的哥特式场景自动生成的方法和装置 |
CN112704874B (zh) * | 2020-12-21 | 2023-09-22 | 北京信息科技大学 | 一种3d游戏中的哥特式场景自动生成的方法和装置 |
CN113423000A (zh) * | 2021-06-11 | 2021-09-21 | 完美世界征奇(上海)多媒体科技有限公司 | 视频的生成方法及装置、存储介质、电子装置 |
CN113423000B (zh) * | 2021-06-11 | 2024-01-09 | 完美世界征奇(上海)多媒体科技有限公司 | 视频的生成方法及装置、存储介质、电子装置 |
Also Published As
Publication number | Publication date |
---|---|
SG11202010692RA (en) | 2020-11-27 |
CN108769821B (zh) | 2019-03-29 |
CN108769821A (zh) | 2018-11-06 |
US20210023449A1 (en) | 2021-01-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2019223782A1 (zh) | 游戏场景描述方法、装置、设备及存储介质 | |
US11928800B2 (en) | Image coordinate system transformation method and apparatus, device, and storage medium | |
WO2020199906A1 (zh) | 人脸关键点检测方法、装置、设备及存储介质 | |
US20200356818A1 (en) | Logo detection | |
KR102320649B1 (ko) | 얼굴 이미지 품질을 결정하는 방법과 장치, 전자 기기 및 컴퓨터 저장 매체 | |
WO2019218824A1 (zh) | 一种移动轨迹获取方法及其设备、存储介质、终端 | |
US20190355147A1 (en) | Method and apparatus for determining object posture in image, device, and storage medium | |
CN111523468B (zh) | 人体关键点识别方法和装置 | |
WO2022156640A1 (zh) | 一种图像的视线矫正方法、装置、电子设备、计算机可读存储介质及计算机程序产品 | |
TW201911130A (zh) | 一種翻拍影像識別方法及裝置 | |
WO2019042426A1 (zh) | 增强现实场景的处理方法、设备及计算机存储介质 | |
WO2020037881A1 (zh) | 运动轨迹绘制方法、装置、设备和存储介质 | |
CN110991287A (zh) | 一种实时视频流人脸检测跟踪方法及检测跟踪系统 | |
CN110490067B (zh) | 一种基于人脸姿态的人脸识别方法及装置 | |
CN109522807B (zh) | 基于自生成特征的卫星影像识别系统、方法及电子设备 | |
WO2017107865A1 (zh) | 图像检索系统、服务器、数据库及相关的方法 | |
WO2022002262A1 (zh) | 基于计算机视觉的字符序列识别方法、装置、设备和介质 | |
CN112163479A (zh) | 动作检测方法、装置、计算机设备和计算机可读存储介质 | |
WO2022148248A1 (zh) | 图像处理模型的训练方法、图像处理方法、装置、电子设备及计算机程序产品 | |
CN111950554A (zh) | 一种身份证识别方法、装置、设备及存储介质 | |
CN114663686A (zh) | 物体特征点匹配方法及装置、训练方法及装置 | |
CN113556600B (zh) | 基于时序信息的驱动控制方法、装置、电子设备和可读存储介质 | |
CN116863085B (zh) | 一种三维重建系统、三维重建方法、电子设备及存储介质 | |
CN111915676B (zh) | 图像生成方法、装置、计算机设备和存储介质 | |
US11954875B2 (en) | Method for determining height of plant, electronic device, and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19807253 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19807253 Country of ref document: EP Kind code of ref document: A1 |