CN110652726A - Game auxiliary system based on image recognition and audio recognition - Google Patents

Game auxiliary system based on image recognition and audio recognition Download PDF

Info

Publication number
CN110652726A
CN110652726A CN201910926107.1A CN201910926107A CN110652726A CN 110652726 A CN110652726 A CN 110652726A CN 201910926107 A CN201910926107 A CN 201910926107A CN 110652726 A CN110652726 A CN 110652726A
Authority
CN
China
Prior art keywords
game
image
module
recognition
sound
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910926107.1A
Other languages
Chinese (zh)
Other versions
CN110652726B (en
Inventor
范科
邬鑫宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
HANGZHOU ICAFE TECHNOLOGY Co Ltd
Original Assignee
HANGZHOU ICAFE TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by HANGZHOU ICAFE TECHNOLOGY Co Ltd filed Critical HANGZHOU ICAFE TECHNOLOGY Co Ltd
Priority to CN201910926107.1A priority Critical patent/CN110652726B/en
Publication of CN110652726A publication Critical patent/CN110652726A/en
Application granted granted Critical
Publication of CN110652726B publication Critical patent/CN110652726B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/50Controlling the output signals based on the game progress
    • A63F13/53Controlling the output signals based on the game progress involving additional visual information provided to the game scene, e.g. by overlay to simulate a head-up display [HUD] or displaying a laser sight in a shooting game
    • A63F13/537Controlling the output signals based on the game progress involving additional visual information provided to the game scene, e.g. by overlay to simulate a head-up display [HUD] or displaying a laser sight in a shooting game using indicators, e.g. showing the condition of a game character on screen
    • A63F13/5375Controlling the output signals based on the game progress involving additional visual information provided to the game scene, e.g. by overlay to simulate a head-up display [HUD] or displaying a laser sight in a shooting game using indicators, e.g. showing the condition of a game character on screen for graphically or textually suggesting an action, e.g. by displaying an arrow indicating a turn in a driving game
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/50Controlling the output signals based on the game progress
    • A63F13/53Controlling the output signals based on the game progress involving additional visual information provided to the game scene, e.g. by overlay to simulate a head-up display [HUD] or displaying a laser sight in a shooting game
    • A63F13/533Controlling the output signals based on the game progress involving additional visual information provided to the game scene, e.g. by overlay to simulate a head-up display [HUD] or displaying a laser sight in a shooting game for prompting the player, e.g. by displaying a game menu
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/50Controlling the output signals based on the game progress
    • A63F13/53Controlling the output signals based on the game progress involving additional visual information provided to the game scene, e.g. by overlay to simulate a head-up display [HUD] or displaying a laser sight in a shooting game
    • A63F13/537Controlling the output signals based on the game progress involving additional visual information provided to the game scene, e.g. by overlay to simulate a head-up display [HUD] or displaying a laser sight in a shooting game using indicators, e.g. showing the condition of a game character on screen
    • A63F13/5378Controlling the output signals based on the game progress involving additional visual information provided to the game scene, e.g. by overlay to simulate a head-up display [HUD] or displaying a laser sight in a shooting game using indicators, e.g. showing the condition of a game character on screen for displaying an additional top view, e.g. radar screens or maps
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/131Protocols for games, networked simulations or virtual reality
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F2300/00Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
    • A63F2300/30Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by output arrangements for receiving control signals generated by the game device
    • A63F2300/303Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by output arrangements for receiving control signals generated by the game device for displaying additional data, e.g. simulating a Head Up Display
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F2300/00Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
    • A63F2300/30Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by output arrangements for receiving control signals generated by the game device
    • A63F2300/303Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by output arrangements for receiving control signals generated by the game device for displaying additional data, e.g. simulating a Head Up Display
    • A63F2300/305Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by output arrangements for receiving control signals generated by the game device for displaying additional data, e.g. simulating a Head Up Display for providing a graphical or textual hint to the player
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F2300/00Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
    • A63F2300/30Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by output arrangements for receiving control signals generated by the game device
    • A63F2300/303Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by output arrangements for receiving control signals generated by the game device for displaying additional data, e.g. simulating a Head Up Display
    • A63F2300/307Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by output arrangements for receiving control signals generated by the game device for displaying additional data, e.g. simulating a Head Up Display for displaying an additional window with a view from the top of the game field, e.g. radar screen
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F2300/00Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
    • A63F2300/30Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by output arrangements for receiving control signals generated by the game device
    • A63F2300/308Details of the user interface

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Optics & Photonics (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a game auxiliary system based on image recognition and audio recognition, which comprises a game image acquisition preprocessing module, a game sound acquisition preprocessing module, a digital image recognition module, a digital audio recognition module and a recognition result prompting module, wherein the five modules are divided according to the logic function of a processing task to jointly form an application program, and the data transmission and the signal transmission among the modules are completed through an in-process communication technology, so that the game auxiliary system is very efficient. Therefore, the system can accurately identify the enemy and the vehicle which are latent at a distance and the direction of the gunshot, and give a prompt icon in a striking color on the game screen in real time; in addition, the electronic compass and the small map can be identified to give the advancing instruction of the player entering the safe area, the survival time of the ordinary primary player in the game can be prolonged through the auxiliary prompts, and the game experience is improved.

Description

Game auxiliary system based on image recognition and audio recognition
Technical Field
The invention belongs to the technical field of electronic game assistance, and particularly relates to a game assistance system based on image recognition and audio recognition.
Background
Electronic Games (Electronic Games), also called Video Games (Video Games) or Video Games (Video Games for short), refer to all interactive Games that run on Electronic device platforms. The perfect electronic game appeared at the end of the 20 th century, changes the behavior mode of human playing games and the definition of a game word, and belongs to a cultural activity born along with the development of science and technology. The electronic game can also refer to 'electronic game software', the electronic game is also an art in fact, integrates the aspects of art, music, movies, AI, computer technology and the like, has strong cultural bearing capacity and infectivity, and has relatively low experience threshold, even people who are difficult to appreciate paintings, music and books and have scarce imagination can usually find sufficient fun in the game, and the game and the movie have the same wonderful effect and are very parent, but the substituted game is stronger. The birth of the electronic game enriches the life of human beings, thereby promoting the progress of the human society in the world, enriching the mental world and the physical world of human beings and enabling the life of human beings to be more happy.
However, there are also a number of very popular games, such as "strictly ceremonial", where users who intend to relax occasionally are forced to end the game passively when playing the game because they cannot find the enemy faster, which prevents a large number of primary ordinary players and some players with poor visual and auditory discrimination from enjoying the game and satisfying the game. The advent of game aids can help the disadvantaged groups in these games, as well as experience the enjoyment of different types of games.
In recent years, with the help of the rapid improvement of the performance of computer hardware, machine learning algorithms represented by deep learning have been greatly successful in the fields of machine vision, speech recognition and the like, and the recognition accuracy is greatly improved, so that artificial intelligence is once again widely concerned by the academic and industrial fields. The method comprises a series of events such as Alphago, video identification, fingerprint unlocking, picture identification, voice to text conversion and the like, so that people can feel artificial intelligence deeply and change the working mode and cognition of the people. Over a hundred years ago, electricity changed industries such as production, transportation, and agriculture, while today, artificial intelligence, like electricity, would change the traditional industries.
Face recognition and picture recognition are two popular applications in the fields of artificial intelligence vision and images. Recognizing objects is another common application of image classification, for example, a simple mobile phone recognition model, which is to define a model for a computer first, then prepare a large number of pictures of a mobile phone to train the model, so that the computer can recognize the pictures, and when a picture is output, whether the picture is a mobile phone or not can be recognized. Under normal conditions, a computer model can be accurately identified, but when some pictures with shielding, variable shapes or angles and difficult illumination are input, the model established by people before can not be identified, which is a difficult problem of computer vision in application. The essence of machine learning is to find a function, which plays different roles in different fields, such as speech recognition field, where the function recognizes a speech into a text; in the field of image recognition, this function maps an image to a classification.
With the development and maturity of machine learning technology, a good basis is provided for the recognition of game images and sounds, and the realization of corresponding auxiliary tool software is possible.
Disclosure of Invention
In view of the above, the invention provides a game auxiliary system based on image recognition and audio recognition, namely auxiliary tool software for realizing image and sound recognition based on machine learning for a first person expedition shooting game, which can accurately recognize remote latent enemies and vehicles and the direction of shooting a gunshot, give prompt icons in a striking color on a game screen in real time, and also can give a travel instruction for a player to enter a safe area through recognition of an electronic compass and a small map, so that the survival time of a common primary player in the game can be prolonged through the auxiliary prompts, and the game experience is improved.
A game auxiliary system based on image recognition and audio recognition comprises a game image acquisition preprocessing module, a game sound acquisition preprocessing module, a digital image recognition module, a digital audio recognition module and a recognition result prompting module; wherein:
the game image acquisition preprocessing module is used for capturing and preprocessing game video images in real time;
the game sound acquisition preprocessing module is used for capturing and preprocessing multi-channel (such as stereo, 5.1 sound channels and the like) sound of a game in real time;
the digital image identification module is used for identifying an electronic compass angle in a game video image, player coordinates in a small map and positions of a person and a vehicle;
the digital audio recognition module adopts the frequency spectrum characteristics of the target signal given by the audio offline analysis system to perform frequency domain analysis and comparison on the audio data of each sound channel and judge which sound channel contains the gunshot information;
the recognition result prompting module is used for highlighting and marking the electronic compass angle, the coordinates of the player in the small map, the positions of the character and the vehicle and the corresponding positions of the occurrence of the gunshot on the game picture and giving out voice alarm, thereby achieving the purpose of timely reminding the player of paying attention.
In order to avoid the influence on the fluency of the game, the calculation processing of the auxiliary tool needs to reduce the occupation and consumption of GPU resources of the machine display card to the maximum extent; preferably, the game image acquisition preprocessing module converts continuous game video into discrete digital images, and directly intercepts image blocks of corresponding areas in the images for transmitting the image blocks to the digital image identification module for identification processing of contents (such as an electronic compass, a small map and the like) displayed at fixed positions; and for objects (such as people, vehicles and the like) which do not appear at fixed positions, cutting the acquired image in a sliding window mode, only intercepting the upper part of the image, then scaling the intercepted image in an equal ratio mode, recording the front and back sequence of each frame, and calling and analyzing by a digital image recognition module. Such a cropping may cover as much of the area where the player is present as possible with as little resource consumption as possible.
Furthermore, the game sound acquisition preprocessing module converts continuous sound into digital audio packets with segmented fixed time length according to respective sound channels, marks the sound channel to which each audio packet belongs, and then transmits the audio packets to the digital audio recognition module for further processing.
Further, the digital image identification module is used for identifying the angle of the electronic compass by adopting two convolutional neural networks assisted by some image preprocessing methods; for the identification of the player coordinates in the small map, an image processing method is directly adopted for realizing; for the identification of characters and vehicles, a model obtained through machine learning offline training is adopted for processing and identification, and the model has two functions of target detection and target identification, namely, the position of an object of interest (such as a character, a vehicle and the like) is detected firstly for a game video image input by a game image acquisition preprocessing module, and then the object is identified to judge which type of target object is specific.
Further, the model is a full convolution neural network model based on a deep learning visual algorithm YOLOv2(You Only Look one 2), the network model has an up-sampling layer and a pyramid structure, and the up-sampling layer is used for improving the identification accuracy and recall rate of small-size targets; the pyramid structure is used for inputting the shallow information into the deep network, combining the shallow spatial information and the deep semantic information, and matching with the upper sampling layer to further improve the accuracy of target identification.
Furthermore, the digital image recognition module utilizes the correlation information (such as the movement speed of people, the driving speed of vehicles and the like) between the front frame and the rear frame of the continuous image sequence to effectively assist the target detection and recognition process of a single image, so that the recognition accuracy is further improved; meanwhile, because the game image acquisition preprocessing module cuts and scales the input image, the digital image recognition module needs to convert a coordinate system after recognizing the target object, calculates the position coordinate of the target object in the game original image, then sends the position coordinate to the recognition result prompting module for further processing, and feeds the position coordinate of the target object back to the game image acquisition preprocessing module, thereby facilitating the optimization of internal performance and the improvement of efficiency.
Further, after the gunshot is detected in a plurality of sound channels simultaneously, the digital audio recognition module analyzes the sound volume amplitude of each sound channel in the time domain space in combination, takes the time domain amplitude of each sound channel in which the gunshot is detected as different gunshot components, finally calculates the source direction of the gunshot in the three-dimensional space according to the characteristics of the vector, and then transmits the source direction to the recognition result prompting module for further processing.
Furthermore, the digital image recognition module can obtain an AI algorithm model capable of detecting and recognizing the target object by means of an image offline training system when detecting and recognizing the target object, the system is based on a deep learning technology in machine learning and training a large number of samples, namely firstly, image acquisition is carried out on the game and strict marking rules are formulated, parameters in the AI algorithm model can optimize model parameters under the action of a reverse transfer algorithm according to learned data distribution in the training process, and the overall optimization or local optimization under the data distribution is approached; in the aspect of target detection, anchor points with certain size and proportion are given in advance, the full graph is traversed by combining the one-to-one corresponding characteristic of the full convolution structure image area, the position of a labeled target is recorded, then the intersection and comparison is used as measurement, labeled frames are clustered by using a kmean + + algorithm, some frames with the most representativeness in all labeled frames are obtained and used as anchor points, and finally, a GPU is used for carrying out iterative training on an AI algorithm model until the model converges.
Further, the audio offline analysis system is based on a frequency domain analysis technology, frequency domain characteristics of a target sample are obtained by performing frequency spectrum analysis on a plurality of kinds of shot sound audio samples collected in advance, the frequency domain characteristics are provided for the digital audio identification module to perform frequency spectrum comparison operation, the digital audio identification module performs time domain to frequency domain conversion on audio signals by adopting fast Fourier transform, namely, Fourier transform is performed on the shot sound signals collected in advance, the frequency domain characteristics of the shot sound signals are observed, if the amplitude values of various kinds of shot sounds in certain fixed frequency bands are relatively large, the digital audio identification module takes the frequency bands as reference intervals to compare whether the amplitude values of the various kinds of shot sounds exceed a preset threshold value, and if the amplitude values of the various kinds of shot sounds exceed the preset threshold value.
Based on the technical scheme, the invention has the following beneficial technical effects:
1. the invention identifies objects such as unobtrusive characters, vehicles and the like in the game through the image identification technology, can quickly help players to find enemy personnel and timely take countermeasures.
2. The invention identifies the angle of the electronic compass and the position of the small map in the game by the image identification technology, and can help the player to efficiently advance to the safe area.
3. The invention identifies the direction of the gunshot in the game multi-channel stereo through the sound identification technology, can quickly help the player to determine the position of the enemy and timely make countermeasures.
4. The invention provides game auxiliary function by image recognition and sound recognition technology, without injecting program and code into game program, without destroying game stability and distorting game data.
5. The function of the invention does not depend on the code injection technology, thus avoiding the embarrassment of interception by antivirus software and the like.
Drawings
FIG. 1 is a schematic diagram of the game assistance system of the present invention.
FIG. 2 is a schematic diagram of an image cropping scheme in the game assistance system of the present invention.
FIG. 3 is a schematic view of the operation of the game support system of the present invention.
Detailed Description
In order to more specifically describe the present invention, the following detailed description is provided for the technical solution of the present invention with reference to the accompanying drawings and the specific embodiments.
The game auxiliary tool is auxiliary tool software for realizing image and sound recognition based on machine learning for a first person expedition shooting game, is a single machine software of a Windows platform, needs to be installed on a machine together with game client software, and is structurally shown in figure 1, and comprises a game image acquisition preprocessing module, a game sound acquisition preprocessing module, a digital image recognition module, a digital audio recognition module and a recognition result prompting module, wherein the five modules are divided according to the logic function of a processing task to jointly form an application program, so that data transmission and signal transmission among the modules are completed through an in-process communication technology, the efficiency is high, and the mutual work flow is shown in figure 3.
In addition, in the program development and debugging stage, the auxiliary tool software needs an image offline training system to provide an algorithm model for image target object recognition for the auxiliary tool software, and needs a sound offline analysis system to provide target sound spectrum characteristic information for sound recognition and comparison for the auxiliary tool software. After the development work of the auxiliary tool software is completed, the image recognition algorithm model and the voice frequency spectrum characteristic information are respectively integrated into the digital image recognition module and the digital audio recognition module of the auxiliary tool software. Thus, the image offline training system and the sound offline analysis system are two separate software programs, both running on the Linux system, and need not be distributed to the game players with the accessibility software programs. The auxiliary tool software which is externally released needs to be installed on a computer of a game player, and an auxiliary tool software program is started before the game player starts the game, because the mainstream popular games are all run on a Windows system, the auxiliary tool software is also developed for a Windows platform and can only run on Windows7 and later released Windows systems.
The game image acquisition preprocessing module is mainly used for acquiring a game screen image in real time through a DirectX system API of Windows, and the image format is RGB 24. Because the mainstream sizes of the current game pictures are large, the identification processing of the whole image needs to consume more computing resources and time. Therefore, in order to improve the efficiency of identifying the target object in the image at the later stage, reduce the occupation of system CPU and GPU resources and ensure the smooth running of the game, the captured game image needs to be cut, and only a specific image area needs to be identified. For the electronic compass and the small map, because the positions of the electronic compass and the small map are relatively fixed, directly intercepting the image block of the area where the electronic compass and the small map are located can be used as a sub-image to be identified; for objects such as people and vehicles with unfixed display sizes and appearance positions, the cropping basis of the sub-images to be recognized is as follows: the top of the screen is a remote area, objects are very small and can only be represented as small points on the image, no matter what target object is, no matter what human eyes or program algorithm can not distinguish, and even if enemies are concerned, the distance does not threaten the current player; the lowest part of the screen is the area around the current player in the game, and if an enemy player cannot see the area, the image recognition is not needed. Therefore, specific cutting areas and position conditions are obtained as shown in fig. 2, assuming that the frame a and the frame B are two continuous game screenshots, the areas shown in the frame a and the frame B are effective areas actually cut and captured by cutting, and the effective area images are transmitted to an image recognition module for target detection and recognition after being processed by reducing the effective area images by 40% -50% in equal proportion. At the stage when no object is detected in the image, the cropping rules of the effective area are alternated according to the approximate positions shown in the frame a and the frame B in fig. 2. When the detected target position fed back by the image recognition module is received, the selection rule of the effective area changes, and the image is cut according to the size of the previous effective area by taking the target object as the center. The core logic is that the intercepted effective area tracks the movement of the target object from the time axis.
The game sound acquisition preprocessing module is mainly used for acquiring PCM data of a plurality of sound channels of a game in real time through a DirectX system API of Windows, then segmenting continuous PCM data of each sound channel according to the duration of 0.5 second, and finally transmitting the segmented PCM data packets to the digital audio recognition module one by one according to time sequence for gunshot detection.
The digital image identification module mainly comprises three sub-logic modules which respectively correspond to the identification of an electronic compass angle, the identification of a small map player coordinate and the identification of a person and a vehicle in a game. The identification of the angle of the electronic compass is the identification of the number, and because the identification accuracy of a single number is higher than that of sequence identification and less calculation power is consumed, the identification of the single number is adopted; before identification, numbers in the electronic compass need to be cut, and the scale of the electronic compass in the game is 5 degrees, so the number of the compass can be one, two or three, and the total number is three. For the compass angle with fixed digit, we can set up the clipping position in advance, so the digit of compass needs to be judged before clipping. Here we use another convolutional neural network to identify the compass bits. As described above, the electronic compass identification is divided into three steps: 1. identifying the number of compass digits; 2. cutting out a single number according to the digit; 3. a single number is identified. For the identification of the coordinates of a player in the small map, namely identifying the marked lines in the small map, dividing the whole map into 8 multiplied by 8 grids by the first-level marked lines, and dividing each grid into 10 multiplied by 10 small grids by the second-level marked lines; since the coordinates of the player are always in the center of the map, the position of the player can be obtained by only identifying the reticle closest to the player.
Firstly, regularizing an image in the horizontal direction and the vertical direction respectively, then carrying out binarization on the regularized image, identifying the image with the continuous length exceeding the small map width 2/3 as a marked line, cutting the number of the marked line according to the position of the marked line after identifying the marked line, fixing the number of the marked line into two bits, wherein the first bit is a letter, the second bit is a number, cutting the two bits respectively, and identifying the two bits by using a convolutional neural network. From the above, the minimap player coordinate identification is divided into three steps: 1. marking position identification; 2. marking line number cutting; 3. identification of individual letters and numbers. For the exploration shooting game of 'dead reckoning' which is popular all over the world in recent two years, the game has an additional survival rule, namely, the concept that a safe activity area is arranged in a game scene, the safe area is gradually reduced along with the progress of the game, a player needs to be left in the safe area to survive, and the game prompts the player how to go to the safe area by using a dotted line mark in a small map. Therefore, after the game auxiliary tool identifies the safety zone indicating line in the minimap, and the current compass angle value of the player is combined, the game auxiliary tool can give the advancing prompting information of the player through the identification result prompting module, such as straight advancing, left turning and right turning, which is helpful for the player with weak direction sense and the player with poor understanding of the map.
For the recognition of characters and vehicles in games, as the positions of the targets are not fixed and the target forms have the characteristics of diversity, the invention mainly adopts an algorithm model obtained by machine learning offline training to process and recognize the target images, and the model mainly has two functions: target detection and target identification. The method comprises the steps of firstly detecting the positions of objects (characters, vehicles and the like) which are possibly interested by people on a game picture input by an image acquisition preprocessing module, then identifying the objects, and judging which type of target object is specific. The model is realized based on the structure of a full convolution neural network of a deep learning visual algorithm YOLOv2(You Only Look one 2), and the network structure has the biggest characteristic of extremely small occupation of computing resources, can process images in real time under most conditions, but needs to be improved in accuracy and recall rate. Therefore, a network up-sampling layer and a pyramid structure are added on the basis of the network structure, and the up-sampling layer can improve the identification accuracy and recall rate of small-size targets; the pyramid structure is that shallow information is input into a deep network, and the accuracy of target identification can be further improved by combining shallow spatial information and deep semantic information and matching with an upper sampling layer; because the sampling layer and the pyramid structure on the network are added to the full convolution neural network of the YOLOv2, the improved model has better identification capability on small objects, consumes less computing resources, improves the accuracy of small object identification by 10 percent, and improves the recall rate by 20 percent.
The above method and process for identifying a character in a single image by the system are characterized in that a sequence of images which are continuous on a time axis is captured from a game, and related information (such as the movement speed of the character, the driving speed of a vehicle and the like) exists between the front frame and the rear frame of the image in terms of time, so that the position and size relationship of the identified target object in the front frame and the rear frame of the image is fully utilized, individual error identification of the full convolutional neural network can be further eliminated, and the overall identification accuracy is improved. For example, we identify a player at time t, the size of which is small, and at this time, my player remains still, and if the player suddenly shifts a lot at time t +1, significantly exceeding the speed at which the character runs, it is considered that the full convolutional neural network will misidentify the vehicle as an enemy player. Similarly, the image recognition module also takes other characteristics of the game itself, such as the aspect ratio of the player character and other factors as auxiliary judgment conditions, so that the recognition accuracy of the character and the vehicle is further improved. In addition, because the image input before is cut and zoomed, after the target object is identified, the coordinate system conversion is needed to be carried out, the position coordinate of the target in the game original image is calculated, and then the position coordinate is sent to the identification result prompting module for further processing, and meanwhile, the identified target coordinate needs to be fed back to the game image acquisition preprocessing module, so that the internal performance optimization and the efficiency improvement are facilitated.
The digital audio recognition module mainly adopts a sound frequency spectrum characteristic comparison scheme provided by a sound offline analysis system for recognition, and the calculation amount in the processing process is not large, so that the consumed calculation force is small; since the kind of sound in the game is not particularly rich, the recognition scheme has high accuracy in recognizing the gunshot sound. In this case, the real-time performance of the whole process from the shooting of a gunshot in the game to the recognition and prompting by the aid becomes a key point, and if the recognition result is delayed too much in time, the player is not assisted. Digital audio recognition requires processing sound data for a period of time, but the frequency domain conversion and the spectral comparison in the recognition process are very fast in calculation, so that delay is mainly generated in the acquisition stage of sound samples. As mentioned above, the sound collection module will package collected sound data in segments according to a fixed time length, and then transmit the sound data to the audio recognition module, and the data time length of each sound data packet becomes the lag time of the gunshot prompt. According to the statistical analysis verification in the previous period, the gunshot waveform at least needs to contain a half period to identify the gunshot with higher accuracy; according to the sample statistics, the gunshot waveform is generally about 1 second, so that 0.5 second is selected as a separation period for sound collection, the identification accuracy is guaranteed, and the aim of assisting a player can be achieved in real time.
The recognition result prompting module creates a transparent window on the upper layer of the game screen mainly through a transparent window technology of a Windows system, displays the position coordinates of enemies and the like input from the digital image recognition module and the corresponding position of the gunshot orientation input from the digital audio recognition module in the transparent window in a flashing mode by using a striking icon, and plays a pre-recorded voice prompt to help common players to discover the enemies and the orientation thereof as early as possible.
The image offline training system mainly provides a specific recognition algorithm and a model for electronic compass recognition, small map position recognition and character and vehicle recognition in games, and obtains an AI algorithm model capable of detecting and recognizing a target object through learning and training a large number of samples based on a deep learning technology in machine learning. In the current-stage AI algorithm, a certain data distribution is learned, and in the training process, parameters in an algorithm model can be optimized under the action of a reverse transfer algorithm according to the learned data distribution, so that the model parameters approach to the global optimum or local optimum under the data distribution. Therefore, the effect of the algorithm model in actual application depends on whether the learned data distribution is consistent with the actual application scene or not in training; therefore, it is first necessary to capture the image of the game and make strict labeling rules, which helps the model to converge better and has better effect when applied. The above mentioned back-propagation algorithm is based on the gradient descent principle, and in order to converge faster and avoid some local optima, the present example employs an adam optimizer that integrates momentum and parameter differentiation updates. In addition, the target detection is a selective traversal in a certain sense, namely anchor points with certain sizes and proportions are required to be given in advance, the full graph is traversed by combining the one-to-one corresponding characteristic of the full convolution structure image areas, so that the sizes and proportions of the anchor points have great influence on the target detection effect, the positions of the labeled targets are recorded, then intersection and comparison are taken as measurement, the labeled frames are clustered by using a kmean + + algorithm, and some frames which are most representative in all the labeled frames are obtained and taken as the anchor points; the model is then iteratively trained using the GPU until the model converges. For electronic compass identification, a certain number of compass angle picture samples with different digits need to be collected and are divided into three categories; then cutting out the numbers in the compass, and dividing the numbers into ten categories according to the numbers of 0-9; for minimap location identification, specifically numbered letters and numbers are cut from the minimap collection sample and divided into 26 categories, including 16 letters and 10 numbers. For the three recognition tasks, the convolutional neural network and the Softmax classifier are adopted, the cross entropy is used as a loss function, training is carried out on the GPU, and the network weight is stored after the loss function is converged and is loaded for use in recognition. For target detection and identification models of characters and vehicles in games, weight data are obtained mainly by sending marked images into a neural network and training on the basis of a gradient descent method; in the subsequent identification processing process, the target object in the graph can be detected and identified by directly loading the weight data. Because the calculation capacity and the video memory size of a single GPU are limited, multiple GPUs are adopted for parallel training, so that the training speed is increased, the size of each batch is increased, and the oscillation generated in the training process is greatly reduced. When the model is approximately converged, the model parameters need to be manually and finely adjusted to continue training, so that a better effect is achieved.
The audio offline analysis system is mainly based on a frequency domain analysis technology, and is used for performing frequency spectrum analysis on a plurality of pre-collected gunshot audio samples to obtain frequency domain characteristics of a target sample and providing the frequency domain characteristics for auxiliary tool software to perform frequency spectrum comparison operation. Due to the increasing demand for game quality, in order to create game atmosphere, most of shooting games have the recorded real gunshot, different from human sound, the gunshot frequency is high, which causes great obstruction to human ear perception, and the game can generate the sound of bullets hitting on the ground in order to further simulate real scenes. Therefore, the gunshot and the bullet sound are closely connected in tandem, the interval time is very short, and the sound source direction of the bullet touching the wall, the ground and the like may be different from the shooting direction, which further increases the difficulty of the player in judging the source of the gunshot. When observing and analyzing the waveform of the collected shot samples, it is difficult to find out how much the shot samples are obviously different from other sounds in shape, because the sound signals are not characterized in the time domain but in the frequency domain. Therefore, the digital audio recognition module uses Fast Fourier Transform-FFT to perform time domain to frequency domain conversion on the sound signal, the Fast Fourier Transform is a variation of the Fourier Transform, and the digital audio recognition module is characterized by very Fast operation speed, small time complexity log (n), and few occupied computing resources, so that the digital audio recognition module can completely meet the requirement of real-time feedback in games. The frequency of the gunshot is very high, the frequency domains of other sounds are much lower, which becomes an obvious distinguishing factor, the fourier transform is carried out on the gunshot signals collected in advance, the frequency domain characteristics of the gunshot signals are observed, and the amplitude values of various gunshots are very large in certain fixed frequency bands; therefore, the digital audio identification module takes the frequency bands as reference intervals and compares whether the amplitudes of the frequency bands exceed a preset threshold value so as to judge whether the frequency bands are gunshot.
The embodiments described above are presented to enable a person having ordinary skill in the art to make and use the invention. It will be readily apparent to those skilled in the art that various modifications to the above-described embodiments may be made, and the generic principles defined herein may be applied to other embodiments without the use of inventive faculty. Therefore, the present invention is not limited to the above embodiments, and those skilled in the art should make improvements and modifications to the present invention based on the disclosure of the present invention within the protection scope of the present invention.

Claims (9)

1. A game assistance system based on image recognition and audio recognition, characterized by: the game sound recognition system comprises a game image acquisition preprocessing module, a game sound acquisition preprocessing module, a digital image recognition module, a digital audio recognition module and a recognition result prompting module; wherein:
the game image acquisition preprocessing module is used for capturing and preprocessing game video images in real time;
the game sound acquisition preprocessing module is used for capturing and preprocessing multi-channel sound of a game in real time;
the digital image identification module is used for identifying an electronic compass angle in a game video image, player coordinates in a small map and positions of a person and a vehicle;
the digital audio recognition module adopts the frequency spectrum characteristics of the target signal given by the audio offline analysis system to perform frequency domain analysis and comparison on the audio data of each sound channel and judge which sound channel contains the gunshot information;
the recognition result prompting module is used for highlighting and marking the electronic compass angle, the coordinates of the player in the small map, the positions of the character and the vehicle and the corresponding positions of the occurrence of the gunshot on the game picture and giving out voice alarm, thereby achieving the purpose of timely reminding the player of paying attention.
2. A game support system as defined in claim 1, wherein: the game image acquisition preprocessing module converts continuous game video into discrete digital images, directly intercepts image blocks in corresponding areas in the images for contents displayed at fixed positions and transmits the image blocks to the digital image identification module for identification processing; and for the target which does not appear at a fixed position, cutting the acquired image in a sliding window mode, only intercepting the upper part of the image, then scaling the intercepted image in an equal ratio, recording the front and back sequence of each frame, and calling and analyzing by a digital image recognition module.
3. A game support system as defined in claim 1, wherein: the game sound acquisition preprocessing module converts continuous sound into digital audio packets with segmented fixed time length according to respective sound channels, marks the sound channel to which each audio packet belongs, and then transmits the sound packets to the digital audio recognition module for further processing.
4. A game support system as defined in claim 1, wherein: the digital image identification module is used for identifying the angle of the electronic compass by adopting two convolutional neural networks assisted by some image preprocessing methods; for the identification of the player coordinates in the small map, an image processing method is directly adopted for realizing; for the identification of characters and vehicles, a model obtained through machine learning offline training is adopted for processing and identification, and the model has two functions of target detection and target identification, namely, the position of an object of interest is detected firstly for a game video image input by a game image acquisition preprocessing module, and then the object is identified to judge which type of target object is specific.
5. A gaming assistance system as defined in claim 4, wherein: the model is a full convolution neural network model based on a deep learning vision algorithm YOLOv2, the network model is provided with an up-sampling layer and a pyramid structure, and the up-sampling layer is used for improving the identification accuracy and the recall rate of small-size targets; the pyramid structure is used for inputting the shallow information into the deep network, combining the shallow spatial information and the deep semantic information, and matching with the upper sampling layer to further improve the accuracy of target identification.
6. A gaming assistance system as defined in claim 4, wherein: the digital image recognition module effectively assists the target detection and recognition process of a single image by utilizing the correlation information between the front frame and the rear frame of the continuous image sequence, so that the recognition accuracy is further improved; meanwhile, because the game image acquisition preprocessing module cuts and scales the input image, the digital image recognition module needs to convert a coordinate system after recognizing the target object, calculates the position coordinate of the target object in the game original image, then sends the position coordinate to the recognition result prompting module for further processing, and feeds the position coordinate of the target object back to the game image acquisition preprocessing module, thereby facilitating the optimization of internal performance and the improvement of efficiency.
7. A game support system as defined in claim 1, wherein: when the gunshot is detected in a plurality of sound channels simultaneously, the digital audio recognition module is used for analyzing by combining the volume amplitude of each sound channel in the time domain space, taking the time domain amplitude of each sound channel in which the gunshot is detected as different gunshot components, finally calculating the source direction of the gunshot in the three-dimensional space according to the characteristics of the vector, and then transmitting the source direction to the recognition result prompting module for further processing.
8. A gaming assistance system as defined in claim 4, wherein: the digital image recognition module can obtain an AI algorithm model capable of detecting and recognizing a target object by means of an image off-line training system when detecting and recognizing the target object, the system is based on a deep learning technology in machine learning and training a large number of samples, namely, firstly, image acquisition is carried out on a game and a strict marking rule is formulated, parameters in the AI algorithm model can optimize model parameters under the action of a reverse transfer algorithm according to learned data distribution in the training process, and the overall optimization or local optimization under the data distribution is approached; in the aspect of target detection, anchor points with certain size and proportion are given in advance, the full graph is traversed by combining the one-to-one corresponding characteristic of the full convolution structure image area, the position of a labeled target is recorded, then the intersection and comparison is used as measurement, labeled frames are clustered by using a kmean + + algorithm, some frames with the most representativeness in all labeled frames are obtained and used as anchor points, and finally, a GPU is used for carrying out iterative training on an AI algorithm model until the model converges.
9. A game support system as defined in claim 1, wherein: the audio offline analysis system is based on a frequency domain analysis technology, frequency domain characteristics of a target sample are obtained by performing frequency spectrum analysis on a plurality of kinds of gun sound audio samples collected in advance, the frequency domain characteristics are provided for a digital audio identification module to perform frequency spectrum comparison operation, the digital audio identification module performs time domain to frequency domain conversion on audio signals by adopting fast Fourier transform, namely, Fourier transform is performed on the gun sound signals collected in advance, the frequency domain characteristics of the gun sound signals are observed, if the amplitude values of various gun sounds in certain fixed frequency bands are relatively large, the digital audio identification module takes the frequency bands as reference intervals to compare whether the amplitude values of the gun sound signals exceed a preset threshold value, and if the amplitude values of the gun sound signals exceed the preset threshold value, the gun sound is.
CN201910926107.1A 2019-09-27 2019-09-27 Game auxiliary system based on image recognition and audio recognition Active CN110652726B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910926107.1A CN110652726B (en) 2019-09-27 2019-09-27 Game auxiliary system based on image recognition and audio recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910926107.1A CN110652726B (en) 2019-09-27 2019-09-27 Game auxiliary system based on image recognition and audio recognition

Publications (2)

Publication Number Publication Date
CN110652726A true CN110652726A (en) 2020-01-07
CN110652726B CN110652726B (en) 2022-10-25

Family

ID=69039625

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910926107.1A Active CN110652726B (en) 2019-09-27 2019-09-27 Game auxiliary system based on image recognition and audio recognition

Country Status (1)

Country Link
CN (1) CN110652726B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111603776A (en) * 2020-05-21 2020-09-01 上海艾为电子技术股份有限公司 Method for recognizing gunshot in audio data, method for driving motor and related device
CN113198172A (en) * 2020-02-03 2021-08-03 宏碁股份有限公司 Game key mode adjusting method and electronic device
CN113289327A (en) * 2021-06-18 2021-08-24 Oppo广东移动通信有限公司 Display control method and device of mobile terminal, storage medium and electronic equipment
WO2023216502A1 (en) * 2022-05-10 2023-11-16 网易(杭州)网络有限公司 Display control method and apparatus in game, storage medium and electronic device

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012143509A (en) * 2011-01-14 2012-08-02 Furyu Kk Video game device, method of processing video game, video game processing program, and computer readable recording medium
CN103023872A (en) * 2012-11-16 2013-04-03 杭州顺网科技股份有限公司 Cloud game service platform
JP2013208227A (en) * 2012-03-30 2013-10-10 Konami Digital Entertainment Co Ltd Game device, control method of game device and program
CN106408001A (en) * 2016-08-26 2017-02-15 西安电子科技大学 Rapid area-of-interest detection method based on depth kernelized hashing
CN108126342A (en) * 2017-12-28 2018-06-08 珠海市君天电子科技有限公司 A kind of information processing method, device and terminal
CN108176049A (en) * 2017-12-28 2018-06-19 珠海市君天电子科技有限公司 A kind of information cuing method, device, terminal and computer readable storage medium
JP2018157987A (en) * 2017-03-23 2018-10-11 株式会社コナミアミューズメント Game system and computer program used for the same
CN109276882A (en) * 2018-09-20 2019-01-29 珠海市君天电子科技有限公司 A kind of game householder method, device, electronic equipment and storage medium
CN109685060A (en) * 2018-11-09 2019-04-26 科大讯飞股份有限公司 Image processing method and device
CN110009004A (en) * 2019-03-14 2019-07-12 努比亚技术有限公司 Image processing method, computer equipment and storage medium
CN110052025A (en) * 2019-05-24 2019-07-26 网易(杭州)网络有限公司 Method for sending information, display methods, device, equipment, medium in game
CN110170170A (en) * 2019-05-30 2019-08-27 维沃移动通信有限公司 A kind of information display method and terminal device

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012143509A (en) * 2011-01-14 2012-08-02 Furyu Kk Video game device, method of processing video game, video game processing program, and computer readable recording medium
JP2013208227A (en) * 2012-03-30 2013-10-10 Konami Digital Entertainment Co Ltd Game device, control method of game device and program
CN103023872A (en) * 2012-11-16 2013-04-03 杭州顺网科技股份有限公司 Cloud game service platform
CN106408001A (en) * 2016-08-26 2017-02-15 西安电子科技大学 Rapid area-of-interest detection method based on depth kernelized hashing
JP2018157987A (en) * 2017-03-23 2018-10-11 株式会社コナミアミューズメント Game system and computer program used for the same
CN108126342A (en) * 2017-12-28 2018-06-08 珠海市君天电子科技有限公司 A kind of information processing method, device and terminal
CN108176049A (en) * 2017-12-28 2018-06-19 珠海市君天电子科技有限公司 A kind of information cuing method, device, terminal and computer readable storage medium
CN109276882A (en) * 2018-09-20 2019-01-29 珠海市君天电子科技有限公司 A kind of game householder method, device, electronic equipment and storage medium
CN109685060A (en) * 2018-11-09 2019-04-26 科大讯飞股份有限公司 Image processing method and device
CN110009004A (en) * 2019-03-14 2019-07-12 努比亚技术有限公司 Image processing method, computer equipment and storage medium
CN110052025A (en) * 2019-05-24 2019-07-26 网易(杭州)网络有限公司 Method for sending information, display methods, device, equipment, medium in game
CN110170170A (en) * 2019-05-30 2019-08-27 维沃移动通信有限公司 A kind of information display method and terminal device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杨露菁等: "《智能图像处理及应用》", 31 March 2019 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113198172A (en) * 2020-02-03 2021-08-03 宏碁股份有限公司 Game key mode adjusting method and electronic device
CN111603776A (en) * 2020-05-21 2020-09-01 上海艾为电子技术股份有限公司 Method for recognizing gunshot in audio data, method for driving motor and related device
CN111603776B (en) * 2020-05-21 2023-09-05 上海艾为电子技术股份有限公司 Method for identifying gunshot in audio data, motor driving method and related device
CN113289327A (en) * 2021-06-18 2021-08-24 Oppo广东移动通信有限公司 Display control method and device of mobile terminal, storage medium and electronic equipment
WO2023216502A1 (en) * 2022-05-10 2023-11-16 网易(杭州)网络有限公司 Display control method and apparatus in game, storage medium and electronic device

Also Published As

Publication number Publication date
CN110652726B (en) 2022-10-25

Similar Documents

Publication Publication Date Title
CN110652726B (en) Game auxiliary system based on image recognition and audio recognition
CN111091824B (en) Voice matching method and related equipment
CN109543606B (en) Human face recognition method with attention mechanism
CN110909630B (en) Abnormal game video detection method and device
CN111461089B (en) Face detection method, and training method and device of face detection model
CN112750140B (en) Information mining-based disguised target image segmentation method
CN112734775B (en) Image labeling, image semantic segmentation and model training methods and devices
CN109376603A (en) A kind of video frequency identifying method, device, computer equipment and storage medium
CN112132197A (en) Model training method, image processing method, device, computer equipment and storage medium
CN107423398A (en) Exchange method, device, storage medium and computer equipment
CN110309784A (en) Action recognition processing method, device, equipment and storage medium
KR20190108378A (en) Method and System for Automatic Image Caption Generation
CN111881776B (en) Dynamic expression acquisition method and device, storage medium and electronic equipment
CN108509827A (en) The recognition methods of anomalous content and video flow processing system and method in video flowing
CN110465089B (en) Map exploration method, map exploration device, map exploration medium and electronic equipment based on image recognition
CN110175646A (en) Multichannel confrontation sample testing method and device based on image transformation
CN113537056A (en) Avatar driving method, apparatus, device, and medium
CN110263654A (en) A kind of flame detecting method, device and embedded device
CN111444826A (en) Video detection method and device, storage medium and computer equipment
Zhao et al. The 3rd anti-uav workshop & challenge: Methods and results
CN103105924A (en) Man-machine interaction method and device
CN114092920A (en) Model training method, image classification method, device and storage medium
CN104112131A (en) Method and device for generating training samples used for face detection
US11379725B2 (en) Projectile extrapolation and sequence synthesis from video using convolution
CN111564064A (en) Intelligent education system and method based on game interaction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant