WO2022201151A1 - System and method for measuring advertisements exposure in 3d computer games - Google Patents

System and method for measuring advertisements exposure in 3d computer games Download PDF

Info

Publication number
WO2022201151A1
WO2022201151A1 PCT/IL2022/050316 IL2022050316W WO2022201151A1 WO 2022201151 A1 WO2022201151 A1 WO 2022201151A1 IL 2022050316 W IL2022050316 W IL 2022050316W WO 2022201151 A1 WO2022201151 A1 WO 2022201151A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame
advertisement
game
exposure
frames
Prior art date
Application number
PCT/IL2022/050316
Other languages
French (fr)
Inventor
Jihad El-Sana
Original Assignee
Mirage Dynamics Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mirage Dynamics Ltd filed Critical Mirage Dynamics Ltd
Publication of WO2022201151A1 publication Critical patent/WO2022201151A1/en

Links

Classifications

    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/60Generating or modifying game content before or while executing the game program, e.g. authoring tools specially adapted for game development or game-integrated level editor
    • A63F13/61Generating or modifying game content before or while executing the game program, e.g. authoring tools specially adapted for game development or game-integrated level editor using advertising information
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/60Generating or modifying game content before or while executing the game program, e.g. authoring tools specially adapted for game development or game-integrated level editor
    • A63F13/67Generating or modifying game content before or while executing the game program, e.g. authoring tools specially adapted for game development or game-integrated level editor adaptively or by learning from player actions, e.g. skill level adjustment or by storing successful combat sequences for re-use
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/70Game security or game management aspects
    • A63F13/79Game security or game management aspects involving player-related data, e.g. identities, accounts, preferences or play histories
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0242Determining effectiveness of advertisements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items

Definitions

  • the present invention relates to the field of digital advertisements using computer graphics. More particularly, the invention relates to a system and method for measuring the exposure of players to advertisements displayed in computer games (such as 3D computer games).
  • One convention approach to measure the exposure is to intercept the rendering pipeline (a conceptual model that describes what steps a graphics system needs to perform to render a 3D scene to a 2D screen) and explore the view frustum to search for advertisements.
  • This approach could be implemented either in software or in hardware.
  • the software implementation must be integrated within the game engine and dramatically reduce the rendering speed of the 3D game, since at every frame the software should compute the view frustum, determines if there are any advisements within that frame, and whether they are visible or not to the player (e.g., by identifying which polygon is the closest to the player's view frustum and its visibility).
  • the hardware implementation is performed by exploring the z-buffer (a type of data buffer used in computer graphics to represent depth information of objects in 3D space from a particular perspective) to determine the existence and visibility of injected advertisements.
  • the z-buffer is read to a memory and scanned to search for advertisement polygons, while halting the game for the time required to determine whether or not an examined polygon that has been identified as an advertisement is visible to the player).
  • both software hardware implementations require changes within the game engine or the game software and therefore, require the intervention of the game designer, to provide a specific API that allows measuring the exposure of the player to injected advertisements. They also impose additional processing time at each frame, which eventually reduces the game speed and deteriorates the game flow.
  • a method for measuring the level of exposure of players to advertisements displayed in computer games comprising the steps of: a) copying, at a predetermined rate, frames, to be analyzed from a frame buffer of a workstation executing the computer game, into a memory, such as a shared memory; b) processing and analyzing the copied frames by an independent deep learning application that runs in parallel to the game application, the deep learning application being adapted to: c) extract features from each analyzed frame using a Convolutional Neural Network (CNN) model; and d) localize the detection region of advertisements within the each analyzed frame using a Recurrent Neural Network (RNN) model.
  • CNN Convolutional Neural Network
  • RNN Recurrent Neural Network
  • the deep learning application may be further adapted to: a) compute the boundaries of each detected advertisement using a quadrilateral regression model; b) improve the accuracy of the computed boundaries using a refinement model.
  • the level of exposure may be determined by the exposure time, the view angle and the number of pixels occupied by each advisement in the screen space of the computer game.
  • the rate of copying the frame buffer into the memory may depend on the rate application of the game.
  • the copied frames from the frame buffer may be stored in the memory at lower resolution, compared to the game resolution.
  • the deep learning models may be pre-trained on an appropriate advertisement dataset and are updated regularly, using a cloud-based training process, during which additional new advertisements are continuously added to the dataset.
  • the method may further comprise the step of measuring changes in the orientation of the advertisement by computing a homography transformation among the detected objects across consecutive frames.
  • the method may further comprise the step of using a tracking model to reduce the time required for ads detection, where the tracking model includes: a) an adaptive correlation module for detecting the displacement of an advertisement in a current analyzed frame, relative to the preceding analyzed frame; b) a Key-point correspondence module for calculating the location of key-points in a current frame with respect to the preceding frame; and c) a homography transformation calculation module for determining the orientation of the detected advertisement, based on the results of the adaptive correlation module and the key-point correspondence module.
  • the method may further comprise the step of: a) assigning a unique ID is to each advertisement; and b) measuring, by the measurements module, exposure parameters that correspond to the advertisement.
  • the exposure parameters may include one or more of the following: the number of frames showing a particular advertisement; the screen size of the particular advertisement and its orientation,
  • the exposure pavements may be computed by the measurement module on the workstation that runs the game.
  • the exposure pavements may be computed by sending the frames to be analyzed to a remote server, which receives a frame or a set of frames that are compressed or down- sampled by the measurement module, and performs the frame analysis externally to the workstation.
  • a system for measuring the level of exposure of players to advertisements displayed in computer games comprising: a) a workstation comprising at least one processor, for executing the computer game; b) a software module installed on the workstation being adapted to copy, at a predetermined rate, frames to be analyzed, from a frame buffer of the workstation, into a memory, such as shared memory; c) a computerized device comprising at least one processor, for: processing and analyzing the copied frames by an independent deep learning application that runs in parallel to the game application, the deep learning application being adapted to: extract features from each analyzed frame using a Convolutional Neural Network (CNN) model; and localize the detection region of advertisements within the each analyzed frame using a Recurrent Neural Network (RNN) model.
  • CNN Convolutional Neural Network
  • RNN Recurrent Neural Network
  • Fig. 1 illustrates a flowchart of the processing steps performed by the deep learning system, implementing a measurement module, according to an embodiment of the invention
  • Fig. 2 illustrates the process carried out by the tracking module allows detecting how an advertisement moves from frame to frame and what was the change in orientation, according to an embodiment of the invention.
  • the present invention provides a method for measuring the exposure of players to advertisements displayed in 3D computer games without altering the game engine or the game software and without reducing the game speed.
  • the performed measurement considers the time and quality of the exposure to the game player. Since the proposed method does not require any changes on the game engine or the game software itself, it can be implemented once for all the games, as a service on top of (or within) the operating system level.
  • a frame buffer is a portion of random-access memory containing data representing all the pixels in a complete video frame
  • Microsoft DirectX is an application program interface (API) for creating and managing graphic images and multimedia effects in applications such as games or active Web pages that will run in Microsoft's Windows operating systems
  • OpenGL Open Graphics Library - is the computer industry's standard application program interface (API) for defining 2-D and 3-D graphic images) implementation. Therefore, it is possible to copy this frame buffer into the main memory and possibly, accumulate the grabbed frames into a video segment.
  • the method proposed by the present invention utilizes computer vision to detect, locate, and track advisements in each frame or selected frames of the 3D game.
  • this method not only the exposure time is measured, but also the view angle and number of pixels occupied by each advisement in the screen space.
  • Dynamic Link Library is a collection of small programs that larger programs can load when needed to complete specific tasks
  • injection techniques are utilized to intercept calls to the Direct3D (Direct3D is the Microsoft 3D application programming interface (API) component of the DirectX API package) or OpenGL APIs and copy the frame buffer into a shared memory that enables an independent application to process and analyze the copied frames.
  • the rate of copying the frame buffer into the shared memory depends on the rate of the game.
  • the code intercepts each SwapBuffers function (The SwapBuffers function is used to copy the contents of an off-screen buffer to an on-screen buffer.
  • the back buffer is off screen, and the front buffer is on-screen) and copy the previous framebuffer before the swap using, for example, the gIReadPixels function (The gIReadPixels function reads a block of pixels from the framebuffer).
  • gIReadPixels reads a block of pixels from the framebuffer.
  • other functions may be used to read pixels from frame-buffer to shared memory.
  • the intercepting process which runs in parallel to the game application, is used to copy the frame buffer to a shared memory.
  • Another process analyzes the intercepted images on the local workstation, or sends them to other machine, for further processing, to detect and localize advertisements using deep learning (deep learning is a type of machine learning technique that teaches computers to do what comes naturally to humans: learn by example.
  • deep learning a computer model learns to perform classification tasks directly from images, text, or sound. Deep learning models are trained by using a large set of labeled data and neural network architectures that contain many layers, and can achieve state-of-the-art accuracy).
  • the copied frames from the frame buffer are stored in the shared memory using substantially lower resolution, compared to the game resolution. Then deep learning is applied to the lower resolution frames, in order to detect advertisements.
  • the deep learning models are pre-trained on an appropriate advertisement dataset and are updated regularly, using a cloud-based training process, during which additional new advertisements are continuously added to the dataset.
  • the deep learning system of the present invention comprises the following models: A Convolutional Neural Network (CNN) model, for extracting features from an analyzed frame.
  • CNN Convolutional Neural Network
  • a Recurrent Neural Network or a Transformers-based model (a transformer is a deep learning model that adopts the mechanism of self attention, differentially weighting the significance of each part of the input data), for localizing the detection region within the analyzed frame.
  • a regression model for determining (computing) the boundaries of each advertisement
  • a refinement model for improving the accuracy of the computed boundaries
  • CNNs are powerful image processing, artificial intelligence (Al) that use deep learning to perform both generative and descriptive tasks, often using machine vison that includes image and video recognition, along with recommender systems and Natural Language Processing (NLP).
  • This neural network computational model uses a variation of multilayer perceptrons (a perceptron is a simple model of a biological neuron in an artificial neural network) and contains one or more convolutional layers that can be either entirely connected or pooled. These convolutional layers create feature maps that record a region of image which is ultimately broken into rectangles and sent out for nonlinear processing.
  • CNN have their "neurons" arranged more like those of the frontal lobe, the area responsible for processing visual stimuli in humans and other animals.
  • the layers of neurons are arranged in such a way as to cover the entire visual field avoiding the piecemeal image processing problem of traditional neural networks.
  • a CNN uses a system much like a multilayer perceptron that has been designed for reduced processing requirements.
  • the layers of a CNN consist of an input layer, an output layer and a hidden layer that includes multiple convolutional layers, pooling layers, fully connected layers and normalization layers.
  • CNN has very high accuracy in image recognition problems and can automatically detects the important features without any human supervision. Therefore, the CNN model is very effective for extracting features from an analyzed frame.
  • feature extraction is a part of the dimensionality reduction process, in which, an initial set of the raw data is divided and reduced to more manageable groups.
  • the most important characteristic of these large data sets is that they have a large number of variables. These variables require a lot of computing resources to process. So, feature extraction helps to get the best feature from those big data sets by selecting and combining variables into features, thereby, effectively reducing the amount of data. These features are easy to process, but still able to describe the actual data set with accuracy and originality.
  • the technique of extracting the features is useful when there is a large data set and it is required to reduce the number of resources without losing any important or relevant information.
  • Feature extraction helps to reduce the amount of redundant data from the data set. The reduction of the data helps to build the model with less machine effort and also increases the speed of learning and generalization steps in the machine learning process.
  • a Recurrent Neural Network is a type of artificial neural network which uses sequential data or time series data. These deep learning algorithms are commonly used for ordinal or temporal problems, such as language translation, natural language processing (NLP), speech recognition, and image captioning; they are incorporated into popular applications such as Siri, voice search, and Google Translate. Like feedforward and Convolutional Neural Networks (CNNs), Recurrent Neural Networks utilize training data to learn. They are distinguished by their "memory" as they take information from prior inputs to influence the current input and output. While traditional deep neural networks assume that inputs and outputs are independent of each other, the output of recurrent neural networks depend on the prior elements within the sequence.
  • An RNN saves the output of processing nodes and feeds the result back into the model thereby learning to predict the outcome of a layer.
  • Each node in the RNN model acts as a memory cell, continuing the computation and implementation of operations. If the network's prediction is incorrect, then the system self-learns and continues working towards the correct prediction during backpropagation.
  • An RNN remembers each and every information through time. It is useful in time series prediction only because of the feature to remember previous inputs, as well. This is called Long Short Term Memory. RNNs are even used with convolutional layers to extend the effective pixel neighborhood. By doing so, the system proposed by the present invention actually performs continuous training.
  • a regression model is a statistical model that estimates the relationship between one dependent variable and one or more independent variables using a line (or a plane in the case of two or more independent variables).
  • a regression model can be used when the dependent variable is quantitative, except in the case of logistic regression, where the dependent variable is binary.
  • the present invention uses, for example, a quadrilateral regression model (or other polygonal shapes), for computing the boundaries of each advertisement.
  • FIG. 1 illustrates a flowchart of the processing steps performed by the deep learning system, implementing a measurement module, according to an embodiment of the invention.
  • a frame to be analyzed 101 is read from the frame buffer and fed into the CNN model 102.
  • the CNN model extracts features from the analyzed frame 101.
  • the features of the analyzed frame 101 (that have been extracted by the CNN model) are fed into a Recurrent Neural Network (RNN) model 103 that localizes the detection region within the analyzed frame 101.
  • RNN Recurrent Neural Network
  • the boundaries of each advertisement in the analyzed frame 101 are computed by a regression model 104, based on the data of localized detection region.
  • the accuracy of the computed boundaries is improved using a refinement model.
  • the computed boundaries are obtained. In this example, four advertisement polygons 106a-106d (marked by solid red lines) were detected in the analyzed frame 101.
  • planar homography is a transformation that is occurring between two planes, i.e., a mapping between two planar projections of an image.
  • the element in an image has its projection to the other image in a homogenous coordinate plane, retaining the same information but in a transformed perspective) among the detected objects across the frames, temporal coherency among the consecutive frame is utilized.
  • Pixel level accuracy may not be required for many measurements and in this case, applying the two main components, CNN and RNN is enough to provide such information.
  • a tracking model may be applied, which include the following components:
  • An adaptive correlation module - is used for detecting the displacement of an advertisement in a current analyzed frame, relative to the preceding analyzed frame. This eliminates the need to perform a new search 2.
  • a Key-point correspondence module - is used to calculate the location of key- points in a current frame with respect to the preceding frame
  • a homography transformation calculation module - is used for determining the orientation of the detected advertisement (and changes in that orientation), based on the results of the adaptive correlation module and the key-point correspondence module
  • the process carried out by the tracking module allows detecting how an advertisement moves from frame to frame and what was the change in orientation, as shown in Fig. 2.
  • the adaptive correlation module (which consists of correlation filters) detects the displacement of an advertisement in a current analyzed frame, relative to the preceding analyzed frame.
  • the key-point correspondence module calculates the location of key-points in a current frame with respect to the preceding frame, based on the results calculated in step 201.
  • the homography transformation calculation module determines the orientation of the detected advertisement, based on the results calculated in step 202.
  • the tracking process may be eliminated. However, the tracking process is used to model orientation and estimate view direction, when needed.
  • the AD server sends a set of advertisements to the workstation, which embeds them within the game.
  • a unique ID is assigned by the Ad server to each advertisement and the measurements module transfers exposure parameters, such as number of frames showing a particular advertisement, the screen size of that particular advertisement and its duration, to the AD server.
  • the exposure duration is measured as the time that elapsed between the first and the last frames, in which that particular advertisement has been displayed. Computing the exposure pavements may be performed by the measurement module on the workstation that runs the game.
  • the measurement module sends the frames to be analyzed to a remote cloud server, which received a frame or set of frames that are compressed or down-sampled (down-sampling is the process of reducing the sampling rate of a signal, to thereby reduce the data rate or the size of the data) by the measurement module, and then performs the frame analysis externally to the workstation.
  • down-sampling is the process of reducing the sampling rate of a signal, to thereby reduce the data rate or the size of the data

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Business, Economics & Management (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Strategic Management (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Accounting & Taxation (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Image Analysis (AREA)

Abstract

A system for measuring the level of exposure of players to advertisements displayed in computer games, comprising a workstation comprising at least one processor, for executing the computer game; a software module installed on the workstation being adapted to copy, at a predetermined rate, frames to be analyzed, from a frame buffer of the workstation, into a memory, such as shared memory; a computerized device comprising at least one processor, for processing and analyzing the copied frames by an independent deep learning application that runs in parallel to the game application. The deep learning application being adapted to extract features from each analyzed frame using a Convolutional Neural Network (CNN) model and localize the detection region of advertisements within the each analyzed frame using a Recurrent Neural Network (RNN) model.

Description

SYSTEM AND METHOD FOR MEASURING ADVERTISEMENTS EXPOSURE IN 3D
COMPUTER GAMES
Field of the Invention
The present invention relates to the field of digital advertisements using computer graphics. More particularly, the invention relates to a system and method for measuring the exposure of players to advertisements displayed in computer games (such as 3D computer games).
Background of the Invention
Computer games have been attracting the interest of many people across ages. Over the recent years, 3D photorealistic computer games have been leading the market of computer games, in terms of revenue. Such development motivated advertisement agencies to address this market.
Recently, computer games have been leveraging the opportunity to monetize their game space by injecting advertisements into the game field. For this purpose, game designers mark various spots, which are used to place personalized advertisements to be displayed to the player, within the game arena. Measuring the exposure of these advertisements has a great value for publishers. Currently, these advertisements are placed on the world coordinates of the game field, which are rendered into the screen, based on the view position and direction of the players. Players determine camera parameters, such as position and view direction, in real-time. As a result, measuring the exposure of the placed advertisements becomes a challenging task.
One convention approach to measure the exposure is to intercept the rendering pipeline (a conceptual model that describes what steps a graphics system needs to perform to render a 3D scene to a 2D screen) and explore the view frustum to search for advertisements. This approach could be implemented either in software or in hardware. The software implementation must be integrated within the game engine and dramatically reduce the rendering speed of the 3D game, since at every frame the software should compute the view frustum, determines if there are any advisements within that frame, and whether they are visible or not to the player (e.g., by identifying which polygon is the closest to the player's view frustum and its visibility).
The hardware implementation is performed by exploring the z-buffer (a type of data buffer used in computer graphics to represent depth information of objects in 3D space from a particular perspective) to determine the existence and visibility of injected advertisements. In this implementation, at each frame, the z-buffer is read to a memory and scanned to search for advertisement polygons, while halting the game for the time required to determine whether or not an examined polygon that has been identified as an advertisement is visible to the player).
However, both software hardware implementations require changes within the game engine or the game software and therefore, require the intervention of the game designer, to provide a specific API that allows measuring the exposure of the player to injected advertisements. They also impose additional processing time at each frame, which eventually reduces the game speed and deteriorates the game flow.
It is therefore an object of the present invention to provide a method for measuring the exposure of players to advertisements displayed in 3D computer games without altering the game engine or the game software.
It is another object of the present invention to provide a method for measuring the exposure of players to advertisements displayed in 3D computer games, which does not reduce the game speed. It is a further object of the present invention to provide a method for measuring the exposure of players to advertisements displayed in 3D computer games, which considers the time, size, and quality of the exposure.
Other objects and advantages of the invention will become apparent as the description proceeds.
Summary of the Invention
A method for measuring the level of exposure of players to advertisements displayed in computer games, such as 3D games, comprising the steps of: a) copying, at a predetermined rate, frames, to be analyzed from a frame buffer of a workstation executing the computer game, into a memory, such as a shared memory; b) processing and analyzing the copied frames by an independent deep learning application that runs in parallel to the game application, the deep learning application being adapted to: c) extract features from each analyzed frame using a Convolutional Neural Network (CNN) model; and d) localize the detection region of advertisements within the each analyzed frame using a Recurrent Neural Network (RNN) model.
The deep learning application may be further adapted to: a) compute the boundaries of each detected advertisement using a quadrilateral regression model; b) improve the accuracy of the computed boundaries using a refinement model. The level of exposure may be determined by the exposure time, the view angle and the number of pixels occupied by each advisement in the screen space of the computer game.
The rate of copying the frame buffer into the memory may depend on the rate application of the game.
The copied frames from the frame buffer may be stored in the memory at lower resolution, compared to the game resolution.
The deep learning models may be pre-trained on an appropriate advertisement dataset and are updated regularly, using a cloud-based training process, during which additional new advertisements are continuously added to the dataset.
The method may further comprise the step of measuring changes in the orientation of the advertisement by computing a homography transformation among the detected objects across consecutive frames.
The method may further comprise the step of using a tracking model to reduce the time required for ads detection, where the tracking model includes: a) an adaptive correlation module for detecting the displacement of an advertisement in a current analyzed frame, relative to the preceding analyzed frame; b) a Key-point correspondence module for calculating the location of key-points in a current frame with respect to the preceding frame; and c) a homography transformation calculation module for determining the orientation of the detected advertisement, based on the results of the adaptive correlation module and the key-point correspondence module. The method may further comprise the step of: a) assigning a unique ID is to each advertisement; and b) measuring, by the measurements module, exposure parameters that correspond to the advertisement.
The exposure parameters may include one or more of the following: the number of frames showing a particular advertisement; the screen size of the particular advertisement and its orientation,
The exposure pavements may be computed by the measurement module on the workstation that runs the game.
The exposure pavements may be computed by sending the frames to be analyzed to a remote server, which receives a frame or a set of frames that are compressed or down- sampled by the measurement module, and performs the frame analysis externally to the workstation.
A system for measuring the level of exposure of players to advertisements displayed in computer games, comprising: a) a workstation comprising at least one processor, for executing the computer game; b) a software module installed on the workstation being adapted to copy, at a predetermined rate, frames to be analyzed, from a frame buffer of the workstation, into a memory, such as shared memory; c) a computerized device comprising at least one processor, for: processing and analyzing the copied frames by an independent deep learning application that runs in parallel to the game application, the deep learning application being adapted to: extract features from each analyzed frame using a Convolutional Neural Network (CNN) model; and localize the detection region of advertisements within the each analyzed frame using a Recurrent Neural Network (RNN) model.
Brief Description of the Drawings
The above and other characteristics and advantages of the invention will be better understood through the following illustrative and non-limitative detailed description of preferred embodiments thereof, with reference to the appended drawings, wherein:
Fig. 1 illustrates a flowchart of the processing steps performed by the deep learning system, implementing a measurement module, according to an embodiment of the invention; and
Fig. 2 illustrates the process carried out by the tracking module allows detecting how an advertisement moves from frame to frame and what was the change in orientation, according to an embodiment of the invention.
Detailed Description of the Invention
The present invention provides a method for measuring the exposure of players to advertisements displayed in 3D computer games without altering the game engine or the game software and without reducing the game speed. The performed measurement considers the time and quality of the exposure to the game player. Since the proposed method does not require any changes on the game engine or the game software itself, it can be implemented once for all the games, as a service on top of (or within) the operating system level. Current rendering pipeline that includes hardware implementation, stores the rendering results in a frame buffer (is a portion of random-access memory containing data representing all the pixels in a complete video frame), which is accessible within Microsoft DirectX (DirectX is an application program interface (API) for creating and managing graphic images and multimedia effects in applications such as games or active Web pages that will run in Microsoft's Windows operating systems) and OpenGL (Open Graphics Library - is the computer industry's standard application program interface (API) for defining 2-D and 3-D graphic images) implementation. Therefore, it is possible to copy this frame buffer into the main memory and possibly, accumulate the grabbed frames into a video segment.
The method proposed by the present invention utilizes computer vision to detect, locate, and track advisements in each frame or selected frames of the 3D game. In this method, not only the exposure time is measured, but also the view angle and number of pixels occupied by each advisement in the screen space.
Accordingly, Dynamic Link Library (DLL is a collection of small programs that larger programs can load when needed to complete specific tasks) injection techniques are utilized to intercept calls to the Direct3D (Direct3D is the Microsoft 3D application programming interface (API) component of the DirectX API package) or OpenGL APIs and copy the frame buffer into a shared memory that enables an independent application to process and analyze the copied frames. The rate of copying the frame buffer into the shared memory depends on the rate of the game.
The code intercepts each SwapBuffers function (The SwapBuffers function is used to copy the contents of an off-screen buffer to an on-screen buffer. The back buffer is off screen, and the front buffer is on-screen) and copy the previous framebuffer before the swap using, for example, the gIReadPixels function (The gIReadPixels function reads a block of pixels from the framebuffer). Of course, other functions may be used to read pixels from frame-buffer to shared memory.
The intercepting process, which runs in parallel to the game application, is used to copy the frame buffer to a shared memory. Another process analyzes the intercepted images on the local workstation, or sends them to other machine, for further processing, to detect and localize advertisements using deep learning (deep learning is a type of machine learning technique that teaches computers to do what comes naturally to humans: learn by example. In deep learning, a computer model learns to perform classification tasks directly from images, text, or sound. Deep learning models are trained by using a large set of labeled data and neural network architectures that contain many layers, and can achieve state-of-the-art accuracy).
In order to save computational resources such as memory and processing time (which may or may not be required, depending on the game resolution, the available memory, and processing power), the copied frames from the frame buffer are stored in the shared memory using substantially lower resolution, compared to the game resolution. Then deep learning is applied to the lower resolution frames, in order to detect advertisements.
According to the present invention, the deep learning models are pre-trained on an appropriate advertisement dataset and are updated regularly, using a cloud-based training process, during which additional new advertisements are continuously added to the dataset.
In one embodiment, the deep learning system of the present invention comprises the following models: A Convolutional Neural Network (CNN) model, for extracting features from an analyzed frame.
A Recurrent Neural Network (RNN) or a Transformers-based model (a transformer is a deep learning model that adopts the mechanism of self attention, differentially weighting the significance of each part of the input data), for localizing the detection region within the analyzed frame.
A regression model, for determining (computing) the boundaries of each advertisement
A refinement model, for improving the accuracy of the computed boundaries
CNN model
CNNs are powerful image processing, artificial intelligence (Al) that use deep learning to perform both generative and descriptive tasks, often using machine vison that includes image and video recognition, along with recommender systems and Natural Language Processing (NLP). This neural network computational model uses a variation of multilayer perceptrons (a perceptron is a simple model of a biological neuron in an artificial neural network) and contains one or more convolutional layers that can be either entirely connected or pooled. These convolutional layers create feature maps that record a region of image which is ultimately broken into rectangles and sent out for nonlinear processing. CNN have their "neurons" arranged more like those of the frontal lobe, the area responsible for processing visual stimuli in humans and other animals. The layers of neurons are arranged in such a way as to cover the entire visual field avoiding the piecemeal image processing problem of traditional neural networks. A CNN uses a system much like a multilayer perceptron that has been designed for reduced processing requirements. The layers of a CNN consist of an input layer, an output layer and a hidden layer that includes multiple convolutional layers, pooling layers, fully connected layers and normalization layers. CNN has very high accuracy in image recognition problems and can automatically detects the important features without any human supervision. Therefore, the CNN model is very effective for extracting features from an analyzed frame.
Generally, feature extraction is a part of the dimensionality reduction process, in which, an initial set of the raw data is divided and reduced to more manageable groups. The most important characteristic of these large data sets is that they have a large number of variables. These variables require a lot of computing resources to process. So, feature extraction helps to get the best feature from those big data sets by selecting and combining variables into features, thereby, effectively reducing the amount of data. These features are easy to process, but still able to describe the actual data set with accuracy and originality. The technique of extracting the features is useful when there is a large data set and it is required to reduce the number of resources without losing any important or relevant information. Feature extraction helps to reduce the amount of redundant data from the data set. The reduction of the data helps to build the model with less machine effort and also increases the speed of learning and generalization steps in the machine learning process.
RNN model
The RNN model is used for localizing the detection region within the analyzed frame. A Recurrent Neural Network (RNN) is a type of artificial neural network which uses sequential data or time series data. These deep learning algorithms are commonly used for ordinal or temporal problems, such as language translation, natural language processing (NLP), speech recognition, and image captioning; they are incorporated into popular applications such as Siri, voice search, and Google Translate. Like feedforward and Convolutional Neural Networks (CNNs), Recurrent Neural Networks utilize training data to learn. They are distinguished by their "memory" as they take information from prior inputs to influence the current input and output. While traditional deep neural networks assume that inputs and outputs are independent of each other, the output of recurrent neural networks depend on the prior elements within the sequence. While future events would also be helpful in determining the output of a given sequence, unidirectional recurrent neural networks cannot account for these events in their predictions. An RNN saves the output of processing nodes and feeds the result back into the model thereby learning to predict the outcome of a layer. Each node in the RNN model acts as a memory cell, continuing the computation and implementation of operations. If the network's prediction is incorrect, then the system self-learns and continues working towards the correct prediction during backpropagation. An RNN remembers each and every information through time. It is useful in time series prediction only because of the feature to remember previous inputs, as well. This is called Long Short Term Memory. RNNs are even used with convolutional layers to extend the effective pixel neighborhood. By doing so, the system proposed by the present invention actually performs continuous training.
Regression Model
A regression model is a statistical model that estimates the relationship between one dependent variable and one or more independent variables using a line (or a plane in the case of two or more independent variables). A regression model can be used when the dependent variable is quantitative, except in the case of logistic regression, where the dependent variable is binary. The present invention uses, for example, a quadrilateral regression model (or other polygonal shapes), for computing the boundaries of each advertisement.
Refinement model
A refinement model is used for improving the accuracy of the computed boundary by the regression model, by using a gradient search algorithm for detecting the maximal visible area polygon that corresponds to the boundaries that have been computed by the quadrilateral regression model. Fig. 1 illustrates a flowchart of the processing steps performed by the deep learning system, implementing a measurement module, according to an embodiment of the invention. At the first step, a frame to be analyzed 101 is read from the frame buffer and fed into the CNN model 102. At the next step, the CNN model extracts features from the analyzed frame 101. At the next step, the features of the analyzed frame 101 (that have been extracted by the CNN model) are fed into a Recurrent Neural Network (RNN) model 103 that localizes the detection region within the analyzed frame 101. At the next step, the boundaries of each advertisement in the analyzed frame 101 are computed by a regression model 104, based on the data of localized detection region. At the next step, the accuracy of the computed boundaries is improved using a refinement model. At the next step, the computed boundaries are obtained. In this example, four advertisement polygons 106a-106d (marked by solid red lines) were detected in the analyzed frame 101.
In order to accelerate the detection process and compute the homography transformation (planar homography is a transformation that is occurring between two planes, i.e., a mapping between two planar projections of an image. The element in an image has its projection to the other image in a homogenous coordinate plane, retaining the same information but in a transformed perspective) among the detected objects across the frames, temporal coherency among the consecutive frame is utilized. Pixel level accuracy may not be required for many measurements and in this case, applying the two main components, CNN and RNN is enough to provide such information.
In order to reduce the time required for ads detection, a tracking model may be applied, which include the following components:
1. An adaptive correlation module - is used for detecting the displacement of an advertisement in a current analyzed frame, relative to the preceding analyzed frame. This eliminates the need to perform a new search 2. A Key-point correspondence module - is used to calculate the location of key- points in a current frame with respect to the preceding frame
3. A homography transformation calculation module -is used for determining the orientation of the detected advertisement (and changes in that orientation), based on the results of the adaptive correlation module and the key-point correspondence module
The process carried out by the tracking module allows detecting how an advertisement moves from frame to frame and what was the change in orientation, as shown in Fig. 2. At the first step 201, the adaptive correlation module (which consists of correlation filters) detects the displacement of an advertisement in a current analyzed frame, relative to the preceding analyzed frame. At the next step 202, the key-point correspondence module calculates the location of key-points in a current frame with respect to the preceding frame, based on the results calculated in step 201. At the next step 203, the homography transformation calculation module determines the orientation of the detected advertisement, based on the results calculated in step 202. For simple measurement purposes, the tracking process may be eliminated. However, the tracking process is used to model orientation and estimate view direction, when needed.
In a typical scenario, the AD server sends a set of advertisements to the workstation, which embeds them within the game. A unique ID is assigned by the Ad server to each advertisement and the measurements module transfers exposure parameters, such as number of frames showing a particular advertisement, the screen size of that particular advertisement and its duration, to the AD server. The exposure duration is measured as the time that elapsed between the first and the last frames, in which that particular advertisement has been displayed. Computing the exposure pavements may be performed by the measurement module on the workstation that runs the game. Alternatively, the measurement module sends the frames to be analyzed to a remote cloud server, which received a frame or set of frames that are compressed or down-sampled (down-sampling is the process of reducing the sampling rate of a signal, to thereby reduce the data rate or the size of the data) by the measurement module, and then performs the frame analysis externally to the workstation.
The above examples and description have of course been provided only for the purpose of illustrations, and are not intended to limit the invention in any way. As will be appreciated by the skilled person, the invention can be carried out in a great variety of ways, employing more than one technique from those described above, all without exceeding the scope of the invention.

Claims

Claims:
1. A method for measuring the level of exposure of players to advertisements displayed in computer games, comprising: a) copying, at a predetermined rate, frames, to be analyzed from a frame buffer of a workstation executing said computer game, into a memory; b) processing and analyzing the copied frames by an independent deep learning application that runs in parallel to the game application, said deep learning application being adapted to: c) extract features from each analyzed frame using a Convolutional Neural Network (CNN) model; and d) localize the detection region of advertisements within said each analyzed frame using a Recurrent Neural Network (RNN) model.
2. A method according to claim 1, wherein the deep learning application is further adapted to: a) compute the boundaries of each detected advertisement using a quadrilateral regression model; b) improve the accuracy of the computed boundaries using a refinement model.
3. A method according to claim 1, wherein the computer game is a 3D game.
4. A method according to claim 1, wherein the level of exposure is determined by the exposure time, the view angle and the number of pixels occupied by each advisement in the screen space of the computer game.
5. A method according to claim 1, wherein the rate of copying the frame buffer into the memory depends on the rate application of the game.
6. A method according to claim 1, wherein the copied frames from the frame buffer may be stored in the memory at lower resolution, compared to the game resolution.
7. A method according to claim 1, wherein the deep learning models are pre-trained on an appropriate advertisement dataset and are updated regularly, using a cloud-based training process, during which additional new advertisements are continuously added to the dataset.
8. A method according to claim 1, further comprising measuring changes in the orientation of the advertisement by computing a homography transformation among the detected objects across consecutive frames.
9. A method according to claim 1, further comprising using a tracking model to reduce the time required for ads detection, said tracking model includes: a) an adaptive correlation module for detecting the displacement of an advertisement in a current analyzed frame, relative to the preceding analyzed frame; b) a Key-point correspondence module for calculating the location of key-points in a current frame with respect to the preceding frame; and c) a homography transformation calculation module for determining the orientation of the detected advertisement, based on the results of said adaptive correlation module and said key-point correspondence module.
10. A method according to claim 1, further comprising assigning a unique ID is to each advertisement; a) measuring, by the measurements module, exposure parameters that correspond to said advertisement.
11. A method according to claim 1, wherein the exposure parameters include one or more of the following: the number of frames showing a particular advertisement; the screen size of said particular advertisement and its orientation,
12. A method according to claim 1, wherein the exposure pavements may be computed by the measurement module on the workstation that runs the game.
13. A method according to claim 1, wherein the exposure pavements may be computed by sending the frames to be analyzed to a remote server, which receives a frame or a set of frames that are compressed or down-sampled by the measurement module, and performs the frame analysis externally to the workstation.
14. A system for measuring the level of exposure of players to advertisements displayed in computer games, comprising: a) a workstation comprising at least one processor, for executing said computer game; b) a software module installed on said workstation being adapted to copy, at a predetermined rate, frames to be analyzed, from a frame buffer of said workstation, into a memory; c) a computerized device comprising at least one processor, for: processing and analyzing the copied frames by an independent deep learning application that runs in parallel to the game application, said deep learning application being adapted to: extract features from each analyzed frame using a Convolutional Neural Network (CNN) model; and localize the detection region of advertisements within said each analyzed frame using a Recurrent Neural Network (RNN) model.
15. A system according to claim 14, in which the deep learning application is further adapted to: a) compute the boundaries of each detected advertisement using a quadrilateral regression model; and b) improve the accuracy of the computed boundaries using a refinement model.
16. A system according to claim 14, in which the computer game is a 3D game.
17. A system according to claim 14, in which the level of exposure is determined by the exposure time, the view angle and the number of pixels occupied by each advisement in the screen space of the computer game.
18. A system according to claim 14, in which the rate of copying the frame buffer into the memory depends on the rate application of the game.
19. A system according to claim 14, in which the copied frames from the frame buffer are stored in the memory at lower resolution, compared to the game resolution.
20. A system according to claim 14, in which the deep learning models are pre-trained on an appropriate advertisement dataset and are updated regularly, using a cloud- based training process, during which additional new advertisements are continuously added to the dataset.
21. A system according to claim 14, in which changes in the orientation of the advertisement are measured by computing a homography transformation among the detected objects across consecutive frames.
22. A system according to claim 14, in which a tracking model is used to reduce the time required for ads detection, said tracking model includes: a) an adaptive correlation module for detecting the displacement of an advertisement in a current analyzed frame, relative to the preceding analyzed frame; b) a Key-point correspondence module for calculating the location of key-points in a current frame with respect to the preceding frame; and c) a homography transformation calculation module for determining the orientation of the detected advertisement, based on the results of said adaptive correlation module and said key-point correspondence module. b) A system according to claim 14, in which a unique ID is assigned to each advertisement and used by the measurements module to measure exposure parameters that correspond to said advertisement.
23. A system according to claim 14, in which the exposure parameters include one or more of the following: the number of frames showing a particular advertisement; the screen size of said particular advertisement and its orientation,
24. A system according to claim 14, in which the exposure pavements may be computed by the measurement module on the workstation that runs the game.
25. A system according to claim 14, in which the exposure pavements may be computed by sending the frames to be analyzed to a remote server, which receives a frame or a set of frames that are compressed or down-sampled by the measurement module, and performs the frame analysis externally to the workstation.
26. A system according to claim 14, in which the memory is a shared memory.
PCT/IL2022/050316 2021-03-21 2022-03-21 System and method for measuring advertisements exposure in 3d computer games WO2022201151A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202163163885P 2021-03-21 2021-03-21
US63/163,885 2021-03-21

Publications (1)

Publication Number Publication Date
WO2022201151A1 true WO2022201151A1 (en) 2022-09-29

Family

ID=83395244

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IL2022/050316 WO2022201151A1 (en) 2021-03-21 2022-03-21 System and method for measuring advertisements exposure in 3d computer games

Country Status (1)

Country Link
WO (1) WO2022201151A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180060906A1 (en) * 2016-08-26 2018-03-01 Minkonet Corporation Method of collecting advertisement exposure data of game video
CN111488487A (en) * 2020-03-20 2020-08-04 西南交通大学烟台新一代信息技术研究院 Advertisement detection method and detection system for all-media data
US20200380334A1 (en) * 2019-05-28 2020-12-03 Himax Technologies Limited Convolutional neural network method and system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180060906A1 (en) * 2016-08-26 2018-03-01 Minkonet Corporation Method of collecting advertisement exposure data of game video
US20200380334A1 (en) * 2019-05-28 2020-12-03 Himax Technologies Limited Convolutional neural network method and system
CN111488487A (en) * 2020-03-20 2020-08-04 西南交通大学烟台新一代信息技术研究院 Advertisement detection method and detection system for all-media data

Similar Documents

Publication Publication Date Title
US11170210B2 (en) Gesture identification, control, and neural network training methods and apparatuses, and electronic devices
US11475542B2 (en) Neural network system with temporal feedback for adaptive sampling and denoising of rendered sequences
Liu et al. Dynamic gesture recognition algorithm based on 3D convolutional neural network
US10402697B2 (en) Fusing multilayer and multimodal deep neural networks for video classification
US20220004744A1 (en) Human posture detection method and apparatus, device and storage medium
US20190012832A1 (en) Path planning for virtual reality locomotion
US11557022B2 (en) Neural network system with temporal feedback for denoising of rendered sequences
Wang et al. Cliffnet for monocular depth estimation with hierarchical embedding loss
KR20200087784A (en) Target detection methods and devices, training methods, electronic devices and media
CN111066063A (en) System and method for depth estimation using affinity for convolutional spatial propagation network learning
US11036975B2 (en) Human pose estimation
US10922876B2 (en) Saccadic redirection for virtual reality locomotion
US20220027674A1 (en) Deliberate conditional poison training for generative models
US20200117952A1 (en) Target object position prediction and motion tracking
US20190163978A1 (en) Budget-aware method for detecting activity in video
CN112101344B (en) Video text tracking method and device
CN114792331A (en) Machine learning framework applied in semi-supervised environment to perform instance tracking in image frame sequences
CN113299363A (en) Yoov 5-based dermatology over-the-counter medicine selling method
CN110472673B (en) Parameter adjustment method, fundus image processing device, fundus image processing medium and fundus image processing apparatus
CN111008622B (en) Image object detection method and device and computer readable storage medium
WO2022201151A1 (en) System and method for measuring advertisements exposure in 3d computer games
CN114998814B (en) Target video generation method and device, computer equipment and storage medium
US11961249B2 (en) Generating stereo-based dense depth images
CN112926681B (en) Target detection method and device based on deep convolutional neural network
CN114463613A (en) Fault detection method and system based on residual error network and Faster R-CNN

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22774498

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE