GB2608991A

GB2608991A - Content generation system and method

Info

Publication number: GB2608991A
Application number: GB2109909.8A
Authority: GB
Inventors: Chiara Monti Maria; Bradley Timothy
Original assignee: Sony Interactive Entertainment Inc
Current assignee: Sony Interactive Entertainment Inc
Priority date: 2021-07-09
Filing date: 2021-07-09
Publication date: 2023-01-25
Also published as: GB202109909D0

Abstract

A system for modifying interactive content associated with a user, the system comprising a content streaming unit to stream the interactive content to an audience via an online hosting platform, a monitoring unit to determine indicators of the engagement or mood of the audience of the content, a modification unit to determine modifications to the content in dependence upon the indicators and a content modification unit to modify the interactive content in accordance with the modifications. The monitoring unit may analyse audio, video, image, test or language information regarding the audience to generate the indicators. The indicators may be generated for each viewer or for groups of viewers. The modifications may include inserting, removing, or modifying elements. Said elements may be virtual terrain, non-user characters, virtual objects, user characteristics, or virtual structures.

Description

CONTENT GENERATION SYSTEM AND METHOD BACKGROUND OF THE INVENTION

Field of the invention

This disclosure relates to a content generation system and method.

Description of the Prior Art

The "background" description provided herein is for the purpose of generally presenting the context of the disclosure. Work of the presently named inventors, to the extent it is described in this background section, as well as aspects of the description which may not otherwise qualify as prior art at the time of filing, are neither expressly or impliedly admitted as prior art against the present invention.

In recent years there has been a significant increase in the prevalence of user-generated viewing content for video games and the like. Streaming platforms, such as Twitch® and YouTube®, as well as platform-specific streaming options (such as directly spectating friends playing a game on a separate games console) have made such content much more accessible in conjunction with the increasing availability of high-speed internet connections for users. As a result of this accessibility, there has been a significant increase in both the number of viewers and the number of content creators (often referred to as streamers in reference to the act of streaming content to viewers).

This represents an interesting challenge in the context of generating content for users; not only should content be appealing and engaging for a player, but it is also considered desirable that the content is appealing and engaging for spectators. This may lead to an increased consideration of the visual appeal of content, the interactivity, events that occur within the content, and the mechanics present in the content. For example, viewers may be more concerned with the visual appeal of the content than a player as they are not so occupied with playing the game. Similarly, increasing levels of interactivity (such as increasing player counts) may also be important as it can enable a spectator to play against a streamer in-game. An example of this is the increasing popularity of 'battle royale'-style games which can feature as many as a hundred players in a single game. Equally, events and game mechanics may also be developed with a view to what will appeal to spectators rather than just the players themselves.

Of course, such an increased number of considerations can complicate the content generation process significantly; this can place a substantial burden upon developers. Not only is the process complicated, but the achievement of satisfactory outcomes (generating suitable content) can be more challenging in the face of balancing potentially conflicting design aims (such as when the interests of the player and the spectator diverge).

It is in the context of the above discussion that the present disclosure arises. SUMMARY OF THE INVENTION

This disclosure is defined by claim 1.

Further respective aspects and features of the disclosure are defined in the appended claims.

It is to be understood that both the foregoing general description of the invention and the following detailed description are exemplary, but are not restrictive, of the invention.

BRIEF DESCRIPTION OF THE DRAWINGS

A more complete appreciation of the disclosure and many of the attendant advantages thereof will be readily obtained as the same becomes better understood by reference to the following detailed description when considered in connection with the accompanying drawings, wherein: Figure 1 schematically illustrates a hardware arrangement for implementing one or more embodiments of the present disclosure; Figure 2 schematically illustrates an exemplary arrangement for modifying content; Figure 3 schematically illustrates a content modification method; Figure 4 schematically illustrates an exemplary training process for a machine learning model; Figure 5 schematically illustrates a system for modifying interactive content associated with a user; and Figure 6 schematically illustrates a method for modifying interactive content associated with a user. DESCRIPTION OF THE EMBODIMENTS Referring now to the drawings, wherein like reference numerals designate identical or corresponding parts throughout the several views, embodiments of the present disclosure are described.

Figure 1 schematically illustrates a hardware arrangement for implementing one or more embodiments of the present disclosure. The hardware system 100 includes a content source 110, a server 120, and viewer devices 130, 140, and 150; of course, this represents a simplified example of such a system and in practice it is considered that the number of users, servers, and viewer devices may be selected freely rather than being limited in accordance with Figure 1.

The content source 110 may be any device which is capable of receiving inputs from a user and performing processing (such as control of an application or gameplay) of content in dependence upon those inputs; for instance, the content source 110 may be a games console. In other embodiments, the content source may be a personal computer, mobile phone, or a portable games console, for example. In a number of embodiments the content source 110 may be provided as a distributed system; this may include an arrangement in which the content source 110 includes a server in a cloud-based gaming arrangement, with a user able to provide inputs using a controller that communicates with the server directly or via another device (such as a games console).

The content source 110 is also operable to generate a video stream comprising at least a portion of the content being interacted with by the user. The generated video stream may comprise the display that the user sees when playing a game, for instance, and may be supplemented with one or more additional visual elements (such as identifying information for the player). In some cases the transmitted video may only comprise a portion of the user's display -for example, HUD elements may be omitted, or the user may be using a multi-display setup and only transmit video corresponding to a single display.

Alternatively, or in addition, the content source 110 may be configured to output information that enables the generation of a video stream representing the user's gameplay or the like. For instance, rather than outputting video data the content source may output information about key presses or events within the games that can be used to reproduce the in-game environment and actions.

The server 120 is provided to act as an intermediary for the streaming process; however, this is not required in all embodiments of the present disclosure as in some cases spectators may be able to obtain the video content directly from the user. The server 120 may be associated with a gaming platform or a video hosting service, for example. The server 120 may be configured to be able to communicate with one or more devices via a network connection (such as the internet). The server 120 may be configured to provide additional functionality (other than video content distribution); examples include interactive features such as chat rooms and reaction interactions enabling a viewer (and in some cases) the user) to provide inputs in response to the content. These may also enable viewers (and in some cases the user) to interact with one another while viewing the content.

The system 100 is shown as comprising three viewer devices 130, 140, and 150. The first viewer device is a laptop computer for viewing content, while the second viewing device 140 is a mobile phone and the third viewing device 150 is a games console associated with a display device 160 (such as a television). The viewing devices may be selected freely rather than being limited to the examples shown in this Figure; it is required only that the viewing devices are capable of displaying content and receiving one or more inputs from associated viewers. Each of the viewer devices is configured to receive inputs from one or more associated input devices; examples include mice, keyboards, microphones, cameras, biometric sensors, hardware motion detectors (such as gyroscopes and accelerometers) associated with a controller or the viewing device, and gamepads.

The system 100 of Figure 1 is therefore an example of a system in which a user is able to provide inputs to control processing (such as interacting with a game), generate video content corresponding to that processing (such as a gameplay stream), transmit the video content to one or more viewers, and receive one or more inputs from viewers of the video content. An exemplary use case is that of a player of a game streaming a video of their gameplay to one or more viewers, a number of which are able to provide feedback or otherwise interact with the user and/or other viewers.

Such arrangements enable viewers to watch the gameplay of a player, generally for entertainment purposes (although other purposes, such as educational, are also considered). It is therefore considered that it is beneficial to generate video content that is entertaining. While some players address this themselves, by providing a voiceover or other supplemental content to entertain viewers, it may be considered advantageous that the game (or other content) is itself more entertaining for viewers.

However, as discussed above, this means that game designers are often faced with two different and unpredictable sets of constraints on the design of the game -entertaining the viewers, and providing an enjoyable experience for a user. This creates a significant burden upon the designers.

Figure 2 schematically illustrates an exemplary arrangement for providing content that is considered more engaging for viewers. This arrangement comprises an input reception unit 200, an input analysis unit 210, and an output generation unit 220. In the following discussion it is considered that the arrangement of Figure 2 is embodied in the server 120 of Figure 1, but this should not be regarded as limiting. It is instead considered that the functionality described here may be provided at any device (such as the content source 110) or distributed amongst a number of devices as appropriate.

The input reception unit 200 is operable to receive one or more inputs from a number of viewers and optionally the user of the content source 110. These inputs may be provided in any suitable format; button presses, text, images, biometric information, and audio are all examples of suitable types of information that may be used as inputs. In some embodiments, these inputs may be received via a chat room function associated with the content (for example), and/or directly from users (such that no public interaction is necessary).

The input analysis unit 210 is operable to analyse one or more of the received inputs to generate a representation of those inputs. This representation may comprise one or more indicators of the engagement and/or mood of one or more viewers, for example. For instance, this may comprise performing an interpretation process so as to identify a meaning of an input (such as a text analysis or speech recognition process), and/or otherwise associate a tag or label with content. An example of such a label may be an emotional evaluation of the input (or a series of inputs), or an engagement rating that indicates how engaged a viewer associated with the inputs is.

The input analysis process may be performed for each input individually, a set of inputs associated with a single viewer, and/or for inputs associated with multiple users. Similarly, inputs may be grouped on a per-type basis (such as considering voice inputs separately to text inputs). When considering more than one input at a time, the input analysis unit 210 may be configured to apply a weighting to the analysis in dependence upon the identity of a user associated with the input and/or the type of input, for example.

The analysis of the inputs may be performed using existing processes, such as natural language processing or speech-to-text applications.

In some embodiments, the analysis may also (or instead) consider the pattern and/or format of inputs rather than their content. For example, an interaction rate may be determined that indicates viewer engagement (for example, measuring the number of inputs over a period of time). Similarly, the format of inputs may be considered relevant in some cases -the use of images as inputs may indicate higher excitement than text inputs, for instance, as the viewer may select faster inputs (a single image versus a number of typed words) when excited. The change of inputs and/or their content over time may also be considered as a part of the analysis process; this may assist with determining average interactions for a group of viewers, for instance, and determining changes (for instance, detecting increasing boredom).

As noted above, the input analysis unit 210 may be configured to consider inputs on a per-group scale rather than a per-viewer scale. In such cases, a more statistical approach to the determination of attributes representing those viewers may be considered. For example, a generated representation can be based upon an average of all inputs for that group, based upon a modal approach (that is, identifying the most common inputs), and/or based upon larger-scale metrics such as the number of inputs per unit time for the group as a whole.

The output generation unit 220 is operable to generate one or more outputs that are used to modify the content that is being viewed. This may be implemented in any of a number of ways, a selection of which is discussed below in more detail. The function of the output generation unit 220 is to generate an output in dependence upon the received inputs and any analysis of those inputs as performed by the input analysis unit 210. For instance, when inputs are identified as indicating a lack of engagement of a number of viewers an output may be generated that causes modification of a game that is being watched by those viewers so as to increase viewer engagement.

The arrangement of Figure 2 is considered advantageous in that it enables content to be modified in a responsive manner so as to provide a desired experience in an efficient and effective manner. In other words, content may be adapted in response to inputs during playback which can simplify the development process for the content -such a feature enables content to be developed which does not need to anticipate future reactions. This can further result in content that is reduced in complexity and storage size, as the number of eventualities that are required to be anticipated is reduced, thereby increasing the efficiency of the distribution of the content without compromising on the quality of the content. Figure 3 schematically illustrates an example of the use of such an arrangement. This example is based upon the use of text inputs from viewers, although (as noted above) it should be noted that the use of other forms of input are also anticipated.

At a step 300, text inputs are received from one or more viewers. For instance, a monitoring of a chat room associated with the viewed content may be performed or one or more direct messages to the player being spectated may be used as an input.

At a step 310, the received text inputs are interpreted so as to enable a meaning to be derived. For example, a translation process may be performed so as to ensure that all of the received inputs are in the same language. A natural language processing algorithm may be considered to be a suitable text analysis process; such processing can be used to interpret text to derive a meaning. This meaning may be determined on a per-word, per-phrase, per-sentence, and/or per-input basis for example. Alternatively, or in addition, multiple inputs may be considered in combination -for instance, grouped on a per-viewer basis, a temporal basis, and/or any other grouping of inputs.

At a step 320, one or more attributes of the inputs, viewers, and/or the content being viewed are determined; for example, the interpretation of the text inputs can be used to determine viewer engagement. For instance, this can include generating a label or other descriptor for one or more of the inputs (or any subset of those inputs, such as particular words or phrases). An example of this is applying a label to text inputs based upon the appearance of keywords, such as an attribute of 'bored' being applied to a message if the word 'dull' is detected. Such an attribute may also (or instead) be assigned to the corresponding viewer for that message (indicating that the user is bored), or to the content being viewed (indicating that the content is boring, or that a particular number of viewers are bored).

Rather than being an emotion-based attribute, the attribute can also (or instead) determine engagement or any other quantity. For instance, if an analysis of the text reveals that there are no keywords associated with the content being viewed, then it is considered that there is off-topic conversation and therefore it can be inferred that the viewers are not particularly engaged with the content.

These attributes may be determined on an individual (per-viewer) basis or on a per-group basis; this may be decided on a per-attribute basis. For instance, 'engagement' may be determined on a per-group basis while 'excitement' is determined on a per-viewer basis. These groups may comprise the entirety of the viewer base (or at least a representative sample, such as a particular percentage of users selected at random or based upon a metric such as viewer profile attributes or length of time in a chatroom), or any other division of the viewers into groups. An example of such a division for groups is viewer age, in view of the fact that older and younger viewers may have differing opinions as to what constitutes engaging content. Equally, any other demographic information or viewer profile information (including whether a viewer follows or is friends with the content streamer) may be used to divide the viewers into groups. Each of these groups (and/or viewers) may have an associated weighting that is applied to any determination of an overall attribute of the viewers (or a larger group of the viewers).

At a step 330, an output is generated in dependence upon the determined attributes; for instance, an output indicating that viewers are bored may be generated and provided to a content source device. This output is generated so as to be representative of one or more attributes of the group (or a number of sub-groups within the overall group) of viewers. For example, an overall engagement of the viewers may be generated as an output, or an excitement level of each of several sub-groups. Alternatively, the output may be generated so as to indicate a change to the content that should be made -for instance, the output may comprise an instruction to 'increase viewer engagement', 'increase game speed', or 'spawn several enemies'.

At a step 340 the viewed content is modified in response to the generated output; for example, the content may be modified to cause a boss fight to occur so as to reduce viewer boredom as determined by the text analysis. In some embodiments, this requires a translation of the generated output so as to determine a type of modification, while in others the generated output defines the modification.

Examples of such modifications include the spawning of enemies within a game, the providing of high-quality loot, increased difficulty, map/environment modification, non-controlled character motion (such as a random teleportation of a player's avatar), and/or modification of one or more parameters associated with a player's avatar (such as decreasing hit points, or increasing move speed).

As noted above, such a process may be modified freely to utilise different input formats. In some cases, this may comprise a modification to the process so as to be able to determine attributes or meanings from those inputs (such as a facial recognition process on received images of viewers to determine mood). Alternatively, or in addition, processing may be implemented that generates a text input from the received input from a viewer. For instance, a transcript of voice inputs may be generated or an image processing may be performed that generates a text description of the input image.

The form of the modifications may be determined in any suitable manner; the examples of modifications provided above should not be regarded as limiting upon the present disclosure. Any modification to the content may be considered as a candidate for modifying one or more attributes associated with one or more viewers of the content -these include visual modifications, interaction modifications, and gameplay modifications for example.

In some embodiments, the designer of a game may indicate a modification that is made in response to each attribute of the viewers -an example of this may be the use of a look-up table that indicates a correspondence between attributes and modifications. In some embodiments, a number of modifications may be associated with an attribute change -in such embodiments it is considered that any combination of one or more of the number of modifications may be implemented as appropriate.

This can improve the variability of the modifications, and thereby increase their effectiveness through a lack of predictability. Similarly, such a correspondence between attributes and modifications may be provided (or modified) by particular players, content hosts (such as video streaming platforms) and/or other services (such as an operating system associated with a games console).

Alternatively, or in addition, modifications may be determined using a trained machine learning model.

Such a model can be trained so as to determine an appropriate modification to apply to content in order to generate a desired change in one or more attributes associated with at least a subset of the viewers of the content. Here, the appropriateness of the modification is determined by its impact upon the attributes rather than a particular quality of the modification itself. In view of this, an appropriate modification may be determined in dependence upon both the viewers, associated attributes, and the content itself.

Figure 4 schematically illustrates an exemplary training process for a machine learning model. Any other suitable process may be used instead, and modifications to the exemplary process may be made as appropriate. The training of the machine learning model may be performed in a live or test environment in which viewers are watching content is it is being played. This can enable a direct feedback mechanism. Alternatively, or in addition, training may be performed in which recorded content and inputs are used as an input to the model (or the attributes associated with the viewers may be provided instead/in addition), with a supervisor to the process indicating whether a determined response is appropriate or not (alternatively, or in addition, the input data may be labelled with an appropriate response that can be used to determine the effectiveness of the output of the model).

A step 400 comprises analysing the attributes associated with viewers of the content; this step may be performed based upon an input to the model, or this step may comprise the generation of the attributes based upon received inputs and/or content. This analysis is performed so as to determine an appropriate modification for the content.

A step 410 comprises implementing a modification to the content. This may be a direct process (for example, if the model is associated with the content itself), or may comprise the issuing of one or more instructions and/or parameters to the content (or the device that is executing the content).

A step 420 comprises analysing a second set of attributes; this second set of attributes corresponds to the viewers of the content after the modificat5ion has been implemented in step 410. This second set of attributes can therefore be used to characterise a viewer response to the modification. This step may also comprise the identification of the attributes, for example by analysing one or more viewer inputs as has been discussed in more detail above.

A step 430 comprises evaluating the differences between the attributes in step 400 and the attributes in step 420. This enables a determination of the success of the modification -with success being measured by whether the difference in attributes corresponds to an intended change. For instance, if the initial attributes indicated viewer boredom, success would be indicated by the second set of attributes indicating a lower degree of boredom of the viewers. This success may be binary (that is, successful or unsuccessful modification), or it may be graduated in that a score is provided that scales in dependence upon the magnitude of the difference between the attributes.

A step 440 comprises the updating of the machine learning model in dependence upon the evaluation. This updating may include reinforcing certain modification/attribute correspondences where an evaluation indicates that the modification was successful, for example.

As is apparent from consideration of the above example, a reinforcement learning approach may be particular appropriate in embodiments of the present disclosure. In such embodiments, the evaluation of step 430 comprises the generation of a reward for the reinforcement learning agent in dependence upon the success (or lack of success) indicated by the change in attributes. However, it is also considered that alternative approaches to training a machine learning model may also be appropriate in

embodiments of the present disclosure.

In some embodiments, a magnitude of the modification to be made is also determined (rather than only the type of the modification). For instance, in a case that an attribute (such as boredom) has an associated scale (for example, a score of one to ten indicating a level of boredom) the magnitude of the modification may be scaled correspondingly. For instance, if viewers are only a little bored (such as a two out of ten rating) then the modification may be minor -such as a slight increase in the number of enemies being faced in the content. Similarly, if the viewers are extremely bored (such as a nine out of ten rating) then a more significant modification may be made. An example of such a modification is that of increasing the difficulty level and spawning a challenging boss for the player to face in the content. Rather than a numerical value associated with an attribute, it is also considered that other gradations may be used -for instance, labels such as 'very bored', 'somewhat bored', and 'slightly bored'. The manner of conveying the distinction is not considered to be important, so long as it indicates different levels of an attribute value.

In some embodiments, the attributes identified or provided relate directly to particular characteristics such as engagement, boredom, or excitement. Each of these may be used or provided to a content modification system, or only the most significant (such as the value of the characteristic that has the greatest magnitude). Alternatively, or in addition, attributes may correspond to more general characteristics -examples include interaction rate, positivity, and responsiveness. These may then be used to infer a particular characteristic (such as boredom in the case that interaction rate and responsiveness are low, or excitement if all are high). In other words, the attributes that are used may be selected freely so long as it is possible to infer properties or characteristics of the viewers using those attributes.

Figure 5 schematically illustrates a system for modifying interactive content associated with a user. The system comprises a content streaming unit 500, a viewer monitoring unit 510, a modification generation unit 520, and a content modification unit 530. These units may be provided in a single processing device, such as a games console or server. Alternatively, the functionality of these units may be distributed amongst a number of different processing devices (such as games consoles and servers) as appropriate.

Examples of such arrangements are discussed in more detail below.

The content streaming unit 500 is operable to stream one or more images of the interactive content to one or more viewers. In some embodiments, the content streaming unit 500 is operable to stream the one or more images to an online platform for hosting content accessible to a plurality of viewers; the online platform may be a streaming portal associated with a particular game, games console, or operating system (for example), or may be a website that provides streaming and interaction (such as text chat) functionality. The content streaming unit 500 is operable to distribute images of the user's gameplay to an audience of viewers; this may include one or more overlays not associated with the content, and may not include all of the content (such as removing heads-up displays) in some embodiments. The content streaming unit 500 is operable to stream the modified content that is generated by the present arrangement, rather than being limited only to an initial streaming of content to which the modification is applied.

The viewer monitoring unit 510 is operable to determine one or more indicators of the engagement and/or mood of one or more of the viewers of the streamed content. In some embodiments the viewer monitoring unit 510 is operable to analyse one or more of audio, video, image, and text content to determine the one or more indicators; such examples are not considered limiting, and any combination of inputs may be used as the basis for such an analysis as discussed above. These indicators may include measurements such as the rate at which inputs are provided by a viewer (such as the rate at which text-based messages are sent) and/or determined characteristics of those inputs (such as a meaning of words used in the inputs).

The viewer monitoring unit 510 is operable in some embodiments to perform an analysis of language used by viewers to determine engagement and/or mood. For example, a natural language processing technique may be applied to identify the meaning of words or phrases that are provided as an input by viewers. Similar processes may also be applied for images, videos (including videos of the viewers themselves, for example including facial recognition of emotions), and audio inputs (which may include using a speech-to-text process).

In some embodiments, one or more of the indicators are determined on a per-viewer basis for each of a number of the one or more viewers. Alternatively, or in addition, in the case that there is a plurality of viewers one or more of the indicators may be determined on the basis of a group of viewers comprising at least a subset of the plurality of viewers.

The modification generation unit 520 is operable to determine one or more modifications to the content in dependence upon the determined indicators, wherein the modifications alter one or more aspects of the interactive content. In particular, the modification generation unit 520 may be operable to generate one or more modifications that are expected to modify the engagement and/or mood of one or more of the viewers. For instance, if the determined indicators suggest that viewers are bored then modifications may be made to the content so as to reduce the level of viewer boredom.

The modification generation unit 520 may be operable to determine modifications including one or more of inserting, removing, and/or modifying one or more elements, wherein the one or more elements include one or more of virtual terrain, non-user characters, virtual objects, user characters, and virtual structures. A correspondence between the modifications and a change in viewer attributes can be determined ahead of time (for example, by a human operator and/or a machine learning model) so as to assist in determining an appropriate modification. For instance, it may be determined that reducing a player's hit points in a game will increase viewer engagement or excitement while reducing the number of enemies present will increase boredom.

In some embodiments, the modification generation unit 520 is operable to determine the one or more modifications in further dependence upon viewer engagement statistics over time, a game state associated with the content, a state of a character controlled by the user, the context of the content, modification history, and/or an expected impact upon the game state associated with the content. For instance, the modification may be dependent upon a change of viewer engagement rather than an absolute determination (hence the consideration of engagement over time). A game state of the content could indicate that particular modifications are or are not appropriate -for instance, in a stealth mission reducing player hit points may make little difference to viewer engagement as this does not make a difference to the game play.

A state of the character controlled by the user can be the health of the character, for example; this may be relevant in determining an appropriate modification in a number of ways. In one example, it is considered that low health could mean that a modification is determined to be the dropping of rare loot rather than increasing game difficulty as a means of increasing viewer excitement -this can be seen as increasing viewer engagement without causing the player to struggle too much. The context of the content can be important in determining an appropriate modification, particularly in the sense of preserving a sense of immersion. For example, if a player is an extremely difficult to access area it is inappropriate to spawn a large group of enemies as this would be seen as being unlikely. Instead, it may be determined that a single, higher-level enemy would be more appropriate as a modification. A modification history may also be useful to consider in diversifying the modifications that are applied, as a repetitive use of the same modification would be expected to have diminishing returns in its impact upon viewer attributes. The consideration of an expected impact on the game state is similar to the consideration of the game state, although it is more concerned with the change to the game state than the consistency. For instance, modifications that make it likely that a player will die in-game may be avoided as this can be unsatisfying for the player.

The content modification unit 530 is operable to modify the interactive content in accordance with the determined modifications. In line with the above, it is considered that the modifications can comprise the modification of one or more elements including terrain, characters, objects, and structures in a virtual environment. In some embodiments, the content modification unit 530 is operable to select from amongst a plurality of determined modifications in accordance with an input from the user.

In some embodiments, the modification generation unit 520 is operable to identify, independently of the content, a modification to be implemented while the content modification unit 530 is operable to determine a content-specific modification corresponding to the identified modification. This may be particularly useful in embodiments in which a system-level or server-based modification determination process is performed, with the content modification being performed by a specific application (such as a game) or by a separate device. Such arrangements may be advantageous in that the same process can be used for different types of content rather than being limited to a specific content. In such embodiments, the determined modifications may be less prescriptive than those generated by an in-application process; for instance, a modification may be 'spawn an enemy' rather than specifying a particular enemy, or 'increase excitement' with a game using this input to generate specific modifications to achieve the desired result of increasing excitement for viewers.

The arrangement of Figure 5 is an example of a processor (for example, a GPU and/or CPU located in a games console or any other computing device) that is operable to modify interactive content associated with a user, and in particular is operable to: stream one or more images of the interactive content to one or more viewers; determine one or more indicators of the engagement and/or mood of one or more of the viewers of the streamed content; determine one or more modifications to the content in dependence upon the determined indicators, wherein the modifications alter one or more aspects of the interactive content; and modify the interactive content in accordance with the determined modifications.

As noted above, rather than utilising a single processor the functionality of this arrangement may be provided using processors associated with a number of devices, such as a server and a games console.

It is considered that the above arrangement is able to determine an objective measure of how entertaining (for example) content is by considering the inputs of at least a representative sample of viewers. While an emotional response of a particular viewer is generally regarded as being subjective, by determining a response of a representative sample of viewers using gathered evidence (that is, no prediction or assumption is required to estimate a response) it is possible to objectively determine one or more properties of the content. In other words, if it is determined that at least a threshold number of viewers are bored by the content then it is considered that the content is objectively boring for those viewers. As the modification of the content is seen only by those viewers (and the streamer), it is considered irrelevant as to whether a wider audience would agree with that assessment -only the response of the present viewers are considered relevant, which is a constraint that renders such a determination objective rather than subjective.

As noted above, there are numerous possible configurations for such an arrangement; a particular configuration may be adopted freely in dependence upon the requirements of a particular implementation of the content streaming arrangement.

In a first example, the viewer monitoring is performed by a server with the determined indicators being provided to a games console that is executing a game that is being streamed to the viewers. The games console (at a system or application level) can then use the determined indicators to determine an appropriate modification to be made to the game.

A second example is that of a cloud gaming arrangement, in which each of the steps is performed by a server that is executing a game remotely to the player that is streaming (or a separate server that is in communication with the server executing the gameplay).

A third example is that in which a number of the indicators are generated by the viewer devices for transmission to a streaming device or a server for generating modifications. In some embodiments; only these viewer-specific indicators may be used; alternatively, or in addition; additional processing may be performed to identify further indicators or to generate an average (or other representation) of those indicators. This may be performed in addition to processing of the inputs by a server, in some embodiments, which may decrease the burden upon a server.

Figure 6 schematically illustrates a method for modifying interactive content associated with a user.

A step 600 comprises streaming one or more images of the interactive content to one or more viewers, either directly or via a video or game server. This can include both initial streamed content that is viewed and streaming modified content (that is, content as modified by the present method).

A step 610 comprises determining one or more indicators of the engagement and/or mood of one or more of the viewers of the streamed content. These indicators may include measurements such as the rate at which inputs are provided by a viewer (such as the rate at which text-based messages are sent) and/or determined characteristics of those inputs (such as a meaning of words used in the inputs).

A step 620 comprises determining one or more modifications to the content in dependence upon the determined indicators, wherein the modifications alter one or more aspects of the interactive content. Such modifications may include one or more of inserting, removing, and/or modifying one or more elements of the content, wherein the one or more elements include one or more of virtual terrain, non-user characters, virtual objects, user characters, and virtual structures.

A step 630 comprises modifying the interactive content in accordance with the determined modifications. As noted above, the modified content (and a user's interaction with the modified content) may be streamed to the viewers; this modified content may form the basis of additional content modification processing.

The techniques described above may be implemented in hardware, software or combinations of the two. In the case that a software-controlled data processing apparatus is employed to implement one or more features of the embodiments, it will be appreciated that such software, and a storage or transmission medium such as a non-transitory machine-readable storage medium by which such software is provided, are also considered as embodiments of the disclosure.

Thus, the foregoing discussion discloses and describes merely exemplary embodiments of the present invention. As will be understood by those skilled in the art, the present invention may be embodied in S other specific forms without departing from the spirit or essential characteristics thereof. Accordingly, the disclosure of the present invention is intended to be illustrative, but not limiting of the scope of the invention, as well as other claims. The disclosure, including any readily discernible variants of the teachings herein, defines, in part, the scope of the foregoing claim terminology such that no inventive subject matter is dedicated to the public.

Claims

CLAIMS1. A system for modifying interactive content associated with a user, the system comprising: a content streaming unit operable to stream one or more images of the interactive content to one or more viewers; S a viewer monitoring unit operable to determine one or more indicators of the engagement and/or mood of one or more of the viewers of the streamed content; a modification generation unit operable to determine one or more modifications to the content in dependence upon the determined indicators, wherein the modifications alter one or more aspects of the interactive content; and a content modification unit operable to modify the interactive content in accordance with the determined modifications.
2. A system according to claim 1, wherein the content streaming unit is operable to stream the one or more images to an online platform for hosting content accessible to a plurality of viewers.
3. A system according to any preceding claim, wherein the viewer monitoring unit is operable to analyse one or more of audio, video, image, and text content to determine the one or more indicators.
4. A system according to any preceding claim, wherein one or more of the indicators are determined on a per-viewer basis for each of a number of the one or more viewers.
5. A system according to any preceding claim, wherein, in the case that there is a plurality of viewers, one or more of the indicators are determined on the basis of a group of viewers comprising at least a subset of the plurality of viewers.
6. A system according to any preceding claim, wherein the viewer monitoring unit is operable to perform an analysis of language used by viewers to determine engagement and/or mood.
7. A system according to any preceding claim, wherein the modification generation unit is operable to generate one or more modifications that are expected to modify the engagement and/or mood of one or more of the viewers.
8. A system according to any preceding claim, wherein the modification generation unit is operable to determine modifications including one or more of inserting, removing, and/or modifying one or more elements.
9. A system according to claim 8, wherein the one or more elements include one or more of virtual terrain, non-user characters, virtual objects, user characters, and virtual structures.
10. A system according to any preceding claim, wherein the modification generation unit is operable to determine the one or more modifications in further dependence upon viewer engagement statistics over time, a game state associated with the content, a state of a character controlled by the user, the context of the content, modification history, and/or an expected impact upon the game state associated with the content.
11. A system according to any preceding claim, wherein: the modification generation unit is operable to identify, independently of the content, a modification to be implemented; and the content modification unit is operable to determine a content-specific modification corresponding to the identified modification.
12. A system according to any preceding claim, wherein the content modification unit is operable to select from amongst a plurality of determined modifications in accordance with an input from the user.
13. A method for modifying interactive content associated with a user, the method comprising: streaming one or more images of the interactive content to one or more viewers; determining one or more indicators of the engagement and/or mood of one or more of the viewers of the streamed content; determining one or more modifications to the content in dependence upon the determined indicators, wherein the modifications alter one or more aspects of the interactive content; and modifying the interactive content in accordance with the determined modifications.
14. Computer software which, when executed by a computer, causes the computer to carry out the method of claim 13.
15. A non-transitory machine-readable storage medium which stores computer software according to claim 14.