CN116017093B

CN116017093B - Video environment simulation method and system

Info

Publication number: CN116017093B
Application number: CN202211624265.XA
Authority: CN
Inventors: 李敦杰; 吴扬东; 郑舜浩; 叶经绍
Original assignee: Guangzhou Xunkong Electronic Technology Co ltd
Current assignee: Guangzhou Xunkong Electronic Technology Co ltd
Priority date: 2022-12-15
Filing date: 2022-12-15
Publication date: 2023-08-11
Anticipated expiration: 2042-12-15
Also published as: CN116017093A

Abstract

The application relates to the technical field of video playing systems, and discloses a video environment simulation method and a system, wherein the video environment simulation method comprises the following steps: analyzing the target film file to generate target film audio frequency and target film image; inputting the target film audio frequency and the target film image into an audio-video feature recognition model, and recognizing feature fragments and corresponding feature time information, wherein the feature fragments comprise feature audio frequency fragments and feature image fragments, and the feature fragments refer to fragments needing to trigger a special effect of sensory stimulation; inputting the characteristic segments into a special effect matching model, and matching special effect control items corresponding to the characteristic segments; generating special effect control information based on each special effect control item and corresponding characteristic time information, generating a special effect control program based on the special effect control information, and sending the special effect control program to the sensory stimulation equipment; the method has the effect of improving the generation efficiency of the sensory stimulation equipment control program corresponding to the film.

Description

Video environment simulation method and system

Technical Field

The application relates to the technical field of video playing systems, in particular to a video environment simulation method and a video environment simulation system.

Background

Traditional cinema generally uses a mode of projecting a large screen and arranging sound equipment in a surrounding mode to improve the audiovisual experience of users; in order to improve the viewing experience of users on movie works such as movies, many 4D cinema have been provided and used in a showing hall to stimulate senses other than the viewing senses of a viewer by means of sensory stimulation devices capable of releasing water drops, bubbles, smoke, smell and the like, so that the viewing experience of the users is improved.

However, the existing 4D cinema generally needs to set the control program of the sensory stimulation device according to the scenario, sound effect and picture of the film in advance, so as to achieve the effect of starting different functions at the specific time node of film playing, thereby using various special effects of the sensory stimulation device to stimulate the senses other than the visual and audio of the film watching personnel, based on this, the existing 4D cinema generally has the problems of less film source and untimely programming of the control program of the sensory stimulation device corresponding to the newly-displayed film, and has great limitation.

Disclosure of Invention

In order to improve the generation efficiency of a sensory stimulation equipment control program corresponding to a film, the application provides a video environment simulation method and a system.

The first technical scheme adopted by the application is as follows:

A video environment simulation method, comprising:

analyzing the target film file to generate target film audio frequency and target film image;

inputting the target film audio frequency and the target film image into an audio-video feature recognition model, and recognizing feature fragments and corresponding feature time information, wherein the feature fragments comprise feature audio frequency fragments and feature image fragments, and the feature fragments refer to fragments needing to trigger a special effect of sensory stimulation;

inputting the characteristic segments into a special effect matching model, and matching special effect control items corresponding to the characteristic segments;

and generating special effect control information based on each special effect control item and corresponding characteristic time information, generating a special effect control program based on the special effect control information, and transmitting the special effect control program to the sensory stimulation equipment.

By adopting the technical scheme, the target film file is acquired and analyzed, so that the target film audio frequency and the target film image are generated, the subsequent identification and analysis of the audio content and the image content of the target film are convenient, and the special effect items required to be used are determined; inputting the audio frequency and the image of the target film into an audio-video characteristic recognition model to recognize characteristic fragments and corresponding characteristic time information from the audio frequency and the image of the film, so that the fragments needing to trigger special effects and the corresponding time information of the characteristic fragments can be conveniently judged later, the characteristic fragments comprise characteristic audio fragments and characteristic image fragments, the characteristic fragments refer to the video fragments needing to enhance the viewing experience by triggering sense-and-sense stimulation special effects, such as fragments of thunder, rain and the like, and the identification can be performed through the audio frequency or the image of the target film; inputting the characteristic segments into the special effect matching model, so that corresponding types of special effect control items can be matched from the special effect matching model according to the content of the characteristic segments, for example, the characteristic segments in rainy days can correspond to the special effect control items for spraying water; generating special effect control information based on the special effect control items and the corresponding characteristic time information to record the information of the triggering time required by the special effect control items of the sensory stimulation equipment, generating a special effect control program based on the special effect control information and sending the special effect control program to the sensory stimulation equipment so as to control the sensory stimulation equipment to trigger the special effect in the playing stage of the film; by analyzing the audio and the image of the target film file, the special effect type and the special effect time required to be triggered are automatically identified, and a special effect control program is generated, so that the generation efficiency of the special effect control program corresponding to the film is improved, and the problem of few sources of the current 4D cinema film is solved.

The present application is in a preferred example: after the step of parsing the target film file to generate the target film audio and target film image, the method includes:

acquiring attribute information of a target film, wherein the attribute information comprises actor information and film label information;

determining sensory principal angle information of each segment in the target film based on the attribute information and the target film image;

and inputting the sensory principal angle information into the audio-video characteristic recognition model to generate a characteristic recognition correction rule.

By adopting the technical scheme, because the audience often generates a common emotion for a specific role when watching a film and different roles can be generated in different fragments, the role of determining the common emotion of the audience is important for setting special effect control information; acquiring attribute information of a target film, wherein the attribute information comprises actor information and film tag information so as to acquire main roles and film types, and the main roles and film types are convenient to judge the roles and things possibly shared by audiences, such as specific vehicles in a racing film, in a follow-up manner; determining sensory principal angles corresponding to all fragments in the target film based on the attribute information of the target film and the target film image, and generating sensory principal angle information; and the sensory principal angle information is input into the audio-video feature recognition model to generate a corresponding feature recognition correction rule, so that the recognition rule of the audio-video feature recognition model is convenient to adjust, the feature fragments are convenient to determine based on the feeling of using the sensory principal angle in the film, and the fit degree of the sensory stimulation special effect on the feeling of the user is improved.

matching the special effect items in the special effect item library based on the film tag information, and determining expected item information and aversion item information;

and inputting the expected item information and the aversion item information into the special effect matching model, and determining corresponding amplified special effect items and reduced special effect items.

By adopting the technical scheme, as the expected special effects of users are different in different types of movies, for example, in movies of racing car or vehicle pursuit, spectators expect to feel acceleration and collision of vehicles, in movies of sea type, spectators expect to feel that ships are impacted by sea waves to fluctuate, and part of other types of special effects can influence the expected special effects of users; therefore, the special effect items in the special effect item library are matched based on the film tag information, and expected item information and aversion item information are determined so as to acquire the expected special effect and the unexpected special effect of the audience on the type of film; the expected item information and the aversion item information are input into the special effect matching model, the amplified special effect item is generated based on the expected item information, the reduced special effect item is generated based on the aversion item information, the special effect receptivity of the amplified special effect item is convenient to improve subsequently, the special effect receptivity of the reduced special effect item is reduced, the fitting degree of various sensory stimulation special effects and a film main body is further improved, and accordingly the film watching experience of a user is improved.

The present application is in a preferred example: the audio feature library is configured to store feature audio, and input the target film audio and the target film image into the audio-video feature recognition model, and the step of recognizing feature segments and corresponding feature time information includes:

inputting the target film audio into an audio-video feature recognition model, enabling the target film audio to be matched with each feature audio, and defining an audio fragment successfully matched as a feature audio fragment;

and acquiring the start and stop time of the characteristic audio fragment in the target film audio, and determining corresponding characteristic time information.

By adopting the technical scheme, the audio feature library is arranged in the audio-video feature recognition model and is used for storing feature audio, such as thunder, gunshot, rain and the like, so that the target film audio and the audio feature audio can be conveniently compared in the follow-up process; inputting the target film audio into an audio-video feature recognition library, enabling the target film audio to be matched with each feature audio, and determining the successfully matched audio fragment in the target film audio as a feature audio fragment, so that the corresponding sensory stimulation special effect can be conveniently matched according to the features of the audio fragment; and acquiring the start-stop time of the characteristic audio fragment in the target film audio so as to determine the characteristic time information corresponding to the characteristic audio fragment, thereby being convenient for triggering the sensory stimulation special effect at the specific time of film playing and improving the film watching experience of the user.

The present application is in a preferred example: the step of inputting the target film audio frequency and the target film image into the audio-video feature recognition model and recognizing feature fragments and corresponding feature time information comprises the following steps:

inputting the target film image into an audio-video feature recognition model, recognizing feature events occurring in the image through an image recognition algorithm, and determining feature image fragments;

and acquiring the start and stop time of the characteristic image fragment in the target film image, and determining corresponding characteristic time information.

By adopting the technical scheme, the audio-video feature recognition model is provided with the image recognition algorithm and the image feature library, wherein the image feature library is used for storing feature images and is a training data set of the image recognition algorithm, so that the image recognition algorithm is conveniently trained through the feature images in the image feature library, the accuracy and success rate of the image recognition algorithm are improved, and the image recognition algorithm is used for recognizing images of feature events in a target film, such as images of events of vehicle collision, rainwater, wind blowing and the like; inputting the target film image into an audio-video feature recognition library, recognizing feature events occurring in the image through an image recognition algorithm, and defining image fragments corresponding to the feature events as feature image fragments, so that the corresponding sensory stimulation special effects can be conveniently matched according to the features of the image fragments; and acquiring the start and stop time of the characteristic image fragment in the target film image so as to determine the characteristic time information corresponding to the characteristic image fragment, thereby being convenient for triggering the sensory stimulation special effect at the specific time of film playing and improving the film watching experience of the user.

The present application is in a preferred example: generating special effect control information based on the special effect control items and the corresponding characteristic time information, generating a special effect control program based on the special effect control information and sending the special effect control program to the sensory stimulation equipment, and further comprising:

shooting an image of an audience seat based on the characteristic time information, generating an audience feedback image, and identifying the expression of each audience in the audience feedback image through an expression identification algorithm;

and counting the number of audiences corresponding to various expressions in the audience feedback image, generating audience feeling information, and calculating audience satisfaction degree based on the audience feeling information.

By adopting the technical scheme, the image pickup device is controlled to shoot the image of the audience seat based on the characteristic time information, so that the image of the audience seat is conveniently shot at the corresponding time of each special effect trigger, an audience feedback image is generated, the expression of each audience in the audience feedback image is identified through the expression identification algorithm, and the satisfaction degree of the audience to the special effect of sensory stimulation is conveniently analyzed subsequently; and counting the number of audiences corresponding to various expressions in the audience feedback image, generating audience feeling information based on the counted data, and calculating audience satisfaction based on the audience feeling information, so that a special effect control program of the sensory stimulation equipment can be conveniently adjusted according to the audience satisfaction.

The present application is in a preferred example: after the step of generating the special effect control information based on each special effect control item and the corresponding characteristic time information, the method further comprises the following steps:

generating special subtitles based on the special control information, marking the special subtitles in a target film file, and generating a film special script;

and sending the film special effect script to a manager terminal.

By adopting the technical scheme, after special effect control information is generated, special effect captions are generated based on the special effect control information, and the special effect captions are marked into a target film file, so that a film special effect script is generated, and a manager can conveniently judge the sensory stimulation special effect triggered by each stage of target film showing through the special effect captions in the film special effect script; and the film special effect script is sent to a manager terminal, so that the manager can conveniently audit the sensory stimulation special effect required to be triggered at each stage of target film projection, and the sensory stimulation equipment can be adjusted in advance, thereby being convenient for carrying out safety management and cost control on the sensory stimulation special effect of the target film.

The second object of the application is realized by the following technical scheme:

a video environment simulation system, comprising:

the target film analysis module is used for analyzing the target film file and generating target film audio frequency and target film images;

The feature segment identification module is used for inputting the target film audio frequency and the target film image into an audio-video feature identification model, and identifying feature segments and corresponding feature time information, wherein the feature segments comprise feature audio frequency segments and feature image segments, and the feature segments refer to segments needing to trigger special effects of sensory stimulation;

the special effect item matching module is used for inputting the characteristic segments into a special effect matching model and matching special effect control items corresponding to the characteristic segments;

and the special effect control program generation module is used for generating special effect control information based on each special effect control item and corresponding characteristic time information, generating a special effect control program based on the special effect control information and transmitting the special effect control program to the sensory stimulation equipment.

By adopting the technical scheme, the target film file is acquired and analyzed, so that the target film audio frequency and the target film image are generated, the subsequent identification and analysis of the audio content and the image content of the target film are convenient, and the special effect items required to be used are determined; inputting the audio frequency and the image of the target film into an audio-video characteristic recognition model to recognize characteristic fragments and corresponding characteristic time information from the audio frequency and the image of the film, so that the fragments needing to trigger special effects and the corresponding time information of the characteristic fragments can be conveniently judged later, the characteristic fragments comprise characteristic audio fragments and characteristic image fragments, the characteristic fragments refer to the video fragments needing to enhance the viewing experience by triggering sense-and-sense stimulation special effects, such as fragments of thunder, rain and the like, and the identification can be performed through the audio frequency or the image of the target film; inputting the characteristic segments into the special effect matching model, so that corresponding types of special effect control items can be matched from the special effect matching model according to the content of the characteristic segments, for example, the characteristic segments in rainy days can correspond to the special effect control items for spraying water; generating special effect control information based on the special effect control items and the corresponding characteristic time information to record the information of the triggering time required by the special effect control items of the sensory stimulation equipment, generating a special effect control program based on the special effect control information and sending the special effect control program to the sensory stimulation equipment so as to control the sensory stimulation equipment to trigger the special effect in the playing stage of the film; by analyzing the audio and the image of the target film file, the special effect type and the special effect time required to be triggered are automatically identified, and a special effect control program is generated, so that the generation efficiency of the special effect control program corresponding to the film is improved, and the problem of few film sources of the current 4G cinema is solved.

The third object of the application is realized by the following technical scheme:

a computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the video environment simulation method described above when the computer program is executed.

The fourth object of the application is realized by the following technical scheme:

a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the video environment simulation method described above.

In summary, the present application includes at least one of the following beneficial technical effects:

1. the method comprises the steps of obtaining a target film file and analyzing the target film file, so that target film audio frequency and target film images are generated, the subsequent identification and analysis of the audio content and the image content of the target film are facilitated, and special effect items required to be used are determined; inputting the audio frequency and the image of the target film into an audio-video characteristic recognition model to recognize characteristic fragments and corresponding characteristic time information from the audio frequency and the image of the film, so that the fragments needing to trigger special effects and the corresponding time information of the characteristic fragments can be conveniently judged later, the characteristic fragments comprise characteristic audio fragments and characteristic image fragments, the characteristic fragments refer to the video fragments needing to enhance the viewing experience by triggering sense-and-sense stimulation special effects, such as fragments of thunder, rain and the like, and the identification can be performed through the audio frequency or the image of the target film; inputting the characteristic segments into the special effect matching model, so that corresponding types of special effect control items can be matched from the special effect matching model according to the content of the characteristic segments, for example, the characteristic segments in rainy days can correspond to the special effect control items for spraying water; generating special effect control information based on the special effect control items and the corresponding characteristic time information to record the information of the triggering time required by the special effect control items of the sensory stimulation equipment, generating a special effect control program based on the special effect control information and sending the special effect control program to the sensory stimulation equipment so as to control the sensory stimulation equipment to trigger the special effect in the playing stage of the film; by analyzing the audio and the image of the target film file, the special effect type and the special effect time required to be triggered are automatically identified, and a special effect control program is generated, so that the generation efficiency of the special effect control program corresponding to the film is improved, and the problem of few sources of the current 4D cinema film is solved.

2. Since the audience often generates a commonality for a specific character when watching a film, and may generate a commonality for different characters in different segments, determining the roles of the audience commonality is important for setting special effect control information; acquiring attribute information of a target film, wherein the attribute information comprises actor information and film tag information so as to acquire main roles and film types, and the main roles and film types are convenient to judge the roles and things possibly shared by audiences, such as specific vehicles in a racing film, in a follow-up manner; determining sensory principal angles corresponding to all fragments in the target film based on the attribute information of the target film and the target film image, and generating sensory principal angle information; and the sensory principal angle information is input into the audio-video feature recognition model to generate a corresponding feature recognition correction rule, so that the recognition rule of the audio-video feature recognition model is convenient to adjust, the feature fragments are convenient to determine based on the feeling of using the sensory principal angle in the film, and the fit degree of the sensory stimulation special effect on the feeling of the user is improved.

3. Because of the difference in the special effects expected by the user in different types of movies, for example, in movies of racing car or vehicle pursuit, the spectators expect to feel acceleration and collision of the vehicle, in movies of sea type, the spectators expect to feel that the ship fluctuates due to the impact of sea waves, while some other types of special effects have an influence on the expected special effects felt by the user; therefore, the special effect items in the special effect item library are matched based on the film tag information, and expected item information and aversion item information are determined so as to acquire the expected special effect and the unexpected special effect of the audience on the type of film; the expected item information and the aversion item information are input into the special effect matching model, the amplified special effect item is generated based on the expected item information, the reduced special effect item is generated based on the aversion item information, the special effect receptivity of the amplified special effect item is convenient to improve subsequently, the special effect receptivity of the reduced special effect item is reduced, the fitting degree of various sensory stimulation special effects and a film main body is further improved, and accordingly the film watching experience of a user is improved.

Drawings

Fig. 1 is a flowchart of a video environment simulation method according to an embodiment of the application.

Fig. 2 is a flowchart of step S10 in the video environment simulation method of the present application.

Fig. 3 is a flowchart of step S20 in the video environment simulation method of the present application.

Fig. 4 is a flowchart of step S40 in the video environment simulation method of the present application.

FIG. 5 is another flow chart of the video environment simulation method of the present application.

FIG. 6 is a schematic block diagram of a video environment simulation system according to a second embodiment of the present application.

Fig. 7 is a schematic view of an apparatus in a third embodiment of the present application.

Detailed Description

The application is described in further detail below with reference to fig. 1 to 7.

At present, many 4D cinema (also called 5D cinema or 6D cinema) is available on the market, and compared with a common cinema, the 4D cinema has added special effects such as vibration, falling, blowing, water spraying, scratching, leg sweeping, smog, rain, photoelectricity, bubbles, smell and the like in the process of playing a film, so as to stimulate the vision, hearing, touch sense, smell sense and various senses of a spectator, thereby improving the film watching experience of a user.

In the application, special effects such as vibration, falling, blowing, water spraying, scratching, leg sweeping, smog, rain, photoelectricity, bubbles, smell and the like are defined as sense stimulation special effects, equipment for triggering the sense stimulation special effects is defined as sense stimulation equipment, in order to trigger the sense stimulation special effects timely, the existing 4D cinema usually controls the sense stimulation equipment manually by special staff or compiles a control program of the sense stimulation equipment in advance so as to realize the effect of automatically controlling the sense stimulation equipment to trigger the special effects at a specific time node of film playing, and if the sense stimulation equipment is controlled by adopting a mode of compiledly programming the control program of the sense stimulation equipment, the following problems exist:

1. The film source is less, and because the manpower cost and time are required for programming the control program of the sensory stimulation equipment in advance, many cold films are not worth specially programming the control program of the sensory stimulation equipment;

2. the sensory stimulation equipment control program corresponding to the film is difficult to write and finish in time, so that the 4D playing of the premised film or the newly-printed film is difficult;

3. the triggering effect of the sensory stimulation special effect in the film playing process is limited by the level of control program writers;

in order to solve the problems, the application discloses a video environment simulation method and a system.

Example 1

The application discloses a video environment simulation method which can be applied to processing video files to automatically generate a control program of sensory stimulation equipment.

As shown in fig. 1, the method specifically comprises the following steps:

s10: and analyzing the target film file to generate target film audio and target film images.

In this embodiment, the target film file refers to a file of a film to be projected having a special effect triggered by the sensory stimulation device; the target movie audio refers to the audio content of the target movie file, and the target movie image refers to the video content of the target movie file.

Specifically, a target film file is obtained, and the target film file is analyzed to distinguish an audio file from an image file, so that the corresponding sensory stimulation special effect items can be conveniently identified and matched based on the audio file and the image file of the target film respectively.

Wherein, referring to fig. 2, after step S10, it includes:

s11: and acquiring attribute information of the target film, wherein the attribute information comprises actor information and film label information.

Since the audience often co-operates with a specific character while watching a movie, and may co-operate with different characters in different segments, determining the character with which the audience co-operates is critical to the setting of special effect control information.

Specifically, corresponding attribute information is acquired from the film evaluation website based on the name of the target film, wherein the attribute information comprises actor information and film label information so as to acquire the main role and film type of the target film, and the role and things shared by audiences and the expected special effect items are conveniently and subsequently evaluated; the film tag information may be obtained by analyzing the film evaluation of the target film from the film evaluation website through a semantic recognition algorithm, or may be directly read from profile information of the film evaluation website.

S12: and determining the sensory principal angle information of each fragment in the target film based on the attribute information and the target film image.

In this embodiment, the sensory principal angle refers to a role or thing that a viewer shares when watching a film, and the sensory principal angle may be a person or an article, and the sensory principal angles in different segments of the film may be different; the sensory principal angle information is information generated by summarizing characteristic information of sensory principal angles in each segment of the target film and corresponding time of the segment.

Specifically, based on attribute information of the target film, determining key roles and film types, dividing the target film image into a plurality of fragments, wherein the dividing time nodes of the fragments are based on the editing time nodes of the target film, or based on the switching time nodes of the scenes, so that the main roles or the main appearing articles of each fragment in the target film can be conveniently determined, and further, the sensory main angles of each fragment in the target film can be determined; and acquiring the start and stop time of each segment in the target film progress time and the sensory principal angles corresponding to each segment, and generating sensory principal angle information, so that the sensory principal angle of each segment can be conveniently judged later.

Specifically, in this embodiment, when multiple characters appear in a certain segment of the target film, the rule for determining the sensory principal angle from the multiple characters includes: determining a sense principal angle according to the role with the largest area occupied by each role in the image picture; determining a sense principal angle according to the role with the largest total duration of each role in the images in the segment; and determining the sense principal angle according to the roles of each character appearing in the cast at the forefront.

S13: and inputting the sensory principal angle information into the audio-video characteristic recognition model to generate a characteristic recognition correction rule.

Specifically, sensory principal angle information is input into an audio-visual feature recognition model, and a feature recognition correction rule is generated; in this embodiment, the audio frequency and the image of the target film need to be processed by the audio-video feature recognition model, and the feature recognition correction rule is used for correcting the feature recognition rule of the audio-video feature recognition model, so that the audio-video feature recognition model directionally recognizes the sensory feeling of the sensory main angle in the target film, thereby improving the fit degree of the sensory stimulation special effect to the user feeling.

Wherein, after step S10, further comprising:

s14: and matching the special effect items in the special effect item library based on the film tag information, and determining expected item information and aversion item information.

In this embodiment, the special effect item library is a database for recording the special effect items of sensory stimulation that can support matching by the current video environment simulation method, and the special effect items are the items of special effect of sensory stimulation; the expected item refers to a special effect item expected by a viewer viewing such a movie, which is determined after evaluation based on the movie type; the aversion item refers to a special effect item that is determined after evaluation based on the film type to be averted by a viewer who views such a film.

Specifically, acquiring film tag information, inputting the film tag information into a special effect item library, determining a special effect item expected by a matched audience as a desired item according to a preset matching rule and the film tag information, determining a special effect item averted by the matched audience as an aversion item, and generating corresponding expected item information and aversion item information.

In particular, the matching rule may be determined by any of a worker setting, a audience voting setting, and a historical feedback information setting.

S15: and inputting the expected item information and the aversion item information into the special effect matching model, and determining corresponding amplified special effect items and reduced special effect items.

In this embodiment, the amplified special effect item refers to a special effect item determined based on expected item information, and is used for increasing intensity of the amplified special effect item when triggering is required to be performed subsequently; the amplitude reduction special effect item is a special effect item determined based on aversion item information and is used for reducing intensity of the amplitude reduction special effect item when the amplitude reduction special effect item is triggered later when the amplitude reduction special effect item needs to be triggered.

Since there are differences in the special effects expected by the user among different types of movies, for example, in movies of racing car or car chase, the spectators expect to feel acceleration and collision of the car, in movies of sea type, the spectators expect to feel that the ship fluctuates by the impact of sea waves, while some other types of special effects have an influence on the special effects expected by the user.

Specifically, in this embodiment, appropriate special effect items are required to be matched with the audio frequency and the image of the target film through the special effect matching model, and expected item information and aversion item information of the target film are input into the special effect matching model, so that an amplified special effect item and a reduced special effect item corresponding to the target film are determined, the subsequent improvement of the special effect receptivity of the amplified special effect item is facilitated, the special effect receptivity of the reduced special effect item is reduced, the fitting degree of various sensory stimulation special effects with the film main body is further improved, and the viewing experience of a user is further improved.

S20: and inputting the target film audio frequency and the target film image into an audio-video feature recognition model, and recognizing feature fragments and corresponding feature time information, wherein the feature fragments comprise feature audio frequency fragments and feature image fragments, and the feature fragments refer to fragments needing to trigger a special effect of sensory stimulation.

In this embodiment, the audio-video feature recognition model is a model for recognizing feature events according to audio features of the audio of the target film and image features of the image of the target film, and the feature events are events in the scenario of the target film, which need to trigger a special effect of sensory stimulation; the feature segments comprise a feature audio segment and a feature image segment, wherein the feature segments refer to segments needing to trigger a special effect of sensory stimulation; the characteristic audio fragment refers to a characteristic fragment identified based on the audio frequency of the target film, and the characteristic image fragment refers to a characteristic fragment identified based on the image of the target film.

Specifically, the target film audio frequency and the target film image are input into an audio-video feature recognition model to recognize feature fragments, corresponding feature time information is determined according to the start-stop time of the feature fragments in the target film progress time, and the time node for triggering the sensory stimulation special effect is conveniently determined according to the feature time information.

Referring to fig. 3, in step S20, a specific method for inputting the target film audio and the target film image into the audio-video feature recognition model to recognize the feature segments and the feature time information includes:

s21: and inputting the target film audio into an audio-video feature recognition model, matching the target film audio with each feature audio, and defining the successfully matched audio fragment as a feature audio fragment.

In this embodiment, an audio feature library is provided in the audio-video feature recognition model, and is used to store feature audio, where the feature audio is audio that can be associated with a sensory stimulation special effect item, such as thunder, gunshot, rain, etc., so as to facilitate the subsequent comparison of the target film audio with the audio feature audio.

Specifically, inputting the target film audio into an audio-video feature recognition model, enabling the target film audio to be matched with each segment of feature audio in an audio feature library, defining successfully matched segments as feature audio segments, and facilitating the follow-up matching of corresponding sensory stimulation special effects according to specific features of the feature audio segments.

S22: and acquiring the start and stop time of the characteristic audio fragment in the target film audio, and determining corresponding characteristic time information.

Specifically, after the characteristic audio fragment is matched, corresponding characteristic time information is determined according to the start and stop time of the characteristic audio fragment in the target film audio, so that the start time node and the duration of the sensory stimulation special effect can be conveniently determined later.

S23: and inputting the target film image into an audio-video feature recognition model, recognizing feature events occurring in the image through an image recognition algorithm, and determining feature image fragments.

In this embodiment, an image recognition algorithm and an image feature library are provided, where the image feature library is used to store feature images and is a training dataset of the image recognition algorithm, so that the image recognition algorithm is conveniently trained through the feature images in the image feature library, so as to improve accuracy and success rate of the image recognition algorithm, and the image recognition algorithm is used to recognize images of feature events in a target film, such as images of events of vehicle collision, rainwater, wind blowing, and the like; facilitating subsequent identification of feature events in the target movie based on the target movie image.

Specifically, the target film image is input into an audio-video feature recognition model, the image recognition algorithm is used for recognizing the image of the feature event in the target film image, the image segment corresponding to the recognized feature event is defined as the feature image segment, and the corresponding sensory stimulation special effect is conveniently matched according to the specific features of the feature image segment.

S24: and acquiring the start and stop time of the characteristic image fragment in the target film image, and determining corresponding characteristic time information.

Specifically, after the characteristic image segments are matched, corresponding characteristic time information is determined according to the start and stop time of the characteristic image segments in the target film image, so that the start time node and the duration of the sensory stimulation special effect can be conveniently determined later.

S30: and inputting the characteristic segments into a special effect matching model, and matching special effect control items corresponding to the characteristic segments.

In this embodiment, the special effect matching model is used for matching corresponding sensory stimulation special effect items according to the feature segments; the special effect control item is a special effect item which is generated after the audio-video characteristic recognition and special effect matching are carried out on the target film and needs to be controlled.

Specifically, the characteristic segments are input into the special effect matching model, so that corresponding types of special effect control items can be matched from the special effect matching model according to the content of the characteristic segments, for example, the characteristic segments in rainy days can correspond to the special effect control items of water spraying.

S40: and generating special effect control information based on each special effect control item and corresponding characteristic time information, generating a special effect control program based on the special effect control information, and transmitting the special effect control program to the sensory stimulation equipment.

In this embodiment, the special effect control information is generated based on the special effect control item and the corresponding characteristic time information, and is used for recording the special effect item, the triggering time and duration of each special effect item, and the working gear and power of the corresponding sensory stimulation equipment when triggering; the special effect control program is a program which is generated based on the special effect control information and is used for controlling the sensory stimulation equipment to trigger the operation parameters such as the time, the power and the like of the special effect of the sensory stimulation.

Specifically, special effect control information is generated based on each special effect control item and corresponding characteristic time information so as to record information of triggering time, power and the like required by each special effect control item of the sensory stimulation equipment, a special effect control program is generated based on the special effect control information and sent to the sensory stimulation equipment, and the sensory stimulation equipment is controlled to trigger a special effect in a specific stage of film playing.

Wherein, referring to fig. 4, after step S40, it includes:

s41: and generating special subtitles based on the special effect control information, marking the special subtitles in the target film file, and generating the film special effect script.

Specifically, after the special effect control information is generated, special effect captions are generated based on the special effect control information, and the special effect captions are marked in the target film file, so that a film special effect script is generated, and a manager can conveniently judge the sensory stimulation special effect triggered by each stage of target film showing through the special effect captions in the film special effect script.

S42: and sending the film special effect script to a manager terminal.

Specifically, the film special effect script is sent to a manager terminal, so that the manager can conveniently audit the sensory stimulation special effect required to be triggered at each stage of target film projection, and the operations such as early debugging of sensory stimulation equipment, for example, adding/replacing a specific type of odor raw material to an odor generating device, and the like, and the sensory stimulation special effect of the target film can be conveniently managed safely and cost controlled.

Wherein, referring to fig. 5, after step S40, the video environment simulation method further includes:

s50: and shooting an image of the audience seat based on the characteristic time information, generating an audience feedback image, and identifying the expression of each audience in the audience feedback image through an expression identification algorithm.

In this embodiment, the audience feedback image refers to an audience image captured at a time after the triggering of the sensory stimulus special effect.

Specifically, the image pickup device is controlled to shoot the image of the audience according to the characteristic time information, so that the image of the audience can be conveniently shot at the time after each special effect is triggered, and an audience feedback image is generated; the expression recognition algorithm is used for recognizing the expressions of each audience in the audience feedback image, so that the satisfaction degree of the audience on the sensory stimulation special effect can be conveniently analyzed later.

S60: and counting the number of audiences corresponding to various expressions in the audience feedback image, generating audience feeling information, and calculating audience satisfaction degree based on the audience feeling information.

In this embodiment, the audience experience information refers to information generated by counting based on the expressions of each audience, and is used to evaluate the overall experience of the audience on the special effects of sensory stimulation; the viewer satisfaction refers to information of satisfaction of the viewer with the sensory stimulus special effects determined after evaluation based on the viewer feeling information.

Specifically, the number of audiences corresponding to various expressions in the feedback image of the audience is counted, audience feeling information is generated based on the counted data, audience satisfaction is calculated based on the audience feeling information, and the follow-up regulation of the special effect control program of the sensory stimulation equipment according to the audience satisfaction is facilitated, so that the accuracy of controlling the sensory stimulation equipment is further improved, and the viewing experience of the audience is improved.

It should be understood that the sequence number of each step in the above embodiment does not mean the sequence of execution, and the execution sequence of each process should be determined by its function and internal logic, and should not be construed as limiting the implementation process of the embodiment of the present application.

Example two

As shown in fig. 6, the present application discloses a video environment simulation system for performing the steps of the video environment simulation method described above, which corresponds to the video environment simulation method in the above-described embodiment.

The video environment simulation system comprises a target film analysis module, a feature fragment identification module, a special effect item matching module and a special effect control program generation module. The detailed description of each functional module is as follows:

The target film analysis module comprises:

the film attribute acquisition sub-module is used for acquiring attribute information of a target film, wherein the attribute information comprises actor information and film label information;

The sensory principal angle determining sub-module is used for determining sensory principal angle information of each segment in the target film based on the attribute information and the target film image;

the feature recognition correction sub-module is used for inputting the sensory principal angle information into the audio-video feature recognition model to generate a feature recognition correction rule;

the special effect item classification sub-module is used for matching special effect items in a special effect item library based on the film tag information and determining expected item information and aversion item information;

and the special effect item adjusting sub-module is used for inputting expected item information and aversion item information into the special effect matching model and determining corresponding amplified special effect items and reduced special effect items.

Wherein, the characteristic fragment identification module includes:

the characteristic audio fragment matching sub-module is used for inputting the target film audio into an audio-video characteristic identification model, so that the target film audio is matched with each characteristic audio, and an audio fragment successfully matched is defined as a characteristic audio fragment;

the characteristic audio time determining submodule is used for acquiring the start and stop time of the characteristic audio fragment in the target film audio and determining corresponding characteristic time information;

the characteristic image fragment identification sub-module is used for inputting the target film image into an audio-video characteristic identification model, identifying characteristic events occurring in the image through an image identification algorithm, and determining characteristic image fragments;

And the characteristic image time determining sub-module is used for acquiring the start and stop time of the characteristic image fragment in the target film image and determining corresponding characteristic time information.

Wherein, special effect control program generation module includes:

the film special effect script generation sub-module is used for generating special effect captions based on the special effect control information and marking the special effect captions in the target film file to generate a film special effect script;

and the film special effect script transmitting sub-module is used for transmitting the film special effect script to the manager terminal.

Wherein, the video environment simulation system further includes:

the audience feedback image generation module is used for shooting images of the audience seat based on the characteristic time information, generating audience feedback images and identifying the expression of each audience in the audience feedback images through an expression identification algorithm;

and the audience satisfaction calculating module is used for counting the number of audiences corresponding to various expressions in the audience feedback image, generating audience feeling information and calculating audience satisfaction based on the audience feeling information.

Example III

A computer device, which may be a server, may have an internal structure as shown in fig. 7. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer equipment is used for storing data such as a target film file, a target film audio frequency, a target film image, an audio-video characteristic identification model, a characteristic fragment, characteristic time information, a special effect matching model, a special effect control item, special effect control information, a special effect control program and the like. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program when executed by a processor implements a video environment simulation method.

In one embodiment, a computer device is provided comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, the processor implementing the following steps when executing the computer program:

s10: analyzing the target film file to generate target film audio frequency and target film image;

s20: inputting the target film audio frequency and the target film image into an audio-video feature recognition model, and recognizing feature fragments and corresponding feature time information, wherein the feature fragments comprise feature audio frequency fragments and feature image fragments, and the feature fragments refer to fragments needing to trigger a special effect of sensory stimulation;

s30: inputting the characteristic segments into a special effect matching model, and matching special effect control items corresponding to the characteristic segments;

In one embodiment, a computer readable storage medium is provided having a computer program stored thereon, which when executed by a processor, performs the steps of:

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous link (Synchlink), DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions.

The above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art will understand that; the technical scheme described in the foregoing embodiments can be modified or some of the features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application, and are intended to be included in the scope of the present application.

Claims

1. A method for simulating a video environment, comprising:

inputting sensory principal angle information into an audio-visual characteristic recognition model to generate a characteristic recognition correction rule;

inputting expected item information and aversion item information into a special effect matching model, and determining corresponding amplified special effect items and reduced special effect items;

2. A video environment simulation method according to claim 1, wherein: the audio feature library is configured to store feature audio, and input the target film audio and the target film image into the audio-video feature recognition model, and the step of recognizing feature segments and corresponding feature time information includes:

3. A video environment simulation method according to claim 1, wherein: the step of inputting the target film audio frequency and the target film image into the audio-video feature recognition model and recognizing feature fragments and corresponding feature time information comprises the following steps:

4. A video environment simulation method according to claim 1, wherein: generating special effect control information based on the special effect control items and the corresponding characteristic time information, generating a special effect control program based on the special effect control information and sending the special effect control program to the sensory stimulation equipment, and further comprising:

5. A video environment simulation method according to claim 1, wherein: after the step of generating the special effect control information based on each special effect control item and the corresponding characteristic time information, the method further comprises the following steps:

and sending the film special effect script to a manager terminal.

6. A video environment simulation system, comprising:

the special effect control program generation module is used for generating special effect control information based on each special effect control item and corresponding characteristic time information, generating a special effect control program based on the special effect control information and sending the special effect control program to the sensory stimulation equipment;

the target film analysis module comprises:

7. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the video environment simulation method according to any of claims 1 to 5 when the computer program is executed.

8. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the video environment simulation method according to any one of claims 1 to 5.