WO2017173976A1 - Method for realizing multi-dimensional control, intelligent terminal and controller - Google Patents

Method for realizing multi-dimensional control, intelligent terminal and controller Download PDF

Info

Publication number
WO2017173976A1
WO2017173976A1 PCT/CN2017/079444 CN2017079444W WO2017173976A1 WO 2017173976 A1 WO2017173976 A1 WO 2017173976A1 CN 2017079444 W CN2017079444 W CN 2017079444W WO 2017173976 A1 WO2017173976 A1 WO 2017173976A1
Authority
WO
WIPO (PCT)
Prior art keywords
controller
control
motion estimation
scene information
smart terminal
Prior art date
Application number
PCT/CN2017/079444
Other languages
French (fr)
Chinese (zh)
Inventor
赵秋林
黄宇轩
刘成刚
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2017173976A1 publication Critical patent/WO2017173976A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/02Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Definitions

  • the present application relates to, but is not limited to, intelligent technology, and more particularly to a method, an intelligent terminal and a controller for implementing multi-dimensional control.
  • this multi-dimensional user experience can only be experienced on a dedicated movie.
  • the control commands of the multi-dimensional experience are synchronized with the movie in advance, for example, at the corresponding show time point, the corresponding controller is issued.
  • Control commands to allow the controller to control effects such as vibrations, blows, smoke, bubbles, smells, scenery, and character performances. That is to say, the realization of this new entertainment effect is currently limited in the use of the family.
  • the present application provides a method, an intelligent terminal and a controller for realizing multi-dimensional control, which can add multi-dimensional experience effects to the screening content in real time, and is applicable to ordinary families.
  • the application provides a method for implementing multi-dimensional control, including:
  • the smart terminal analyzes the obtained currently played video content to identify the scene information corresponding to the video content
  • the smart terminal sends the scene information to the controller, so that the controller starts multi-dimensional control according to the scene information.
  • the analyzing the acquired video content to identify the The scenario information corresponding to the video content may include:
  • the video frame is sampled and analyzed, and the candidate object is searched, wherein for each sample frame, a motion estimation vector is acquired, and a plurality of regions in the macroblock set with a large motion estimation vector are defined as a marker region;
  • the smart terminal continuously detects key frames in the currently played video frame. If there is a marked area in a sequence of video frames that are long in a preset period, the smart terminal starts sampling and analyzing the video frame. A key frame in the sequence identifies, for each sample frame, a candidate object within the video frame and a location thereof to identify the scene information.
  • the delineating a plurality of regions in the macroblock set having a large motion estimation vector as the marker region may include:
  • the obtained motion estimation vector is divided into the following two categories by using a classification algorithm: a macroblock with a large motion estimation vector and a macroblock with a small motion estimation vector;
  • a plurality of regions in the macroblock set having a large motion estimation vector are defined as marker regions; objects located outside the marker regions are used as reference objects.
  • the present application further provides a method for implementing multi-dimensional control, comprising: the controller identifying an instruction that needs to start multi-dimensional experience control according to the obtained scene information corresponding to the currently played video content, and performing corresponding control.
  • a correspondence between different object categories and control information is preset in the controller
  • the determining, according to the obtained scenario information, the instruction that the multi-dimensional experience control needs to be started may include: determining that the object in the obtained scene information belongs to a preset object type of the trigger control, and determining that the preset trigger condition is met, determining The instruction of the multidimensional experience control is initiated.
  • the controller may include at least one of the following: a vibration controller, an odor controller, a spray controller, a light controller, and a sound controller.
  • distributed deployment or centralized deployment, is employed between multiple controllers.
  • the application further provides a method for implementing a multi-dimensional experience, including:
  • the smart terminal analyzes the obtained currently played video content to identify and initiate the request.
  • the smart terminal determines, according to the identified scenario information, whether to start multi-dimensional experience control
  • the corresponding control information is sent to the corresponding controller.
  • the foregoing method may further include:
  • the smart terminal listens to a query command from one or more controllers, and returns its own device description information to the controller that initiates the query request;
  • the analyzing the obtained video content, and identifying the scenario information corresponding to the controller that initiated the request may include:
  • the video frame is sampled and analyzed, and the candidate object is searched, wherein for each sample frame, a motion estimation vector is acquired, and a plurality of regions in the macroblock set with a large motion estimation vector are defined as a marker region;
  • Each sample frame identifies a candidate object associated with a controller that initiates the query and establishes a session within the video frame and a location thereof to identify the scene information corresponding to the controller that initiated the query and establishes the session.
  • the delineating a plurality of regions in the macroblock set having a large motion estimation vector as the marker region may include:
  • the obtained motion estimation vector is divided into the following two categories by using a classification algorithm: a macroblock with a large motion estimation vector and a macroblock with a small motion estimation vector;
  • a plurality of regions in the macroblock set having a large motion estimation vector are defined as marker regions; objects located outside the marker regions are used as reference objects.
  • a correspondence between different object categories and control information is preset in the smart terminal
  • Determining whether the multi-dimensional experience control needs to be started according to the identified scene information includes: the object in the obtained scene information belongs to a preset object type of trigger control, and when the preset trigger condition is met, the corresponding multi-dimensional is started. Experience control.
  • the present application further provides an intelligent terminal, including: a first analysis module and a broadcast module; wherein
  • a first analysis module configured to analyze the obtained currently played video content, to identify scene information corresponding to the video content
  • the broadcast module is configured to send the identified scene information to the controller, so that the controller starts the multi-dimensional control according to the scene information.
  • the first analysis module may be configured to: when playing a video, sample and analyze a video frame, obtain a motion estimation vector for each sample frame; and use a classification algorithm to divide the obtained motion estimation vector into the following: Two types: a macroblock with a large motion estimation vector and a macroblock with a small motion estimation vector; and a plurality of regions in a macroblock set with a large motion estimation vector are defined as a marker region;
  • the application further provides an intelligent terminal, comprising: a second analysis module and a determining module; wherein
  • the second analysis module is configured to analyze the obtained currently played video content to identify the scenario information corresponding to the controller that initiated the request;
  • the determining module is configured to determine whether the multi-dimensional experience control needs to be started according to the identified scenario information. When it is determined that the multi-dimensional experience control needs to be started, the corresponding control information is sent to the corresponding controller.
  • the smart terminal may further include: an establishing module configured to listen to a query command from one or more controllers, and return device description information of the smart terminal to which the smart terminal belongs to the controller that initiates the query request; Establish a session with the controller that initiated the session.
  • the second analysis module may be configured to:
  • the motion estimation vector is obtained for each sample frame.
  • the motion estimation vectors obtained by the classification algorithm are classified into the following two types: macroblocks with large motion estimation vectors and macros with small motion estimation vectors. Block; delineating several regions in a macroblock set with a large motion estimation vector Marked area; an object located outside the marked area is called a reference;
  • the determining module may be configured to: when the object in the obtained scene information belongs to a preset trigger-controlled object according to a correspondence between different object categories and control information set in advance When the category is met and the preset trigger condition is met, the corresponding multi-dimensional experience control is started, and the corresponding control information is sent to the corresponding controller.
  • the application further provides a controller, comprising: an acquisition module and a control module; wherein
  • Obtaining a module configured to obtain scene information corresponding to the currently played video content
  • the control module is configured to perform corresponding control when it is determined according to the obtained scenario information that the multi-dimensional experience control needs to be started.
  • a correspondence between different object categories and control information is preset in the control module
  • the control module may be configured to start the multi-dimensional experience control when the object in the obtained scene information belongs to a preset object type of trigger control and meets a preset trigger condition.
  • the obtaining module is further configured to: send a query command to query device description information of the smart terminal in the current network, and listen to information broadcast by the smart terminal.
  • the technical solution of the present application includes the smart terminal analyzing the obtained currently played video content to identify the scene information corresponding to the video content; the intelligent terminal sends the scene information to the controller, so that the controller starts multi-dimensional control according to the scene information. Or, after the multi-dimensional experience function is started, the smart terminal analyzes the currently played video content to obtain scenario information corresponding to the controller that initiates the request; the smart terminal determines, according to the obtained scenario information, whether to start multi-dimensional experience control; When the multi-dimensional experience control needs to be started, the corresponding control information is sent to the corresponding controller.
  • the technical solution provided by the present application utilizes an intelligent terminal to implement audio and video detection, to identify a current video playing scene, and control various controllers according to the identified various scenes to reconstruct the current playing.
  • the scene enables real-time multi-dimensional experience effects on the content of the show, and is suitable for ordinary families.
  • FIG. 1 is a flowchart of a method for implementing a multi-dimensional experience according to an embodiment of the present invention
  • FIG. 2 is a flowchart of another method for implementing a multi-dimensional experience according to an embodiment of the present invention
  • FIG. 3 is a schematic structural diagram of a smart terminal according to an embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of another smart terminal according to an embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of a controller according to an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of a networking architecture of a controller deployed in a centralized manner according to an embodiment of the present invention
  • FIG. 7 is a schematic diagram of a networking architecture of a controller deployed in a distributed manner according to an embodiment of the present invention.
  • FIG. 1 is a flowchart of a method for implementing multi-dimensional control according to an embodiment of the present invention. As shown in FIG. 1 , the method includes:
  • Step 100 The smart terminal analyzes the obtained currently played video content to identify the scene information corresponding to the video content.
  • the smart terminal After starting the multi-dimensional experience function, first, when the smart terminal plays the video, sample and analyze the video frame, and try to search for candidate objects, such as flowers (such as corresponding wind), grass, rock slurry (such as corresponding vibration), etc. Frame, obtain motion estimation vector; use classification algorithm such as k-means clustering
  • the analysis divides the obtained motion estimation vectors into two categories: macroblocks with large motion estimation vectors and macroblocks with small motion estimation vectors. A plurality of regions in a macroblock set in which the motion estimation vector is large are defined as a marker region. If a marked area is too small, discard the marked area. An object located outside the marked area serves as a reference for the large background. In this way, the possible areas where key candidate objects exist are found.
  • a preset area such as a rectangular area
  • a preset threshold such as 80% (adjustable)
  • the area is considered to be a marker. region.
  • the area of the marked marked area occupies a ratio of the total area smaller than the preset area, such as 10% (adjustable)
  • the marked area is discarded.
  • the smart terminal continuously detects the key frame in the obtained video frame, that is, the I frame. If there is always a marked area in a sequence of video frames that are long in a preset period, the smart terminal starts sampling and analyzing the video.
  • the key frame in the frame sequence identifies and identifies the candidate object and the location in the video frame by using an algorithm such as a neural network for each sample frame, thereby identifying the scene information. In this way, the identification of key candidate objects is achieved.
  • the candidate object identified in the marked area of the video frame sequence is marked as the candidate object category if the following conditions are met: 1) The object class exists in the marked area of successive video frame sequences; 2) each object of the object class continues to change relative to the reference vector of each video sequence.
  • the scene information further includes: recording additional parameters such as an object duration, an object position moving relative speed, a number, and the like.
  • the neural network used above can adopt the structure of AlexNet: a total of 8 layers, the first 5 layers are convolution layers, and the last 3 layers are fully connected layers. Among them, the last layer uses the softmax classifier. Among them, in the first five layers of the convolutional layer, the first layer is a convolutional layer, which is convoluted using a specific template interval, and then uses ReLU as an activation function, and is polled after regularization, and the obtained result is used as a second layer convolution.
  • Layer input the following four layers of convolutional layer and the first layer are similar, but the convolution template with lower dimension is used; in the last three layers of the full connection layer, after the last three layers of ReLU, the dropout is fully connected; finally, softmax is used. As a lost function.
  • the edge contour of the flower For example, if you use a neural network to detect a large area of flowers in the current picture, you can find the edge contour of the flower. If you also detect that the flower has a large amplitude of shaking to the right, then swing according to the flower. The direction can be inferred that the wind blows from left to right, and the level of the wind can be derived according to the amplitude of the flower swing; if a character is also detected in the picture, the position and number of the character are marked, and The frame finds the speed of relative movement between characters and the like. The information obtained is the scene information required in this step.
  • Step 101 The smart terminal sends the identified scene information to the controller, so that the controller starts multi-dimensional control according to the scene information.
  • the smart terminal sends the identified scene information to the controller, such as broadcasting the identified scene information.
  • the scene information may include: the type of flower, the approximate number of flowers; the direction of wind blowing and the level of wind; the number of characters and the speed of relative movement.
  • the control information is used for the controller that needs to start the multi-dimensional experience control to perform corresponding control.
  • the controller further includes: the controller identifies, according to the obtained scene information corresponding to the currently played video content, an instruction that needs to start multi-dimensional experience control, and performs corresponding control.
  • the controller in the present application may include, but is not limited to, at least one of the following: a vibration controller, an odor controller, a spray controller, a light controller, and a sound controller.
  • controllers can be distributed or centralized. When distributed deployment is used, each controller communicates with the intelligent terminal; when centralized deployment, multiple controllers can be placed in one device, such as a wearable device, which is more convenient for the user. Experience. Among them, the controller and the intelligent terminal can communicate by means of Ethernet, WiFi, Bluetooth, and the like.
  • a correspondence relationship between different object categories and control information is set in advance, and when the object in the obtained scene information belongs to a preset object type of the trigger control, and the preset trigger condition is satisfied, , determine the instruction to start the corresponding multi-dimensional experience control.
  • the correspondence may be set as: when the obtained object information belongs to an object category that triggers vibration, such as rock, and meets the trigger condition, such as the number of objects is greater than one and the speed is greater than 1/8 of the screen. Seconds, lasting more than 3 seconds, the vibration controller is activated to trigger the vibration effect;
  • the correspondence may be set to: when the object in the obtained scene information belongs to an object category that triggers the generation of odor, such as osmanthus, and meets the trigger condition such as the continuous appearance time > 6 seconds, and the number > 10 Then, the odor controller is activated to trigger the scent of osmanthus fragrance.
  • the corresponding relationship may be: when the obtained object information belongs to the object category that triggers the generated sound, if a task appears in the screen, and the trigger condition is met, such as the position, moving direction, and movement of the character. At the speed, etc., the sound controller is activated to trigger a gradual process in which the footstep moves in accordance with the direction in which the character moves.
  • FIG. 2 is a flowchart of another method for implementing multi-dimensional control according to an embodiment of the present invention. As shown in FIG. 2, the method includes:
  • Step 200 The smart terminal analyzes the obtained currently played video content to identify the scenario information corresponding to the controller that initiated the request.
  • the method may further include: after the controller is started, sending a query command to the smart terminal to query the device information of the smart terminal in the current network, and listening to the information broadcast by the smart terminal;
  • the intelligent terminal acts as a convergence point to listen to the query from the controller, and when the query is queried, returns its own device description information to the controller that initiates the query request;
  • the controller receiving the query response acts as a client to initiate a session to the smart terminal and establishes a session between the smart terminal and the controller.
  • this step the smart terminal collects corresponding scenario information for the request of the controller.
  • the vibration controller is used to initiate the query request.
  • the smart terminal only recognizes the object category such as rock that triggers the vibration, that is, the object in the scene information returned at this time only has the object category that triggers the vibration.
  • Step 201 The smart terminal determines, according to the identified scenario information, whether to start multi-dimensional experience control.
  • a correspondence relationship between different object categories and control information is set in advance, and the object in the obtained scene information belongs to a preset object type of the trigger control, and the preset trigger condition is satisfied.
  • the corresponding multidimensional experience control is started.
  • step 102 The specific implementation of this step is consistent with step 102, and details are not described herein again.
  • Step 202 When it is determined that the multi-dimensional experience control needs to be started, the corresponding control information is sent to the corresponding controller.
  • the intelligent terminal directly delivers the final control information to the controller, and the controller only needs to start and trigger the corresponding action according to the received control command.
  • FIG. 3 is a schematic structural diagram of a smart terminal according to an embodiment of the present invention. As shown in FIG. 3, the method includes at least a first analysis module 300 and a broadcast module 301.
  • the first analysis module 300 is configured to analyze the acquired currently played video content to identify scene information corresponding to the video content.
  • the broadcast module 301 is configured to send the identified scene information to the controller, so that the controller starts the multi-dimensional control according to the scene information.
  • the first analysis module 300 can be configured to:
  • the motion estimation vector is obtained for each sampling frame;
  • the motion estimation vectors obtained by the classification algorithm such as k-means cluster analysis are divided into the following two categories: motion A macroblock having a large vector and a macroblock having a small motion estimation vector are estimated; a plurality of regions in the macroblock set having a large motion estimation vector are defined as a marked region; if a marked region is too small, the marked region is discarded. An object located outside the marked area is called a reference object;
  • Each sample frame is identified by a neural network or the like to locate a candidate object in the video frame and a location thereof, thereby obtaining scene information.
  • FIG. 4 is a schematic structural diagram of another smart terminal according to an embodiment of the present invention. As shown in FIG. 4, the method includes at least a second analysis module 401 and a determining module 402.
  • the second analysis module 401 is configured to analyze the obtained currently played video content to identify the scenario information corresponding to the controller that initiated the request;
  • the determining module 402 is configured to determine whether the multi-dimensional experience control needs to be started according to the identified scenario information. When it is determined that the multi-dimensional experience control needs to be started, the corresponding control information is sent to the corresponding controller.
  • the smart terminal shown in FIG. 4 may further include an establishing module 400 configured to listen to a query command from a certain controller or some controllers, and return device description information of the smart terminal to which the smart terminal belongs to the controller that initiates the query request; A session is established between the controllers of the session.
  • an establishing module 400 configured to listen to a query command from a certain controller or some controllers, and return device description information of the smart terminal to which the smart terminal belongs to the controller that initiates the query request; A session is established between the controllers of the session.
  • the second analysis module 401 can be configured to:
  • Each sampling frame identifies a candidate object and a location in the video frame that are related to the controller that initiated the query and establishes the session through an algorithm such as a neural network, thereby identifying scene information corresponding to the controller that initiated the query and establishes the session.
  • the determining module 402 may be configured to: when the object in the obtained scene information belongs to a preset trigger-controlled object category according to a preset relationship between different object categories and control information, and meets a preset trigger condition The corresponding multi-dimensional experience control is started, and the corresponding control information is sent to the corresponding controller.
  • FIG. 5 is a schematic structural diagram of a controller according to an embodiment of the present invention. As shown in FIG. 5, the method includes at least an obtaining module 500 and a control module 501.
  • the obtaining module 500 is configured to obtain scene information corresponding to the currently played video content.
  • the control module 501 is configured to perform corresponding control when it is determined that the multi-dimensional experience control needs to be started according to the obtained scenario information.
  • the control module 501 may be configured with a corresponding relationship between different object types and control information in advance; the control module 501 may be configured to: when the obtained object in the scene information belongs to a preset trigger-controlled object category, and meets Multi-dimensional experience control is initiated when a pre-set trigger condition is set.
  • the obtaining module 500 is further configured to: send a query command to query device description information of the smart terminal in the current network, and listen to information broadcast by the smart terminal.
  • FIG. 6 is a schematic diagram of a networking architecture of a centralized deployment of a controller according to an embodiment of the present invention.
  • a centralized deployment is adopted between multiple controllers, such as Wear the device.
  • the vibration controller (such as the vibration embedded in the smart pants)
  • the controller initiates the query request as an example, and in the first embodiment, the smart terminal determines whether the vibration controller needs to be activated to trigger the vibration effect.
  • This embodiment may include:
  • an inquiry command is sent to the intelligent terminal to query the device description information of the intelligent terminal in the current network, and listen to the broadcast information of the intelligent terminal; the intelligent terminal acts as a convergence point, and when the monitoring device has a vibration controller to initiate an inquiry, Reading its own device description information and returning it to the vibration controller through the query response; the vibration controller acts as a client to initiate a session request, and the intelligent terminal receives the session request and establishes a session between itself and the vibration controller.
  • the video frame is sampled and analyzed, and the candidate object is searched for, that is, the motion estimation vector is acquired for each sample frame.
  • the obtained motion estimation vectors of the video frames are classified into the following two types by using a classification algorithm: macroblocks with large motion estimation vectors and macroblocks with small motion estimation vectors.
  • a plurality of regions in a macroblock set in which the motion estimation vector is large are defined as a marker region. If a marked area is too small, discard the marked area. An object located outside the marked area is called a reference.
  • the frames in the sequence of video frames are sampled and analyzed, and each sample frame is identified by a neural network or the like to locate the main object and the location in the video frame.
  • the neural network can adopt the structure of AlexNet: a total of 8 layers, the first 5 layers are convolution layers, and the last 3 layers are fully connected layers. Among them, the last layer uses the softmax classifier.
  • the first layer is a convolutional layer, which is convoluted using a specific template interval, and then uses ReLU as an activation function, and is polled after regularization, and the obtained result is used as a second layer convolution.
  • Layer input the following four layers of convolutional layer and the first layer are similar, but the convolution template with lower dimension is used; in the last three layers of the full connection layer, after the last three layers of ReLU, the dropout is fully connected; finally, softmax is used. As a lost function.
  • the candidate object identified in the marked area of the video frame sequence is marked as the candidate object category if the following conditions are met: 1) The object class exists in the marked area of successive video frame sequences; 2) each object of the object class continues to change relative to the reference vector of each video sequence.
  • the scene information further includes: recording additional parameters such as an object duration, an object position moving relative speed, a number, and the like.
  • the vibration controller there is a correspondence between different object categories and control information in the smart terminal, when the object in the obtained scene information belongs to a preset object type of trigger control, and the preset trigger condition is satisfied. , start the corresponding multi-dimensional experience control.
  • a plurality of triggering vibration correspondences are preset for the vibration controller: each trigger item is provided with a triggering object category, and a trigger condition, and the vibration effect is triggered when the trigger item is satisfied.
  • the correspondence may be set as: when the obtained object information belongs to an object category that triggers vibration, such as rock, and meets the trigger condition, such as the number of objects is greater than one and the speed is greater than 1/8 of the screen. In seconds, lasting more than 3 seconds, the vibration controller is activated to trigger the vibration effect.
  • the smart terminal only needs to send the corresponding control information, that is, the triggering vibration effect, to the vibration controller.
  • the smart terminal determines whether it is necessary to activate the odor controller to emit an odor effect, and then generates control information and sends it to the odor controller.
  • This embodiment may include:
  • the query command is sent to the smart terminal to query the device description information of the smart terminal in the current network, and listen to the broadcast information of the smart terminal; the smart terminal acts as a convergence point, and when the odor controller is inquired to initiate the query Reading its own device description information and returning it to the odor controller through the query response; the scent controller initiates a session request as a client, and the smart terminal receives the session request and establishes a session between itself and the odor controller.
  • the smart terminal classifies according to objects in the scene, and in some scenarios, it is necessary to manufacture certain environmental odors to enrich the user experience, and accordingly, identifiable objects and corresponding odors are preset.
  • the object in the obtained scene information belongs to the preset object type of the trigger control, and the preset trigger condition is met, , start the corresponding multi-dimensional experience control.
  • a plurality of triggering scent correspondences are preset for the scent controller: the object type of the trigger is specified in each trigger item, and the trigger condition, and the scent effect is triggered when the trigger item is satisfied.
  • the odor controller is activated to trigger the scent of osmanthus fragrance:
  • the smart terminal only needs to send the corresponding control information, that is, the scent of osmanthus fragrance to the odor controller.
  • FIG. 7 is a schematic diagram of a networking architecture of a controller deployed in a distributed manner according to an embodiment of the present invention. As shown in FIG. 7, in the third embodiment, a distributed deployment between multiple controllers is assumed. In the third embodiment, the smart terminal only needs to identify the set object category and broadcast the identified scene information; and each controller performs scene information belonging to its own control range whether to start the controller to trigger the multi-dimensional effect. Make a decision.
  • This embodiment may include:
  • the key frames in the currently played video frame are continuously detected.
  • the neural network detects that there is a large area of flower sea in the current picture, and after finding the edge contour of the flower, if it is detected that the flower is larger to the right
  • the amplitude of the sway then, according to the direction of the flower swing, it can be inferred that the wind blows from left to right, and the level of the wind can be derived according to the amplitude of the flower swing; if a person is also detected in the picture, then the mark is The position and number of characters, and the speed at which relative movement between characters is found through multiple frames.
  • the information obtained is the scene information.
  • the smart terminal broadcasts the obtained scene information, that is, the type of flower, the approximate number of flowers, the direction of wind blowing and the level of wind; the number of characters and the speed of relative movement.
  • the blowing controller For each blowing controller, according to the obtained scene information, the location of the location, and the corresponding relationship between the different scene information and the control information, it is determined whether the blowing is required, and the magnitude of the wind. For example: the scene information stroke is blown from left to right. If the position of the blow controller is on the left, then the corresponding wind in the scene information is blown; if the orientation of the blow controller is on the right, there is no need to trigger the blow.
  • the odor controller For each odor controller, according to the obtained scene information and the corresponding relationship between the preset scene information and the control information, the odor controller is triggered to release the fragrance of the flower in the corresponding scene information.
  • each sound controller For each sound controller, according to the obtained scene information, select the corresponding background sound such as wind and grass The sound of movement. And according to the moving speed and moving direction of the character in the scene information, and the corresponding relationship between the different scene information and the control information set in advance, and according to the channel corresponding to the sound controller itself, triggering the sound controller to select the strength of the footstep sound Or gradient, then superimpose the background sound and footsteps and output. Complete the sound output of this channel.
  • the embodiment of the invention further provides a computer readable storage medium storing computer executable instructions, which are implemented by the processor to implement the method for implementing multidimensional control according to any of the above embodiments.
  • computer storage medium includes volatile and nonvolatile, implemented in any method or technology for storing information, such as computer readable instructions, data structures, program modules or other data. Sex, removable and non-removable media.
  • Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disc (DVD) or other optical disc storage, magnetic cartridge, magnetic tape, magnetic disk storage or other magnetic storage device, or may Any other medium used to store the desired information and that can be accessed by the computer.
  • communication media typically includes computer readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave or other transport mechanism, and can include any information delivery media. .
  • the embodiment of the present application provides a method, an intelligent terminal, and a controller for implementing multi-dimensional control, which implements audio and video detection by using an intelligent terminal, is used to identify a current video playing scene, and controls various controllers according to the identified various scenarios. To reconstruct the currently playing scene, real-time multi-dimensional experience effect on the screening content, and suitable for ordinary families.

Abstract

Disclosed are a method for realizing multi-dimensional experience, an intelligent terminal and a controller. The method comprises: an intelligent terminal analysing an acquired video content currently playing, so as to recognize scene information corresponding to the video content (100); and the intelligent terminal sending the scene information to a controller, so that the controller activates multi-dimensional control according to the scene information (101). In the method, an intelligent terminal is used to realize audio and video detection, so as to recognize a scene currently playing in a video, and controllers are controlled so as to recreate a currently playing scene according to various recognized scenes, thus realizing that a multi-dimensional experience effect is added in real time into a content to be presented and is suitable for ordinary households.

Description

一种实现多维控制的方法、智能终端及控制器Method, intelligent terminal and controller for realizing multidimensional control 技术领域Technical field
本申请涉及但不限于智能技术,尤指一种实现多维控制的方法、智能终端及控制器。The present application relates to, but is not limited to, intelligent technology, and more particularly to a method, an intelligent terminal and a controller for implementing multi-dimensional control.
背景技术Background technique
如果在用户看电视或电影时,能将震动、吹风、烟雾、气泡、气味、布景和人物表演等效果模拟引入,形成一种独特的表演形式,这些现场特技效果和剧情紧密结合,将会营造出一种与影片内容相一致的环境,让观众通过视觉、嗅觉、听觉和触觉多重身体感官体验全新娱乐效果。If users watch TV or movies, they can simulate the effects of vibration, hair, smoke, air bubbles, smells, scenery and character performances to form a unique form of performance. These scene special effects and plots will be closely combined to create An environment that is consistent with the content of the film, allowing the viewer to experience new entertainment effects through multiple body senses of sight, smell, hearing and touch.
但是,目前这种多维的用户体验只能在专用的电影上才能体验到,多维体验的控制指令是预先就与电影进行了同步的,比如:在对应的放映时间点上向相应的控制器发出控制指令以使控制器控制产生震动、吹风、烟雾、气泡、气味、布景和人物表演等效果。也就是说,目前这种全新娱乐效果的实现导致在家庭的使用中是受到限制的。However, at present, this multi-dimensional user experience can only be experienced on a dedicated movie. The control commands of the multi-dimensional experience are synchronized with the movie in advance, for example, at the corresponding show time point, the corresponding controller is issued. Control commands to allow the controller to control effects such as vibrations, blows, smoke, bubbles, smells, scenery, and character performances. That is to say, the realization of this new entertainment effect is currently limited in the use of the family.
发明概述Summary of invention
以下是对本文详细描述的主题的概述。本概述并非是为了限制权利要求的保护范围。The following is an overview of the topics detailed in this document. This Summary is not intended to limit the scope of the claims.
本申请提供一种实现多维控制的方法、智能终端及控制器,能够实时对放映内容加入多维体验效果,并适用于普通家庭。The present application provides a method, an intelligent terminal and a controller for realizing multi-dimensional control, which can add multi-dimensional experience effects to the screening content in real time, and is applicable to ordinary families.
本申请提供了一种实现多维控制的方法,包括:The application provides a method for implementing multi-dimensional control, including:
智能终端对获取到的当前播放的视频内容进行分析,以识别出所述视频内容对应的场景信息;The smart terminal analyzes the obtained currently played video content to identify the scene information corresponding to the video content;
智能终端发送所述场景信息给控制器,以便控制器根据所述场景信息启动多维控制。The smart terminal sends the scene information to the controller, so that the controller starts multi-dimensional control according to the scene information.
在示例性实施方式中,所述对获取得的视频内容进行分析,以识别出所 述视频内容对应的场景信息,可以包括:In an exemplary embodiment, the analyzing the acquired video content to identify the The scenario information corresponding to the video content may include:
当所述智能终端播放视频时,采样分析视频帧,搜索候选物体,其中,对每一个采样帧,获取运动估计向量,并将运动估计向量大的宏块集中的若干区域划定为标记区域;When the smart terminal plays the video, the video frame is sampled and analyzed, and the candidate object is searched, wherein for each sample frame, a motion estimation vector is acquired, and a plurality of regions in the macroblock set with a large motion estimation vector are defined as a marker region;
所述智能终端对当前播放的视频帧中的关键帧进行持续的探测,如果在预先设置的一段持续较长的视频帧序列中,一直存在标记区域,则所述智能终端开始采样分析该视频帧序列中的关键帧,对每个采样帧识别定位出视频帧内的候选物体以及所在位置,以识别出所述场景信息。The smart terminal continuously detects key frames in the currently played video frame. If there is a marked area in a sequence of video frames that are long in a preset period, the smart terminal starts sampling and analyzing the video frame. A key frame in the sequence identifies, for each sample frame, a candidate object within the video frame and a location thereof to identify the scene information.
在示例性实施方式中,所述将运动估计向量大的宏块集中的若干区域划定为标记区域可以包括:In an exemplary embodiment, the delineating a plurality of regions in the macroblock set having a large motion estimation vector as the marker region may include:
采用分类算法将获得的所述运动估计向量分为以下两类:运动估计向量大的宏块以及运动估计向量小的宏块;The obtained motion estimation vector is divided into the following two categories by using a classification algorithm: a macroblock with a large motion estimation vector and a macroblock with a small motion estimation vector;
将运动估计向量大的宏块集中的若干区域划定为标记区域;位于标记区域外的物体作为参照物。A plurality of regions in the macroblock set having a large motion estimation vector are defined as marker regions; objects located outside the marker regions are used as reference objects.
本申请还提供了一种实现多维控制的方法,包括:控制器根据获得的当前播放的视频内容对应的场景信息,识别出需要启动多维体验控制的指令,进行相应控制。The present application further provides a method for implementing multi-dimensional control, comprising: the controller identifying an instruction that needs to start multi-dimensional experience control according to the obtained scene information corresponding to the currently played video content, and performing corresponding control.
在示例性实施方式中,所述控制器中预先设置有不同的物体类别与控制信息之间的对应关系;In an exemplary embodiment, a correspondence between different object categories and control information is preset in the controller;
所述根据获得的场景信息识别出自身需要启动多维体验控制的指令,可以包括:所述得到的场景信息中的物体属于预先设置的触发控制的物体类别,并且满足预先设置的触发条件时,确定启动所述多维体验控制的指令。The determining, according to the obtained scenario information, the instruction that the multi-dimensional experience control needs to be started, may include: determining that the object in the obtained scene information belongs to a preset object type of the trigger control, and determining that the preset trigger condition is met, determining The instruction of the multidimensional experience control is initiated.
在示例性实施方式中,所述控制器可以包括以下至少之一:震动控制器、气味控制器、喷雾控制器、灯光控制器、声音控制器。In an exemplary embodiment, the controller may include at least one of the following: a vibration controller, an odor controller, a spray controller, a light controller, and a sound controller.
在示例性实施方式中,多个控制器之间采用分布式部署,或者集中式部署。In an exemplary embodiment, distributed deployment, or centralized deployment, is employed between multiple controllers.
本申请又提供了一种实现多维体验的方法,包括:The application further provides a method for implementing a multi-dimensional experience, including:
智能终端对获取到的当前播放的视频内容进行分析,以识别出与发起请 求的控制器对应的场景信息;The smart terminal analyzes the obtained currently played video content to identify and initiate the request. The scene information corresponding to the controller;
智能终端根据识别出的场景信息确定是否需要启动多维体验控制;The smart terminal determines, according to the identified scenario information, whether to start multi-dimensional experience control;
当确定出需要启动多维体验控制时,将对应的控制信息下发给相应控制器。When it is determined that the multi-dimensional experience control needs to be started, the corresponding control information is sent to the corresponding controller.
在示例性实施方式中,所述智能终端对获得的视频内容进行分析之前,上述方法还可以包括:In an exemplary embodiment, before the smart terminal analyzes the obtained video content, the foregoing method may further include:
所述智能终端监听到来自一个或一个以上控制器的查询命令,将自身的设备描述信息返回给发起查询请求的控制器;The smart terminal listens to a query command from one or more controllers, and returns its own device description information to the controller that initiates the query request;
与收到查询响应且发起会话的控制器之间建立会话。Establish a session with the controller that received the query response and initiated the session.
在示例性实施方式中,所述对获得的视频内容进行分析,识别出与发起请求的控制器对应的场景信息可以包括:In an exemplary embodiment, the analyzing the obtained video content, and identifying the scenario information corresponding to the controller that initiated the request may include:
当所述智能终端播放视频时,采样分析视频帧,搜索候选物体,其中,对每一个采样帧,获取运动估计向量,并将运动估计向量大的宏块集中的若干区域划定为标记区域;When the smart terminal plays the video, the video frame is sampled and analyzed, and the candidate object is searched, wherein for each sample frame, a motion estimation vector is acquired, and a plurality of regions in the macroblock set with a large motion estimation vector are defined as a marker region;
对所述获得的视频帧中的关键帧进行持续的探测,如果在预先设置的一段持续较长的视频帧序列中,一直存在标记区域,则开始采样分析该视频帧序列中的关键帧,对每个采样帧识别定位出视频帧内的与发起查询并建立会话的控制器相关的候选物体以及所在位置,以识别出所述与发起查询并建立会话的控制器对应的场景信息。Performing continuous detection on the key frames in the obtained video frame, if there is always a marked area in a sequence of video frames that are long in a preset period, then starting to sample and analyze the key frames in the sequence of the video frames, Each sample frame identifies a candidate object associated with a controller that initiates the query and establishes a session within the video frame and a location thereof to identify the scene information corresponding to the controller that initiated the query and establishes the session.
在示例性实施方式中,所述将运动估计向量大的宏块集中的若干区域划定为标记区域可以包括:In an exemplary embodiment, the delineating a plurality of regions in the macroblock set having a large motion estimation vector as the marker region may include:
采用分类算法将获得的所述运动估计向量分为以下两类:运动估计向量大的宏块以及运动估计向量小的宏块;The obtained motion estimation vector is divided into the following two categories by using a classification algorithm: a macroblock with a large motion estimation vector and a macroblock with a small motion estimation vector;
将运动估计向量大的宏块集中的若干区域划定为标记区域;位于标记区域外的物体作为参照物。A plurality of regions in the macroblock set having a large motion estimation vector are defined as marker regions; objects located outside the marker regions are used as reference objects.
在示例性实施方式中,所述智能终端中预先设置有不同的物体类别与控制信息之间的对应关系; In an exemplary embodiment, a correspondence between different object categories and control information is preset in the smart terminal;
所述智能终端根据识别出的场景信息确定是否需要启动多维体验控制包括:所述得到的场景信息中的物体属于预先设置的触发控制的物体类别,并且满足预先设置的触发条件时,启动相应多维体验控制。Determining whether the multi-dimensional experience control needs to be started according to the identified scene information includes: the object in the obtained scene information belongs to a preset object type of trigger control, and when the preset trigger condition is met, the corresponding multi-dimensional is started. Experience control.
本申请再提供了一种智能终端,包括:第一分析模块、广播模块;其中,The present application further provides an intelligent terminal, including: a first analysis module and a broadcast module; wherein
第一分析模块,配置为对获取到的当前播放的视频内容进行分析,以识别出所述视频内容对应的场景信息;a first analysis module, configured to analyze the obtained currently played video content, to identify scene information corresponding to the video content;
广播模块,配置为发送识别出的场景信息给控制器,以便控制器根据所述场景信息启动多维控制。The broadcast module is configured to send the identified scene information to the controller, so that the controller starts the multi-dimensional control according to the scene information.
在示例性实施方式中,所述第一分析模块可以配置为:当播放视频时,采样分析视频帧,对每一个采样帧,获取运动估计向量;采用分类算法将获得的运动估计向量分为以下两类:运动估计向量大的宏块以及运动估计向量小的宏块;将运动估计向量大的宏块集中的若干区域划定为标记区域;In an exemplary embodiment, the first analysis module may be configured to: when playing a video, sample and analyze a video frame, obtain a motion estimation vector for each sample frame; and use a classification algorithm to divide the obtained motion estimation vector into the following: Two types: a macroblock with a large motion estimation vector and a macroblock with a small motion estimation vector; and a plurality of regions in a macroblock set with a large motion estimation vector are defined as a marker region;
对当前播放的视频帧中的关键帧进行持续的探测,如果在视频帧序列中,一直存在标记区域,则开始采样分析该视频帧序列中的关键帧,对每个采样帧识别定位出视频帧内的候选物体以及所在位置,以识别出所述场景信息。Continuously detecting key frames in the currently played video frame. If there is always a marked area in the video frame sequence, sampling and analyzing key frames in the video frame sequence are started, and the video frame is identified and positioned for each sample frame. The candidate object and the location within it to identify the scene information.
本申请还提供了一种智能终端,包括:第二分析模块、确定模块;其中,The application further provides an intelligent terminal, comprising: a second analysis module and a determining module; wherein
第二分析模块,配置为对获取到的当前播放的视频内容进行分析,以识别出与发起请求的控制器对应的场景信息;The second analysis module is configured to analyze the obtained currently played video content to identify the scenario information corresponding to the controller that initiated the request;
确定模块,配置为根据识别出的场景信息确定是否需要启动多维体验控制,当确定出需要启动多维体验控制时,将对应的控制信息下发给相应控制器。The determining module is configured to determine whether the multi-dimensional experience control needs to be started according to the identified scenario information. When it is determined that the multi-dimensional experience control needs to be started, the corresponding control information is sent to the corresponding controller.
在示例性实施方式中,上述智能终端还可以包括:建立模块,配置为监听到来自一个或一个以上控制器的查询命令,将自身所属智能终端的设备描述信息返回给发起查询请求的控制器;与发起会话的控制器之间建立会话。In an exemplary embodiment, the smart terminal may further include: an establishing module configured to listen to a query command from one or more controllers, and return device description information of the smart terminal to which the smart terminal belongs to the controller that initiates the query request; Establish a session with the controller that initiated the session.
在示例性实施方式中,所述第二分析模块可以配置为:In an exemplary embodiment, the second analysis module may be configured to:
当播放视频时,采样分析视频帧,对每一个采样帧,获取运动估计向量;采用分类算法将获得的运动估计向量分为以下两类:运动估计向量大的宏块以及运动估计向量小的宏块;将运动估计向量大的宏块集中的若干区域划定 为标记区域;位于标记区域外的物体称为参照物;When the video is played, the video frames are sampled and analyzed, and the motion estimation vector is obtained for each sample frame. The motion estimation vectors obtained by the classification algorithm are classified into the following two types: macroblocks with large motion estimation vectors and macros with small motion estimation vectors. Block; delineating several regions in a macroblock set with a large motion estimation vector Marked area; an object located outside the marked area is called a reference;
对当前播放的视频帧中的关键帧进行持续的探测,如果在视频帧序列中,一直存在标记区域,则开始采样分析该视频帧序列中的帧,对每个采样帧识别定位出视频帧内的与发起查询并建立会话的控制器相关的主要物体以及所在位置,以识别出与所述发起查询并建立会话的控制器对应的场景信息。Performing continuous detection on key frames in the currently played video frame. If there is always a marked area in the video frame sequence, sampling is performed to analyze the frames in the video frame sequence, and each of the sample frames is identified and positioned within the video frame. The primary object associated with the controller that initiated the query and establishes the session and the location to identify the scenario information corresponding to the controller that initiated the query and established the session.
在示例性实施方式中,所述确定模块可以配置为:根据预先设置的不同的物体类别与控制信息之间的对应关系,当所述得到的场景信息中的物体属于预先设置的触发控制的物体类别,并且满足预先设置的触发条件时,启动相应多维体验控制,并将对应的控制信息下发给相应控制器。In an exemplary embodiment, the determining module may be configured to: when the object in the obtained scene information belongs to a preset trigger-controlled object according to a correspondence between different object categories and control information set in advance When the category is met and the preset trigger condition is met, the corresponding multi-dimensional experience control is started, and the corresponding control information is sent to the corresponding controller.
本申请再提供了一种控制器,包括:获取模块、控制模块;其中,The application further provides a controller, comprising: an acquisition module and a control module; wherein
获取模块,配置为获取当前播放的视频内容对应的场景信息;Obtaining a module, configured to obtain scene information corresponding to the currently played video content;
控制模块,配置为根据获得的场景信息确定出自身需要启动多维体验控制时,进行相应控制。The control module is configured to perform corresponding control when it is determined according to the obtained scenario information that the multi-dimensional experience control needs to be started.
在示例性实施方式中,所述控制模块中预先设置有不同的物体类别与控制信息之间的对应关系;In an exemplary embodiment, a correspondence between different object categories and control information is preset in the control module;
所述控制模块可以配置为:当所述得到的场景信息中的物体属于预先设置的触发控制的物体类别,并且满足预先设置的触发条件时,启动所述多维体验控制。The control module may be configured to start the multi-dimensional experience control when the object in the obtained scene information belongs to a preset object type of trigger control and meets a preset trigger condition.
在示例性实施方式中,所述获取模块还配置为:发送查询命令,以查询当前网络中的智能终端的设备描述信息,并监听智能终端广播的信息。In an exemplary embodiment, the obtaining module is further configured to: send a query command to query device description information of the smart terminal in the current network, and listen to information broadcast by the smart terminal.
本申请技术方案包括智能终端对获取到的当前播放的视频内容进行分析,以识别出视频内容对应的场景信息;智能终端发送所述场景信息给控制器,以便控制器根据场景信息启动多维控制。或者包括在启动多维体验功能后,智能终端对当前播放的视频内容进行分析以获取与发起请求的控制器对应的场景信息;智能终端根据获得的场景信息确定是否需要启动多维体验控制;当确定出需要启动多维体验控制时,将对应的控制信息下发给相应控制器。本申请提供的技术方案利用智能终端实现音视频检测,用以识别当前的视频播放的场景,并根据识别出的各种场景控制各种控制器来重建当前播放 的场景,实现了实时对放映内容加入多维体验效果,并适用于普通家庭。The technical solution of the present application includes the smart terminal analyzing the obtained currently played video content to identify the scene information corresponding to the video content; the intelligent terminal sends the scene information to the controller, so that the controller starts multi-dimensional control according to the scene information. Or, after the multi-dimensional experience function is started, the smart terminal analyzes the currently played video content to obtain scenario information corresponding to the controller that initiates the request; the smart terminal determines, according to the obtained scenario information, whether to start multi-dimensional experience control; When the multi-dimensional experience control needs to be started, the corresponding control information is sent to the corresponding controller. The technical solution provided by the present application utilizes an intelligent terminal to implement audio and video detection, to identify a current video playing scene, and control various controllers according to the identified various scenes to reconstruct the current playing. The scene enables real-time multi-dimensional experience effects on the content of the show, and is suitable for ordinary families.
本申请的其它特征和优点将在随后的说明书中阐述,并且,部分地从说明书中变得显而易见,或者通过实施本申请而了解。本申请的目的和其他优点可通过在说明书、权利要求书以及附图中所特别指出的结构来实现和获得。Other features and advantages of the present application will be set forth in the description which follows. The objectives and other advantages of the present invention can be realized and obtained by the structure of the invention.
附图概述BRIEF abstract
此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,本申请的示意性实施例及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:The drawings described herein are intended to provide a further understanding of the present application, and are intended to be a part of this application. In the drawing:
图1为本发明实施例的一种实现多维体验的方法的流程图;FIG. 1 is a flowchart of a method for implementing a multi-dimensional experience according to an embodiment of the present invention; FIG.
图2为本发明实施例的另一种实现多维体验的方法的流程图;2 is a flowchart of another method for implementing a multi-dimensional experience according to an embodiment of the present invention;
图3为本发明实施例的一种智能终端的组成结构示意图;3 is a schematic structural diagram of a smart terminal according to an embodiment of the present invention;
图4为本发明实施例的另一种智能终端的组成结构示意图;4 is a schematic structural diagram of another smart terminal according to an embodiment of the present invention;
图5为本发明实施例的控制器的组成结构示意图;FIG. 5 is a schematic structural diagram of a controller according to an embodiment of the present invention; FIG.
图6为本发明实施例的控制器采用集中式部署的组网架构示意图;FIG. 6 is a schematic diagram of a networking architecture of a controller deployed in a centralized manner according to an embodiment of the present invention; FIG.
图7为本发明实施例的控制器采用分布式部署的组网架构示意图。FIG. 7 is a schematic diagram of a networking architecture of a controller deployed in a distributed manner according to an embodiment of the present invention.
详述Detailed
下文中将结合附图对本申请的实施例进行详细说明。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互任意组合。Embodiments of the present application will be described in detail below with reference to the accompanying drawings. It should be noted that, in the case of no conflict, the features in the embodiments and the embodiments in the present application may be arbitrarily combined with each other.
图1为本发明实施例的一种实现多维控制的方法的流程图,如图1所示,包括:FIG. 1 is a flowchart of a method for implementing multi-dimensional control according to an embodiment of the present invention. As shown in FIG. 1 , the method includes:
步骤100:智能终端对获取到的当前播放的视频内容进行分析,以识别出所述视频内容对应的场景信息。Step 100: The smart terminal analyzes the obtained currently played video content to identify the scene information corresponding to the video content.
在启动多维体验功能后,首先,当智能终端播放视频时,采样分析视频帧,尝试搜索候选物体,如花(如对应风)、草、岩石熔浆(如对应震动)等,即对每一个采样帧,获取运动估计向量;采用分类算法如k-means聚类 分析将获得的运动估计向量分为两类:运动估计向量大的宏块以及运动估计向量小的宏块。将运动估计向量大的宏块集中的若干区域划定为标记区域。如果某个标记区域面积太小,则放弃该标记区域。位于标记区域外的物体作为大背景的参照物。这样,就查找出了关键的候选物体存在的可能区域。其中,对于整个视频来说,在预设区域如长方形区域内,如果运动向量大的宏块占总宏块数的比例超过预设阈值如80%(可调整),那么,认为这个区域就是标记区域。其中,如果划定出的标记区域的面积占总面积的大小不到预设面积的比例阈值如10%(可调整),那么,放弃该标记区域。After starting the multi-dimensional experience function, first, when the smart terminal plays the video, sample and analyze the video frame, and try to search for candidate objects, such as flowers (such as corresponding wind), grass, rock slurry (such as corresponding vibration), etc. Frame, obtain motion estimation vector; use classification algorithm such as k-means clustering The analysis divides the obtained motion estimation vectors into two categories: macroblocks with large motion estimation vectors and macroblocks with small motion estimation vectors. A plurality of regions in a macroblock set in which the motion estimation vector is large are defined as a marker region. If a marked area is too small, discard the marked area. An object located outside the marked area serves as a reference for the large background. In this way, the possible areas where key candidate objects exist are found. Wherein, for the entire video, in a preset area such as a rectangular area, if the ratio of the macroblock of the motion vector to the total number of macroblocks exceeds a preset threshold such as 80% (adjustable), then the area is considered to be a marker. region. Wherein, if the area of the marked marked area occupies a ratio of the total area smaller than the preset area, such as 10% (adjustable), the marked area is discarded.
然后,智能终端对获得的视频帧中的关键帧即I帧进行持续的探测,如果在预先设置的一段持续较长的视频帧序列中,一直存在标记区域,那么,智能终端开始采样分析该视频帧序列中的关键帧,对每个采样帧通过神经网络等算法识别定位出视频帧内的候选物体以及所在位置,从而识别出场景信息。这样,实现了对关键的候选物体的识别。Then, the smart terminal continuously detects the key frame in the obtained video frame, that is, the I frame. If there is always a marked area in a sequence of video frames that are long in a preset period, the smart terminal starts sampling and analyzing the video. The key frame in the frame sequence identifies and identifies the candidate object and the location in the video frame by using an algorithm such as a neural network for each sample frame, thereby identifying the scene information. In this way, the identification of key candidate objects is achieved.
其中,如果之前得到的参照物在当前采样的视频帧序列中均存在,那么,对视频帧序列的标记区域内识别出的候选物体,如果满足以下条件,就标记为候选物体的类别:1)该物体类别在连续的视频帧序列的标记区域内均存在;2)该物体类别的每个物体,相对每个视频序列的参照物,位置相对矢量持续发生变化。在示例性实施方式中,如果该候选物体的类别多于一个,则场景信息还包括:记录额外的参数,如物体持续时间、物体位置移动相对速度、个数等。Wherein, if the previously obtained reference object exists in the currently sampled video frame sequence, the candidate object identified in the marked area of the video frame sequence is marked as the candidate object category if the following conditions are met: 1) The object class exists in the marked area of successive video frame sequences; 2) each object of the object class continues to change relative to the reference vector of each video sequence. In an exemplary embodiment, if the candidate object has more than one category, the scene information further includes: recording additional parameters such as an object duration, an object position moving relative speed, a number, and the like.
比如:在具体实现中,上述使用到的神经网络可以采用AlexNet的结构:一共8层,前5层为卷积层,后3层全连接层。其中,最后一层使用softmax分类器。其中,在前5层的卷积层中,第1层为卷积层,使用特定模版间隔进行卷积,然后采用ReLU作为激活函数,规则化后做Pooling,得到的结果作为第2层卷积层输入,后面4层卷积层和第1层类似,只是采用了维数更低的卷积模版;在后3层全连接层中,后3层ReLU后dropout再全连接;最后采用softmax lost作为lost function。For example, in the specific implementation, the neural network used above can adopt the structure of AlexNet: a total of 8 layers, the first 5 layers are convolution layers, and the last 3 layers are fully connected layers. Among them, the last layer uses the softmax classifier. Among them, in the first five layers of the convolutional layer, the first layer is a convolutional layer, which is convoluted using a specific template interval, and then uses ReLU as an activation function, and is polled after regularization, and the obtained result is used as a second layer convolution. Layer input, the following four layers of convolutional layer and the first layer are similar, but the convolution template with lower dimension is used; in the last three layers of the full connection layer, after the last three layers of ReLU, the dropout is fully connected; finally, softmax is used. As a lost function.
本步骤中,如果之前得到的参照物在当前采样的视频帧序列中不存在,则放弃本次搜索,结束本流程。 In this step, if the previously obtained reference object does not exist in the currently sampled video frame sequence, the current search is abandoned, and the process ends.
举个例子来看:如果采用神经网络检测出当前画面中有大面积的花海,可以将花的边缘轮廓找到,如果还检测到花往右有较大的晃动幅度,那么,根据花摆动的方向可以推断出有风从左往右吹,根据花摆动的幅度可以推算出风的等级;如果同时还探测到有人物在画面中出现,那么,标记出人物的位置和个数,以及通过多帧发现人物之间相对移动的速度等。这些获得的信息就是本步骤中需要的场景信息。For example, if you use a neural network to detect a large area of flowers in the current picture, you can find the edge contour of the flower. If you also detect that the flower has a large amplitude of shaking to the right, then swing according to the flower. The direction can be inferred that the wind blows from left to right, and the level of the wind can be derived according to the amplitude of the flower swing; if a character is also detected in the picture, the position and number of the character are marked, and The frame finds the speed of relative movement between characters and the like. The information obtained is the scene information required in this step.
步骤101:智能终端发送识别出的场景信息给控制器,以便控制器根据所述场景信息启动多维控制。Step 101: The smart terminal sends the identified scene information to the controller, so that the controller starts multi-dimensional control according to the scene information.
智能终端发送识别出的场景信息给控制器,比如广播识别出的场景信息。以上面列举的例子为例,场景信息可以包括:花的种类、花大概数量;风吹的方向及风力等级;人物个数以及相对移动的速度。The smart terminal sends the identified scene information to the controller, such as broadcasting the identified scene information. Taking the example listed above as an example, the scene information may include: the type of flower, the approximate number of flowers; the direction of wind blowing and the level of wind; the number of characters and the speed of relative movement.
其中,控制信息用于需要启动多维体验控制的控制器进行相应控制。The control information is used for the controller that needs to start the multi-dimensional experience control to perform corresponding control.
对于每个控制器,则还包括:控制器根据获到的当前播放的视频内容对应的场景信息识别出自身需要启动多维体验控制的指令,进行相应控制。For each controller, the controller further includes: the controller identifies, according to the obtained scene information corresponding to the currently played video content, an instruction that needs to start multi-dimensional experience control, and performs corresponding control.
本申请中的控制器可以包括但不限于以下至少之一:震动控制器、气味控制器、喷雾控制器、灯光控制器、声音控制器。The controller in the present application may include, but is not limited to, at least one of the following: a vibration controller, an odor controller, a spray controller, a light controller, and a sound controller.
多个控制器之间可以是分布式部署,也可以是集中式部署。当采用分布式部署时,每个控制器都与智能终端进行通信;当采用集中式部署时,可以将多个控制器设置在一个装置,如一个可穿戴装置中,这样,更加方便了用户的体验。其中,控制器和智能终端可以采用以太网(Ethernet)、WiFi、蓝牙(Bluetooth)等方式进行通信。Multiple controllers can be distributed or centralized. When distributed deployment is used, each controller communicates with the intelligent terminal; when centralized deployment, multiple controllers can be placed in one device, such as a wearable device, which is more convenient for the user. Experience. Among them, the controller and the intelligent terminal can communicate by means of Ethernet, WiFi, Bluetooth, and the like.
在本步骤的控制器中,预先设置有不同的物体类别与控制信息之间的对应关系,当得到的场景信息中的物体属于预先设置的触发控制的物体类别,并且满足预先设置的触发条件时,确定启动相应多维体验控制的指令。In the controller of this step, a correspondence relationship between different object categories and control information is set in advance, and when the object in the obtained scene information belongs to a preset object type of the trigger control, and the preset trigger condition is satisfied, , determine the instruction to start the corresponding multi-dimensional experience control.
比如:对于震动控制器,这个对应关系可以设置为:当获得的场景信息中的物体属于触发震动的物体类别如岩石,并且满足触发条件如物体个数大于1个且速度大于1/8屏幕每秒,持续超过3秒,则启动震动控制器触发震动效果; For example, for the vibration controller, the correspondence may be set as: when the obtained object information belongs to an object category that triggers vibration, such as rock, and meets the trigger condition, such as the number of objects is greater than one and the speed is greater than 1/8 of the screen. Seconds, lasting more than 3 seconds, the vibration controller is activated to trigger the vibration effect;
再如,对于气味控制器,这个对应关系可以设置为:当获得的场景信息中的物体属于触发产生气味的物体类别如桂花,并且满足触发条件如持续出现时间>6秒钟,且数量>10个,则启动气味控制器触发发出带桂花香味的气味。For another example, for the odor controller, the correspondence may be set to: when the object in the obtained scene information belongs to an object category that triggers the generation of odor, such as osmanthus, and meets the trigger condition such as the continuous appearance time > 6 seconds, and the number > 10 Then, the odor controller is activated to trigger the scent of osmanthus fragrance.
又如:对于声音控制器,这个对应关系可以为:当获得的场景信息中的物体属于触发产生声音的物体类别如有任务在画面中出现,并且满足触发条件如人物的位置、移动方向和移动速度等,则启动声音控制器触发产生脚步随人物移动方向的渐变过程。For another example, for the sound controller, the corresponding relationship may be: when the obtained object information belongs to the object category that triggers the generated sound, if a task appears in the screen, and the trigger condition is met, such as the position, moving direction, and movement of the character. At the speed, etc., the sound controller is activated to trigger a gradual process in which the footstep moves in accordance with the direction in which the character moves.
图2为本发明实施例的另一种实现多维控制的方法的流程图,如图2所示,包括:FIG. 2 is a flowchart of another method for implementing multi-dimensional control according to an embodiment of the present invention. As shown in FIG. 2, the method includes:
步骤200:智能终端对获取到的当前播放的视频内容进行分析,以识别出与发起请求的控制器对应的场景信息。Step 200: The smart terminal analyzes the obtained currently played video content to identify the scenario information corresponding to the controller that initiated the request.
本步骤之前,上述方法还可以包括:某个或某些控制器启动后,向智能终端发送查询命令,以查询当前网络中的智能终端的设备信息,并监听智能终端广播的信息;Before the step, the method may further include: after the controller is started, sending a query command to the smart terminal to query the device information of the smart terminal in the current network, and listening to the information broadcast by the smart terminal;
智能终端作为汇聚点会监听来自控制器的查询,当监听到有查询时,将自身的设备描述信息返回给发起查询请求的控制器;The intelligent terminal acts as a convergence point to listen to the query from the controller, and when the query is queried, returns its own device description information to the controller that initiates the query request;
收到查询响应的控制器作为客户端向智能终端发起会话,并在智能终端与控制器之间建立会话。The controller receiving the query response acts as a client to initiate a session to the smart terminal and establishes a session between the smart terminal and the controller.
本步骤的具体实现与步骤100是一致的,不同在于:本步骤中智能终端是针对控制器的请求进行相应场景信息的采集。比如,发起查询请求的是震动控制器,那么,此时智能终端仅针对触发震动的物体类别如岩石进行识别,也就是说此时返回的场景信息中的物体只会有触发震动的物体类别。The specific implementation of this step is consistent with the step 100. The difference is that in this step, the smart terminal collects corresponding scenario information for the request of the controller. For example, the vibration controller is used to initiate the query request. Then, the smart terminal only recognizes the object category such as rock that triggers the vibration, that is, the object in the scene information returned at this time only has the object category that triggers the vibration.
步骤201:智能终端根据识别出的场景信息确定是否需要启动多维体验控制。Step 201: The smart terminal determines, according to the identified scenario information, whether to start multi-dimensional experience control.
本步骤中,在智能终端中,预先设置有不同的物体类别与控制信息之间的对应关系,当得到的场景信息中的物体属于预先设置的触发控制的物体类别,并且满足预先设置的触发条件时,启动相应多维体验控制。 In this step, in the smart terminal, a correspondence relationship between different object categories and control information is set in advance, and the object in the obtained scene information belongs to a preset object type of the trigger control, and the preset trigger condition is satisfied. When the corresponding multidimensional experience control is started.
本步骤的具体实现与步骤102一致,这里不再赘述。The specific implementation of this step is consistent with step 102, and details are not described herein again.
步骤202:当确定出需要启动多维体验控制时,将对应的控制信息下发给相应控制器。Step 202: When it is determined that the multi-dimensional experience control needs to be started, the corresponding control information is sent to the corresponding controller.
本步骤中,智能终端将最终的控制信息直接下发给控制器,控制器只需要按照接收到的控制指令启动并触发相应动作即可。In this step, the intelligent terminal directly delivers the final control information to the controller, and the controller only needs to start and trigger the corresponding action according to the received control command.
图3为本发明实施例的一种智能终端的组成结构示意图,如图3所示,至少包括:第一分析模块300、广播模块301;其中,FIG. 3 is a schematic structural diagram of a smart terminal according to an embodiment of the present invention. As shown in FIG. 3, the method includes at least a first analysis module 300 and a broadcast module 301.
第一分析模块300,配置为对获取到的当前播放的视频内容进行分析,以识别出所述视频内容对应的场景信息;The first analysis module 300 is configured to analyze the acquired currently played video content to identify scene information corresponding to the video content.
广播模块301,配置为发送识别出的场景信息给控制器,以便控制器根据所述场景信息启动多维控制。The broadcast module 301 is configured to send the identified scene information to the controller, so that the controller starts the multi-dimensional control according to the scene information.
其中,第一分析模块300可以配置为:The first analysis module 300 can be configured to:
当播放视频时,采样分析视频帧,尝试搜索候选物体,即对每一个采样帧,获取运动估计向量;采用分类算法如k-means聚类分析将获得的运动估计向量分为以下两类:运动估计向量大的宏块以及运动估计向量小的宏块;将运动估计向量大的宏块集中的若干区域划定为标记区域;如果某个标记区域面积太小,则放弃该标记区域。位于标记区域外的物体称为参照物;When the video is played, the video frames are sampled and analyzed, and the candidate objects are searched for, that is, the motion estimation vector is obtained for each sampling frame; the motion estimation vectors obtained by the classification algorithm such as k-means cluster analysis are divided into the following two categories: motion A macroblock having a large vector and a macroblock having a small motion estimation vector are estimated; a plurality of regions in the macroblock set having a large motion estimation vector are defined as a marked region; if a marked region is too small, the marked region is discarded. An object located outside the marked area is called a reference object;
对当前播放的视频帧中的关键帧进行持续的探测,如果在预先设置的一段持续较长的视频帧序列中,一直存在标记区域,那么,开始采样分析该视频帧序列中的关键帧,对每个采样帧通过神经网络等算法识别定位出视频帧内的候选物体以及所在位置,从而获得场景信息。Performing continuous detection on key frames in the currently played video frame. If there is always a marked area in a sequence of video frames that are long in a preset period, then starting to sample and analyze key frames in the sequence of video frames, Each sample frame is identified by a neural network or the like to locate a candidate object in the video frame and a location thereof, thereby obtaining scene information.
图4为本发明实施例的另一种智能终端的组成结构示意图,如图4所示,至少包括第二分析模块401、确定模块402;其中,FIG. 4 is a schematic structural diagram of another smart terminal according to an embodiment of the present invention. As shown in FIG. 4, the method includes at least a second analysis module 401 and a determining module 402.
第二分析模块401,配置为对获取到的当前播放的视频内容进行分析,以识别出与发起请求的控制器对应的场景信息;The second analysis module 401 is configured to analyze the obtained currently played video content to identify the scenario information corresponding to the controller that initiated the request;
确定模块402,配置为根据识别出的场景信息确定是否需要启动多维体验控制,当确定出需要启动多维体验控制时,将对应的控制信息下发给相应的控制器。 The determining module 402 is configured to determine whether the multi-dimensional experience control needs to be started according to the identified scenario information. When it is determined that the multi-dimensional experience control needs to be started, the corresponding control information is sent to the corresponding controller.
图4所示的智能终端还可以包括建立模块400,配置为监听到来自某个或某些控制器的查询命令,将自身所属智能终端的设备描述信息返回给发起查询请求的控制器;与发起会话的控制器之间建立会话。The smart terminal shown in FIG. 4 may further include an establishing module 400 configured to listen to a query command from a certain controller or some controllers, and return device description information of the smart terminal to which the smart terminal belongs to the controller that initiates the query request; A session is established between the controllers of the session.
其中,第二分析模块401可以配置为:The second analysis module 401 can be configured to:
对当前播放的视频帧中的关键帧进行持续的探测,如果在预先设置的一段持续较长的视频帧序列中,一直存在标记区域,那么,开始采样分析该视频帧序列中的关键帧,对每个采样帧通过神经网络等算法识别定位出视频帧内的与发起查询并建立会话的控制器相关的候选物体以及所在位置,从而识别出与发起查询并建立会话的控制器对应的场景信息。Performing continuous detection on key frames in the currently played video frame. If there is always a marked area in a sequence of video frames that are long in a preset period, then starting to sample and analyze key frames in the sequence of video frames, Each sampling frame identifies a candidate object and a location in the video frame that are related to the controller that initiated the query and establishes the session through an algorithm such as a neural network, thereby identifying scene information corresponding to the controller that initiated the query and establishes the session.
确定模块402可以配置为:根据预先设置的不同的物体类别与控制信息之间的对应关系,当得到的场景信息中的物体属于预先设置的触发控制的物体类别,并且满足预先设置的触发条件时,启动相应多维体验控制,并将对应的控制信息下发给相应的控制器。The determining module 402 may be configured to: when the object in the obtained scene information belongs to a preset trigger-controlled object category according to a preset relationship between different object categories and control information, and meets a preset trigger condition The corresponding multi-dimensional experience control is started, and the corresponding control information is sent to the corresponding controller.
图5为本发明实施例的控制器的组成结构示意图,如图5所示,至少包括获取模块500、控制模块501;其中,FIG. 5 is a schematic structural diagram of a controller according to an embodiment of the present invention. As shown in FIG. 5, the method includes at least an obtaining module 500 and a control module 501.
获取模块500,配置为获取当前播放的视频内容对应的场景信息;The obtaining module 500 is configured to obtain scene information corresponding to the currently played video content.
控制模块501,配置为根据获得的场景信息确定出自身需要启动多维体验控制时,进行相应控制。The control module 501 is configured to perform corresponding control when it is determined that the multi-dimensional experience control needs to be started according to the obtained scenario information.
其中,在控制模块501中预先设置有不同的物体类别与控制信息之间的对应关系;控制模块501可以配置为:当得到的场景信息中的物体属于预先设置的触发控制的物体类别,并且满足预先设置的触发条件时,启动多维体验控制。The control module 501 may be configured with a corresponding relationship between different object types and control information in advance; the control module 501 may be configured to: when the obtained object in the scene information belongs to a preset trigger-controlled object category, and meets Multi-dimensional experience control is initiated when a pre-set trigger condition is set.
其中,获取模块500还配置为:发送查询命令,以查询当前网络中的智能终端的设备描述信息,并监听智能终端广播的信息The obtaining module 500 is further configured to: send a query command to query device description information of the smart terminal in the current network, and listen to information broadcast by the smart terminal.
下面结合具体实施例进行详细描述。The details are described below in conjunction with specific embodiments.
图6为本发明实施例的控制器采用集中式部署的组网架构示意图,如图6所示,在第一实施例中,假设多个控制器之间采用集中式部署,如设置在一可穿戴设备中。第一实施例中,以震动控制器(比如内嵌在智能裤的震动 控制器)发起查询请求为例,且第一实施例中智能终端对是否需要启动震动控制器触发震动效果进行确定。本实施例可以包括:FIG. 6 is a schematic diagram of a networking architecture of a centralized deployment of a controller according to an embodiment of the present invention. As shown in FIG. 6 , in the first embodiment, a centralized deployment is adopted between multiple controllers, such as Wear the device. In the first embodiment, the vibration controller (such as the vibration embedded in the smart pants) The controller initiates the query request as an example, and in the first embodiment, the smart terminal determines whether the vibration controller needs to be activated to trigger the vibration effect. This embodiment may include:
首先,震动控制器启动后,向智能终端发送查询命令,查询当前网络中的智能终端的设备描述信息,并监听智能终端的广播信息;智能终端作为汇聚点,在监听有震动控制器发起查询时,读取自身的设备描述信息并通过查询响应返回给震动控制器;震动控制器作为客户端发起会话请求,智能终端接收会话请求并在自身与震动控制器之间建立起会话。First, after the vibration controller is started, an inquiry command is sent to the intelligent terminal to query the device description information of the intelligent terminal in the current network, and listen to the broadcast information of the intelligent terminal; the intelligent terminal acts as a convergence point, and when the monitoring device has a vibration controller to initiate an inquiry, Reading its own device description information and returning it to the vibration controller through the query response; the vibration controller acts as a client to initiate a session request, and the intelligent terminal receives the session request and establishes a session between itself and the vibration controller.
接着,当智能终端播放视频时,先采样分析视频帧,尝试搜索候选物体,即对每一个采样帧,获取出运动估计向量。采用分类算法将获得的该视频帧的运动估计向量分为以下两类:运动估计向量大的宏块以及运动估计向量小的宏块。将运动估计向量大的宏块集中的若干区域划定为标记区域。如果某个标记区域面积太小,则放弃该标记区域。位于标记区域外的物体称为参照物。Then, when the smart terminal plays the video, the video frame is sampled and analyzed, and the candidate object is searched for, that is, the motion estimation vector is acquired for each sample frame. The obtained motion estimation vectors of the video frames are classified into the following two types by using a classification algorithm: macroblocks with large motion estimation vectors and macroblocks with small motion estimation vectors. A plurality of regions in a macroblock set in which the motion estimation vector is large are defined as a marker region. If a marked area is too small, discard the marked area. An object located outside the marked area is called a reference.
如果在一段持续较长的视频帧序列中,一直存在标记区域,则采样分析该视频帧序列中的帧,对每个采样帧通过神经网络等算法识别定位出视频帧内的主要物体以及所在位置。比如:在具体实现中,此神经网络可以采用AlexNet的结构:一共8层,前5层为卷积层,后3层全连接层。其中,最后一层使用softmax分类器。其中,在前5层的卷积层中,第1层为卷积层,使用特定模版间隔进行卷积,然后采用ReLU作为激活函数,规则化后做Pooling,得到的结果作为第2层卷积层输入,后面4层卷积层和第1层类似,只是采用了维数更低的卷积模版;在后3层全连接层中,后3层ReLU后dropout再全连接;最后采用softmax lost作为lost function。If there is always a marked area in a long sequence of video frames, the frames in the sequence of video frames are sampled and analyzed, and each sample frame is identified by a neural network or the like to locate the main object and the location in the video frame. . For example, in a specific implementation, the neural network can adopt the structure of AlexNet: a total of 8 layers, the first 5 layers are convolution layers, and the last 3 layers are fully connected layers. Among them, the last layer uses the softmax classifier. Among them, in the first five layers of the convolutional layer, the first layer is a convolutional layer, which is convoluted using a specific template interval, and then uses ReLU as an activation function, and is polled after regularization, and the obtained result is used as a second layer convolution. Layer input, the following four layers of convolutional layer and the first layer are similar, but the convolution template with lower dimension is used; in the last three layers of the full connection layer, after the last three layers of ReLU, the dropout is fully connected; finally, softmax is used. As a lost function.
然后,如果之前得到的参照物在当前采样的视频帧序列中均存在,那么,对视频帧序列的标记区域内识别出的候选物体,如果满足以下条件,就标记为候选物体的类别:1)该物体类别在连续的视频帧序列的标记区域内均存在;2)该物体类别的每个物体,相对每个视频序列的参照物,位置相对矢量持续发生变化。在示例性实施方式中,如果该候选物体的类别多于一个,则场景信息还包括:记录额外的参数,如物体持续时间、物体位置移动相对速度、个数等。 Then, if the previously obtained reference object exists in the currently sampled video frame sequence, the candidate object identified in the marked area of the video frame sequence is marked as the candidate object category if the following conditions are met: 1) The object class exists in the marked area of successive video frame sequences; 2) each object of the object class continues to change relative to the reference vector of each video sequence. In an exemplary embodiment, if the candidate object has more than one category, the scene information further includes: recording additional parameters such as an object duration, an object position moving relative speed, a number, and the like.
第一实施例中,在智能终端中有不同的物体类别与控制信息之间的对应关系,当得到的场景信息中的物体属于预先设置的触发控制的物体类别,并且满足预先设置的触发条件时,启动相应多维体验控制。第一实施例中,假设针对震动控制器预先设定了若干个触发震动的对应关系:每个触发项中设置有触发的物体类别,以及触发条件,当满足触发项时触发震动效果。比如:对于震动控制器,这个对应关系可以设置为:当获得的场景信息中的物体属于触发震动的物体类别如岩石,并且满足触发条件如物体个数大于1个且速度大于1/8屏幕每秒,持续超过3秒,则启动震动控制器触发震动效果。In the first embodiment, there is a correspondence between different object categories and control information in the smart terminal, when the object in the obtained scene information belongs to a preset object type of trigger control, and the preset trigger condition is satisfied. , start the corresponding multi-dimensional experience control. In the first embodiment, it is assumed that a plurality of triggering vibration correspondences are preset for the vibration controller: each trigger item is provided with a triggering object category, and a trigger condition, and the vibration effect is triggered when the trigger item is satisfied. For example, for the vibration controller, the correspondence may be set as: when the obtained object information belongs to an object category that triggers vibration, such as rock, and meets the trigger condition, such as the number of objects is greater than one and the speed is greater than 1/8 of the screen. In seconds, lasting more than 3 seconds, the vibration controller is activated to trigger the vibration effect.
最后,在第一实施例中,智能终端只需将对应的控制信息即触发震动效果下发给震动控制器即可。Finally, in the first embodiment, the smart terminal only needs to send the corresponding control information, that is, the triggering vibration effect, to the vibration controller.
在第二实施例中,以气味控制器为例,假设智能终端对是否需要启动气味控制器发出气味效果进行确定,之后生成控制信息后发送给气味控制器。本实施例可以包括:In the second embodiment, taking the odor controller as an example, it is assumed that the smart terminal determines whether it is necessary to activate the odor controller to emit an odor effect, and then generates control information and sends it to the odor controller. This embodiment may include:
首先,气味控制器启动后,向智能终端发送查询命令,查询当前网络中的智能终端的设备描述信息,并监听智能终端的广播信息;智能终端作为汇聚点,在监听有气味控制器发起查询时,读取自身的设备描述信息并通过查询响应返回给气味控制器;气味控制器作为客户端发起会话请求,智能终端接收会话请求并在自身与气味控制器之间建立起会话。First, after the scent controller is started, the query command is sent to the smart terminal to query the device description information of the smart terminal in the current network, and listen to the broadcast information of the smart terminal; the smart terminal acts as a convergence point, and when the odor controller is inquired to initiate the query Reading its own device description information and returning it to the odor controller through the query response; the scent controller initiates a session request as a client, and the smart terminal receives the session request and establishes a session between itself and the odor controller.
接着,第二实施例中,智能终端根据场景中的物体进行分类,在一些场景下需要制造某些环境气味来丰富用户体验,相应地,预先设置可识别的物体,以及对应的气味。Next, in the second embodiment, the smart terminal classifies according to objects in the scene, and in some scenarios, it is necessary to manufacture certain environmental odors to enrich the user experience, and accordingly, identifiable objects and corresponding odors are preset.
当智能终端播放视频时,对视频帧中的每若干个关键帧中抽出一个做采样。对该采样使用卷积神经网络等算法识别出该帧中存在大量的花束,并持续了相当长的一段时间。具体实现与第一实施例一致,这里不再赘述。When the smart terminal plays a video, one of each of several key frames in the video frame is sampled. An algorithm such as a convolutional neural network is used to identify a large number of bouquets in the frame and lasts for a considerable period of time. The specific implementation is consistent with the first embodiment, and details are not described herein again.
第二实施例中,在智能终端中有不同的场景信息与控制信息之间的对应关系,当得到的场景信息中的物体属于预先设置的触发控制的物体类别,并且满足预先设置的触发条件时,启动相应多维体验控制。第二实施例中,假设针对气味控制器预先设定若干个触发香味的对应关系:每个触发项中指定了触发的物体类别,以及触发条件,当满足触发项时触发气味效果。比如: 当获得的场景信息中的物体属于触发产生气味的物体类别如桂花,并且满足触发条件如持续出现时间>6秒钟,且数量>10个,则启动气味控制器触发发出带桂花香味的气味:In the second embodiment, there is a corresponding relationship between the scene information and the control information in the smart terminal. When the object in the obtained scene information belongs to the preset object type of the trigger control, and the preset trigger condition is met, , start the corresponding multi-dimensional experience control. In the second embodiment, it is assumed that a plurality of triggering scent correspondences are preset for the scent controller: the object type of the trigger is specified in each trigger item, and the trigger condition, and the scent effect is triggered when the trigger item is satisfied. For example: When the obtained object information belongs to an object category that triggers the generation of odor, such as osmanthus, and the trigger condition is satisfied, for example, the duration of occurrence is >6 seconds, and the number is >10, the odor controller is activated to trigger the scent of osmanthus fragrance:
最后,在第二实施例中,智能终端只需将对应的控制信息即触发带桂花香味的气味下发给气味控制器即可。Finally, in the second embodiment, the smart terminal only needs to send the corresponding control information, that is, the scent of osmanthus fragrance to the odor controller.
图7为本发明实施例的控制器采用分布式部署的组网架构示意图,如图7所示,在第三实施例中,假设多个控制器之间采用分布式部署。第三实施例中,智能终端只需对设置好的物体类别进行识别,并广播识别出的场景信息;而每个控制器会将属于自身控制范围的场景信息进行是否需要启动控制器触发多维效果进行确定。本实施例可以包括:FIG. 7 is a schematic diagram of a networking architecture of a controller deployed in a distributed manner according to an embodiment of the present invention. As shown in FIG. 7, in the third embodiment, a distributed deployment between multiple controllers is assumed. In the third embodiment, the smart terminal only needs to identify the set object category and broadcast the identified scene information; and each controller performs scene information belonging to its own control range whether to start the controller to trigger the multi-dimensional effect. Make a decision. This embodiment may include:
首先,对当前播放的视频帧中的关键帧进行持续的探测,如神经网络检测出当前画面中有大面积的花海,将花的边缘轮廓找到后,如果还检测到花往右有较大的晃动幅度,那么,根据花摆动的方向可以推断出有风从左往右吹,根据花摆动的幅度可以推算出风的等级;如果同时还探测到有人物在画面中出现,那么,标记出人物的位置和个数,以及通过多帧发现人物之间相对移动的速度等。这些获得的信息就是场景信息。Firstly, the key frames in the currently played video frame are continuously detected. For example, the neural network detects that there is a large area of flower sea in the current picture, and after finding the edge contour of the flower, if it is detected that the flower is larger to the right The amplitude of the sway, then, according to the direction of the flower swing, it can be inferred that the wind blows from left to right, and the level of the wind can be derived according to the amplitude of the flower swing; if a person is also detected in the picture, then the mark is The position and number of characters, and the speed at which relative movement between characters is found through multiple frames. The information obtained is the scene information.
接着,智能终端广播获得的场景信息,即花的种类、花大概数量;风吹的方向及风力等级;人物个数以及相对移动的速度。Then, the smart terminal broadcasts the obtained scene information, that is, the type of flower, the approximate number of flowers, the direction of wind blowing and the level of wind; the number of characters and the speed of relative movement.
然后,对于每个控制器的处理如下:Then, the processing for each controller is as follows:
对于每个吹风控制器,根据获得的场景信息、自己所在的位置,以及不同的场景信息与控制信息之间的对应关系,决定是否需要触发吹风,以及风的量级。比如:场景信息中风是从左往右吹的,吹风控制器方位如果在左边,那么,则吹场景信息中对应的风力;如果吹风控制器方位是在右边,则就不用需要触发吹风。For each blowing controller, according to the obtained scene information, the location of the location, and the corresponding relationship between the different scene information and the control information, it is determined whether the blowing is required, and the magnitude of the wind. For example: the scene information stroke is blown from left to right. If the position of the blow controller is on the left, then the corresponding wind in the scene information is blown; if the orientation of the blow controller is on the right, there is no need to trigger the blow.
对于每个气味控制器,根据获得的场景信息以及预先设置的不同的场景信息与控制信息之间的对应关系,触发气味控制器将对应场景信息中的花类的香味释放出来。For each odor controller, according to the obtained scene information and the corresponding relationship between the preset scene information and the control information, the odor controller is triggered to release the fragrance of the flower in the corresponding scene information.
对于每个声音控制器,根据获得的场景信息,选择对应的背景声如风吹草 动的声音。并且根据场景信息中人物移动速度和移动方向,以及预先设置的不同的场景信息与控制信息之间的对应关系,并根据声音控制器自身所对应的声道,触发声音控制器选择脚步声强弱或渐变,然后将背景声和脚步声做叠加后输出。完成本声道的声音输出。For each sound controller, according to the obtained scene information, select the corresponding background sound such as wind and grass The sound of movement. And according to the moving speed and moving direction of the character in the scene information, and the corresponding relationship between the different scene information and the control information set in advance, and according to the channel corresponding to the sound controller itself, triggering the sound controller to select the strength of the footstep sound Or gradient, then superimpose the background sound and footsteps and output. Complete the sound output of this channel.
这样,在各种控制器的综合作用下,给用户模拟出了风吹花海,人物走动的场景。In this way, under the combined effect of various controllers, the user is simulated the scene where the wind blows the flowers and the characters move.
本发明实施例还提供一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令被处理器执行时实现上述任一实施例所述的实现多维控制的方法。The embodiment of the invention further provides a computer readable storage medium storing computer executable instructions, which are implemented by the processor to implement the method for implementing multidimensional control according to any of the above embodiments.
本领域普通技术人员可以理解,上文中所公开方法中的全部或某些步骤、系统、装置中的功能模块/单元可以被实施为软件、固件、硬件及其适当的组合。在硬件实施方式中,在以上描述中提及的功能模块/单元之间的划分不一定对应于物理组件的划分;例如,一个物理组件可以具有多个功能,或者一个功能或步骤可以由若干物理组件合作执行。某些组件或所有组件可以被实施为由处理器,如数字信号处理器或微处理器执行的软件,或者被实施为硬件,或者被实施为集成电路,如专用集成电路。这样的软件可以分布在计算机可读介质上,计算机可读介质可以包括计算机存储介质(或非暂时性介质)和通信介质(或暂时性介质)。如本领域普通技术人员公知的,术语计算机存储介质包括在用于存储信息(诸如计算机可读指令、数据结构、程序模块或其他数据)的任何方法或技术中实施的易失性和非易失性、可移除和不可移除介质。计算机存储介质包括但不限于RAM、ROM、EEPROM、闪存或其他存储器技术、CD-ROM、数字多功能盘(DVD)或其他光盘存储、磁盒、磁带、磁盘存储或其他磁存储装置、或者可以用于存储期望的信息并且可以被计算机访问的任何其他的介质。此外,本领域普通技术人员公知的是,通信介质通常包含计算机可读指令、数据结构、程序模块或者诸如载波或其他传输机制之类的调制数据信号中的其他数据,并且可包括任何信息递送介质。Those of ordinary skill in the art will appreciate that all or some of the steps, systems, and functional blocks/units of the methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be composed of several physical The components work together. Some or all of the components may be implemented as software executed by a processor, such as a digital signal processor or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on a computer readable medium, which may include computer storage media (or non-transitory media) and communication media (or transitory media). As is well known to those of ordinary skill in the art, the term computer storage medium includes volatile and nonvolatile, implemented in any method or technology for storing information, such as computer readable instructions, data structures, program modules or other data. Sex, removable and non-removable media. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disc (DVD) or other optical disc storage, magnetic cartridge, magnetic tape, magnetic disk storage or other magnetic storage device, or may Any other medium used to store the desired information and that can be accessed by the computer. Moreover, it is well known to those skilled in the art that communication media typically includes computer readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave or other transport mechanism, and can include any information delivery media. .
以上所述,仅为本申请的较佳实例而已,并非用于限定本申请的保护范围。凡在本申请的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。 The above descriptions are only preferred examples of the present application and are not intended to limit the scope of the present application. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and principles of the present application are intended to be included within the scope of the present application.
工业实用性Industrial applicability
本申请实施例提供一种实现多维控制的方法、智能终端及控制器,利用智能终端实现音视频检测,用以识别当前的视频播放的场景,并根据识别出的各种场景控制各种控制器来重建当前播放的场景,实现了实时对放映内容加入多维体验效果,并适用于普通家庭。 The embodiment of the present application provides a method, an intelligent terminal, and a controller for implementing multi-dimensional control, which implements audio and video detection by using an intelligent terminal, is used to identify a current video playing scene, and controls various controllers according to the identified various scenarios. To reconstruct the currently playing scene, real-time multi-dimensional experience effect on the screening content, and suitable for ordinary families.

Claims (21)

  1. 一种实现多维控制的方法,包括:A method of implementing multidimensional control, comprising:
    智能终端对获取到的当前播放的视频内容进行分析,以识别出所述视频内容对应的场景信息;The smart terminal analyzes the obtained currently played video content to identify the scene information corresponding to the video content;
    所述智能终端发送所述场景信息给控制器,以便所述控制器根据所述场景信息启动多维控制。The smart terminal sends the scene information to a controller, so that the controller starts multi-dimensional control according to the scene information.
  2. 根据权利要求1所述的方法,其中,所述对获取到的当前播放的视频内容进行分析,以识别出所述视频内容对应的场景信息,包括:The method of claim 1, wherein the analyzing the currently played video content to identify the scene information corresponding to the video content comprises:
    当所述智能终端播放视频时,采样分析视频帧,搜索候选物体,其中,对每一个采样帧,获取运动估计向量,并将运动估计向量大的宏块集中的区域划定为标记区域;When the smart terminal plays the video, the video frame is sampled and analyzed, and the candidate object is searched, wherein for each sampling frame, the motion estimation vector is acquired, and the region of the macroblock set with the large motion estimation vector is defined as the marked region;
    所述智能终端对当前播放的视频帧中的关键帧进行持续的探测,如果在预先设置的视频帧序列中,一直存在标记区域,则所述智能终端开始采样分析该视频帧序列中的关键帧,对每个采样帧识别定位出视频帧内的候选物体以及所在位置,以识别出所述场景信息。The smart terminal continuously detects key frames in the currently played video frame. If there is always a marked area in a preset video frame sequence, the smart terminal starts sampling and analyzing key frames in the video frame sequence. Identifying, for each sample frame, a candidate object within the video frame and a location thereof to identify the scene information.
  3. 根据权利要求2所述的方法,其中,所述将运动估计向量大的宏块集中的区域划定为标记区域,包括:The method according to claim 2, wherein said delineating an area of a macroblock set having a large motion estimation vector as a marked area comprises:
    采用分类算法将获得的所述运动估计向量分为以下两类:运动估计向量大的宏块以及运动估计向量小的宏块;The obtained motion estimation vector is divided into the following two categories by using a classification algorithm: a macroblock with a large motion estimation vector and a macroblock with a small motion estimation vector;
    将运动估计向量大的宏块集中的区域划定为标记区域;位于标记区域外的物体作为参照物。An area in the macroblock set in which the motion estimation vector is large is defined as a marker area; an object located outside the marker area is used as a reference object.
  4. 一种实现多维控制的方法,包括:控制器根据获得的当前播放的视频内容对应的场景信息,识别出需要启动多维体验控制的指令,进行相应控制。A method for implementing multi-dimensional control includes: the controller identifies an instruction that needs to start multi-dimensional experience control according to the obtained scene information corresponding to the currently played video content, and performs corresponding control.
  5. 根据权利要求4所述的方法,其中,所述控制器中预先设置有不同的物体类别与控制信息之间的对应关系;The method according to claim 4, wherein a correspondence between different object categories and control information is preset in the controller;
    所述根据获得的场景信息识别出自身需要启动多维体验控制的指令,包括:所述得到的场景信息中的物体属于预先设置的触发控制的物体类别,并 且满足预先设置的触发条件时,确定启动所述多维体验控制的指令。And determining, according to the obtained scene information, an instruction that is required to start the multi-dimensional experience control, the method includes: the object in the obtained scene information belongs to a preset object type of the trigger control, and And when the preset trigger condition is met, an instruction to start the multi-dimensional experience control is determined.
  6. 根据权利要求4或5所述的方法,其中,所述控制器包括以下至少之一:震动控制器、气味控制器、喷雾控制器、灯光控制器、声音控制器。The method according to claim 4 or 5, wherein the controller comprises at least one of the following: a vibration controller, an odor controller, a spray controller, a light controller, a sound controller.
  7. 根据权利要求6所述的方法,其中,所述多个控制器之间采用分布式部署,或者集中式部署。The method of claim 6, wherein the plurality of controllers employ distributed deployment or centralized deployment.
  8. 一种实现多维体验的方法,包括:A method of implementing a multidimensional experience, including:
    智能终端对获取到的当前播放的视频内容进行分析,以识别出与发起请求的控制器对应的场景信息;The smart terminal analyzes the obtained currently played video content to identify the scenario information corresponding to the controller that initiated the request;
    所述智能终端根据识别出的场景信息确定是否需要启动多维体验控制;Determining, by the smart terminal, whether to start multi-dimensional experience control according to the identified scenario information;
    当确定出需要启动多维体验控制时,将对应的控制信息下发给相应的控制器。When it is determined that the multi-dimensional experience control needs to be started, the corresponding control information is sent to the corresponding controller.
  9. 根据权利要求8所述的方法,所述智能终端对获得的视频内容进行分析之前,上述方法还包括:所述智能终端监听到来自一个或一个以上控制器的查询命令,将自身的设备描述信息返回给发起查询请求的控制器;与收到查询响应且发起会话的控制器之间建立会话。The method according to claim 8, before the intelligent terminal analyzes the obtained video content, the method further includes: the smart terminal listening to a query command from one or more controllers, and setting its own device description information. Returned to the controller that initiated the query request; establishes a session with the controller that received the query response and initiated the session.
  10. 根据权利要求9所述的方法,其中,所述对获取到的视频内容进行分析,以识别出与发起请求的控制器对应的场景信息,包括:The method of claim 9, wherein the analyzing the acquired video content to identify scene information corresponding to the controller that initiated the request comprises:
    当所述智能终端播放视频时,采样分析视频帧,搜索候选物体,其中,对每一个采样帧,获取运动估计向量,并将运动估计向量大的宏块集中的区域划定为标记区域;When the smart terminal plays the video, the video frame is sampled and analyzed, and the candidate object is searched, wherein for each sampling frame, the motion estimation vector is acquired, and the region of the macroblock set with the large motion estimation vector is defined as the marked region;
    对所述获得的视频帧中的关键帧进行持续的探测,如果在预先设置的视频帧序列中,一直存在标记区域,则开始采样分析该视频帧序列中的关键帧,对每个采样帧识别定位出视频帧内的与发起查询并建立会话的控制器相关的候选物体以及所在位置,以识别出所述与发起查询并建立会话的控制器对应的场景信息。Performing continuous detection on the key frames in the obtained video frame. If there is always a marked area in the preset video frame sequence, sampling starts analyzing the key frames in the video frame sequence, and identifying each sample frame. A candidate object related to the controller that initiates the query and establishes the session and the location of the location within the video frame are located to identify the scenario information corresponding to the controller that initiated the query and establishes the session.
  11. 根据权利要求10所述的方法,其中,所述将运动估计向量大的宏块集中的区域划定为标记区域包括:The method of claim 10, wherein the delineating the region of the macroblock set having the large motion estimation vector as the marker region comprises:
    采用分类算法将获得的所述运动估计向量分为以下两类:运动估计向量 大的宏块以及运动估计向量小的宏块;将运动估计向量大的宏块集中的区域划定为标记区域;位于标记区域外的物体作为参照物。The obtained motion estimation vectors are classified into the following two categories by using a classification algorithm: a motion estimation vector A large macroblock and a macroblock having a small motion estimation vector; an area in a macroblock set in which the motion estimation vector is large is defined as a marker area; and an object located outside the marker area is used as a reference object.
  12. 根据权利要求9所述的方法,其中,所述智能终端中预先设置有不同的物体类别与控制信息之间的对应关系;The method according to claim 9, wherein a correspondence between different object categories and control information is preset in the smart terminal;
    所述智能终端根据识别出的场景信息确定是否需要启动多维体验控制包括:所述得到的场景信息中的物体属于预先设置的触发控制的物体类别,并且满足预先设置的触发条件时,启动相应多维体验控制。Determining whether the multi-dimensional experience control needs to be started according to the identified scene information includes: the object in the obtained scene information belongs to a preset object type of trigger control, and when the preset trigger condition is met, the corresponding multi-dimensional is started. Experience control.
  13. 一种智能终端,包括:第一分析模块、广播模块;其中,An intelligent terminal includes: a first analysis module and a broadcast module; wherein
    所述第一分析模块,配置为对获取到的当前播放的视频内容进行分析,以识别出所述视频内容对应的场景信息;The first analysis module is configured to analyze the obtained currently played video content to identify scene information corresponding to the video content;
    所述广播模块,配置为发送识别出的场景信息给控制器,以便所述控制器根据所述场景信息启动多维控制。The broadcast module is configured to send the identified scene information to the controller, so that the controller starts multi-dimensional control according to the scene information.
  14. 根据权利要求13所述的智能终端,其中,所述第一分析模块配置为:当播放视频时,采样分析视频帧,对每一个采样帧,获取运动估计向量;采用分类算法将获得的运动估计向量分为以下两类:运动估计向量大的宏块以及运动估计向量小的宏块;将运动估计向量大的宏块集中的区域划定为标记区域;The smart terminal according to claim 13, wherein the first analysis module is configured to: when playing a video, sample and analyze a video frame, acquire a motion estimation vector for each sample frame; and obtain a motion estimation by using a classification algorithm. The vector is divided into the following two types: a macroblock with a large motion estimation vector and a macroblock with a small motion estimation vector; and an area in which the macroblock of the motion estimation vector is large is defined as a marker region;
    对当前播放的视频帧中的关键帧进行持续的探测,如果在视频帧序列中,一直存在标记区域,则开始采样分析该视频帧序列中的关键帧,对每个采样帧识别定位出视频帧内的候选物体以及所在位置,以识别出所述场景信息。Continuously detecting key frames in the currently played video frame. If there is always a marked area in the video frame sequence, sampling and analyzing key frames in the video frame sequence are started, and the video frame is identified and positioned for each sample frame. The candidate object and the location within it to identify the scene information.
  15. 一种智能终端,包括:第二分析模块、确定模块;其中,An intelligent terminal includes: a second analysis module and a determination module; wherein
    所述第二分析模块,配置为对获取到的当前播放的视频内容进行分析,以识别出与发起请求的控制器对应的场景信息;The second analysis module is configured to analyze the obtained currently played video content to identify the scenario information corresponding to the controller that initiated the request;
    所述确定模块,配置为根据识别出的场景信息确定是否需要启动多维体验控制,当确定出需要启动多维体验控制时,将对应的控制信息下发给相应的控制器。The determining module is configured to determine whether the multi-dimensional experience control needs to be started according to the identified scenario information. When it is determined that the multi-dimensional experience control needs to be started, the corresponding control information is sent to the corresponding controller.
  16. 根据权利要求15所述的智能终端,所述智能终端还包括:建立模块,配置为监听到来自一个或一个以上控制器的查询命令,将自身所属智能终端 的设备描述信息返回给发起查询请求的控制器;与发起会话的控制器之间建立会话。The intelligent terminal according to claim 15, further comprising: an establishing module configured to listen to a query command from one or more controllers, and to belong to the smart terminal to which it belongs The device description information is returned to the controller that initiated the query request; a session is established with the controller that initiated the session.
  17. 根据权利要求16所述的智能终端,其中,所述第二分析模块配置为:The intelligent terminal of claim 16, wherein the second analysis module is configured to:
    当播放视频时,采样分析视频帧,对每一个采样帧,获取运动估计向量;采用分类算法将获得的运动估计向量分为以下两类:运动估计向量大的宏块以及运动估计向量小的宏块;将运动估计向量大的宏块集中的区域划定为标记区域;位于标记区域外的物体称为参照物;When the video is played, the video frames are sampled and analyzed, and the motion estimation vector is obtained for each sample frame. The motion estimation vectors obtained by the classification algorithm are classified into the following two types: macroblocks with large motion estimation vectors and macros with small motion estimation vectors. a block; an area in a macroblock set having a large motion estimation vector is defined as a mark area; an object located outside the mark area is referred to as a reference object;
    对当前播放的视频帧中的关键帧进行持续的探测,如果在视频帧序列中,一直存在标记区域,则开始采样分析该视频帧序列中的帧,对每个采样帧识别定位出视频帧内的与发起查询并建立会话的控制器相关的主要物体以及所在位置,以识别出与所述发起查询并建立会话的控制器对应的场景信息。Performing continuous detection on key frames in the currently played video frame. If there is always a marked area in the video frame sequence, sampling is performed to analyze the frames in the video frame sequence, and each of the sample frames is identified and positioned within the video frame. The primary object associated with the controller that initiated the query and establishes the session and the location to identify the scenario information corresponding to the controller that initiated the query and established the session.
  18. 根据权利要求16所述的智能终端,其中,所述确定模块配置为:根据预先设置的不同的物体类别与控制信息之间的对应关系,当所述得到的场景信息中的物体属于预先设置的触发控制的物体类别,并且满足预先设置的触发条件时,启动相应多维体验控制,并将对应的控制信息下发给相应的控制器。The intelligent terminal according to claim 16, wherein the determining module is configured to: when the object in the obtained scene information belongs to a preset according to a correspondence between different object categories and control information set in advance When the controlled object category is triggered and the preset trigger condition is met, the corresponding multi-dimensional experience control is started, and the corresponding control information is sent to the corresponding controller.
  19. 一种控制器,包括:获取模块、控制模块;其中,A controller includes: an acquisition module and a control module; wherein
    所述获取模块,配置为获取当前播放的视频内容对应的场景信息;The acquiring module is configured to acquire scene information corresponding to the currently played video content;
    所述控制模块,配置为根据获得的场景信息确定出自身需要启动多维体验控制时,进行相应控制。The control module is configured to perform corresponding control when it is determined according to the obtained scenario information that the multi-dimensional experience control needs to be started.
  20. 根据权利要求19所述的控制器,其中,所述控制模块中预先设置有不同的物体类别与控制信息之间的对应关系;The controller according to claim 19, wherein a correspondence between different object categories and control information is preset in the control module;
    所述控制模块配置为:当所述得到的场景信息中的物体属于预先设置的触发控制的物体类别,并且满足预先设置的触发条件时,启动所述多维体验控制。The control module is configured to start the multi-dimensional experience control when an object in the obtained scene information belongs to a preset object type of trigger control and meets a preset trigger condition.
  21. 根据权利要求19或20所述的控制器,其中,所述获取模块还配置为:发送查询命令,以查询当前网络中的智能终端的设备描述信息,并监听智能终端广播的信息。 The controller according to claim 19 or 20, wherein the obtaining module is further configured to: send a query command to query device description information of the smart terminal in the current network, and listen to information broadcast by the smart terminal.
PCT/CN2017/079444 2016-04-05 2017-04-05 Method for realizing multi-dimensional control, intelligent terminal and controller WO2017173976A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610206745.2 2016-04-05
CN201610206745.2A CN105760141B (en) 2016-04-05 2016-04-05 Method for realizing multidimensional control, intelligent terminal and controller

Publications (1)

Publication Number Publication Date
WO2017173976A1 true WO2017173976A1 (en) 2017-10-12

Family

ID=56333468

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/079444 WO2017173976A1 (en) 2016-04-05 2017-04-05 Method for realizing multi-dimensional control, intelligent terminal and controller

Country Status (2)

Country Link
CN (1) CN105760141B (en)
WO (1) WO2017173976A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110493090A (en) * 2019-08-22 2019-11-22 三星电子(中国)研发中心 A kind of method and system for realizing Intelligent home theater
CN111031392A (en) * 2019-12-23 2020-04-17 广州视源电子科技股份有限公司 Media file playing method, system, device, storage medium and processor
EP3675504A1 (en) * 2018-12-31 2020-07-01 Comcast Cable Communications LLC Environmental data for media content

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105760141B (en) * 2016-04-05 2023-05-09 中兴通讯股份有限公司 Method for realizing multidimensional control, intelligent terminal and controller
CN106657975A (en) * 2016-10-10 2017-05-10 乐视控股(北京)有限公司 Video playing method and device
CN108063701B (en) * 2016-11-08 2020-12-08 华为技术有限公司 Method and device for controlling intelligent equipment
CN107743205A (en) * 2017-09-11 2018-02-27 广东欧珀移动通信有限公司 Image processing method and device, electronic installation and computer-readable recording medium
CN110475159A (en) * 2018-05-10 2019-11-19 中兴通讯股份有限公司 The transmission method and device of multimedia messages, terminal
CN109388719A (en) * 2018-09-30 2019-02-26 京东方科技集团股份有限公司 Multidimensional contextual data generating means and method based on Digitized Works
CN110245628B (en) * 2019-06-19 2023-04-18 成都世纪光合作用科技有限公司 Method and device for detecting discussion scene of personnel
CN112040289B (en) * 2020-09-10 2022-12-06 深圳创维-Rgb电子有限公司 Video playing control method and device, video playing equipment and readable storage medium
CN114885189A (en) * 2022-04-14 2022-08-09 深圳创维-Rgb电子有限公司 Control method, device and equipment for opening fragrance and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070217511A1 (en) * 2006-03-14 2007-09-20 Celestial Semiconductor, Ltd. Method and system for motion estimation with multiple vector candidates
CN105072483A (en) * 2015-08-28 2015-11-18 深圳创维-Rgb电子有限公司 Smart home equipment interaction method and system based on smart television video scene
CN105306982A (en) * 2015-05-22 2016-02-03 维沃移动通信有限公司 Sensory feedback method for mobile terminal interface image and mobile terminal thereof
CN105760141A (en) * 2016-04-05 2016-07-13 中兴通讯股份有限公司 Multi-dimensional control method, intelligent terminal and controllers

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101035279B (en) * 2007-05-08 2010-12-15 孟智平 Method for using the information set in the video resource
CN103559713B (en) * 2013-11-10 2017-01-11 深圳市幻实科技有限公司 Method and terminal for providing augmented reality
CN103679727A (en) * 2013-12-16 2014-03-26 中国科学院地理科学与资源研究所 Multi-dimensional space-time dynamic linkage analysis method and device
CN103970892B (en) * 2014-05-23 2017-03-01 无锡清华信息科学与技术国家实验室物联网技术中心 Various dimensions viewing system control method based on intelligent home device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070217511A1 (en) * 2006-03-14 2007-09-20 Celestial Semiconductor, Ltd. Method and system for motion estimation with multiple vector candidates
CN105306982A (en) * 2015-05-22 2016-02-03 维沃移动通信有限公司 Sensory feedback method for mobile terminal interface image and mobile terminal thereof
CN105072483A (en) * 2015-08-28 2015-11-18 深圳创维-Rgb电子有限公司 Smart home equipment interaction method and system based on smart television video scene
CN105760141A (en) * 2016-04-05 2016-07-13 中兴通讯股份有限公司 Multi-dimensional control method, intelligent terminal and controllers

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3675504A1 (en) * 2018-12-31 2020-07-01 Comcast Cable Communications LLC Environmental data for media content
CN110493090A (en) * 2019-08-22 2019-11-22 三星电子(中国)研发中心 A kind of method and system for realizing Intelligent home theater
CN111031392A (en) * 2019-12-23 2020-04-17 广州视源电子科技股份有限公司 Media file playing method, system, device, storage medium and processor

Also Published As

Publication number Publication date
CN105760141A (en) 2016-07-13
CN105760141B (en) 2023-05-09

Similar Documents

Publication Publication Date Title
WO2017173976A1 (en) Method for realizing multi-dimensional control, intelligent terminal and controller
CN109873951B (en) Video shooting and playing method, device, equipment and medium
US11854547B2 (en) Network microphone device with command keyword eventing
US10554850B2 (en) Video ingestion and clip creation
CN106686404B (en) Video analysis platform, matching method, and method and system for accurately delivering advertisements
KR101588046B1 (en) Method and system for generating data for controlling a system for rendering at least one signal
CN106057205B (en) Automatic voice interaction method for intelligent robot
CN104620522B (en) User interest is determined by detected body marker
US11810597B2 (en) Video ingestion and clip creation
KR102197098B1 (en) Method and apparatus for recommending content
CN104618446A (en) Multimedia pushing implementing method and device
CN109635616A (en) Interactive approach and equipment
EP3675504A1 (en) Environmental data for media content
CN111442464B (en) Air conditioner and control method thereof
KR101924715B1 (en) Techniques for enabling auto-configuration of infrared signaling for device control
US20200143823A1 (en) Methods and devices for obtaining an event designation based on audio data
KR20160099289A (en) Method and system for video search using convergence of global feature and region feature of image
CN111096078B (en) Method and system for creating light script of video
US20180006869A1 (en) Control method and system, and electronic apparatus thereof
US20230147768A1 (en) Adaptive learning system for localizing and mapping user and object using an artificially intelligent machine
CN110889354B (en) Image capturing method and device of augmented reality glasses
EP3777485B1 (en) System and methods for augmenting voice commands using connected lighting systems
US20190066711A1 (en) Voice filtering system and method
UA146807U (en) WAY OF CONTROLLING OBJECTS OF ADDED REALITY
CN116386639A (en) Voice interaction method, related device, equipment, system and storage medium

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17778640

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 17778640

Country of ref document: EP

Kind code of ref document: A1