CN105760141B - Method for realizing multidimensional control, intelligent terminal and controller - Google Patents

Method for realizing multidimensional control, intelligent terminal and controller Download PDF

Info

Publication number
CN105760141B
CN105760141B CN201610206745.2A CN201610206745A CN105760141B CN 105760141 B CN105760141 B CN 105760141B CN 201610206745 A CN201610206745 A CN 201610206745A CN 105760141 B CN105760141 B CN 105760141B
Authority
CN
China
Prior art keywords
controller
intelligent terminal
motion estimation
scene information
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610206745.2A
Other languages
Chinese (zh)
Other versions
CN105760141A (en
Inventor
赵秋林
黄宇轩
刘成刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CN201610206745.2A priority Critical patent/CN105760141B/en
Publication of CN105760141A publication Critical patent/CN105760141A/en
Priority to PCT/CN2017/079444 priority patent/WO2017173976A1/en
Application granted granted Critical
Publication of CN105760141B publication Critical patent/CN105760141B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/02Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Abstract

The invention discloses a method for realizing multidimensional experience, an intelligent terminal and a controller, wherein the method comprises the steps that the intelligent terminal analyzes acquired video content which is currently played to identify scene information corresponding to the video content; the intelligent terminal sends the scene information to the controller so that the controller starts multidimensional control according to the scene information. The technical scheme provided by the invention realizes audio and video detection by using the intelligent terminal, is used for identifying the current video playing scene, and controlling various controllers according to the identified various scenes to reconstruct the current playing scene, thereby realizing the effect of adding multidimensional experience to the projection content in real time and being suitable for common families.

Description

Method for realizing multidimensional control, intelligent terminal and controller
Technical Field
The present invention relates to, but not limited to, intelligent technologies, and in particular, to a method for implementing multidimensional control, an intelligent terminal, and a controller.
Background
If the effects such as vibration, blowing, smoke, bubbles, smell, scenery and character performance can be simulated and introduced when a user watches television or movies, a unique performance form is formed, and the field special effects and the drama are tightly combined, so that an environment consistent with the content of the movies can be created, and a brand new entertainment effect can be experienced by a spectator through multiple physical and sensory effects such as vision, smell, hearing and touch.
However, at present, the multidimensional user experience can only be experienced on a special movie, and control instructions of the multidimensional experience are synchronized with the movie in advance, for example: and a control instruction is sent to the corresponding controller at the corresponding showing time point so that the controller can control the vibration, blowing, smog, bubbles, smell, scenery, character performance and other effects. That is, the realization of such an entirely new entertainment effect is currently limited in use at home.
Disclosure of Invention
The invention provides a method, an intelligent terminal and a controller for realizing multidimensional control, which can add multidimensional experience effects to projection contents in real time and are suitable for ordinary families.
In order to achieve the object of the present invention, the present invention provides a method for implementing multidimensional control, including: the intelligent terminal analyzes the obtained video content which is currently played so as to identify scene information corresponding to the video content;
the intelligent terminal sends the scene information to the controller so that the controller can start multidimensional control according to the scene information.
Optionally, the analyzing the obtained video content, and identifying the scene information includes:
when the intelligent terminal plays the video, sampling and analyzing the video frame, and searching for a candidate object: for each sampling frame, acquiring a motion estimation vector, and dividing a plurality of areas in a macro block set with large motion estimation vector into marking areas;
and continuously detecting key frames in the video frames which are currently played by the intelligent terminal, and if a mark area is always present in a preset video frame sequence with longer duration, starting sampling and analyzing the key frames in the video frame sequence by the intelligent terminal, and identifying and positioning candidate objects and positions in the video frames for each sampling frame so as to identify the scene information.
Optionally, the demarcating the regions in the macro block set with large motion estimation vectors as marked regions includes:
classifying the obtained motion estimation vectors into two types by adopting a classification algorithm: macro blocks with large motion estimation vectors and macro blocks with small motion estimation vectors;
a plurality of areas in a macro block set with large motion estimation vectors are defined as marked areas; an object located outside the marked area serves as a reference.
The invention also provides a method for realizing the multidimensional control, and the controller identifies the instruction for starting the multidimensional experience control according to the acquired scene information corresponding to the video content which is currently played, and performs the corresponding control.
Optionally, the corresponding relation between different object categories and control information is preset in the controller;
the instruction for identifying the multi-dimensional experience control itself according to the obtained scene information comprises the following steps: and determining an instruction for starting the multidimensional experience control when the object in the obtained scene information belongs to the object category of the preset trigger control and the preset trigger condition is met.
Optionally, the controller includes: vibration controller, and/or odor controller, and/or spray controller, and/or light controller, and/or sound controller.
Alternatively, a distributed deployment or a centralized deployment is adopted among the controllers.
The invention also provides a method for realizing multidimensional experience, which comprises the following steps:
the intelligent terminal analyzes the obtained video content which is currently played so as to identify scene information corresponding to a controller which initiates a request;
the intelligent terminal determines whether multidimensional experience control needs to be started according to the identified scene information;
and when the multi-dimensional experience control is determined to be started, the corresponding control information is issued to the corresponding controller.
Optionally, before the intelligent terminal analyzes the obtained video content, the method further includes:
the intelligent terminal monitors inquiry commands from one or more controllers and returns self equipment description information to the controller which initiates the inquiry request;
and the controller which receives the inquiry response is used as a client to initiate a session to the intelligent terminal, and the session is established between the intelligent terminal and the controller.
Optionally, the analyzing the obtained video content, and identifying the scene information corresponding to the controller initiating the request includes:
when the intelligent terminal plays the video, sampling and analyzing the video frame, and searching for a candidate object: for each sampling frame, acquiring a motion estimation vector, and dividing a plurality of areas in a macro block set with large motion estimation vector into marking areas;
and continuously detecting the key frames in the obtained video frames, if a mark area is always present in a preset long-duration video frame sequence, starting sampling and analyzing the key frames in the video frame sequence, and identifying and positioning a candidate object and a position of the candidate object which are related to a controller which initiates the inquiry and establishes the session in the video frame for each sampling frame so as to identify the scene information corresponding to the controller which initiates the inquiry and establishes the session.
Optionally, the demarcating the regions in the macro block set with large motion estimation vectors as marked regions includes:
classifying the obtained motion estimation vectors into two types by adopting a classification algorithm: macro blocks with large motion estimation vectors and macro blocks with small motion estimation vectors;
a plurality of areas in a macro block set with large motion estimation vectors are defined as marked areas; an object located outside the marked area serves as a reference.
Optionally, the intelligent terminal is preset with corresponding relations between different object categories and control information;
the intelligent terminal determining whether to start multidimensional experience control according to the obtained scene information comprises the following steps: and when the object in the obtained scene information belongs to the object category of the preset triggering control and meets the preset triggering condition, starting corresponding multidimensional experience control and transmitting corresponding control information to a corresponding controller.
The invention also provides an intelligent terminal which comprises a first analysis module and a broadcasting module; wherein, the liquid crystal display device comprises a liquid crystal display device,
the first analysis module is used for analyzing the obtained video content which is currently played after the multidimensional experience function is started so as to identify scene information corresponding to the video content;
and the broadcasting module is used for sending the identified scene information to the controller so that the controller can start multidimensional control according to the scene information.
Optionally, the first analysis module is specifically configured to: when playing video, sampling and analyzing video frames, and obtaining motion estimation vectors for each sampling frame; the motion estimation vectors obtained are classified into two categories by using a classification algorithm: macro blocks with large motion estimation vectors and macro blocks with small motion estimation vectors; a plurality of areas in a macro block set with large motion estimation vectors are defined as marked areas;
and continuously detecting the key frames in the video frames which are currently played, if a mark area is always present in a long-duration video frame sequence, starting to sample and analyze the key frames in the video frame sequence, and identifying and positioning the candidate objects and the positions in the video frames for each sampling frame so as to identify the scene information.
The invention also provides an intelligent terminal, which comprises a second analysis module and a determination module; wherein, the liquid crystal display device comprises a liquid crystal display device,
the second analysis module is used for analyzing the obtained video content which is currently played after the multidimensional experience function is started so as to identify and acquire scene information corresponding to the controller which initiates the request;
and the determining module is used for determining whether the multidimensional experience control needs to be started according to the identified scene information, and transmitting corresponding control information to the corresponding controller when the multidimensional experience control needs to be started.
Optionally, the system also comprises an establishing module, which is used for monitoring the query command from one or more controllers and returning the device description information of the intelligent terminal to which the system belongs to the controller which initiates the query request; a session is established with the controller that initiated the session.
Optionally, the second analysis module is specifically configured to:
when playing video, sampling and analyzing video frames, and obtaining motion estimation vectors for each sampling frame; the motion estimation vectors obtained are classified into two categories by using a classification algorithm: macro blocks with large motion estimation vectors and macro blocks with small motion estimation vectors; a plurality of areas in a macro block set with large motion estimation vectors are defined as marked areas; objects located outside the marked area are referred to as references;
and continuously detecting key frames in the currently played video frames, if a mark area is always present in a long-lasting video frame sequence, starting to sample and analyze frames in the video frame sequence, and identifying and positioning main objects and positions of the main objects and the positions of the main objects, which are relevant to a controller for initiating the inquiry and establishing the session, in the video frames for each sampling frame so as to identify scene information corresponding to the controller for initiating the inquiry and establishing the session.
Optionally, the determining module is specifically configured to: and when the object in the obtained scene information belongs to the object category of the preset triggering control and meets the preset triggering condition, starting corresponding multidimensional experience control and transmitting the corresponding control information to a corresponding controller.
The invention further provides a controller, which comprises an acquisition module and a control module; wherein, the liquid crystal display device comprises a liquid crystal display device,
the acquisition module is used for acquiring scene information corresponding to the video content which is currently played;
and the control module is used for carrying out corresponding control when determining that the control module needs to start multidimensional experience control according to the acquired scene information.
Optionally, the corresponding relation between different object categories and control information is preset in the control module;
the control module is specifically used for: and when the object in the obtained scene information belongs to the object category of the preset trigger control and meets the preset trigger condition, starting the multidimensional experience control.
Optionally, the acquiring module is further configured to: and sending a query command to query the equipment information of the intelligent terminal in the current network and monitor the information broadcast by the intelligent terminal.
Compared with the prior art, the technical scheme of the application comprises the steps that the intelligent terminal analyzes the obtained video content which is currently played so as to identify scene information corresponding to the video content; the intelligent terminal sends the scene information to the controller so that the controller starts multidimensional control according to the scene information. Or after the multidimensional experience function is started, the intelligent terminal analyzes the video content which is currently played to acquire scene information corresponding to the controller which initiates the request; the intelligent terminal determines whether multidimensional experience control needs to be started according to the acquired scene information; and when the multi-dimensional experience control is determined to be started, the corresponding control information is issued to the corresponding controller. The technical scheme provided by the invention realizes audio and video detection by using the intelligent terminal, is used for identifying the current video playing scene, and controlling various controllers according to the identified various scenes to reconstruct the current playing scene, thereby realizing the effect of adding multidimensional experience to the projection content in real time and being suitable for common families.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiments of the invention and together with the description serve to explain the invention and do not constitute a limitation on the invention. In the drawings:
FIG. 1 is a flow chart of a method of implementing a multi-dimensional experience in accordance with the present invention;
FIG. 2 is a flow chart of another method of implementing a multidimensional experience in accordance with the present invention;
FIG. 3 is a schematic diagram of a composition structure of an intelligent terminal according to the present invention;
fig. 4 is a schematic diagram of a composition structure of another intelligent terminal according to the present invention;
FIG. 5 is a schematic diagram of the structure of the controller according to the present invention;
FIG. 6 is a schematic diagram of a networking architecture for a controller employing centralized deployment in accordance with the present invention;
FIG. 7 is a schematic diagram of a networking architecture employing distributed deployment of the controller of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail hereinafter with reference to the accompanying drawings. It should be noted that, in the case of no conflict, the embodiments and features in the embodiments may be arbitrarily combined with each other.
FIG. 1 is a flow chart of a method for implementing multidimensional control according to the present invention, as shown in FIG. 1, comprising:
step 100: and the intelligent terminal analyzes the acquired video content which is currently played so as to identify scene information corresponding to the video content.
After the multidimensional experience function is started, firstly, when the intelligent terminal plays a video, sampling and analyzing video frames, and trying to search candidate objects such as flowers (corresponding to wind), grasses, rock slurry (corresponding to vibration) and the like, namely, obtaining a motion estimation vector for each sampling frame; the motion estimation vectors obtained are classified into two classes using a classification algorithm such as k-means cluster analysis: macro blocks with large motion estimation vectors and macro blocks with small motion estimation vectors. A number of regions in the macroblock set where the motion estimation vector is large are defined as marker regions. If the area of a certain mark area is too small, the mark area is discarded. Objects located outside the marked area serve as references for a large background. Thus, the possible areas where the key candidate objects exist are found. Wherein, for the whole video, in a preset area, such as a rectangular area, if the proportion of macro blocks with large motion vectors to the total macro blocks exceeds a preset threshold value, such as 80% (adjustable), then the area is considered as a marked area. If the area of the marked marking area occupies less than a threshold value (adjustable) of the ratio of the preset area, such as 10%, the marking area is abandoned.
And then, continuously detecting the key frame, namely the I frame, in the obtained video frames by the intelligent terminal, if a mark area exists in a preset long-duration video frame sequence, starting sampling analysis of the key frame in the video frame sequence by the intelligent terminal, and identifying and positioning candidate objects and positions of the candidate objects in the video frames for each sampling frame through algorithms such as a neural network, so as to identify scene information. In this way, identification of key candidate objects is achieved.
Specifically: if the previously obtained references are present in the currently sampled video frame sequence, the candidate object identified in the marked area of the video frame sequence is marked as a class of candidate objects if 1) the object class is present in the marked area of the successive video frame sequence and 2) the respective object of the object class is present in the marked area of the successive video frame sequence, the position relative vector constantly changing with respect to the reference of the respective video sequence. Further, if the class of the candidate object is more than one, the scene information further includes: additional parameters such as object duration, relative speed of object position movement, number, etc. are recorded.
Such as: in a specific implementation, the neural network used in the foregoing may adopt the structure of AlexNet: a total of 8 layers, the first 5 layers are convolution layers and the last 3 layers are full connection layers. Wherein the last layer uses a softmax classifier. Specifically: among the convolution layers of the first 5 layers, the 1 st layer is the convolution layer, the specific template interval is used for convolution, then the ReLU is adopted as an activation function, the Pooling is carried out after regularization, the obtained result is used as the input of the 2 nd layer convolution layer, and the later 4 layers of convolution layers are similar to the 1 st layer, but the convolution template with lower dimension is adopted; in the rear 3 full-connection layers, the rear 3 ReLU rear dropout is fully connected again; finally, softmax lost is used as lost function.
In this step, if the previously obtained reference object does not exist in the currently sampled video frame sequence, the search is abandoned, and the process is ended.
For example, see: if the neural network is adopted to detect that the flowers with large areas exist in the current picture, the edge outline of the flowers can be found, if the flowers are also detected to have larger shaking amplitude to the right, the wind blowing from left to right can be deduced according to the swinging direction of the flowers, and the wind blowing grade can be deduced according to the swinging amplitude of the flowers; if it is detected that a person appears in the screen at the same time, the positions and the number of the persons are marked, and the speed of the relative movement between the persons is found through a plurality of frames, etc. These obtained information are the scene information required in this step.
Step 101: the intelligent terminal sends the identified scene information to the controller so that the controller starts multidimensional control according to the scene information.
The intelligent terminal transmits the identified scene information to the controller, such as by broadcasting the identified scene information. Taking the above-listed example as an example, the scene information may include: the type of flowers and the approximate number of flowers; the direction of the wind blowing and the wind power level; the number of people and the speed of the relative movement.
The control information is used for corresponding control by a controller needing to start multidimensional experience control.
For each controller, then further comprising: and the controller recognizes an instruction which needs to start multidimensional experience control according to the acquired scene information corresponding to the currently played video content, and performs corresponding control.
The controller in the present invention may include, but is not limited to: vibration controller, and/or odor controller, and/or spray controller, and/or light controller, and/or sound controller, etc.
The controllers can be distributed or centralized. When distributed deployment is adopted, each controller communicates with an intelligent terminal; when a centralized deployment is adopted, the controllers can be arranged in one device, such as one wearable device, so that the user experience is more convenient. The controller and the intelligent terminal may communicate in an Ethernet (Ethernet), wiFi, bluetooth (Bluetooth) manner, or the like.
In the controller of the step, corresponding relations between different object types and control information are preset, and when the objects in the obtained scene information belong to the object types of the preset trigger control and meet the preset trigger conditions, an instruction for starting corresponding multidimensional experience control is determined.
Such as: for the vibration controller, this correspondence relationship may be set as: when the objects in the obtained scene information belong to the object category such as rock triggering vibration, and the triggering condition such as that the number of the objects is more than 1 and the speed is more than 1/8 screen per second, and the duration is more than 3 seconds, starting a vibration controller to trigger a vibration effect;
for another example, for the odor controller, this correspondence may be set as: when the objects in the obtained scene information belong to the object category triggering odor generation, such as osmanthus fragrans, and the triggering condition is met, such as lasting for more than 6 seconds and the number of the objects is more than 10, the odor controller is started to trigger the odor with the osmanthus fragrans fragrance.
Another example is: for a sound controller, this correspondence may be: when the object in the obtained scene information belongs to the object category triggering to generate sound, such as tasks appear in the picture, and the triggering conditions such as the position, the moving direction and the moving speed of the person are met, the sound controller is started to trigger the gradual change process of the steps along with the moving direction of the person.
FIG. 2 is a flow chart of another method for implementing multidimensional control according to the present invention, as shown in FIG. 2, comprising:
step 200: the intelligent terminal analyzes the obtained video content which is currently played so as to identify scene information corresponding to the controller which initiates the request.
The method also comprises the following steps: after a certain controller or a certain controllers are started, a query command is sent to the intelligent terminal to query the equipment information of the intelligent terminal in the current network, and the information broadcasted by the intelligent terminal is monitored;
the intelligent terminal is used as a convergence point to monitor the inquiry from the controller, and when the inquiry is monitored, the intelligent terminal returns the self equipment description information to the controller which initiates the inquiry request;
the controller which receives the inquiry response is used as a client to initiate a session to the intelligent terminal, and the session is established between the intelligent terminal and the controller.
The specific implementation of this step is identical to step 100, except that: in the step, the intelligent terminal collects corresponding scene information according to the request of the controller. For example, the query request is sent by the vibration controller, and then the intelligent terminal only identifies the object type such as rock triggering vibration, that is, the object in the scene information returned at the moment only has the object type triggering vibration.
Step 201: and the intelligent terminal determines whether the multidimensional experience control needs to be started according to the identified scene information.
In the step, corresponding relations between different object types and control information are preset in the intelligent terminal, and when the objects in the obtained scene information belong to the object types of the preset trigger control and meet the preset trigger conditions, corresponding multidimensional experience control is started.
The specific implementation of this step is identical to step 102 and will not be described here again.
Step 202: and when the multi-dimensional experience control is determined to be started, the corresponding control information is issued to the corresponding controller.
In the step, the intelligent terminal directly transmits the final control information to the controller, and the controller only needs to start and trigger corresponding actions according to the received control instruction.
Fig. 3 is a schematic diagram of a composition structure of an intelligent terminal according to the present invention, as shown in fig. 3, at least including: the first analysis module, the broadcasting module; wherein, the liquid crystal display device comprises a liquid crystal display device,
the first analysis module is used for analyzing the obtained video content which is currently played after the multidimensional experience function is started so as to identify scene information corresponding to the video content;
and the broadcasting module is used for sending the identified scene information to the controller so that the controller can start multidimensional control according to the scene information.
The first analysis module is specifically configured to:
when playing video, sampling and analyzing video frames, and trying to search for candidate objects, namely, obtaining motion estimation vectors for each sampling frame; the motion estimation vectors obtained are classified into two classes using a classification algorithm such as k-means cluster analysis: macro blocks with large motion estimation vectors and macro blocks with small motion estimation vectors. A number of regions in the macroblock set where the motion estimation vector is large are defined as marker regions. If the area of a certain mark area is too small, the mark area is discarded. Objects located outside the marked area are referred to as references.
And continuously detecting the key frames in the currently played video frames, if a mark area exists in a preset long-duration video frame sequence, starting to sample and analyze the key frames in the video frame sequence, and identifying and positioning candidate objects and positions of the candidate objects in the video frames for each sampling frame through algorithms such as a neural network, so as to obtain scene information.
Fig. 4 is a schematic diagram of a composition structure of another intelligent terminal according to the present invention, as shown in fig. 4, at least including a second analysis module and a determination module; wherein, the liquid crystal display device comprises a liquid crystal display device,
the second analysis module is used for analyzing the obtained video content which is currently played after the multidimensional experience function is started so as to identify scene information corresponding to the controller which initiates the request;
and the determining module is used for determining whether the multidimensional experience control needs to be started according to the identified scene information, and transmitting corresponding control information to the corresponding controller when the multidimensional experience control needs to be started.
The intelligent terminal shown in fig. 4 further includes a setup module for: monitoring a query command from a certain controller or a certain controllers, and returning the device description information of the intelligent terminal to which the device belongs to the controller which initiates the query request; a session is established with the controller that initiated the session.
The second analysis module is specifically configured to:
and continuously detecting a key frame in the currently played video frame, if a mark area is always present in a preset long-duration video frame sequence, starting sampling and analyzing the key frame in the video frame sequence, identifying and positioning a candidate object and a position in each sampling frame, which are related to a controller for initiating the query and establishing the session, in the video frame through algorithms such as a neural network, so as to identify scene information corresponding to the controller for initiating the query and establishing the session.
The determining module is specifically configured to: and when the object in the obtained scene information belongs to the object category of the preset triggering control and meets the preset triggering condition, starting corresponding multidimensional experience control, and sending the corresponding control information to the corresponding controller.
FIG. 5 is a schematic diagram of the structure of the controller according to the present invention, as shown in FIG. 5, at least including an acquisition module and a control module; wherein, the liquid crystal display device comprises a liquid crystal display device,
the acquisition module is used for acquiring scene information corresponding to the video content which is currently played;
and the control module is used for carrying out corresponding control when determining that the control module needs to start multidimensional experience control according to the acquired scene information.
Wherein, the corresponding relation between different object categories and control information is preset in the control module; the control module is specifically used for: and when the object in the obtained scene information belongs to the object category of the preset trigger control and meets the preset trigger condition, starting the multidimensional experience control.
Wherein, the acquisition module is further used for: sending a query command to query the equipment information of the intelligent terminal in the current network and monitor the information broadcast by the intelligent terminal
The following describes in detail specific embodiments.
Fig. 6 is a schematic diagram of a networking architecture in which the controllers of the present invention are deployed in a centralized manner, as shown in fig. 6, in a first embodiment, it is assumed that the controllers are deployed in a centralized manner, such as in a wearable device. In the first embodiment, the query request is initiated by the vibration controller, and the intelligent terminal in the first embodiment determines whether the vibration controller needs to be started to trigger the vibration effect. The method specifically comprises the following steps:
firstly, after the vibration controller is started, a query command is sent to the intelligent terminal, the device description information of the intelligent terminal in the current network is queried, and the broadcasting information of the intelligent terminal is monitored; the intelligent terminal is used as a convergence point, and when the vibration controller is monitored to initiate inquiry, the intelligent terminal reads the self equipment description information and returns the equipment description information to the vibration controller through inquiry response; the vibration controller is used as a client to initiate a session, and the intelligent terminal receives the session and establishes the session between itself and the vibration controller.
Then, when the intelligent terminal plays the video, firstly sampling and analyzing the video frame, and trying to search the candidate object, namely obtaining a motion estimation vector for each sampling frame. And dividing the obtained motion estimation vector of the video frame into two types of macro blocks with large motion estimation vectors and macro blocks with small motion estimation vectors by adopting a classification algorithm. A number of regions in the macroblock set where the motion estimation vector is large are defined as marker regions. If the area of a certain mark area is too small, the mark area is discarded. Objects located outside the marked area are referred to as references.
If in a sequence of video frames that lasts longer, there is always a marked area. The frames in the video frame sequence are sampled and analyzed, and the main object and the position of the main object in the video frame are identified and positioned for each sampled frame through algorithms such as a neural network. Such as: in a specific implementation, the neural network may adopt the structure of AlexNet: a total of 8 layers, the first 5 layers are convolution layers and the last 3 layers are full connection layers. Wherein the last layer uses a softmax classifier. Specifically: among the convolution layers of the first 5 layers, the 1 st layer is the convolution layer, the specific template interval is used for convolution, then the ReLU is adopted as an activation function, the Pooling is carried out after regularization, the obtained result is used as the input of the 2 nd layer convolution layer, and the later 4 layers of convolution layers are similar to the 1 st layer, but the convolution template with lower dimension is adopted; in the rear 3 full-connection layers, the rear 3 ReLU rear dropout is fully connected again; finally, softmax lost is used as lost function.
Then, if the previously obtained references are present in the currently sampled video frame sequence, the candidate object identified in the marked area of the video frame sequence is marked as a class of candidate objects if 1) the object class is present in the marked area of the successive video frame sequence and 2) the respective object of the object class is continuously changed in position relative to the vector relative to the references of the respective video sequence. Further, if the class of the candidate object is more than one, the scene information further includes: additional parameters such as object duration, relative speed of object position movement, number, etc. are recorded.
In the first embodiment, there are corresponding relations between different object types and control information in the intelligent terminal, and when the object in the obtained scene information belongs to the object type of the preset trigger control and meets the preset trigger condition, corresponding multidimensional experience control is started. In the first embodiment, it is assumed that a plurality of correspondence relations for triggering vibration are preset for the vibration controller: and each triggering item is provided with a triggered object category and a triggering condition, and when the triggering item is met, a vibration effect is triggered. Such as: for the vibration controller, this correspondence relationship may be set as: when the objects in the obtained scene information belong to the object category such as rock triggering vibration and the triggering condition such as that the number of the objects is more than 1 and the speed is more than 1/8 screen per second and lasts for more than 3 seconds, the vibration controller is started to trigger the vibration effect.
Finally, in the first embodiment, the intelligent terminal only needs to issue the corresponding control information, namely the triggering vibration effect, to the vibration controller.
In the second embodiment, taking the air microcontroller as an example, it is assumed that the intelligent terminal determines whether the odor controller needs to be started to send out the odor effect, and then generates a control command and sends the control command to the odor controller. The method specifically comprises the following steps:
firstly, after the odor controller is started, a query command is sent to the intelligent terminal, the device description information of the intelligent terminal in the current network is queried, and the broadcasting information of the intelligent terminal is monitored; the intelligent terminal is used as a convergence point, and when the odor controller is monitored to initiate inquiry, the intelligent terminal reads the self equipment description information and returns the inquiry response to the odor controller; the odor controller initiates a session as a client, and the intelligent terminal receives the session and establishes the session between itself and the odor controller.
Next, in the second embodiment, the intelligent terminal classifies objects in the scene, and in some scenes, some environmental odors need to be manufactured to enrich the user experience, and accordingly, identifiable objects and corresponding odors are preset.
When the intelligent terminal plays the video, sampling one of every plurality of key frames in the video frames. The presence of a large number of bouquet in the frame is identified using an algorithm such as convolutional neural network on the samples and for a considerable period of time. The specific implementation is identical to that of the first embodiment, and will not be described here again.
In the second embodiment, there are different correspondence between scene information and control information in the intelligent terminal, and when the object in the obtained scene information belongs to the object category of the preset trigger control and meets the preset trigger condition, corresponding multidimensional experience control is started. In the second embodiment, it is assumed that a plurality of corresponding relationships of trigger fragrances are preset for the odor controller: the object category of the trigger is specified in each trigger item, and the trigger condition, when the trigger item is satisfied, the smell effect is triggered. Such as: when the objects in the obtained scene information belong to the object category triggering odor generation, such as osmanthus fragrans, and the triggering condition is met, such as lasting for more than 6 seconds and the number of the objects is more than 10, the odor controller is started to trigger the odor with the osmanthus fragrans fragrance to be emitted:
finally, in the second embodiment, the intelligent terminal only needs to send the corresponding control information, namely the triggering smell with the sweet osmanthus fragrance, to the smell controller.
FIG. 7 is a schematic diagram of a networking architecture in which the controllers of the present invention are deployed in a distributed manner, as shown in FIG. 7, in a third embodiment, a distributed deployment among the controllers is assumed. In the third embodiment, the intelligent terminal only needs to identify the set object type and broadcast the identified scene information; and the controllers can determine whether the scene information belonging to the control range of the controllers needs to be started to trigger the multidimensional effect. The method specifically comprises the following steps:
firstly, continuously detecting key frames in a currently played video frame, if a neural network detects that a large area of flowers are in the current picture, finding out the edge outline of the flowers, and if the flowers are detected to have larger shaking amplitude to the right, deducing that the wind blows from left to right according to the direction of the flowers, and deducing the wind level according to the amplitude of the flowers; if it is detected that a person appears in the screen at the same time, the positions and the number of the persons are marked, and the speed of the relative movement between the persons is found through a plurality of frames, etc. These obtained information are scene information.
Then, the intelligent terminal broadcasts the obtained scene information, namely the type of flowers and the approximate number of flowers; the direction of the wind blowing and the wind power level; the number of people and the speed of the relative movement.
Then, the processing for each controller is as follows:
for each blowing controller, determining whether to trigger blowing and the magnitude of the wind according to the obtained scene information, the position of the blowing controller and the corresponding relation between different scene information and control information. Such as: the wind blows in the scene information from left to right, and if the direction of the blowing controller is at the left, the corresponding wind force in the scene information is blown; if the blower controller orientation is to the right, then there is no need to trigger a blower.
And triggering the odor controllers to release the fragrance of flowers in the corresponding scene information according to the obtained scene information and the preset corresponding relation between different scene information and control information for each flower fragrance controller.
For each sound controller, according to the obtained scene information, the corresponding background sound such as the harshness of wind and grass is selected. According to the moving speed and moving direction of the character in the scene information and the corresponding relation between different preset scene information and control information, triggering the sound controller to select the intensity or gradual change of the footstep sound according to the sound channel corresponding to the sound controller, and then superposing the background sound and the footstep sound and outputting the superposed sound. And completing sound output of the sound channel.
Thus, under the comprehensive actions of various controllers, the scene that the user blows the flowers and the sea and the person walks is simulated.
The foregoing is merely a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (8)

1. A method for implementing multidimensional control, comprising: the intelligent terminal analyzes the obtained video content which is currently played so as to identify scene information corresponding to the video content;
the intelligent terminal sends the scene information to the controller so that the controller starts multidimensional control according to the scene information;
the analyzing the obtained video content currently played to identify the scene information corresponding to the video content includes:
when the intelligent terminal plays the video, sampling and analyzing the video frame, and searching for a candidate object: for each sampling frame, acquiring a motion estimation vector, and dividing a plurality of areas in a macro block set with large motion estimation vector into marking areas;
and continuously detecting key frames in the video frames which are currently played by the intelligent terminal, and if a mark area is always present in a preset video frame sequence with longer duration, starting sampling and analyzing the key frames in the video frame sequence by the intelligent terminal, and identifying and positioning a candidate object and the position of the candidate object in the video frame for each sampling frame so as to identify the scene information.
2. The method of claim 1, wherein the demarcating the regions in the set of macro blocks for which the motion estimation vectors are large as marked regions comprises:
classifying the obtained motion estimation vectors into two types by adopting a classification algorithm: macro blocks with large motion estimation vectors and macro blocks with small motion estimation vectors;
a plurality of areas in a macro block set with large motion estimation vectors are defined as marked areas; an object located outside the marked area serves as a reference.
3. A method of implementing a multi-dimensional experience, comprising:
the intelligent terminal monitors inquiry commands from one or more controllers and returns self equipment description information to the controller which initiates the inquiry request;
the controller which receives the inquiry response is used as a client to initiate a session to the intelligent terminal, and the session is established between the intelligent terminal and the controller;
the intelligent terminal analyzes the obtained video content which is currently played so as to identify scene information corresponding to a controller which initiates a query request;
the intelligent terminal determines whether multidimensional experience control needs to be started according to the identified scene information;
when the multi-dimensional experience control is determined to be started, corresponding control information is issued to the corresponding controller;
the analyzing the obtained video content currently played to identify scene information corresponding to the controller initiating the query request includes:
when the intelligent terminal plays the video, sampling and analyzing the video frame, and searching for a candidate object: for each sampling frame, acquiring a motion estimation vector, and dividing a plurality of areas in a macro block set with large motion estimation vector into marking areas;
and continuously detecting the key frames in the obtained video frames, if a mark area is always present in a preset long-duration video frame sequence, starting sampling and analyzing the key frames in the video frame sequence, and identifying and positioning a candidate object and a position of the candidate object relative to a controller which initiates a query request and establishes a session in the video frame for each sampling frame so as to identify the scene information corresponding to the controller which initiates the query request and establishes the session.
4. The method of claim 3, wherein the demarcating the regions in the set of macro blocks for which the motion estimation vector is large as a marked region comprises:
classifying the obtained motion estimation vectors into two types by adopting a classification algorithm: macro blocks with large motion estimation vectors and macro blocks with small motion estimation vectors;
a plurality of areas in a macro block set with large motion estimation vectors are defined as marked areas; an object located outside the marked area serves as a reference.
5. The method of claim 3, wherein the intelligent terminal is preset with correspondence between different object categories and control information;
the intelligent terminal determining whether to start multidimensional experience control according to the obtained scene information comprises the following steps: and when the object in the obtained scene information belongs to the object category of the preset triggering control and meets the preset triggering condition, starting corresponding multidimensional experience control and transmitting corresponding control information to a corresponding controller.
6. The intelligent terminal is characterized by comprising a first analysis module and a broadcasting module; wherein, the liquid crystal display device comprises a liquid crystal display device,
the first analysis module is used for analyzing the obtained video content which is currently played after the multidimensional experience function is started so as to identify scene information corresponding to the video content;
the broadcasting module is used for sending the identified scene information to the controller so that the controller can start multidimensional control according to the scene information;
the first analysis module is specifically configured to: when playing video, sampling and analyzing video frames, and obtaining motion estimation vectors for each sampling frame; the motion estimation vectors obtained are classified into two categories by using a classification algorithm: macro blocks with large motion estimation vectors and macro blocks with small motion estimation vectors; a plurality of areas in a macro block set with large motion estimation vectors are defined as marked areas;
and continuously detecting the key frames in the video frames which are currently played, if a mark area is always present in a long-duration video frame sequence, starting to sample and analyze the key frames in the video frame sequence, and identifying and positioning a candidate object and the position of the candidate object in the video frame for each sampling frame so as to identify the scene information.
7. The intelligent terminal is characterized by comprising a second analysis module and a determination module; wherein, the liquid crystal display device comprises a liquid crystal display device,
the second analysis module is used for analyzing the obtained video content which is currently played after the multidimensional experience function is started so as to identify and acquire scene information corresponding to the controller which initiates the query request;
the determining module is used for determining whether the multidimensional experience control needs to be started according to the identified scene information, and when the multidimensional experience control needs to be started, the corresponding control information is issued to the corresponding controller;
the system also comprises a building module, a query module and a query module, wherein the building module is used for monitoring query commands from one or more controllers and returning the device description information of the intelligent terminal to which the building module belongs to the controller which initiates the query request; establishing a session with a controller initiating the session;
wherein, the second analysis module is specifically configured to:
when playing video, sampling and analyzing video frames, and obtaining motion estimation vectors for each sampling frame; the motion estimation vectors obtained are classified into two categories by using a classification algorithm: macro blocks with large motion estimation vectors and macro blocks with small motion estimation vectors; a plurality of areas in a macro block set with large motion estimation vectors are defined as marked areas; objects located outside the marked area are referred to as references;
and continuously detecting key frames in the currently played video frames, if a mark area is always present in a long-lasting video frame sequence, starting to sample and analyze frames in the video frame sequence, and identifying and positioning a main object and a position of the main object relative to a controller which initiates a query request and establishes a session in the video frame for each sampling frame so as to identify scene information corresponding to the controller which initiates the query request and establishes the session.
8. The intelligent terminal of claim 7, wherein the determining module is specifically configured to: and when the object in the obtained scene information belongs to the object category of the preset triggering control and meets the preset triggering condition, starting corresponding multidimensional experience control and transmitting the corresponding control information to a corresponding controller.
CN201610206745.2A 2016-04-05 2016-04-05 Method for realizing multidimensional control, intelligent terminal and controller Active CN105760141B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201610206745.2A CN105760141B (en) 2016-04-05 2016-04-05 Method for realizing multidimensional control, intelligent terminal and controller
PCT/CN2017/079444 WO2017173976A1 (en) 2016-04-05 2017-04-05 Method for realizing multi-dimensional control, intelligent terminal and controller

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610206745.2A CN105760141B (en) 2016-04-05 2016-04-05 Method for realizing multidimensional control, intelligent terminal and controller

Publications (2)

Publication Number Publication Date
CN105760141A CN105760141A (en) 2016-07-13
CN105760141B true CN105760141B (en) 2023-05-09

Family

ID=56333468

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610206745.2A Active CN105760141B (en) 2016-04-05 2016-04-05 Method for realizing multidimensional control, intelligent terminal and controller

Country Status (2)

Country Link
CN (1) CN105760141B (en)
WO (1) WO2017173976A1 (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105760141B (en) * 2016-04-05 2023-05-09 中兴通讯股份有限公司 Method for realizing multidimensional control, intelligent terminal and controller
CN106657975A (en) * 2016-10-10 2017-05-10 乐视控股(北京)有限公司 Video playing method and device
CN108063701B (en) * 2016-11-08 2020-12-08 华为技术有限公司 Method and device for controlling intelligent equipment
CN107743205A (en) * 2017-09-11 2018-02-27 广东欧珀移动通信有限公司 Image processing method and device, electronic installation and computer-readable recording medium
CN110475159A (en) * 2018-05-10 2019-11-19 中兴通讯股份有限公司 The transmission method and device of multimedia messages, terminal
CN109388719A (en) * 2018-09-30 2019-02-26 京东方科技集团股份有限公司 Multidimensional contextual data generating means and method based on Digitized Works
US20200213662A1 (en) * 2018-12-31 2020-07-02 Comcast Cable Communications, Llc Environmental Data for Media Content
CN110245628B (en) * 2019-06-19 2023-04-18 成都世纪光合作用科技有限公司 Method and device for detecting discussion scene of personnel
CN110493090B (en) * 2019-08-22 2022-01-28 三星电子(中国)研发中心 Method and system for realizing intelligent home theater
CN111031392A (en) * 2019-12-23 2020-04-17 广州视源电子科技股份有限公司 Media file playing method, system, device, storage medium and processor
CN112040289B (en) * 2020-09-10 2022-12-06 深圳创维-Rgb电子有限公司 Video playing control method and device, video playing equipment and readable storage medium
CN114885189A (en) * 2022-04-14 2022-08-09 深圳创维-Rgb电子有限公司 Control method, device and equipment for opening fragrance and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103559713A (en) * 2013-11-10 2014-02-05 深圳市幻实科技有限公司 Method and terminal for providing augmented reality
CN103679727A (en) * 2013-12-16 2014-03-26 中国科学院地理科学与资源研究所 Multi-dimensional space-time dynamic linkage analysis method and device
CN103970892A (en) * 2014-05-23 2014-08-06 无锡清华信息科学与技术国家实验室物联网技术中心 Method for controlling multidimensional film-watching system based on intelligent home device
CN105306982A (en) * 2015-05-22 2016-02-03 维沃移动通信有限公司 Sensory feedback method for mobile terminal interface image and mobile terminal thereof

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8009923B2 (en) * 2006-03-14 2011-08-30 Celestial Semiconductor, Inc. Method and system for motion estimation with multiple vector candidates
CN101035279B (en) * 2007-05-08 2010-12-15 孟智平 Method for using the information set in the video resource
CN105072483A (en) * 2015-08-28 2015-11-18 深圳创维-Rgb电子有限公司 Smart home equipment interaction method and system based on smart television video scene
CN105760141B (en) * 2016-04-05 2023-05-09 中兴通讯股份有限公司 Method for realizing multidimensional control, intelligent terminal and controller

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103559713A (en) * 2013-11-10 2014-02-05 深圳市幻实科技有限公司 Method and terminal for providing augmented reality
CN103679727A (en) * 2013-12-16 2014-03-26 中国科学院地理科学与资源研究所 Multi-dimensional space-time dynamic linkage analysis method and device
CN103970892A (en) * 2014-05-23 2014-08-06 无锡清华信息科学与技术国家实验室物联网技术中心 Method for controlling multidimensional film-watching system based on intelligent home device
CN105306982A (en) * 2015-05-22 2016-02-03 维沃移动通信有限公司 Sensory feedback method for mobile terminal interface image and mobile terminal thereof

Also Published As

Publication number Publication date
CN105760141A (en) 2016-07-13
WO2017173976A1 (en) 2017-10-12

Similar Documents

Publication Publication Date Title
CN105760141B (en) Method for realizing multidimensional control, intelligent terminal and controller
US20220286728A1 (en) Information processing apparatus and information processing method, display equipped with artificial intelligence function, and rendition system equipped with artificial intelligence function
CN109922373B (en) Video processing method, device and storage medium
CN108230594B (en) Method for generating alarm in video monitoring system
CN117496643A (en) System and method for detecting and responding to visitor of smart home environment
EP3274910A2 (en) Computer vision systems
JP2015534202A (en) Image stabilization techniques for video surveillance systems.
CN105264879A (en) Computer vision application processing
JP2003529136A (en) Program Classification by Object Tracking
CN112581627A (en) System and apparatus for user-controlled virtual camera for volumetric video
US20190212719A1 (en) Information processing device and information processing method
CN112005281A (en) System and method for power management on smart devices
CN109791601A (en) Crowd's amusement
CN112330371A (en) AI-based intelligent advertisement pushing method, device, system and storage medium
TW201511544A (en) System and method for automatically switch channels
CN110493090B (en) Method and system for realizing intelligent home theater
KR101924715B1 (en) Techniques for enabling auto-configuration of infrared signaling for device control
CN106564059B (en) A kind of domestic robot system
US11151602B2 (en) Apparatus, systems and methods for acquiring commentary about a media content event
CN104185068A (en) Method for switching contextual models automatically according to television programs and television
CN115782908A (en) Human-computer interaction method of vehicle, nonvolatile storage medium and vehicle
CN108200390A (en) Video structure analyzing method and device
CN104075410A (en) Terminal control method and system based on audio signals
CN112804545B (en) Slow live broadcast processing method and system based on live broadcast streaming frame extraction algorithm
CN113946127B (en) Intelligent home system based on edge computing technology

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant