A kind of audio-video creation, playback method and device having interactive function
Technical field
The present embodiments relate to technical field of information processing, in particular to a kind of audio-video wound for having interactive function
It builds, playback method and device.
Background technique
In the prior art, it when TV programme and voice program play, is typically only capable to realize the playing function of video or audio,
Support the general operations such as F.F., rewind, broadcasting, pause.With the development and universal, the intelligence skill such as human-computer interaction of computer technology
Art is conveniently serviced in the various aspects offer that people live.When user is in viewing film or the process of TV programme
In, many correlations can be led to the problem of to the content of broadcasting, need related interaction for this problem.At present for audio-video
Interaction is all confined to the parsing of complicated transcription or picture element.Such as when background is when playing a certain song, and user needs
It is to be understood that when the song information, it usually needs by other equipment, such as search song is shaken by mobile phone, alternatively, passing through this
It networks and searches for after record in equipment, these modes are usually all comparatively laborious, how easily in playing process to broadcasting content
Correlated knowledge point and user, which carry out quickly interaction, becomes a urgent problem to be solved.
Summary of the invention
For the problems of the prior art, the present invention provide it is a kind of have interactive function audio-video creation, playback method
And device.
The present invention provides a kind of audio-video creation method for having interactive function, which is characterized in that the described method includes:
Step 101, audio and video resources to be processed are obtained;
Step 102, the audio and video resources are converted into content-data and play time corresponding with content-data;
Step 103, multiple time tags are created, the time tag and audio and video playing time correlation join;
Step 104, according to time tag, cutting is carried out to content-data, one or more key content data is extracted, closes
Join the time tag and one or more of key content data;
Step 105, creation corresponds to the interactive component of one or more of key content data;
Step 106, the key content data for belonging to same time tag and its corresponding interactive component and audio-video are provided
Source binds pieces.
Further, the step 102 specifically includes
Speech recognition and image recognition are carried out to audio and video resources, export recognition result, the recognition result includes interior
Hold data and play time corresponding with content-data;
It is index with play time, stores play time and content-data.
Further, step 104 specifically includes
Content-data is analyzed, the content-data is split as one or more semantic paragraphs;
It based on the semantic paragraph, determines and segments content criticality in paragraph, extract key content data;
It is associated with the time tag and one or more of key content data.
Further, the interactive component includes the explanation for key content data, or with key content data phase
Associated application, plug-in unit or configuration file.
Further, the method also includes
Step 107, to binding key content label and key content data interaction component audio and video resources segment into
Row encapsulation, constructs the audio and video resources that can be interacted;
Step 108, the audio and video resources that can be interacted are stored.
The present invention also provides a kind of audio and video playing methods for having interactive function, which is characterized in that described to have interaction
The audio-video of function is according to the preceding audio-video creation method creation for having interactive function;The playback method further wraps
It includes:
Step 201, judge whether to receive user instructions, if so, determining broadcasting for audio and video resources when receiving user instructions
Put the time;
Step 202, the corresponding time tag of the play time and key content data are extracted;
Step 203, judge whether user instruction meets the first trigger condition, if meeting the first trigger condition, pause is played;
Step 204, it according to the user instruction and the key content data, provides a user relevant to user instruction
Interactive service.
Further, the user instruction may include: interactive voice instruction and/or body feeling interaction instruction, and/or touching
Control interactive instruction.
Further, the step 201 further includes
Step 2011, if not receiving user instructions, judge the corresponding time tag of current play time with the presence or absence of key
Content-data;
Step 2012, key content data if it exists further judge key content data attribute;
Step 2013, if tag attributes are actively to show, key content data interaction component is called to interact with user.
Further, described to judge whether user instruction meets the first trigger condition and specifically further include
Step 2031, user instruction is parsed, converts user instruction as the first text data;
Step 2032, judge whether first text data meets first trigger condition, wherein first touching
Clockwork spring part includes that first text data and key content data similarity are greater than the first preset threshold.
It is further, described that provide a user interactive service relevant to user instruction include following one kind or several:
The explanation of key content data is provided;
Application associated with key content data, plug-in unit or configuration file are called, interactive by interactive voice or GUI
Form is exchanged with user.
The present invention also provides a kind of audio-video creating devices for having interactive function, which is characterized in that described device includes:
Module is obtained, for obtaining audio and video resources to be processed;
Conversion module, when for the audio and video resources to be converted to content-data and broadcasting corresponding with content-data
Between;
Time-triggered protocol module, for creating multiple time tags, the time tag includes the audio and video playing time, or
Audio and video playing time interval;
Content processing module is extracted in one or more keys for carrying out cutting to content-data according to time tag
Hold data, is associated with the time tag and one or more of key content data;
Component creation module, for creating the interactive component for corresponding to one or more of key content data;
Binding module, key content data and its corresponding interactive component and sound for that will belong to same time tag regard
The binding of frequency resource segment.
Further, the conversion module is specifically used for
Speech recognition and image recognition are carried out to audio and video resources, export recognition result, the recognition result includes interior
Hold data and play time corresponding with content-data;
It is index with play time, stores play time and content-data.
Further, the content processing module is specifically used for
Content-data is analyzed, the content-data is split as one or more semantic paragraphs;
It based on the semantic paragraph, determines and segments content criticality in paragraph, extract key content data;
It is associated with the time tag and one or more of key content data.
Further, the interactive component includes the explanation for key content data, or with key content data phase
Associated application, plug-in unit or configuration file.
Further, described device further includes
Package module, for the audio and video resources piece to binding key content label and key content data interaction component
Section is packaged, and constructs the audio and video resources that can be interacted;
Memory module, for storing the audio and video resources that can be interacted.
The present invention also provides a kind of audio and video display devices for having interactive function, which is characterized in that described to have interaction
The audio-video of function is according to the preceding audio-video creation method creation for having interactive function;Described device includes:
Receiving module is received user instructions for judging whether, if so, determining audio and video resources when receiving user instructions
Play time;
Extraction module, for extracting the corresponding time tag of the play time and key content data;
Judgment module, for judging whether user instruction meets the first trigger condition, if meeting the first trigger condition, pause
It plays;
Interactive module provides a user and user instruction for according to the user instruction and the key content data
Relevant interactive service.
Further, the user instruction may include: interactive voice instruction and/or body feeling interaction instruction, and/or touching
Control interactive instruction.
Further, described device further comprises
Content judgment module judges that the corresponding time tag of current play time whether there is if not receiving user instructions
Key content data;
The label judgment module is also used to key content data if it exists, further judges key content data attribute;
Calling module calls key content data interaction component to interact with user if tag attributes are actively to show.
Further, described to judge whether user instruction meets the first trigger condition and specifically further include
User instruction is parsed, converts user instruction as the first text data;
Judge whether first text data meets first trigger condition, wherein first trigger condition includes
First text data and key content data similarity are greater than the first preset threshold.
It is further, described that provide a user interactive service relevant to user instruction include following one kind or several:
The explanation of key content data is provided;
Application associated with key content data, plug-in unit or configuration file are called, interactive by interactive voice or GUI
Form is exchanged with user.
The present invention also provides a kind of terminal devices, which is characterized in that the terminal device includes processor and memory, institute
The computer program for being stored with and being run in memory on a processor is stated, the computer program is executed by the processor
Shi Shixian method as described above.
The present invention also provides a kind of computer readable storage mediums, which is characterized in that the computer readable storage medium
In be stored with the computer program that can be run on a processor, the computer program and realize side as described above when executed
Method.
By means of the present invention, the content interaction services association based on time point can be more easily constructed, thus and
User is quickly interacted, to more be bonded the demand of user, the user experience is improved.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is the present invention
Some embodiments for those of ordinary skill in the art without creative efforts, can also basis
These attached drawings obtain other attached drawings.
Fig. 1 is the audio-video creation method for having interactive function in one embodiment of the invention.
Fig. 2 is the audio and video playing method for having interactive function in one embodiment of the invention.
Fig. 3 is the audio-video creating device for having interactive function in one embodiment of the invention.
Fig. 4 is the audio and video display device for having interactive function in one embodiment of the invention.
Specific embodiment
To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with attached drawing to embodiment party of the present invention
Formula is described in further detail.The embodiment of the present invention and the specific features of embodiment are to technical side of the embodiment of the present invention
The detailed description of case, rather than the restriction to description of the invention technical solution, in the absence of conflict, the embodiment of the present invention
And the technical characteristic of embodiment can be combined with each other.
Method of the invention can be applied to it is any have processing and/or playing function device or equipment, as computer,
Mobile terminal, mobile unit, TV etc..
Embodiment one
The audio-video creation method for having interactive function of the invention is illustrated below, referring to Fig. 1, the method packet
It includes:
Step 101, audio and video resources to be processed are obtained;
Step 102, the audio and video resources are converted into content-data and play time corresponding with content-data;
Step 103, multiple time tags are created, the time tag and audio and video playing time correlation join;
Step 104, according to time tag, cutting is carried out to content-data, one or more key content data is extracted, closes
Join the time tag and one or more of key content data;
Step 105, creation corresponds to the interactive component of one or more of key content data;
Step 106, the key content data for belonging to same time tag and its corresponding interactive component and audio-video are provided
Source binds pieces.
Preferably, the step 102 specifically includes
Speech recognition and image recognition are carried out to audio and video resources, export recognition result, the recognition result includes interior
Hold data and play time corresponding with content-data;
It is index with play time, stores play time and content-data.
The content-data is preferably content of text data.
Specifically, for audio-video class program A, program total duration is 30 minutes, to the video and audio of audio and video resources
It is analyzed respectively, for video, feature detection can be carried out to video frame images frame by frame according to play time sequence, extracted
Information element in video frame images, as play time 00 divide 05 second -00 point 18 seconds, occur doggie in video frame images, truck,
Play time 03 is divided 12 seconds, occurs space shuttle in video frame images, etc.;It can be direct if there is subtitle file for audio
Extract includes that the text of time shaft can be converted to text and its corresponding by speech analysis if there is no subtitle file
In play time, such as the program play time 00 divide 05 second -00 point 18 seconds, based on audio data conversion word content data
For " having a look in old truck either with or without other omissions ", for example, program 15 divides 17 seconds beginning background audios to be song, the lyrics are extracted
Text.The content-data converted and play time corresponding with content-data can be stored by data structure table.
Specifically, creation time label in the step 103, time tag can correspond to time point, can also to it is corresponding when
Between section, i.e., the described time tag includes audio and video playing time or audio and video playing time interval, such as time tag can be with
It is that XX divides XX seconds or XX XX seconds-XX is divided to divide XX seconds.
Specifically, step 104 includes
According to time tag, content-data is analyzed,
The content-data is split as one or more semantic paragraphs;
It based on the semantic paragraph, determines and segments content criticality in paragraph, extract key content data;
It is associated with the time tag and one or more of key content data.
Specifically, in step 105, creation corresponds to the interactive component of one or more of key content data, described
Interactive component includes the explanation for key content data, or application associated with key content data, plug-in unit or configuration
File.Such as divide 12 seconds for time tag 03, key content data are space shuttle, can grab space flight by network and fly
The explanation of machine, the explanation may include that similar " space shuttle (Space Shuttle) is a kind of manned, repeatable makes
, travel to and fro between spacecraft between space and ground." this Explanation of Basic Phrase, it can also include its birth history, substantially
Structure etc.;Further, it is also possible to be associated with application relevant to space shuttle, plug-in unit or configuration file, such as it is associated with and space flight
Aircraft relevant knowledge question and answer app.Such as time tag 15 divide 17 seconds -16 points 57 seconds, critical data be song, Yi Jige
Word segment.By internet shared platform, according to the explanation of the corresponding lyrics segment of lyrics piece segment search, including song title,
Singer, album name etc., or with the associated music/download platform of the song, such as XX music, in link.In addition,
Music application relevant to the song, plug-in unit or configuration file can also be associated with.
After the interactive component for improving key content data, knowledge mapping can be constructed, thus handle and key content data
Relevant knowledge hierarchy systematically shows user.The knowledge mapping can be a kind of semantic network, and one kind being based on figure
Data structure, the knowledge mapping includes one or more of relationship between plurality of data node and designation date node
A connection side, wherein the plurality of data node includes one or more timing nodes, one or more key content data
Node and one or more interactive component nodes, the connection side correspond to time, key content data and interactive component node
Logic association.Such as timing node " time tag 15 divide 17 seconds -16 points and 57 seconds " is closed with key content back end " song "
Connection, key content back end " song " and interactive component node " title of the song, singer " are associated with.As known to those skilled in the art, together
One timing node can connect multiple key content back end, and a key content back end also can connect multiple interactions
Component nodes.The multiple interactive component nodes for being connected to key content back end have different priority indexs, the preferential finger
Number can select preference dynamic to adjust according to frequency of use, user.
Preferably, the method can also include
Step 107, to binding key content label and key content data interaction component audio and video resources segment into
Row encapsulation, constructs the audio and video resources that can be interacted;
Step 108, the audio and video resources that can be interacted are stored.
Specifically, pass through the audio and video resources segment to binding key content label and key content data interaction component
It is packaged and audio and video resources is merged with interactive component, to also be able to achieve interaction in off-line state.
Embodiment two
The audio and video playing method for having interactive function of the invention is illustrated below, referring to fig. 2, the broadcasting side
Method further comprises:
Step 201, judge whether to receive user instructions, if so, determining broadcasting for audio and video resources when receiving user instructions
Put the time;
Step 202, the corresponding time tag of the play time and key content data are extracted;
Step 203, judge whether user instruction meets the first trigger condition, if meeting the first trigger condition, pause is played;
Step 204, it according to the user instruction and the key content data, provides a user relevant to user instruction
Interactive service.
The audio and video resources wherein played can be created according to the mode of embodiment one.
Specifically, the user instruction may include: interactive voice instruction and/or body feeling interaction instruction and/or touch-control
Interactive instruction.
Such as during audio and video playing, receive the phonetic order that user issues by voice recording device: this is
What song? perhaps it is connect by the specific objective image in touch control device reception user's click screen or by touch-control or keyboard
It receives user and inputs the movement for instructing, or utilizing body feeling interaction equipment capture user by dialog interface, obtain user action and refer to
It enables.
What song is does specifically in step 201, user issues instruction: this?, at this time the recording played time be 15 points 50 seconds.
In step 202, play time be 15 points be under the jurisdiction of within 50 seconds time tag 15 divide 17 seconds -16 points 57 seconds, extract the time tag and
Its key content data.
Specifically, the step 201 further includes
Step 2011, if not receiving user instructions, judge the corresponding time tag of current play time with the presence or absence of key
Content-data;
Step 2012, key content data if it exists further judge key content data attribute;
Step 2013, if tag attributes are actively to show, key content data interaction component is called to interact with user.
For example, setting property 0 or 1 to key content data, 0 represents key content data shows to be passive;1 represents pass
Key content-data is actively to show.Audio and video playing to 03 point 12 seconds, key content data are space shuttle, and the key
The attribute value of content-data is 1, then space shuttle image section shows the solution of space shuttle in the form of special efficacy subtitle in video
It releases;Furthermore can also be that pause plays, be associated with interacting Question-Answer assistant relevant to space shuttle: to user's enquirement, " dear is small
Friend, you know that space shuttle is? " terminate interacting Question-Answer hence into the operation of interacting Question-Answer assistant, and in user
After assistant, broadcast interface is returned, restores to play.
Furthermore, it is possible to which the initiative ability of audio and video resources interaction is arranged, choose whether to open by user, be activated if opening
The judgement of key content data attribute, i.e. execution step 2011-2013;If not opening, step 2011-2013 is not executed.
Specifically, described to judge whether user instruction meets the first trigger condition and specifically further include in step 203
Step 2031, user instruction is parsed, converts user instruction as the first text data;
Step 2032, judge whether first text data meets first trigger condition, wherein first touching
Clockwork spring part includes that first text data and key content data similarity are greater than the first preset threshold.
Specifically, in the step 204, it is described provide a user interactive service relevant to user instruction include it is following it
It is one or more of:
The explanation of key content data is provided;
Application associated with key content data, plug-in unit or configuration file are called, interactive by interactive voice or GUI
Form is exchanged with user.
What song is does for example, based on user in 15 points of 50 seconds phonetic orders issued: this? and divided according to time tag 15
17 seconds -16 points of 57 seconds key content data determine that current matching all critical learning node is " song by matching knowledge mapping
It is bent ".Then further, the interactive component node for being associated with all critical learning node " song " is searched.It can be according to the interactive component
The priority index selection of node has the interactive component node of highest priority index to user feedback interaction relevant to song, example
Such as, it can show that song title, singer, album name etc., or pause play in the form of special efficacy subtitle, show this to user
Song is in the link of music/download platform, shortcut icon etc..It according to the user's choice, can be further with floating interface
It shows the operation interface of music/download platform application, plug-in unit, or jumps to music/download platform application, plug-in unit
Operation interface.
Embodiment three
The audio-video creating device for having interactive function of the invention is illustrated below, referring to Fig. 3, described device packet
It includes:
Module is obtained, for obtaining audio and video resources to be processed;
Conversion module, when for the audio and video resources to be converted to content-data and broadcasting corresponding with content-data
Between;
Time-triggered protocol module, for creating multiple time tags, the time tag includes the audio and video playing time, or
Audio and video playing time interval;
Content processing module is extracted in one or more keys for carrying out cutting to content-data according to time tag
Hold data, is associated with the time tag and one or more of key content data;
Component creation module, for creating the interactive component for corresponding to one or more of key content data;
Binding module, key content data and its corresponding interactive component and sound for that will belong to same time tag regard
The binding of frequency resource segment.
Preferably, the conversion module is specifically used for
Speech recognition and image recognition are carried out to audio and video resources, export recognition result, the recognition result includes interior
Hold data and play time corresponding with content-data;
It is index with play time, stores play time and content-data.
Preferably, the content processing module is specifically used for
Content-data is analyzed, the content-data is split as one or more semantic paragraphs;
It based on the semantic paragraph, determines and segments content criticality in paragraph, extract key content data;
It is associated with the time tag and one or more of key content data.
Preferably, the interactive component includes the explanation for key content data, or related to key content data
Application, plug-in unit or the configuration file of connection.
Preferably, described device further includes
Package module, for the audio and video resources piece to binding key content label and key content data interaction component
Section is packaged, and constructs the audio and video resources that can be interacted;
Memory module, for storing the audio and video resources that can be interacted.
Example IV
The audio and video display device for having interactive function of the invention is illustrated below, referring to fig. 4, described device packet
It includes:
Receiving module is received user instructions for judging whether, if so, determining audio and video resources when receiving user instructions
Play time;
Extraction module, for extracting the corresponding time tag of the play time and key content data;
Judgment module, for judging whether user instruction meets the first trigger condition, if meeting the first trigger condition, pause
It plays;
Interactive module provides a user and user instruction for according to the user instruction and the key content data
Relevant interactive service.
The audio and video resources wherein played can be created according to the mode of embodiment one.
Preferably, the user instruction may include: interactive voice instruction and/or body feeling interaction instruction and/or touch-control
Interactive instruction.
Preferably, described device further comprises
Content judgment module judges that the corresponding time tag of current play time whether there is if not receiving user instructions
Key content data;
The label judgment module is also used to key content data if it exists, further judges key content data attribute;
Calling module calls key content data interaction component to interact with user if tag attributes are actively to show.
Preferably, described to judge whether user instruction meets the first trigger condition and specifically further include
User instruction is parsed, converts user instruction as the first text data;
Judge whether first text data meets first trigger condition, wherein first trigger condition includes
First text data and key content data similarity are greater than the first preset threshold.
It is preferably, described that provide a user interactive service relevant to user instruction include following one kind or several:
The explanation of key content data is provided;
Application associated with key content data, plug-in unit or configuration file are called, interactive by interactive voice or GUI
Form is exchanged with user.
The present invention provides a kind of terminal device, which is characterized in that the terminal device includes processor and memory, described
The computer program that can be run on a processor is stored in memory, the computer program by the processor when being executed
Realize method as described above.
The present invention provides a kind of computer readable storage medium, which is characterized in that in the computer readable storage medium
It is stored with the computer program that can be run on a processor, the computer program and realizes side as described above when executed
Method.
It can be using any combination of one or more computer-readable media.Computer-readable medium can be calculating
Machine readable signal medium or computer readable storage medium.Computer readable storage medium can for example be but not limited to electricity,
Magnetic, optical, electromagnetic, infrared ray or semiconductor system, device or device, or any above combination.Computer-readable storage
Medium may include: the electrical connection with one or more conducting wires, portable computer diskette, hard disk, random access memory
(RAM), read-only memory (ROM), flash memory, erasable programmable read only memory (EPROM), optical fiber, portable compact disc
Read-only memory (CD-ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.In this document,
Computer readable storage medium can be any tangible medium for including or store program, which can be commanded and execute system
System, device or device use or in connection.
The computer for executing operation of the present invention can be write with one or more programming languages or combinations thereof
Program code.
It is described above to be intended merely to facilitate the example for understanding the present invention and enumerating, it is not used in and limits the scope of the invention.?
When specific implementation, those skilled in the art can according to the actual situation change the component of device, increase, reduce, not
The step of method, can be changed according to the actual situation on the basis of the function that influence method is realized, increased, reduced or
Change sequence.
Although an embodiment of the present invention has been shown and described, it should be understood by those skilled in the art that: do not departing from this
These embodiments can be carried out with a variety of change, modification, replacement and modification in the case where the principle and objective of invention, it is of the invention
Range is limited by claim and its equivalent replacement, without creative work improvements introduced etc., should be included in this hair
Within bright protection scope.