CN109889921A

CN109889921A - A kind of audio-video creation, playback method and device having interactive function

Info

Publication number: CN109889921A
Application number: CN201910263196.6A
Authority: CN
Inventors: 孙晓刚; 戴帅湘
Original assignee: Beijing Suddenly Cognitive Technology Co Ltd
Current assignee: Guangyulaite Digital Technology Shanghai Co ltd
Priority date: 2019-04-02
Filing date: 2019-04-02
Publication date: 2019-06-14
Anticipated expiration: 2039-04-02
Also published as: CN109889921B

Abstract

The invention discloses a kind of audio-video creation, playback method and devices for having interactive function, which comprises obtains audio and video resources to be processed；The audio and video resources are converted into content-data and play time corresponding with content-data；Multiple time tags are created, the time tag and audio and video playing time correlation join；According to time tag, cutting is carried out to content-data, one or more key content data is extracted, is associated with the time tag and one or more of key content data；Creation corresponds to the interactive component of one or more of key content data；The key content data and its corresponding interactive component and audio and video resources binds pieces of same time tag will be belonged to.By means of the present invention, the content interaction services association based on time point can be more easily constructed, to quickly be interacted with user, to more be bonded the demand of user, the user experience is improved.

Description

A kind of audio-video creation, playback method and device having interactive function

Technical field

The present embodiments relate to technical field of information processing, in particular to a kind of audio-video wound for having interactive function It builds, playback method and device.

Background technique

In the prior art, it when TV programme and voice program play, is typically only capable to realize the playing function of video or audio, Support the general operations such as F.F., rewind, broadcasting, pause.With the development and universal, the intelligence skill such as human-computer interaction of computer technology Art is conveniently serviced in the various aspects offer that people live.When user is in viewing film or the process of TV programme In, many correlations can be led to the problem of to the content of broadcasting, need related interaction for this problem.At present for audio-video Interaction is all confined to the parsing of complicated transcription or picture element.Such as when background is when playing a certain song, and user needs It is to be understood that when the song information, it usually needs by other equipment, such as search song is shaken by mobile phone, alternatively, passing through this It networks and searches for after record in equipment, these modes are usually all comparatively laborious, how easily in playing process to broadcasting content Correlated knowledge point and user, which carry out quickly interaction, becomes a urgent problem to be solved.

Summary of the invention

For the problems of the prior art, the present invention provide it is a kind of have interactive function audio-video creation, playback method And device.

The present invention provides a kind of audio-video creation method for having interactive function, which is characterized in that the described method includes:

Step 101, audio and video resources to be processed are obtained；

Step 102, the audio and video resources are converted into content-data and play time corresponding with content-data；

Step 103, multiple time tags are created, the time tag and audio and video playing time correlation join；

Step 104, according to time tag, cutting is carried out to content-data, one or more key content data is extracted, closes Join the time tag and one or more of key content data；

Step 105, creation corresponds to the interactive component of one or more of key content data；

Step 106, the key content data for belonging to same time tag and its corresponding interactive component and audio-video are provided Source binds pieces.

Further, the step 102 specifically includes

Speech recognition and image recognition are carried out to audio and video resources, export recognition result, the recognition result includes interior Hold data and play time corresponding with content-data；

It is index with play time, stores play time and content-data.

Further, step 104 specifically includes

Content-data is analyzed, the content-data is split as one or more semantic paragraphs；

It based on the semantic paragraph, determines and segments content criticality in paragraph, extract key content data；

It is associated with the time tag and one or more of key content data.

Further, the interactive component includes the explanation for key content data, or with key content data phase Associated application, plug-in unit or configuration file.

Further, the method also includes

Step 107, to binding key content label and key content data interaction component audio and video resources segment into Row encapsulation, constructs the audio and video resources that can be interacted；

Step 108, the audio and video resources that can be interacted are stored.

The present invention also provides a kind of audio and video playing methods for having interactive function, which is characterized in that described to have interaction The audio-video of function is according to the preceding audio-video creation method creation for having interactive function；The playback method further wraps It includes:

Step 201, judge whether to receive user instructions, if so, determining broadcasting for audio and video resources when receiving user instructions Put the time；

Step 202, the corresponding time tag of the play time and key content data are extracted；

Step 203, judge whether user instruction meets the first trigger condition, if meeting the first trigger condition, pause is played；

Step 204, it according to the user instruction and the key content data, provides a user relevant to user instruction Interactive service.

Further, the user instruction may include: interactive voice instruction and/or body feeling interaction instruction, and/or touching Control interactive instruction.

Further, the step 201 further includes

Step 2011, if not receiving user instructions, judge the corresponding time tag of current play time with the presence or absence of key Content-data；

Step 2012, key content data if it exists further judge key content data attribute；

Step 2013, if tag attributes are actively to show, key content data interaction component is called to interact with user.

Further, described to judge whether user instruction meets the first trigger condition and specifically further include

Step 2031, user instruction is parsed, converts user instruction as the first text data；

Step 2032, judge whether first text data meets first trigger condition, wherein first touching Clockwork spring part includes that first text data and key content data similarity are greater than the first preset threshold.

It is further, described that provide a user interactive service relevant to user instruction include following one kind or several:

The explanation of key content data is provided；

Application associated with key content data, plug-in unit or configuration file are called, interactive by interactive voice or GUI Form is exchanged with user.

The present invention also provides a kind of audio-video creating devices for having interactive function, which is characterized in that described device includes:

Module is obtained, for obtaining audio and video resources to be processed；

Conversion module, when for the audio and video resources to be converted to content-data and broadcasting corresponding with content-data Between；

Time-triggered protocol module, for creating multiple time tags, the time tag includes the audio and video playing time, or Audio and video playing time interval；

Content processing module is extracted in one or more keys for carrying out cutting to content-data according to time tag Hold data, is associated with the time tag and one or more of key content data；

Component creation module, for creating the interactive component for corresponding to one or more of key content data；

Binding module, key content data and its corresponding interactive component and sound for that will belong to same time tag regard The binding of frequency resource segment.

Further, the conversion module is specifically used for

It is index with play time, stores play time and content-data.

Further, the content processing module is specifically used for

It is associated with the time tag and one or more of key content data.

Further, described device further includes

Package module, for the audio and video resources piece to binding key content label and key content data interaction component Section is packaged, and constructs the audio and video resources that can be interacted；

Memory module, for storing the audio and video resources that can be interacted.

The present invention also provides a kind of audio and video display devices for having interactive function, which is characterized in that described to have interaction The audio-video of function is according to the preceding audio-video creation method creation for having interactive function；Described device includes:

Receiving module is received user instructions for judging whether, if so, determining audio and video resources when receiving user instructions Play time；

Extraction module, for extracting the corresponding time tag of the play time and key content data；

Judgment module, for judging whether user instruction meets the first trigger condition, if meeting the first trigger condition, pause It plays；

Interactive module provides a user and user instruction for according to the user instruction and the key content data Relevant interactive service.

Further, described device further comprises

Content judgment module judges that the corresponding time tag of current play time whether there is if not receiving user instructions Key content data；

The label judgment module is also used to key content data if it exists, further judges key content data attribute；

Calling module calls key content data interaction component to interact with user if tag attributes are actively to show.

User instruction is parsed, converts user instruction as the first text data；

Judge whether first text data meets first trigger condition, wherein first trigger condition includes First text data and key content data similarity are greater than the first preset threshold.

The explanation of key content data is provided；

The present invention also provides a kind of terminal devices, which is characterized in that the terminal device includes processor and memory, institute The computer program for being stored with and being run in memory on a processor is stated, the computer program is executed by the processor Shi Shixian method as described above.

The present invention also provides a kind of computer readable storage mediums, which is characterized in that the computer readable storage medium In be stored with the computer program that can be run on a processor, the computer program and realize side as described above when executed Method.

By means of the present invention, the content interaction services association based on time point can be more easily constructed, thus and User is quickly interacted, to more be bonded the demand of user, the user experience is improved.

Detailed description of the invention

In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is the present invention Some embodiments for those of ordinary skill in the art without creative efforts, can also basis These attached drawings obtain other attached drawings.

Fig. 1 is the audio-video creation method for having interactive function in one embodiment of the invention.

Fig. 2 is the audio and video playing method for having interactive function in one embodiment of the invention.

Fig. 3 is the audio-video creating device for having interactive function in one embodiment of the invention.

Fig. 4 is the audio and video display device for having interactive function in one embodiment of the invention.

Specific embodiment

To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with attached drawing to embodiment party of the present invention Formula is described in further detail.The embodiment of the present invention and the specific features of embodiment are to technical side of the embodiment of the present invention The detailed description of case, rather than the restriction to description of the invention technical solution, in the absence of conflict, the embodiment of the present invention And the technical characteristic of embodiment can be combined with each other.

Method of the invention can be applied to it is any have processing and/or playing function device or equipment, as computer, Mobile terminal, mobile unit, TV etc..

Embodiment one

The audio-video creation method for having interactive function of the invention is illustrated below, referring to Fig. 1, the method packet It includes:

Step 101, audio and video resources to be processed are obtained；

Preferably, the step 102 specifically includes

It is index with play time, stores play time and content-data.

The content-data is preferably content of text data.

Specifically, for audio-video class program A, program total duration is 30 minutes, to the video and audio of audio and video resources It is analyzed respectively, for video, feature detection can be carried out to video frame images frame by frame according to play time sequence, extracted Information element in video frame images, as play time 00 divide 05 second -00 point 18 seconds, occur doggie in video frame images, truck, Play time 03 is divided 12 seconds, occurs space shuttle in video frame images, etc.；It can be direct if there is subtitle file for audio Extract includes that the text of time shaft can be converted to text and its corresponding by speech analysis if there is no subtitle file In play time, such as the program play time 00 divide 05 second -00 point 18 seconds, based on audio data conversion word content data For " having a look in old truck either with or without other omissions ", for example, program 15 divides 17 seconds beginning background audios to be song, the lyrics are extracted Text.The content-data converted and play time corresponding with content-data can be stored by data structure table.

Specifically, creation time label in the step 103, time tag can correspond to time point, can also to it is corresponding when Between section, i.e., the described time tag includes audio and video playing time or audio and video playing time interval, such as time tag can be with It is that XX divides XX seconds or XX XX seconds-XX is divided to divide XX seconds.

Specifically, step 104 includes

According to time tag, content-data is analyzed,

The content-data is split as one or more semantic paragraphs；

It is associated with the time tag and one or more of key content data.

Specifically, in step 105, creation corresponds to the interactive component of one or more of key content data, described Interactive component includes the explanation for key content data, or application associated with key content data, plug-in unit or configuration File.Such as divide 12 seconds for time tag 03, key content data are space shuttle, can grab space flight by network and fly The explanation of machine, the explanation may include that similar " space shuttle (Space Shuttle) is a kind of manned, repeatable makes , travel to and fro between spacecraft between space and ground." this Explanation of Basic Phrase, it can also include its birth history, substantially Structure etc.；Further, it is also possible to be associated with application relevant to space shuttle, plug-in unit or configuration file, such as it is associated with and space flight Aircraft relevant knowledge question and answer app.Such as time tag 15 divide 17 seconds -16 points 57 seconds, critical data be song, Yi Jige Word segment.By internet shared platform, according to the explanation of the corresponding lyrics segment of lyrics piece segment search, including song title, Singer, album name etc., or with the associated music/download platform of the song, such as XX music, in link.In addition, Music application relevant to the song, plug-in unit or configuration file can also be associated with.

After the interactive component for improving key content data, knowledge mapping can be constructed, thus handle and key content data Relevant knowledge hierarchy systematically shows user.The knowledge mapping can be a kind of semantic network, and one kind being based on figure Data structure, the knowledge mapping includes one or more of relationship between plurality of data node and designation date node A connection side, wherein the plurality of data node includes one or more timing nodes, one or more key content data Node and one or more interactive component nodes, the connection side correspond to time, key content data and interactive component node Logic association.Such as timing node " time tag 15 divide 17 seconds -16 points and 57 seconds " is closed with key content back end " song " Connection, key content back end " song " and interactive component node " title of the song, singer " are associated with.As known to those skilled in the art, together One timing node can connect multiple key content back end, and a key content back end also can connect multiple interactions Component nodes.The multiple interactive component nodes for being connected to key content back end have different priority indexs, the preferential finger Number can select preference dynamic to adjust according to frequency of use, user.

Preferably, the method can also include

Step 108, the audio and video resources that can be interacted are stored.

Specifically, pass through the audio and video resources segment to binding key content label and key content data interaction component It is packaged and audio and video resources is merged with interactive component, to also be able to achieve interaction in off-line state.

Embodiment two

The audio and video playing method for having interactive function of the invention is illustrated below, referring to fig. 2, the broadcasting side Method further comprises:

The audio and video resources wherein played can be created according to the mode of embodiment one.

Specifically, the user instruction may include: interactive voice instruction and/or body feeling interaction instruction and/or touch-control Interactive instruction.

Such as during audio and video playing, receive the phonetic order that user issues by voice recording device: this is What song? perhaps it is connect by the specific objective image in touch control device reception user's click screen or by touch-control or keyboard It receives user and inputs the movement for instructing, or utilizing body feeling interaction equipment capture user by dialog interface, obtain user action and refer to It enables.

What song is does specifically in step 201, user issues instruction: this?, at this time the recording played time be 15 points 50 seconds. In step 202, play time be 15 points be under the jurisdiction of within 50 seconds time tag 15 divide 17 seconds -16 points 57 seconds, extract the time tag and Its key content data.

Specifically, the step 201 further includes

For example, setting property 0 or 1 to key content data, 0 represents key content data shows to be passive；1 represents pass Key content-data is actively to show.Audio and video playing to 03 point 12 seconds, key content data are space shuttle, and the key The attribute value of content-data is 1, then space shuttle image section shows the solution of space shuttle in the form of special efficacy subtitle in video It releases；Furthermore can also be that pause plays, be associated with interacting Question-Answer assistant relevant to space shuttle: to user's enquirement, " dear is small Friend, you know that space shuttle is? " terminate interacting Question-Answer hence into the operation of interacting Question-Answer assistant, and in user After assistant, broadcast interface is returned, restores to play.

Furthermore, it is possible to which the initiative ability of audio and video resources interaction is arranged, choose whether to open by user, be activated if opening The judgement of key content data attribute, i.e. execution step 2011-2013；If not opening, step 2011-2013 is not executed.

Specifically, described to judge whether user instruction meets the first trigger condition and specifically further include in step 203

Specifically, in the step 204, it is described provide a user interactive service relevant to user instruction include it is following it It is one or more of:

The explanation of key content data is provided；

What song is does for example, based on user in 15 points of 50 seconds phonetic orders issued: this? and divided according to time tag 15 17 seconds -16 points of 57 seconds key content data determine that current matching all critical learning node is " song by matching knowledge mapping It is bent ".Then further, the interactive component node for being associated with all critical learning node " song " is searched.It can be according to the interactive component The priority index selection of node has the interactive component node of highest priority index to user feedback interaction relevant to song, example Such as, it can show that song title, singer, album name etc., or pause play in the form of special efficacy subtitle, show this to user Song is in the link of music/download platform, shortcut icon etc..It according to the user's choice, can be further with floating interface It shows the operation interface of music/download platform application, plug-in unit, or jumps to music/download platform application, plug-in unit Operation interface.

Embodiment three

The audio-video creating device for having interactive function of the invention is illustrated below, referring to Fig. 3, described device packet It includes:

Module is obtained, for obtaining audio and video resources to be processed；

Preferably, the conversion module is specifically used for

It is index with play time, stores play time and content-data.

Preferably, the content processing module is specifically used for

It is associated with the time tag and one or more of key content data.

Preferably, the interactive component includes the explanation for key content data, or related to key content data Application, plug-in unit or the configuration file of connection.

Preferably, described device further includes

Example IV

The audio and video display device for having interactive function of the invention is illustrated below, referring to fig. 4, described device packet It includes:

Preferably, the user instruction may include: interactive voice instruction and/or body feeling interaction instruction and/or touch-control Interactive instruction.

Preferably, described device further comprises

Preferably, described to judge whether user instruction meets the first trigger condition and specifically further include

User instruction is parsed, converts user instruction as the first text data；

It is preferably, described that provide a user interactive service relevant to user instruction include following one kind or several:

The explanation of key content data is provided；

The present invention provides a kind of terminal device, which is characterized in that the terminal device includes processor and memory, described The computer program that can be run on a processor is stored in memory, the computer program by the processor when being executed Realize method as described above.

The present invention provides a kind of computer readable storage medium, which is characterized in that in the computer readable storage medium It is stored with the computer program that can be run on a processor, the computer program and realizes side as described above when executed Method.

It can be using any combination of one or more computer-readable media.Computer-readable medium can be calculating Machine readable signal medium or computer readable storage medium.Computer readable storage medium can for example be but not limited to electricity, Magnetic, optical, electromagnetic, infrared ray or semiconductor system, device or device, or any above combination.Computer-readable storage Medium may include: the electrical connection with one or more conducting wires, portable computer diskette, hard disk, random access memory (RAM), read-only memory (ROM), flash memory, erasable programmable read only memory (EPROM), optical fiber, portable compact disc Read-only memory (CD-ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.In this document, Computer readable storage medium can be any tangible medium for including or store program, which can be commanded and execute system System, device or device use or in connection.

The computer for executing operation of the present invention can be write with one or more programming languages or combinations thereof Program code.

It is described above to be intended merely to facilitate the example for understanding the present invention and enumerating, it is not used in and limits the scope of the invention.? When specific implementation, those skilled in the art can according to the actual situation change the component of device, increase, reduce, not The step of method, can be changed according to the actual situation on the basis of the function that influence method is realized, increased, reduced or Change sequence.

Although an embodiment of the present invention has been shown and described, it should be understood by those skilled in the art that: do not departing from this These embodiments can be carried out with a variety of change, modification, replacement and modification in the case where the principle and objective of invention, it is of the invention Range is limited by claim and its equivalent replacement, without creative work improvements introduced etc., should be included in this hair Within bright protection scope.

Claims

1. a kind of audio-video creation method for having interactive function, which is characterized in that the described method includes:

Step 101, audio and video resources to be processed are obtained；

Step 104, according to time tag, cutting is carried out to content-data, one or more key content data is extracted, is associated with institute State time tag and one or more of key content data；

Step 106, the key content data and its corresponding interactive component and audio and video resources piece of same time tag will be belonged to Section binding.

2. the method according to claim 1, wherein the step 102 specifically includes

Speech recognition and image recognition are carried out to audio and video resources, export recognition result, the recognition result includes content number Accordingly and play time corresponding with content-data；

It is index with play time, stores play time and content-data.

3. the method according to claim 1, wherein step 104 includes

It is associated with the time tag and one or more of key content data.

4. method according to claim 1 or 3, which is characterized in that

The interactive component includes the explanation for key content data, or application associated with key content data, is inserted Part or configuration file.

5. the method according to claim 1, wherein the method also includes

Step 107, the audio and video resources segment of binding key content label and key content data interaction component is sealed Dress constructs the audio and video resources that can be interacted；

Step 108, the audio and video resources that can be interacted are stored.

6. a kind of audio and video playing method for having interactive function, which is characterized in that the audio-video for having interactive function is Any one of -5 method creation according to claim 1；The playback method further comprises:

Step 201, judge whether to receive user instructions, if so, when determining the broadcasting of audio and video resources when receiving user instructions Between；

Step 204, according to the user instruction and the key content data, interaction relevant to user instruction is provided a user Service.

7. according to the method described in claim 6, it is characterized in that,

The user instruction may include: interactive voice instruction and/or body feeling interaction instruction and/or touch-control interactive instruction.

8. according to the method described in claim 6, it is characterized in that, the step 201 further includes

Step 2011, if not receiving user instructions, judge the corresponding time tag of current play time with the presence or absence of key content Data；

9. according to the method described in claim 6, it is characterized in that, described judge whether user instruction meets the first trigger condition Specifically further include

Step 2032, judge whether first text data meets first trigger condition, wherein the first triggering item Part includes that first text data and key content data similarity are greater than the first preset threshold.

10. according to the method described in claim 6, it is characterized in that,

It is described that provide a user interactive service relevant to user instruction include following one kind or several:

The explanation of key content data is provided；

Application associated with key content data, plug-in unit or configuration file are called, by way of interactive voice or GUI interaction It is exchanged with user.