CN109688475A - Video playing jump method, system and computer readable storage medium - Google Patents

Video playing jump method, system and computer readable storage medium Download PDF

Info

Publication number
CN109688475A
CN109688475A CN201811654558.6A CN201811654558A CN109688475A CN 109688475 A CN109688475 A CN 109688475A CN 201811654558 A CN201811654558 A CN 201811654558A CN 109688475 A CN109688475 A CN 109688475A
Authority
CN
China
Prior art keywords
video
voice messaging
video playing
audio data
scene tag
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811654558.6A
Other languages
Chinese (zh)
Other versions
CN109688475B (en
Inventor
李其浪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen TCL New Technology Co Ltd
Original Assignee
Shenzhen TCL New Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen TCL New Technology Co Ltd filed Critical Shenzhen TCL New Technology Co Ltd
Priority to CN201811654558.6A priority Critical patent/CN109688475B/en
Publication of CN109688475A publication Critical patent/CN109688475A/en
Priority to PCT/CN2019/126022 priority patent/WO2020135161A1/en
Application granted granted Critical
Publication of CN109688475B publication Critical patent/CN109688475B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1822Parsing for meaning understanding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/232Content retrieval operation locally within server, e.g. reading video streams from disk arrays
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/233Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/42203Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS] sound input device, e.g. microphone
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Abstract

The invention discloses a kind of video playing jump method, system and computer readable storage mediums, comprising: receives the user speech information of video playing terminal acquisition;The user speech information is identified, the feature of the voice messaging is extracted;The voice messaging feature scene tag different from preset audio data is matched, the scene tag with voice messaging characteristic matching is obtained;It is sent to video playing terminal with voice messaging characteristic matching scene tag by described, jumps to corresponding position to control the video played in video playing terminal.The invention also discloses a kind of video playing jump system and computer readable storage mediums.The present invention is realized user video can be realized by voice command and jumped by the speech recognition of server and semantics recognition, to improve user experience.

Description

Video playing jump method, system and computer readable storage medium
Technical field
The present invention relates to field of video broadcasting technology more particularly to a kind of video playing jump methods, system and computer Readable storage medium storing program for executing.
Background technique
With the development of internet technology, people, which no longer rely on merely, receives live telecast signal and watches live video, But existing any video in network, including live video are watched by internet.It not only can be according to oneself happiness Video type is selected well, playback progress can also be arbitrarily adjusted during watching video, directly jump to video desired In the scene of viewing.
When adjusting video playing progress, user can pass through the key on TV remote or the void in video jukebox software Quasi- key is realized, as user presses the key on TV remote or the virtual key in video jukebox software, video playing Progress jumps forward or backward the regular hour;Pin the key or video jukebox software on TV remote always such as user On virtual key, video playing progress goes ahead or jumps backward the regular hour;After jumping the time such as user setting, TV or video jukebox software load carry out video playing etc. after jumping the time.User is made to need to be manually operated key just in this way Video can be jumped in the scene for wanting viewing, and be difficult disposably to jump completion, user experience is poor.
Summary of the invention
The main purpose of the present invention is to provide a kind of video playing jump method, system and computer-readable storage mediums Matter, it is intended to it solves user and needs repeatedly manual operation key that could jump to video in the scene for wanting viewing, user experience The poor technical problem of property.
To achieve the above object, the present invention provides a kind of video playing jump method, comprising the following steps:
Receive the user speech information of video playing terminal acquisition;
The user speech information is identified, the feature of the voice messaging is extracted;
The voice messaging feature scene tag different from preset audio data is matched, obtains and believes with voice Cease the scene tag of characteristic matching;
It is sent to video playing terminal with voice messaging characteristic matching scene tag by described, to control video playing end The video played on end jumps to corresponding position.
Preferably, scene tags described that the voice messaging feature is different from preset audio data progress Before the step of matching, obtaining the scene tag with voice messaging characteristic matching, comprising:
Judge in the voice messaging feature whether to include jumping video name;
If not including jumping video name in the voice messaging feature, the title of currently playing video is obtained;
The scene tag that the voice messaging feature is different from preset audio data matches, acquisition and language Message cease characteristic matching scene tag the step of include:
By the voice messaging feature and the title of currently playing the video scene tag different from the audio data It is matched, obtains the scene tag with voice messaging characteristic matching.
Preferably, it is described judge whether to include the steps that jumping video name in the voice messaging feature after, packet It includes:
If in the voice messaging feature include jump video name, then follow the steps: by the voice messaging feature with Different scene tags is matched in preset audio data, obtains the scene tag with voice messaging characteristic matching.
Preferably, the scene tag that the voice messaging feature is different from preset audio data matches, Obtain and voice messaging characteristic matching scene tag the step of include:
Judge in preset audio data whether to include the corresponding audio data of currently playing video;
If there is no the corresponding audio data of currently playing video in preset audio data, asked to video playing terminal transmission Ask instruction;
The corresponding audio data of currently playing video that video playing terminal is sent is received, the audio data is saved in Preset audio data.
Preferably, scene tags described that the voice messaging feature is different from preset audio data progress After the step of matching, obtaining the scene tag with voice messaging characteristic matching, further includes:
If not matching the scene tag for meeting the voice messaging feature within a preset time, generate that it fails to match mentions Show;
By it fails to match, prompt is sent to video playing terminal, so that video playing terminal display reminding information.
In addition, to achieve the above object, the present invention also provides a kind of video playing jump methods, comprising the following steps:
Acquire the voice messaging of user's input;
The user speech information is sent to server, so that server is by the voice messaging feature and the audio Different scene tags is matched in data, obtains the scene tag with voice messaging characteristic matching;
The scene tag with voice messaging characteristic matching is received, the video played in video playing terminal is jumped to Corresponding position.
Preferably, server step is sent in the name information by the user speech information and currently playing video After rapid, further includes:
Receive the audio data request instruction of server transmission;
The corresponding audio data of currently playing video is sent to server.
Preferably, it is described the user speech information is sent to server step after, further includes:
If server does not match the scene tag for meeting the voice messaging feature within a preset time, matching is received Failure prompts, and shows in video terminal interface, to prompt user.
In addition, to achieve the above object, the present invention also provides a kind of video playing jump system, the video playing is jumped System includes: video playing terminal and server,
The voice messaging of video playing terminal acquisition user's input, and by the user speech information and currently playing The name information of video is sent to server;
The server receives the user speech information of video playing terminal acquisition, identifies to the voice messaging, The feature for extracting the voice messaging carries out the voice messaging feature scene tag different from preset audio data Matching obtains the scene tag with voice messaging characteristic matching, and the scene tag with voice messaging characteristic matching is sent To video playing terminal;
The video playing terminal receives the scene tag with voice messaging characteristic matching, will be in video playing terminal The video of broadcasting jumps to corresponding position.
In addition, to achieve the above object, the present invention also provides a kind of computer readable storage medium, the computer program quilts Video playing terminal and server realize video playing jump method as described above when executing.
The interactive system that the present invention is applied to video playing terminal and server forms, first reception video playing terminal are logical The collected user speech information of voice acquisition module such as microphone is crossed, above-mentioned user speech information is passed through into speech recognition and language Adopted identification function identifies user speech information, gets the feature of user speech information, and this feature mainly includes user The information such as the video name jumped, scene are intended to, while server will be in the voice messaging feature and the audio data Different scene tags are matched, and the scene tag with voice messaging characteristic matching is obtained, finally will be with voice messaging feature Matched scene tag is sent to video playing terminal, and video is made to jump to corresponding position.Pass through voice to realize user Order can be realized video and jump, and can accurately jump in the scene that user wants, and improve the experience property of user.
Detailed description of the invention
Fig. 1 is the system architecture schematic diagram that the embodiment of the present invention is related to;
Fig. 2 is the flow diagram of video playing jump method first embodiment of the present invention;
Fig. 3 is the flow diagram of video playing jump method second embodiment of the present invention;
Fig. 4 is the flow diagram of video playing jump method 3rd embodiment of the present invention;
Fig. 5 is the flow diagram of video playing jump method fourth embodiment of the present invention;
Fig. 6 is the structural schematic diagram of video playing jump system first embodiment of the present invention.
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific embodiment
It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.
The primary solutions of the embodiment of the present invention are: receiving the user speech information of video playing terminal acquisition;To institute It states user speech information to be identified, extracts the feature of the voice messaging;By the voice messaging feature and preset audio Different scene tags is matched in data, obtains the scene tag with voice messaging characteristic matching;Described it will believe with voice The scene tag of breath characteristic matching is sent to video playing terminal, jumps to phase to control the video played in video playing terminal Answer position.
Since video playing can't be jumped to corresponding scene bit by the scene characteristic in user speech by the prior art It sets, therefore needs the present invention to solve.
The present invention provides a solution, makes user that video can be realized by voice command and jumps, and can be accurately It jumps in the scene that user wants, improves the experience property of user.
Fig. 1 is the system architecture schematic diagram of the video playing jump method embodiment of the application.
Fig. 1 is please referred to, system architecture 100 may include video playing terminal 101,102,103, network 104 and server 105.Network 104 between video playing terminal 101,102,103 and server 105 to provide the medium of communication link.Net Network 104 may include various wired, wireless communication links, such as fiber optic cables, mobile network, WiFi, bluetooth or hot spot.
User can be used video playing terminal 101,102,103 and be interacted by network 104 with server 105, to receive Or send message etc..Various telecommunication customer end applications can be installed, such as video is broadcast in video playing terminal 101,102,103 Put class application, web browser applications, shopping class application, searching class application, instant messaging tools, mailbox client, social activity put down Platform software etc..
Video playing terminal 101,102,103 can be hardware, be also possible to software.When video playing terminal 101,102, 103 when being hardware, can be with display screen and support the various electronic equipments of video playing, including but not limited to intelligent hand Machine, tablet computer, E-book reader, MP3 player (Moving Picture Experts Group Audio Layer III, dynamic image expert's compression standard audio level 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic image expert's compression standard audio level 4) player, pocket computer on knee and desktop computer etc. Deng.When video playing terminal 101,102,103 is software, may be mounted in above-mentioned cited electronic equipment.It can be with It is implemented as multiple softwares or software module (such as providing Distributed Services), single software or software mould also may be implemented into Block.It is not specifically limited herein.
Server 105 can be to provide the server of various services, such as to broadcasting in video playing terminal 101,102,103 The video put is read out, and can also be analyzed various voice messagings, command information, the video/audio data received Deng processing, and by processing result such as video clip, scene tag, command information etc., video playing terminal is fed back to, so that view Frequency playback terminal completes corresponding actions according to processing result.
It should be noted that server can be hardware, it is also possible to software.When server is hardware, may be implemented At the distributed server cluster that multiple servers form, individual server also may be implemented into.It, can when server is software It, can also be with to be implemented as multiple softwares or software module (such as providing multiple softwares of Distributed Services or software module) It is implemented as single software or software module.It is not specifically limited herein.
It should be noted that video playing jump method provided by the embodiment of the present application can be by video playing terminal 101, it 102,103 executes, can also be executed by server 105.Correspondingly, it can be set for the device of pushed information in video In playback terminal 101,102,103, also it can be set in server 105.It is not specifically limited herein.
It should be understood that the number of video playing terminal, network and server in Fig. 1 is only schematical.According to reality It now needs, can have any number of video playing terminal, network and server.
Referring to figure 2., first embodiment of the invention provides a kind of video playing jump method, comprising the following steps:
Step S10 receives the user speech information of video playing terminal acquisition.
Present invention could apply to the interactive system that video playing terminal and server form, video playing terminal and service Device is connected by network, realizes interaction.Video playing terminal is adopted by taking television set as an example by the voice of television set in the present embodiment Collected voice messaging is sent to server by wireless network by the voice messaging that collection module acquires user in real time.Service The user speech information that device real-time reception network other end television set is sent.
Step S20 identifies the user speech information, extracts the feature of the voice messaging.
The user language information received is carried out speech recognition and semantics recognition by server, and wherein speech recognition is to pass through Acoustic model and speech model convert voice messaging to the text information of computer capacity identification, and semantics recognition is in speech recognition Basis exist, features are inclined to etc. to carry out intellectual analysis based on the gender of user, hobby, program request usually, better understand use The intention at family.Such as the full name that user's typing voice is a specific film or TV play, server is only needed by voice Identification can find out this film or the TV play that user wants viewing, if user's typing voice is " romance movie ", " heat Fuzzy sentence, the server such as the action movie broadcast ", " film of Hong Kong director ", " Hollywood blockbuster " also need to carry out semantics recognition, It just can be carried out and accurately jump.
Server is based on speech recognition and speech identifying function, can extract the feature of user speech information, such as user Typing voice is " director Zhao is looked into TV play " name of * * " ", and server can identify the voice, and extracts " electricity Depending on play ", the people of * * " justice ", " director Zhao is looked into " feature.
Step S30 matches the voice messaging feature scene tag different from the audio data, obtains With the scene tag of voice messaging characteristic matching.
It is preset with magnanimity audio data in server of the present invention, and speech recognition label is carried out to all audio datas, Corresponding scene tag is generated, scene different in audio data can be generated different scene tags by server, the scene mark Label include the relevant informations such as video type, title, scene description, personage, time, collection number.Scene tag can be in corresponding scene Beginning, ending or the climax position of audio-frequency information, this case is preferably in the starting position of corresponding scene audio information.
It should be noted that in addition to the embodiments described above, server can be according to the magnanimity audio data in audio database Corresponding video clip or caption information are obtained from television set or network, and then video clip or caption information are carried out Intellectual analysis generates scene tag in the corresponding position of audio data.
In the present embodiment, user, which is intended to jump video playing terminal, jumps to the corresponding video corresponding period, such as TV play " people's justice of * * " is currently played in video playing terminal, and the voice command of user's typing this moment is the " TV play " name of * * Justice " in director Zhao looked into ", server judge first in user speech information extraction audio database with the TV play " name of * * Justice " related all audio-frequency informations.
According to the user for including in the user speech information characteristics scene information to be jumped, the scene that user to be jumped is believed Breath is matched with each scene tag in the audio data, the highest scene tag of matching degree is found out, such as user's typing voice Order is " director Zhao is looked into TV play " name of * * " ", then finds in corresponding audio data and own in audio database Scene tag, such as director Zhao is grabbed, Chen Yanshi fights excavator, Hou Liangping and Qi Tongwei sings and " deals with by intelligence ", Ou Yangjing is grabbed, and is looked for The scene tag to match out with " director Zhao is grabbed ".
Step S40 is sent to video playing terminal with voice messaging characteristic matching scene tag for described, to control view The video played on frequency playback terminal jumps to corresponding position.
Server is got with after the scene tag of voice messaging characteristic matching, sends it to video playing terminal, with Video playing terminal is set to jump to corresponding position according to the scene tag.
It should be noted that in addition to the embodiments described above, server can be according to described and voice messaging characteristic matching Scene tag generates a jump instruction, which includes scene tag location information, so that video playing terminal energy root Corresponding position is jumped to according to the jump instruction.
In the present embodiment server receive video playing terminal acquisition user speech information and currently playing video Name information, and speech recognition and semantics recognition are carried out to the user speech information, the feature of the voice messaging is extracted, Further according to the name information of the currently playing video, confirm to include the corresponding audio number of currently playing video in audio database According to the voice messaging feature scene tag different from the audio data is matched, is obtained special with voice messaging Matched scene tag is levied, then is sent to video playing terminal with voice messaging characteristic matching scene tag for described, with control The video played in video playing terminal processed jumps to corresponding position.The present invention is identified by the speech identifying function of server User speech information characteristics, and the scene tag being consistent with user voice command is matched according to user vocal feature, so that view Frequency playback terminal realizes that video jumps, and can accurately jump in the scene that user wants, to improve the experience of user Property.
Further, referring to figure 3., second embodiment of the invention provides a kind of video playing jump method, based on above-mentioned Embodiment shown in Fig. 2 carries out the voice messaging feature scene tag different from preset audio data in step S30 Before the step of matching, acquisition and scene tag of voice messaging characteristic matching, comprising:
Whether step S50 judges in the voice messaging feature to include jumping video name.
For the accuracy for improving query result, the present embodiment also judges the voice messaging before the matching for carrying out label Whether include jumping video name in feature, if not including jumping video name in the voice messaging feature, thens follow the steps S60 obtains the title of currently playing video.
In the present embodiment, there is no the video name to be jumped in the voice command of user's typing, those skilled in the art can The video being currently played using being interpreted as the object that user to be jumped as video playing terminal, server is to playback terminal at this time Obtain the title of currently playing video.Step S30 is then replaced are as follows: step S31: by the voice messaging feature and currently playing view The title of the frequency scene tag different from the audio data is matched, and the scene mark with voice messaging characteristic matching is obtained Label.
After getting video name, then according to the title of the voice of user and currently playing video and the audio data Middle different scene tag is matched, and the scene tag with voice messaging characteristic matching is obtained, as video playing terminal is current It is playing TV play " people's justice of * * ", server acquires currently playing video to video playing terminal, then according to user Typing voice command is " director Zhao is looked into TV play " name of * * " ", and server judges user speech information extraction first In audio database with TV play " name of * * " related all audio-frequency informations, further according to feature in voice messaging with TV In acute " name of * * " related all audio-frequency informations, matched speed is first carried out faster according to video name in this way, as a result It is more accurate.In addition if user's typing voice is " skipping to end ", by the feature of extraction " end ", and to currently broadcasting The video put skips to last one set starting position.
If not including jumping video name in certain voice messaging feature, currently playing video can not also be obtained Title directlys adopt voice messaging feature in preset audio data and carries out tag match, the audio number that this mode needs to inquire According to more, cause inquiry velocity relatively slow.
If including jumping video name in the voice messaging feature, S30 is thened follow the steps, by the voice messaging feature The scene tags different from preset audio data are matched, and the scene tag with voice messaging characteristic matching is obtained.
The implementation procedure of server is identical as step S31 at this time, and difference is video name one kind in user speech information In, it is a kind of to be obtained from server to video playing terminal.
Furthermore if the voice of user's typing be " romance movie ", " action movie of hot broadcast ", the film of director " Hong Kong ", " Hollywood blockbuster " etc. do not include specific TV play or movie name information when, then need server in speech database voluntarily Matching the features such as can be inclined to based on the gender of user, hobby, program request usually to carry out intellectual analysis, select suitable use The video at family, so that video playing terminal jumps to the video.User can also carry out other instructions, such as the voice of user's typing Extraction " advance ", the feature of " 30 minutes " are jumped into 30 points of advance to the video being currently played for " advancing 30 minutes " The position of clock.
The present invention is judged by server whether there is or not video name is jumped in user speech information characteristics, to realize to current Video is jumped, Switch Video is played to other video names or Switch Video and played to other video name respective fields for broadcasting Scape can more meet public requirement.
Further, the scene tag step S30 that the voice messaging feature is different from preset audio data It is matched, obtains the scene tag with voice messaging characteristic matching, comprising:
Whether step S32 judges in preset audio data to include the corresponding audio data of currently playing video;
If there is no the corresponding audio data of currently playing video in preset audio data, S31 is thened follow the steps, and execute step Rapid S34.
Step S33 sends request instruction to video playing terminal.
Step S34 receives the corresponding audio data of currently playing video that video playing terminal is sent, by the audio number According to being saved in audio database.
As not having the corresponding audio data of the currently playing video of video playing terminal in audio database, then server is to view Frequency playback terminal sends request instruction, which requires video playing terminal to send the corresponding audio number of currently playing video After the audio data for receiving video playing terminal transmission according to, server, it is asked to save to audio database.Make audio in this way Audio data in database is richer, more complete, while being also convenient for when user's the video object to be jumped being current broadcasting When video, it can be matched to the scene tag of user's needs in time.
Further, referring to figure 4., third embodiment of the invention provides a kind of video playing jump method, based on above-mentioned Embodiment shown in Fig. 2 carries out the voice messaging feature scene tag different from the audio data in step S30 After the scene tag of matching, acquisition and voice messaging characteristic matching, further includes:
Step S70, if not matching the scene tag for meeting the voice messaging feature within a preset time, generation With unsuccessfully prompting;
Step S80, by it fails to match, prompt is sent to video playing terminal, so that video playing terminal display reminding is believed Breath.
The voice messaging feature scene tag different from audio database is matched, if in audio database There is no the video object that user to be jumped, then directly terminates to match;Such as the video object for thering is user to be jumped in audio database, It then identifies the corresponding audio-frequency information of user's video name to be jumped in audio database, it is corresponding to obtain the audio-frequency information Each scene tag, and matched with each scene tag.Meet the voice messaging feature if not matching within a preset time Scene tag, then terminate to match.After terminating matching, the prompt that generates that it fails to match and is sent to video playing terminal.Video is broadcast The terminal prompt information that receives that it fails to match is put, can directly display out at video playing interface, it can also be by terminal User prompt the prompt informations such as control such as Toast, Snackbar.Certainly, matching result has gone out that it fails to match prompt is outer, can also With according to the voice messaging feature in user's recommendation of audio database closer to other video informations for intention.If The voice of user's typing is " romance movie ", " action movie of hot broadcast ", the film of director " Hong Kong ", " Hollywood blockbuster " etc. When, server voluntarily matches in speech database, can the features such as gender based on user, hobby, program request tendency usually Intellectual analysis is carried out, selects the video of suitable user, so that video playing terminal jumps to the video.
Referring to Fig. 5, fourth embodiment of the invention provides a kind of video playing jump method, comprising the following steps:
Step S110, the voice messaging of acquisition user's input.
In the present embodiment, video playing terminal both may include video playback module and voice acquisition module;It can also be with It only include video playback module, then external voice acquisition module, such as microphone.Mobile phone, television set, computer etc. all can be used as video Playback terminal acquires the voice messaging of user by the microphone of mobile phone using mobile phone as video playing terminal in the present embodiment, And video playing application program is installed in mobile phone, the view that user wants viewing can be played by video playing application program Frequently.
The user speech information is sent to server by step S120, so that server is by the voice messaging feature The scene tags different from the audio data are matched, and the scene tag with voice messaging characteristic matching is obtained.
User speech information is sent to server by mobile phone, may include scenario key (such as " X in voice messaging Celestial platform is put to death in jump "), it can simultaneously include also acute name keyword and scenario key (such as " acute name A plot B "), so that server energy Directly parse user from voice messaging and be intended to the video object jumped and scene information, while so that server according to hand Whether the name information for the currently playing video that machine is sent has the corresponding audio number of currently playing video in audio data library According to such as nothing, then executing following steps:
Step S121 receives the audio data request instruction of server transmission.
Step S122 sends the corresponding audio data of currently playing video to server.
After mobile phone receives the audio data request instruction of server transmission, from transferring from the background, currently playing video is corresponding Audio data is wrapped into, and is uploaded to server, so as to there is the corresponding sound of currently playing video in the audio database of server Frequency evidence.
Step S130 receives the scene tag with voice messaging characteristic matching, by what is played in video playing terminal Video jumps to corresponding position.
The matching result that mobile phone real-time reception server is sent, if the matching result is the field with voice messaging characteristic matching Scape label then jumps the execution of video playing application program according to the location information for including in the scene tag.If server is not The scene tag for meeting the voice messaging feature is matched, prompt that mobile phone received be that it fails to match, then in mobile phone screen Upper display text information, to prompt user.
Video playing terminal acquires the voice messaging of user's input by microphone in the present embodiment, and obtains in backstage The name information of the user speech information and currently playing video is sent to service by the name information of currently playing video Device so that server matches the voice messaging feature scene tag different from the audio data, obtain with The scene tag of voice messaging characteristic matching receives the scene tag with voice messaging characteristic matching, by video playing end The video played on end jumps to corresponding position.Directly transmitting voice command the invention enables user can realize that video is jumped Turn, and jump in the video scene for wanting viewing, to improve user experience.
Referring to Fig. 6, the present invention is a kind of video playing jump system first embodiment schematic diagram, and the video playing jumps System includes: video playing terminal and server,
The voice messaging of video playing terminal acquisition user's input, and the user speech information is sent to service Device;
The server receives the user speech information of video playing terminal acquisition, identifies to the voice messaging, The feature for extracting the voice messaging carries out the voice messaging feature scene tag different from preset audio data Matching obtains the scene tag with voice messaging characteristic matching, and the scene tag with voice messaging characteristic matching is sent To video playing terminal;
The video playing terminal receives the scene tag with voice messaging characteristic matching, will be in video playing terminal The video of broadcasting jumps to corresponding position.
In addition, the embodiment of the present invention also proposes a kind of computer readable storage medium, the computer readable storage medium On be stored with video playing jump routine, the video playing jump routine is realized when being executed by video playing terminal and server Following operation:
Receive the user speech information of video playing terminal acquisition;
The user speech information is identified, the feature of the voice messaging is extracted;
The voice messaging feature scene tag different from preset audio data is matched, obtains and believes with voice Cease the scene tag of characteristic matching;
It is sent to video playing terminal with voice messaging characteristic matching scene tag by described, to control video playing end The video played on end jumps to corresponding position.
Further, scene tags described that the voice messaging feature is different from preset audio data progress Before the step of matching, obtaining the scene tag with voice messaging characteristic matching, comprising:
Judge in the voice messaging feature whether to include jumping video name;
If not including jumping video name in the voice messaging feature, the title of currently playing video is obtained;
The scene tag that the voice messaging feature is different from preset audio data matches, acquisition and language Message cease characteristic matching scene tag the step of include:
By the voice messaging feature and the title of currently playing the video scene tag different from the audio data It is matched, obtains the scene tag with voice messaging characteristic matching.
Further, it is described judge whether to include the steps that jumping video name in the voice messaging feature after, Include:
If in the voice messaging feature include jump video name, then follow the steps: by the voice messaging feature with Different scene tags is matched in preset audio data, obtains the scene tag with voice messaging characteristic matching.
Further, the scene tag progress that the voice messaging feature is different from preset audio data The step of matching, obtaining the scene tag with voice messaging characteristic matching, comprising:
Judge in preset audio data whether to include the corresponding audio data of currently playing video;
If there is no the corresponding audio data of currently playing video in preset audio data, asked to video playing terminal transmission Ask instruction;
The corresponding audio data of currently playing video that video playing terminal is sent is received, the audio data is saved in Preset audio data.
Further, scene tags described that the voice messaging feature is different from preset audio data progress After the step of matching, obtaining the scene tag with voice messaging characteristic matching, further includes:
If not matching the scene tag for meeting the voice messaging feature within a preset time, generate that it fails to match mentions Show;
By it fails to match, prompt is sent to video playing terminal, so that video playing terminal display reminding information.
Video playing jump routine, the video playing jump routine quilt are stored on the computer readable storage medium Following operation is also realized when video playing terminal and server execute:
Acquire the voice messaging of user's input;
The user speech information is sent to server, so that server is by the voice messaging feature and the audio Different scene tags is matched in data, obtains the scene tag with voice messaging characteristic matching;
The scene tag with voice messaging characteristic matching is received, the video played in video playing terminal is jumped to Corresponding position.
Further, server is sent in the name information by the user speech information and currently playing video After step, further includes:
Receive the audio data request instruction of server transmission;
The corresponding audio data of currently playing video is sent to server.
Further, server is sent in the name information by the user speech information and currently playing video After step, further includes:
If server does not match the scene tag for meeting the voice messaging feature within a preset time, matching is received Failure prompts, and shows in video terminal interface, to prompt user.
The specific embodiment of computer readable storage medium of the present invention and the basic phase of each embodiment of above-mentioned video skip method Together, therefore not to repeat here.
It should be noted that, in this document, the terms "include", "comprise" or its any other variant are intended to non-row His property includes, so that the process, method, article or the system that include a series of elements not only include those elements, and And further include other elements that are not explicitly listed, or further include for this process, method, article or system institute it is intrinsic Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including being somebody's turn to do There is also other identical elements in the process, method of element, article or system.
The serial number of the above embodiments of the invention is only for description, does not represent the advantages or disadvantages of the embodiments.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side Method can be realized by means of software and necessary general hardware platform, naturally it is also possible to by hardware, but in many cases The former is more preferably embodiment.Based on this understanding, technical solution of the present invention substantially in other words does the prior art The part contributed out can be embodied in the form of software products, which is stored in one as described above In storage medium (such as ROM/RAM, magnetic disk, CD), including some instructions are used so that a video playing terminal (can be hand Machine, computer, television set or network equipment etc.) execute method described in each embodiment of the present invention.
The above is only a preferred embodiment of the present invention, is not intended to limit the scope of the invention, all to utilize this hair Equivalent structure or equivalent flow shift made by bright specification and accompanying drawing content is applied directly or indirectly in other relevant skills Art field, is included within the scope of the present invention.

Claims (10)

1. a kind of video playing jump method, which comprises the following steps:
Receive the user speech information of video playing terminal acquisition;
The user speech information is identified, the feature of the voice messaging is extracted;
The voice messaging feature scene tag different from preset audio data is matched, is obtained special with voice messaging Levy matched scene tag;
It is sent to video playing terminal with voice messaging characteristic matching scene tag by described, to control in video playing terminal The video of broadcasting jumps to corresponding position.
2. video playing jump method as described in claim 1, which is characterized in that it is described by the voice messaging feature with Different scene tags is matched in preset audio data, obtain with the step of the scene tag of voice messaging characteristic matching it Before, comprising:
Judge in the voice messaging feature whether to include jumping video name;
If not including jumping video name in the voice messaging feature, the title of currently playing video is obtained;
The scene tag that the voice messaging feature is different from preset audio data matches, and obtains and believes with voice Cease characteristic matching scene tag the step of include:
The voice messaging feature and the title of currently playing the video scene tag different from the audio data are carried out Matching obtains the scene tag with voice messaging characteristic matching.
3. video playing jump method as claimed in claim 2, which is characterized in that in the judgement voice messaging feature In whether include the steps that jumping video name after, comprising:
If including jumping video name in the voice messaging feature, then follow the steps: by the voice messaging feature and presetting Different scene tags is matched in audio data, obtains the scene tag with voice messaging characteristic matching.
4. video playing jump method as described in claim 1, which is characterized in that described by the voice messaging feature and pre- If different scene tags is matched in audio data, the step of acquisition with the scene tag of voice messaging characteristic matching, packet It includes:
Judge in preset audio data whether to include the corresponding audio data of currently playing video;
If not having the corresponding audio data of currently playing video in preset audio data, request is sent to video playing terminal and is referred to It enables;
The corresponding audio data of currently playing video that video playing terminal is sent is received, the audio data is saved in default Audio data.
5. video playing jump method as described in claim 1, which is characterized in that it is described by the voice messaging feature with Different scene tags is matched in preset audio data, obtain with the step of the scene tag of voice messaging characteristic matching it Afterwards, further includes:
If not matching the scene tag for meeting the voice messaging feature within a preset time, generate it fails to match prompt;
By it fails to match, prompt is sent to video playing terminal, so that video playing terminal display reminding information.
6. a kind of video playing jump method, which comprises the following steps:
Acquire the voice messaging of user's input;
The user speech information is sent to server, so that server is by the voice messaging feature and the audio data Middle different scene tag is matched, and the scene tag with voice messaging characteristic matching is obtained;
The scene tag with voice messaging characteristic matching is received, the video played in video playing terminal is jumped to accordingly Position.
7. video playing jump method as claimed in claim 6, which is characterized in that send out the user speech information described It send to server step, further includes:
Receive the audio data request instruction of server transmission;
The corresponding audio data of currently playing video is sent to server.
8. video playing jump method as claimed in claim 6, which is characterized in that send out the user speech information described It send to server step, further includes:
If server does not match the scene tag for meeting the voice messaging feature within a preset time, it fails to match for reception Prompt, and shown in video terminal interface, to prompt user.
9. a kind of video playing jump system, which is characterized in that the video playing jump system include: video playing terminal and Server,
The voice messaging of video playing terminal acquisition user's input, and the user speech information is sent to server;
The server receives the user speech information of video playing terminal acquisition, identifies to the voice messaging, extracts The feature of the voice messaging out, by the voice messaging feature scene tag progress different from preset audio data Match, obtains the scene tag with voice messaging characteristic matching, the scene tag with voice messaging characteristic matching is sent to Video playing terminal;
The video playing terminal receives the scene tag with voice messaging characteristic matching, will play in video playing terminal Video jump to corresponding position.
10. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program quilt Video playing terminal and server realize such as video playing jump method described in any one of claims 1-8 when executing.
CN201811654558.6A 2018-12-29 2018-12-29 Video playing skipping method and system and computer readable storage medium Active CN109688475B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201811654558.6A CN109688475B (en) 2018-12-29 2018-12-29 Video playing skipping method and system and computer readable storage medium
PCT/CN2019/126022 WO2020135161A1 (en) 2018-12-29 2019-12-17 Video playback jump method and system, and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811654558.6A CN109688475B (en) 2018-12-29 2018-12-29 Video playing skipping method and system and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN109688475A true CN109688475A (en) 2019-04-26
CN109688475B CN109688475B (en) 2020-10-02

Family

ID=66191672

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811654558.6A Active CN109688475B (en) 2018-12-29 2018-12-29 Video playing skipping method and system and computer readable storage medium

Country Status (2)

Country Link
CN (1) CN109688475B (en)
WO (1) WO2020135161A1 (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110166845A (en) * 2019-05-13 2019-08-23 Oppo广东移动通信有限公司 Video broadcasting method and device
CN111209437A (en) * 2020-01-13 2020-05-29 腾讯科技(深圳)有限公司 Label processing method and device, storage medium and electronic equipment
WO2020135161A1 (en) * 2018-12-29 2020-07-02 深圳Tcl新技术有限公司 Video playback jump method and system, and computer readable storage medium
CN111601163A (en) * 2020-04-26 2020-08-28 百度在线网络技术(北京)有限公司 Play control method and device, electronic equipment and storage medium
CN111818172A (en) * 2020-07-21 2020-10-23 海信视像科技股份有限公司 Method and device for controlling intelligent equipment by management server of Internet of things
CN112261436A (en) * 2019-07-04 2021-01-22 青岛海尔多媒体有限公司 Video playing method, device and system
CN112632329A (en) * 2020-12-18 2021-04-09 咪咕互动娱乐有限公司 Video extraction method and device, electronic equipment and storage medium
CN112954426A (en) * 2021-02-07 2021-06-11 咪咕文化科技有限公司 Video playing method, electronic equipment and storage medium
CN113689856A (en) * 2021-08-20 2021-11-23 海信电子科技(深圳)有限公司 Voice control method for video playing progress of browser page and display device

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150357001A1 (en) * 2012-08-31 2015-12-10 Amazon Technologies, Inc. Timeline interface for video content
CN105869623A (en) * 2015-12-07 2016-08-17 乐视网信息技术(北京)股份有限公司 Video playing method and device based on speech recognition
CN106162357A (en) * 2016-05-31 2016-11-23 腾讯科技(深圳)有限公司 Obtain the method and device of video content
CN107135418A (en) * 2017-06-14 2017-09-05 北京易世纪教育科技有限公司 A kind of control method and device of video playback
CN107155138A (en) * 2017-06-06 2017-09-12 深圳Tcl数字技术有限公司 Video playback jump method, equipment and computer-readable recording medium
CN107506385A (en) * 2017-07-25 2017-12-22 努比亚技术有限公司 A kind of video file retrieval method, equipment and computer-readable recording medium
CN107871500A (en) * 2017-11-16 2018-04-03 百度在线网络技术(北京)有限公司 One kind plays multimedia method and apparatus
CN107948729A (en) * 2017-12-13 2018-04-20 广东欧珀移动通信有限公司 Rich Media's processing method, device, storage medium and electronic equipment
US20180192120A1 (en) * 2006-08-04 2018-07-05 Gula Consulting Limited Liability Company Moving video tags

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6336093B2 (en) * 1998-01-16 2002-01-01 Avid Technology, Inc. Apparatus and method using speech recognition and scripts to capture author and playback synchronized audio and video
CN101329867A (en) * 2007-06-21 2008-12-24 西门子(中国)有限公司 Method and device for playing speech on demand
CN105677735B (en) * 2015-12-30 2020-04-21 腾讯科技(深圳)有限公司 Video searching method and device
CN107071542B (en) * 2017-04-18 2020-07-28 百度在线网络技术(北京)有限公司 Video clip playing method and device
CN107704525A (en) * 2017-09-04 2018-02-16 优酷网络技术(北京)有限公司 Video searching method and device
CN109688475B (en) * 2018-12-29 2020-10-02 深圳Tcl新技术有限公司 Video playing skipping method and system and computer readable storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180192120A1 (en) * 2006-08-04 2018-07-05 Gula Consulting Limited Liability Company Moving video tags
US20150357001A1 (en) * 2012-08-31 2015-12-10 Amazon Technologies, Inc. Timeline interface for video content
CN105869623A (en) * 2015-12-07 2016-08-17 乐视网信息技术(北京)股份有限公司 Video playing method and device based on speech recognition
CN106162357A (en) * 2016-05-31 2016-11-23 腾讯科技(深圳)有限公司 Obtain the method and device of video content
CN107155138A (en) * 2017-06-06 2017-09-12 深圳Tcl数字技术有限公司 Video playback jump method, equipment and computer-readable recording medium
CN107135418A (en) * 2017-06-14 2017-09-05 北京易世纪教育科技有限公司 A kind of control method and device of video playback
CN107506385A (en) * 2017-07-25 2017-12-22 努比亚技术有限公司 A kind of video file retrieval method, equipment and computer-readable recording medium
CN107871500A (en) * 2017-11-16 2018-04-03 百度在线网络技术(北京)有限公司 One kind plays multimedia method and apparatus
CN107948729A (en) * 2017-12-13 2018-04-20 广东欧珀移动通信有限公司 Rich Media's processing method, device, storage medium and electronic equipment

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020135161A1 (en) * 2018-12-29 2020-07-02 深圳Tcl新技术有限公司 Video playback jump method and system, and computer readable storage medium
CN110166845A (en) * 2019-05-13 2019-08-23 Oppo广东移动通信有限公司 Video broadcasting method and device
CN110166845B (en) * 2019-05-13 2021-10-26 Oppo广东移动通信有限公司 Video playing method and device
CN112261436A (en) * 2019-07-04 2021-01-22 青岛海尔多媒体有限公司 Video playing method, device and system
CN112261436B (en) * 2019-07-04 2024-04-02 青岛海尔多媒体有限公司 Video playing method, device and system
CN111209437A (en) * 2020-01-13 2020-05-29 腾讯科技(深圳)有限公司 Label processing method and device, storage medium and electronic equipment
CN111209437B (en) * 2020-01-13 2023-11-28 腾讯科技(深圳)有限公司 Label processing method and device, storage medium and electronic equipment
CN111601163B (en) * 2020-04-26 2023-03-03 百度在线网络技术(北京)有限公司 Play control method and device, electronic equipment and storage medium
CN111601163A (en) * 2020-04-26 2020-08-28 百度在线网络技术(北京)有限公司 Play control method and device, electronic equipment and storage medium
CN111818172A (en) * 2020-07-21 2020-10-23 海信视像科技股份有限公司 Method and device for controlling intelligent equipment by management server of Internet of things
CN112632329A (en) * 2020-12-18 2021-04-09 咪咕互动娱乐有限公司 Video extraction method and device, electronic equipment and storage medium
CN112954426A (en) * 2021-02-07 2021-06-11 咪咕文化科技有限公司 Video playing method, electronic equipment and storage medium
CN113689856B (en) * 2021-08-20 2023-11-03 Vidaa(荷兰)国际控股有限公司 Voice control method for video playing progress of browser page and display equipment
CN113689856A (en) * 2021-08-20 2021-11-23 海信电子科技(深圳)有限公司 Voice control method for video playing progress of browser page and display device

Also Published As

Publication number Publication date
CN109688475B (en) 2020-10-02
WO2020135161A1 (en) 2020-07-02

Similar Documents

Publication Publication Date Title
CN109688475A (en) Video playing jump method, system and computer readable storage medium
CN110784752B (en) Video interaction method and device, computer equipment and storage medium
US10650816B2 (en) Performing tasks and returning audio and visual feedbacks based on voice command
CN110191372A (en) Multimedia interaction method, system and device
CN100385424C (en) Information processing apparatus and content information processing method
CN104869467B (en) Information output method, device and system in media play
EP3680896B1 (en) Method for controlling terminal by voice, terminal, server and storage medium
CN108536414B (en) Voice processing method, device and system and mobile terminal
CN105230035B (en) The processing of the social media of time shift multimedia content for selection
CN106489150A (en) For recognize and preserve media asset a part system and method
CN102833596B (en) Information transmitting method and device
CN103369367B (en) Streamable content is used to improve the system and method for Consumer's Experience
KR20160011613A (en) Method and device for information acquisition
CN103026681A (en) Video-based method, server and system for realizing value-added service
CN106998490B (en) A kind of multi-medium data synchronous method and device
CN112383790B (en) Live broadcast screen recording method and device, electronic equipment and storage medium
CN109891896A (en) Anchor for live stream
CN103686200A (en) Intelligent television video resource searching method and system
CN103686344A (en) Enhanced video system and method
CN112423081B (en) Video data processing method, device and equipment and readable storage medium
CN108733666B (en) Server information pushing method, terminal information sending method, device and system
CN103369126A (en) Song requesting method
CN111064980A (en) Cloud-based audio and video playing control method and system
CN112954426B (en) Video playing method, electronic equipment and storage medium
CN108833983A (en) Played data acquisition methods, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant