CN106060629A - Picture extraction method and terminal - Google Patents

Picture extraction method and terminal Download PDF

Info

Publication number
CN106060629A
CN106060629A CN201610592552.5A CN201610592552A CN106060629A CN 106060629 A CN106060629 A CN 106060629A CN 201610592552 A CN201610592552 A CN 201610592552A CN 106060629 A CN106060629 A CN 106060629A
Authority
CN
China
Prior art keywords
picture
target
data
video data
pending
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610592552.5A
Other languages
Chinese (zh)
Inventor
白斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingsoft Internet Security Software Co Ltd
Original Assignee
Beijing Kingsoft Internet Security Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingsoft Internet Security Software Co Ltd filed Critical Beijing Kingsoft Internet Security Software Co Ltd
Priority to CN201610592552.5A priority Critical patent/CN106060629A/en
Publication of CN106060629A publication Critical patent/CN106060629A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/426Internal components of the client ; Characteristics thereof
    • H04N21/42653Internal components of the client ; Characteristics thereof for processing graphics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4394Processing of audio elementary streams involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Graphics (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The embodiment of the invention discloses a picture extraction method, which comprises the following steps: acquiring waveform data of audio data in audio and video data to be processed; acquiring target waveform data from the waveform data, and acquiring target audio data matched with the target waveform data from the audio data; acquiring target audio and video data corresponding to the target audio data from the audio and video data to be processed; and extracting a picture from the target audio and video data to obtain a target picture. The embodiment of the invention also discloses a terminal. By adopting the method and the device, the efficiency of extracting the target picture is improved, and the extraction cost is reduced.

Description

The extracting method of a kind of picture and terminal
Technical field
The present invention relates to electronic technology field, particularly relate to extracting method and the terminal of a kind of picture.
Background technology
Video has a lot of Highlights, in order to effectively utilize the picture of these Highlights, factory in playing process at present The picture of these Highlights is manually intercepted by Shang Changhui, and runs it, as prepared advertising, or makes video Brief introduction etc..
But, owing to excellent picture is manually to intercept, this be often depending on intercept operation personnel invite people like with And individual's problem such as quality, this makes the image quality manually intercepting out uncontrollable, it is impossible to ensure the quality of excellent picture, and A large amount of human cost need to be spent to carry out checking video and carrying out operation intercept, which increase the cost overhead of manufacturer, and extract Picture efficiency is low.
Summary of the invention
Embodiment of the present invention technical problem to be solved is, it is provided that the extracting method of a kind of picture and terminal.Can carry The high efficiency extracting target picture, reduces extraction cost.
In order to solve above-mentioned technical problem, embodiments provide the extracting method of a kind of picture, including:
Obtain the Wave data of voice data in pending audio, video data;
From described Wave data, obtain target waveform data, obtain and described object wave figurate number in described voice data Target audio data according to coupling;
The target sound video data corresponding with described target audio data is obtained in described pending audio, video data;
Carry out extracting picture from described target sound video data, it is thus achieved that target picture.
Wherein, described acquisition target waveform data from described Wave data, obtain with described in described voice data The target audio data of target waveform Data Matching include:
Detect described Wave data, obtain amplitude more than the Wave data presetting amplitude thresholds;
Described amplitude is set to target waveform data more than the Wave data presetting amplitude thresholds.
Wherein, described carrying out from described target sound video data extracts picture, it is thus achieved that target picture includes:
Extract the target video data in described target sound video data;
Described target video data is carried out camera lens division, it is thus achieved that the video data of each camera lens;
Picture extraction is carried out respectively, it is thus achieved that at least one target picture from the video data of described each camera lens.
Wherein, described carrying out respectively from the video data of described each camera lens extracts picture, it is thus achieved that at least one target is drawn Face includes:
Carry out respectively extracting picture from the video data of described each camera lens, it is thus achieved that at least one pending extraction picture Face;
When only getting a pending extraction picture, described pending extraction picture is set to target and draws Face;
When getting the pending extraction picture of at least two, the extraction picture pending to described at least two is carried out Filter process, it is thus achieved that at least one target picture described.
Wherein, described at least one pending extraction picture described is filtered process, it is thus achieved that described at least one Target picture includes:
In at least one pending extracting described, picture calculates any two pending phases extracted between picture Like degree;
Judge that whether described similarity is more than the threshold value preset;
When described similarity is more than the threshold value preset, filter described any two pending extract in pictures any One pending extraction picture, described any two pending extract in pictures by except described any one pending Extract another the pending extraction picture outside picture and be set to described target picture;
When described similarity is less than the threshold value preset, described any two pending extraction pictures are disposed as institute State target picture.
Wherein, described carrying out from described target sound video data extracts picture, it is thus achieved that after target picture, also include:
At least two target picture is carried out video-splicing, it is thus achieved that target video also exports described target video.
The embodiment of the present invention additionally provides a kind of terminal, including:
First acquiring unit, for obtaining the Wave data of the voice data in pending audio, video data;
Second acquisition unit, for obtaining target waveform data from described Wave data, obtains in described voice data Take and the target audio data of described target waveform Data Matching;
3rd acquiring unit is corresponding with described target audio data for obtaining in described pending audio, video data Target sound video data;
Extraction unit, for carrying out extraction picture, it is thus achieved that target picture from described target sound video data.
Wherein, described second acquisition unit includes:
Detection sub-unit, is used for detecting described Wave data, obtains amplitude more than the Wave data presetting amplitude thresholds;
First arranges subelement, for described amplitude is set to object wave figurate number more than the Wave data presetting amplitude thresholds According to.
Wherein, described extraction unit includes:
First extracts subelement, for extracting the target video data in described target sound video data;
Divide subelement, for described target video data is carried out camera lens division, it is thus achieved that the video data of each camera lens;
Second extracts subelement, for carrying out picture extraction from the video data of described each camera lens respectively, it is thus achieved that at least One target picture.
Wherein, described second extraction subelement includes:
3rd extracts subelement, for carrying out extraction picture from the video data of described each camera lens respectively, it is thus achieved that at least One pending extraction picture;
Second arranges subelement, for when only getting a pending extraction picture, by described pending carrying Take picture and be set to target picture;
Process subelement, for when getting the pending extraction picture of at least two, waiting to locate to described at least two The extraction picture of reason carries out filtering process, it is thus achieved that at least one target picture described.
Wherein, described process subelement includes:
Computation subunit, for calculating any two pending proposing at least one pending extracting in picture described Take the similarity between picture;
Judgment sub-unit, for judging that whether described similarity is more than the threshold value preset;
Filter subelement, for when described judgment sub-unit judges described similarity more than the threshold value preset, filtering institute State any two pending any one pending extraction pictures extracted in pictures, described any two pending Extract in picture and another the pending extraction picture in addition to described any one pending extraction picture is set to Described target picture;
3rd arranges subelement, is used for when described judgment sub-unit judges described similarity less than the threshold value preset, will Described any two pending extraction pictures are disposed as described target picture.
Wherein, described terminal also includes:
Concatenation unit, for carrying out video-splicing by least one target picture described, it is thus achieved that target video also exports institute State target video.
The embodiment of the present invention additionally provides a kind of terminal, including: housing, processor, memorizer, circuit board and power supply electricity Road, wherein, described circuit board is placed in the interior volume that described housing surrounds, described processor and described memorizer and is arranged on institute State on circuit board;Described power circuit, powers for each circuit or the device for described mobile terminal;Described memorizer is used for Storage executable program code;Described processor by read the executable program code of storage in described memorizer run with The program that described executable program code is corresponding, for performing following steps:
Obtain the Wave data of voice data in pending audio, video data;
From described Wave data, obtain target waveform data, obtain and described object wave figurate number in described voice data Target audio data according to coupling;
The target sound video data corresponding with described target audio data is obtained in described pending audio, video data;
Carry out extracting picture from described target sound video data, it is thus achieved that target picture.
In embodiments of the present invention, terminal obtains the Wave data of the voice data in pending audio, video data, from Described Wave data obtains target waveform data, described voice data obtains the mesh with described target waveform Data Matching Mark voice data, obtains the target sound video counts corresponding with described target audio data in described pending audio, video data According to, carry out extracting picture from described target sound video data, it is thus achieved that target picture, extraction mesh from audio, video data can be improved The efficiency of mark picture, reduces extraction cost.
Accompanying drawing explanation
In order to be illustrated more clearly that the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing In having technology to describe, the required accompanying drawing used is briefly described, it should be apparent that, the accompanying drawing in describing below is only this Some embodiments of invention, for those of ordinary skill in the art, on the premise of not paying creative work, it is also possible to Other accompanying drawing is obtained according to these accompanying drawings.
Fig. 1 is a kind of embodiment schematic flow sheet of the extracting method of a kind of picture that the embodiment of the present invention provides;
Fig. 2 is a kind of example structure figure of a kind of terminal that the embodiment of the present invention provides;
Fig. 3 is the another kind of example structure figure of a kind of terminal that the embodiment of the present invention provides.
Detailed description of the invention
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Describe, it is clear that described embodiment is only a part of embodiment of the present invention rather than whole embodiments wholely.Based on Embodiment in the present invention, it is every other that those of ordinary skill in the art are obtained under not making creative work premise Embodiment, broadly falls into the scope of protection of the invention.
Executive agent in the embodiment of the present invention can be terminal, described terminal comprise the steps that computer, panel computer, The intelligent terminal such as notebook, above-mentioned terminal is only citing, and non exhaustive, including but not limited to above-mentioned terminal.
See Fig. 1, be the extracting method one embodiment schematic flow sheet of a kind of picture that the embodiment of the present invention provides.This The extracting method of a kind of picture of inventive embodiments comprises the steps:
S100, obtains the Wave data of voice data in pending audio, video data.
In embodiments of the present invention, audio, video data is made up of voice data and video data, and audio, video data is permissible By Audio Players output audio frequency and video player output video, if audio, video data can be having of televising The audio, video datas such as the video recording on the programme content of voice output, mobile phone with voice output.
In embodiments of the present invention, pending audio, video data is the audio frequency and video number that user selects to carry out processing Can be as pending audio, video data according to, the audio, video data received such as terminal, or terminal can store multiple sound Video data, user therefrom selects an audio, video data as pending audio, video data.
In embodiments of the present invention, when terminal determines pending audio, video data, terminal can be to pending audio frequency and video Decoding data, extracts the voice data included by pending audio, video data.
In embodiments of the present invention, when voice data during terminal gets pending audio, video data, terminal can Processing voice data, it is thus achieved that Wave data, wherein, Wave data can be voice data waveform number in time domain According to.Wherein, all and one time reference line of Wave data and voice data is corresponding.
S101, obtains target waveform data from described Wave data, obtains and described target in described voice data The target audio data of Wave data coupling.
In embodiments of the present invention, when voice data is during playing out, if occurring background music, it is corresponding Wave data amplitude will increase.Therefore, terminal can detection waveform data, and when detect Wave data occur sudden change time, Terminal can determine whether that background music occurs in the voice data that this section of Wave data is corresponding, and until before wave recovery sudden change next time During waveform, terminal can determine whether that the voice data that this section of Wave data is corresponding terminates background music, thus terminal can record this two The time point of secondary change, intercepts the Wave data between the two time point as target waveform data in Wave data.Cause This, terminal can monitor Wave data, obtains amplitude more than or equal to the Wave data presetting amplitude thresholds, and by amplitude more than or Be set to target waveform data equal to the Wave data presetting amplitude thresholds, wherein, amplitude thresholds user can with sets itself, This is not defined.
In embodiments of the present invention, the target waveform data portability timestamp of acquisition, wherein, timestamp is a character Sequence, uniquely identifies the time at certain a moment, i.e. includes start time point and the knot of target waveform data of target waveform data Bundle time point.Owing to shared all and one the time reference line of Wave data and voice data is corresponding, therefore, terminal can be according to target The timestamp of Wave data obtains the target audio data that this timestamp is corresponding on voice data, thus gets and object wave The target audio data of graphic data coupling.
S102, obtains the target sound video corresponding with described target audio data in described pending audio, video data Data.
In embodiments of the present invention, voice data, video data and audio, video data all carry timestamp.Due to sound Voice data in video data need to carry out with the video data in audio, video data Tong Bu play, therefore, voice data time Between stab, the timestamp of video data in audio, video data and all with one time reference line pair of timestamp of audio, video data Should, so that voice data can carry out Tong Bu broadcasting with video data, that is, terminal output audio, video data plays out Time, the Voice & Video of output carries out synchronizing to play.Therefore, can be according to the timestamp in target audio data at audio, video data The audio, video data that this timestamp of middle acquisition is corresponding, thus audio, video data corresponding for this timestamp is set to target sound video counts According to.
S103, carries out extracting picture, it is thus achieved that target picture from described target sound video data.
In embodiments of the present invention, target sound video data can include target audio data and target video data, eventually End can extract target video data included in target sound video data.
After terminal gets target video data, terminal can at least one preset position in target video data At least one picture of upper extraction.Wherein, at least one position can be the start position in target video data, point midway with And final position, further, position can also is that other positions, and user can be arranged voluntarily.Therefore, when the position of terminal preset Putting when including start position, point midway and final position, terminal can start position in target video data, position, midpoint Put and final position is respectively extracted a picture and carried out preserving or exporting as target picture.
Further, after terminal gets target video data, target video data can be carried out point by terminal by camera lens Section, obtains the video data of each camera lens, and carries out extracting picture from the video data of each camera lens, it is thus achieved that target picture.Wherein, eventually End can extract at least one picture at least one the preset position from the video data of each camera lens respectively.Wherein, at least One position can be any one position in the start position in the video data of each camera lens, point midway and final position Put multiple position.Further, position can also is that other positions, and user can be arranged voluntarily.Therefore, when the position of terminal preset Put when including start position, point midway and final position, terminal can start position in the video data of each camera lens, in Respectively extract a picture on some position and final position preserve as target picture and export.
Further, terminal also can be using above-mentioned extracted picture as pending extraction picture, i.e. it may be that eventually End can carry out picture extraction from the video data of each camera lens respectively, it is thus achieved that at least one pending extraction picture, wherein, eventually End can calculate the accessed pending number extracting picture, performs corresponding according to the pending number extracting picture Step.Concrete, when terminal only gets a pending extraction picture, pending extraction picture is set to by terminal Target picture;When terminal gets the pending extraction picture of at least two, terminal can be all pending to obtained Extract picture to carry out filtering process, it is thus achieved that target picture.Wherein, all pending extraction picture obtained is carried out by terminal Filter process, it is thus achieved that target picture may is that terminal calculates any two in the pending extraction picture obtained and waits to locate The similarity extracted between picture of reason, wherein, calculating any two pending similarities extracted between picture can be Terminal all carries out picture detection to these any two pending extraction pictures respectively, calculates the similarity of its content.Work as terminal After calculating these any two pending similarities extracting picture, terminal can determine whether that whether similarity is more than the threshold preset Value, when terminal judges similarity is more than the threshold value preset, terminal can filter this any two pending extracting in picture Any one pending extraction picture, this any two pending extract in pictures by except this any one pending Extract another the pending extraction picture outside picture and be set to target picture, when terminal judges similarity is less than or equal to During the threshold value preset, these any two pending extraction pictures can be disposed as target picture by terminal.Thus terminal can obtain Get target picture.Wherein, terminal can carry out combination of two respectively to the pending extraction picture obtained, thus calculate and appoint Two pending similarities extracted between picture of anticipating can be to calculate pending the extracting between picture of each combination Similarity.
In embodiments of the present invention, after terminal gets target picture, terminal can also carry out exporting target picture.Or Person is supplied to user and makes other information, as carried out making video profile, preparing advertising as excellent picture using target picture.
Further, in embodiments of the present invention, when terminal gets at least two target picture, terminal can be all of Target picture carries out video-splicing and obtains target video and export target video.Meanwhile, terminal also can according to target picture Number obtains the reproduction time of target video, and carries out broadcasting target video in reproduction time.
In embodiments of the present invention, terminal obtains the Wave data of the voice data in pending audio, video data, from Described Wave data obtains target waveform data, described voice data obtains the mesh with described target waveform Data Matching Mark voice data, obtains the target sound video counts corresponding with described target audio data in described pending audio, video data According to, carry out extracting picture from described target sound video data, it is thus achieved that target picture, extraction mesh from audio, video data can be improved The efficiency of mark picture, reduces extraction cost.
See Fig. 2, be a kind of embodiment schematic flow sheet of a kind of terminal that the embodiment of the present invention provides.The present invention implements A kind of terminal of example includes:
First acquiring unit 100, for obtaining the Wave data of the voice data in pending audio, video data.
Second acquisition unit 200, for obtaining target waveform data, in described voice data from described Wave data Obtain the target audio data with described target waveform Data Matching.
3rd acquiring unit 300, for obtaining and described target audio data in described pending audio, video data Corresponding target sound video data.
Extraction unit 400, for carrying out extraction picture, it is thus achieved that target picture from described target sound video data.
In embodiments of the present invention, audio, video data is made up of voice data and video data, and audio, video data is permissible By Audio Players output audio frequency and video player output video, if audio, video data can be having of televising The audio, video datas such as the video recording on the programme content of voice output, mobile phone with voice output.
In embodiments of the present invention, pending audio, video data is the audio frequency and video number that user selects to carry out processing Can be as pending audio, video data according to, the audio, video data received such as terminal, or terminal can store multiple sound Video data, user therefrom selects an audio, video data as pending audio, video data.
In embodiments of the present invention, when terminal determines pending audio, video data, terminal can be to pending audio frequency and video Decoding data, extracts the voice data included by pending audio, video data.
In embodiments of the present invention, when voice data during terminal gets pending audio, video data, first obtains Taking unit 100 to process voice data, it is thus achieved that Wave data, wherein, Wave data can be that voice data is in time domain On Wave data.Wherein, all and one time reference line of Wave data and voice data is corresponding.
In embodiments of the present invention, when voice data is during playing out, if occurring background music, it is corresponding Wave data amplitude will increase.Therefore, second acquisition unit 200 can detection waveform data, and when Wave data being detected When there is sudden change, second acquisition unit 200 can determine whether that background music occurs in the voice data that this section of Wave data is corresponding, and until Next time wave recovery sudden change before waveform time, second acquisition unit 200 can determine whether the voice data that this section of Wave data is corresponding Terminate background music, thus second acquisition unit 200 can record the time point of this twice change, intercepts this in Wave data Wave data between two time points is as target waveform data.Therefore, second acquisition unit 200 can monitor Wave data, Obtain amplitude more than or equal to the Wave data presetting amplitude thresholds, and by amplitude more than or equal to the waveform presetting amplitude thresholds Data are set to target waveform data, and wherein, amplitude thresholds user can not be defined at this with sets itself.
In embodiments of the present invention, the target waveform data portability timestamp of acquisition, wherein, timestamp is a character Sequence, uniquely identifies the time at certain a moment, i.e. includes start time point and the knot of target waveform data of target waveform data Bundle time point.Owing to shared all and one the time reference line of Wave data and voice data is corresponding, therefore, second acquisition unit 200 can obtain, according to the timestamp of target waveform data, the target audio data that this timestamp is corresponding on voice data, thus Get the target audio data with target waveform Data Matching.
In embodiments of the present invention, voice data, video data and audio, video data all carry timestamp.Due to sound Voice data in video data need to carry out with the video data in audio, video data Tong Bu play, therefore, voice data time Between stab, the timestamp of video data in audio, video data and all with one time reference line pair of timestamp of audio, video data Should, so that voice data can carry out Tong Bu broadcasting with video data, that is, terminal output audio, video data plays out Time, the Voice & Video of output carries out synchronizing to play.Therefore, the 3rd acquiring unit 300 can according in target audio data time Between stab in audio, video data, obtain the audio, video data that this timestamp is corresponding, thus by audio, video data corresponding for this timestamp It is set to target sound video data.
In embodiments of the present invention, target sound video data can include target audio data and target video data, carries Take unit 400 and can extract target video data included in target sound video data.
After extraction unit 400 gets target video data, extraction unit 400 can preset in target video data At least one position on extract at least one picture.Wherein, at least one position can be the starting point in target video data Position, point midway and final position, further, position can also is that other positions, and user can be arranged voluntarily.Cause This, when the position of terminal preset includes start position, point midway and final position, extraction unit 400 can regard in target Frequency according in start position, point midway and final position respectively extract a picture carry out preserving as target picture or Person exports.
Further, after extraction unit 400 gets target video data, extraction unit 400 can be by camera lens to target Video data carries out segmentation, obtains the video data of each camera lens, and carries out extracting picture from the video data of each camera lens, it is thus achieved that mesh Mark picture.Wherein, extraction unit 400 can extract respectively at least one the preset position from the video data of each camera lens to A few picture.Wherein, start position, point midway and the end during at least one position can be the video data of each camera lens Multiple position, any one position in some position.Further, position can also is that other positions, and user can set voluntarily Put.Therefore, when the position of terminal preset includes start position, point midway and final position, extraction unit 400 can be respectively A picture is respectively extracted as target picture on start position, point midway and final position in the video data of camera lens Preserve and export.
Further, extraction unit 400 also can using above-mentioned extracted picture as pending extraction picture, To be, extraction unit 400 can carry out picture extraction from the video data of each camera lens respectively, it is thus achieved that at least one pending carrying Taking picture, wherein, extraction unit 400 can calculate the accessed pending number extracting picture, carries according to pending The number taking picture performs corresponding step.Concrete, when extraction unit 400 only gets a pending extraction picture Time, pending extraction picture is set to target picture by extraction unit 400;Treat when extraction unit 400 gets at least two During the extraction picture processed, extraction unit 400 can filter process to all pending extraction picture obtained, it is thus achieved that Target picture.Wherein, extraction unit 400 filters process to all pending extraction picture obtained, it is thus achieved that target Picture may is that extraction unit 400 calculates any two pending extraction pictures in pending the extracting in picture obtained Similarity between face, wherein, calculating any two pending similarities extracted between picture can be extraction unit 400 Respectively these any two pending extraction pictures are all carried out picture detection, calculate the similarity of its content.Work as extraction unit After 400 calculate these any two pending similarities extracting picture, extraction unit 400 can determine whether whether similarity is more than Preset threshold value, when terminal judges similarity more than preset threshold value time, extraction unit 400 can filter this any two pending Any one the pending extraction picture extracted in picture, will be except this in these any two pending extraction pictures Pending another pending extraction picture extracted outside picture of anticipating is set to target picture, works as extraction unit 400 judge that when similarity is less than or equal to the threshold value preset, extraction unit 400 can be by these any two pending extraction pictures It is disposed as target picture.Thus extraction unit 400 can get target picture.Wherein, extraction unit 400 can be to being obtained Pending extraction picture carries out combination of two respectively, thus calculates any two pending similarities extracted between picture It can be the pending similarity extracted between picture calculating each combination.
In embodiments of the present invention, after extraction unit 400 gets target picture, terminal can also carry out exporting target Picture.Or it is supplied to user and makes other information, as carried out making video profile, making as excellent picture using target picture Advertisement etc..
Further, in embodiments of the present invention, when extraction unit 400 gets at least two target picture, terminal All of target picture can carry out video-splicing acquisition target video and export target video.Meanwhile, terminal also can be according to target The number of picture obtains the reproduction time of target video, and carries out broadcasting target video in reproduction time.
Wherein, described second acquisition unit 200 includes:
Detection sub-unit, is used for detecting described Wave data, obtains amplitude more than the Wave data presetting amplitude thresholds;
First arranges subelement, for described amplitude is set to object wave figurate number more than the Wave data presetting amplitude thresholds According to.
Described extraction unit 400 includes:
First extracts subelement, for extracting the target video data in described target sound video data;
Divide subelement, for described target video data is carried out camera lens division, it is thus achieved that the video data of each camera lens;
Second extracts subelement, for carrying out extraction picture from the video data of described each camera lens respectively, it is thus achieved that at least One target picture.
Described second extracts subelement includes:
3rd extracts subelement, for carrying out picture extraction from the video data of described each camera lens respectively, it is thus achieved that at least One pending extraction picture;
Second arranges subelement, for when only getting a pending extraction picture, by described pending carrying Take picture and be set to target picture;
Process subelement, for when getting the pending extraction picture of at least two, waiting to locate to described at least two The extraction picture of reason carries out filtering process, it is thus achieved that at least one target picture described.
Described process subelement includes:
Computation subunit, for calculating any two pending proposing at least one pending extracting in picture described Take the similarity between picture;
Judgment sub-unit, for judging that whether described similarity is more than the threshold value preset;
Filter subelement, for when described judgment sub-unit judges described similarity more than the threshold value preset, filtering institute State any two pending any one pending extraction pictures extracted in pictures, described any two pending Extract in picture and another the pending extraction picture in addition to described any one pending extraction picture is set to Described target picture;
3rd arranges subelement, is used for when described judgment sub-unit judges described similarity less than the threshold value preset, will Described any two pending extraction pictures are disposed as described target picture.
Described terminal also includes:
Concatenation unit, for carrying out video-splicing by least two target picture, it is thus achieved that target video also exports described mesh Mark video.
Wherein it is possible to be understood by, the function of each functional module of the unit in the terminal of the present embodiment can be according to above-mentioned Method in embodiment of the method implements, and it implements process and is referred to the associated description of said method embodiment, this Place no longer repeats.
Refer to Fig. 3, for the another kind of embodiment schematic flow sheet of a kind of terminal of the present invention.As it is shown on figure 3, the present embodiment Described a kind of terminal includes:
Housing 301, processor 302, memorizer 303, circuit board 307 and power circuit 305, wherein, circuit board 307 disposes It is arranged on circuit board 307 at the interior volume that housing 301 surrounds, processor 302 and memorizer 303;Power circuit 305, uses Power in each circuit or the device for terminal;Memorizer 303 is used for storing executable program code;Processor 302 is by reading In access to memory 303, the executable program code of storage runs the program corresponding with executable program code, for execution Following steps:
Obtain the Wave data of voice data in pending audio, video data;
From described Wave data, obtain target waveform data, obtain and described object wave figurate number in described voice data Target audio data according to coupling;
The target sound video data corresponding with described target audio data is obtained in described pending audio, video data;
Carry out extracting picture from described target sound video data, it is thus achieved that target picture.
Wherein, described processor 302 obtains target waveform data from described Wave data and includes:
Detect described Wave data, obtain amplitude more than the Wave data presetting amplitude thresholds;
Described amplitude is set to target waveform data more than the Wave data presetting amplitude thresholds.
Wherein, described processor 302 carries out extracting picture from described target sound video data, it is thus achieved that target picture bag Include:
Extract the target video data in described target sound video data;
Described target video data is carried out camera lens division, it is thus achieved that the video data of each camera lens;
Picture extraction is carried out respectively, it is thus achieved that at least one target picture from the video data of described each camera lens.
Wherein, described processor 302 carries out extracting picture from the video data of described each camera lens respectively, it is thus achieved that at least one Individual target picture includes:
Carry out respectively extracting picture from the video data of described each camera lens, it is thus achieved that at least one pending extraction picture Face;
When only getting a pending extraction picture, described pending extraction picture is set to target and draws Face;
When getting the pending extraction picture of at least two, the extraction picture pending to described at least two is carried out Filter process, it is thus achieved that at least one target picture described.
Wherein, described processor 302 filters process to the extraction picture that described at least two is pending, it is thus achieved that described At least one target picture includes:
In pending the extracting of described at least two, picture calculates any two pending phases extracted between picture Like degree;
Judge that whether described similarity is more than the threshold value preset;
When described similarity is more than the threshold value preset, filter described any two pending extract in pictures any One pending extraction picture, described any two pending extract in pictures by except described any one pending Extract another the pending extraction picture outside picture and be set to described target picture;
When described similarity is less than the threshold value preset, described any two pending extraction pictures are disposed as institute State target picture.
Wherein, described processor 302 carry out from described target sound video data extract picture, it is thus achieved that target picture it After, processor 302 also performs:
At least two target picture is carried out video-splicing, it is thus achieved that target video also exports described target video.
It is understood that the function of each functional module of the terminal of the present embodiment can be according in said method embodiment Method implements, and it implements process and is referred to the associated description of said method embodiment, the most no longer repeats.
In embodiments of the present invention, terminal obtains the Wave data of the voice data in pending audio, video data, from Described Wave data obtains target waveform data, described voice data obtains the mesh with described target waveform Data Matching Mark voice data, obtains the target sound video counts corresponding with described target audio data in described pending audio, video data According to, carry out extracting picture from described target sound video data, it is thus achieved that target picture, extraction mesh from audio, video data can be improved The efficiency of mark picture, reduces extraction cost.
In embodiments of the present invention, terminal obtains the Wave data of the voice data in pending audio, video data, from Described Wave data obtains target waveform data, described voice data obtains the mesh with described target waveform Data Matching Mark voice data, obtains the target sound video counts corresponding with described target audio data in described pending audio, video data According to, carry out extracting picture from described target sound video data, it is thus achieved that target picture, extraction mesh from audio, video data can be improved The efficiency of mark picture, reduces extraction cost.
One of ordinary skill in the art will appreciate that all or part of flow process realizing in above-described embodiment method, be permissible Instructing relevant hardware by computer program to complete, described program can be stored in a computer read/write memory medium In, this program is upon execution, it may include such as the flow process of the embodiment of above-mentioned each method.Wherein, described storage medium can be magnetic Dish, CD, read-only store-memory body (Read-Only Memory, ROM) or random store-memory body (Random Access Memory, RAM) etc..
The above disclosed present pre-ferred embodiments that is only, can not limit the right model of the present invention with this certainly Enclose, the equivalent variations therefore made according to the claims in the present invention, still belong to the scope that the present invention is contained.

Claims (10)

1. the extracting method of a picture, it is characterised in that described method includes:
Obtain the Wave data of voice data in pending audio, video data;
From described Wave data, obtain target waveform data, obtain and described target waveform data in described voice data The target audio data joined;
The target sound video data corresponding with described target audio data is obtained in described pending audio, video data;
Carry out extracting picture from described target sound video data, it is thus achieved that target picture.
2. the method for claim 1, it is characterised in that described acquisition target waveform packet from described Wave data Include:
Detect described Wave data, obtain amplitude more than the Wave data presetting amplitude thresholds;
Described amplitude is set to target waveform data more than the Wave data presetting amplitude thresholds.
3. the method for claim 1, it is characterised in that described carrying out from described target sound video data extracts picture Face, it is thus achieved that target picture includes:
Extract the target video data in described target sound video data;
Described target video data is carried out camera lens division, it is thus achieved that the video data of each camera lens;
Picture extraction is carried out respectively, it is thus achieved that at least one target picture from the video data of described each camera lens.
4. method as claimed in claim 3, it is characterised in that described carry respectively from the video data of described each camera lens Take picture, it is thus achieved that at least one target picture includes:
Carry out respectively extracting picture from the video data of described each camera lens, it is thus achieved that at least one pending extraction picture;
When only getting a pending extraction picture, described pending extraction picture is set to target picture;
When getting the pending extraction picture of at least two, the extraction picture pending to described at least two filters Process, it is thus achieved that at least one target picture described.
5. method as claimed in claim 4, it is characterised in that the described extraction picture pending to described at least two is carried out Filter process, it is thus achieved that at least one target picture described includes:
In pending the extracting of described at least two, picture calculates any two pending similarities extracted between picture;
Judge that whether described similarity is more than the threshold value preset;
When described similarity is more than the threshold value preset, filter described any two pending any one extracted in picture Pending extraction picture, in described any two pending extracting except described any one pending extraction in picture Another pending extraction picture outside picture is set to described target picture;
When described similarity is less than the threshold value preset, described any two pending extraction pictures are disposed as described mesh Mark picture.
6. method as claimed in claim 3, it is characterised in that described carrying out from described target sound video data extracts picture Face, it is thus achieved that after target picture, also includes:
At least two target picture is carried out video-splicing, it is thus achieved that target video also exports described target video.
7. a terminal, it is characterised in that described terminal includes:
First acquiring unit, for obtaining the Wave data of the voice data in pending audio, video data;
Second acquisition unit, for from described Wave data obtain target waveform data, in described voice data obtain with The target audio data of described target waveform Data Matching;
3rd acquiring unit, for obtaining the mesh corresponding with described target audio data in described pending audio, video data Mark with phonetic symbols video data;
Extraction unit, for carrying out extraction picture, it is thus achieved that target picture from described target sound video data.
8. terminal as claimed in claim 7, it is characterised in that described second acquisition unit includes:
Detection sub-unit, is used for detecting described Wave data, obtains amplitude more than the Wave data presetting amplitude thresholds;
First arranges subelement, for described amplitude is set to target waveform data more than the Wave data presetting amplitude thresholds.
9. terminal as claimed in claim 7, it is characterised in that described extraction unit includes:
First extracts subelement, for extracting the target video data in described target sound video data;
Divide subelement, for described target video data is carried out camera lens division, it is thus achieved that the video data of each camera lens;
Second extracts subelement, for carrying out picture extraction from the video data of described each camera lens respectively, it is thus achieved that at least one Target picture.
10. a terminal, it is characterised in that including: housing, processor, memorizer, circuit board and power circuit, wherein, described Circuit board is placed in the interior volume that described housing surrounds, described processor and described memorizer and is arranged on described circuit board; Described power circuit, powers for each circuit or the device for described mobile terminal;Described memorizer is used for storing and can perform Program code;Described processor is run by the executable program code of storage in the described memorizer of reading and performs with described The program that program code is corresponding, for performing following steps:
Obtain the Wave data of voice data in pending audio, video data;
From described Wave data, obtain target waveform data, obtain and described target waveform data in described voice data The target audio data joined;
The target sound video data corresponding with described target audio data is obtained in described pending audio, video data;
Carry out extracting picture from described target sound video data, it is thus achieved that target picture.
CN201610592552.5A 2016-07-25 2016-07-25 Picture extraction method and terminal Pending CN106060629A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610592552.5A CN106060629A (en) 2016-07-25 2016-07-25 Picture extraction method and terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610592552.5A CN106060629A (en) 2016-07-25 2016-07-25 Picture extraction method and terminal

Publications (1)

Publication Number Publication Date
CN106060629A true CN106060629A (en) 2016-10-26

Family

ID=57417395

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610592552.5A Pending CN106060629A (en) 2016-07-25 2016-07-25 Picture extraction method and terminal

Country Status (1)

Country Link
CN (1) CN106060629A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108848411A (en) * 2018-08-01 2018-11-20 夏颖 The system and method for defining program boundaries and advertisement boundary based on audio signal waveform

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1658663A (en) * 2004-02-18 2005-08-24 三星电子株式会社 Method and apparatus for summarizing a plurality of frames
CN101431689A (en) * 2007-11-05 2009-05-13 华为技术有限公司 Method and device for generating video abstract
US20100104261A1 (en) * 2008-10-24 2010-04-29 Zhu Liu Brief and high-interest video summary generation
CN102006498A (en) * 2010-12-10 2011-04-06 北京中科大洋科技发展股份有限公司 Safe broadcast monitoring method based on video and audio comparison
CN103546667A (en) * 2013-10-24 2014-01-29 中国科学院自动化研究所 Automatic news splitting method for volume broadcast television supervision

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1658663A (en) * 2004-02-18 2005-08-24 三星电子株式会社 Method and apparatus for summarizing a plurality of frames
CN101431689A (en) * 2007-11-05 2009-05-13 华为技术有限公司 Method and device for generating video abstract
US20100104261A1 (en) * 2008-10-24 2010-04-29 Zhu Liu Brief and high-interest video summary generation
CN102006498A (en) * 2010-12-10 2011-04-06 北京中科大洋科技发展股份有限公司 Safe broadcast monitoring method based on video and audio comparison
CN103546667A (en) * 2013-10-24 2014-01-29 中国科学院自动化研究所 Automatic news splitting method for volume broadcast television supervision

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108848411A (en) * 2018-08-01 2018-11-20 夏颖 The system and method for defining program boundaries and advertisement boundary based on audio signal waveform
CN108848411B (en) * 2018-08-01 2020-09-25 夏颖 System and method for defining program boundaries and advertisement boundaries based on audio signal waveforms

Similar Documents

Publication Publication Date Title
CN103559150B (en) The implementation method of main frame external camera and device and mobile terminal
CN109190539A (en) Face identification method and device
CN108024079A (en) Record screen method, apparatus, terminal and storage medium
CN105744292A (en) Video data processing method and device
CN105338238A (en) Photographing method and electronic device
CN104469487B (en) A kind of detection method and device of scene switching point
CN107517313A (en) Awakening method and device, terminal and readable storage medium storing program for executing
CN108460120A (en) Data save method, device, terminal device and storage medium
CN104023176B (en) Handle method, device and the terminal device of audio and image information
RU2625336C2 (en) Method and device for content control in electronic device
CN105867899A (en) Method and device for identifying device
CN103152633B (en) A kind of recognition methods of keyword and device
CN109065017B (en) Voice data generation method and related device
CN106060629A (en) Picture extraction method and terminal
CN103685349A (en) Method for information processing and electronic equipment
CN114120969A (en) Method and system for testing voice recognition function of intelligent terminal and electronic equipment
JP2015082692A (en) Video editing device, video editing method, and video editing program
WO2019015411A1 (en) Screen recording method and apparatus, and electronic device
CN106210878A (en) Picture extraction method and terminal
CN109034059B (en) Silence type face living body detection method, silence type face living body detection device, storage medium and processor
CN105141739B (en) The method and device that fingerprint and volume key are recorded is combined in the standby state
CN104834549B (en) The application file update method and device of mobile terminal
CN104869232A (en) Terminal
CN108170800A (en) The classification storage method and terminal of image
CN115293985A (en) Super-resolution noise reduction method and device for image optimization

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20161026

RJ01 Rejection of invention patent application after publication