CN110211590A - A kind of processing method, device, terminal device and the storage medium of meeting hot spot - Google Patents
A kind of processing method, device, terminal device and the storage medium of meeting hot spot Download PDFInfo
- Publication number
- CN110211590A CN110211590A CN201910549987.5A CN201910549987A CN110211590A CN 110211590 A CN110211590 A CN 110211590A CN 201910549987 A CN201910549987 A CN 201910549987A CN 110211590 A CN110211590 A CN 110211590A
- Authority
- CN
- China
- Prior art keywords
- key
- meeting
- video data
- audio data
- hot
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 21
- 238000000034 method Methods 0.000 claims abstract description 38
- 239000000284 extract Substances 0.000 claims abstract description 12
- 230000015654 memory Effects 0.000 claims description 29
- 238000012545 processing Methods 0.000 claims description 11
- 230000000699 topical effect Effects 0.000 claims description 9
- 238000005516 engineering process Methods 0.000 claims description 7
- 238000000605 extraction Methods 0.000 abstract description 3
- 230000006870 function Effects 0.000 description 10
- 238000004458 analytical method Methods 0.000 description 7
- 230000008569 process Effects 0.000 description 7
- 238000012549 training Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 6
- 230000001360 synchronised effect Effects 0.000 description 5
- 241001269238 Data Species 0.000 description 4
- 230000008859 change Effects 0.000 description 4
- 238000003062 neural network model Methods 0.000 description 4
- 230000001755 vocal effect Effects 0.000 description 3
- 230000009471 action Effects 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000013499 data model Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000008520 organization Effects 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- KLDZYURQCUYZBL-UHFFFAOYSA-N 2-[3-[(2-hydroxyphenyl)methylideneamino]propyliminomethyl]phenol Chemical compound OC1=CC=CC=C1C=NCCCN=CC1=CC=CC=C1O KLDZYURQCUYZBL-UHFFFAOYSA-N 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 235000013351 cheese Nutrition 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 201000001098 delayed sleep phase syndrome Diseases 0.000 description 1
- 208000033921 delayed sleep phase type circadian rhythm sleep disease Diseases 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 230000002045 lasting effect Effects 0.000 description 1
- 230000035800 maturation Effects 0.000 description 1
- 210000005036 nerve Anatomy 0.000 description 1
- 210000004218 nerve net Anatomy 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/57—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for processing of video signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/439—Processing of audio elementary streams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/44008—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/85—Assembly of content; Generation of multimedia applications
- H04N21/854—Content authoring
- H04N21/8549—Creating video summaries, e.g. movie trailer
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Databases & Information Systems (AREA)
- Computer Security & Cryptography (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
The embodiment of the invention discloses processing method, device, terminal device and the storage mediums of a kind of meeting hot spot;The described method includes: obtaining the audio data and/or video data of meeting;Based on the audio data and/or video data, the key scenes of the meeting are identified;The first audio data where obtaining the key scenes in first time period;By identifying first audio data, the first text information is obtained;Based on first text information, the hot information of the meeting is generated.The method can extract the key scenes of the meeting, hot information is obtained based on the key scenes, artificial extraction hot information is needed not rely on, can be improved the efficiency for obtaining hot information, and hot information can be obtained in real time, to improve the intelligence of system;And the method can also improve the accuracy and validity for obtaining hot information.
Description
Technical field
The present invention relates to field of communication technology more particularly to a kind of processing method of meeting hot spot, device, terminal and deposit
Storage media.
Background technique
Currently, all kinds of meetings held daily are innumerable, in order to record the news point information of meeting, usually by editing
Personnel pass through the news point that the text of a statement or speech and video of meeting etc. find the meeting.However, this kind of mode to the dependence of editor very
Greatly, editorial staff is needed directly to participate in entire conference process, human cost is higher;And if meeting is the meeting of some professional domain
View, it is also necessary to which editorial staff has stronger professional knowledge;And if the duration of meeting is longer, it is also easy to miss
Important information.
Summary of the invention
In view of this, the present invention provides a kind of information processing method, device, terminal device and storage medium, at least portion
Divide and solves the above problems.
The technical scheme of the present invention is realized as follows:
A kind of processing method of meeting hot spot, which comprises
Obtain the audio data and/or video data of meeting;
Based on the audio data and/or video data, the key scenes of the meeting are identified;
The first audio data where obtaining the key scenes in first time period;
By identifying first audio data, the first text information is obtained;
Based on first text information, the hot information of the meeting is generated.
In above scheme, the method also includes:
The first video data where obtaining the key scenes in first time period;
Based on first text information and first video data, the hot video of the meeting is generated.
It is described to be based on the audio data and/or video data in above scheme, identify the crucial field of the meeting
Scape, comprising:
Audio identification is carried out to the audio data, to determine the first key scenes of the meeting;Wherein, described
First key scenes include at least one of: voice, applause, laugh, song;And/or
Key frame is extracted from the video data;The second crucial field of the meeting is identified based on the key frame
Scape;Wherein, second key scenes include at least one of: interview, spectators are applauded, key person makes a speech.
It is described by identifying first audio data in above scheme, obtain the first text information, comprising:
By identifying first audio data, the voiceprint of at least one key person is obtained;Based on the key
The voiceprint of personage obtains the first sub- text information of the key person;
It is described to be based on first text information, generate the hot information of meeting, comprising:
It is inserted into the identification information of the key person in the described first sub- text information, generates and is based on the key person
The hot information of object.
In above scheme, the method also includes:
Based on the voiceprint of a key person at least one key person, extracted from the first video data
The first sub-video data of the key person;
Based on first sub-video data, the hot video of the key person is generated.
In above scheme, the method also includes:
Based on the voiceprint of multiple key persons, extracted from the first video data and the multiple key person
Corresponding second sub-video data;
Based on the respective weight coefficient of the multiple key person, extracted from corresponding second sub-video data
The video clip of corresponding period out;
Gather the corresponding video clip of the multiple key person, generates the hot spot view of the multiple key person
Frequently.
It is described to be based on first text information in above scheme, generate the hot information of the meeting, comprising:
Extract the key message in first text information;Wherein, the key message include it is following at least it
One: indicating the keyword of key person, the keyword of instruction movement, the relevant keyword of finance and economics and/or key sentence, technology
Relevant keyword and/or the relevant keyword of key sentence, topical news and/or key sentence;
Based on the key message, the hot information of the meeting is generated.
The embodiment of the invention also provides a kind of processing unit of meeting hot spot, described device includes:
First obtains module, for obtaining the audio data and/or video data of meeting;
First identification module identifies the key of the meeting for being based on the audio data and/or video data
Scene;
Second obtains module, for the first audio data in first time period where obtaining the key scenes;
Second identification module, for obtaining the first text information by identifying first audio data;
Generation module generates the hot information of the meeting for being based on first text information.
The embodiment of the invention also provides a kind of terminal device, the terminal device includes: processor and for storing energy
Enough memories for running Computer Service on a processor, wherein when the processor is used to run the Computer Service, it is real
The processing method of meeting hot spot described in existing any embodiment of the present invention.
The embodiment of the invention also provides a kind of storage medium, there are computer executable instructions in the storage medium,
It is characterized in that, the computer executable instructions, which are executed by processor, realizes meeting hot spot described in any embodiment of the present invention
Processing method.
The embodiment of the invention discloses a kind of processing method of meeting hot spot, pass through the audio data for obtaining meeting and/or
Video data;Based on the audio data and/or video data, the key scenes of the meeting are identified;It can be from video counts
According to and/or audio data in find key scenes in meeting automatically;Pass through first time where obtaining the key scenes again
The first audio data in section;By identifying first audio data, the first text information is obtained, so that based on described the
One text information generates the hot information of the meeting, can by key scenes the sound within the scope of certain time in a meeting
Frequency obtains potential hot information in meeting according to being converted;So, it is possible the identification based on key scenes in meeting and
More comprehensive hot information is obtained, can be improved the accuracy and validity of the hot information of acquisition.And the embodiment of the present invention
It needs not rely on editorial staff directly to participate in entire conference process and take passages important hot information, human cost can be saved,
The intelligence of lifting system.
Detailed description of the invention
Fig. 1 is a kind of flow diagram of the processing method of meeting hot spot provided in an embodiment of the present invention;
Fig. 2 is the flow diagram of the processing method of another meeting hot spot provided in an embodiment of the present invention;
Fig. 3 is the flow diagram of the processing method of another meeting hot spot provided in an embodiment of the present invention;
Fig. 4 is a kind of structural schematic diagram of the processing unit of meeting hot spot provided in an embodiment of the present invention;
Fig. 5 is a kind of hardware structural diagram of terminal device provided in an embodiment of the present invention.
Specific embodiment
Lower combination accompanying drawings and embodiments, the present invention will be described in further detail.It should be appreciated that described herein
Specific embodiment is only used to explain the present invention, is not intended to limit the present invention.
Unless otherwise defined, all technical and scientific terms used herein and belong to technical field of the invention
The normally understood meaning of technical staff it is identical.Term as used herein in the specification of the present invention is intended merely to retouch
State the purpose of specific embodiment, it is not intended that in the limitation present invention.Term as used herein "and/or" include one or
Any and all combinations of multiple relevant listed items.
As shown in Figure 1, the embodiment of the invention provides a kind of processing methods of meeting hot spot, comprising:
Step 101, the audio data and/or video data of meeting are obtained;
Step 102, it is based on the audio data and/or video data, identifies the key scenes of the meeting;
Step 103, the first audio data where obtaining the key scenes in first time period;
Step 104, by identifying first audio data, the first text information is obtained;
Step 105, it is based on first text information, generates the hot information of the meeting.
Method described in the embodiment of the present invention be applied to terminal device, the terminal device include but is not limited to down toward
It is one of few: computer, server, mobile phone.
In some embodiments, the terminal device can be setting audio collecting device and/or the dress of video acquisition
It sets.In this way, the audio data for obtaining meeting can directly obtain the audio of the meeting using the audio collecting device
Data;The video data for obtaining meeting directly obtains the video data of the meeting using the video acquisition device.
In further embodiments, the terminal device obtains the audio number for the meeting that other electronic equipments are sent
According to and/or video data.
Wherein, the meeting can be workability meeting, for example, mobilization meeting, experience exchangement meeting, work arrange meeting or
Summing-up meeting, etc.;The meeting can be professional meeting, for example, seminar, forum, workshop, evaluation meeting, etc.;Institute
Stating meeting can be commercial affairs meeting, such as business negotiations meeting, theatre party, advertisement introduction meeting, etc.;The meeting can be
Informedness meeting, for example, news briefing, press conference, public lecture, consultation meeting, etc.;The meeting can be decision
The meeting of property, for example, Standing Committee, Party Committees meeting, council, etc..
In practical applications, if only getting the audio data of the meeting, meeting can be obtained based on the audio data
The key scenes of view;If getting the audio data and video data of the meeting, the audio data and view can be based on
Frequency evidence obtains the key scenes of the meeting.
Wherein, the key scenes include but is not limited at least one of:
Voice, applause, laugh, song, spectators applaud, key person makes a speech, interview.
Here, the key scenes may include one or more key scenes.
Here, the first time period include: several seconds, Shi Jimiao, tens seconds or a few minutes, etc..When described first
Between section can for the key scenes occur time point or period, alternatively, the first time period can be to include described
The a period of time at time point or period that key scenes occur.
For example, the key scenes of " applause ", the first time period occur can be at the 10th to 12 second of meeting
Refer to the 8th second to 12 seconds this times of meeting;The first audio data where obtaining the key scenes in first time period is then
For the audio data of the 8th second to the 12nd second this time of acquisition meeting.
For another example, at the 20th to 25 minute of meeting, there is the key scenes of " interview ", the first time period
It can refer to the 20th to 25 minute this time of meeting;The first sound where obtaining the key scenes in first time period
Frequency obtains the audio data of the 20th to 25 minute this time of meeting according to then.
Wherein, a kind of implementation of the step 102 are as follows: audio feature vector is extracted from the audio data, with
And video feature vector is extracted from video data;The audio feature vector is inputted into first nerves network model, with identification
First key scenes of the meeting out;The video feature vector is inputted into nervus opticus network model, it is described to identify
Second key scenes of meeting.
Here, the key scenes include first key scenes and the second key scenes;First key scenes
It is different from second key scenes, alternatively, the part sub-scene of first key scenes and second key scenes
Part sub-scene is identical.
Wherein, a kind of implementation of the step 104 are as follows: speech analysis means are provided in the terminal device,
Speech recognition is carried out to first audio data based on the speech analysis means, is obtained and first audio data pair
The first text information answered.
Wherein, a kind of implementation of the step 105 are as follows: based in first text information keyword and/or
Key sentence generates the hot information of the meeting.
In practical applications, voice conversion can also be carried out to the audio data of meeting, obtains the alternative text of entire meeting
This information extracts alternative key message from the alternative text information;Based on the alternative key message and the first text
The key message obtained in information generates the hot information of the meeting.
In embodiments of the present invention, by being based on institute's audio data and/or video data, the meeting can be identified
Key scenes, based on the first audio data in first time period where the key scenes, available first audio number
According to corresponding first text information, it can the audio data of important scene is obtained, thus from the important scene
Audio data in extract potential hot information, can be improved obtain hot information accuracy and validity;And it is not required to
It relies on editorial staff directly to participate in entire conference process and take passages important hot information, the processing of hot information can be simplified
Process can be improved the efficiency for obtaining hot information, can save human cost, and the intelligence of energy lifting system.
And the embodiment of the present invention can also carry out the meeting recorded the acquisition of key scenes, can be based on institute
It states key scenes and obtains hot news, obtain the real-time of hot news so as to more timely obtain hot news, improve
Property.
In some embodiments, the step 102, including
Audio identification is carried out to the audio data, to determine the first key scenes of the meeting;Wherein, described
First key scenes include at least one of: voice, applause, laugh, song;And/or
Key frame is extracted from the video data;The second crucial field of the meeting is identified based on the key frame
Scape;Wherein, second key scenes include at least one of: interview, spectators are applauded, key person makes a speech.
Here, first key scenes include one or more, when each first key scenes correspond to different
Between section;Second key scenes include one or more, and each second key scenes correspond to the different periods.
First key scenes are different with second key scenes, alternatively, the part in first key scenes
Scene is identical as the part scene in second key scenes.
In embodiments of the present invention, it can be based on audio data and/or video data, determine different key scenes.
Since certain special scenes are easier to be based on audio data acquisition, then it can use audio data and obtain laugh, song etc.
The first key scenes;Since certain special scenes are easier to be based on video data acquisition, then it can use video data acquisition
Second key scenes of interview, key person's speech etc.;It so, it is possible more comprehensively and accurately to obtain in meeting
Key point and/or point-of-interest etc. information, to extract more accurate and effective hot information.
In further embodiments, the step 102, comprising:
Audio identification is carried out to the audio data, determines alternative first key scenes of the meeting;
If it is determined that when the duration of the first alternative key scenes is greater than first threshold, described alternative the is determined
One key scenes are first key scenes;And/or
Key frame is extracted from the video data;Identify the meeting based on the key frame alternative second is closed
Key scene;If it is determined that determining that described alternative second closes when the duration of alternative second key scenes is greater than first threshold
Key scene is second key scenes.
In inventive embodiments, the duration of determining alternative key scenes can be judged, if alternative crucial
When the duration of scene is greater than certain threshold value, just determine that the alternative scene is key scenes;It so, it is possible further really
The information of key point in meeting and/or point-of-interest etc. is made, to improve the accuracy for obtaining hot information.
In some embodiments, described that key frame is extracted from the video data, including but not limited to it is following at least it
One:
Key frame is extracted with prefixed time interval;
The key frame of the first quantity is extracted within the unit time;
If it is determined that significant change occurs for the image of display, key frame when changing in predetermined amount of time is extracted.
Wherein, the prefixed time interval can be determined according to the first lasting duration of meeting.For example, if the meeting
Duration be 1 hour, can determine the prefixed time interval be 1 minute or half a minute.
Wherein, the unit time can be 1 second, 10 seconds, 1 minute, a few minutes, etc..
Wherein, it includes but is not limited at least one of that significant change, which occurs, for described image: the key person in image increases
Add deduct the key person less, in image position change, the behavior act of key person in image changes,
Place in image changes.
In this way, can be analyzed, not needed to the meeting just for partial video frame in the embodiment of the present invention
Included video frame is analyzed in all videos;Be conducive to be based only upon the key frame in video frame and obtain and identify pass
Key scene and/or face so as to reduce the treating capacity of data, and improve and generate picture and text news release and news-video original text
Efficiency.
In some embodiments, the method also includes:
The first video data where obtaining the key scenes in first time period;
Based on first text information and first video data, the hot video of the meeting is generated.
For example, determining that key scenes are key person A speech, laugh, reporter B interview;Wherein, the key person A
Speech occurs at the 15th to 20 minute of the meeting;The 25th to 26 minute in the meeting occurs for the laugh;The note
Person B interview is the 30th to 33 minute;The 15th to the 20 minute video data, the 24th to 26 minute view can then be extracted
Frequency is accordingly and the 30th to 33 minute video data.Here, the video data for extracting the laugh scene can extract packet
Include the video data before the laugh scene certain time occurs.By the 15th to the 20 minute video data, the 24th to
26 minutes video datas and the 30th to 33 minute video data are spliced, and the hot video of the meeting is generated;And
First text information is corresponded into first video data and carries out Subtitle Demonstration.
In further embodiments, described to be based on first text information and first video data, described in generation
The hot video of meeting, comprising:
Obtain key message in first text information;
Video data corresponding with the key message is extracted, from first video data to generate the meeting
Hot video.
For example, determine key scenes be key scenes be key person A speech;Wherein, the key person A speech
The 15th to 20 minute in the meeting occurs;Get the first text information of the key person A speech;Based on described
First text information obtains at least one key message, wherein the key message corresponds to the 16th to 18 of the meeting
Minute;The 16-18 minutes corresponding video datas then need to be also extracted from the 15th to the 20 minute video data;
The hot video of meeting is obtained based on described 16-18 minutes corresponding video datas.In this way, can be further improved acquisition
To the accuracy of hot video, the extraction of some unnecessary video datas is greatly reduced.
In some embodiments, the step 104, comprising:
By identifying first audio data, the voiceprint of at least one key person is obtained;Based on the key
The voiceprint of personage obtains the first sub- text information of the key person;
The step 105 includes:
It is inserted into the identification information of the key person in the described first sub- text information, generates and is based on the key person
The hot information of object.
Here, the key person can be the forward personage that appears on the scene in meeting;The key person may be speech
Time is more than the personage of first time threshold;The key person may be the personage of important information speech;The key person
Object may be the personage individually occurred in a certain number of video frames;Etc..
In embodiments of the present invention, the vocal print of key person can be obtained based on the Application on Voiceprint Recognition to the first audio data
Information, to obtain the corresponding first sub- text information of the key person;To establish the hot spot letter of the key person
Breath.And the identification information of the key person can be inserted into the place of the corresponding speech of the key person, it is convenient to not
Speech with key person arranges, and user is facilitated to read.
In some embodiments, before the step 105, further includes:
Based on the voiceprint of the key person, the identification information of the key person is obtained.
In practical applications, the terminal device can preset storage relation table;The relation table has recorded the sound of personage
Line information and identification information corresponding relationship;In this way, when getting the voiceprint of key person, it can be from the relation table
In, it has searched whether and the matched identification information of key person's voiceprint;If so, the key can be directly acquired
The identification information of personage.
In some embodiments, as shown in Fig. 2, the method also includes:
Step 106a, based on the voiceprint of a key person at least one key person, from the first video data
In extract the first sub-video data of the key person;Based on first sub-video data, the key person is generated
Hot video.
In embodiments of the present invention, the first sub-video data that can extract the key person, establishes key person's
Hot video.
In further embodiments, as shown in Fig. 2, the method also includes:
Step 106b, based on the voiceprint of multiple key persons, extracted from the first video data with it is the multiple
Corresponding second sub-video data of key person;Based on the respective weight coefficient of the multiple key person, from corresponding
The video clip of corresponding period is extracted in second sub-video data;Gather the corresponding institute of the multiple key person
Video clip is stated, the hot video of the multiple key person is generated.
Here, the corresponding video clip of the multiple key person of the set can be with are as follows: by the multiple key
The video clip of personage is stitched together, to form the video of an entirety.
In some embodiments, the respective weight coefficient of determining key person, comprising: determined according to the meeting
Time for competiton of the key person, the key person seat, go out by web search the identity of the key person
At least one of information and/or personal information determine the respective weight coefficient of key person.
Such as, however, it is determined that key person's time for competiton is more forward or rearward, then the weight coefficient of the key person
What is be arranged is bigger;If it is determined that the time for competiton of the key person more leans on centre, then the weight coefficient setting of the key person
It is smaller.
For another example, if the identity information for finding the key person by network is more important, the weight of the key person
Coefficient is arranged bigger;If more inessential by the identity information that network finds the key person, the key person's
Weight coefficient is arranged smaller.
It is understood that the weight coefficient is bigger, then it is corresponding to extract the video clip for getting over long period;The power
Weight coefficient is smaller, then corresponding to extract the video clip for getting over short time period.
In embodiments of the present invention, the second sub-video data of multiple key persons can be extracted, establishing includes multiple passes
The hot video of the set of key personage speech.Further, it can be established based on the significance level of the key person
The different hot video of corresponding time limit of speech;The time that more cheese can be made to be made a speech is longer;More unessential people
The time that object is made a speech is about short.
In some embodiments, the step 105, comprising:
Extract the key message in first text information;Wherein, the key message include it is following at least it
One: indicating the keyword of key person, the keyword of instruction movement, the relevant keyword of finance and economics and/or key sentence, technology
Relevant keyword and/or the relevant keyword of key sentence, topical news and/or key sentence;
Based on the key message, the hot information of the meeting is generated.
Here, the key message can also include instruction word relevant to time, place, etc..
Here, the keyword may include that word is introduced in key person's name, time, and/or place etc.;The pass
Keyword can also include that conclusion, action, order, requirement and/or execution etc. act word;The keyword can also include special
The technical term in industry field and/or the hot spot word of topical news;For example, the technical term of the professional domain can for 5G,
Artificial intelligence, neural network, etc.;The hot spot word of the topical news can be Huawei etc..
In some scenes, the embodiment of the present invention can also be the audio data based on the meeting, obtain the audio
The corresponding alternative text information of data;Key message is obtained based on the alternative text information, is based on the key message and institute
The video data of key scenes, the audio data of the key scenes are stated, the hot information of the meeting is obtained.
In embodiments of the present invention, the key scenes in meeting can be identified, and based on crucial in audio data
Word, key sentence etc., extract the news point of the meeting, to obtain the hot information of the meeting;In this way, can be into one
Step improves the accuracy and validity for obtaining hot information.
As shown in figure 3, the method includes following the embodiment of the invention discloses a kind of processing method of meeting hot spot
Step:
Step S301: the content of meeting is acquired;
Optionally, the content of terminal device acquisition meeting;The content of the meeting includes audio data and video data.
Step S302: the audio data of the meeting is separated;
Optionally, the terminal device isolates the audio data from the content of the meeting.
Step S303: the key frame of video data is extracted;
Optionally, the terminal device extracts key frame from the video data with prefixed time interval.
Step S304a: speech recognition;
Optionally, the terminal device carries out speech recognition to the audio data, and it is corresponding to obtain the audio data
Alternative text information.
Step S304b: audio classification;
Optionally, the terminal device carries out audio classification to the audio data, and the audio data is divided into and is spoken
The scene of sound, applause, laugh, song;And it is extracted from the audio data and the voice, applause, laugh, song
Corresponding first audio data of scene.
Step S304c: Application on Voiceprint Recognition;
Optionally, the terminal device carries out Application on Voiceprint Recognition to the audio data, obtains out at least one key person
Vocal print feature information;And the corresponding second audio number of at least one described key person is extracted from the audio data
According to.
Step S305a: scene Recognition;
Optionally, the terminal device obtains the second image of the key frame;Based on the first image number, identification
Interview, the scene that spectators applaud, key person makes a speech out;And from the video data obtain with the interview,
Corresponding first video data of scene that spectators applaud, key person makes a speech.
Step S305b: recognition of face;
Optionally, the terminal device obtains the second image of the key frame;Face is carried out based on second image
Identification, to obtain the second video data of key person.
Wherein, recognition of face can be carried out using neural network model.Specifically, the terminal device, which obtains, includes
There is a training set of images of the training image of face, the training image includes carry key feature points markup information original
Image;Initial convolutional neural networks are repeatedly trained based on described image training set, until loss meets the condition of convergence, are obtained
Neural network model after to training.The first image is trained using the neural network model after training, is identified
The face for including in the first image.
Here, the loss function (loss function) is also cost function (cost function), is nerve net
The process of the objective function of network optimization, neural metwork training or optimization is exactly to minimize the process of loss function, loss function
It is worth smaller, the result of corresponding prediction and the value of legitimate reading are with regard to closer.
Step S306: the comprehensive analysis of multi-modal data is carried out;
Optionally, by the alternative text information, first audio data, the first video data and second view
Frequency is according to being input in multi-modal data model, to carry out the comprehensive analysis of multi-modal data.
Wherein, the multi-modal data model may be neural network model.
Step S307a: the keyword of meeting is extracted;
Optionally, comprehensive analysis of the terminal device based on the multi-modal data obtains the keyword of meeting.
Wherein, the keyword may include that word is introduced in key person's name, time, and/or place etc.;The pass
Keyword can also include that conclusion, action, order, requirement and/or execution etc. act word;The keyword can also include special
The technical term in industry field, the hot spot word of topical news, and/or the relevant hot spot word of finance and economics.For example, the profession neck
The technical term in domain can be 5G, artificial intelligence, neural network, block chain, etc.;The hot spot word of the topical news can
Think Huawei, G20 summit, China's Space station, etc.;The relevant hot spot word of the finance and economics can be tax reduction, macroscopical lever
Rate, etc..
Step S307b: the key sentence of meeting is extracted;
Optionally, comprehensive analysis of the terminal device based on the Multi-state data obtains the key sentence of meeting.
Wherein, key sentence includes but is not limited at least one of: key sentence, the professional domain of instruction movement
Technology sentence, the hot spot sentence of topical news, the relevant key sentence of finance and economics.
Step S307c: the potential news point of meeting is extracted;
Optionally, comprehensive analysis of the terminal device based on the Multi-state data obtains scene of interest in meeting;
Potential news point is extracted based on the scene of interest.
Here, the scene of interest can be the scenes such as applause, laugh;The news point is the scene of interest
Corresponding text information.
Step S308: meeting outline is generated;
Optionally, the terminal device is based on the keyword, the key sentence and the news point, generates meeting
The outline of view.
Here, the meeting outline is the hot information in above-described embodiment.
Step S309: picture and text news release is generated;
Optionally, the terminal device utilizes pre-set format, and the meeting outline is generated picture and text news release.
Here, the pre-set format can be used for providing the organization of unity form of each news point;For example, right
It, can be new to provide with the upper limit value of personage, technical essential and number of words of making a speech in the news release of the speech of multiple key persons
Hear the organization of unity form of original text.
Step S310: segmentation cutting is carried out to the TV news;
Optionally, the terminal device is based on the second audio data, first video data and second view
Frequency is according to segmentation cutting is carried out to TV news, to obtain key person and/or the corresponding video data of key scenes.
Step S311: highlight point video is extracted;
Optionally, the terminal device is based on the key person and/or the corresponding video data of key scenes and institute
Meeting outline is stated, highlight point video is extracted.
Here, the highlight point video is the hot video in above-described embodiment.
Step S312: news-video original text is generated.
Optionally, the terminal device utilizes pre-set format, and it is new that the highlight point video is generated video
Hear original text.
In an alternative embodiment, each key person is arranged using the pre-set format in the terminal device
The correspondence time for the time made a speech.
In the embodiment of the present invention, it can be based on audio classification, determine at least partly key scenes in meeting, such as
Song, laugh, applause and voice etc.;Based on the scene Recognition to video data, its in the meeting is further determined that out
Its Partial key scene;In this way, can obtain than news point crucial in more comprehensive meeting, it is relatively more accurate so as to obtain
Hot information.
And in the embodiment of the present invention, the vocal print that based on the Application on Voiceprint Recognition to audio data, can obtain key person is special
Sign, to obtain the audio data of the key person;In this way, the extraction to the hot news of key person may be implemented.
And further, the video hotspot information of key person can also be isolated in the embodiment of the present invention.
And in the embodiment of the present invention, picture and text news release and/or news-video can be generated based on pre-set format
Original text can make the hot information more specification, clean and tidy, convenient for the reading or viewing of user, promote the experience of user.
It need to be noted that: the description of the processing method of following meeting hot spot, the processing side with above-mentioned meeting hot spot
Method item description be it is similar, with method beneficial effect describe, do not repeat them here.For the processing unit of meeting hot spot of the present invention
Undisclosed technical detail in embodiment please refers to the description of the processing method embodiment of meeting hot spot of the present invention.
As shown in figure 4, the embodiment of the invention also provides a kind of processing unit of meeting hot spot, described device includes:
First obtains module 41, for obtaining the audio data and/or video data of meeting;
First identification module 42 identifies the pass of the meeting for being based on the audio data and/or video data
Key scene;
Second obtains module 43, for the first audio data in first time period where obtaining the key scenes;
Second identification module 44, for obtaining the first text information by identifying first audio data;
Generation module 45 generates the hot information of the meeting for being based on first text information.
In some embodiments, the generation module 45, for based in first text information keyword and/
Or key sentence, generate the hot information of the meeting.
In some embodiments, described second module 43 is obtained, for first time period where obtaining the key scenes
The first interior video data;
The generation module 45 generates the meeting for being based on first text information and first video data
The hot video of view.
In some embodiments, the generation module 45 is also used to obtain key message in first text information;
Video data corresponding with the key message is extracted, from first video data to generate the hot spot view of the meeting
Frequently.
In some embodiments, first identification module 42, for carrying out audio classification to the audio data, with
Determine the first key scenes of the meeting;Wherein, first key scenes include at least one of: voice, the palm
Sound, laugh, song;And/or
For extracting key frame from the video data;Identify the meeting based on the key frame second is closed
Key scene;Wherein, second key scenes include at least one of: interview, spectators are applauded, key person makes a speech.
In some implementations, first identification module 42 is also used to if it is determined that the first alternative key scenes
When duration is greater than first threshold, determine that alternative first key scenes are first key scenes;And/or
Key frame is extracted from the video data;Identify the meeting based on the key frame alternative second is closed
Key scene;If it is determined that determining that described alternative second closes when the duration of alternative second key scenes is greater than first threshold
Key scene is second key scenes.
In some embodiments, second identification module 44, for obtaining by identifying first audio data
The voiceprint of at least one key person;Based on the voiceprint of the key person, the first of the key person is obtained
Sub- text information;
The generation module 45, for being inserted into the identification information of the key person in the described first sub- text information,
Generate the hot information based on the key person.
In some embodiments, described second module 43 is obtained, for based on a key at least one key person
The voiceprint of personage extracts the first sub-video data of the key person from the first video data;
The generation module 45 generates the hot spot view of the key person for being based on first sub-video data
Frequently.
In some embodiments, described second module 43 is obtained, for the voiceprint based on multiple key persons, from
The second sub-video data corresponding with the multiple key person is extracted in first video data;
Based on the respective weight coefficient of the multiple key person, extracted from corresponding second sub-video data
The video clip of corresponding period out;
The generation module 45 generates described more for gathering the corresponding video clip of the multiple key person
The hot video of a key person.
In some embodiments, the generation module 45, for extracting the crucial letter in first text information
Breath;Wherein, the key message includes at least one of: indicate key person keyword, instruction movement keyword,
The relevant keyword of finance and economics and/or the relevant keyword of key sentence, technology and/or the relevant pass of key sentence, topical news
Keyword and/or key sentence;Based on the key message, the hot information of the meeting is generated.
As shown in figure 5, the terminal device includes: processor 51 the embodiment of the invention also discloses a kind of terminal device
With for storing the memory 52 that can run Computer Service on processor 51, wherein the processor 51 is for running
When the Computer Service, the processing method for being applied to the meeting hot spot of the terminal device is realized.
In some embodiments, the memory in the embodiment of the present invention can be volatile memory or non-volatile deposit
Reservoir, or may include both volatile and non-volatile memories.Wherein, nonvolatile memory can be read-only memory
(Read-Only Memory, ROM), programmable read only memory (Programmable ROM, PROM), erasable programmable are only
Read memory (Erasable PROM, EPROM), electrically erasable programmable read-only memory (Electrically EPROM,
) or flash memory EEPROM.Volatile memory can be random access memory (Random Access Memory, RAM),
As External Cache.By exemplary but be not restricted explanation, the RAM of many forms is available, such as static random
Access memory (Static RAM, SRAM), dynamic random access memory (Dynamic RAM, DRAM), synchronous dynamic with
Machine accesses memory (Synchronous DRAM, SDRAM), double data speed synchronous dynamic RAM
(Double Data Rate SDRAM, DDRSDRAM), enhanced Synchronous Dynamic Random Access Memory (Enhanced
SDRAM, ESDRAM), synchronized links dynamic random access memory (Synchlink DRAM, SLDRAM) and direct memory it is total
Line random access memory (Direct Rambus RAM, DRRAM).The memory of system and method described herein is intended to
The including but not limited to memory of these and any other suitable type.
And possible kind of the IC chip of processor, the processing capacity with signal.During realization, the above method
Each step can be completed by the instruction of the integrated logic circuit of the hardware in processor or software form.Above-mentioned place
Reason device can be general processor, digital signal processor (Digital Signal Processor, DSP), dedicated integrated electricity
Road (Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field
Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic device
Part, discrete hardware components.It may be implemented or execute disclosed each method, step and the logic diagram in the embodiment of the present invention.
General processor can be microprocessor or the processor is also possible to any conventional processor etc..In conjunction with of the invention real
The step of applying method disclosed in example can be embodied directly in hardware decoding processor and execute completion, or use decoding processor
In hardware and software module combination execute completion.Software module can be located at random access memory, flash memory, read-only memory,
In the storage medium of this fields such as programmable read only memory or electrically erasable programmable memory, register maturation.This is deposited
The step of storage media is located at memory, and processor reads the information in memory, completes the above method in conjunction with its hardware.
In some embodiments, embodiments described herein can use hardware, software, firmware, middleware, microcode
Or combinations thereof realize.For hardware realization, processing unit be may be implemented in one or more specific integrated circuits
(Application Specific Integrated Circuits, ASIC), digital signal processor (Digital
Signal Processing, DSP), digital signal processing appts (DSP Device, DSPD), programmable logic device
(Programmable Logic Device, PLD), field programmable gate array (Field-Programmable Gate
Array, FPGA), general processor, controller, microcontroller, microprocessor, for executing the other of herein described function
In electronic unit or combinations thereof.
For software implementations, it can be realized herein by executing the module (such as process, function etc.) of function described herein
The technology.Software code is storable in memory and is executed by processor.Memory can in the processor or
It is realized outside processor.
Further embodiment of this invention provides a kind of computer storage medium, which has
Executable program, it can be achieved that being applied in the server or terminal device when the executable code processor executes
The step of processing method of meeting hot spot.For example, one or more of method as shown in FIG. 1 to FIG. 3.
In some embodiments, the computer storage medium may include: USB flash disk, mobile hard disk, read-only memory
(ROM, Read Only Memory), random access memory (RAM, Random Access Memory), magnetic or disk
Etc. the various media that can store program code.
In several embodiments provided herein, it should be understood that disclosed device and method can pass through
Other modes are realized.
It, in the absence of conflict, can be with it should be understood that between technical solution documented by the embodiment of the present invention
Any combination.
The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, appoints
What those familiar with the art in the technical scope disclosed by the present invention, can easily think of the change or the replacement, answer
It is included within the scope of the present invention.Therefore, protection scope of the present invention should be with the scope of protection of the claims
It is quasi-.
Claims (10)
1. a kind of processing method of meeting hot spot, which is characterized in that the described method includes:
Obtain the audio data and/or video data of meeting;
Based on the audio data and/or video data, the key scenes of the meeting are identified;
The first audio data where obtaining the key scenes in first time period;
By identifying first audio data, the first text information is obtained;
Based on first text information, the hot information of the meeting is generated.
2. the method according to claim 1, wherein the method also includes:
The first video data where obtaining the key scenes in first time period;
Based on first text information and first video data, the hot video of the meeting is generated.
3. knowing the method according to claim 1, wherein described be based on the audio data and/or video data
Not Chu the meeting key scenes, comprising:
Audio classification is carried out to the audio data, to determine the first key scenes of the meeting;Wherein, it described first closes
Key scene includes at least one of: voice, applause, laugh, song;
And/or
Key frame is extracted from the video data;The second key scenes of the meeting are identified based on the key frame;Its
In, second key scenes include at least one of: interview, spectators are applauded, key person makes a speech.
4. the method according to claim 1, wherein described by identifying first audio data, the is obtained
One text information, comprising:
By identifying first audio data, the voiceprint of at least one key person is obtained;Based on the key person
Voiceprint, obtain the first sub- text information of the key person;
It is described to be based on first text information, generate the hot information of meeting, comprising:
It is inserted into the identification information of the key person in the described first sub- text information, generates the heat based on the key person
Point information.
5. according to the method described in claim 4, it is characterized in that, the method also includes:
Based on the voiceprint of a key person at least one key person, the pass is extracted from the first video data
The first sub-video data of key personage;
Based on first sub-video data, the hot video of the key person is generated.
6. according to the method described in claim 4, it is characterized in that, the method also includes:
Based on the voiceprint of multiple key persons, it is right respectively with the multiple key person to extract from the first video data
The second sub-video data answered;
Based on the respective weight coefficient of the multiple key person, extracted from corresponding second sub-video data corresponding
The video clip of period;
Gather the corresponding video clip of the multiple key person, generates the hot video of the multiple key person.
7. generating the meeting the method according to claim 1, wherein described be based on first text information
The hot information of view, comprising:
Extract the key message in first text information;Wherein, the key message includes at least one of: instruction
Keyword, the relevant keyword of finance and economics and/or the relevant pass of key sentence, technology that the keyword of key person, instruction act
Keyword and/or the relevant keyword of key words, topical news and/or key sentence;
Based on the key message, the hot information of the meeting is generated.
8. a kind of processing unit of meeting hot spot, which is characterized in that described device includes:
First obtains module, for obtaining the audio data and/or video data of meeting;
First identification module identifies the key scenes of the meeting for being based on the audio data and/or video data;
Second obtains module, for the first audio data in first time period where obtaining the key scenes;
Second identification module, for obtaining the first text information by identifying first audio data;
Generation module generates the hot information of the meeting for being based on first text information.
9. a kind of terminal device, which is characterized in that the terminal device includes: processor and can be on a processor for storing
The memory of Computer Service is run, wherein the processor is for realizing claim 1-7 when running the Computer Service
The processing method of described in any item meeting hot spots.
10. a kind of storage medium, there are computer executable instructions in the storage medium, which is characterized in that the computer can
Execute instruction the processing method for being executed by processor and realizing the described in any item meeting hot spots of claim 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910549987.5A CN110211590B (en) | 2019-06-24 | 2019-06-24 | Conference hotspot processing method and device, terminal equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910549987.5A CN110211590B (en) | 2019-06-24 | 2019-06-24 | Conference hotspot processing method and device, terminal equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110211590A true CN110211590A (en) | 2019-09-06 |
CN110211590B CN110211590B (en) | 2021-12-03 |
Family
ID=67794249
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910549987.5A Active CN110211590B (en) | 2019-06-24 | 2019-06-24 | Conference hotspot processing method and device, terminal equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110211590B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111223487A (en) * | 2019-12-31 | 2020-06-02 | 联想(北京)有限公司 | Information processing method and electronic equipment |
CN111400511A (en) * | 2020-03-12 | 2020-07-10 | 北京奇艺世纪科技有限公司 | Multimedia resource interception method and device |
CN111597381A (en) * | 2020-04-16 | 2020-08-28 | 国家广播电视总局广播电视科学研究院 | Content generation method, device and medium |
CN111798870A (en) * | 2020-09-08 | 2020-10-20 | 共道网络科技有限公司 | Session link determining method, device and equipment and storage medium |
CN112231464A (en) * | 2020-11-17 | 2021-01-15 | 安徽鸿程光电有限公司 | Information processing method, device, equipment and storage medium |
CN116074137A (en) * | 2023-01-18 | 2023-05-05 | 京东方科技集团股份有限公司 | Recording method, recording device, electronic equipment and storage medium for meeting summary |
Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102388379A (en) * | 2009-10-27 | 2012-03-21 | 思科技术公司 | Automated and enhanced note taking for online collaborative computing sessions |
CN103064863A (en) * | 2011-10-24 | 2013-04-24 | 北京百度网讯科技有限公司 | Method and equipment of providing recommend information |
CN103137137A (en) * | 2013-02-27 | 2013-06-05 | 华南理工大学 | Eloquent speaker finding method in conference audio |
US20150067026A1 (en) * | 2013-08-30 | 2015-03-05 | Citrix Systems, Inc. | Acquiring online meeting data relating to an online meeting |
JP2015076673A (en) * | 2013-10-07 | 2015-04-20 | 富士ゼロックス株式会社 | Conference system, server device, client terminal and program |
CN105574182A (en) * | 2015-12-22 | 2016-05-11 | 北京搜狗科技发展有限公司 | News recommendation method and device as well as device for news recommendation |
CN106202427A (en) * | 2016-07-12 | 2016-12-07 | 腾讯科技(深圳)有限公司 | Application processing method and device |
US20160372154A1 (en) * | 2015-06-18 | 2016-12-22 | Orange | Substitution method and device for replacing a part of a video sequence |
CN106599137A (en) * | 2016-12-02 | 2017-04-26 | 北京薇途科技有限公司 | Novel scene content pushing system and device |
CN106612468A (en) * | 2015-10-21 | 2017-05-03 | 上海文广互动电视有限公司 | A video abstract automatic generation system and method |
CN106982344A (en) * | 2016-01-15 | 2017-07-25 | 阿里巴巴集团控股有限公司 | video information processing method and device |
CN107528899A (en) * | 2017-08-23 | 2017-12-29 | 广东欧珀移动通信有限公司 | Information recommendation method, device, mobile terminal and storage medium |
CN108305632A (en) * | 2018-02-02 | 2018-07-20 | 深圳市鹰硕技术有限公司 | A kind of the voice abstract forming method and system of meeting |
CN108346034A (en) * | 2018-02-02 | 2018-07-31 | 深圳市鹰硕技术有限公司 | A kind of meeting intelligent management and system |
CN109388701A (en) * | 2018-08-17 | 2019-02-26 | 深圳壹账通智能科技有限公司 | Minutes generation method, device, equipment and computer storage medium |
-
2019
- 2019-06-24 CN CN201910549987.5A patent/CN110211590B/en active Active
Patent Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102388379A (en) * | 2009-10-27 | 2012-03-21 | 思科技术公司 | Automated and enhanced note taking for online collaborative computing sessions |
CN103064863A (en) * | 2011-10-24 | 2013-04-24 | 北京百度网讯科技有限公司 | Method and equipment of providing recommend information |
CN103137137A (en) * | 2013-02-27 | 2013-06-05 | 华南理工大学 | Eloquent speaker finding method in conference audio |
US20150067026A1 (en) * | 2013-08-30 | 2015-03-05 | Citrix Systems, Inc. | Acquiring online meeting data relating to an online meeting |
JP2015076673A (en) * | 2013-10-07 | 2015-04-20 | 富士ゼロックス株式会社 | Conference system, server device, client terminal and program |
US20160372154A1 (en) * | 2015-06-18 | 2016-12-22 | Orange | Substitution method and device for replacing a part of a video sequence |
CN106612468A (en) * | 2015-10-21 | 2017-05-03 | 上海文广互动电视有限公司 | A video abstract automatic generation system and method |
CN105574182A (en) * | 2015-12-22 | 2016-05-11 | 北京搜狗科技发展有限公司 | News recommendation method and device as well as device for news recommendation |
CN106982344A (en) * | 2016-01-15 | 2017-07-25 | 阿里巴巴集团控股有限公司 | video information processing method and device |
CN106202427A (en) * | 2016-07-12 | 2016-12-07 | 腾讯科技(深圳)有限公司 | Application processing method and device |
CN106599137A (en) * | 2016-12-02 | 2017-04-26 | 北京薇途科技有限公司 | Novel scene content pushing system and device |
CN107528899A (en) * | 2017-08-23 | 2017-12-29 | 广东欧珀移动通信有限公司 | Information recommendation method, device, mobile terminal and storage medium |
CN108305632A (en) * | 2018-02-02 | 2018-07-20 | 深圳市鹰硕技术有限公司 | A kind of the voice abstract forming method and system of meeting |
CN108346034A (en) * | 2018-02-02 | 2018-07-31 | 深圳市鹰硕技术有限公司 | A kind of meeting intelligent management and system |
CN109388701A (en) * | 2018-08-17 | 2019-02-26 | 深圳壹账通智能科技有限公司 | Minutes generation method, device, equipment and computer storage medium |
Non-Patent Citations (2)
Title |
---|
BING ZHU: ""Research and implementation of hot topic detection system based on web"", 《ICCC》 * |
金海: ""基于深度神经网络的音频事件监测"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111223487A (en) * | 2019-12-31 | 2020-06-02 | 联想(北京)有限公司 | Information processing method and electronic equipment |
CN111400511A (en) * | 2020-03-12 | 2020-07-10 | 北京奇艺世纪科技有限公司 | Multimedia resource interception method and device |
CN111597381A (en) * | 2020-04-16 | 2020-08-28 | 国家广播电视总局广播电视科学研究院 | Content generation method, device and medium |
CN111798870A (en) * | 2020-09-08 | 2020-10-20 | 共道网络科技有限公司 | Session link determining method, device and equipment and storage medium |
CN112231464A (en) * | 2020-11-17 | 2021-01-15 | 安徽鸿程光电有限公司 | Information processing method, device, equipment and storage medium |
CN112231464B (en) * | 2020-11-17 | 2023-12-22 | 安徽鸿程光电有限公司 | Information processing method, device, equipment and storage medium |
CN116074137A (en) * | 2023-01-18 | 2023-05-05 | 京东方科技集团股份有限公司 | Recording method, recording device, electronic equipment and storage medium for meeting summary |
Also Published As
Publication number | Publication date |
---|---|
CN110211590B (en) | 2021-12-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110211590A (en) | A kind of processing method, device, terminal device and the storage medium of meeting hot spot | |
CN110517689B (en) | Voice data processing method, device and storage medium | |
CN107562760B (en) | Voice data processing method and device | |
Biel et al. | VlogSense: Conversational behavior and social attention in YouTube | |
Hong et al. | Video accessibility enhancement for hearing-impaired users | |
CN106777257B (en) | Intelligent dialogue model construction system and method based on dialect | |
CN110121116A (en) | Video generation method and device | |
CN106303658A (en) | It is applied to exchange method and the device of net cast | |
CN109361825A (en) | Meeting summary recording method, terminal and computer storage medium | |
CN105681920A (en) | Network teaching method and system with voice recognition function | |
CN104463423A (en) | Formative video resume collection method and system | |
CN111028007B (en) | User portrait information prompting method, device and system | |
CN107507620A (en) | Voice broadcast sound setting method and device, mobile terminal and storage medium | |
CN112562677A (en) | Conference voice transcription method, device, equipment and storage medium | |
CN206672635U (en) | A kind of voice interaction device based on book service robot | |
CN111353439A (en) | Method, device, system and equipment for analyzing teaching behaviors | |
WO2024188276A1 (en) | Text classification method and refrigeration device system | |
CN117609548A (en) | Video multi-mode target element extraction and video abstract synthesis method and system based on pre-training model | |
CN117216206A (en) | Session processing method and device, electronic equipment and storage medium | |
CN111522992A (en) | Method, device and equipment for putting questions into storage and storage medium | |
CN111160051A (en) | Data processing method and device, electronic equipment and storage medium | |
CN111221987A (en) | Hybrid audio tagging method and apparatus | |
CN110351183A (en) | Resource collecting method and device in instant messaging | |
CN114155841A (en) | Voice recognition method, device, equipment and storage medium | |
CN107464196A (en) | Student group is left school Forecasting Methodology and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |