CN103700370B - A kind of radio and television speech recognition system method and system - Google Patents
A kind of radio and television speech recognition system method and system Download PDFInfo
- Publication number
- CN103700370B CN103700370B CN201310648375.4A CN201310648375A CN103700370B CN 103700370 B CN103700370 B CN 103700370B CN 201310648375 A CN201310648375 A CN 201310648375A CN 103700370 B CN103700370 B CN 103700370B
- Authority
- CN
- China
- Prior art keywords
- voice
- data
- mark
- identification
- speech
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Abstract
The invention discloses a kind of radio and television audio recognition method and system, wherein method includes: extract voice data according to broadcast television data;Voice data is carried out pretreatment, obtains feature text data;Feature text data is sent to Cloud Server be identified processing, obtains men and women's sound identification, Speaker Identification and voice identification result;To data prediction, men and women's sound identification, Speaker Identification and voice identification result merges and structured text mark, the voice identification result of generating structure.Existing voice recognition methods is improved by the method, merge various broadcast television data preconditioning technique and radio and television audio recognition method, it is identified processing for the data handling requirements of broadcast television industry to speech data, carry out each recognition result merging and the voice identification result of generating structure, basic data can be provided for the intelligent processing method of other business of subsequent broadcast TV programme, and processing speed is accelerated and improves accuracy.
Description
Technical field
The present invention relates to audio frequency and video processing technology field, know particularly to a kind of radio and television voice
Other method and system.
Background technology
At present in field of broadcast televisions, radio and television speech recognition is mainly utilized and is applicable to each row
The conventional speech recognition methods of industry, and traditional speech recognition mainly uses pattern matching method, point
For training and identifying two stages, wherein in the training stage, user is by each word in vocabulary
Read successively or give an account of, and its characteristic vector is stored in template base as template;Identifying
In the stage, the characteristic vector of input voice is carried out similarity with each template in template base successively
Relatively, similarity soprano is exported as recognition result.
But this speech recognition application there is problems in that in the speech recognition of field of broadcast televisions
1) speech recognition is often had particularly, is different from other industry by broadcast television industry
Process and operation, but owing to above-mentioned traditional voice identification is applied to every profession and trade, for extensively
Broadcast TV industry and there is no specific aim, it is impossible to according to the feature of broadcast television industry to radio and television number
Non-voice context according to filters.Because in broadcast television industry non-voice context for
Speech recognition is not within process range, if so do not carried out non-voice context
Filter, just also needs to be transmitted it and process, and does not only result in transfer resource and calculates resource
Waste, but also can cause that more misrecognition operation occurs due to the existence of non-voice context,
And affect processing speed.
2) do not possesses the speech recognition for broadcast television industry due to traditional voice identification technology
Function, causes recognition result sufficiently complete, such as, cannot sentence for one section of broadcast television data
Break and the important information such as identity of the speak scene occurred and speaker, it is impossible to voice content
Segmentation is carried out, it is impossible to identify the timestamp of each voice word, to follow-up according to different speakers
The intellectuality of other broadcast television services, automatic business processing cannot provide any valuable reference
Information.
To sum up, traditional audio recognition method is applied to exist in broadcast television industry and is expended money
The problems such as source, processing speed is slow, accuracy is the highest, offer quantity of information is not enough.
Summary of the invention
(1) to solve the technical problem that
The technical problem to be solved in the present invention is how to carry out language for broadcast television industry feature
Sound identification, it is to avoid conventional speech recognition methods broadcast television industry apply present in shortcoming,
Intellectuality, automatic business processing for other broadcast television industry business follow-up provide abundance can use
Basic data.
(2) technical scheme
For solving above-mentioned technical problem, the invention provides a kind of radio and television speech recognition side
Method, including:
S1, extract voice data according to broadcast television data;
S2, described voice data is carried out pretreatment, obtain feature text data;
S3, described feature text data is sent to Cloud Server be identified process, obtain man
Female voice identification, Speaker Identification and voice identification result;
S4, to described data prediction, men and women's sound identification, Speaker Identification and speech recognition
Result carries out merging and structured text mark, the voice identification result of generating structure.
Further, step S2 carries out pretreatment to described voice data and specifically includes:
S21, described voice data is carried out cutting and fragmentation process and generate several sentences literary compositions
Part;
S22, described sentence file is carried out non-voice filtration, leave speech sentence file;
S23, each speech sentence file is carried out wide and narrow strip differentiation, to being determined as broadband signal
Speech sentence file add broadband mark, it determines for narrow band signal speech sentence file add
Arrowband identifies;
S24, the speech sentence file identifying interpolation broadband mark and arrowband carry out audio frequency characteristics
Extract, obtain feature text data, wherein said feature text data comprises this speech sentence
Beginning and ending time, voice characteristics information, the audio-video document title of this sentence ownership and correspondence
Wide and narrow strip identifies.
Further, described feature text data is sent to Cloud Server and is identified by step S3
Process includes: men and women's sound identification, Speaker Identification, voice content identification and punctuation mark identification,
Generate the voice identification result containing mark.
Further, institute's speech recognition result is merged and structured text by step S4
Mark specifically includes:
S41, each voice identification result is collected, aligns, and according to wherein comprising
Beginning and ending time is ranked up;
S42, to sequence after voice identification result be marked according to structured format, including
Speaker's sex mark, speaker's mark, voice content, punctuation mark and timestamp.
Further, the process that step S3 is identified processing is to know according to language model storehouse
Other, and network text collection is passed through in described speech model storehouse and network text study is constantly carried out
Update.
For solving above-mentioned technical problem, present invention also offers a kind of radio and television speech recognitions system
System, this system includes:
Extraction unit, extracts voice data according to broadcast television data;
Pretreatment terminal, carries out pretreatment to described voice data, obtains feature text data,
And it is sent to Cloud Server;
Cloud Server, is identified described feature text data processing, and obtains speech recognition knot
Really, and institute's speech recognition result is merged and structured text mark, generating structure
The voice identification result changed.
Further, described pretreatment terminal includes:
Cutting module, carries out cutting to described voice data and fragmentation processes and generates several sentences
Subfile;
Non-voice filtering module, carries out non-voice filtration to described sentence file, leaves voice sentence
Subfile;
Wide and narrow strip discrimination module, carries out wide and narrow strip differentiation, to differentiation to each speech sentence file
Speech sentence file for broadband signal adds broadband mark, it determines for the voice sentence of narrow band signal
Subfile adds arrowband mark;
Audio feature extraction module, to adding broadband mark and the speech sentence file of arrowband mark
Carry out audio feature extraction, obtain feature text data, wherein said feature text data wraps
Beginning and ending time containing this speech sentence, belong to the wide and narrow strip mark of audio-video document title and correspondence
Know.
Further, described Cloud Server includes:
Men and women's sound identification module, for carrying out men and women's sound identification to described feature text data;
Speaker Identification module, for carrying out Speaker Identification to described feature text;
Voice content and punctuation mark identification module, for carrying out in voice described feature text
Hold and identify and punctuation mark identification, generate the voice identification result containing punctuation mark mark;
Recognition result processing module, merges institute's speech recognition result and structuring literary composition
This mark, the voice identification result of generating structure.
Further, described recognition result processing module farther includes:
Collect order module, for each voice identification result is collected, aligns, and press
It is ranked up according to the beginning and ending time wherein comprised;
Add mark module, for the voice identification result after sequence is carried out according to structured format
Labelling, including speaker's sex mark, speaker's mark, voice content, punctuation mark and
Timestamp.
Further, described Cloud Server also includes: language model intellectual learning module, use
In fixed-period crawling network text, by the study regular update language model storehouse to network text,
In identification processing procedure, the language model storehouse according to regular update is identified.
(3) beneficial effect
Embodiments provide a kind of radio and television audio recognition method and system, Qi Zhongfang
Method includes: extract voice data according to broadcast television data;Described voice data is carried out pre-
Process, obtain feature text data;Described feature text data is sent to Cloud Server carry out
Identifying processing, obtains men and women's sound identification, Speaker Identification and voice identification result;To described
Data prediction, men and women's sound identification, Speaker Identification and voice identification result carry out merging with
And structured text mark, the voice identification result of generating structure..The method is based on cloud meter
Calculate and existing voice recognition methods is improved, merge broadcast television data preconditioning technique, man
Female voice identification technology, speaker Recognition Technology and radio and television audio recognition method, to voice
Data are specific to the data handling requirements of broadcast television industry again and know after carrying out pretreatment
Other places are managed, to broadcast television data pre-processed results, men and women's sound recognition result, Speaker Identification
Result and voice identification result carry out merging and structured text mark, generating structure
Voice identification result, it is possible to for the speech retrieval of broadcast TV program, subtitle recognition, host
The later stage intelligent processing method functions such as identification provide basic data, it is possible to radio and television voice is known
Other processing speed is accelerated and improves accuracy.
Intellectuality, automatic business processing for other broadcast television services follow-up provide basic data tool
Body include following some:
1) recognition result to voice and the mark result to voice word timestamp can be
The retrieval service of radio and television voice content provides basic data;
2) the cutting time point to speech sentence identifies result, and the differentiation knot of wide and narrow strip
Really, can be broadcast TV program split provide boundary time point reference;
3) to the identification of voice content in radio and television and the identification of punctuation mark, permissible
Content reference is provided for the subtitle recognition in broadcast TV program;
4) to the Speaker Identification of speech sentence and the differentiation of wide and narrow strip as a result, it is possible to be
Host in broadcast TV program identifies, welcome guest identifies, scene Recognition of speaking (indoor scene,
Outdoor scene) etc. provide foundation.
Accompanying drawing explanation
The step of a kind of radio and television audio recognition method that Fig. 1 provides for the embodiment of the present invention one
Flow chart;
The flow chart of steps of the pretreatment operation that Fig. 2 provides for the embodiment of the present invention one;
The speech/non-speech that Fig. 3 provides for the embodiment of the present invention one differentiate during audio classification
The technological frame schematic diagram of method;
The tool that broadcast television data is carried out speech recognition that Fig. 4 provides for the embodiment of the present invention one
Body flow chart;
The composition of a kind of radio and television speech recognition system that Fig. 5 provides for the embodiment of the present invention two
Schematic diagram;
The composition schematic diagram of the pretreatment terminal that Fig. 6 provides for the embodiment of the present invention two;
The composition schematic diagram of the Cloud Server that Fig. 7 provides for the embodiment of the present invention two;
The voice content that Fig. 8 provides for the embodiment of the present invention two and the work of punctuation mark identification module
Make flow chart;
The cloud service platform configuration diagram that Fig. 9 provides for the embodiment of the present invention two.
Detailed description of the invention
Below in conjunction with the accompanying drawings and embodiment, the detailed description of the invention of the present invention is made the most in detail
Describe.Following example are used for illustrating the present invention, but are not limited to the scope of the present invention.
Embodiment one
The embodiment of the present invention one provides a kind of radio and television audio recognition method, and steps flow chart is such as
Shown in Fig. 1, specifically include following steps:
Step S1, extract voice data according to broadcast television data.
Step S2, voice data is carried out pretreatment, obtain feature text data.
Step S3, feature text data is sent to Cloud Server be identified process, obtain man
Female voice identification, Speaker Identification and voice identification result;
Step S4, to data prediction, men and women's sound identification, Speaker Identification and speech recognition
Result carries out merging and structured text mark, the voice identification result of generating structure.
Broadcast television data to be identified (the i.e. audio frequency and video number that first said method provides from user
According to) in extract voice data, and after pretreatment, obtain feature text data, then by cloud
Server to its be identified process, finally to the data prediction obtained, men and women's sound identification,
Speaker Identification and voice identification result carry out merging and structured text mark, the most throughout one's life
Become structurized voice identification result, and return to use with expandable mark language XML by it
Family.To voice identification result add the timestamp of voice word, the timestamp of sentence, Nan Nvsheng,
Speakers etc. identify, it is possible to for retrieval, subtitle recognition and the hosting of radio and television voice content
People's identification etc. provide foundation, the intellectuality of convenient other broadcast television services follow-up, automatically
Change processes, and provides basic data for various operations and process.
Preferably, also include before the present embodiment step S1: receive the radio and television that user sends
Data, wherein this broadcast television data includes audio, video data it can be understood as voice data
And video data.After receiving broadcast television data, this broadcast television data to be first determined whether
It is whether the audio, video data type supported of speech recognition system, if not supporting in other words
Discernible audio, video data, then refusal processes.
Audio/video decoding in the present embodiment uses encoding and decoding standard G.711, utilizes ffmpeg
Software decode instrument realizes the decoding of audio frequency and video, and extraction audio-frequency unit saves as pcm form,
The radio and television audio, video data form of compatible current various main flows, such as wmv, wma,
The form such as wav, mp3, asf, rm, mp4, avi, flv.If it is judged that be discernible
Audio, video data, then be decoded this audio, video data, and the most therefrom extract and belong to
The data of audio-frequency unit, and using the voice data that obtains as the pending data of step S2.
Preferably, step S2 in the present embodiment carries out pretreatment to voice data, in pretreatment
Hold and mainly include carrying out cutting and fragmentation, to fragmentation according to the standard of applicable speech recognition
After sentence file carry out speech/non-speech, the differentiation of broadband/arrowband and identify, finally extract
Include the feature text data of phonetic feature, the steps flow chart of pretreatment operation as in figure 2 it is shown,
Specifically include following steps:
Step S21, voice data is carried out cutting and fragmentation process and generate several sentences literary compositions
Part.
Voice data owing to receiving is the more complete data block of ratio, needs its cutting and broken
Sheetization processes, and generates several sentence files little, that be suitable for speech recognition system process.Tool
The dicing process of body is as follows:
First this voice data is resolved, analyzes the energy signal value of each audio sample point,
Finding mute position, in the present embodiment with 50 frames, frame 200 sampled point is as quiet point
Threshold values, when exceeding this quiet some threshold values, illustrates that this point is mute position;Find mute position it
After, according to mute position, voice data is carried out cutting, i.e. fragmentation and generate discrete sentence literary composition
Part, and each sentence file is stamped time marking, the sentence file finally given is with pcm lattice
Formula preserves.
Step S22, distich subfile carry out non-voice filtration, leave speech sentence file.
Owing to step S21 simply carries out cutting according to mute position to voice data, wherein also
Including substantial amounts of non-voice context, and these contents do not have any side for follow-up audio identification
Help, also do not have any positive effect, contrary, owing to the existence of non-voice context also can
Increase the weight of the speech recognition system transmission to voice data and the processing load of calculating, also result in by mistake
The generation identified, it is therefore desirable to the sentence file generated is carried out non-voice filtration, i.e. to fragment
Sentence file after change carries out speech/non-speech differentiation, remaining speech sentence file, this step
Specific as follows:
First, resolve the sentence file after each fragmentation, according to speech/non-speech classification mould
Type, carries out the differentiation of speech/non-speech by grader to each sentence file;
Secondly, according to differentiating result, the sentence file of non-voice is carried out deleting the operation of mark,
And the sub-time location of protocol sentence.
The present embodiment employ a kind of based on support vector machine (Support Vector
Machine, be called for short SVM) audio frequency classification method, be primarily based on energy threshold, short sentence
Son is divided into quiet and non-mute, then by selecting effectively and the audio frequency characteristics of robust, non-
Mute signal is divided into 4 classes: voice (pure voice, non-pure voice), non-voice (music, ring
Border sound), the method has the highest classification accuracy and processing speed, this audio frequency classification method
Technological frame as shown in Figure 3.
Step S23, each speech sentence file is carried out wide and narrow strip differentiation, to being determined as broadband
The speech sentence file of signal adds broadband mark, it determines for the speech sentence file of narrow band signal
Add arrowband mark.
Each speech sentence is carried out wide and narrow strip differentiation, in order to according to differentiating that result is subsequent voice
Selecting which kind of speech recognition modeling to provide reference during identification, this step is specific as follows:
First, the speech sentence segment processed remaining applicable speech recognition system after filtering is entered
Row is analyzed one by one, it determines its speech sentence is broadband (high sampling rate) or arrowband (low sampling rate),
To select which kind of speech recognition modeling to provide reference during subsequent speech recognition;
Secondly, every speech sentence is carried out wide and narrow strip mark, i.e. the voice sentence to broadband signal
Subfile adds broadband mark, and the speech sentence file of narrow band signal is added arrowband mark.
Concrete, in the present embodiment, wide and narrow strip differentiates by analyzing the spectrum energy in audio signal
Value differentiates, when the spectral energy values of more than 8K is more than 0.1, this audio signal is wide
Band, when the spectral energy values of below 8K is less than or equal to 0.1, this audio signal is then narrow
Band signal.
Step S24, the speech sentence file identifying interpolation broadband mark and arrowband carry out audio frequency
Feature extraction, obtains feature text data, wherein comprises this speech sentence in feature text data
Beginning and ending time, voice characteristics information, the audio-video document title of this sentence ownership and correspondence
Wide and narrow strip identifies.
For saving network bandwidth resources, after speech sentence file is added wide and narrow strip mark, also
The extraction of audio frequency characteristics to be carried out, is converted into text feature data by voice data, to reduce net
The data volume of network transmission, specific as follows:
First, divide one by one to the speech sentence file adding broadband mark and arrowband mark
Analysis, extracts MFCC(Mel Frequency Cepstrum Coefficient, Mel frequency cepstral
Coefficient) and PLP(Packet Level Protocol, packet level protocol) phonetic feature, this is
At two kinds of phonetic features that field of speech recognition is conventional;
Secondly, every phonetic feature after extraction is carried out time marking so that finally obtain
Feature text data comprises the beginning and ending time of this speech sentence, belongs to which audio-video document
The wide and narrow strip mark of file name and correspondence.
It should be noted that input speech signal is not only converted into by this step compares robust and tool
There is the phonetic feature of separating capacity, for distinguishing different speakers, and at feature extraction base
Also having carried out certain normalization on plinth, normalization content therein includes:
1) mean normalization CMN, mainly reduces channel effect;
2) normalized square mean CVN, the main additive noise that reduces affects;
3) sound channel length normalization VTLN, the impact that main reduction sound channel difference causes;
4) Gaussian Gaussianization, is the extended method of CMN+CVN;
5) anti-noise algorithm, reducing background noise affects systematic function, uses AWF and VTS
Algorithm.
Preferably, feature text data is sent to Cloud Server by the present embodiment step S3, enters
Enter speech recognition flow process.The present embodiment medium cloud server calls module uses Web Service to connect
Mouthful agreement, radio and television mission bit stream to be identified is sent in the way of XML message to
Server end carries out speech recognition.Wherein the XML message of identification mission comprises herein below:
1) radio and television file name to be identified;
2) the sentence listed files of fragmentation;
3) the speech/non-speech mark of each sentence file;
4) broadband of each sentence file/arrowband mark;
5) the phonetic feature text of each sentence file being accredited as voice;
6) the beginning and ending time mark of each sentence file.
Cloud server, to after identification mission, is identified process and includes: men and women's sound identification,
Speaker Identification, voice content identification and punctuation mark identification, generate the voice containing mark and know
Other result, this step is specific as follows:
(1) can with XML(by phonetic feature text corresponding for speech sentence file to be identified
Extension language) mode of message is sent to far-end one by one for radio and television voice recognition processing
With server, in XML message in addition to comprising phonetic feature text data, also should
Comprise following information: beginning and ending time, this speech sentence file that speech sentence file is corresponding belong to
Radio and television audio-video document title, this speech sentence file wide and narrow strip mark;
(2) speech recognition system in Cloud Server is based on cloud computing framework establishment, works as voice
When the feature text of sentence is sent to radio and television speech recognition cloud, taken according to cloud by controller
Calculate resource in business device takies situation, for the identification reasonable distribution meter of this speech sentence file
Calculate resource;
(3) speech recognition system is called the calculating resource being assigned to and is carried out phonetic feature respectively
Men and women's sound identification, Speaker Identification, voice content and punctuation mark identification, wherein men and women's sound is known
Not according to men and women's sound disaggregated model, the classification each sentence being carried out men and women's sound by grader is sentenced
Not and identify;Speaker Identification, according to speaker model storehouse, carries out speaker's to each sentence
Identify and identify;Voice content identification and punctuation mark identification carry out voice content to each sentence
Identification, labelling punctuation mark simultaneously, and each vocabulary identified is carried out time-labeling.
Preferably, voice identification result is merged and structuring literary composition by the present embodiment step S4
Specifically including of this mark:
Step S41, each voice identification result is collected, aligns, and according to wherein wrapping
The beginning and ending time contained is ranked up, concrete: the recognition result for each speech sentence is carried out
Merge, carry out collecting arrangement, by each sentence according to its radio and television audio-video document belonged to
Different recognition results (men and women's sound identification, Speaker Identification, voice content and punctuation mark identification)
Align according to time point, and carry out time-sequencing.
Step S42, to sequence after voice identification result be marked according to structured format,
Including speaker's sex mark, speaker's mark, voice content, punctuation mark and timestamp,
Concrete: for the recognition result sorted, carry out text according to specific structurized form
Result identifies, and mark content includes speaker's sex of each sentence file, speaker, sentence
In voice content, the timestamp of each voice word, the punctuation mark of the sentence point of interruption in sentence.
Ultimately produce structurized voice identification result, the most again by voice identification result with
The form of XML message feeds back to user, and wherein XML message comprises herein below:
1) the radio and television file name identified;
2) the sentence listed files of fragmentation;
3) the speech/non-speech mark of each sentence file;
4) broadband of each sentence file/arrowband mark;
5) voice identification result of each sentence file;
6) speaker's mark of each sentence file;
7) men and women's tone mark of each sentence file is known;
8) the beginning and ending time mark of each sentence file.
Preferably, the present embodiment is the accuracy rate ensureing speech recognition, knows in step S3
The process of other places reason is identified with language model storehouse according to acoustic model repository, wherein language
Model library is constantly updated by the collection to network text and the study to network text.Fixed
Phase carries out the collection of network text by the Internet, by periodically optimizing the study of network text
Language model storehouse, specific as follows:
1) fixed-period crawling network text from the Internet, by web crawlers, periodically to each greatly
Search engine (such as Baidu, Google, search, search dog, search storehouse etc.) and each big radio and television
Relevant portal website (such as CCTV's net, each earth mat platform, Sina, Sohu etc.) captures webpage chain
Connect, collect popular vocabulary and web documents.
2) network text by collecting carries out participle to web documents, and adds up word frequency, word
Number, by word segmentation result, network hot word collection result and this speech recognition system of statistical data typing
Language model storehouse in system, carries out reference for each sound identification module, it is achieved to language model storehouse
Regular update, to ensure the accuracy rate of radio and television speech recognition.
Based on above-mentioned, the present embodiment carries out the idiographic flow of speech recognition such as to broadcast television data
Shown in Fig. 4, specifically include:
First, receive broadcast television data, send it to pretreatment terminal and carry out audio frequency and video solution
Code, therefrom extracts voice data, carries out audio frequency cutting and fragmentation afterwards, to fragmentation
After sentence file carry out speech/non-speech differentiation, if voice then continues next step,
Otherwise it is marked as non-voice, does not continue with.For speech sentence file continue into
Line width arrowband differentiates, speech feature extraction, then is known by voice by the feature text data obtained
Other " cloud " calls, and as voice recognition tasks, it is sent to cloud service using XML message
Device carries out voice recognition processing.The cloud service platform of cloud server end carries out men and women's sound respectively to it
Identification, Speaker Identification, voice content identification and punctuation mark identification, then recognition result is entered
Row fusions etc. feed back to and service platform after processing, simultaneously from the new network words of e-learning,
Popular vocabulary etc. carry out regular update to the language model storehouse of cloud service platform, it is ensured that speech recognition
Accuracy rate.Finally, Cloud Server is by recognition result, and the most structurized speech recognition is tied
The intellectualities further such as fruit feeds back to user by XML form, for reference, retrieval
Process.
The recognition methods provided by the present embodiment, based on cloud computing to existing voice recognition methods
Improve, merge broadcast television data preconditioning technique, men and women's sound identification technology, speaker
Identification technology and radio and television audio recognition method, have after speech data is carried out pretreatment again
The data handling requirements of broadcast television industry is identified processing, to broadcast television data by body acupuncture
Pre-processed results, men and women's sound recognition result, Speaker Identification result and voice identification result enter
Row merges and structured text mark, the voice identification result of generating structure, it is possible to for rear
The continuous intellectuality of other broadcast television services, automatic business processing provide basic data, specifically include
Below some:
5) recognition result to voice and the mark result to voice word timestamp can be
The retrieval service of radio and television voice content provides basic data;
6) the cutting time point to speech sentence identifies result, and the differentiation knot of wide and narrow strip
Really, can be broadcast TV program split provide boundary time point reference;
7) to the identification of voice content in radio and television and the identification of punctuation mark, permissible
Content reference is provided for the subtitle recognition in broadcast TV program;
8) to the Speaker Identification of speech sentence and the differentiation of wide and narrow strip as a result, it is possible to be
Host in broadcast TV program identifies, welcome guest identifies, scene Recognition of speaking (indoor scene,
Outdoor scene) etc. provide foundation.
It addition, processing speed accelerate, it is possible to reply mass data speech recognition problem, also by
Language model storehouse is learnt and renewal in periodically, it is possible to increase the accuracy of speech recognition.
Embodiment two
The embodiment of the present invention two additionally provides a kind of radio and television speech recognition system, composition signal
Figure is as it is shown in figure 5, this system includes:
Extraction unit 10, extracts voice data according to broadcast television data;
Pretreatment terminal 20, carries out pretreatment to voice data, obtains feature text data, and
It is sent to Cloud Server 30;
Cloud Server 30, is identified feature text data processing, obtains voice identification result,
And voice identification result is merged and structured text mark, the voice of generating structure
Recognition result.
Preferably, the composition schematic diagram of the pretreatment terminal 20 in the present embodiment as shown in Figure 6,
Specifically include:
Cutting module 21, carries out cutting to voice data and fragmentation processes and generates several sentences
File;
Non-voice filtering module 22, distich subfile carries out non-voice filtration, leaves speech sentence
File;
Wide and narrow strip discrimination module 23, carries out wide and narrow strip differentiation to each speech sentence file, to sentencing
Not Wei broadband signal speech sentence file add broadband mark, it determines for the voice of narrow band signal
Sentence file adds arrowband mark;
Audio feature extraction module 24, to the speech sentence literary composition adding broadband mark and arrowband mark
Part carries out audio feature extraction, obtains feature text data, wherein comprises in feature text data
The beginning and ending time of this speech sentence, voice characteristics information, the audio-video document name of this sentence ownership
Claim and corresponding wide and narrow strip mark.
Preferably, the composition schematic diagram of the Cloud Server 30 in the present embodiment is as it is shown in fig. 7, have
Body includes:
Men and women's sound identification module 31, for carrying out men and women's sound identification to feature text data.
Due in terms of physiology and psychology, male, women have spoken obvious difference, such as sound
The resonance that the fundamental tone of band generation, oral cavity structure (laryngopharynx, tongue, palate, lip, tooth etc.) produce
Peak frequency, the size of exhaled air flow and power etc..Therefore voice signal comprises the property of speaker
Other feature.In the present embodiment, by GMM-SVM(Gaussian Mixture
Models-Support Vector Machines) technological frame of mixed model, establish entirety
Men and women sound identification (the i.e. speaker of change spatial modeling (Total Variability Modeling)
Sex identification).All change spatial modelings, when training space matrix, do not repartition speaker
Space and channel space, represented by overall space, simplifies the mathematical notation in space, greatly
Reduce greatly the degree of dependence to training data.Merged by multisystem, provide final sex
Result judges.
Speaker Identification module 32, for carrying out Speaker Identification to feature text.
Speaker Identification realizes based on two class difference between speaker in the present embodiment:
One is itself to there are differences in the pronunciation of different vocal tract spectrum characteristic, and this species diversity is embodied in pronunciation
Phonetic feature distribution upper the most different;Two is the high-level feature (high-level of different speaker
Features) there are differences, i.e. different with background due to living environment, the day after tomorrow is formed, as practised
The differences such as idiom, the rhythm, linguistic structure.The Speaker Recognition System of main flow the most in the world
It is essentially all based on these features, solves Speaker Identification by the method for statistical modeling and ask
Topic.Concrete, Speaker Recognition System includes following two module:
A, speaker's modeling tool module: the method trained by differentiation, such as support vector machine
SVM, or method based on statistical modeling, such as gauss hybrid models GMM, to speaker
It is modeled, portrays different speaker's respective feature space distribution character, be used for distinguishing difference
Speaker.
B, speaker's distinguished number module: by feature and corresponding speaker's mould of input voice
Type mates, and differentiates the speaker's identity of input voice according to matching degree.
Voice content and punctuation mark identification module 33, for carrying out voice content to feature text
Identify and punctuation mark identification, generate the voice identification result containing mark.
Module comprises 4 ingredients: acoustic model repository, language model storehouse, search for and decode,
Punctuation mark generate, workflow diagram as shown in Figure 8, input phonetic feature after, according to this language
Sound feature is broadband signal or narrow band signal, by search and decoder module Selection and call intelligence
Voice content is identified by the acoustic model repository practised and come with language model storehouse, generates after identification
Text (sentence) send into punctuation mark generation module and carry out the identification of punctuation mark, finally give birth to
Become the voice identification result with punctuation mark mark.
The identification technology introduction that 4 ingredients are respectively adopted is as follows:
A, acoustic model repository: use in the present embodiment based on CD-DNN-HMM(
The hidden Markov model of hereafter relevant deep neural network) acoustic model repository, ratio is traditional
Hidden Markov model based on GMM-HMM(gauss hybrid models) acoustic model repository knowledge
Other accuracy rate is higher.
B, language model storehouse: use N-Gram(N metagrammar in the present embodiment) language
Model, this model based on such a it is assumed that the appearance of the n-th word only and above N-1 word
Relevant, and the most uncorrelated with other any word, and the probability of whole sentence is exactly each word probability of occurrence
Product.These probability can be by directly adding up the number of times that N number of word occurs simultaneously from language material
Obtain.N-Gram language model is simply effective, is widely used by speech recognition industry.
C, search for and decode: use the dynamically rule such as Viterbi searching algorithm in the present embodiment
The method of drawing, search optimal result in the case of setting models;Viterbi based on dynamic programming
Algorithm each state on each time point, after calculating decoded state sequence pair observation sequence
Test probability, retain the path of maximum probability, and corresponding status information under each nodes records
Word decoding sequence is reversely obtained so that last.Viterbi algorithm is not losing the condition of optimal solution
Under, solve HMM model status switch and acoustics observation sequence in continuous speech recognition simultaneously
Nonlinear Time alignment, word boundary detection and the identification of word, the speech recognition being also conventional is searched
The elementary tactics of rope.
Punctuation mark generates: have employed one in the present embodiment and utilizes in the interpolation of plain text information
The method of literary composition uttered sentence end of the sentence punctuate.The method is from the different grain size angle of sentence, and modeling is complete
The relation of office lexical information and punctuate, and use multilayer perceptron to merge under different grain size
The punctuate model arrived, it is achieved thereby that punctuate (fullstop, question mark and exclamation) generates.
Recognition result processing module 34, merges voice identification result and structured text
Mark, the voice identification result of generating structure.Wherein in the present embodiment, recognition result processes
Module 34 is first to the voice identification result (band of each speech sentence file in broadcast television data
Punctuation mark, each voice word band timestamp) carry out collecting and merging.
Preferably, the recognition result processing module 34 in the present embodiment farther includes:
Collect order module, for each voice identification result is collected, aligns, and press
It is ranked up according to the beginning and ending time wherein comprised;
Add mark module, for the voice identification result after sequence is carried out according to structured format
Labelling, including speaker's sex mark, speaker's mark, voice content, punctuation mark and
Timestamp.
Preferably, the Cloud Server 30 in the present embodiment also includes: language model intelligence
Practise module 35, for fixed-period crawling network text, by the study of network text the most more
Newspeak model library, in identification processing procedure, the language model storehouse according to regular update is known
Not, to guarantee the accuracy rate of speech recognition..
Cloud Server 30 in the present embodiment is to realize based on speech recognition cloud service platform 36
, the cloud service that concrete speech recognition cloud service platform combines based on ICE with SOA is put down
Table frame builds, ICE framework complete Distributed Calculation, external by SOA framework
Cloud service is provided, completes the identification mission of sing on web Service and the communication of recognition result.
In the present embodiment in service platform, by various identification modules (i.e. men and women's sound identification module
31, Speaker Identification module 32, voice content and punctuation mark identification module 33 and identification knot
Really processing module 34) it is encapsulated into plug-in unit, form the cloud service of standard, configure in the frame,
Becoming a part for cloud service platform, various identification modules can be properly functioning in the system that do not affects
In the case of add easily in platform and unload, when data volume to be identified increases, cloud
Service platform will add identification module adaptively, to complete the radio and television speech recognition of magnanimity
Task.
This cloud service platform framework is as it is shown in figure 9, after broadcast television data completes pretreatment, logical
Cross and call data access interface voice recognition tasks is passed to control with XML task message single
Unit, (is calculated the state of resource by prison by control unit according to the state of current calculating resource
Control unit is collected), mainly include CPU, internal memory, network state, appointing in conjunction with recognition node
Business execution state, task priority, and the priori of execution efficiency, dynamic decision also divides
The calculating resource joining optimum completes the execution of identification mission.
In sum, the identification system globe area broadcast television data pretreatment skill that the present embodiment provides
Art, men and women's sound identification technology, speaker Recognition Technology and radio and television audio recognition method,
The data handling requirements of broadcast television industry it is specific to again after speech data is carried out pretreatment
It is identified processing, to broadcast television data pre-processed results, men and women's sound recognition result, speaks
People's recognition result and voice identification result carry out merging and structured text mark, generate knot
The voice identification result of structure, it is possible to for the intellectuality, automatically of other broadcast television services follow-up
Change processes provides basic data.Further, since use the speech data parallel processing to fragmentation
Mode, processing speed is accelerated, it is possible to the speech recognition problem of reply mass data, simultaneously by
In periodically, language model storehouse is carried out intellectual learning and renewal, it is possible to increase speech recognition accurate
Degree.
Embodiment of above is merely to illustrate the present invention, and not limitation of the present invention, relevant
The those of ordinary skill of technical field, without departing from the spirit and scope of the present invention,
Can also make a variety of changes and modification, the technical scheme of the most all equivalents falls within the present invention
Category, the scope of patent protection of the present invention should be defined by the claims.
Claims (2)
1. a radio and television audio recognition method, it is characterised in that including:
S1, extract voice data according to broadcast television data;
S2, described voice data is carried out pretreatment, obtain feature text data;Wherein, step
Rapid S2 carries out pretreatment to described voice data and specifically includes:
S21, described voice data is carried out cutting and fragmentation process and generate several sentences literary compositions
Part;
S22, described sentence file is carried out non-voice filtration, leave speech sentence file;
S23, each speech sentence file is carried out wide and narrow strip differentiation, to being determined as broadband signal
Speech sentence file add broadband mark, it determines for narrow band signal speech sentence file add
Arrowband identifies;
S24, the speech sentence file identifying interpolation broadband mark and arrowband carry out audio frequency characteristics
Extract, obtain feature text data, wherein said feature text data comprises this speech sentence
Beginning and ending time, voice characteristics information, the audio-video document title of this sentence ownership and correspondence
Wide and narrow strip identifies;
S3, described feature text data is sent to Cloud Server be identified process, obtain man
Female voice identification, Speaker Identification and voice identification result;Step S3 is by described feature textual data
It is identified process includes according to being sent to Cloud Server: men and women's sound identification, Speaker Identification, language
Sound content recognition and punctuation mark identification, generate the voice identification result containing mark;And walk
The process that rapid S3 is identified processing is identified according to language model storehouse, and described voice
Model library is constantly updated by network text collection and network text study;Described voice mould
The renewal step in type storehouse includes:
S31, from the Internet fixed-period crawling network text;
S32, by the network text collected web documents carried out participle, and add up word frequency,
Word number, by word segmentation result, network hot word collection result and this speech recognition of statistical data typing
Language model storehouse in system, carries out reference for each sound identification module, it is achieved to language model
The regular update in storehouse, to ensure the accuracy rate of radio and television speech recognition;
S4, to described data prediction, men and women's sound identification, Speaker Identification and speech recognition
Result carries out merging and structured text mark, the voice identification result of generating structure;Step
Institute's speech recognition result is merged by rapid S4 and structured text mark specifically includes:
S41, each voice identification result is collected, aligns, and according to wherein comprising
Beginning and ending time is ranked up;
S42, to sequence after voice identification result be marked according to structured format, including
Speaker's sex mark, speaker's mark, voice content, punctuation mark and timestamp.
2. a radio and television speech recognition system, it is characterised in that this system includes:
Extraction unit, extracts voice data according to broadcast television data;
Pretreatment terminal, carries out pretreatment to described voice data, obtains feature text data,
And it is sent to Cloud Server;Described pretreatment terminal includes:
Cutting module, carries out cutting to described voice data and fragmentation processes and generates several sentences
Subfile;
Non-voice filtering module, carries out non-voice filtration to described sentence file, leaves voice sentence
Subfile;
Wide and narrow strip discrimination module, carries out wide and narrow strip differentiation, to differentiation to each speech sentence file
Speech sentence file for broadband signal adds broadband mark, it determines for the voice sentence of narrow band signal
Subfile adds arrowband mark;
Audio feature extraction module, to adding broadband mark and the speech sentence file of arrowband mark
Carry out audio feature extraction, obtain feature text data, wherein said feature text data wraps
Beginning and ending time containing this speech sentence, belong to the wide and narrow strip mark of audio-video document title and correspondence
Know;
Cloud Server, is identified described feature text data processing, and obtains speech recognition knot
Really, and institute's speech recognition result is merged and structured text mark, generating structure
The voice identification result changed;Described Cloud Server includes:
Men and women's sound identification module, for carrying out men and women's sound identification to described feature text data;
Speaker Identification module, for carrying out Speaker Identification to described feature text;
Voice content and punctuation mark identification module, for carrying out in voice described feature text
Hold and identify and punctuation mark identification, generate the voice identification result containing punctuation mark mark;
Recognition result processing module, merges institute's speech recognition result and structuring literary composition
This mark, the voice identification result of generating structure;Described recognition result processing module is further
Including:
Collect order module, for each voice identification result is collected, aligns, and press
It is ranked up according to the beginning and ending time wherein comprised;
Add mark module, for the voice identification result after sequence is carried out according to structured format
Labelling, including speaker's sex mark, speaker's mark, voice content, punctuation mark and
Timestamp;
Described Cloud Server also includes: language model intellectual learning module, for fixed-period crawling
Network text, by the study regular update language model storehouse to network text, at identifying processing
During be identified according to the language model storehouse of regular update;Described language model intellectual learning
Module is used for performing following steps:
S31, from the Internet fixed-period crawling network text;
S32, by the network text collected web documents carried out participle, and add up word frequency,
Word number, by word segmentation result, network hot word collection result and this speech recognition of statistical data typing
Language model storehouse in system, carries out reference for each sound identification module, it is achieved to language model
The regular update in storehouse, to ensure the accuracy rate of radio and television speech recognition.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310648375.4A CN103700370B (en) | 2013-12-04 | 2013-12-04 | A kind of radio and television speech recognition system method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310648375.4A CN103700370B (en) | 2013-12-04 | 2013-12-04 | A kind of radio and television speech recognition system method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103700370A CN103700370A (en) | 2014-04-02 |
CN103700370B true CN103700370B (en) | 2016-08-17 |
Family
ID=50361876
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310648375.4A Active CN103700370B (en) | 2013-12-04 | 2013-12-04 | A kind of radio and television speech recognition system method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103700370B (en) |
Families Citing this family (33)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105895104B (en) * | 2014-05-04 | 2019-09-03 | 讯飞智元信息科技有限公司 | Speaker adaptation recognition methods and system |
CN104469616A (en) * | 2014-12-17 | 2015-03-25 | 天脉聚源(北京)教育科技有限公司 | Method and device for transmitting sound signals of intelligent teaching system |
CN104751847A (en) * | 2015-03-31 | 2015-07-01 | 刘畅 | Data acquisition method and system based on overprint recognition |
CN106162319A (en) * | 2015-04-20 | 2016-11-23 | 中兴通讯股份有限公司 | A kind of method and device of Voice command electronic programming |
CN104936020B (en) * | 2015-06-25 | 2019-04-02 | 四川迪佳通电子有限公司 | Far field pickup remote control method and system based on set-top box |
CN104994400A (en) * | 2015-07-06 | 2015-10-21 | 无锡天脉聚源传媒科技有限公司 | Method and device for indexing video by means of acquisition of host name |
CN106683661B (en) * | 2015-11-05 | 2021-02-05 | 阿里巴巴集团控股有限公司 | Role separation method and device based on voice |
CN105895102A (en) * | 2015-11-15 | 2016-08-24 | 乐视移动智能信息技术(北京)有限公司 | Recording editing method and recording device |
CN105679319B (en) * | 2015-12-29 | 2019-09-03 | 百度在线网络技术(北京)有限公司 | Voice recognition processing method and device |
CN105957517A (en) * | 2016-04-29 | 2016-09-21 | 中国南方电网有限责任公司电网技术研究中心 | Voice data structural transformation method based on open source API and system thereof |
CN105897360B (en) * | 2016-05-18 | 2018-12-11 | 国家新闻出版广电总局监管中心 | A kind of broadcasting-quality and effect method of discrimination and system |
CN106100777B (en) * | 2016-05-27 | 2018-08-17 | 西华大学 | Broadcast support method based on speech recognition technology |
CN108062954B (en) * | 2016-11-08 | 2020-12-08 | 科大讯飞股份有限公司 | Speech recognition method and device |
CN106649643B (en) * | 2016-12-08 | 2019-10-22 | 腾讯音乐娱乐(深圳)有限公司 | A kind of audio data processing method and its device |
CN106953887B (en) * | 2017-01-05 | 2020-04-24 | 北京中瑞鸿程科技开发有限公司 | Fine-grained radio station audio content personalized organization recommendation method |
CN106877955B (en) * | 2017-03-29 | 2018-12-25 | 西华大学 | Fm broadcast signal based on hidden Markov model gives the correct time characteristic recognition method |
CN106971721A (en) * | 2017-03-29 | 2017-07-21 | 沃航(武汉)科技有限公司 | A kind of accent speech recognition system based on embedded mobile device |
CN107203616A (en) * | 2017-05-24 | 2017-09-26 | 苏州百智通信息技术有限公司 | The mask method and device of video file |
CN107291676B (en) * | 2017-06-20 | 2021-11-19 | 广东小天才科技有限公司 | Method for cutting off voice file, terminal equipment and computer storage medium |
CN108648758B (en) * | 2018-03-12 | 2020-09-01 | 北京云知声信息技术有限公司 | Method and system for separating invalid voice in medical scene |
US10825458B2 (en) * | 2018-10-31 | 2020-11-03 | Rev.com, Inc. | Systems and methods for a two pass diarization, automatic speech recognition, and transcript generation |
KR20200063290A (en) * | 2018-11-16 | 2020-06-05 | 삼성전자주식회사 | Electronic apparatus for recognizing an audio scene and method for the same |
US20200220869A1 (en) * | 2019-01-08 | 2020-07-09 | Fidelity Information Services, Llc | Systems and methods for contactless authentication using voice recognition |
CN110110294B (en) * | 2019-03-26 | 2021-02-02 | 北京捷通华声科技股份有限公司 | Dynamic reverse decoding method, device and readable storage medium |
CN110580907B (en) * | 2019-08-28 | 2021-09-24 | 云知声智能科技股份有限公司 | Voice recognition method and system for multi-person speaking scene |
CN110910863B (en) * | 2019-11-29 | 2023-01-31 | 上海依图信息技术有限公司 | Method, device and equipment for extracting audio segment from audio file and storage medium |
CN112037792B (en) * | 2020-08-20 | 2022-06-17 | 北京字节跳动网络技术有限公司 | Voice recognition method and device, electronic equipment and storage medium |
CN112185357A (en) * | 2020-12-02 | 2021-01-05 | 成都启英泰伦科技有限公司 | Device and method for simultaneously recognizing human voice and non-human voice |
CN112818906B (en) * | 2021-02-22 | 2023-07-11 | 浙江传媒学院 | Intelligent cataloging method of all-media news based on multi-mode information fusion understanding |
CN113470652A (en) * | 2021-06-30 | 2021-10-01 | 山东恒远智能科技有限公司 | Voice recognition and processing method based on industrial Internet |
CN113593577A (en) * | 2021-09-06 | 2021-11-02 | 四川易海天科技有限公司 | Vehicle-mounted artificial intelligence voice interaction system based on big data |
CN113825009A (en) * | 2021-10-29 | 2021-12-21 | 平安国际智慧城市科技股份有限公司 | Audio and video playing method and device, electronic equipment and storage medium |
CN115456150B (en) * | 2022-10-18 | 2023-05-16 | 北京鼎成智造科技有限公司 | Reinforced learning model construction method and system |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0952737A2 (en) * | 1998-04-21 | 1999-10-27 | International Business Machines Corporation | System and method for identifying and selecting portions of information streams for a television system |
CN101539929A (en) * | 2009-04-17 | 2009-09-23 | 无锡天脉聚源传媒科技有限公司 | Method for indexing TV news by utilizing computer system |
CN101924863A (en) * | 2010-05-21 | 2010-12-22 | 中山大学 | Digital television equipment |
CN103200463A (en) * | 2013-03-27 | 2013-07-10 | 天脉聚源(北京)传媒科技有限公司 | Method and device for generating video summary |
CN103413557A (en) * | 2013-07-08 | 2013-11-27 | 深圳Tcl新技术有限公司 | Voice signal bandwidth expansion method and device thereof |
-
2013
- 2013-12-04 CN CN201310648375.4A patent/CN103700370B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0952737A2 (en) * | 1998-04-21 | 1999-10-27 | International Business Machines Corporation | System and method for identifying and selecting portions of information streams for a television system |
CN101539929A (en) * | 2009-04-17 | 2009-09-23 | 无锡天脉聚源传媒科技有限公司 | Method for indexing TV news by utilizing computer system |
CN101924863A (en) * | 2010-05-21 | 2010-12-22 | 中山大学 | Digital television equipment |
CN103200463A (en) * | 2013-03-27 | 2013-07-10 | 天脉聚源(北京)传媒科技有限公司 | Method and device for generating video summary |
CN103413557A (en) * | 2013-07-08 | 2013-11-27 | 深圳Tcl新技术有限公司 | Voice signal bandwidth expansion method and device thereof |
Also Published As
Publication number | Publication date |
---|---|
CN103700370A (en) | 2014-04-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103700370B (en) | A kind of radio and television speech recognition system method and system | |
CN110364171B (en) | Voice recognition method, voice recognition system and storage medium | |
US11823678B2 (en) | Proactive command framework | |
CN108735201B (en) | Continuous speech recognition method, device, equipment and storage medium | |
Theodorou et al. | An overview of automatic audio segmentation | |
CN108428446A (en) | Audio recognition method and device | |
US20240021202A1 (en) | Method and apparatus for recognizing voice, electronic device and medium | |
CN103500579B (en) | Audio recognition method, Apparatus and system | |
US11276403B2 (en) | Natural language speech processing application selection | |
WO2022178969A1 (en) | Voice conversation data processing method and apparatus, and computer device and storage medium | |
CN112735383A (en) | Voice signal processing method, device, equipment and storage medium | |
CN109976702A (en) | A kind of audio recognition method, device and terminal | |
WO2016119604A1 (en) | Voice information search method and apparatus, and server | |
CN111489765A (en) | Telephone traffic service quality inspection method based on intelligent voice technology | |
CN103871424A (en) | Online speaking people cluster analysis method based on bayesian information criterion | |
CN111462758A (en) | Method, device and equipment for intelligent conference role classification and storage medium | |
CN111489743A (en) | Operation management analysis system based on intelligent voice technology | |
CN112669842A (en) | Man-machine conversation control method, device, computer equipment and storage medium | |
Kaushik et al. | Automatic audio sentiment extraction using keyword spotting. | |
CN105957517A (en) | Voice data structural transformation method based on open source API and system thereof | |
WO2023272616A1 (en) | Text understanding method and system, terminal device, and storage medium | |
CN112231440A (en) | Voice search method based on artificial intelligence | |
US10929601B1 (en) | Question answering for a multi-modal system | |
CN103247316A (en) | Method and system for constructing index in voice frequency retrieval | |
CN112201225B (en) | Corpus acquisition method and device, readable storage medium and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C53 | Correction of patent for invention or patent application | ||
CB03 | Change of inventor or designer information |
Inventor after: Chen Xinwei Inventor before: Chen Xinwei Inventor before: Xu Bo |
|
COR | Change of bibliographic data |
Free format text: CORRECT: INVENTOR; FROM: CHEN XINWEI XU BO TO: CHEN XINWEI |
|
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |