CN106782513B - Speech recognition realization method and system based on confidence level - Google Patents

Speech recognition realization method and system based on confidence level Download PDF

Info

Publication number
CN106782513B
CN106782513B CN201710060942.2A CN201710060942A CN106782513B CN 106782513 B CN106782513 B CN 106782513B CN 201710060942 A CN201710060942 A CN 201710060942A CN 106782513 B CN106782513 B CN 106782513B
Authority
CN
China
Prior art keywords
speech recognition
phoneme
probability
information
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710060942.2A
Other languages
Chinese (zh)
Other versions
CN106782513A (en
Inventor
俞凯
陈哲怀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sipic Technology Co Ltd
Original Assignee
Shanghai Jiaotong University
Suzhou Speech Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University, Suzhou Speech Information Technology Co Ltd filed Critical Shanghai Jiaotong University
Priority to CN201710060942.2A priority Critical patent/CN106782513B/en
Publication of CN106782513A publication Critical patent/CN106782513A/en
Application granted granted Critical
Publication of CN106782513B publication Critical patent/CN106782513B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/20Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/32Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems

Abstract

A kind of speech recognition realization method and system based on confidence level, decoded information, which is obtained, according to the speech recognition for carrying out phoneme synchronous decoding from user speech generates the synchronous word figure acoustic information structure of phoneme, and word-based figure acoustic information structural generation confusion network to construct the competitive relation between speech recognition candidate result, i.e. confusion network competes probability;Simultaneously using the full search space of the auxiliary search network struction speech recognition based on language model, the full search Spatial Probability of intact mistake is calculated, and combine the speech recognition of phoneme synchronous decoding, process record is scanned for the full search space of generation, and path backtracking is carried out by entire search history, to obtain full search Spatial Probability;It is merged to obtain the court verdict of speech recognition finally by confusion network competition probability and full search Spatial Probability.One aspect of the present invention can provide correct confidence level to the result of speech recognition, so as to improve speech recognition user experience, on the other hand can substantially reduce the calculating and memory source consumption of speech recognition certainty factor algebra.

Description

Speech recognition realization method and system based on confidence level
Technical field
The present invention relates to a kind of accurately and efficiently confidences for being applied to speech recognition (Speech Recognition) (Confidence Measure, the CM) technology of spending, it is specifically a kind of to be based on phoneme synchronous decoding (Phone Synchronous Decoding), word figure and confusion network (Lattice and Confusion Network), adjuvant search space The speech recognition realization method and system of (Auxiliary Search Space).
Background technique
Speech recognition is that one kind allows machine that voice signal is changed into corresponding text or life by identification and understanding process The artificial intelligence technology of order.Existing voice identification technology can not still accomplish that completely correctly, confidence level is a kind of for judging voice The technology of identifying system own voices recognition result reliability, is generally given with recognition result reliability or recognition result probability value Out.
Traditional voice recognition confidence technology mainly includes confidence level (the Predictor features based on predicted characteristics Based CM) and confidence level (Posterior based CM) based on posterior probability, disadvantage includes: between more predicted characteristics It is often not independent mutually in statistical significance;Additional model training link is needed in conjunction with a variety of predicted characteristics, is unfavorable for more Scape application;Speech recognition system is intended to obtain correct text, and is difficult to provide accurate posterior probability, is embodied in: Both inaccuracy needs additional model training link to posterior probability method based on filler simultaneously;And the posteriority of word-based figure is general Rate method is not then complete to search space construction.
Summary of the invention
The present invention is imperfect for competition results characterization of the prior art to solution code space, and the confidence level caused is inaccurate Really;Retraining is carried out dependent on to each model of speech recognition, increases a large amount of extra process;The process calculation amount of building solution code space Greatly, cause speech recognition time-consuming to increase, be unfavorable for the defects of improving user experience etc., propose that a kind of voice based on confidence level is known On the one hand other realization method and system can provide correct confidence level to the result of speech recognition, so as to improve speech recognition On the other hand user experience can substantially reduce the calculating and memory source consumption of speech recognition certainty factor algebra.
The present invention is achieved by the following technical solutions:
The speech recognition implementation method based on confidence level that the present invention relates to a kind of, it is synchronous according to phoneme is carried out from user speech Decoded speech recognition obtains decoded information and generates the synchronous word figure acoustic information structure of phoneme, and word-based figure acoustic information knot Structure generates confusion network to construct the competitive relation between speech recognition candidate result, i.e. confusion network competes probability;Simultaneously Using the full search space of the auxiliary search network struction speech recognition based on language model, the complete of intact mistake is calculated Search space probability, and the speech recognition of phoneme synchronous decoding is combined, process record is scanned for the full search space of generation, And path backtracking is carried out by entire search history, to obtain full search Spatial Probability;It is general finally by competing confusion network Rate and full search Spatial Probability are merged to obtain the court verdict of speech recognition.
Technical effect
Compared with prior art, proposed by the present invention to be based on phoneme synchronous decoding (Phone Synchronous Decoding), word figure and confusion network (Lattice and Confusion Network), adjuvant search space The speech recognition confidence level technology of (Auxiliary Search Space), the conventional method that compares mainly have following difference:
System constructs each link Conventional method The present invention Advantage compares
Word figure generates Synchronous decoding frame by frame Phoneme synchronous decoding It is more acurrate, efficient generating process
The building of full search space Based on filler or word figure Adjuvant search space Construction search space is more comprehensively
Confidence calculations Word figure posterior probability Confusion network competes probability Voice recognition information is more acurrate
Detailed description of the invention
Fig. 1 is present system schematic diagram;
Fig. 2 is embodiment probability output schematic diagram;
In figure: the longitudinal axis is probability value, and horizontal axis is time shaft;
Fig. 3 is the speech recognition schematic diagram of phoneme synchronous decoding of the present invention;
Fig. 4 is the synchronous word figure acoustic information structural schematic diagram of phoneme;
Fig. 5 is confusion network schematic diagram;
Fig. 6 is the generating process schematic diagram for assisting dragnet network;
Fig. 7 is Confidence schematic diagram.
Specific embodiment
As shown in Figure 1, the present embodiment system includes: speech recognition module, word figure generation module, confusion network competition probability Computing module, full search Spatial Probability computing module and Confidence device, in which: the speech recognition mould of phoneme synchronous decoding Block is connected with word figure generation module and transmits complete phoneme information, and the synchronous word figure generation module building of phoneme is compact and without letter The acoustic information of breath loss, which is characterized and exported to confusion network, competes probability evaluation entity, and confusion network competes probability evaluation entity The competitive relation probability in phoneme word figure is extracted, full search Spatial Probability computing module constructs auxiliary search according to phoneme information Space, and full search Spatial Probability is further obtained, Confidence device is according to full search Spatial Probability and competitive relation probability Fusion obtains confidence level as final evaluation and identifies whether correct court verdict.
The present invention relates to the audio recognition methods of above system, comprising the following steps:
Step 1) obtains decoding letter as shown in figure 3, carry out the speech recognition of phoneme synchronous decoding frame by frame to user speech Breath, specifically includes:
1.1 by establishing continuous timing disaggregated model, so that Acoustic Modeling is more accurate;
1.2 model connection timing disaggregated model using neural network, and probability output distribution has unimodal protrusion The characteristics of;
When 1.3 speech recognition decoder, linguistics web search is only just carried out when there is the output of non-blank-white model and is obtained Otherwise decoded information directly abandons present frame acoustic information, goes to next frame.
Step 2) generates the synchronous word figure acoustic information structure of phoneme according to the obtained decoded information of step 1, specific to wrap It includes:
2.1 connection timing disaggregated models obtain going out for the phoneme in each frame after inputting each frame acoustic feature information Existing probability.
The acoustic feature information identifies physical features from multiple voice.
If 2.2 current acoustic characteristic informations are non-empty model frame, the weighted finite shape of adaptation Acoustic Modeling information is used State machine carries out linguistic information search to the frame acoustic feature information, obtains phoneme information and is deposited in the form of weighted finite state machine Storage, otherwise abandons the frame;Finally word figure acoustic information structure is obtained through merging treatment.
As shown in figure 4, the word figure acoustic information structure is same for the phoneme being indicated based on weighted finite state machine Word figure is walked, it is very compact phoneme level word figure which, which needs not move through beta pruning, and compression ratio is 80% compared with prior art; The synchronous word figure of phoneme is by carrying out two two-phases for the acoustic output model of all candidates between two different model output times Connect, such as:
The structure compares conventional method (synchronous decoding frame by frame), and theoretical search space reduces 90%;Theory search network pressure Contracting is compared close to 100:1.So that finally obtained voice recognition information is accurate, efficient.
The word-based figure acoustic information structural generation confusion network of step 3), for constructing between speech recognition candidate result Competitive relation, i.e. the competition probability of confusion network, specifically include:
3.1 cluster flag according to optimal decoding coordinates measurement confusion network;
The time boundary and phoneme information of 3.2 pairs of each candidate words cluster, and are merged on confusion network cluster flag;
Optimal decoding paths are extracted again on 3.3 confusion networks obtained after cluster.
As shown in figure 5, the competitive relation indicates (such as HAVE and MOVE) by confusion network, and known based on voice Competitive relation between other candidate result obtains competition probability, more accurate compared to traditional word figure posterior probability.
Step 4) uses the full search space of the auxiliary search network struction speech recognition constructed based on polynary language model, The full search Spatial Probability of decoding process is calculated, it is specific as shown in Figure 6, comprising:
4.1 based on polynary language model building pronunciation full search space;
4.2 construct the pronunciation search space with contextual information by the contextual information in pronunciation full search space itself;
4.3 combine the corresponding search condition modeling of acoustic model, obtain final full search space;
4.4 scan on full search space in conjunction with phoneme information, obtain candidate competitive unit;
4.5 pass through the speech recognition decoder probability of candidate competitive unit, and full search Spatial Probability is calculated.
The polynary language model is as unit of phoneme, word or word.
The full search space as shown in fig. 6, the auxiliary search network analog is pronounced.
Step 5) combines the speech recognition of phoneme synchronous decoding, scans for process record to the full search space of generation, And path backtracking is carried out by entire search history, to obtain full search Spatial Probability;And pass through Confidence device combination language Sound recognition result, confusion network competition probability and full search Spatial Probability, obtain final speech recognition result.
As shown in fig. 7, the differentiation process of the Confidence device specifically:
5.1 pairs of confusion network competition probability and full search Spatial Probability carry out the fusion of interpolation method, obtain confidence level;
5.2 when fused confidence level is less than threshold value, using speech recognition module output as speech recognition result;Otherwise Recognition failures, it is desirable that user re-enters.
Above-mentioned specific implementation can by those skilled in the art under the premise of without departing substantially from the principle of the invention and objective with difference Mode carry out local directed complete set to it, protection scope of the present invention is subject to claims and not by above-mentioned specific implementation institute Limit, each implementation within its scope is by the constraint of the present invention.

Claims (5)

1. a kind of speech recognition implementation method based on confidence level, which is characterized in that synchronous according to phoneme is carried out from user speech Decoded speech recognition obtains decoded information and generates the synchronous word figure acoustic information structure of phoneme, and word-based figure acoustic information knot Structure generates confusion network to construct the competitive relation between speech recognition candidate result, i.e. confusion network competes probability;Simultaneously Using the full search space of the auxiliary search network struction speech recognition based on language model, the complete of intact mistake is calculated Search space probability, and the speech recognition of phoneme synchronous decoding is combined, process record is scanned for the full search space of generation, And path backtracking is carried out by entire search history, to obtain full search Spatial Probability;It is general finally by competing confusion network Rate and full search Spatial Probability are merged to obtain the court verdict of speech recognition;
The competitive relation, obtains in the following manner:
3.1 cluster flag according to optimal decoding coordinates measurement confusion network;
The time boundary and phoneme information of 3.2 pairs of each candidate words cluster, and are merged on confusion network cluster flag;
Again optimal decoding paths are extracted on 3.3 confusion networks obtained after cluster, final competitive relation passes through confusion network It indicates, and competition probability is obtained based on the competitive relation between speech recognition candidate result;
The full search Spatial Probability, obtains in the following manner:
4.1 based on polynary language model building pronunciation full search space;
4.2 construct the pronunciation search space with contextual information by the contextual information in pronunciation full search space itself;
4.3 combine the corresponding search condition modeling of acoustic model, obtain final full search space;
4.4 scan on full search space in conjunction with phoneme information, obtain candidate competitive unit;
4.5 pass through the speech recognition decoder probability of candidate competitive unit, and full search Spatial Probability is calculated.
2. according to the method described in claim 1, it is characterized in that, the phoneme synchronous decoding, to user speech carry out frame by frame The speech recognition of phoneme synchronous decoding, obtains decoded information, specifically:
1.1 by establishing continuous timing disaggregated model, so that Acoustic Modeling is more accurate;
1.2 model connection timing disaggregated model using neural network, and probability output distribution has unimodal spy outstanding Point;
When 1.3 speech recognition decoder, linguistics web search is only just carried out when there is the output of non-blank-white model and is decoded Otherwise information directly abandons present frame acoustic information, goes to next frame.
3. according to the method described in claim 1, it is characterized in that, the word figure acoustic information structure, in the following manner It arrives:
2.1 connection timing disaggregated models show that the appearance of the phoneme in each frame is general after inputting each frame acoustic feature information Rate;
If 2.2 current acoustic characteristic informations are non-empty model frame, the weighted finite state machine of adaptation Acoustic Modeling information is used Linguistic information search is carried out to the frame acoustic feature information, obtain phoneme information and is stored in the form of weighted finite state machine, Otherwise the frame is abandoned;Finally word figure acoustic information structure is obtained through merging treatment.
4. method according to claim 1 or 3, characterized in that the word figure acoustic information structure is to be had based on weighting The synchronous word figure of the phoneme that limit state machine is indicated, it is very compact phoneme level word figure which, which needs not move through beta pruning, The synchronous word figure of phoneme is by carrying out two two-phases for the acoustic output model of all candidates between two different model output times Even.
5. a kind of speech recognition system for realizing any of the above-described claim the method characterized by comprising speech recognition Module, word figure generation module, confusion network competition probability evaluation entity, full search Spatial Probability computing module and confidence level are sentenced Other device, in which: the speech recognition module of phoneme synchronous decoding is connected with word figure generation module and transmits complete phoneme information, sound The synchronous word figure generation module of element constructs compact and without information loss acoustic information and characterizes and export general to confusion network competition Rate computing module, confusion network competition probability evaluation entity extract the competitive relation probability in phoneme word figure, full search space Probability evaluation entity constructs adjuvant search space according to phoneme information, and further obtains full search Spatial Probability, and confidence level is sentenced Other device obtains confidence level as final evaluation according to full search Spatial Probability and competitive relation probability fusion and identifies whether correctly Court verdict.
CN201710060942.2A 2017-01-25 2017-01-25 Speech recognition realization method and system based on confidence level Active CN106782513B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710060942.2A CN106782513B (en) 2017-01-25 2017-01-25 Speech recognition realization method and system based on confidence level

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710060942.2A CN106782513B (en) 2017-01-25 2017-01-25 Speech recognition realization method and system based on confidence level

Publications (2)

Publication Number Publication Date
CN106782513A CN106782513A (en) 2017-05-31
CN106782513B true CN106782513B (en) 2019-08-23

Family

ID=58943125

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710060942.2A Active CN106782513B (en) 2017-01-25 2017-01-25 Speech recognition realization method and system based on confidence level

Country Status (1)

Country Link
CN (1) CN106782513B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107871499B (en) * 2017-10-27 2020-06-16 珠海市杰理科技股份有限公司 Speech recognition method, system, computer device and computer-readable storage medium
CN110164416B (en) * 2018-12-07 2023-05-09 腾讯科技(深圳)有限公司 Voice recognition method and device, equipment and storage medium thereof
CN110808032B (en) * 2019-09-20 2023-12-22 平安科技(深圳)有限公司 Voice recognition method, device, computer equipment and storage medium
CN113192535B (en) * 2021-04-16 2022-09-09 中国科学院声学研究所 Voice keyword retrieval method, system and electronic device
CN115394288B (en) * 2022-10-28 2023-01-24 成都爱维译科技有限公司 Language identification method and system for civil aviation multi-language radio land-air conversation

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7571098B1 (en) * 2003-05-29 2009-08-04 At&T Intellectual Property Ii, L.P. System and method of spoken language understanding using word confusion networks
CN101763855A (en) * 2009-11-20 2010-06-30 安徽科大讯飞信息科技股份有限公司 Method and device for judging confidence of speech recognition
CN101887725A (en) * 2010-04-30 2010-11-17 中国科学院声学研究所 Phoneme confusion network-based phoneme posterior probability calculation method
CN104157285A (en) * 2013-05-14 2014-11-19 腾讯科技(深圳)有限公司 Voice recognition method and device, and electronic equipment
CN105895081A (en) * 2016-04-11 2016-08-24 苏州思必驰信息科技有限公司 Speech recognition decoding method and speech recognition decoding device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102371188B1 (en) * 2015-06-30 2022-03-04 삼성전자주식회사 Apparatus and method for speech recognition, and electronic device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7571098B1 (en) * 2003-05-29 2009-08-04 At&T Intellectual Property Ii, L.P. System and method of spoken language understanding using word confusion networks
CN101763855A (en) * 2009-11-20 2010-06-30 安徽科大讯飞信息科技股份有限公司 Method and device for judging confidence of speech recognition
CN101887725A (en) * 2010-04-30 2010-11-17 中国科学院声学研究所 Phoneme confusion network-based phoneme posterior probability calculation method
CN104157285A (en) * 2013-05-14 2014-11-19 腾讯科技(深圳)有限公司 Voice recognition method and device, and electronic equipment
CN105895081A (en) * 2016-04-11 2016-08-24 苏州思必驰信息科技有限公司 Speech recognition decoding method and speech recognition decoding device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
《一种广播新闻语音的关键词检测系统》;张鹏远等;《全国网络与信息安全技术研讨会》;20070731;第398-405页
《基于环境特征的语音识别置信度研究》;国玉晶等;《第十届全国人机语音通讯学术会议暨国际语音语言处理研讨会》;20090816;第298-302页

Also Published As

Publication number Publication date
CN106782513A (en) 2017-05-31

Similar Documents

Publication Publication Date Title
CN106782513B (en) Speech recognition realization method and system based on confidence level
JP6916264B2 (en) Real-time speech recognition methods based on disconnection attention, devices, equipment and computer readable storage media
CN111739508B (en) End-to-end speech synthesis method and system based on DNN-HMM bimodal alignment network
CN106098059B (en) Customizable voice awakening method and system
Zayats et al. Disfluency detection using a bidirectional LSTM
CN102982811B (en) Voice endpoint detection method based on real-time decoding
JP7070894B2 (en) Time series information learning system, method and neural network model
US10714076B2 (en) Initialization of CTC speech recognition with standard HMM
KR20190125463A (en) Method and apparatus for detecting voice emotion, computer device and storage medium
CN108899013A (en) Voice search method, device and speech recognition system
CN106611597A (en) Voice wakeup method and voice wakeup device based on artificial intelligence
CN101604520A (en) Spoken language voice recognition method based on statistical model and syntax rule
CN111539199B (en) Text error correction method, device, terminal and storage medium
CN108364634A (en) Spoken language pronunciation evaluating method based on deep neural network posterior probability algorithm
JP2018060047A (en) Learning device for acoustic model and computer program therefor
CN113823257B (en) Speech synthesizer construction method, speech synthesis method and device
CN112309398A (en) Working time monitoring method and device, electronic equipment and storage medium
CN100431003C (en) Voice decoding method based on mixed network
CN111179941A (en) Intelligent device awakening method, registration method and device
Deng et al. History utterance embedding transformer lm for speech recognition
Huang et al. Towards word-level end-to-end neural speaker diarization with auxiliary network
CN113096646B (en) Audio recognition method and device, electronic equipment and storage medium
CN115240713A (en) Voice emotion recognition method and device based on multi-modal features and contrast learning
JP2905674B2 (en) Unspecified speaker continuous speech recognition method
CN113936642A (en) Pronunciation dictionary construction method, voice recognition method and related device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20200628

Address after: Room 105G, 199 GuoShoujing Road, Pudong New Area, Shanghai, 200120

Co-patentee after: AI SPEECH Co.,Ltd.

Patentee after: Shanghai Jiaotong University Intellectual Property Management Co.,Ltd.

Address before: 200240 Dongchuan Road, Shanghai, No. 800, No.

Co-patentee before: AI SPEECH Co.,Ltd.

Patentee before: SHANGHAI JIAO TONG University

TR01 Transfer of patent right

Effective date of registration: 20201105

Address after: 215123 14 Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou, Jiangsu.

Patentee after: AI SPEECH Co.,Ltd.

Address before: Room 105G, 199 GuoShoujing Road, Pudong New Area, Shanghai, 200120

Patentee before: Shanghai Jiaotong University Intellectual Property Management Co.,Ltd.

Patentee before: AI SPEECH Co.,Ltd.

TR01 Transfer of patent right
CP01 Change in the name or title of a patent holder

Address after: 215123 14 Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou, Jiangsu.

Patentee after: Sipic Technology Co.,Ltd.

Address before: 215123 14 Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou, Jiangsu.

Patentee before: AI SPEECH Co.,Ltd.

CP01 Change in the name or title of a patent holder
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Implementation Method and System of Confidence Based Speech Recognition

Effective date of registration: 20230726

Granted publication date: 20190823

Pledgee: CITIC Bank Limited by Share Ltd. Suzhou branch

Pledgor: Sipic Technology Co.,Ltd.

Registration number: Y2023980049433

PE01 Entry into force of the registration of the contract for pledge of patent right