GB202117611D0 - Systems and methods for speech recognition - Google Patents

Systems and methods for speech recognition

Info

Publication number
GB202117611D0
GB202117611D0 GBGB2117611.0A GB202117611A GB202117611D0 GB 202117611 D0 GB202117611 D0 GB 202117611D0 GB 202117611 A GB202117611 A GB 202117611A GB 202117611 D0 GB202117611 D0 GB 202117611D0
Authority
GB
United Kingdom
Prior art keywords
systems
methods
speech recognition
speech
recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
GBGB2117611.0A
Other versions
GB2613581A (en
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Toshiba Corp
Original Assignee
Toshiba Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Toshiba Corp filed Critical Toshiba Corp
Priority to GB2117611.0A priority Critical patent/GB2613581A/en
Publication of GB202117611D0 publication Critical patent/GB202117611D0/en
Priority to JP2022136148A priority patent/JP7447202B2/en
Priority to CN202211051579.5A priority patent/CN116229946A/en
Publication of GB2613581A publication Critical patent/GB2613581A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/092Reinforcement learning
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/16Speech classification or search using artificial neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/183Speech classification or search using natural language modelling using context dependencies, e.g. language models
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • User Interface Of Digital Computer (AREA)
GB2117611.0A 2021-12-06 2021-12-06 Systems and methods for speech recognition Pending GB2613581A (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
GB2117611.0A GB2613581A (en) 2021-12-06 2021-12-06 Systems and methods for speech recognition
JP2022136148A JP7447202B2 (en) 2021-12-06 2022-08-29 Systems and methods for speech recognition
CN202211051579.5A CN116229946A (en) 2021-12-06 2022-08-30 System and method for speech recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB2117611.0A GB2613581A (en) 2021-12-06 2021-12-06 Systems and methods for speech recognition

Publications (2)

Publication Number Publication Date
GB202117611D0 true GB202117611D0 (en) 2022-01-19
GB2613581A GB2613581A (en) 2023-06-14

Family

ID=79270447

Family Applications (1)

Application Number Title Priority Date Filing Date
GB2117611.0A Pending GB2613581A (en) 2021-12-06 2021-12-06 Systems and methods for speech recognition

Country Status (3)

Country Link
JP (1) JP7447202B2 (en)
CN (1) CN116229946A (en)
GB (1) GB2613581A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115049844A (en) * 2022-06-29 2022-09-13 厦门大学 Image description generation method for enhancing visual information flow
CN116757184A (en) * 2023-08-18 2023-09-15 昆明理工大学 Vietnam voice recognition text error correction method and system integrating pronunciation characteristics

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117292694B (en) * 2023-11-22 2024-02-27 中国科学院自动化研究所 Time-invariant-coding-based few-token neural voice encoding and decoding method and system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20210141115A (en) 2020-05-15 2021-11-23 삼성전자주식회사 Method and apparatus for estimating utterance time

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115049844A (en) * 2022-06-29 2022-09-13 厦门大学 Image description generation method for enhancing visual information flow
CN115049844B (en) * 2022-06-29 2024-06-04 厦门大学 Image description generation method for enhancing visual information flow
CN116757184A (en) * 2023-08-18 2023-09-15 昆明理工大学 Vietnam voice recognition text error correction method and system integrating pronunciation characteristics
CN116757184B (en) * 2023-08-18 2023-10-20 昆明理工大学 Vietnam voice recognition text error correction method and system integrating pronunciation characteristics

Also Published As

Publication number Publication date
JP2023084085A (en) 2023-06-16
CN116229946A (en) 2023-06-06
GB2613581A (en) 2023-06-14
JP7447202B2 (en) 2024-03-11

Similar Documents

Publication Publication Date Title
EP3752957A4 (en) System and method for speech understanding via integrated audio and visual based speech recognition
GB202117611D0 (en) Systems and methods for speech recognition
EP3501023A4 (en) Speech recognition method and apparatus
EP3479376A4 (en) Speech recognition method and apparatus based on speaker recognition
EP3721380A4 (en) Method and system for facial recognition
EP3888084A4 (en) Method and device for providing voice recognition service
EP3779667A4 (en) Speech recognition device, speech recognition device cooperation system, and speech recognition device cooperation method
EP4026121A4 (en) Speech recognition systems and methods
EP3850622A4 (en) Method and device for speech recognition
EP3663905A4 (en) Information processing device, speech recognition system, and information processing method
EP3869509A4 (en) Voice recognition device and method
EP4128040A4 (en) Systems and methods for object recognition
EP4099316A4 (en) Speech synthesis method and system
EP3975172A4 (en) Voiceprint recognition method, and device
GB2600987B (en) Speech Recognition Systems and Methods
SG10202003089VA (en) Object recognition system and object recognition method
IL311378A (en) Systems and methods for item recognition
SG10202008401VA (en) Object recognition system and method
EP4010998A4 (en) System and method for event recognition
GB202003088D0 (en) Method and system for action recognition
EP4214634A4 (en) Systems and methods for object recognition
EP3908934A4 (en) Systems and methods for contactless authentication using voice recognition
EP4091094A4 (en) Systems and methods for stream recognition
EP3933388A4 (en) Feature point recognition system and recognition method
GB2602976B (en) Speech recognition systems and methods