CN110035043A - A kind of story play system and method based on speech recognition - Google Patents

A kind of story play system and method based on speech recognition Download PDF

Info

Publication number
CN110035043A
CN110035043A CN201810104033.9A CN201810104033A CN110035043A CN 110035043 A CN110035043 A CN 110035043A CN 201810104033 A CN201810104033 A CN 201810104033A CN 110035043 A CN110035043 A CN 110035043A
Authority
CN
China
Prior art keywords
story
stream media
audio files
recording
speech recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201810104033.9A
Other languages
Chinese (zh)
Inventor
朱建强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Hua Zhen Electronic Technology Co Ltd
Original Assignee
Shanghai Hua Zhen Electronic Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Hua Zhen Electronic Technology Co Ltd filed Critical Shanghai Hua Zhen Electronic Technology Co Ltd
Priority to CN201810104033.9A priority Critical patent/CN110035043A/en
Publication of CN110035043A publication Critical patent/CN110035043A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/63Querying
    • G06F16/632Query formulation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/34Adaptation of a single recogniser for parallel processing, e.g. by use of multiple processors or cloud computing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/61Network streaming of media packets for supporting one-way streaming services, e.g. Internet radio
    • H04L65/612Network streaming of media packets for supporting one-way streaming services, e.g. Internet radio for unicast
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Mathematical Physics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The story play system based on speech recognition that the invention discloses a kind of, including cloud server and story player;Cloud server receives the recording that Story machine is sent and does speech recognition calculating, identify the story name in recording, inquire the audio files in stream media of this corresponding story of story name, audio files in stream media comprising story content is sent to story player, if not can recognize that story name, it will be prompted to duplicate streaming media audio file and be sent to story player;Recording is sent cloud server by story player, and receives the audio files in stream media returned on cloud server, plays out.The invention also discloses a kind of story playback method based on speech recognition.

Description

A kind of story play system and method based on speech recognition
Technical field
The invention belongs to technical field of voice recognition more particularly to a kind of story play systems and side based on speech recognition Method.
Background technique
Story machine traditional at present plays story, is all the audio files storage by story on Story machine, then passes through Key or IR remote controller play to control, due to the limitation of the storage size of Story machine, the story audio file number of broadcasting Measure it is limited, and control play mode it is also very single, to listen some story, can only go to select by key.
Summary of the invention
Based on this, the present invention provides a kind of story play system and method based on speech recognition, fully effective can solve Certainly above-mentioned technical problem.
The technical scheme is that a kind of story play system based on speech recognition, including cloud server and event Thing player;Cloud server receives the recording that Story machine is sent and does speech recognition calculating, identifies the story name in recording, Audio files in stream media comprising story content is sent to by the audio files in stream media for inquiring this corresponding story of story name Story player will be prompted to duplicate streaming media audio file and be sent to story player if not can recognize that story name;Therefore Recording is sent cloud server by thing player, and receives the audio files in stream media returned on cloud server, is broadcast It puts.
In a preferred embodiment, the cloud server includes:
Module is updated storage, for updating and storing the audio files in stream media of story;
Speech recognition engine module for receiving recording data, the corresponding story title of identification recording data, and provides the story The identification score value of title;
It identifies score value judgment module, identifies whether score value is greater than identification point threshold for judging, if so, output result is event Thing title, if it is not, then exporting result is the audio files in stream media comprising " the story name described in you fail to identify, please repeat ", And it is sent to story player;
Condition query module, for inquiring story audio files in stream media corresponding to the story title identified, and by the sound Frequency files in stream media is sent to story player.
In a preferred embodiment, the story player includes:
Recording module, for recording received voice, the voice of recording includes the content voice signal B and Story machine that user speaks The content voice signal A of broadcasting;
Front audio processing module inhibits to filter out voice signal A, output voice letter by echo for handling the voice recorded Number B;
The voice signal B of output is sent to cloud server for realizing the communication with cloud server by Wifi module, and Receive the audio files in stream media of cloud server passback;
Streaming media playing module for decoding received audio files in stream media, and plays out.
In order to solve the technical problem, the present invention also provides a kind of story playback method based on speech recognition, including following mistake Journey:
S100, it receives and records and do speech recognition calculating, identify the story name in recording, inquire the corresponding event of this story name The audio files in stream media of thing, output result is the audio files in stream media comprising story content, if identifying story title, Then exporting result is the audio files in stream media comprising " the story name described in you fail to identify, please repeat ";
S200, the audio files in stream media of step S100 output is decoded and is played.
In a preferred embodiment, step S100 specifically includes following process:
S101, update and the audio files in stream media for storing story;
S102, recording data, the corresponding story title of identification recording data are received, and provides the identification score value of the story title;
S103, judge to identify whether score value is greater than identification point threshold, if so, output result is story title, enter step S104, if it is not, then exporting result is the streaming media audio file comprising " the story name described in you fail to identify, please repeat ", into Enter step S203;
The audio files in stream media of story corresponding to the story title that S104, inquiry identify, enters step S203.
In a preferred embodiment, the step S200 specifically includes following process:
S201, recorded speech, wherein the voice of recording include the content voice signal B that user speaks and Story machine play it is interior Hold voice signal A;
The voice that S202, processing are recorded is inhibited to filter out voice signal A by echo, exports voice signal B, enter step S102;
S203, the audio files in stream media of passback is decoded and is played.
The beneficial effects of the present invention are: the present invention is connected to internet in a manner of wifi, realize logical with cloud server Believe, speech recognition is done on cloud server and calculates and stores story audio file, user says the name of story, passes through server On speech recognition, identify story name, storage story audio files in stream media beyond the clouds played on Story machine, by Then it is stored on server beyond the clouds, compared with traditional approach, can store more stories, carried out by English identification method It plays, designs intelligent humanized, enrich broadcast mode.
Detailed description of the invention
Fig. 1 is the functional block diagram of the story play system described in the embodiment of the present invention based on speech recognition;
Fig. 2 is the flow chart of the story playback method described in the embodiment of the present invention based on speech recognition;
Fig. 3 is the schematic diagram of echo process of inhibition described in the embodiment of the present invention.
Description of symbols:
100- cloud server, 200- story player, 101- update storage module, 102- speech recognition engine module, 103- Identify score value judgment module, 104- condition query module, 201- recording module, 202- front audio processing module, 203-Wifi Module, 204- streaming media playing module.
Specific embodiment
The present invention is described in detail below.
Embodiment
As shown in Figure 1, a kind of story play system based on speech recognition, including cloud server 100 and story play Machine 200;Cloud server 100 receives the recording that Story machine is sent and does speech recognition calculating, identifies the story name in recording, Audio files in stream media comprising story content is sent to by the audio files in stream media for inquiring this corresponding story of story name Story player 200 will be prompted to duplicate streaming media audio file and be sent to story player if not can recognize that story name 200;Recording is sent cloud server 100 by story player 200, and receives the audio stream returned on cloud server 100 Media file plays out.
In above system, the audio file of the story of magnanimity is stored on cloud server 100, audio file can basis Story name indexes inquiry, and story audio file is put can regularly update on the server, and newest story is added.Sound Frequency file supports streaming media and broadcasting.Speech recognition engine is run on the cloud server 100, this engine is large vocabulary Speech recognition engine can support the speech recognition content recognition of magnanimity, this engine supports multithreading, support that multiple Story machines are logical It crosses internet while sending recording data, while doing the calculating of speech recognition, the story name in the recording identified and and this event The identification score value of thing name.By identifying the judgement of score value threshold values, if the score value of identification is higher than identification score value threshold values, event is exported Thing name finds the audio file of story, the files in stream media of this audio is then issued story and is broadcast using story name as index Machine 200 is put, does on story player 200 and is played in downloading;If the score value of this identification is judged to lower than identification score value threshold values It cannot identify, return result to 200 machine of story player, tell user this time failing identification just by Story machine playing alert tones Really.
In another embodiment, the cloud server 100 includes:
Module 101 is updated storage, for updating and storing the audio files in stream media of story;
Speech recognition engine module 102 for receiving recording data, the corresponding story title of identification recording data, and provides this The identification score value of story title;
It identifies score value judgment module 103, identifies whether score value is greater than identification point threshold for judging, if so, output result For story title, if it is not, then exporting result is the audio Streaming Media text comprising " the story name described in you fail to identify, please repeat " Part, and it is sent to story player 200;
Condition query module 104, for inquiring story audio files in stream media corresponding to the story title identified, and should Audio files in stream media is sent to story player 200.
In another embodiment, the story player 200 includes:
Recording module 201, for recording received voice, the voice of recording includes the content voice signal B that user speaks and event The content voice signal A that thing player 200 plays;
Front audio processing module 202 is inhibited to filter out voice signal A by echo, exports voice for handling the voice recorded Signal B;Specifically, microphone also can record the sound that loudspeaker play to enter in recording, phonetic recognization rate can be greatly reduced, In order to also can precisely identify when playing, uses echo and inhibit function, this function is indicated with such as Fig. 3: story player 200 play voice signals, played back by loudspeaker, then by microphone resurvey and user's one's voice in speech It mixes, carries out " subtraction " (echo inhibition) with reference signal (being connected in front audio processing by power amplifier chips lead) Operation inhibits reference signal.By front audio, treated that sound is left with user's one's voice in speech in this way, ensure that therefore For affairs that should be kept secret when loudspeaker play, speech recognition equally has high discrimination.
The voice signal B of output is sent to cloud for realizing the communication with cloud server 100 by Wifi module 203 Server 100, and receive the audio files in stream media of the passback of cloud server 100;
Streaming media playing module 204 for decoding received audio files in stream media, and plays out.
As shown in Fig. 2, in order to solve the technical problem, the present invention also provides a kind of story playback method based on speech recognition, It comprises the following processes:
S100, it receives and records and do speech recognition calculating, identify the story name in recording, inquire the corresponding event of this story name The audio files in stream media of thing, output result is the audio files in stream media comprising story content, if identifying story title, Then exporting result is the audio files in stream media comprising " the story name described in you fail to identify, please repeat ";
S200, the audio files in stream media of step S100 output is decoded and is played.
In another embodiment, step S100 specifically includes following process:
S101, update and the audio files in stream media for storing story;
S102, recording data, the corresponding story title of identification recording data are received, and provides the identification score value of the story title;
S103, judge to identify whether score value is greater than identification point threshold, if so, output result is story title, enter step S104, if it is not, then exporting result is the streaming media audio file comprising " the story name described in you fail to identify, please repeat ", into Enter step S203;
The audio files in stream media of story corresponding to the story title that S104, inquiry identify, enters step S203.
In another embodiment, the step S200 specifically includes following process:
S201, recorded speech, wherein the voice of recording include the content voice signal B that user speaks and Story machine play it is interior Hold voice signal A;
The voice that S202, processing are recorded is inhibited to filter out voice signal A by echo, exports voice signal B, enter step S102;
S203, the audio files in stream media of passback is decoded and is played.
In above-described embodiment, when starting to play, for example user says " I wants to listen the story of small red cap ", story player 200 Recording only includes that is said or talked about by user due to not playing story also at this time, in recording, recording is sent to cloud server 100, Identification is done on server to calculate, and is identified story name " small red cap ", is inquired into story audio list, by the audio of " small red cap " Files in stream media is sent to story player 200, and after story player 200 receives, decoding plays the audio file of small red cap, if User's sound of speaking is too small or from microphone it is too far etc. due to cause not can recognize that story name " small red cap ", then story The broadcasting content of player 200 is " the story name described in you fail to identify, please repeat, please repeat ".
During the story of " small red cap " is playing, user says " playing Snow White ", story player 200 The sound of loudspeaker has been done echo inhibition, there was only user's one's voice in speech in recording, recording is issued by front audio processing module The audio files in stream media of " Snow White " is issued story player 200, story after server identification by cloud server 100 Player 200 stops the broadcasting of " small red cap ", changes the audio files in stream media for broadcasting " Snow White ".
A specific embodiment of the invention above described embodiment only expresses, the description thereof is more specific and detailed, but simultaneously Limitations on the scope of the patent of the present invention therefore cannot be interpreted as.It should be pointed out that for those of ordinary skill in the art For, without departing from the inventive concept of the premise, various modifications and improvements can be made, these belong to guarantor of the invention Protect range.

Claims (6)

1. a kind of story play system based on speech recognition, it is characterised in that: including cloud server and story player;Cloud End server receives the recording that Story machine is sent and does speech recognition calculating, identifies the story name in recording, inquires this event Audio files in stream media comprising story content is sent to story and played by the audio files in stream media of the corresponding story of thing name Machine will be prompted to duplicate streaming media audio file and be sent to story player if not can recognize that story name;Story player Cloud server is sent by recording, and receives the audio files in stream media returned on cloud server, is played out.
2. the story play system according to claim 1 based on speech recognition, which is characterized in that the cloud service Device includes:
Module is updated storage, for updating and storing the audio files in stream media of story;
Speech recognition engine module for receiving recording data, the corresponding story title of identification recording data, and provides the story The identification score value of title;
It identifies score value judgment module, identifies whether score value is greater than identification point threshold for judging, if so, output result is event Thing title, if it is not, then exporting result is the audio files in stream media comprising " the story name described in you fail to identify, please repeat ", And it is sent to story player;
Condition query module, for inquiring story audio files in stream media corresponding to the story title identified, and by the sound Frequency files in stream media is sent to story player.
3. the story play system according to claim 1 based on speech recognition, which is characterized in that the story plays Machine includes:
Recording module, for recording received voice, the voice of recording includes the content voice signal B and Story machine that user speaks The content voice signal A of broadcasting;
Front audio processing module inhibits to filter out voice signal A, output voice letter by echo for handling the voice recorded Number B;
The voice signal B of output is sent to cloud server for realizing the communication with cloud server by Wifi module, and Receive the audio files in stream media of cloud server passback;
Streaming media playing module for decoding received audio files in stream media, and plays out.
4. a kind of story playback method based on speech recognition, which is characterized in that comprise the following processes:
S100, it receives and records and do speech recognition calculating, identify the story name in recording, inquire the corresponding event of this story name The audio files in stream media of thing, output result is the audio files in stream media comprising story content, if identifying story title, Then exporting result is the audio files in stream media comprising " the story name described in you fail to identify, please repeat ";
S200, the audio files in stream media of step S100 output is decoded and is played.
5. the story playback method according to claim 4 based on speech recognition, which is characterized in that step S100 is specifically wrapped Include following process:
S101, update and the audio files in stream media for storing story;
S102, recording data, the corresponding story title of identification recording data are received, and provides the identification score value of the story title;
S103, judge to identify whether score value is greater than identification point threshold, if so, output result is story title, enter step S104, if it is not, then exporting result is the streaming media audio file comprising " the story name described in you fail to identify, please repeat ", into Enter step S203;
The audio files in stream media of story corresponding to the story title that S104, inquiry identify, enters step S203.
6. the story playback method according to claim, described in 5 based on speech recognition, which is characterized in that the step S200 specifically includes following process:
S201, recorded speech, wherein the voice of recording include the content voice signal B that user speaks and Story machine play it is interior Hold voice signal A;
The voice that S202, processing are recorded is inhibited to filter out voice signal A by echo, exports voice signal B, enter step S102;
S203, the audio files in stream media of passback is decoded and is played.
CN201810104033.9A 2018-02-02 2018-02-02 A kind of story play system and method based on speech recognition Withdrawn CN110035043A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810104033.9A CN110035043A (en) 2018-02-02 2018-02-02 A kind of story play system and method based on speech recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810104033.9A CN110035043A (en) 2018-02-02 2018-02-02 A kind of story play system and method based on speech recognition

Publications (1)

Publication Number Publication Date
CN110035043A true CN110035043A (en) 2019-07-19

Family

ID=67234590

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810104033.9A Withdrawn CN110035043A (en) 2018-02-02 2018-02-02 A kind of story play system and method based on speech recognition

Country Status (1)

Country Link
CN (1) CN110035043A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111193940A (en) * 2019-12-09 2020-05-22 腾讯科技(深圳)有限公司 Audio playing method and device, computer equipment and computer readable storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040117190A1 (en) * 2002-12-17 2004-06-17 Microsoft Corporation Computer system and method for enhancing experience using networked devices
CN201163780Y (en) * 2008-02-01 2008-12-10 广州汉音电子科技有限公司 Voice box and system for implementing voice box receiving
CN102831892A (en) * 2012-09-07 2012-12-19 深圳市信利康电子有限公司 Toy control method and system based on internet voice interaction
CN205516447U (en) * 2016-01-21 2016-08-31 上海市纺织科学研究院 Multi -functional removal intelligent toys based on radio communication
CN107103795A (en) * 2017-06-28 2017-08-29 广州播比网络科技有限公司 A kind of interactive player method of Story machine
CN107308657A (en) * 2017-07-31 2017-11-03 广州网嘉玩具科技开发有限公司 A kind of novel interactive intelligent toy system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040117190A1 (en) * 2002-12-17 2004-06-17 Microsoft Corporation Computer system and method for enhancing experience using networked devices
CN201163780Y (en) * 2008-02-01 2008-12-10 广州汉音电子科技有限公司 Voice box and system for implementing voice box receiving
CN102831892A (en) * 2012-09-07 2012-12-19 深圳市信利康电子有限公司 Toy control method and system based on internet voice interaction
CN205516447U (en) * 2016-01-21 2016-08-31 上海市纺织科学研究院 Multi -functional removal intelligent toys based on radio communication
CN107103795A (en) * 2017-06-28 2017-08-29 广州播比网络科技有限公司 A kind of interactive player method of Story machine
CN107308657A (en) * 2017-07-31 2017-11-03 广州网嘉玩具科技开发有限公司 A kind of novel interactive intelligent toy system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111193940A (en) * 2019-12-09 2020-05-22 腾讯科技(深圳)有限公司 Audio playing method and device, computer equipment and computer readable storage medium
CN111193940B (en) * 2019-12-09 2021-07-06 腾讯科技(深圳)有限公司 Audio playing method and device, computer equipment and computer readable storage medium

Similar Documents

Publication Publication Date Title
CN106463112B (en) Voice recognition method, voice awakening device, voice recognition device and terminal
CN108520748B (en) Intelligent device function guiding method and system
US20110218798A1 (en) Obfuscating sensitive content in audio sources
CN103152499B (en) Echo canceler
US20120271631A1 (en) Speech recognition using multiple language models
CN108962262A (en) Voice data processing method and device
AU2001289766A1 (en) System and methods for recognizing sound and music signals in high noise and distortion
US8620670B2 (en) Automatic realtime speech impairment correction
CN109147820B (en) Vehicle-mounted sound control method and device, electronic equipment and storage medium
CN113986187A (en) Method and device for acquiring range amplitude, electronic equipment and storage medium
CN107656977A (en) The acquisition of multimedia file and player method and device
CN104409087A (en) Method and system of playing song documents
CN109285556A (en) Audio-frequency processing method, device, equipment and storage medium
CN204539202U (en) The online song system of a kind of vehicle-carried sound-controlled playing speech on demand
CN102881309B (en) Lyrics file generates method and device
US8315867B1 (en) Systems and methods for analyzing communication sessions
US20090210229A1 (en) Processing Received Voice Messages
CN107767860B (en) Voice information processing method and device
CN110035043A (en) A kind of story play system and method based on speech recognition
CN109510891A (en) Voice control recording device and method
CN113707128B (en) Test method and system for full duplex voice interaction system
US20220215835A1 (en) Evaluating user device activations
CN104484426A (en) Multi-mode music searching method and system
CN207895504U (en) Automobile data recorder
CN115699168A (en) Voiceprint management method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20190719