CN110430465B - Learning method based on intelligent voice recognition, terminal and storage medium - Google Patents

Learning method based on intelligent voice recognition, terminal and storage medium Download PDF

Info

Publication number
CN110430465B
CN110430465B CN201910636555.8A CN201910636555A CN110430465B CN 110430465 B CN110430465 B CN 110430465B CN 201910636555 A CN201910636555 A CN 201910636555A CN 110430465 B CN110430465 B CN 110430465B
Authority
CN
China
Prior art keywords
learning
video
user
playing
terminal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910636555.8A
Other languages
Chinese (zh)
Other versions
CN110430465A (en
Inventor
宀抽『
岳顺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Skyworth RGB Electronics Co Ltd
Original Assignee
Shenzhen Skyworth RGB Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Skyworth RGB Electronics Co Ltd filed Critical Shenzhen Skyworth RGB Electronics Co Ltd
Priority to CN201910636555.8A priority Critical patent/CN110430465B/en
Publication of CN110430465A publication Critical patent/CN110430465A/en
Priority to PCT/CN2020/073079 priority patent/WO2021008128A1/en
Application granted granted Critical
Publication of CN110430465B publication Critical patent/CN110430465B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B5/00Electrically-operated educational appliances
    • G09B5/06Electrically-operated educational appliances with both visual and audible presentation of the material to be studied
    • G09B5/065Combinations of audio and video presentations, e.g. videotapes, videodiscs, television systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • H04N21/4312Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47202End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for requesting content on demand, e.g. video on demand
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47217End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for controlling playback functions for recorded or on-demand content, e.g. using progress bars, mode or play-point indicators or bookmarks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/478Supplemental services, e.g. displaying phone caller identification, shopping application
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/482End-user interface for program selection
    • H04N21/4825End-user interface for program selection using a list of items to be played back in a given order, e.g. playlists
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Abstract

The invention discloses a learning method, a terminal and a storage medium based on intelligent voice recognition, wherein the learning method based on intelligent voice recognition comprises the following steps: the method comprises the steps that a terminal obtains an online learning video configured by a server, and the online learning video is displayed on a display screen of the terminal in a list form; when the terminal receives a playing instruction input by a user, playing the online learning video according to the playing instruction, and entering a learning mode; and when the terminal enters the learning mode, receiving a voice command input by a user through intelligent voice recognition, and switching the online learning video to a corresponding learning scene according to the voice command. According to the method and the device, the learning scene of the online video is extracted through intelligent voice recognition, and the learning scene is converted into learning data, so that when a user watches the learning-type video, the playing of the knowledge points can be flexibly controlled, and the learning ability of the user is improved.

Description

Learning method based on intelligent voice recognition, terminal and storage medium
Technical Field
The invention relates to the field of terminal application, in particular to a learning method based on intelligent voice recognition, a terminal and a storage medium.
Background
With the development of internet technology, consumers can more easily acquire various online video resources by using the internet and watch and learn by using the online video resources; for some learned video resources, users mostly watch video contents repeatedly to acquire knowledge in the video, so that the interaction with the video contents is lacked, and the understanding of the learned contents cannot be enhanced.
Along with the development of hardware and software, the technology of intelligent voice recognition is also rapidly developed; although, intelligent speech recognition can accurately convert speech information into text information using algorithms and understand the user's intentions; however, for a user needing learning, the learning progress of the user cannot be obtained through intelligent voice recognition, so that the learning resources needed by the user at the current stage are found, and the learning capability of the user is improved.
Accordingly, the prior art is yet to be improved and developed.
Disclosure of Invention
The invention aims to solve the technical problem that the learning method, the terminal and the storage medium based on intelligent voice recognition are provided for overcoming the defects of the prior art, the learning scene of an online video is extracted through intelligent voice recognition, and the learning scene is converted into learning data, so that when a user watches the learning-type video, the playing of knowledge points can be flexibly controlled, and the learning ability of the user is improved.
The technical scheme adopted by the invention for solving the technical problem is as follows:
the invention provides a learning method based on intelligent voice recognition, wherein the learning method based on intelligent voice recognition comprises the following steps:
the method comprises the steps that a terminal obtains an online learning video configured by a server, and the online learning video is displayed on a display screen of the terminal in a list form;
when the terminal receives a playing instruction input by a user, playing the online learning video according to the playing instruction, and entering a learning mode;
and when the terminal enters the learning mode, receiving a voice command input by a user through intelligent voice recognition, and switching the online learning video to a corresponding learning scene according to the voice command.
Further, the learning scenarios include a word learning scenario, a dialogue learning scenario, and a word dialogue cross-learning scenario.
Further, the step of acquiring the online learning video configured by the server and displaying the online learning video on the display screen of the terminal in a list form by the terminal specifically includes the following steps:
the terminal sends a request for acquiring the online learning video to the server;
and the terminal receives the list of the online learning videos sent by the server and displays the list on a display screen of the terminal.
Further, when the terminal receives a play instruction input by a user, the method specifically includes the following steps of playing the online learning video according to the play instruction and entering a learning mode:
when the terminal receives the playing instruction, judging whether the playing instruction is an instruction for playing the online learning video;
when the playing instruction is an instruction for playing the online learning video, sending a request for downloading the online learning video to the server;
receiving and playing the online learning video downloaded from the server, and prompting whether the user enters the learning mode or not on a display screen of the terminal;
and when the user selects to enter the learning mode, the terminal starts an intelligent voice recognition function.
Further, when the playing instruction is an instruction to play the online learning video, after sending a request to download the online learning video to the server, the method further includes the following steps:
and when the downloading of the online learning video is finished, prompting a learning rule and a learning mode to the user on a display screen of the terminal.
Further, when the terminal enters the learning mode, receiving a voice instruction input by a user through intelligent voice recognition, and switching the online learning video to a corresponding learning scene according to the voice instruction specifically includes the following steps:
when the terminal enters the learning mode, receiving a voice instruction input by a user through intelligent voice recognition;
and switching the online learning video to a corresponding learning scene according to the voice command, and playing corresponding learning contents in the online learning video.
Further, the switching the online learning video to a corresponding learning scene according to the voice instruction and playing the corresponding learning content in the online learning video specifically includes the following steps:
switching the online learning video to a corresponding learning scene according to the voice command, and playing corresponding learning content in the online learning video;
when the terminal plays the learning content, judging whether the user inputs voice information within a preset time;
jumping to a time point corresponding to the voice information when the user inputs the voice information within a preset time;
and playing the learning segment corresponding to the voice information in the learning content according to the time point.
Further, when the terminal enters the learning mode, after receiving a voice instruction input by a user through intelligent voice recognition and switching the online learning video to a corresponding learning scene according to the voice instruction, the method further includes the following steps:
when the terminal finishes playing, uploading the voice information input by the user in the learning scene to the server;
and receiving the scores and the error correction sent by the server, and providing corresponding learning suggestions to the user on a display screen of the terminal according to the voice information.
The invention also provides a terminal, which comprises a processor and a memory connected with the processor, wherein the memory stores a learning program based on intelligent voice recognition, and the learning program based on intelligent voice recognition is used for realizing the learning method based on intelligent voice recognition when being executed by the processor.
The invention also provides a storage medium, wherein the storage medium stores a learning program based on intelligent voice recognition, and the learning program based on intelligent voice recognition is used for realizing the learning method based on intelligent voice recognition when being executed by the processor.
The invention provides a learning method, a terminal and a storage medium based on intelligent voice recognition, which are characterized in that online learning videos configured by a server are obtained and displayed in a display screen of the terminal in a list form, so that a user can select a required learning video from the list; prompting the user to enter a learning mode after the user selects the corresponding online learning video; when entering a learning mode, entering a corresponding learning scene in the online learning video according to a voice command input by a user so that the user can learn corresponding learning content in the learning scene conveniently; in addition, after the learning of the user is finished, corresponding learning suggestions are made to the user according to the voice information of the user in the learning process, so that the user can correct errors in the subsequent learning process; according to the method and the device, the learning scene of the online video is extracted through intelligent voice recognition, and the learning scene is converted into learning data, so that when a user watches the learning-type video, the playing of the knowledge points can be flexibly controlled, and the learning ability of the user is improved.
Drawings
FIG. 1 is a flow chart of a preferred embodiment of the learning method based on intelligent speech recognition in the present invention.
Fig. 2 is a functional block diagram of a terminal and a server in the present invention.
Fig. 3 is a sequence diagram of interactions during use by a user in the present invention.
Fig. 4 is (a) a processing flowchart of the terminal in the present invention.
Fig. 5 is a flowchart of the processing of the terminal in the present invention (second).
Fig. 6 is a sequence diagram of the terminal and the server creating the learning content in the present invention.
Fig. 7 is a functional block diagram of a terminal in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer and clearer, the present invention is further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Example one
Fig. 1 shows a learning method based on intelligent speech recognition according to a preferred embodiment of the present invention, and fig. 1 is a flowchart of a learning method based on intelligent speech recognition according to a preferred embodiment of the present invention.
The learning method based on intelligent voice recognition comprises the following steps:
and S100, the terminal acquires an online learning video configured by the server and displays the online learning video on a display screen of the terminal in a list form.
As shown in fig. 2, in this embodiment, the learning method based on intelligent speech recognition is mainly implemented by two devices, one of which is a terminal device (i.e., a playing device) that can acquire and play an online video through a network, and at the same time, it needs to have a speech acquisition function, i.e., far-field speech or near-field speech can be used; moreover, it needs to support the function of intelligent voice recognition, recognize the voice command of the user and the audio data played by the video content, and preliminarily determine whether the played video has learnability.
The other device is a background server which needs to have the voice recognition capability, can recognize and analyze the audio and video uploaded by the terminal device, can correspondingly process the audio and video uploaded by the terminal device to obtain corresponding learning data, and stores the learning data as a database, so that the learning data can be obtained when the terminal enters a learning mode; moreover, the server also needs to have the capability of processing and analyzing the voice information input by the user, i.e. analyzing the voice information input by the user to obtain a score and give a correction suggestion and method; after the scoring and suggestion are obtained, the scoring result of the user is stored, and comprehensive evaluation and learning suggestion is given in a service cycle.
In this embodiment, an online learning video needs to be configured on a server in advance, and when the server configures the online learning video, whether the online video belongs to the online learning video is determined by analyzing audio content of the online video played by each user in daily use, and if so, a learning segment in the online video is stored as learning data for the user to enter a learning mode when the user learns again; after the server is configured with the online learning video, the online learning video is stored in a corresponding list, so that a user can find needed learning content from the list.
When a user uses the terminal equipment, the terminal acquires an online learning video configured by the server, and the online learning video is displayed in a display screen in a list form so as to be displayed to the user for viewing, so that the user can conveniently find needed learning content from the list; specifically, when the terminal acquires the online learning video configured by the server, a request for acquiring the online learning video is sent to the server, and then, when the server receives the request, a pre-configured list is sent to the terminal, so that the terminal can display the list on a display screen.
Namely, the step S100 specifically includes the following steps:
step S110, the terminal sends a request for acquiring the online learning video to the server;
and step S120, the terminal receives the list of the online learning videos sent by the server and displays the list on a display screen of the terminal.
The online learning video list configured by the server is obtained and displayed on the display screen, so that the user can inquire corresponding learning content from the list in a targeted manner during the learning process.
And step S200, when the terminal receives a playing instruction input by a user, playing the online learning video according to the playing instruction, and entering a learning mode.
In this embodiment, when a user clicks an online video on a terminal device, the terminal device may determine according to a click operation of the user, and determine whether the video clicked by the user is an online learning video; the method specifically includes that a display screen of the terminal is divided into an online learning video special area and a non-learning video area, and when a user operates in the online learning video special area, a playing instruction input by the user can be judged to be an instruction for playing the online learning video.
When the playing instruction input by the user is judged to be the instruction for playing the online learning video, the terminal downloads the corresponding learning video and the learning data (namely the learning segment) for the learning mode to the server; when downloading is completed, the terminal prompts a learning rule and a learning mode to a user; when the terminal plays the downloaded learning video, the terminal prompts a user whether to enter a learning mode or not in a dialog box mode; when the user selects to enter the learning mode, the terminal starts the intelligent voice recognition function so that the user can conveniently carry out voice dialogue learning in the learning mode.
That is, the step S200 specifically includes the following steps:
step S210, when the terminal receives the playing instruction, judging whether the playing instruction is an instruction for playing the online learning video;
step S220, when the playing instruction is an instruction for playing the online learning video, sending a request for downloading the online learning video to the server;
step S230, when the downloading of the online learning video is finished, prompting a learning rule and a learning mode to the user on a display screen of the terminal;
step S240, receiving and playing the online learning video downloaded from the server, and prompting whether the user enters the learning mode on a display screen of the terminal;
and step S250, when the user selects to enter the learning mode, the terminal starts an intelligent voice recognition function.
By judging whether the video clicked by the user is the online learning video or not, the terminal can prompt the user whether to enter the learning mode or not according to the clicking operation of the user, and then enter the corresponding learning mode according to the selection of the user.
And step S300, when the terminal enters the learning mode, receiving a voice command input by a user through intelligent voice recognition, and switching the online learning video to a corresponding learning scene according to the voice command.
In this embodiment, there are mainly three learning scenarios: one is a word learning scene taking words as main parts, in the learning scene, a word is played by a terminal, and a user follows to learn one word; the second is a learning scene of language conversation, and in the learning scene, a terminal and a user learn in a conversation mode; and the third is a cross learning scene of words and conversations, and in the learning scene, the word learning scene and the conversation learning scene are combined.
When the user enters a corresponding learning scenario, the playback device supports two types of voice dialog input: one is a voice instruction, such as: "next word", "resume", "end", etc.; another type of learning segments for speech content, such as: a word, a dialogue, etc.; when the voice input is recognized by the voice, the corresponding content is jumped to play the corresponding learning segment.
In the embodiment, a voice instruction input by a user is received through intelligent voice recognition, then the online learning video is switched to a corresponding learning scene according to the voice instruction, and corresponding learning content in the online learning video is played; specifically, when the terminal plays the learning content, it is determined whether the user inputs the voice information within a preset time (e.g., 10 seconds); when a user inputs voice information within preset time, the terminal jumps the playing online learning video to a time point corresponding to the voice information; for example, when the user plays the online learning video, the user inputs the voice information of the "next word", at this time, the terminal jumps the currently played online learning video to the time point of the "next word", and then plays the learning segment corresponding to the voice information in the learning content according to the time point.
In addition, in this embodiment, when the terminal finishes playing, the terminal uploads the voice information input by the user during the learning process to the server, and then, according to the score and the error correction sent by the server, the score and the error correction are displayed on the display screen, and a corresponding learning suggestion and the like are displayed on the display screen.
Namely, the step S300 specifically includes the following steps:
step S310, when the terminal enters the learning mode, receiving a voice instruction input by a user through intelligent voice recognition;
step S320, switching the online learning video to a corresponding learning scene according to the voice command, and playing corresponding learning content in the online learning video;
step S330, when the terminal finishes playing, uploading the voice information input by the user in the learning scene to the server;
and step S340, receiving the scores and the error corrections sent by the server, and proposing corresponding learning suggestions to the user on a display screen of the terminal according to the voice information.
Among the above steps, the step S320 specifically includes the following steps:
step S321, switching the online learning video to a corresponding learning scene according to the voice command, and playing corresponding learning content in the online learning video;
step S322, when the terminal plays the learning content, judging whether the user inputs voice information within a preset time;
step S323, when the user inputs voice information within a preset time, jumping to a time point corresponding to the voice information;
step S324, playing a learning segment corresponding to the voice message in the learning content according to the time point.
The online video learning scene is extracted through intelligent voice recognition, the learning scene can be converted into learning data, and a user can conveniently and flexibly control the playing of knowledge points to deepen the learning impression when watching the learning video.
The present embodiment is further described below with reference to fig. 3 to 6:
in this embodiment, it is mainly determined whether the type of the video belongs to the online learning video by analyzing the audio content of the online video played by the user, and if so, the learning segment of the video is analyzed and stored as learning data for the user to perform voice control in the state of performing the learning mode again.
When a user enters a learning mode to play a video, the playing equipment prompts the user to learn the contents, so that the user can know the learning contents in advance and can learn corresponding segments in a targeted manner; when a user speaks a voice instruction related to the learning content, the playing device jumps to the corresponding learning segment to play, and pauses when the playing is finished; when the user speaks the voice control instruction, the playing device also jumps to the corresponding learning segment for playing.
When the user inputs the voice, the playing equipment uploads the voice input of the user to the server side for voice analysis and scoring and error correction, so that effective suggestions can be given for the voice input of the user, and the learning ability and the knowledge ability of the user can be gradually improved; the background server can also provide a learning curve according to the use time period and the use period, so that a stable or gradually improved learning ability is ensured in the learning process.
Specifically, as shown in fig. 3, when the user uses the playing device, the playing device may obtain data of a learning content region configured by the server (the data is composed of learning data after each user uses an online video, or is directly edited manually), and display the data to the user for the user to view; when a user clicks a corresponding learning piece source, the playing device acquires corresponding learning data (the learning data is an ordered list of learning segments and mainly comprises types [ keyword types/language segment types ], playing labels [ keywords/voice segments ], playing time points, segment duration and the like) from the server.
When the playing equipment finishes obtaining, prompting a user whether to enter a learning mode, and when the user selects to enter the learning mode, giving a learning rule by the playing equipment; the learning rule mainly comprises the content of the learning segment and the rule of learning the voice recognition; such as: the voice content rules allow the user to speak keywords [ words learn/reply from voice scene ] or speak voice segments of the voice scene; voice instruction rules, let the user speak voice instructions that control learning, such as: again, next, come again, end, etc.
When the playing device enters a learning mode, if the user does not input voice in 10S, the playing device enters normal playing; in the normal playing process, if a voice instruction input by a user is received and the voice instruction input by the user is detected as a keyword, skipping to a learning segment corresponding to the keyword and matching to a time point corresponding to the keyword for playing; when the playing is finished, pausing; if the voice instruction is received again in the playing process, matching is carried out again; if the matching fails, giving a prompt; and if the received voice instruction is a language segment, jumping to a time point corresponding to the language segment for playing, and recording the context in the learning scene for scene learning.
In the learning process, as long as a user inputs voice, the playing equipment uploads the voice of the user to the server for scoring and correcting pronunciation, records the current voice and score, counts the highest value, the lowest value and the average value after being used for multiple times, and records multiple sections of voice to prompt and correct the user; when the user finishes using, the playing device prompts the user with the total score and pronunciation suggestion of the current film source.
As shown in fig. 6, fig. 6 is a sequence diagram describing creation of learning data; when a user plays an online film source, the playing equipment decodes the audio of an online video, identifies the content in the online video, and judges that the online video is a film source supporting a learning mode if the following conditions are met:
1. a film source mainly comprising a certain type of words is arranged at regular intervals;
2. a film source that is dominated by some type of scene dialog.
When the playing equipment identifies the film source supporting the learning mode, prompting the user whether to enter the learning mode; if the user selects to enter the learning mode, the audio decoding of the online video is sent to the server, and the server analyzes the whole audio data; the server generates corresponding learning segment information according to the conditions; after the whole audio data is analyzed, the server organizes the content of the whole learning segment to generate learning rules, such as: listing all words, or arranging corresponding learning scenes, and the like; and finally, the server generates a corresponding language content instruction set according to the learning rule.
After the server finishes processing, the processing result is returned to the playing device and displayed on the playing device (the learning rule and the language content instruction set are displayed); when the playing device plays the online learning video, if the user wants to enter the learning mode, the server sends the learning data to the playing device, and then voice interaction is performed among the user, the playing device and the server according to the use timing chart in fig. 3; in order to confirm the effectiveness of the learning data, the server background can support manual editing and adjustment, and also support manual entry, so that a learning special area is conveniently created.
As shown in fig. 4 and 5, the playback device includes the following steps in processing:
step S11, acquiring and displaying the online learning video configured by the server;
step S12, judging whether the online video selected by the user is the online learning video; if so, go to step S13; if not, returning to step S11;
step S13, playing and prompting whether to enter a learning mode;
step S14, judging whether the user selects to enter the learning mode; if so, go to step S15; if not, returning to step S13;
step S15, downloading the learning content rules and the learning content list;
step S16, finishing downloading and prompting learning rules and learning modes;
step S17, judging whether the user has voice input in 10S; if so, go to step S18; if not, go to step S21;
step S18, inputting the voice learning content;
step S19, switching to the corresponding learning content time point, playing, and uploading voice to the server;
step S20, finishing learning on demand;
step S21, entering normal playing;
step S22, receiving learning content input;
step S23, receiving a voice operation instruction;
step S24, switching to the corresponding learning content playing;
step S25, finishing playing;
and step S26, giving scores and suggestions.
The learning method based on intelligent voice recognition in the embodiment not only breaks away from monotonous video watching learning, but also brings fun of interactive learning, and can improve learning efficiency; the voice interaction mode is used for replacing watching, the expression of spoken language is also improved, and learning in a learning scene is really realized; meanwhile, the back end of the server can collect the voice input of the user to carry out scoring and standard pronunciation comparison, and gives a learning suggestion and learning track analysis, so that the learning efficiency is improved and error correction is easier.
Example two
Referring to fig. 7, fig. 7 is a functional block diagram of a terminal according to a preferred embodiment of the present invention.
As shown in fig. 7, an embodiment of the present invention provides a terminal, which may be a mobile terminal (such as a mobile phone or a tablet computer) or an intelligent terminal (such as a smart television or other intelligent devices), where the terminal of this embodiment includes a processor 10 and a memory 20 connected to the processor 10;
the memory 20 stores a learning program based on intelligent voice recognition, which is executed by the processor 10 to implement the learning method based on intelligent voice recognition; as described above.
EXAMPLE III
The embodiment of the invention provides a storage medium, wherein the storage medium stores a learning program based on intelligent voice recognition, and the learning program based on intelligent voice recognition is used for realizing the learning method based on intelligent voice recognition when being executed by a processor; as described above.
In summary, the present invention provides a learning method, a terminal and a storage medium based on intelligent voice recognition, which are configured to obtain an online learning video configured by a server, and display the online learning video in a display screen of the terminal in a list form, so that a user can select a required learning video from the list; prompting the user to enter a learning mode after the user selects the corresponding online learning video; when entering a learning mode, entering a corresponding learning scene in the online learning video according to a voice command input by a user so that the user can learn corresponding learning content in the learning scene conveniently; in addition, after the learning of the user is finished, corresponding learning suggestions are made to the user according to the voice information of the user in the learning process, so that the user can correct errors in the subsequent learning process; according to the method and the device, the learning scene of the online video is extracted through intelligent voice recognition, and the learning scene is converted into learning data, so that when a user watches the learning-type video, the playing of the knowledge points can be flexibly controlled, and the learning ability of the user is improved.
Of course, it will be understood by those skilled in the art that all or part of the processes of the methods of the above embodiments may be implemented by a computer program instructing relevant hardware (such as a processor, a controller, etc.), and the program may be stored in a computer readable storage medium, and when executed, the program may include the processes of the above method embodiments. The storage medium may be a memory, a magnetic disk, an optical disk, etc.
It is to be understood that the invention is not limited to the examples described above, but that modifications and variations may be effected thereto by those of ordinary skill in the art in light of the foregoing description, and that all such modifications and variations are intended to be within the scope of the invention as defined by the appended claims.

Claims (5)

1. A learning method based on intelligent voice recognition is characterized by comprising the following steps:
the method comprises the steps that a terminal obtains an online learning video configured by a server, and the online learning video is displayed on a display screen of the terminal in a list form;
the method for displaying the online learning video on the display screen of the terminal in the form of the list comprises the following steps:
the terminal sends a request for acquiring the online learning video to the server;
the terminal receives the list of the online learning videos sent by the server and displays the list on a display screen of the terminal;
when the terminal receives a playing instruction input by a user, playing the online learning video according to the playing instruction, and entering a learning mode;
when the terminal receives a playing instruction input by a user, the online learning video is played according to the playing instruction, and the online learning video enters a learning mode, and the method specifically comprises the following steps:
when the terminal receives the playing instruction, judging whether the playing instruction is an instruction for playing the online learning video;
when the playing instruction is an instruction for playing the online learning video, sending a request for downloading the online learning video to the server;
receiving and playing the online learning video downloaded from the server, and prompting whether the user enters the learning mode or not on a display screen of the terminal;
when the user selects to enter a learning mode, the terminal starts an intelligent voice recognition function;
judging whether the video clicked by the user is an online learning video or not, so that the terminal prompts the user whether to enter a learning mode or not according to the clicking operation of the user and then enters a corresponding learning mode according to the selection of the user;
when the playing instruction is an instruction for playing the online learning video, the method further comprises the following steps after sending a request for downloading the online learning video to the server:
when the online learning video is downloaded, prompting a learning rule and a learning mode to the user on a display screen of the terminal;
when the terminal enters the learning mode, receiving a voice command input by a user through intelligent voice recognition, and switching the online learning video to a corresponding learning scene according to the voice command;
when the terminal enters the learning mode, receiving a voice instruction input by a user through intelligent voice recognition, and switching the online learning video to a corresponding learning scene according to the voice instruction specifically comprises the following steps:
when the terminal enters the learning mode, receiving a voice instruction input by a user through intelligent voice recognition;
switching the online learning video to a corresponding learning scene according to the voice command, and playing corresponding learning content in the online learning video;
the switching the online learning video to the corresponding learning scene according to the voice instruction and playing the corresponding learning content in the online learning video specifically comprises the following steps:
switching the online learning video to a corresponding learning scene according to the voice command, and playing corresponding learning content in the online learning video;
when the terminal plays the learning content, judging whether the user inputs voice information within a preset time;
jumping to a time point corresponding to the voice information when the user inputs the voice information within a preset time;
playing a learning segment corresponding to the voice information in the learning content according to the time point;
in the normal playing process, if a voice instruction input by a user is received and the voice instruction input by the user is detected as a keyword, skipping to a learning segment corresponding to the keyword and matching to a time point corresponding to the keyword for playing;
when the playing is finished, pausing; if the voice instruction is received again in the playing process, matching is carried out again; if the matching fails, giving a prompt; and if the received voice instruction is a language segment, jumping to a time point corresponding to the language segment for playing, and recording the context in the learning scene for scene learning.
2. The intelligent speech recognition-based learning method of claim 1, wherein the learning scenarios comprise word learning scenarios, dialogue learning scenarios, and word-dialogue cross-learning scenarios.
3. The intelligent voice recognition-based learning method according to claim 1, wherein after the terminal enters the learning mode, receiving a voice command input by a user through intelligent voice recognition, and switching the online learning video to a corresponding learning scene according to the voice command, the method further comprises the following steps:
when the terminal finishes playing, uploading the voice information input by the user in the learning scene to the server;
and receiving the scores and the error correction sent by the server, and providing corresponding learning suggestions to the user on a display screen of the terminal according to the voice information.
4. A terminal comprising a processor and a memory coupled to the processor, the memory storing an intelligent speech recognition based learning program, the intelligent speech recognition based learning program being executable by the processor to implement the intelligent speech recognition based learning method of any one of claims 1-3.
5. A storage medium storing a learning program based on intelligent speech recognition, wherein the learning program based on intelligent speech recognition is executed by a processor to implement the learning method based on intelligent speech recognition according to any one of claims 1-3.
CN201910636555.8A 2019-07-15 2019-07-15 Learning method based on intelligent voice recognition, terminal and storage medium Active CN110430465B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910636555.8A CN110430465B (en) 2019-07-15 2019-07-15 Learning method based on intelligent voice recognition, terminal and storage medium
PCT/CN2020/073079 WO2021008128A1 (en) 2019-07-15 2020-01-20 Smart voice recognition-based learning method, terminal, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910636555.8A CN110430465B (en) 2019-07-15 2019-07-15 Learning method based on intelligent voice recognition, terminal and storage medium

Publications (2)

Publication Number Publication Date
CN110430465A CN110430465A (en) 2019-11-08
CN110430465B true CN110430465B (en) 2021-06-01

Family

ID=68409523

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910636555.8A Active CN110430465B (en) 2019-07-15 2019-07-15 Learning method based on intelligent voice recognition, terminal and storage medium

Country Status (2)

Country Link
CN (1) CN110430465B (en)
WO (1) WO2021008128A1 (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110430465B (en) * 2019-07-15 2021-06-01 深圳创维-Rgb电子有限公司 Learning method based on intelligent voice recognition, terminal and storage medium
CN111028828A (en) * 2019-12-20 2020-04-17 京东方科技集团股份有限公司 Voice interaction method based on screen drawing, screen drawing and storage medium
WO2021155812A1 (en) * 2020-02-07 2021-08-12 海信视像科技股份有限公司 Receiving device, server, and speech information processing system
CN113344318A (en) * 2021-04-16 2021-09-03 华蔚集团(广东)有限公司 On-line evaluation system based on textbook course
CN114520003A (en) * 2022-02-28 2022-05-20 安徽淘云科技股份有限公司 Voice interaction method and device, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105681920A (en) * 2015-12-30 2016-06-15 深圳市鹰硕音频科技有限公司 Network teaching method and system with voice recognition function
CN105872828A (en) * 2016-03-30 2016-08-17 乐视控股(北京)有限公司 Television interactive learning method and device
CN109191349A (en) * 2018-11-02 2019-01-11 北京唯佳未来教育科技有限公司 A kind of methods of exhibiting and system of English learning content
WO2019102463A1 (en) * 2017-11-23 2019-05-31 Nagler Almog Tal An interface for training content over a network of mobile devices

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20110065276A (en) * 2010-06-25 2011-06-15 서동혁 Method and apparatus for pronunciation exercise using comparison video
CN106227335B (en) * 2016-07-14 2020-07-03 广东小天才科技有限公司 Interactive learning method for preview lecture and video course and application learning client
CN106293347B (en) * 2016-08-16 2019-11-12 广东小天才科技有限公司 A kind of learning method and device, user terminal of human-computer interaction
CN107135418A (en) * 2017-06-14 2017-09-05 北京易世纪教育科技有限公司 A kind of control method and device of video playback
CN108766071A (en) * 2018-04-28 2018-11-06 北京猎户星空科技有限公司 A kind of method, apparatus, storage medium and the relevant device of content push and broadcasting
CN110430465B (en) * 2019-07-15 2021-06-01 深圳创维-Rgb电子有限公司 Learning method based on intelligent voice recognition, terminal and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105681920A (en) * 2015-12-30 2016-06-15 深圳市鹰硕音频科技有限公司 Network teaching method and system with voice recognition function
CN105872828A (en) * 2016-03-30 2016-08-17 乐视控股(北京)有限公司 Television interactive learning method and device
WO2019102463A1 (en) * 2017-11-23 2019-05-31 Nagler Almog Tal An interface for training content over a network of mobile devices
CN109191349A (en) * 2018-11-02 2019-01-11 北京唯佳未来教育科技有限公司 A kind of methods of exhibiting and system of English learning content

Also Published As

Publication number Publication date
WO2021008128A1 (en) 2021-01-21
CN110430465A (en) 2019-11-08

Similar Documents

Publication Publication Date Title
CN110430465B (en) Learning method based on intelligent voice recognition, terminal and storage medium
US20210280185A1 (en) Interactive voice controlled entertainment
US10540971B2 (en) System and methods for in-meeting group assistance using a virtual assistant
US10964325B2 (en) Asynchronous virtual assistant
US20140036022A1 (en) Providing a conversational video experience
CN108882101B (en) Playing control method, device, equipment and storage medium of intelligent sound box
US11651775B2 (en) Word correction using automatic speech recognition (ASR) incremental response
CN107403011B (en) Virtual reality environment language learning implementation method and automatic recording control method
US20140028780A1 (en) Producing content to provide a conversational video experience
US8768744B2 (en) Method and apparatus for automated user review of media content in a mobile communication device
CN111949240A (en) Interaction method, storage medium, service program, and device
US11449301B1 (en) Interactive personalized audio
CN107959882B (en) Voice conversion method, device, terminal and medium based on video watching record
Wittenburg et al. The prospects for unrestricted speech input for TV content search
CN111933135A (en) Terminal control method and device, intelligent terminal and computer readable storage medium
JP7071514B2 (en) Audio information processing methods, devices, storage media and electronic devices
US20220360856A1 (en) Apparatus and system for providing content based on user utterance
CN115866339A (en) Television program recommendation method and device, intelligent device and readable storage medium
CN113301362B (en) Video element display method and device
WO2013181633A1 (en) Providing a converstional video experience
JP6867543B1 (en) Information processing equipment, information processing methods and programs
CN115052194B (en) Learning report generation method, device, electronic equipment and storage medium
US11769531B1 (en) Content system with user-input based video content generation feature
KR102443914B1 (en) Method and apparatus for recommending hehavior of user
CN113163245B (en) Data processing method, device, electronic equipment and computer storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant