CN108682437B - Information processing method, device, medium and computing equipment - Google Patents

Information processing method, device, medium and computing equipment Download PDF

Info

Publication number
CN108682437B
CN108682437B CN201810486526.3A CN201810486526A CN108682437B CN 108682437 B CN108682437 B CN 108682437B CN 201810486526 A CN201810486526 A CN 201810486526A CN 108682437 B CN108682437 B CN 108682437B
Authority
CN
China
Prior art keywords
text information
user
information
pronunciation
standard
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810486526.3A
Other languages
Chinese (zh)
Other versions
CN108682437A (en
Inventor
臧阳光
沙泓州
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lede Technology Co Ltd
Original Assignee
Lede Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lede Technology Co Ltd filed Critical Lede Technology Co Ltd
Priority to CN201810486526.3A priority Critical patent/CN108682437B/en
Publication of CN108682437A publication Critical patent/CN108682437A/en
Application granted granted Critical
Publication of CN108682437B publication Critical patent/CN108682437B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

The invention provides an information processing method, which comprises the following steps: generating user text information corresponding to the voice information according to the received user voice information, wherein the user text information can represent the user voice information; according to pronunciation rules, segmenting the user text information and the corresponding standard text information into at least one pronunciation unit; comparing at least one pronunciation unit of the user text information with at least one pronunciation unit of the standard text information; and outputting prompt information which indicates different pronunciation units in the user text information and the standard text information. According to the invention, the user text information and the standard text information are segmented and compared, and the fact that the pronunciation of which part of the user text information is different from that of the standard text can be accurately found, so that the specific different pronunciation units of the user can be clearly prompted, the improvement of the user is facilitated, the better experience is brought to the user, and the learning cost of the user is reduced.

Description

Information processing method, device, medium and computing equipment
Technical Field
The embodiment of the invention relates to the technical field of computers, in particular to an information processing method, an information processing device, an information processing medium and a computing device.
Background
This section is intended to provide a background or context to the embodiments of the invention that are recited in the claims. The description herein is not admitted to be prior art by inclusion in this section.
With the progress of science and technology and the diversified development of economic culture, various learning applications are increasingly applied in various aspects such as life, work and the like. Currently, some language learning applications can recognize a user's pronunciation to prompt the user whether the pronunciation is accurate. However, these applications usually only prompt the user that the pronunciation of a certain word is accurate or inaccurate, but the user cannot know which part of the pronunciation is inaccurate, and still needs to try and learn many times, which is costly and poor in user experience.
Disclosure of Invention
Therefore, the prior art can only identify whether the pronunciation of the whole word is accurate, but cannot indicate which part of the pronunciation is inaccurate, so that the user is inconvenient to learn.
Therefore, an improved processing method is needed, which can prompt the user which part of his pronunciation is accurate and which part is inaccurate, help the user find problems, reduce the learning cost of the user, and improve the user experience.
In this context, embodiments of the present invention are intended to provide an information processing method, apparatus, medium, and computing device.
In a first aspect of embodiments of the present invention, there is provided an information processing method including: generating user text information corresponding to the voice information according to the received user voice information, wherein the user text information can represent the user voice information, segmenting the user text information and corresponding standard text information into at least one pronunciation unit according to pronunciation rules, comparing the at least one pronunciation unit of the user text information with the at least one pronunciation unit of the standard text information, and outputting prompt information, wherein the prompt information indicates different pronunciation units in the user text information and the standard text information.
In an embodiment of the present invention, the generating, according to the received user voice information, user text information corresponding to the voice information includes: according to the received user voice information, determining user word text information corresponding to the user voice information, and generating user phonetic symbol text information according to the user word text information.
In an embodiment of the invention, the user text information includes user phonetic symbol text information. The standard text information includes standard phonetic symbol text information. The segmenting the user text information and the corresponding standard text information into at least one pronunciation unit includes: and respectively segmenting the user phonetic symbol text information and the standard phonetic symbol text information into a plurality of pronunciation units, wherein each pronunciation unit comprises a phonetic symbol unit.
In an embodiment of the present invention, the comparing the at least one pronunciation unit of the user text information with the at least one pronunciation unit of the standard text information includes: and comparing the at least one pronunciation unit of the user text information with the at least one pronunciation unit of the standard text information one by one according to the arrangement sequence of the pronunciation units, and/or comparing the at least one pronunciation unit of the user text information with the at least one pronunciation unit of the standard text information through an edit distance algorithm.
In an embodiment of the present invention, the outputting the prompt information includes: outputting presentation information indicating different pronunciation units in the user text information and the standard text information, and/or outputting voice information indicating different pronunciation units in the user text information and the standard text information.
In a second aspect of embodiments of the present invention, there is provided an information processing apparatus including a generation module, a segmentation module, a comparison module, and an output module. The generating module generates user text information corresponding to the voice information according to the received user voice information, wherein the user text information can represent the user voice information. And the segmentation module segments the user text information and the corresponding standard text information into at least one pronunciation unit according to pronunciation rules. And the comparison module compares at least one pronunciation unit of the user text information with at least one pronunciation unit of the standard text information. And the output module outputs prompt information, wherein the prompt information indicates different pronunciation units in the user text information and the standard text information.
In an embodiment of the present invention, the generating, according to the received user voice information, user text information corresponding to the voice information includes: according to the received user voice information, determining user word text information corresponding to the user voice information, and generating user phonetic symbol text information according to the user word text information.
In an embodiment of the invention, the user text information includes user phonetic symbol text information. The standard text information includes standard phonetic symbol text information. The segmenting the user text information and the corresponding standard text information into at least one pronunciation unit includes: and respectively segmenting the user phonetic symbol text information and the standard phonetic symbol text information into a plurality of pronunciation units, wherein each pronunciation unit comprises a phonetic symbol unit.
In an embodiment of the present invention, the comparing the at least one pronunciation unit of the user text information with the at least one pronunciation unit of the standard text information includes: and comparing the at least one pronunciation unit of the user text information with the at least one pronunciation unit of the standard text information one by one according to the arrangement sequence of the pronunciation units, and/or comparing the at least one pronunciation unit of the user text information with the at least one pronunciation unit of the standard text information through an edit distance algorithm.
In an embodiment of the present invention, the outputting the prompt information includes: outputting presentation information indicating different pronunciation units in the user text information and the standard text information, and/or outputting voice information indicating different pronunciation units in the user text information and the standard text information.
In a third aspect of embodiments of the present invention, there is provided a computing device comprising: one or more memories storing executable instructions and one or more processors executing the executable instructions to implement any of the methods described above.
In a fourth aspect of embodiments of the present invention there is provided a computer readable storage medium having stored thereon executable instructions which, when executed by a processing unit, cause the processing unit to perform any of the methods described above.
According to the embodiment of the invention, the user text information and the standard text information are segmented and compared, and the fact that the pronunciation of the part of the user text information is different from that of the standard text can be accurately found, so that the user can be explicitly prompted about the pronunciation unit with different specific pronunciations, the improvement of the user is facilitated, the better experience is brought to the user, and the learning cost of the user is reduced.
Drawings
The above and other objects, features and advantages of exemplary embodiments of the present invention will become readily apparent from the following detailed description read in conjunction with the accompanying drawings. Several embodiments of the invention are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which:
FIG. 1 schematically shows an application scenario according to an embodiment of the invention;
FIG. 2 schematically shows a flow chart of an information processing method according to an embodiment of the invention;
FIG. 3 schematically shows a schematic view of a readable storage medium according to an embodiment of the invention;
fig. 4 schematically shows a block diagram of an information processing apparatus according to an embodiment of the present invention;
fig. 5 schematically shows a computing device suitable for implementing the information processing method according to an embodiment of the present invention.
In the drawings, the same or corresponding reference numerals indicate the same or corresponding parts.
Detailed Description
The principles and spirit of the present invention will be described with reference to a number of exemplary embodiments. It is understood that these embodiments are given solely for the purpose of enabling those skilled in the art to better understand and to practice the invention, and are not intended to limit the scope of the invention in any way. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
As will be appreciated by one skilled in the art, embodiments of the present invention may be embodied as a system, apparatus, device, method, or computer program product. Thus, the present invention may be embodied in the form of: entirely hardware, entirely software (including firmware, resident software, micro-code, etc.), or a combination of hardware and software.
According to an embodiment of the invention, an information processing method, an information processing device, an information processing medium and a computing device are provided.
In this document, it is to be understood that any number of elements in the figures are provided by way of illustration and not limitation, and any nomenclature is used for differentiation only and not in any limiting sense.
The principles and spirit of the present invention are explained in detail below with reference to several representative embodiments of the invention.
Summary of The Invention
The inventor finds that in order to reduce the learning cost of the user and improve the learning efficiency of the user, the pronunciation of the user can be converted into corresponding user text information, then the user text information and the standard text information are cut into at least one pronunciation unit, the at least one pronunciation unit of the user text information and the at least one pronunciation unit of the standard text information are compared, and different pronunciation units are found out to prompt the user that which part of the pronunciation is inaccurate, so that the user can learn and improve more pertinently, and better experience is brought to the user.
Having described the general principles of the invention, various non-limiting embodiments of the invention are described in detail below.
Application scene overview
Referring first to fig. 1, fig. 1 schematically shows an application scenario according to an embodiment of the present invention.
As shown in fig. 1, the application scenario 100 includes a user 110 and an electronic device 120. The electronic device 120 is capable of receiving voice information of the user 110.
According to the embodiment of the present invention, the electronic device 120 may be various electronic devices having a microphone and a display screen, including but not limited to a smart phone, a tablet computer, a laptop portable computer, a desktop computer, and the like.
In the embodiment of the present invention, a user may perform pronunciation exercises, the electronic device 120 may receive voice information of the user and convert the voice information into corresponding user text information, and the electronic device 120 may further segment the user text information and corresponding standard text information into at least one pronunciation unit according to a pronunciation rule, and then compare the at least one pronunciation unit of the user text information with the at least one pronunciation unit of the standard text information to find out different pronunciation units to prompt the user. For example, as shown in fig. 1, different pronunciation units can be marked with red to prompt the user of the part with inaccurate pronunciation, thereby reducing the learning cost of the user and improving the learning efficiency of the user.
Exemplary method
An information processing method according to an exemplary embodiment of the present invention is described below with reference to fig. 2 in conjunction with the application scenario of fig. 1. It should be noted that the above application scenarios are merely illustrated for the convenience of understanding the spirit and principles of the present invention, and the embodiments of the present invention are not limited in this respect. Rather, embodiments of the present invention may be applied to any scenario where applicable.
Fig. 2 schematically shows a flow chart of an information processing method according to an embodiment of the present invention.
As shown in fig. 2, the method includes operations S201 to S204.
In operation S201, user text information corresponding to the voice information is generated according to the received user voice information, and the user text information can represent the user voice information.
According to the embodiment of the invention, the electronic equipment can receive the voice information of the user in a specific scene, for example, in the scene that the electronic equipment currently runs a corresponding language learning application. Or the electronic device may also receive user voice information in response to a specific condition, for example, receiving a user's turn-on instruction, etc.
In the embodiment of the invention, the received user voice information can be subjected to voice recognition, so that the received user voice information is converted into corresponding user text information.
In an embodiment of the present invention, generating the user text information corresponding to the voice information according to the received user voice information may include: and determining user word text information corresponding to the user voice information according to the received user voice information, and generating user phonetic symbol text information according to the user word text information.
For example, the received pronunciation of the user may be subjected to speech recognition, a word corresponding to the pronunciation may be determined, and phonetic symbol information of the word may be obtained to generate user phonetic symbol text information corresponding to the pronunciation. For example, in an english scenario, if a word corresponding to a user pronunciation is but, the user text information generated according to the pronunciation may be "/Λ t/".
In another embodiment of the present invention, the user text information corresponding to the voice information is generated according to the received user voice information, and the method may further include: and directly generating user phonetic symbol text information corresponding to the voice information according to the received user voice information.
For example, phonetic symbol information corresponding to each pronunciation unit in a user pronunciation may be determined directly through speech recognition, thereby generating user phonetic symbol text information corresponding to the pronunciation.
In operation S202, the user text information and the corresponding standard text information are segmented into at least one pronunciation unit according to pronunciation rules.
According to the embodiment of the invention, the pronunciation rules can be different according to different language types. For example, in chinese, the pronunciation of the user can be split according to the rules of initials and finals, and in english, the pronunciation of the user can be split according to the rules of vowels and consonants.
In the embodiment of the present invention, the user text information may be user phonetic symbol text information, and the standard text information may be standard phonetic symbol text information. Segmenting user text information and corresponding standard text information into at least one pronunciation unit, comprising: and respectively segmenting the user phonetic symbol text information and the standard phonetic symbol text information into a plurality of pronunciation units, wherein each pronunciation unit comprises a phonetic symbol unit.
For example, in an English scenario, the user phonetic symbol text information may be/b Λ t/, which may be segmented into [/b/,/Λ/,/t [/b/,/Λ/,/t [/b/] according to pronunciation rules]And,/b/,/Λ/,/t/are the pronunciation units, respectively. The standard phonetic text may be
Figure BDA0001666130840000071
According to pronunciation rules, the pronunciation can be segmented into
Figure BDA0001666130840000072
/b/,
Figure BDA0001666130840000073
And/t/is a pronunciation unit respectively.
It can be understood that, in order to more specifically indicate a place where a user pronounces inaccurately, the embodiment of the present invention may segment the phonetic symbol text according to the pronunciation rules and the minimum dimension, for example, in an english scene, the phonetic symbol text may be segmented one by one according to consonants, unit tones, and diphthongs, so that each pronunciation unit only includes one phonetic symbol unit, that is, each pronunciation unit only includes one consonant, one unit tone, or one diphthongs.
In operation S203, at least one pronunciation unit of the user text information is compared with at least one pronunciation unit of the standard text information.
According to the embodiment of the invention, at least one pronunciation unit of the user text information and at least one pronunciation unit of the standard text information can be compared one by one according to the arrangement sequence of the pronunciation units.
For example, a first pronunciation unit of user text information may be compared to a first pronunciation unit of standard text information, a second pronunciation unit of user text information may be compared to a second pronunciation unit of standard text information, and so on.
According to the embodiment of the invention, at least one pronunciation unit of the user text information can be compared with at least one pronunciation unit of the standard text information through an edit distance algorithm.
It can be understood that if the pronunciation of the user is very inaccurate, the number of pronunciation units after the segmentation of the user text message is different from the number of pronunciation units after the segmentation of the standard text message. Therefore, the invention can also adopt an edit distance algorithm to determine different pronunciation units between the user text information and the standard text information.
The two comparison methods of the embodiments of the present invention may be used alone or in combination, and the present invention is not limited thereto, and those skilled in the art can set the comparison method according to the actual application situation.
In operation S204, a prompt message capable of indicating a different pronunciation unit in the user text message from the standard text message is output.
According to the embodiment of the invention, the display information can be output, and the display information can indicate different pronunciation units in the user text information and the standard text information.
For example, different pronunciation units may be presented separately or in standard text messages, red or bolded to indicate to the user the part of the user whose pronunciation is inaccurate.
According to the embodiment of the invention, the voice information can also be output, and the voice information can indicate the different pronunciation units in the user text information and the standard text information.
For example, different pronunciation units may be read separately, or the correct pronunciation may be read so that the user may improve learning.
The two output methods of the embodiment of the present invention may be used alone or in combination, and the present invention is not limited thereto, and those skilled in the art may set the two output methods according to the actual application situation.
According to the embodiment of the invention, the pronunciation of the user is converted into the corresponding user text information, then the user text information and the standard text information are divided into at least one pronunciation unit, and the at least one pronunciation unit of the user text information and the at least one pronunciation unit of the standard text information are compared to find out different pronunciation units so as to prompt the user that the pronunciation of the user is inaccurate, so that the user can learn and improve more pertinently, the learning cost of the user is reduced, the learning efficiency of the user is improved, and better experience is brought to the user.
Exemplary Medium
The exemplary embodiments of the present invention provide a computer-readable storage medium storing computer-executable instructions, which when executed by a processing unit, are used to implement the information processing method described in any one of the above method embodiments.
In some possible embodiments, aspects of the present invention may also be implemented in the form of a program product including program code for causing a computing device to perform operations in the information processing methods according to various exemplary embodiments of the present invention described in the above section "exemplary methods" of this specification when the program product is run on the computing device, for example, the computing device may perform operation S201 as shown in fig. 2: generating user text information corresponding to the voice information according to the received user voice information, wherein the user text information can represent the user voice information; operation S202: according to pronunciation rules, segmenting the user text information and the corresponding standard text information into at least one pronunciation unit; operation S203: comparing at least one pronunciation unit of the user text information with at least one pronunciation unit of the standard text information; operation S204: and outputting prompt information which indicates different pronunciation units in the user text information and the standard text information.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
As shown in fig. 3, a program product 30 for an information processing method according to an embodiment of the present invention is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a computing device, such as a personal computer. However, the program product of the present invention is not limited in this regard and, in the present document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
Exemplary devices
Having described the medium of an exemplary embodiment of the present invention, an information processing apparatus of an exemplary embodiment of the present invention is next described with reference to fig. 4.
Fig. 4 schematically shows a block diagram of an information processing apparatus 400 according to an embodiment of the present invention.
As shown in fig. 4, the information processing apparatus 400 includes a generation module 410, a segmentation module 420, a comparison module 430, and an output module 440.
The generating module 410 generates user text information corresponding to the voice information according to the received user voice information, wherein the user text information can represent the user voice information.
According to the embodiment of the invention, the method for generating the user text information corresponding to the voice information according to the received user voice information comprises the following steps: and determining user word text information corresponding to the user voice information according to the received user voice information, and generating user phonetic symbol text information according to the user word text information.
According to the embodiment of the present invention, the generating module 410 may, for example, perform the operation S201 described above with reference to fig. 2, which is not described herein again.
The segmentation module 420 segments the user text information and the corresponding standard text information into at least one pronunciation unit according to the pronunciation rules.
According to an embodiment of the invention, the user text information comprises user phonetic symbol text information and the standard text information comprises standard phonetic symbol text information. Segmenting user text information and corresponding standard text information into at least one pronunciation unit, comprising: and respectively segmenting the user phonetic symbol text information and the standard phonetic symbol text information into a plurality of pronunciation units, wherein each pronunciation unit comprises a phonetic symbol unit.
According to the embodiment of the present invention, the dividing module 420 may, for example, perform the operation S202 described above with reference to fig. 2, which is not described herein again.
The comparison module 430 compares at least one pronunciation unit of the user text message with at least one pronunciation unit of the standard text message.
According to the embodiment of the invention, comparing at least one pronunciation unit of the user text information with at least one pronunciation unit of the standard text information comprises the following steps: comparing at least one pronunciation unit of the user text information with at least one pronunciation unit of the standard text information one by one according to the arrangement sequence of the pronunciation units; and/or comparing at least one pronunciation unit of the user text information with at least one pronunciation unit of the standard text information by an edit distance algorithm.
According to the embodiment of the present invention, the comparing module 430 may, for example, perform the operation S203 described above with reference to fig. 2, which is not described herein again.
The output module 440 outputs a prompt indicating a different pronunciation unit in the user text information than in the standard text information.
According to the embodiment of the invention, the outputting of the prompt message comprises the following steps: outputting display information which indicates different pronunciation units in the user text information and the standard text information; and/or outputting voice information indicating different pronunciation units in the user text information than in the standard text information.
According to the embodiment of the present invention, the output module 440 may, for example, perform the operation S204 described above with reference to fig. 2, which is not described herein again.
Any number of modules, sub-modules, units, sub-units, or at least part of the functionality of any number thereof according to embodiments of the present disclosure may be implemented in one module. Any one or more of the modules, sub-modules, units, and sub-units according to the embodiments of the present disclosure may be implemented by being split into a plurality of modules. Any one or more of the modules, sub-modules, units, sub-units according to embodiments of the present disclosure may be implemented at least in part as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented in any other reasonable manner of hardware or firmware by integrating or packaging a circuit, or in any one of or a suitable combination of software, hardware, and firmware implementations. Alternatively, one or more of the modules, sub-modules, units, sub-units according to embodiments of the disclosure may be at least partially implemented as a computer program module, which when executed may perform the corresponding functions.
For example, any of the generating module 410, the dividing module 420, the comparing module 430, and the outputting module 440 may be combined and implemented in one module, or any one of the modules may be split into a plurality of modules. Alternatively, at least part of the functionality of one or more of these modules may be combined with at least part of the functionality of the other modules and implemented in one module. According to an embodiment of the present disclosure, at least one of the generating module 410, the dividing module 420, the comparing module 430, and the outputting module 440 may be implemented at least partially as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented by hardware or firmware in any other reasonable manner of integrating or packaging a circuit, or implemented by any one of three implementations of software, hardware, and firmware, or any suitable combination of any of them. Alternatively, at least one of the generating module 410, the segmenting module 420, the comparing module 430 and the output module 440 may be at least partially implemented as a computer program module, which when executed may perform a corresponding function.
Exemplary computing device
Having described the method, medium, and apparatus of exemplary embodiments of the present invention, a computing device of an exemplary embodiment of the present invention for implementing the information processing method of the present invention will next be described with reference to fig. 5.
As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or program product. Thus, various aspects of the invention may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.
In some possible embodiments, a computing device according to the present invention may include at least one processing unit, and at least one memory unit. Wherein the storage unit stores program code that, when executed by the processing unit, causes the processing unit to perform the steps in the information processing method according to various exemplary embodiments of the present invention described in the above section "exemplary method" of the present specification. For example, the processing unit may perform operation S201 as shown in fig. 2: generating user text information corresponding to the voice information according to the received user voice information, wherein the user text information can represent the user voice information; operation S202: according to pronunciation rules, segmenting the user text information and the corresponding standard text information into at least one pronunciation unit; operation S203: comparing at least one pronunciation unit of the user text information with at least one pronunciation unit of the standard text information; operation S204: and outputting prompt information which indicates different pronunciation units in the user text information and the standard text information.
A computing device 500 according to this embodiment of the invention is described below with reference to fig. 5. The computing device 500 shown in FIG. 5 is only one example and should not be taken to limit the scope of use and functionality of embodiments of the present invention.
As shown in fig. 5, computing device 500 is embodied in the form of a general purpose computing device. Components of computing device 500 may include, but are not limited to: the at least one processing unit 510, the at least one memory unit 520, and a bus 530 that couples various system components including the memory unit 520 and the processing unit 510.
Bus 530 includes a data bus, a control bus, an address bus, and the like.
The storage unit 520 may include volatile memory, such as a Random Access Memory (RAM)521 and/or a cache memory 522, and may further include a Read Only Memory (ROM) 523.
The storage unit 520 may also include a program/utility 525 having a set (at least one) of program modules 524, such program modules 524 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
Computing device 500 may also communicate with one or more external devices 540 (e.g., keyboard, pointing device, bluetooth device, etc.) via an input/output (I/O) interface 550. Moreover, computing device 500 may also communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet) via network adapter 560. As shown, network adapter 560 communicates with the other modules of computing device 500 via bus 530. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in conjunction with computing device 500, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
It should be noted that although in the above detailed description, reference is made to several units/modules or sub-units/modules of the apparatus displaying a plurality of components in the display area, such division is merely exemplary and not mandatory. Indeed, the features and functionality of two or more of the units/modules described above may be embodied in one unit/module according to embodiments of the invention. Conversely, the features and functions of one unit/module described above may be further divided into embodiments by a plurality of units/modules.
Moreover, while the operations of the method of the invention are depicted in the drawings in a particular order, this does not require or imply that the operations must be performed in this particular order, or that all of the illustrated operations must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions.
While the spirit and principles of the invention have been described with reference to several particular embodiments, it is to be understood that the invention is not limited to the disclosed embodiments, nor is the division of aspects, which is for convenience only as the features in such aspects may not be combined to benefit. The invention is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims (10)

1. An information processing method comprising:
generating user text information corresponding to the voice information according to the received user voice information, wherein the user text information can represent the user voice information;
according to pronunciation rules, segmenting the user text information and the corresponding standard text information into at least one pronunciation unit;
comparing at least one pronunciation unit of the user text information with at least one pronunciation unit of the standard text information;
outputting prompt information which indicates different pronunciation units in the user text information and the standard text information;
the comparing the at least one pronunciation unit of the user text information with the at least one pronunciation unit of the standard text information includes:
comparing the at least one pronunciation unit of the user text information with the at least one pronunciation unit of the standard text information one by one according to the arrangement sequence of the pronunciation units; and/or
And comparing the at least one pronunciation unit of the user text information with the at least one pronunciation unit of the standard text information by an edit distance algorithm.
2. The method of claim 1, wherein the generating user text information corresponding to the voice information from the received user voice information comprises:
determining user word text information corresponding to the user voice information according to the received user voice information;
and generating user phonetic symbol text information according to the user word text information.
3. The method of claim 1, wherein:
the user text information comprises user phonetic symbol text information;
the standard text information comprises standard phonetic symbol text information;
the segmenting the user text information and the corresponding standard text information into at least one pronunciation unit includes:
and respectively segmenting the user phonetic symbol text information and the standard phonetic symbol text information into a plurality of pronunciation units, wherein each pronunciation unit comprises a phonetic symbol unit.
4. The method of claim 1, wherein the outputting the prompt message comprises:
outputting presentation information indicating different pronunciation units in the user text information and the standard text information; and/or
Outputting voice information indicating different pronunciation units in the user text information and the standard text information.
5. An information processing apparatus comprising:
the generating module is used for generating user text information corresponding to the voice information according to the received user voice information, wherein the user text information can represent the user voice information;
the segmentation module is used for segmenting the user text information and the corresponding standard text information into at least one pronunciation unit according to pronunciation rules;
the comparison module is used for comparing at least one pronunciation unit of the user text information with at least one pronunciation unit of the standard text information;
the output module is used for outputting prompt information which indicates different pronunciation units in the user text information and the standard text information;
the comparing the at least one pronunciation unit of the user text information with the at least one pronunciation unit of the standard text information includes:
comparing the at least one pronunciation unit of the user text information with the at least one pronunciation unit of the standard text information one by one according to the arrangement sequence of the pronunciation units; and/or
And comparing the at least one pronunciation unit of the user text information with the at least one pronunciation unit of the standard text information by an edit distance algorithm.
6. The apparatus of claim 5, wherein the generating of the user text information corresponding to the voice information from the received user voice information comprises:
determining user word text information corresponding to the user voice information according to the received user voice information;
and generating user phonetic symbol text information according to the user word text information.
7. The apparatus of claim 5, wherein said comparing at least one pronunciation unit of the user text message to at least one pronunciation unit of the standard text message comprises:
comparing the at least one pronunciation unit of the user text information with the at least one pronunciation unit of the standard text information one by one according to the arrangement sequence of the pronunciation units; and/or
And comparing the at least one pronunciation unit of the user text information with the at least one pronunciation unit of the standard text information by an edit distance algorithm.
8. The apparatus of claim 5, wherein the outputting the hint information comprises:
outputting presentation information indicating different pronunciation units in the user text information and the standard text information; and/or
Outputting voice information indicating different pronunciation units in the user text information and the standard text information.
9. A computing device, comprising:
one or more memories storing executable instructions; and
one or more processors executing the executable instructions to implement the method of any one of claims 1-4.
10. A medium having stored thereon executable instructions which, when executed by a processor, implement a method according to any one of claims 1 to 4.
CN201810486526.3A 2018-05-18 2018-05-18 Information processing method, device, medium and computing equipment Active CN108682437B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810486526.3A CN108682437B (en) 2018-05-18 2018-05-18 Information processing method, device, medium and computing equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810486526.3A CN108682437B (en) 2018-05-18 2018-05-18 Information processing method, device, medium and computing equipment

Publications (2)

Publication Number Publication Date
CN108682437A CN108682437A (en) 2018-10-19
CN108682437B true CN108682437B (en) 2020-12-11

Family

ID=63805273

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810486526.3A Active CN108682437B (en) 2018-05-18 2018-05-18 Information processing method, device, medium and computing equipment

Country Status (1)

Country Link
CN (1) CN108682437B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110176249A (en) * 2019-04-03 2019-08-27 苏州驰声信息科技有限公司 A kind of appraisal procedure and device of spoken language pronunciation

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101604522A (en) * 2009-07-16 2009-12-16 北京森博克智能科技有限公司 The embedded Chinese and English mixing voice recognition methods and the system of unspecified person
CN102253976A (en) * 2011-06-17 2011-11-23 苏州思必驰信息科技有限公司 Metadata processing method and system for spoken language learning
US8543400B2 (en) * 2007-06-11 2013-09-24 National Taiwan University Voice processing methods and systems
CN106531182A (en) * 2016-12-16 2017-03-22 上海斐讯数据通信技术有限公司 Language learning system
CN106898363A (en) * 2017-02-27 2017-06-27 河南职业技术学院 A kind of vocality study electron assistant articulatory system
CN107578778A (en) * 2017-08-16 2018-01-12 南京高讯信息科技有限公司 A kind of method of spoken scoring

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8543400B2 (en) * 2007-06-11 2013-09-24 National Taiwan University Voice processing methods and systems
CN101604522A (en) * 2009-07-16 2009-12-16 北京森博克智能科技有限公司 The embedded Chinese and English mixing voice recognition methods and the system of unspecified person
CN102253976A (en) * 2011-06-17 2011-11-23 苏州思必驰信息科技有限公司 Metadata processing method and system for spoken language learning
CN106531182A (en) * 2016-12-16 2017-03-22 上海斐讯数据通信技术有限公司 Language learning system
CN106898363A (en) * 2017-02-27 2017-06-27 河南职业技术学院 A kind of vocality study electron assistant articulatory system
CN107578778A (en) * 2017-08-16 2018-01-12 南京高讯信息科技有限公司 A kind of method of spoken scoring

Also Published As

Publication number Publication date
CN108682437A (en) 2018-10-19

Similar Documents

Publication Publication Date Title
US11176141B2 (en) Preserving emotion of user input
US11468244B2 (en) Large-scale multilingual speech recognition with a streaming end-to-end model
CN109002510B (en) Dialogue processing method, device, equipment and medium
US9824085B2 (en) Personal language model for input method editor
JP6362603B2 (en) Method, system, and computer program for correcting text
KR102046486B1 (en) Information inputting method
US20180341640A1 (en) Amendment Source-Positioning Method and Apparatus, Computer Device and Readable Medium
CN110415679B (en) Voice error correction method, device, equipment and storage medium
US11443227B2 (en) System and method for cognitive multilingual speech training and recognition
CN109947924B (en) Dialogue system training data construction method and device, electronic equipment and storage medium
CN110941951B (en) Text similarity calculation method, text similarity calculation device, text similarity calculation medium and electronic equipment
US11822589B2 (en) Method and system for performing summarization of text
US10360401B2 (en) Privacy protection in network input methods
CN109657127B (en) Answer obtaining method, device, server and storage medium
WO2020050894A1 (en) Text to visualization
CN108682437B (en) Information processing method, device, medium and computing equipment
US20230153550A1 (en) Machine Translation Method and Apparatus, Device and Storage Medium
US20230419950A1 (en) Artificial intelligence factsheet generation for speech recognition
CN114783405B (en) Speech synthesis method, device, electronic equipment and storage medium
US10762895B2 (en) Linguistic profiling for digital customization and personalization
US20220254351A1 (en) Method and system for correcting speaker diarization using speaker change detection based on text
CN112509581B (en) Error correction method and device for text after voice recognition, readable medium and electronic equipment
CN114758649A (en) Voice recognition method, device, equipment and medium
CN105683873A (en) Fault-tolerant input method editor
CN114064010A (en) Front-end code generation method, device, system and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant