CN111651961A - Voice-based input method and device - Google Patents

Voice-based input method and device Download PDF

Info

Publication number
CN111651961A
CN111651961A CN202010297397.0A CN202010297397A CN111651961A CN 111651961 A CN111651961 A CN 111651961A CN 202010297397 A CN202010297397 A CN 202010297397A CN 111651961 A CN111651961 A CN 111651961A
Authority
CN
China
Prior art keywords
symbol
text
voice
formula
typesetting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010297397.0A
Other languages
Chinese (zh)
Inventor
崔释文
李健
武卫东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sinovoice Technology Co Ltd
Original Assignee
Beijing Sinovoice Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sinovoice Technology Co Ltd filed Critical Beijing Sinovoice Technology Co Ltd
Priority to CN202010297397.0A priority Critical patent/CN111651961A/en
Publication of CN111651961A publication Critical patent/CN111651961A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • G06F40/111Mathematical or scientific formatting; Subscripts; Superscripts
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/189Automatic justification
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Algebra (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Physics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Document Processing Apparatus (AREA)

Abstract

The invention provides a voice-based input method and device, and relates to the technical field of computers. According to the voice-based input method provided by the invention, the voice data is obtained, the voice data is recognized to form the voice text, the symbol coding information corresponding to the voice text is formed through the typesetting matching rule, and then the symbol coding information is converted into the formula to be output, so that the user does not need to input through a keyboard, the input efficiency of a complex formula is improved, the time is saved, meanwhile, the formula converted after the user inputs the voice can be directly applied, and the convenience of the voice input formula is improved.

Description

Voice-based input method and device
Technical Field
The invention relates to the technical field of computers, in particular to a voice-based input method and a voice-based input device.
Background
At present, a plurality of defects exist in the input of the formula, especially when the formula relates to the frequent use of mathematical principles, physical formulas, chemical equations and the like, the keyboard key positions are limited, the format of the formula is strange, the combination changes are thousands of, the number of special symbols is more than a few, the searching is difficult, and the characters can not be quickly input through the traditional keyboard. The time for the user to input, type and sort the formula accounts for more than half of the total time, so that the formula input is very complicated and the efficiency is low.
Disclosure of Invention
In view of the above, the present invention has been made to provide a speech based input method and apparatus that overcomes or at least partially solves the above problems.
According to a first aspect of the present invention, there is provided a speech-based input method, the method comprising:
acquiring voice data;
recognizing the voice data and determining a corresponding voice text;
determining symbol coding information corresponding to the voice text based on a typesetting matching rule;
converting the symbol encoding information into a formula;
and outputting the formula.
According to a second aspect of the present invention, there is provided a speech based input device, the device comprising:
the data acquisition module is used for acquiring voice data;
the voice recognition module is used for recognizing the voice data and determining a corresponding voice text;
the text matching module is used for determining symbol coding information corresponding to the voice text based on a typesetting matching rule;
the symbol coding processing module is used for converting the symbol coding information into a formula;
and the data output module is used for outputting the formula.
According to the method and the device, the voice data are recognized to form the voice text, the symbol coding information corresponding to the voice text is formed through the typesetting matching rule, and then the symbol coding information is converted into the formula to be output, so that a user does not need to input the symbol coding information through a keyboard, the input efficiency of a complex formula is improved, the time is saved, meanwhile, the formula converted after the user inputs the voice can be directly applied, and the convenience of the voice input formula is improved.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings.
In the drawings:
FIG. 1 is a flow chart illustrating the steps of a method for voice-based input according to an embodiment of the present invention;
fig. 2 is a diagram of rule distribution of LaTex provided in an embodiment of the present invention;
FIG. 3 is a flow chart of steps of another method for speech-based input provided by an embodiment of the present invention;
fig. 4 is a block diagram of a speech-based input device according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention can be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
Fig. 1 is a flowchart illustrating steps of a method for inputting based on voice according to an embodiment of the present invention, where as shown in fig. 1, the method may include:
step 101, voice data is obtained.
In the embodiment of the invention, the user can input the formula in a voice input mode, so that the user does not need to search symbols and typesetting adjustment related to the formula. The user can speak the formula to be input and can receive the corresponding voice data.
And 102, recognizing the voice data and determining a corresponding voice text.
In the embodiment of the invention, voice recognition processing can be carried out on voice data, wherein the voice recognition refers to a process of converting a voice signal into a corresponding text through a recognition and understanding process by a machine, and one mode of the voice recognition can be that acoustic information in the voice data is recognized through an acoustic model and then is processed through a language model to understand the corresponding voice information and convert the corresponding voice information into the corresponding voice text.
For example, when speech recognition is performed on speech data input by a user, a corresponding speech text is obtained as "seven power numerator a side plus a side minus b side of a denominator sin bracket x plus y back to bracket plus cos2x plus x" jumps out.
And 103, determining symbol coding information corresponding to the voice text based on the typesetting matching rule.
In the embodiment of the invention, the formula can be identified and typeset based on various typesetting modes, and different typesetting modes correspond to different typesetting rules, so that the typesetting matching rules can be determined based on the typesetting modes, and the voice text is converted to obtain the symbol coding information required by the formula. Wherein the symbol encoding information refers to information of symbol encoding required to constitute a formula.
Taking a LaTex (typesetting system) as an example, a typesetting matching rule is formed based on the LaTex typesetting system, and the corresponding symbol codes are LaTex symbol codes.
In an optional embodiment, the step of generating the layout matching rule includes:
and a substep S11, determining a symbol code based on the typesetting rule, and determining the text data corresponding to the symbol code.
In the embodiment of the invention, the symbol codes required by the formula can be determined based on the typesetting rules corresponding to different typesetting modes, and the text data corresponding to the symbol codes can be determined. For example, the text data with symbol codes in the language may be determined based on different languages, such as symbol codes in chinese, corresponding text data "comma", symbol codes in english, corresponding text data "comma", and the like, and the text data corresponding to the symbol codes may be matched, and in other examples, some symbol codes may correspond to different expressions, such as symbol code "+" may be expressed as "plus", and the like, so that one symbol code corresponds to a plurality of text data.
And a substep S12 of establishing a mapping relation between the symbol codes and the text data as a typesetting matching rule.
In the embodiment of the invention, after the text data corresponding to the symbol codes are determined, the mapping relation between the symbol codes and the text data can be established, wherein the symbol codes and the text data can be in one-to-many mapping relation, and the typesetting matching rule can be generated based on the mapping relation.
For example, a typesetting matching rule may be constructed based on a LaTex typesetting system, where the typesetting matching rule may include a system rule and an element rule, as shown in fig. 2, the system rule includes a punctuation rule, a parenthesis rule, an operator rule, a minimum unit rule, and an angle-marking rule, the element rule includes a fraction rule, a logarithm rule, a derivative rule, an integral rule, a limit rule, an exponential rule, a root rule, and a vector rule, and in some typesetting matching rules, a mapping relationship between LaTex symbol codes required for forming a formula and preset text data may be shown in the following tables 1 to 8:
TABLE 1 punctuation matching rules
LaTex symbol encoding Text data
Comma (comma)
. Stippling and pointing
Ellipses
Blank space
Sigh, factoria
Colon
Branch number
° Degree of rotation
Fen and skimming
‘’ Left-falling and two-falling
TABLE 1
TABLE 2 bracket matching rules
Figure BDA0002452700220000041
Figure BDA0002452700220000051
TABLE 2
TABLE 3 operator matching rules
LaTex symbol encoding Text data
+ Plus, plus sign, plus
- Minus, minus
\pm Plus-minus, plus-minus
\times Multiplication, cross multiplication
\div Is divided by
\cdot Dot product
> Is greater than
< Is less than
TABLE 3
TABLE 4 Arabic numerals matching rules
LaTex symbol encoding Text data
1、2、3、4、5… 1、2、3、4、5…
TABLE 4
TABLE 5 English letter matching rules
LaTex symbol encoding Text data
a、b、c、d、e… a、b、c、d、e…
A、B、C、D、E… Big A, big B, big C, big D …
TABLE 5
TABLE 6 Greek letter matching rules
LaTex symbol encoding Text data
\alpha α
\beta β
\gamma γ
\Gamma Γ
\Delta Δ
TABLE 6
TABLE 7 minimum Unit matching rules
LaTex classified symbol coding Examples of the invention
Algebraic expression 2xy、16y2、…
Corner ∠BOC、∠α、…
Triangle shape ⊿ABC
TABLE 7
TABLE 8 index matching rules
LaTex symbol encoding Text data
x^{2} x square, x square
x^{3} x cube, x cube
x^{y} Y power of x
... ...
TABLE 8
The LaTex symbol codes can also be called LaTex codes, and the matched symbol code information can be called LaTex source codes.
Therefore, the mapping relation between the symbol codes and the text data can be established based on the typesetting mode, and the typesetting matching rule is established. Therefore, in the input process, the recognized voice text can be matched with the text data in the typesetting matching rule, the symbol codes corresponding to the matched text data are further determined, and each symbol code is determined as symbol code information.
Based on the typesetting matching rule, the text information "the denominator sin, x, y, back, and seven power numerator, a square and b square, leaps out of the bracket, cos2x and x" can be converted into corresponding symbolic coding information: "\ frac { a2+b2}{sin(x+y)+cos2x+x7}”。
And 104, converting the symbol coding information into a formula.
In the embodiment of the invention, the matched symbol coding information can be converted into a formula based on different typesetting modes. Taking the LaTex typesetting system as an example, on the basis of the symbol coding information, determining the formula corresponding to the voice data as follows according to the typesetting rule corresponding to the LaTex typesetting system:
Figure BDA0002452700220000071
and 105, outputting the formula.
In the embodiment of the invention, after the matching of the formula is completed, the corresponding formula can be output based on the environment of the input method, for example, the formula is output in editing software, social software and the like, so that the input requirement of a user on the formula is met.
In summary, the input method based on speech provided by the embodiment of the present invention identifies speech data to form a speech text, forms symbol encoding information corresponding to the speech text by a typesetting matching rule, and converts the symbol encoding information into a formula to be output, so that a user does not need to input the symbol encoding information through a keyboard, thereby improving the input efficiency of the formula and saving time.
Fig. 3 is a flowchart illustrating steps of another speech-based input method according to an embodiment of the present invention, where as shown in fig. 3, the method may include:
step 301, voice data is obtained.
In the embodiment of the invention, the user can input the formula in a voice input mode, so that the user does not need to search symbols and typesetting adjustment related to the formula. The user can speak the formula to be input and can receive the corresponding voice data.
Step 302, recognizing the voice data and determining a corresponding voice text.
In the embodiment of the invention, voice recognition processing can be carried out on voice data, wherein the voice recognition refers to a process of converting a voice signal into a corresponding text through a recognition and understanding process by a machine, and one mode of the voice recognition can be that acoustic information in the voice data is recognized through an acoustic model and then is processed through a language model to understand the corresponding voice information and convert the corresponding voice information into the corresponding voice text.
For example, when speech recognition is performed on speech data input by a user, a corresponding speech text is obtained as "seven power numerator a side plus a side minus b side of a denominator sin bracket x plus y back to bracket plus cos2x plus x" jumps out.
In the process of recognizing the voice, the voice recognition method can also analyze the voice based on the information such as the pause of the voice of the user and the like, thereby improving the accuracy of the voice recognition. In an optional embodiment, if no text information is recognized in the voice data recognition process after a set time, punctuation marks are filled at the end of the previous text information.
In the embodiment of the invention, when the voice data is identified, the voice data is detected through end point detection, if no voice data is input after the set time, the text information is determined not to be identified, and punctuation marks are filled at the tail of the identified text information.
For example, the endpoint detection may use a VAD (voice activity detection) algorithm, which aims to detect whether a voice signal exists in current voice data, and determine the voice data, where if a voice signal input is detected to be recorded as state "1", a voice signal input is not detected to be recorded as state "0", and the set time is n seconds, such as 2S, if the voice signal input exceeds 2S and is recorded as state "0", then a "comma" is automatically added at the end of the previous text message. Taking the LaTex typesetting system as an example, the text information "the seven power numerator small-a side plus small-b side of denominator sin bracket x plus y back bracket plus cos2x plus x jumps out", wherein the text information "the seven power of denominator sin bracket x plus y back bracket plus cos2x plus x" exceeds 2S to identify that the text information "the numerator small-a side plus small-b side jumps out", and then a new text information "the seven power comma numerator small-a side plus small-b side of denominator sin bracket x plus y back bracket plus cos2x plus x jumps out" is formed, and the text information is confirmed to be the voice text corresponding to the voice data.
In the formula, some symbols are symbols appearing in pairs, such as parentheses and quotation marks, so the symbols appearing in pairs can be called symbol pairs, wherein the symbol pairs comprise a first symbol and a second symbol, and if one symbol pair is "()", the first symbol is "(", and the second symbol is ")".
Therefore, in the process of voice recognition, if a first symbol in a plurality of symbol pairs appears before, under the condition that a second symbol is recognized but which symbol pair belongs to cannot be judged, forward matching can be carried out to determine the symbol pair to which the symbol pair belongs, and then the symbol pair is matched into a corresponding second symbol.
Therefore, in some optional embodiments of the present invention, if the text segment conforms to the preset text data, the symbol encoding information is queried forward, and a target symbol closest to the text segment is determined, where the target symbol is a first symbol in a symbol pair, and the symbol pair includes the first symbol and a second symbol.
In the embodiment of the present invention, the preset text data is preset data for matching a symbol pair, such as data representing a second symbol. For example, when the user inputs the second symbol of the symbol pair, the word is not accurately used, which symbol pair is not determined, and thus the corresponding second symbol is not determined. For example, if the user enters "back brackets" but cannot determine that the text segment is between small brackets, middle brackets and big brackets, the symbol encoding information may be queried forward to find the target symbol closest to the text segment, which is the first symbol in the symbol pair. A first symbol may be understood as a first symbol in a pair of notations, for example a left brace, a second symbol corresponding to the first symbol, and as a second symbol in a pair of notations, for example a right brace, a right brace. And detecting the matched symbol pair, and automatically backfilling a second symbol, so that the entry of the symbol pair is closer to a real scene, and the convenience of a voice input formula is improved.
Based on the typesetting matching rule, determining the symbolic coding information corresponding to the voice text, such as the following steps: 303-305.
And 303, segmenting the voice text to obtain a corresponding text segment.
In the embodiment of the present invention, the segmentation processing is performed on the voice text, and the segmentation may be performed according to a formula unit required by a formula, for example, a letter, a number, a numeric symbol, and the like are one formula unit, and the text segment is text information corresponding to one formula unit in the formula.
For example, on the basis of the above-mentioned phonetic text, the text segment after segmentation is that "the denominator sin x plus y returns to the seventh power of the bracket plus cos2x plus x, and the numerator a side plus the decimal b side jumps out":
"denominator", "sin", "bracket", "x", "plus", "y", "back bracket", "plus", "cos", "2 x", "plus", "the seventh power of x", "comma", "numerator", "small a square", "plus", "small b square" and "jump out".
And step 304, matching the text segments according to the typesetting matching rule to obtain corresponding symbolic codes.
In the embodiment of the invention, different typesetting matching rules can be formed based on various typesetting modes, the text segments are matched with the text data in the typesetting matching rules, the matched text data is determined, then the symbolic codes corresponding to the text data are determined, and the symbolic codes corresponding to the text segments can be obtained.
Step 305 combines the symbol codes to determine corresponding symbol code information.
In the embodiment of the invention, the symbol codes corresponding to the text segment are combined based on the segmentation sequence of the text segment to obtain the symbol code information corresponding to the voice data.
Taking the LaTex typesetting system as an example, based on the text segment, matching the corresponding symbol codes to perform combination to determine that the symbol code information is:
“\frac{a2+b2}{sin(x+y)+cos2x+x7}”
step 306, converting the symbol coding information into a formula.
In the embodiment of the invention, the matched symbol coding information can be converted into a formula based on different typesetting modes. Taking the LaTex typesetting system as an example, on the basis of the symbol coding information, determining the formula corresponding to the voice data as follows according to the typesetting rule corresponding to the LaTex typesetting system:
Figure BDA0002452700220000101
the accuracy of the converted formula is affected by errors in the above steps, such as recognition errors of voice data, or errors generated by analyzing information such as pauses of user's voice.
Therefore, in an optional embodiment, the method further comprises at least one of the following verification steps:
and a sub-step S31 of detecting the symbol encoding of the text segment adjacent to the punctuation mark.
In the embodiment of the invention, positioning detection is carried out by taking a section of text segment as a unit, and the symbol code corresponding to the text segment at the adjacent position of the text segment corresponding to the punctuation mark is determined.
And a substep S32, deleting the punctuation mark if the symbol is coded as an operator.
In the embodiment of the invention, the obtained symbol codes are matched, if the symbol codes are matched as operators, based on the typesetting rule of the formula, the punctuation marks identified before can be known to have errors, and the punctuation marks can be deleted to update the formula.
And a substep S33, deleting one punctuation mark if two identical punctuation marks are detected to be adjacent.
In the embodiment of the invention, the obtained symbol codes are matched, if the symbol codes are matched to be punctuation marks, based on the typesetting rule of the formula, the punctuation marks identified or filled in the previous step are known to have errors, and one punctuation mark can be deleted to update the formula.
And a substep S34 of detecting the hierarchy of the formula and updating the symbol which does not conform to the hierarchy rule.
In the embodiment of the present invention, the hierarchical rule refers to operation levels included in the formula, for example, a numerator and a denominator belong to different hierarchies, and contents inside and outside a parenthesis are also different hierarchies. Some symbols are used to distinguish the hierarchy of the formula, so the detection of the hierarchy of the formula can be performed based on the symbols. Detecting the level of the symbol in the symbol coding information, determining the position of the symbol in the symbol coding information, matching based on a level rule, and updating the symbol to be the symbol conforming to the level.
For example, the symbols may be middle brackets and small brackets, based on the typesetting rule of the formula, it is known that the level of the small brackets is higher than the level of the middle brackets, the middle brackets are nested in the small brackets in the process of inputting the formula by voice, and it is detected that the level corresponding to the symbols does not accord with the level rule, the positions between the middle brackets and the small brackets can be exchanged, it is detected that the level of the symbols accords with the level rule, and finally the formula which accords with the level rule is obtained.
Therefore, based on the verification step, the error of the voice input formula can be reduced, and the accuracy of the voice formula input is improved.
And 307, outputting the formula.
In the embodiment of the invention, after the matching of the formula is completed, the corresponding formula can be output based on the environment of the input method, for example, the formula is output in editing software, social software and the like, so that the input requirement of a user on the formula is met.
In summary, according to another input method based on speech provided by the embodiment of the present invention, speech data is recognized to form a speech text, symbol encoding information corresponding to the speech text is formed by a typesetting matching rule, and then the symbol encoding information is converted into a formula to be output, so that a user does not need to input the symbol encoding information through a keyboard, the formula input efficiency is improved, and time is saved. Meanwhile, the accuracy of the voice input formula can be improved by verifying the voice input formula.
Fig. 4 is a speech-based input device according to an embodiment of the present invention, and as shown in fig. 4, the speech-based input device may include:
a data acquisition module 401, configured to acquire voice data;
a voice recognition module 402, configured to recognize the voice data and determine a corresponding voice text;
a text matching module 403, configured to determine symbol encoding information corresponding to the speech text based on a typesetting matching rule;
a symbol encoding processing module 404, configured to convert the symbol encoding information into a formula;
and a data output module 405, configured to output the formula.
Optionally, the text matching module 403 includes:
the text segmentation submodule is used for segmenting the voice text to obtain a corresponding text segment;
the information matching submodule is used for matching the text segments according to the typesetting matching rule and determining corresponding symbol codes;
and the information combination submodule is used for combining the symbol codes to obtain corresponding symbol code information.
Optionally, the apparatus further comprises:
the code query module is used for querying the symbol coding information forward if the text segment accords with preset text data, and determining a target symbol closest to the text segment, wherein the target symbol is a first symbol in a symbol pair, and the symbol pair comprises the first symbol and a second symbol;
a code matching module for matching the text segment to a second symbol in the pair of symbols.
Optionally, the apparatus further includes a sub-module for generating a layout matching rule:
the typesetting matching unit is used for determining symbol codes based on a typesetting rule and determining text data corresponding to the symbol codes;
and the typesetting analysis unit is used for establishing a mapping relation between the symbol codes and the text data as a typesetting matching rule.
Optionally, the apparatus further comprises at least one of the following modules:
the punctuation inspection module is used for detecting the symbol codes of the text segments adjacent to the punctuation symbols, and deleting the punctuation symbols if the symbol codes are operators;
if two identical punctuation marks are detected to be adjacent, deleting one punctuation mark;
and the level checking module is used for detecting the level of the formula and updating the symbols which do not accord with the level rule.
Optionally, the apparatus further comprises:
and the punctuation filling module is used for filling punctuation marks at the tail of the previous section of text information if the text information is not identified after the set time in the voice data identification process.
In summary, the input device based on voice provided by the embodiment of the present invention recognizes voice data to form a voice text, forms symbol encoding information corresponding to the voice text by a typesetting matching rule, and converts the symbol encoding information into a formula to be output, so that a user does not need to input the symbol encoding information through a keyboard, thereby improving the input efficiency of a complex formula and saving time.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
As is readily imaginable to the person skilled in the art: any combination of the above embodiments is possible, and thus any combination between the above embodiments is an embodiment of the present invention, but the present disclosure is not necessarily detailed herein for reasons of space.
In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the invention and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
An electronic device, comprising:
one or more processors;
a memory;
one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the methods of the embodiments described above.
A computer-readable storage medium storing a computer program for use in conjunction with an electronic device, the computer program being executable by a processor to perform the speech-based input method of the above embodiments.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.
The foregoing detailed description of a voice-based input method and a voice-based input device provided by the present invention has been presented, and specific examples are used herein to illustrate the principles and implementations of the present invention, and the above descriptions of the examples are only used to help understand the method and its core ideas of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (10)

1. A method for speech-based input, the method comprising:
acquiring voice data;
recognizing the voice data and determining a corresponding voice text;
determining symbol coding information corresponding to the voice text based on a typesetting matching rule;
converting the symbol encoding information into a formula;
and outputting the formula.
2. The method of claim 1, wherein the determining symbolic coding information corresponding to the phonetic text based on the typesetting matching rule comprises:
segmenting the voice text to obtain corresponding text segments;
matching the text segments according to a typesetting matching rule to obtain corresponding symbol codes;
and combining the symbol codes to determine corresponding symbol code information.
3. The method of claim 2, further comprising:
if the text segment accords with preset text data, inquiring the symbol coding information forward, and determining a target symbol closest to the text segment, wherein the target symbol is a first symbol in a symbol pair, and the symbol pair comprises the first symbol and a second symbol;
matching the text segment to be the second symbol in the pair of symbols.
4. The method according to claim 2, further comprising the step of generating a layout matching rule:
determining symbol codes based on a typesetting rule, and determining text data corresponding to the symbol codes;
and establishing a mapping relation between the symbol codes and the text data as a typesetting matching rule.
5. The method of claim 1, further comprising at least one of the following verification steps:
detecting symbol codes of text segments adjacent to punctuation marks, and deleting the punctuation marks if the symbol codes are operators;
if two identical punctuation marks are detected to be adjacent, deleting one punctuation mark;
and detecting the hierarchy of the formula, and updating the symbols which do not accord with the hierarchy rule.
6. The method of claim 1, further comprising:
and if the text information is not recognized after the set time in the voice data recognition process, filling punctuation marks at the tail of the previous section of text information.
7. A speech-based input device, the device comprising:
the data acquisition module is used for acquiring voice data;
the voice recognition module is used for recognizing the voice data and determining a corresponding voice text;
the text matching module is used for determining symbol coding information corresponding to the voice text based on a typesetting matching rule;
the symbol coding processing module is used for converting the symbol coding information into a formula;
and the data output module is used for outputting the formula.
8. The apparatus of claim 6, wherein the text matching module comprises:
the text segmentation submodule is used for segmenting the voice text to obtain a corresponding text segment;
the information matching submodule is used for matching the text segments according to the typesetting matching rule and determining corresponding symbol codes;
and the information combination submodule combines the symbol codes to obtain corresponding symbol code information.
9. An electronic device, comprising:
one or more processors;
a memory;
one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the method of any of claims 1-6.
10. A computer-readable storage medium storing a computer program for use in conjunction with an electronic device, the computer program being executable by a processor to perform the method of any of claims 1-6.
CN202010297397.0A 2020-04-15 2020-04-15 Voice-based input method and device Pending CN111651961A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010297397.0A CN111651961A (en) 2020-04-15 2020-04-15 Voice-based input method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010297397.0A CN111651961A (en) 2020-04-15 2020-04-15 Voice-based input method and device

Publications (1)

Publication Number Publication Date
CN111651961A true CN111651961A (en) 2020-09-11

Family

ID=72345547

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010297397.0A Pending CN111651961A (en) 2020-04-15 2020-04-15 Voice-based input method and device

Country Status (1)

Country Link
CN (1) CN111651961A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116504245A (en) * 2023-06-26 2023-07-28 凯泰铭科技(北京)有限公司 Method and system for compiling rules by voice

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106940637A (en) * 2017-03-13 2017-07-11 刘新星 Acoustic control computational methods, device and user terminal
CN108255798A (en) * 2016-12-28 2018-07-06 北京新唐思创教育科技有限公司 A kind of input method and its device of La Taihe forms formula
CN110826301A (en) * 2019-09-19 2020-02-21 厦门快商通科技股份有限公司 Punctuation mark adding method, system, mobile terminal and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108255798A (en) * 2016-12-28 2018-07-06 北京新唐思创教育科技有限公司 A kind of input method and its device of La Taihe forms formula
CN106940637A (en) * 2017-03-13 2017-07-11 刘新星 Acoustic control computational methods, device and user terminal
CN110826301A (en) * 2019-09-19 2020-02-21 厦门快商通科技股份有限公司 Punctuation mark adding method, system, mobile terminal and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116504245A (en) * 2023-06-26 2023-07-28 凯泰铭科技(北京)有限公司 Method and system for compiling rules by voice
CN116504245B (en) * 2023-06-26 2023-09-22 凯泰铭科技(北京)有限公司 Method and system for compiling rules by voice

Similar Documents

Publication Publication Date Title
WO2019153996A1 (en) Text error correction method and apparatus for voice recognition
CN108595410B (en) Automatic correction method and device for handwritten composition
CN107729313B (en) Deep neural network-based polyphone pronunciation distinguishing method and device
CN112417102B (en) Voice query method, device, server and readable storage medium
CN111310440B (en) Text error correction method, device and system
CN111523306A (en) Text error correction method, device and system
CN111460793A (en) Error correction method, device, equipment and storage medium
CN112489626B (en) Information identification method, device and storage medium
WO2014048172A1 (en) Method and system for correcting text
JP2016536652A (en) Real-time speech evaluation system and method for mobile devices
CN113435186B (en) Chinese text error correction system, method, device and computer readable storage medium
CN108573707B (en) Method, device, equipment and medium for processing voice recognition result
CN112560450B (en) Text error correction method and device
CN111401071A (en) Model training method and device, computer equipment and readable storage medium
CN111651978A (en) Entity-based lexical examination method and device, computer equipment and storage medium
CN111274785A (en) Text error correction method, device, equipment and medium
CN114036930A (en) Text error correction method, device, equipment and computer readable medium
CN110895961A (en) Text matching method and device in medical data
CN112101032A (en) Named entity identification and error correction method based on self-distillation
CN113380223B (en) Method, device, system and storage medium for disambiguating polyphone
CN110929514B (en) Text collation method, text collation apparatus, computer-readable storage medium, and electronic device
CN111651961A (en) Voice-based input method and device
CN111046627A (en) Chinese character display method and system
CN112116181B (en) Classroom quality model training method, classroom quality evaluation method and classroom quality evaluation device
CN113822052A (en) Text error detection method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination