CN113298082B

CN113298082B - Dictation data processing method and device, electronic equipment and storage medium

Info

Publication number: CN113298082B
Application number: CN202110853809.9A
Authority: CN
Inventors: 张恒志; 李泽桐; 姬传国
Original assignee: Beijing Ape Power Future Technology Co Ltd
Current assignee: Beijing Ape Power Future Technology Co Ltd
Priority date: 2021-07-28
Filing date: 2021-07-28
Publication date: 2021-11-09
Anticipated expiration: 2041-07-28
Also published as: CN113298082A

Abstract

The disclosure provides a processing method and device of dictation data, computer equipment and a storage medium, and relates to the technical field of computers. The method comprises the following steps: in response to monitoring that a dictation data detection control in a dictation service interface is triggered, acquiring first sequence data to be detected, target dictation characters and first reference characters; generating characters to be detected according to the first sequence data; determining a first matching degree between the character to be detected and the target dictation character; determining the similarity between the character to be detected and the first reference character under the condition that the first matching degree is smaller than the threshold value; and generating and displaying a first detection result corresponding to the first sequence data according to the similarity. Therefore, the characters to be detected are detected and the result is displayed based on the target dictation characters and the reference characters, so that whether the dictation result of the user is accurate or not can be detected, the user can visually see the error reason, and the user can better master the dictation content.

Description

Dictation data processing method and device, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a method and an apparatus for processing dictation data, an electronic device, and a storage medium.

Background

For students in middle and primary schools, parts such as Chinese words and English words are of great importance in the learning process, and generally, the Chinese words and English words can be detected in a dictation mode to determine whether the students in middle and primary schools master the content.

With the continuous development of computer technology, students in middle and primary schools can listen and write by using electronic equipment. Therefore, how to process dictation data generated by students in middle and primary schools in the dictation process becomes a popular research direction.

Disclosure of Invention

The present disclosure is directed to solving, at least to some extent, one of the technical problems in the related art.

An embodiment of a first aspect of the present disclosure provides a method for processing dictation data, including:

in response to monitoring that a dictation data detection control in a dictation service interface is triggered, acquiring first sequence data to be detected, a target dictation character and a first reference character, wherein the similarity between the first reference character and the target dictation character is greater than a threshold value;

generating characters to be detected according to the first sequence data;

determining a first matching degree between the character to be detected and the target dictation character;

determining the similarity between the character to be detected and the first reference character under the condition that the first matching degree is smaller than a threshold value;

and generating and displaying a first detection result corresponding to the first sequence data according to the similarity.

An embodiment of a second aspect of the present disclosure provides a processing apparatus for dictation data, including:

the acquisition module is used for responding to the fact that a dictation data detection control in a dictation service interface is triggered, acquiring first sequence data to be detected, a target dictation character and a first reference character, wherein the similarity between the first reference character and the target dictation character is larger than a threshold value;

the first generation module is used for generating characters to be detected according to the first sequence data;

the first determining module is used for determining a first matching degree between the character to be detected and the target dictation character;

the second determining module is used for determining the similarity between the character to be detected and the first reference character under the condition that the first matching degree is smaller than a threshold value;

and the second generation module is used for generating and displaying a first detection result corresponding to the first sequence data according to the similarity.

An embodiment of a third aspect of the present disclosure provides an electronic device, including: the device comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the program to realize the processing method of dictation data as proposed in the embodiment of the first aspect of the disclosure.

A fourth aspect of the present disclosure provides a non-transitory computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program implements the processing method of dictation data as set forth in the first aspect of the present disclosure.

A fifth aspect of the present disclosure provides a computer program product, which when executed by an instruction processor in the computer program product, performs the processing method for dictation data set forth in the first aspect of the present disclosure.

The processing method, the processing device, the electronic equipment and the storage medium for dictation data provided by the disclosure have the following beneficial effects:

in the embodiment of the disclosure, first, in response to monitoring that a dictation data detection control in a dictation service interface is triggered, first sequence data to be detected, a target dictation character and a first reference character are acquired, then, the character to be detected is generated according to the first sequence data, a first matching degree between the character to be detected and the target dictation character is determined, finally, under the condition that the first matching degree is smaller than a threshold value, the similarity between the character to be detected and the first reference character is determined, and a first detection result corresponding to the first sequence data is generated and displayed according to the similarity. Therefore, the characters to be detected are detected and the result is displayed based on the target dictation characters and the reference characters, so that whether the dictation result of the user is accurate or not can be detected, the user can visually see the error reason, and the user can better master the dictation content.

Additional aspects and advantages of the disclosure will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the disclosure.

Drawings

The foregoing and/or additional aspects and advantages of the present disclosure will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

fig. 1 is a schematic flowchart of a processing method for dictation data according to an embodiment of the present disclosure;

fig. 2 is a schematic flow chart of a dictation data processing method according to another embodiment of the present disclosure;

fig. 3 is a schematic structural diagram of a device for processing dictation data according to an embodiment of the present disclosure;

FIG. 4 illustrates a block diagram of an exemplary electronic device suitable for use in implementing embodiments of the present disclosure.

Detailed Description

Reference will now be made in detail to the embodiments of the present disclosure, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the drawings are exemplary and intended to be illustrative of the present disclosure, and should not be construed as limiting the present disclosure.

A method, an apparatus, an electronic device, and a storage medium for processing dictation data according to embodiments of the present disclosure are described below with reference to the accompanying drawings.

Fig. 1 is a schematic flow chart of a processing method for dictation data according to an embodiment of the present disclosure.

The embodiment of the present disclosure is exemplified in that the processing method of the dictation data is configured in a processing apparatus of the dictation data, and the processing apparatus of the dictation data can be applied to any electronic device, so that the electronic device can execute a processing function of the dictation data.

The electronic device may be a Personal Computer (PC), a cloud device, a mobile device, and the like, and the mobile device may be a hardware device having an operating system, a touch screen, and/or a display screen, such as a mobile phone, a tablet Computer, a Personal digital assistant, a wearable device, and an in-vehicle device.

As shown in fig. 1, the method for processing dictation data may include the following steps:

step 101, in response to monitoring that a dictation data detection control in a dictation service interface is triggered, acquiring first sequence data to be detected, a target dictation character and a first reference character, wherein the similarity between the first reference character and the target dictation character is greater than a threshold value.

The dictation service interface can be an interface with an input function, and a user can write in the interface according to the heard dictation content.

The dictation data detection control can be displayed at any position in the dictation service interface. The user can trigger the dictation data detection control to send a dictation data detection request to the electronic equipment, and then the electronic equipment can detect the acquired first sequence data.

It is to be understood that the position, style, or presentation form of the dictation data detection control may be in any form, and the disclosure is not limited thereto.

The first sequence data may include a plurality of stroke data with a sequence, for example, input time, input start position coordinates, input end position coordinates, and the like of each stroke, and in addition, the first sequence data may be one or more arrays with a sequence, which is not limited in this disclosure.

Alternatively, the first sequence data may be data of each input operation, such as a start position coordinate and an end position coordinate of each input, recorded according to an input time of each input operation after the dictation process is finished.

Optionally, since a user may have an input error during the dictation process, and the error portion is erased through an erasing operation, the first sequence data may be data generated after processing recorded data of each input operation. For example, the recorded data of each input operation is second sequence data, where the second sequence data includes the sequence, input location, and type of each input operation; then in response to the fact that the type of any input operation in the second sequence data is erasure, determining a first target input operation associated with any input operation according to the sequence and the input position of any input operation in the second sequence data, wherein any input operation is used for erasing data input by the first target input operation; and finally, clearing any input operation and target input operation in the second sequence data to generate first sequence data.

It is understood that, since the second sequence data is sequence data recorded in real time and may contain an erasing operation, an operation of the type of erasing in the second sequence data and an input operation corresponding to the erasing operation may be cleared first to generate the first sequence data including only valid input operations.

In addition, for convenience of explanation, different labels may be used to distinguish between erase operations and non-erase operations. For example, the input operation of the non-erase class may be denoted as "1", the operation of the erase class may be denoted as "0", or other contents may be recorded.

It should be noted that the above examples are only illustrative, and should not be taken as limitations on the second sequence data, the first sequence data, and the like in the embodiments of the present disclosure.

Wherein, the target dictation character is a correct character corresponding to the current dictation voice.

Wherein the first reference character may be a character with a structure similar to that of the target dictation character, a stroke similar to that of the target dictation character, or a pronunciation similar to that of the target dictation character. For example, if the target dictation character is "none", the corresponding first reference character may include "element", "fu", and so on, which is not limited by this disclosure.

Alternatively, the first reference character may be one or more characters that are pre-generated and have a similarity to the target dictation character greater than a threshold. It will be appreciated that a database may store a reference character for each character.

And 102, generating characters to be detected according to the first sequence data.

It can be understood that, since the start and stop positions and sequence of each input operation are included in the first sequence data, the character to be detected input by the user can be determined based on the first sequence data.

For example, based on the start-stop position of each input operation in the first sequence data, the strokes corresponding to each input operation can be determined as "first", "second", "", and based on the positional relationship among the strokes, the corresponding character to be detected can be determined as "none".

It should be noted that the above examples are only simply illustrative, and should not be taken as specific limitations of the first sequence data and the characters to be detected in the embodiments of the present disclosure.

Step 103, determining a first matching degree between the character to be detected and the target dictation character.

Optionally, the first matching degree between the character to be detected and the target dictation character may be calculated by using the euclidean distance and the manhattan distance, or the cosine similarity between the character to be detected and the target dictation character may be calculated, and the cosine similarity is used as the first matching degree between the character to be detected and the target dictation character, which is not limited in this disclosure.

And 104, determining the similarity between the character to be detected and the first reference character under the condition that the first matching degree is smaller than the threshold value.

It can be understood that, when the first matching degree is smaller than the threshold, it indicates that the character to be detected is not the target dictation character, and the dictation result of the user is incorrect, then the character to be detected may be matched with the first reference character to determine the similarity between the character to be detected and the first reference character.

It should be noted that, for determining the specific implementation manner of the similarity between the character to be detected and the first reference character, reference may be made to the detailed description of other embodiments in the present disclosure, and details are not repeated here.

And 105, generating and displaying a first detection result corresponding to the first sequence data according to the similarity.

The first detection result may include a place where the dictation result is wrong, a pronunciation and a meaning corresponding to the character to be detected, a correct writing mode of the target dictation character, a difference between the target dictation character and the character to be detected, and the like, which is not limited by the present disclosure.

Optionally, one or more first reference characters with the highest similarity to the character to be detected may be selected and displayed as the first detection result.

It can be understood that the generation and display of the first detection result corresponding to the first sequence data can help the user to find out the reason of the error of the dictation result and help to distinguish the difference between the dictation result and the target dictation character, thereby enabling the user to learn better.

Fig. 2 is a schematic flow chart of a processing method of dictation data according to an embodiment of the present disclosure, and as shown in fig. 2, the processing method of dictation data may include the following steps:

step 201, responding to that the currently played audio data is the last audio data in the audio sequence corresponding to the target dictation content, and displaying a dictation data detection control on a display interface.

In the present disclosure, in order to reduce the occupation of the user input interface as much as possible, in the dictation process, the dictation data "detection control" may be displayed on the display interface only after the last audio data. Therefore, the detection control can not affect the effective input area in the input interface in the dictation process, and can be displayed only when the dictation is finished, so that the user can accurately judge that the current dictation process is finished.

Step 202, in response to monitoring that a dictation data detection control in a dictation service interface is triggered, acquiring first sequence data to be detected, a target dictation character and a first reference character, wherein the similarity between the first reference character and the target dictation character is greater than a threshold value.

And step 203, generating the character to be detected according to the first sequence data.

Step 204, determining a first matching degree between the character to be detected and the target dictation character.

The specific implementation form of steps 202 to 204 may refer to the detailed description of other embodiments in the present disclosure, and will not be described in detail here.

And step 205, under the condition that the first matching degree is greater than or equal to the threshold, generating and displaying a second detection result corresponding to the first sequence data according to the input position, type and/or sequence of each input operation in the first sequence data.

The second detection result may include a correct stroke order of the target dictation character, whether the strokes corresponding to the first sequence data are correct, a reference character similar to the target dictation character, and the like, which is not limited by the present disclosure.

It can be understood that, when the first matching degree is greater than or equal to the threshold, the presentation form of the dictation result of the user is correct, but there may be an input operation of erasing during writing by the user or the writing order of the strokes of the user is inconsistent with the correct writing way of the target dictation character, and therefore, the first sequence data may be further detected to enable the user to accurately grasp the target dictation character.

Optionally, when the input operation of the type erase is included in the first sequence data, the second detection result may be generated and displayed according to the sequence and the input position of the input operation of the type erase in the first sequence data.

It is understood that, if the input operation whose type is erasure is included in the first sequence data, it indicates that the user is not familiar with the understanding of the target dictation character, and therefore, a plurality of similar characters can be acquired according to the order and input positions of the input operation whose type is erasure in the first sequence data, and recommended to the user for learning, so as to deepen the impression of the user.

Optionally, a third reference character which is the same as the stroke and the input sequence of the input operation of the type of erasing and is the same as the stroke and the input sequence of the input operation of the type of erasing may be screened from the first reference characters corresponding to the target dictation character according to the sequence and the input position of the input operation of the type of erasing, and finally a second detection result including the target dictation character and the third reference character is generated and displayed.

For example, if the target dictation character is "none", and the first sequence data further includes an input operation of the type erasing after the stroke "|" and the stroke corresponding to the erased input operation is "|", the corresponding third reference character is determined to include "dry", "king", "jade", etc.

Optionally, according to the sequence and the input position of the input operation of which the type is erase in the first sequence data, the second detection result may be generated and displayed through the following processes:

(1) determining a second target input operation and a third target input operation which are related according to the sequence and the input position of the input operation with the type of erasure in the first sequence data, wherein the input operation with the type of erasure is used for erasing data input by the second target input operation, and the third target input operation is a next input operation adjacent to the input operation with the type of erasure;

(2) acquiring a second reference character according to data input by a second target input operation and data input by a third target input operation, wherein the second reference character comprises the data input by the second target input operation and/or the data input by the third target input operation;

(3) and generating and displaying a second detection result containing the target dictation character and a second reference character.

The second detection result may include a stroke sequence corresponding to the first sequence data, a target dictation character and a corresponding correct stroke sequence, and a second reference character and a corresponding correct stroke sequence, which are not limited in this disclosure.

For example, if the target dictation character is "dry", the corresponding correct stroke order is "one, |", if the first sequence data includes input operations of the type erasing, the stroke order corresponding to the first sequence data may be "one, horizontal, |", wherein "horizontal" is data input by the second target input operation, "|" is data input by the third target input operation, the second reference character acquired may be "day", "on", etc.

Optionally, when the input operation of the type of the input operation is an erase operation is not included in the first sequence data, the second detection result is generated and displayed according to the input position and the sequence of each input operation in the first sequence data.

The second detection result may include a stroke sequence corresponding to the first sequence data, a target dictation result and a corresponding correct stroke sequence, an error in the stroke sequence corresponding to the first sequence data, and the like, which is not limited by the present disclosure.

The method includes the steps of determining target sequence data corresponding to target dictation characters, wherein the target sequence data comprise sequences of target strokes, determining input strokes corresponding to input operations according to input positions of the input operations, and generating and displaying a second detection result according to a second matching degree between the sequences of the input strokes and the sequences of the target strokes.

For example, the target dictation character is "king", the target stroke sequence included in the corresponding target sequence data is "one, | one", the stroke sequence corresponding to the first sequence data is "one, |", the determined second matching degree can be 50%, the second detection result can display the second matching degree, the target dictation character is "king", the stroke sequence is "one, |, one", and the stroke sequence corresponding to the first sequence data is "one, |", and the third and fourth strokes "|, one" of "one, |" can be highlighted.

In the embodiment of the disclosure, under the condition that the first matching degree between the character to be detected and the target dictation character is greater than or equal to the threshold value, the second detection result corresponding to the first sequence data can be generated and displayed according to the input position, type and/or sequence of each input operation in the first sequence data. Therefore, the method and the device not only help the user to visually see errors occurring in the writing process, but also help the user to better master the correct writing mode of the target dictation character.

In order to implement the above embodiments, the present disclosure further provides a device for processing dictation data.

Fig. 3 is a schematic structural diagram of a device for processing dictation data according to an embodiment of the present disclosure.

As shown in fig. 3, the apparatus 300 for processing dictation data may include: the device comprises an acquisition module 310, a first generation module 320, a first determination module 330, a second determination module 340 and a second generation module 350.

The obtaining module 310 is configured to obtain first sequence data to be detected, a target dictation character and a first reference character in response to monitoring that a dictation data detection control in a dictation service interface is triggered, where a similarity between the first reference character and the target dictation character is greater than a threshold.

And a first generating module 320, configured to generate the character to be detected according to the first sequence data.

The first determining module 330 is configured to determine a first matching degree between the character to be detected and the target dictation character.

The second determining module 340 is configured to determine a similarity between the character to be detected and the first reference character when the first matching degree is smaller than the threshold.

And a second generating module 350, configured to generate and display a first detection result corresponding to the first sequence data according to the similarity.

In a possible implementation manner, the obtaining module is specifically configured to:

acquiring second sequence data, wherein the second sequence data comprises the sequence, the input position and the type of each input operation;

in response to the fact that the type of any input operation in the second sequence data is erasing, determining a first target input operation associated with any input operation according to the sequence and input positions of any input operation in the second sequence data, wherein any input operation is used for erasing data input by the first target input operation;

and clearing any input operation and target input operation in the second sequence data to generate first sequence data.

In one possible implementation, the third generating module includes:

the generating unit is used for generating and displaying a second detection result according to the sequence and the input position of the input operation with the type of erasure in the first sequence data under the condition that the input operation with the type of erasure is contained in the first sequence data;

alternatively, the first and second electrodes may be,

and the generating unit is used for generating and displaying a second detection result according to the input position and the sequence of each input operation in the first sequence data under the condition that the input operation with the type of erasure is not contained in the first sequence data.

In a possible implementation manner, the generating unit is specifically configured to:

determining a second target input operation and a third target input operation which are related according to the sequence and the input position of the input operation with the type of erasure in the first sequence data, wherein the input operation with the type of erasure is used for erasing data input by the second target input operation, and the third target input operation is a next input operation adjacent to the input operation with the type of erasure;

acquiring a second reference character according to data input by a second target input operation and data input by a third target input operation, wherein the second reference character comprises the data input by the second target input operation and/or the data input by the third target input operation;

and generating and displaying a second detection result containing the target dictation character and a second reference character.

In a possible implementation manner, the generating unit is further specifically configured to:

determining target sequence data corresponding to target dictation characters, wherein the target sequence data comprises the sequence of each target stroke;

determining an input stroke corresponding to each input operation according to the input position of each input operation;

and generating and displaying a second detection result according to the second matching degree between the sequence of each input stroke and the sequence of each target stroke.

In a possible implementation manner, before it is monitored that a dictation data detection control in a dictation service interface is triggered, the method further includes:

and responding to the audio data played currently as the last audio data in the audio sequence corresponding to the target dictation content, and displaying a dictation data detection control on a display interface.

The functions and specific implementation principles of the modules in the embodiments of the present disclosure may refer to the embodiments of the methods, and are not described herein again.

The device for processing dictation data of the embodiment of the disclosure first acquires first sequence data to be detected, a target dictation character and a first reference character in response to monitoring that a dictation data detection control in a dictation service interface is triggered, then generates the character to be detected according to the first sequence data, determines a first matching degree between the character to be detected and the target dictation character, and finally determines a similarity between the character to be detected and the first reference character under the condition that the first matching degree is smaller than a threshold value, and generates and displays a first detection result corresponding to the first sequence data according to the similarity. Therefore, the characters to be detected are detected and the result is displayed based on the target dictation characters and the reference characters, so that whether the dictation result of the user is accurate or not can be detected, the user can visually see the error reason, and the user can better master the dictation content.

In order to implement the above embodiments, the present disclosure also provides an electronic device, including: the device comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, and when the processor executes the program, the processing method of dictation data as proposed by the foregoing embodiments of the present disclosure is realized.

In order to achieve the above embodiments, the present disclosure also proposes a non-transitory computer-readable storage medium storing a computer program, which when executed by a processor implements a processing method of dictation data as proposed by the foregoing embodiments of the present disclosure.

In order to implement the foregoing embodiments, the present disclosure also proposes a computer program product, which when executed by an instruction processor in the computer program product, executes the processing method of dictation data as proposed by the foregoing embodiments of the present disclosure.

FIG. 4 illustrates a block diagram of an exemplary electronic device suitable for use in implementing embodiments of the present disclosure. The electronic device 12 shown in fig. 4 is only an example and should not bring any limitations to the functionality and scope of use of the embodiments of the present disclosure.

As shown in FIG. 4, electronic device 12 is embodied in the form of a general purpose computing device. The components of computer device 12 may include, but are not limited to: one or more processors or processing units 16, a system memory 28, and a bus 18 that couples various system components including the system memory 28 and the processing unit 16.

Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. These architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus, to name a few.

Computer device 12 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by computer device 12 and includes both volatile and nonvolatile media, removable and non-removable media.

Memory 28 may include computer system readable media in the form of volatile Memory, such as Random Access Memory (RAM) 30 and/or cache Memory 32. Computer device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 4, and commonly referred to as a "hard drive"). Although not shown in FIG. 4, a disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a Compact disk Read Only Memory (CD-ROM), a Digital versatile disk Read Only Memory (DVD-ROM), or other optical media) may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. Memory 28 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the disclosure.

A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 42 generally perform the functions and/or methodologies of the embodiments described in this disclosure.

Computer device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with computer device 12, and/or with any devices (e.g., network card, modem, etc.) that enable computer device 12 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 22. Moreover, computer device 12 may also communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public Network such as the Internet) via Network adapter 20. As shown, network adapter 20 communicates with the other modules of computer device 12 via bus 18. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with computer device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

The processing unit 16 executes various functional applications and data processing, for example, implementing the methods mentioned in the foregoing embodiments, by executing programs stored in the system memory 28.

According to the technical scheme, first, in response to monitoring that a dictation data detection control in a dictation service interface is triggered, first sequence data to be detected, a target dictation character and a first reference character are obtained, then the character to be detected is generated according to the first sequence data, a first matching degree between the character to be detected and the target dictation character is determined, finally, under the condition that the first matching degree is smaller than a threshold value, the similarity between the character to be detected and the first reference character is determined, and a first detection result corresponding to the first sequence data is generated and displayed according to the similarity. Therefore, the characters to be detected are detected and the result is displayed based on the target dictation characters and the reference characters, so that whether the dictation result of the user is accurate or not can be detected, the user can visually see the error reason, and the user can better master the dictation content.

In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present disclosure. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present disclosure, "a plurality" means at least two, e.g., two, three, etc., unless explicitly specifically limited otherwise.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing steps of a custom logic function or process, and alternate implementations are included within the scope of the preferred embodiment of the present disclosure in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present disclosure.

The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.

It should be understood that portions of the present disclosure may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. If implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.

In addition, functional units in the embodiments of the present disclosure may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.

The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc. Although embodiments of the present disclosure have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present disclosure, and that changes, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present disclosure.

Claims

1. A method for processing dictation data, comprising:

in response to monitoring that a dictation data detection control in a dictation service interface is triggered, acquiring first sequence data to be detected, a target dictation character and a first reference character, wherein the similarity between the first reference character and the target dictation character is greater than a threshold value, the first reference character comprises characters with a structure similar to that of the target dictation character, a stroke similar to that of the target dictation character or a pronunciation similar to that of the target dictation character, and the first sequence data comprises start and stop positions and sequence of each input operation;

generating characters to be detected according to the first sequence data;

and generating and displaying a first detection result corresponding to the first sequence data according to the similarity, wherein the first detection result comprises one or more first reference characters with the highest similarity with the character to be detected.

2. The method of claim 1, wherein said obtaining first sequence data to be detected comprises:

in response to the fact that the type of any input operation in the second sequence data is erasing, determining a first target input operation associated with the any input operation according to the sequence and input positions of the any input operation in the second sequence data, wherein the any input operation is used for erasing data input by the first target input operation;

clearing the any input operation and the target input operation in the second sequence data to generate the first sequence data.

3. The method of claim 1, wherein after said determining a first degree of match between said character to be detected and said target dictation character, further comprising:

and under the condition that the first matching degree is greater than or equal to the threshold, generating and displaying a second detection result corresponding to the first sequence data according to the input position, type and/or sequence of each input operation in the first sequence data.

4. The method as claimed in claim 3, wherein the generating and displaying the second detection result corresponding to the first sequence data according to the input position, type and/or sequence of each input operation in the first sequence data comprises:

under the condition that the first sequence data comprises input operation of which the type is erasure, generating and displaying a second detection result according to the sequence and the input position of the input operation of which the type is erasure in the first sequence data;

alternatively, the first and second electrodes may be,

and under the condition that the input operation with the type of erasure is not included in the first sequence data, generating and displaying a second detection result according to the input position and the sequence of each input operation in the first sequence data.

5. The method as claimed in claim 4, wherein the generating and displaying the second detection result according to the sequence of the input operation with the type of erasure and the input position in the first sequence data comprises:

determining a second target input operation and a third target input operation which are related according to the sequence and the input position of the input operation with the type of erasure in the first sequence data, wherein the input operation with the type of erasure is used for erasing the data input by the second target input operation, and the third target input operation is a next input operation adjacent to the input operation with the type of erasure;

acquiring a second reference character according to the data input by the second target input operation and the data input by the third target input operation, wherein the second reference character comprises the data input by the second target input operation and/or the data input by the third target input operation;

and generating and displaying a second detection result containing the target dictation character and the second reference character.

6. The method as claimed in claim 4, wherein the generating and displaying a second detection result according to the input position and the sequence of each input operation in the first sequence data comprises:

determining target sequence data corresponding to the target dictation character, wherein the target sequence data comprises the sequence of each target stroke;

7. The method of any of claims 1-6, wherein prior to the monitoring that a dictation data detection control in a dictation service interface is triggered, further comprising:

and responding to the audio data played currently as the last audio data in the audio sequence corresponding to the target dictation content, and displaying the dictation data detection control on a display interface.

8. A device for processing dictation data, comprising:

the acquisition module is used for responding to the fact that a dictation data detection control in a dictation service interface is triggered, acquiring first sequence data to be detected, target dictation characters and first reference characters, wherein the similarity between the first reference characters and the target dictation characters is larger than a threshold value, the first reference characters comprise characters which are similar to the target dictation characters in structure, strokes or pronunciation, and the first sequence data comprise start and stop positions and sequence of each input operation;

and the second generating module is used for generating and displaying a first detection result corresponding to the first sequence data according to the similarity, wherein the first detection result comprises one or more first reference characters with the highest similarity to the character to be detected.

9. The apparatus of claim 8, wherein the obtaining module is specifically configured to:

10. The apparatus of claim 8, wherein the apparatus further comprises:

and the third generating module is used for generating and displaying a second detection result corresponding to the first sequence data according to the input position, type and/or sequence of each input operation in the first sequence data under the condition that the first matching degree is greater than or equal to the threshold value.

11. The apparatus of claim 10, wherein the third generation module comprises:

the generating unit is used for generating and displaying a second detection result according to the sequence and the input position of the input operation with the type of erasure in the first sequence data under the condition that the first sequence data comprises the input operation with the type of erasure;

alternatively, the first and second electrodes may be,

the generating unit is used for generating and displaying a second detection result according to the input position and the sequence of each input operation in the first sequence data under the condition that the input operation with the type of erasure is not included in the first sequence data.

12. The apparatus of claim 11, wherein the generating unit is specifically configured to:

13. The apparatus as claimed in claim 11, wherein said generating unit is further specifically configured to:

14. The apparatus of any of claims 8-13, wherein prior to the monitoring that a dictation data detection control in a dictation service interface is triggered, further comprising:

15. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing a method for processing dictation data as claimed in any one of claims 1 to 7 when executing the program.

16. A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, carries out a method for processing dictation data as claimed in any one of claims 1 to 7.