CN113157194B

CN113157194B - Text display method, electronic equipment and storage device

Info

Publication number: CN113157194B
Application number: CN202110277573.9A
Authority: CN
Inventors: 孙国俊; 杨宇
Original assignee: Hefei Xunfei Reading And Writing Technology Co ltd
Current assignee: Hefei Xunfei Reading And Writing Technology Co ltd
Priority date: 2021-03-15
Filing date: 2021-03-15
Publication date: 2023-08-08
Anticipated expiration: 2041-03-15
Also published as: CN113157194A

Abstract

The application discloses a text display method, electronic equipment and a storage device, wherein the text display method comprises the following steps: acquiring an image to be identified; the image to be identified contains an original text; performing text recognition on the image to be recognized to obtain a recognition text corresponding to the original text; displaying the identification text in a first area of the display interface and displaying a first target text in the identification text in a first preset highlighting format; and displaying the original text in a second area of the display interface and displaying a second target text corresponding to the first target text in the original text in a second preset highlighting format. By means of the scheme, the text proofreading efficiency can be improved.

Description

Text display method, electronic equipment and storage device

Technical Field

The present disclosure relates to the field of text processing technologies, and in particular, to a text display method, an electronic device, and a storage device.

Background

With the development of information technology, automatic identification of text in an image by using a computer has been applied to various industries and scenes. In a real scene, after the recognition is finished, it is usually required to manually check whether an error exists in the recognition result, which is time-consuming and inefficient. In view of this, how to improve the text collation efficiency is a very valuable subject.

Disclosure of Invention

The text display method, the electronic equipment and the storage device are mainly used for solving the technical problem, and can be used for improving text correction efficiency.

In order to solve the above problem, a first aspect of the present application provides a text display method, including: acquiring an image to be identified; the image to be identified contains an original text; performing text recognition on the image to be recognized to obtain a recognition text corresponding to the original text; displaying the identification text in a first area of the display interface and displaying a first target text in the identification text in a first preset highlighting format; and displaying the original text in a second area of the display interface and displaying a second target text corresponding to the first target text in the original text in a second preset highlighting format.

In order to solve the above problem, a second aspect of the present application provides an electronic device, including a memory, a man-machine interaction circuit, and a processor, where the memory and the man-machine interaction circuit are coupled to the processor, and the memory stores program instructions, and the processor is configured to execute the program instructions to implement the text display method in the first aspect.

In order to solve the above-mentioned problem, a third aspect of the present application provides a storage device storing program instructions executable by a processor for implementing the text display method in the above-mentioned first aspect.

According to the scheme, the image to be recognized is obtained, the image to be recognized contains the original text, text recognition is carried out on the image to be recognized, the recognition text corresponding to the original text is obtained, and therefore the recognition text is displayed in the first area of the display interface, the first target text in the recognition text is displayed in the first preset highlighting format, the original text is displayed in the second area of the display interface, the second target text corresponding to the first target text in the original text is displayed in the second preset highlighting format, so that the original text and the recognition text can be displayed in different areas of the display interface at the same time, the second target text corresponding to the first target text is displayed in the second preset highlighting format, namely, the related target text in the original text and the recognition text can be highlighted in different areas of the display interface at the same time, and in the text correction process, users can be helped to quickly correct the recognition text only by focusing on the target text highlighted in different areas of the display interface, and correction efficiency can be helped.

Drawings

FIG. 1 is a flow chart of an embodiment of a text display method of the present application;

FIG. 2 is a schematic state diagram of an embodiment of a text display method of the present application;

FIG. 3 is a flow diagram of acquiring a second target text in the case of acquiring a first target text;

FIG. 4 is a flow diagram of acquiring a first target text with a second target text previously acquired;

FIG. 5 is a flow diagram of one embodiment of obtaining a first target text in an edit mode;

FIG. 6 is a schematic state diagram of another embodiment of a text display method of the present application;

FIG. 7 is a schematic diagram of a frame of an embodiment of an electronic device of the present application;

FIG. 8 is a schematic diagram of a frame of an embodiment of a storage device of the present application.

Detailed Description

The following describes the embodiments of the present application in detail with reference to the drawings.

In the following description, for purposes of explanation and not limitation, specific details are set forth such as the particular system architecture, interfaces, techniques, etc., in order to provide a thorough understanding of the present application.

The terms "system" and "network" are often used interchangeably herein. The term "and/or" is herein merely an association relationship describing an associated object, meaning that there may be three relationships, e.g., a and/or B, may represent: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship. Further, "a plurality" herein means two or more than two.

Referring to fig. 1, fig. 1 is a flow chart illustrating an embodiment of a text display method of the present application.

Specifically, the method may include the steps of:

step S11: and acquiring an image to be identified, wherein the image to be identified contains the original text.

In one implementation scenario, the image to be identified may be captured. For example, a printed article such as a book, journal, newspaper, etc. may be photographed to obtain an image to be recognized, in which case the original text may include text printed on the printed article; alternatively, the handwritten records such as the hall notes, the reading notes, the conference records, and the like may be photographed to obtain the image to be recognized, and in this case, the original text may include the handwritten text included in the handwritten records.

In another implementation scenario, the embodiment of the disclosure may be applied to an electronic device such as a tablet computer, where a user may write characters on a touch screen of the electronic device through a stylus, or where the electronic device may also be connected to a touch pad, where a user may write characters on the touch pad through the stylus. In this case, the electronic device may display a writing area, and capture a writing track of the user in the writing area, so as to obtain an original text based on the writing track, and further obtain, after receiving the save instruction, an image to be identified based on the writing area and the original text on the writing area.

Step S12: and carrying out text recognition on the image to be recognized to obtain a recognition text corresponding to the original text.

In one implementation scenario, text detection may be performed on an image to be identified to obtain a text region in the image to be identified, and then text recognition may be performed on the text region to obtain a recognition text corresponding to an original text in the text region. That is, text recognition can be implemented in two stages, namely "text detection" and "text recognition".

In a specific implementation scenario, in order to improve the text detection efficiency, a text detection network may be used to perform text detection on an image to be identified, so as to obtain a text region in the image to be identified. The word detection network may include, but is not limited to: faster R-CNN, texTBoxes, DMPNet (Deep Matching Prior Network), CTPN (Connectionist Text Proposal Network), etc., are not limited herein.

In another specific implementation scenario, in order to improve the text recognition efficiency, a text recognition network may be used to perform text recognition on a text region, so as to obtain a recognition text corresponding to an original text in the text region. The word recognition network may include, but is not limited to: CNN (Convolutional Neural Network ) +softmax, cnn+rnn (Recurrent Neural Networks, recurrent neural network) + CTC (Connectionist Temporal Classifier), and the like, without limitation herein.

In another implementation scenario, in order to further improve the efficiency of text recognition, an end-to-end text recognition network may be trained in advance, so that an image to be recognized may be recognized by using the text recognition network, and a recognition text corresponding to the original text may be obtained. In particular, the end-to-end text recognition network may include, but is not limited to: FOTS (Fast Oriented Text Spotting), etc., without limitation herein.

Step S13: and displaying the identification text in a first area of the display interface and displaying a first target text in the identification text in a first preset highlighting format.

In one implementation scenario, the original text contains a plurality of original characters, the recognition text contains recognition characters corresponding to the plurality of original characters respectively, and the image to be recognized is recognized by the text, so that the position coordinates of the plurality of original characters corresponding to the recognition characters respectively and the position coordinates of the original characters in the image to be recognized can be obtained. In this case, the recognized character corresponding to the original character may be displayed in the first area of the display interface based on the position coordinates of the original character. Specifically, the distance from the original character to the edge of the image to be recognized may be obtained based on the position coordinates of the recognized character, so that the display position of the recognized character corresponding to the original character in the first area may be determined based on the distance, and the recognized character corresponding to the original character may be displayed in the display position.

In one specific implementation scenario, referring to table 1 in combination, table 1 is a schematic diagram identifying text display locations. As shown in table 1, in the image to be recognized, if the distance from the last original character in the i-th line of original text to the right edge of the image to be recognized is not less than a preset distance, and the distance from the first original character in the i+1-th line of original text to the left edge of the image to be recognized is not less than a preset distance, when the first area is displayed, the recognition text corresponding to the i+1-th line of original text can be retracted into a preset number (e.g., 2) of characters to be used as a new paragraph; if the distance from the last original character in the i-th line original text to the right edge of the image to be recognized is not smaller than the preset distance, and the distance from the first original character in the i+1-th line original text to the left edge of the image to be recognized is smaller than the preset distance, when the first area is displayed, the recognition text corresponding to the i+1-th line original text can be used as a new paragraph, but the new paragraph is not retracted; if the distance from the last original character in the i-th line of original text to the right edge of the image to be recognized is smaller than the preset distance, and the distance from the first original character in the i+1-th line of original text to the left edge of the image to be recognized is not smaller than the preset distance, when the first area is displayed, the recognition text corresponding to the i+1-th line of original text can be retracted into the preset number (e.g. 2) of characters to be used as a new paragraph; if the distance from the last original character in the i-th line original text to the right edge of the image to be recognized is smaller than the preset distance, and the distance from the first original character in the i+1-th line original text to the left edge of the image to be recognized is smaller than the preset distance, when the first area is displayed, the recognition text corresponding to the i+1-th line original text can be continued after the recognition character corresponding to the last original character in the i-th line original text.

Table 1 schematic table for identifying text display position

In another specific implementation scenario, the preset distance may be set to 100px, 55px, etc., which is not limited herein. In addition, the preset distance can be set according to practical application conditions. For example, the original text and the recognized text may be displayed on the display screen of the electronic device at the same time, and if the screen size of the display screen is larger than the preset size (e.g., 10 inches), the preset distance may be set to be slightly larger, for example, 100px, whereas if the screen size of the display screen is not larger than the preset size, the preset distance may be set to be slightly smaller, for example, 55px.

In yet another embodiment, please refer to fig. 2 in combination, fig. 2 is a schematic diagram illustrating a state of an embodiment of a text display method of the present application. Taking the foregoing setting of the preset distance to 100px as an example, as shown in fig. 2, left and right two dot-dash lines in the second area respectively represent the left edge and the right edge of the image to be recognized. On line 2 original text "T1B, the noise reduction algorithm is optimized at the same time. Comprises the following steps: "last original character in': ' distance to right edge of image to be recognized is not less than 100px, and line 3 original text "(1) AI intelligent recognition: the method supports that the distance from the first original character 'to the left edge of an image to be recognized is not smaller than 100px, when the first area is displayed, the 3 rd line of original text can be used as a new paragraph and is retracted into 2 characters, and further referring to fig. 2, the 1 st line of original text' intelligent office book X1 upgrades a brand new transcription engine in the last month, the distance from the last original character 'step' in the complete synchronization 'to the right edge of the image to be recognized is currently smaller than 100px, and the distance from the 2 nd line of original text' T1B is optimized, and meanwhile, the noise reduction algorithm is optimized.

In one implementation scenario, the recognition text includes a number of first sub-texts, and the number of first sub-texts each end with a preset clause symbol, that is, the recognition text is divided into a number of first sub-texts by the preset clause symbol. And the first target text belongs to a plurality of first sub-texts contained in the identification text, namely the first target text is one of the first sub-texts.

In one specific implementation scenario, the preset clause symbol may include, but is not limited to: question marks (i.e.,'. It should be noted that the preset clause symbol is not limited to a chinese state symbol or an english state symbol, for example, a period' of chinese state. 'and period' of the english state can be regarded as preset clause symbols, and the other cases can be similar, and are not exemplified here. In addition, the preset clause symbol may further include a line-feed symbol, that is, if a certain sub-text ends without a punctuation mark, and the paragraph ends at the sub-text, the sub-text may also be used as the first sub-text.

In another specific implementation scenario, please continue to refer to fig. 2, recognizing that text in the first region "will now be completely synchronized on T1B, while optimizing the noise reduction algorithm. Comprises the following steps: for example, the recognition text contains a first sub-text that is currently completely synchronized T1B, and the first sub-text is "simultaneous", and the first sub-text optimizes the noise reduction algorithm. ". In addition, although the sub-text "includes the following: "the sub-text is not ended by punctuation marks such as the question mark, the exclamation mark, the period, the comma and the like, but is ended by a line-feed symbol, i.e. the paragraph ends at the sub-text, the sub-text can also be included as follows: "as first sub-text". Other situations can be similar and are not exemplified here.

In yet another specific implementation scenario, the first sub-text may be used as the first target text according to the order of the first sub-text from first to second in the recognition text, and the second target text corresponding to the first target text in the original text is obtained, so that the user can check the first target text and the second target text corresponding to the first target text in the original text, and under the condition that the user check completion instruction is received, the next first sub-text is automatically used as the first target text, and so on, and no further examples are given here. Therefore, the first target text can be highlighted in the first area automatically according to the sequence, and the second target text corresponding to the first target text is highlighted in the second area, so that the text checking efficiency and the automation degree can be further improved.

In yet another specific implementation scenario, the first target text may also be selected by the user among a number of first sub-texts. Based on the above, the second target text corresponding to the first target text may be obtained based on the first target text selected by the user, and the specific process may refer to the following disclosure embodiments, which are not described herein.

In another implementation scenario, the first preset highlighting format may be set according to actual application needs. For example, the first preset highlighting format may include, but is not limited to: the reverse display, the bold display, the underline display, etc., are not limited herein. With continued reference to fig. 2, as shown in fig. 2, the first target text "(2) supports two simultaneous interpretation modes of simultaneous interpretation, along with transliteration-mesoscopic, along with transliteration-english. "may be bolded to highlight the first target text; alternatively, the first target text may be displayed in reverse, that is, the background color of the first target text may be set to black and the text color may be set to white, so as to highlight the first target text; alternatively, an underline may be displayed under the first target text to highlight the first target text; alternatively, a box (such as a dashed box) may be displayed on the periphery of the first target text to highlight the first target text, which is not limited herein.

Step S14: and displaying the original text in a second area of the display interface, and displaying a second target text corresponding to the first target text in the original text in a second preset highlighting format.

It should be noted that, in the actual application process, the step S13 and the step S14 are executed out of order. In one implementation scenario, the foregoing step S13 may be performed first, as described above, and specifically, the first target text in the recognition text may be automatically obtained based on the order of the first sub-text from first to second in the recognition text, or may be obtained based on the selection of the user; on the basis of the first target text being obtained, a second target text corresponding to the first target text in the original text may be obtained based on the first target text, and then step S14 may be performed. Alternatively, in another implementation scenario, the second target text in the original text may be acquired first, and on this basis, step S14 may be performed first, and at the same time, the first target text corresponding to the second target text in the identification text may be acquired based on the second target text, and then step S13 may be performed. Alternatively, in still another implementation scenario, in the case where the first target text in the recognition text is previously acquired, the second target text corresponding to the first target text in the original text may be acquired based on the first target text, and in the case where the second target text in the original text is previously acquired, the first target text corresponding to the second target text in the recognition text may be acquired based on the second target text, and step S13 and step S14 may be simultaneously performed on the basis of the acquisition of the first target text and the second target text. In the actual application process, the execution sequence of the step S13 and the step S14 may be set according to the actual application requirement, which is not limited herein.

In one implementation scenario, the original text may include a number of second sub-texts, and the number of second sub-texts each end with a preset clause symbol, that is, the original text is divided into a number of second sub-texts by the preset clause symbol. And the second target text belongs to a plurality of second sub-texts contained in the original text, namely the second target text is one of the second sub-texts.

In one specific implementation scenario, the preset clause symbol may include, but is not limited to: question marks (i.e.'.

In another specific implementation scenario, please continue to refer to fig. 2, the original text in the second region is "currently completely synchronized with T1B", and the noise reduction algorithm is optimized. Comprises the following steps: by way of example, the recognition characters corresponding to the original characters in the original text can be obtained through text recognition, and the recognition characters corresponding to the original characters comprise preset clause symbols ', ' and ' respectively. ' and a line feed. In this case, the above original text may be divided into the second sub-text "on the T1B to be completely synchronized at present", "and the second sub-text" at the same time "," and the second sub-text "optimize the noise reduction algorithm". "and the second sub-text" includes the following: ". Other situations can be similar and are not exemplified here. Needs to be as follows It is noted that, through text recognition, not only the recognition character corresponding to the original character can be obtained, but also the position information (e.g., position coordinates) of the original character in the image to be recognized can be obtained. On this basis, each second sub-text in the original text can be distinguished by the position information of the original character contained in the second sub-text. For example, the position coordinates in the image to be recognized are (x ₀ ,y ₀ )、……、(x ₁ ,y ₀ ) Belongs to the second sub-text 01, and the position coordinates are (x ₂ ,y ₀ )、……、(x ₃ ,y ₀ ) And so on, the original character of (c) belongs to the second sub-text 02, and is not exemplified here.

In yet another specific implementation scenario, the second sub-text may be used as the second target text according to the order of the second sub-text from first to second in the original text, and the first target text corresponding to the second target text in the identification text is obtained, so that the user can check the first target text and the second target text, and in case that the user check completion instruction is received, the next second sub-text is automatically used as the second target text, and so on, and no further examples are given here. Therefore, the second target text can be highlighted in the second area automatically according to the sequence, and the first target text corresponding to the second target text is highlighted in the first area, so that the text checking efficiency and the automation degree can be further improved.

In yet another specific implementation scenario, the second target text may also be selected by the user among a number of second sub-texts. Based on the above, the first target text corresponding to the second target text may be obtained based on the second target text selected by the user, and the specific process may refer to the following disclosure embodiments, which are not described herein.

In another implementation scenario, the second preset highlighting format may be set according to actual application needs. For example, the second preset highlighting format may include, but is not limited to: highlighting, etc., and are not limited herein. With continued reference to fig. 2, as shown in fig. 2, the second target text "(2) supports two simultaneous interpretation modes of simultaneous interpretation, along with transliteration-mesoscopic, along with transliteration-english. "may be highlighted in a grey ground tint to highlight the second target text.

In yet another implementation scenario, as described above, a first target text may be acquired first, and then, based on the first target text, a second target text corresponding to the first target text in the original text may be acquired; the first target text corresponding to the second target text in the identification text can be acquired firstly based on the second target text, and the first target text is displayed in a first preset highlighting format and the second target text is displayed in a second preset highlighting format after the first target text and the second target text corresponding to each other are acquired in any mode, so that the original text and the associated target text in the identification text can be highlighted simultaneously in different areas of the display interface, and in the text correction process, only the target text highlighted in different areas of the display interface is required to be paid attention to. With continued reference to fig. 2, in the first region, the first target text "(2) is bolded to support two simultaneous interpretation modes of along-with-sound. "and highlighting the second target text in the second region" (2) supports two simultaneous interpretation modes of along-with-sound. By the text collating process, only the two highlighted texts are focused. Other situations can be similar and are not exemplified here.

In one implementation, the first region and the second region may be disposed above and below the display interface. For example, when the electronic device is placed on a portrait screen, the first area and the second area may be disposed above and below the display interface. Specifically, as shown in fig. 2, the first area may be disposed at a lower portion of the display interface, and the second area may be disposed at an upper portion of the display interface; alternatively, the first region may be provided at an upper portion of the display interface, and the second region may be provided at a lower portion of the display interface.

In another implementation scenario, the first region and the second region may also be disposed about the display interface. For example, when the electronic device is placed across a screen, the first region and the second region may be disposed about the display interface. Specifically, the first region may be disposed at a left portion of the display interface, and the second region may be disposed at a right portion of the display interface; alternatively, the first region is disposed at the right portion of the display interface, and the second region is disposed at the left portion of the display interface, which is not limited herein.

With continued reference to fig. 2, in the case where there are multiple images to be identified, in order to facilitate the user to quickly locate the images to be identified that are desired to be displayed, the display interface may further include a third area, and the third area may display thumbnails of the multiple images to be identified. On the basis, the image to be recognized which the user desires to display can be determined based on the click coordinates of the user in the third area, so that the original text in the image to be recognized can be displayed in the second area.

With continued reference to fig. 2, in order to enhance the user interaction experience, the display interface may further include a fourth area, and the fourth area may specifically display a plurality of function keys. For example, the fourth region may display time (e.g., "6:12 pm" in fig. 2); alternatively, the fourth region may display an amount of electricity; or, the fourth area may display a print key, and send the original text and/or the identification text to the printer for printing when receiving a trigger instruction of the user to the print key; alternatively, the fourth area may display a export key, and export the identification text into a document of a preset format (e.g., ".doc" format, ".txt" format, etc.) upon receiving a trigger instruction from the user for the export key. The function keys displayed in the fourth area may be specifically set according to actual application needs, which is not limited herein.

Referring to fig. 3, fig. 3 is a schematic flow chart of acquiring a second target text in the case of acquiring the first target text. In the embodiment of the disclosure, the recognition text includes a plurality of first sub-texts, and each first sub-text includes a plurality of recognition characters, and the specific division manner of the first sub-text may refer to the related description in the foregoing embodiment of the disclosure, which is not repeated herein. Specifically, embodiments of the present disclosure may include the steps of:

step S31: and acquiring the position information of the original characters corresponding to the identification characters in the first target text in the original text.

As described above, in the embodiment of the present disclosure, a first target text is obtained from an identification text, and then a second target text corresponding to the first target text in an original text is obtained based on the first target text. For convenience of description, the process of acquiring the first target text is explained first.

In one implementation scenario, a first click coordinate of a user in a first area may be acquired, an identification character where the first click coordinate is located is taken as a first target character, and on the basis, a first sub-text to which the first target character belongs may be taken as a first target text. According to the method, the first click coordinate of the user in the first area is obtained, and the identification character where the first click coordinate is located is used as the first target character, so that the first sub-text where the first target character belongs is used as the first target text, and therefore the first target text at the click position of the user can be obtained only by clicking the user in the first area, and convenience in obtaining the first target text can be improved.

In a specific implementation scenario, the user may display an editing cursor at the first click coordinate of the first area, and the editing cursor may be specifically displayed in a blinking form. In addition, the editing cursor may be set to be invisible, and is not limited herein.

In another specific implementation scenario, referring still to FIG. 2, the user may support two simultaneous interpretation modes in the transliteration-in-middle-zeying, transliteration-in-english, in the first sub-text "(2) displayed in the first region. Clicking at the "middle recognition character" turn "so that the first click coordinates at that location can be obtained. On the basis, the identification text 'turning' of the first click coordinate can be obtained, and the first sub text (2) of the identification text 'turning' supports two simultaneous interpretation modes of simultaneous interpretation, namely, simultaneous interpretation, middle-gloss interpretation and simultaneous interpretation-english interpretation. ", as the first target text. Other situations can be similar and are not exemplified here.

It should be noted that, in the embodiment of the present disclosure, the position information includes the position coordinates of the original character in the image to be recognized, and the position information is obtained by performing text recognition on the image to be recognized. The text recognition may be specifically described with reference to the foregoing disclosed embodiments, and will not be repeated herein.

In one implementation scenario, the position coordinates may include a distance from the original character to the left edge of the image to be recognized and a distance from the original character to the upper edge of the image to be recognized; alternatively, the position coordinates may also include a distance from the original character to the right edge of the image to be recognized and a distance from the original character to the upper edge of the image to be recognized; alternatively, the position coordinates may also include a distance from the original character to the left edge of the image to be recognized and a distance from the original character to the lower edge of the image to be recognized; alternatively, the position coordinates may also include a distance from the original character to the lower edge of the image to be recognized and a distance from the original character to the right edge of the image to be recognized, which are not limited herein.

In another implementation scenario, the character size of the original character in the image to be recognized can also be obtained by performing text recognition on the image to be recognized. Specifically, the character size may be the size of the smallest bounding box bounding the original character. For example, the minimum bounding box may be specifically a minimum rectangle that encloses the original character, or the minimum bounding box may be specifically a minimum circle that encloses the original character, which is not limited herein.

In yet another implementation scenario, two simultaneous interpretation modes of along-with-sound-in-the-sun, along-with-sound-in-the-sound-interpretation are still supported with the first target text "(2). For example, the position information of the original characters corresponding to the identification characters in the image to be identified in the first target text can be obtained. Other situations can be similar and are not exemplified here.

Step S32: and obtaining a second target text based on the position information of the corresponding original character.

In one implementation scenario, after the position information of the corresponding original character is acquired, a second target text corresponding to the first target text in the original text may be acquired through the position information. Two simultaneous interpretation modes of along-with-sound-medium-ze-along-sound-in-sound-interpretation are supported still with the first target text "(2). By way of example only, the term "as used herein, by ' (', '2','), ' branch ' … …, ' die ', ' and'. ' etc. recognizes the position information of the original character corresponding to the character, so that the sub-text composed of the original character corresponding to the position information can be regarded as the second target text. On this basis, the image to be recognized may be displayed in a second preset highlighting format corresponding to the above-mentioned position information, such as highlighting at a position corresponding to the above-mentioned position information.

In a specific implementation scenario, in a case that the character size of the original character is obtained, for the original character corresponding to each recognition character, the original character may be displayed in the second preset highlighting format at the position information corresponding to the original character in the image to be recognized and within the character size range of the original character. Two simultaneous interpretation modes of along-with-sound-medium-ze-along-sound-to-English are supported by the first target text (2). The "flip" of the identification character in "is taken as an example, and the position information of the original character corresponding to the identification character includes: the character size of the original character comprises a first width and a first height, the first distance from the left edge of the image to be recognized and the second distance from the left edge of the image to be recognized can be the left top point, a rectangle with the width being the first width and the height being the first height is determined, and the rectangle is displayed in a second preset highlighting format, for example, the rectangle is displayed in a gray shading mode.

In another specific implementation scenario, in the case that the character size of the original character is unknown, for the original character corresponding to each recognition character, the original character may be displayed in the second preset highlighting format at the position information corresponding to the original character in the image to be recognized and within the preset size range. Two simultaneous interpretation modes of along-with-sound-medium-ze-along-sound-to-English are supported by the first target text (2). The "flip" of the identification character in "is taken as an example, and the position information of the original character corresponding to the identification character includes: and determining a rectangle with a width of a preset width and a height of a preset height by taking the position of the first distance to the left edge of the image to be recognized and the second distance to the upper edge of the image to be recognized as the upper left vertex, and displaying the rectangle in a second preset highlighting format, for example, displaying the rectangle in gray shading.

Different from the foregoing embodiment, each first sub-text includes a plurality of recognition characters, on the basis of which, position information of the corresponding original characters in the original text of the recognition characters in the first target text is obtained, and the position information includes position coordinates of the original characters in the image to be recognized, where the position information is obtained by performing text recognition on the image to be recognized, and based on the position information of the corresponding original characters, a second target text is obtained, that is, the second target text is determined according to the position information of the corresponding original characters in the original text of the recognition characters in the first target text, which can be beneficial to improving accuracy of obtaining the second target text.

Referring to fig. 4, fig. 4 is a schematic flow chart of acquiring a first target text in the case of acquiring a second target text. In the embodiment of the disclosure, the original text includes a plurality of second sub-texts, and each second sub-text includes a plurality of original characters, and the specific division manner of the second sub-text may refer to the related description in the foregoing embodiment of the disclosure, which is not repeated herein. Specifically, embodiments of the present disclosure may include the steps of:

step S41: and acquiring the identification characters corresponding to the original characters in the second target text in the identification text respectively.

As described above, in the embodiment of the present disclosure, the second target text is obtained from the original text, and then the first target text corresponding to the second target text in the identification text is obtained based on the second target text. For convenience of description, the process of acquiring the second target text is explained first.

In one implementation scenario, a second click coordinate of the user in the second area may be obtained, and an original character where the second click coordinate is located is taken as a second target character, and on this basis, a second sub-text to which the second target character belongs may be taken as a second target text. According to the mode, the second click coordinate of the user in the second area is obtained, and the original character where the second click coordinate is located is used as the second target character, so that the second sub-text where the second target character belongs is used as the second target text, and therefore the user can obtain the second target text only by clicking in the second area, and convenience in obtaining the second target text can be improved.

In a specific implementation scenario, in the text display process, a certain offset distance may exist between the origin of coordinates (e.g., the top left corner vertex of the image to be recognized) of the image to be recognized and the origin of the second area (e.g., the top left corner vertex of the second area), after the second click coordinate of the user in the second area is obtained, the position coordinate of the click position of the user in the image to be recognized in the second area can be obtained based on the second click coordinate and the offset distance, and then the original character with the position information closest to the position coordinate can be used as the second target character.

In another aspectIn a specific implementation scenario, as in the foregoing disclosed embodiment, each second sub-text in the original text may be distinguished by the location information of the original character contained in the second sub-text. For example, the position coordinates in the image to be recognized are (x ₀ ,y ₀ )、……、(x ₁ ,y ₀ ) Belongs to the second sub-text 01, and the position coordinates are (x ₂ ,y ₀ )、……、(x ₃ ,y ₀ ) Belongs to the second sub-text 02. On the basis of the position information of the second target character and the position information of the original character contained in the second sub-text, the second sub-text to which the second target character belongs can be determined. For example, the position information of the second target character belongs to a set of position coordinates (x ₀ ,y ₀ )、……、(x ₁ ,y ₀ ) In, it may be determined that the second target character belongs to the second sub-text 01. Other situations can be similar and are not exemplified here.

In one implementation, two simultaneous interpretation modes of along-with-interpretation-mesoscopic, along-with-interpretation-english are supported with a second target text "(2). "for example, the recognition character obtained by text recognition of the second target text may be obtained: ' s ' (', '2','s) ' … …, ' mold ','s '. '. Other situations can be similar and are not exemplified here.

Step S42: and taking the sub-text formed by the corresponding recognition characters as a first target text.

In one implementation scenario, two simultaneous interpretation modes of along-with-sound-in-the-sun, along-with-sound-in-the-sound are still supported with the second target text "(2). For example, after the corresponding recognition characters are obtained, the sub-text composed of the corresponding recognition characters can be used as the first target text. Other situations can be similar and are not exemplified here.

Different from the foregoing embodiment, each second sub-text includes a plurality of original characters, and by acquiring the corresponding recognition characters of each original character in the second target text in the recognition text, and using the sub-text formed by the corresponding recognition characters as the first target text, the accuracy of acquiring the first target text can be improved.

Referring to fig. 5, fig. 5 is a flowchart illustrating an embodiment of obtaining a first target text in an editing mode. In the embodiment of the disclosure, the recognition text may include a plurality of first sub-texts, and the plurality of first sub-texts are all terminated by a preset clause symbol, and specifically, reference may be made to the related description in the foregoing embodiment of the disclosure, which is not repeated herein. Specifically, embodiments of the present disclosure may include the steps of:

step S51: and under the condition that a trigger instruction of the user for the editing mode is acquired, acquiring a third click coordinate of the user in the first area, and moving an editing cursor to the third click coordinate.

In one implementation, the display interface may set an icon for triggering the edit mode, and in the event that a user click on the icon is detected, the edit mode may be triggered. In the case of triggering the editing mode, the virtual keyboard may be displayed at a preset position of the display interface, so that the user edits the recognition text by using the virtual keyboard.

In a specific implementation scenario, referring to fig. 2 in combination, an icon for triggering the editing mode may be displayed in the upper right corner position of the first area as shown in fig. 2. In addition, an icon for triggering the editing mode may be displayed at a position such as an upper left corner, a lower left corner, or a lower right corner of the first area according to actual application needs, which is not limited herein.

In another embodiment, please refer to fig. 6, fig. 6 is a schematic diagram illustrating a state of another embodiment of the text display method of the present application. As shown in fig. 6, in case of triggering the edit mode, a virtual keyboard may be displayed at the lower side of the display interface. In addition, the virtual keyboard is displayed on the upper side of the display interface according to the actual application requirement, and the method is not limited herein.

Step S52: and taking the first sub-text where the editing cursor is positioned as a first target text.

With continued reference to fig. 6, the first text sub-text in which the editing cursor (not shown) is located is "recognition that is very suitable for chinese and english. The first sub-text may be regarded as the first target text. Other situations can be similar and are not exemplified here.

In one implementation scenario, after the first target text is acquired, a second target text corresponding to the first target text in the original text may be further acquired based on the first target text, the first target text in the identification text is displayed in a first preset highlighting format in a first area of the display interface, and the second target text corresponding to the first target text in the original text is displayed in a second preset highlighting format in a second area of the display interface. Reference may be made specifically to the foregoing descriptions of the disclosed embodiments, and details are not repeated herein.

In a specific implementation scenario, if the second target text corresponding to the first target text in the original text is not displayed in the second area, the original text displayed in the second area may be automatically scrolled, so that the second target text is displayed in the second area.

In another specific implementation scenario, in the edit mode, an edit cursor may be set to be visible in the display interface, in which case the first target text may or may not be displayed in any highlighted format.

In another implementation scenario, if in the editing mode, the preset clause symbol in the first target text is deleted or modified to be a non-clause symbol, the first target text and a first sub-text after the first target text may be combined to form a new first target text, and in the second area, a second target text corresponding to the new first target text in the original text is displayed in a second preset highlighting format. According to the method, under the condition that the preset clause symbol in the first target text is deleted or the preset clause symbol is modified into the non-clause symbol, the first target text and a first sub-text after the first target text are combined into a new first target text, and a second target text corresponding to the new first target text in the original text is displayed in a second preset highlighting format in a second area, so that the corresponding second target text can be synchronously updated and highlighted along with the editing of the first target text, and the accuracy of text display can be improved.

In one particular implementation, the non-clause symbols may include symbols other than the preset clause symbols. For example, the preset clause symbol includes: in the case of a question mark (i.e.': the pause (i.e.,') and the like, are not limited herein.

In another specific implementation scenario, please continue to refer to fig. 2, two simultaneous interpretation modes of along-with-sound are still supported with the first target text "(2). "for example, the preset clause symbol' therein is deleted. 'or will preset clause symbols'. In the case of' modified to a non-clause symbol, the first target text and a first sub-text following the first target text may be "combined to a new first target text," for example, as interpreted-interpreted. On the basis, the second target text corresponding to the new first target text can be displayed in the second area in a second preset highlighting format. The specific process of obtaining the second target text corresponding to the new first target text may refer to the related description in the foregoing disclosed embodiment, which is not repeated herein. Other situations can be similar and are not exemplified here.

In another specific implementation scenario, after obtaining the new first target text, the new first target text may also be displayed in the first area in a first preset highlighting format; alternatively, as previously described, in the editing mode, the editing cursor may be set to be visible in the display interface, in which case the first target text may be displayed without any highlighting format.

In yet another implementation scenario, if a new preset clause symbol is inserted into the first target text in the editing mode, the first target text may be divided into two first sub-texts ending with the preset clause symbol, and the first sub-text in which the editing cursor is located is used as the new first target text, and on this basis, in the second area, a second target text corresponding to the new first target text in the original text is displayed in the second preset highlighting format. According to the method, under the condition that a new preset clause symbol is inserted into the first target text, the first target text is divided into two first sub-texts ending with the preset clause symbol, and the first sub-text where the editing cursor is located is used as the new first target text, so that a second target text corresponding to the new first target text in the original text is displayed in a second preset highlighting format in a second area, the corresponding second target text can be synchronously updated and highlighted along with editing of the first target text, and the accuracy of text display can be improved.

In a specific implementation scenario, the first target text may be divided into two first sub-texts ending with the pre-set clause symbol, with the newly inserted pre-set clause symbol as a boundary. With continued reference to figure 2, two simultaneous interpretation modes of along-with-sound-medium-ze-along-sound-in-sound-interpretation are supported still with the first target text "(2). "for example, if a predetermined clause symbol' is inserted after" in-translation ", the first target text may be divided into first sub-text" (2) supporting "in-translation-in-translation" and "two simultaneous translation modes with the first sub-text" with the predetermined clause symbol as a boundary. Other situations can be similar and are not exemplified here.

In yet another implementation scenario, after the editing is completed, the editing mode may be exited in response to a save instruction of the user on the editing result, in which case the virtual keyboard may be retracted so that the user continues to collate the original text and recognize the text.

Different from the foregoing embodiment, in the case that the trigger instruction of the user to the editing mode is obtained, the third click coordinate of the user in the first area is obtained, and the editing cursor is moved to the third click coordinate, and on this basis, the first sub-text where the editing cursor is located is used as a target text, so that the first target text can be updated synchronously with the movement of the editing cursor in the editing mode, which is beneficial to improving the accuracy of text display.

Referring to fig. 7, fig. 7 is a schematic diagram of a frame of an embodiment of an electronic device 70 of the present application. The electronic device 70 comprises a memory 71, a man-machine interaction circuit 72 and a processor 73, the memory 71 and the man-machine interaction circuit 72 being coupled to the processor 73, the memory 71 having stored therein program instructions, the processor 73 being adapted to execute the program instructions to implement the steps of any of the above-described embodiments of the text display method. In particular, the electronic device 70 may include, but is not limited to: a mobile phone, a tablet computer, a notebook computer, etc., are not limited herein.

In particular, the processor 73 is configured to control itself and the memory 71, the man-machine interaction circuit 72 to implement the steps of any of the above-described text display method embodiments. The processor 73 may also be referred to as a CPU (Central Processing Unit ). The processor 73 may be an integrated circuit chip with signal processing capabilities. The processor 73 may also be a general purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a Field programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. In addition, the processor 73 may be commonly implemented by an integrated circuit chip.

In the disclosed embodiment, the processor 73 is configured to acquire an image to be identified; the image to be identified contains an original text; the processor 73 is configured to perform text recognition on the image to be recognized to obtain a recognition text corresponding to the original text; the processor 73 is configured to control the man-machine interaction circuit 72 to display the identification text in a first area of the display interface and display a first target text in the identification text in a first preset highlighting format; and the processor 73 is configured to control the man-machine interaction circuit 72 to display the original text in a second area of the display interface and display a second target text corresponding to the first target text in the original text in a second preset highlighting format.

In some disclosed embodiments, the recognition text includes a plurality of first sub-texts, and each of the plurality of first sub-texts ends with a preset clause symbol, the first target text is selected from the plurality of first sub-texts by the user, and the second target text is obtained based on the first target text.

In contrast to the foregoing embodiment, the identification text includes a plurality of first sub-texts, and the plurality of first sub-texts each end with a preset clause symbol, where the first target text is selected from the plurality of first sub-texts by the user, and the second target text is obtained based on the first target text, that is, in the text display process, the user may select to obtain the first target text from the plurality of first sub-texts included in the identification text, and on this basis, the first target text and the second target text corresponding to the first target text are highlighted in different areas of the display interface, so that the user experience can be advantageously improved.

In some disclosed embodiments, each first sub-text contains a number of recognized characters, and the processor 73 is configured to obtain location information of original characters corresponding to the recognized characters in the first target text in the original text; the position information comprises position coordinates of original characters in the image to be recognized, and the position information is obtained by performing text recognition on the image to be recognized. The processor 73 is configured to obtain the second target text based on the position information of the corresponding original character.

In some disclosed embodiments, the processor 73 is configured to obtain a first click coordinate of the user in the first area, and take an identification character where the first click coordinate is located as a first target character; the processor 73 is configured to take, as the first target text, a first sub-text to which the first target character belongs.

Different from the foregoing embodiment, the first target text at the clicking position of the user can be obtained only by clicking the user in the first area by obtaining the first clicking coordinate of the user in the first area and taking the identification character where the first clicking coordinate is located as the first target character, so that the first sub-text to which the first target character belongs is taken as the first target text, which can be beneficial to improving the convenience of obtaining the first target text.

In some disclosed embodiments, the original text includes a plurality of second sub-texts, and the plurality of second sub-texts each end with a preset clause symbol, the second target text is selected from the plurality of second sub-texts by the user, and the first target text is acquired based on the second target text.

In contrast to the foregoing embodiment, the original text includes a plurality of second sub-texts, and the plurality of second sub-texts each end with a preset clause symbol, where the second target text is selected from the plurality of second sub-texts by the user, and the first target text is obtained based on the second target text, that is, in the text display process, the user may select to obtain the second target text from the plurality of second sub-texts included in the original text, and on this basis, the second target text and the first target text corresponding to the second target text are highlighted in different areas of the display interface, so that the user experience can be advantageously improved.

In some disclosed embodiments, each of the second sub-texts includes a plurality of original characters, and the processor 73 is configured to obtain recognition characters corresponding to the original characters in the recognition text respectively; the processor 73 is configured to take the sub-text composed of the corresponding recognition characters as the first target text.

In some disclosed embodiments, the processor 73 is configured to obtain a second click coordinate of the user in the second area, and take an original character where the second click coordinate is located as a second target character; the processor 73 is configured to use the second sub-text to which the second target character belongs as the second target text.

Different from the foregoing embodiment, the second target text can be obtained only by clicking the second region by the user, and the original character where the second click coordinate is located is taken as the second target character, so that the second sub-text where the second target character belongs is taken as the second target text.

In some disclosed embodiments, the recognition text includes a plurality of first sub-texts, and the plurality of first sub-texts are all terminated by a preset clause symbol, and the processor 73 is configured to obtain a third click coordinate of the user in the first area and move the editing cursor to the third click coordinate if a trigger instruction of the user on the editing mode is obtained; the processor 73 is configured to take the first sub-text where the editing cursor is located as the first target text.

In some disclosed embodiments, the processor 73 is configured to combine the first target text and the first sub-text after the first target text into a new first target text if the preset clause symbol in the first target text is deleted or the preset clause symbol is modified to a non-clause symbol; and the processor 73 is configured to control the man-machine interaction circuit 72 to display, in the second area, a second target text corresponding to the new first target text in the original text in a second preset highlighting format.

Different from the foregoing embodiment, in the case of deleting the preset clause symbol in the first target text or modifying the preset clause symbol to be a non-clause symbol, the first target text and a first sub-text after the first target text are combined to form a new first target text, and in the second area, a second target text corresponding to the new first target text in the original text is displayed in a second preset highlighting format, so that the corresponding second target text can be synchronously updated and highlighted along with editing of the first target text, which can be beneficial to improving the accuracy of text display.

In some disclosed embodiments, the processor 73 is configured to divide the first target text into two first sub-texts ending with the preset clause symbol and take the first sub-text where the editing cursor is located as the new first target text in the case that the new preset clause symbol is inserted into the first target text; and the processor 73 is configured to control the man-machine interaction circuit 72 to display, in the second area, a second target text corresponding to the new first target text in the original text in a second preset highlighting format.

Different from the foregoing embodiment, in the case of inserting a new preset clause symbol into the first target text, the first target text is divided into two first sub-texts ending with the preset clause symbol, and the first sub-text where the editing cursor is located is used as the new first target text, so that in the second area, the second target text corresponding to the new first target text in the original text is displayed in the second preset highlighting format, and the corresponding second target text can be synchronously updated and highlighted along with the editing of the first target text, which can be beneficial to improving the accuracy of text display.

Referring to fig. 8, fig. 8 is a schematic diagram illustrating a frame of an embodiment of a storage device 80 of the present application. The storage means 80 stores program instructions 81 that can be executed by the processor, the program instructions 81 being adapted to implement the steps of any of the text display method embodiments described above.

According to the scheme, the original text and the related target text in the identification text can be highlighted in different areas of the display interface, so that in the text correction process, only the target text highlighted in different areas of the display interface is required to be focused, and further the user can be helped to quickly correct the identification text, and the text correction efficiency is helped to be improved.

In some embodiments, functions or modules included in an apparatus provided by the embodiments of the present disclosure may be used to perform a method described in the foregoing method embodiments, and specific implementations thereof may refer to descriptions of the foregoing method embodiments, which are not repeated herein for brevity.

The foregoing description of various embodiments is intended to highlight differences between the various embodiments, which may be the same or similar to each other by reference, and is not repeated herein for the sake of brevity.

In the several embodiments provided in the present application, it should be understood that the disclosed methods and apparatus may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of modules or units is merely a logical functional division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical, or other forms.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in part or all or part of the technical solution contributing to the prior art or in the form of a software product stored in a storage medium, including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to perform all or part of the steps of the methods of the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Claims

1. A text display method, comprising:

acquiring an image to be identified; the image to be identified contains an original text;

performing text recognition on the image to be recognized to obtain the original text and a recognition text corresponding to the original text;

displaying the identification text in a first area of a display interface and displaying a first target text in the identification text in a first preset highlighting format; the method comprises the steps of,

displaying the original text in a second area of a display interface and displaying a second target text corresponding to the first target text in the original text in a second preset highlighting format;

the display interface further comprises a third area for displaying thumbnails of the images to be identified, and the method further comprises: acquiring click coordinates of a user in the third area, determining the image to be recognized expected to be displayed by the user based on the click coordinates, and displaying the original text of the image to be recognized in the second area;

the display interface further comprises a fourth area, a plurality of function keys are displayed in the fourth area, and the method further comprises: transmitting the original text and/or the identification text to a printing device in response to a user triggering a print key; and responding to the user triggering an export key, and exporting the identification text based on a preset format.

2. The method of claim 1, wherein the identified text comprises a number of first sub-texts, and wherein each of the number of first sub-texts ends with a preset clause symbol, wherein the first target text is selected from the number of first sub-texts by a user, and wherein the second target text is obtained based on the first target text.

3. The method of claim 2, wherein each of the first sub-texts comprises a number of recognition characters; the step of obtaining the second target text comprises the following steps:

acquiring position information of an original character corresponding to the identification character in the first target text in the original text; the position information comprises position coordinates of the original characters in the image to be recognized, and is obtained by performing text recognition on the image to be recognized;

and obtaining the second target text based on the position information of the corresponding original character.

4. The method of claim 3, wherein the step of obtaining the first target text comprises:

acquiring a first click coordinate of a user in the first area, and taking an identification character where the first click coordinate is positioned as a first target character;

And taking the first sub-text to which the first target character belongs as the first target text.

5. The method of claim 1, wherein the original text comprises a number of second sub-texts, and the number of second sub-texts each end with a preset clause symbol, the second target text being selected by a user among the number of second sub-texts, the first target text being obtained based on the second target text.

6. The method of claim 5, wherein each of the second sub-texts comprises a plurality of original characters; the step of obtaining the first target text comprises the following steps:

acquiring identification characters corresponding to the original characters in the identification text respectively in the second target text;

and taking the sub-text formed by the corresponding recognition characters as the first target text.

7. The method of claim 6, wherein the step of obtaining the second target text comprises:

acquiring a second click coordinate of a user in the second area, and taking an original character where the second click coordinate is located as a second target character;

and taking the second sub-text to which the second target character belongs as the second target text.

8. The method of claim 1, wherein the identified text comprises a number of first sub-texts, and wherein the number of first sub-texts each end with a preset clause symbol; the method further comprises the steps of:

under the condition that a trigger instruction of a user for an editing mode is acquired, acquiring a third click coordinate of the user in the first area, and moving an editing cursor to the third click coordinate;

and taking the first sub-text where the editing cursor is positioned as the first target text.

9. The method of claim 8, wherein the method further comprises:

under the condition that the preset clause symbol in the first target text is deleted or the preset clause symbol is modified into a non-clause symbol, combining the first target text and one first sub-text after the first target text into a new first target text; the method comprises the steps of,

and displaying a second target text corresponding to the new first target text in the original text in the second area in the second preset highlighting format.

10. The method of claim 8, wherein the method further comprises:

Under the condition that a new preset clause symbol is inserted into the first target text, dividing the first target text into two first sub-texts ending with the preset clause symbol, and taking the first sub-text where the editing cursor is positioned as a new first target text; the method comprises the steps of,

11. An electronic device comprising a memory, man-machine interaction circuitry, and a processor, the memory and man-machine interaction circuitry coupled to the processor, the memory having program instructions stored therein, the processor for executing the program instructions to implement the text display method of any of claims 1-10.

12. A storage device storing program instructions executable by a processor for implementing the text display method of any one of claims 1 to 10.