CN112288835A

CN112288835A - Image text extraction method and device and electronic equipment

Info

Publication number: CN112288835A
Application number: CN202011179137.XA
Authority: CN
Inventors: 任静
Original assignee: Vivo Mobile Communication Co Ltd
Current assignee: Vivo Mobile Communication Co Ltd
Priority date: 2020-10-29
Filing date: 2020-10-29
Publication date: 2021-01-29

Abstract

The application discloses an image text extraction method and device and electronic equipment, and belongs to the technical field of communication. The method comprises the following steps: receiving a dragging operation of a user on the target edit box on the first image; responding to the dragging operation, and determining a text to be edited in the first image according to a dragging parameter of the dragging operation; editing the text to be edited in the first image according to the editing type corresponding to the target editing frame to generate a second image; and extracting image text in the second image. The method and the device can reduce the operation steps of the user and improve the efficiency of extracting the image text.

Description

Image text extraction method and device and electronic equipment

Technical Field

The application belongs to the technical field of communication, and particularly relates to an image text extraction method and device and electronic equipment.

Background

With the rapid development of the artificial intelligence technology, the image character extraction technology has been widely applied to such intelligent terminal devices, and when a user uses an intelligent image recognition tool to extract a text in an image, if some characters extracted from the image need to be added, replaced or deleted, the user usually needs to perform operations such as text extraction, addition, deletion, modification and the like on the text content after performing the text extraction operation, and the finally required text can be output after secondary integration. In the process of editing the text, the position information of the character content of the image needs to be edited, and then the final effective text content is obtained.

The text extraction method is complicated, and the text extraction efficiency is low.

Disclosure of Invention

The embodiment of the application aims to provide an image text extraction method, an image text extraction device and electronic equipment, and can solve the problems that in the prior art, a text extraction mode is complicated, and text extraction efficiency is low.

In order to solve the technical problem, the present application is implemented as follows:

in a first aspect, an embodiment of the present application provides an image text extraction method, where the method includes:

receiving a dragging operation of a user on the target edit box on the first image;

responding to the dragging operation, and determining a text to be edited in the first image according to a dragging parameter of the dragging operation;

editing the text to be edited in the first image according to the editing type corresponding to the target editing frame to generate a second image;

and extracting image text in the second image.

In a second aspect, an embodiment of the present application provides an image text extraction apparatus, including:

the dragging operation receiving module is used for receiving dragging operation of a target edit box on the first image by a user;

the edited text determining module is used for responding to the dragging operation and determining a text to be edited in the first image according to the dragging parameter of the dragging operation;

the second image generation module is used for editing the text to be edited in the first image according to the editing type corresponding to the target editing frame to generate a second image;

and the image text extraction module is used for extracting the image text in the second image.

In a third aspect, an embodiment of the present application provides an electronic device, which includes a processor, a memory, and a program or instructions stored on the memory and executable on the processor, where the program or instructions, when executed by the processor, implement the steps of the image text extraction method according to the first aspect.

In a fourth aspect, the present application provides a readable storage medium, on which a program or instructions are stored, which when executed by a processor implement the steps of the image text extraction method according to the first aspect.

In a fifth aspect, an embodiment of the present application provides a chip, where the chip includes a processor and a communication interface, where the communication interface is coupled to the processor, and the processor is configured to execute a program or instructions to implement the image text extraction method according to the first aspect.

In the embodiment of the application, by receiving a dragging operation of a target edit box on a first image by a user, responding to the dragging operation, determining a text to be edited in the first image according to a dragging parameter of the dragging operation, editing the text to be edited in the first image according to an editing type corresponding to the target edit box to generate a second image, and extracting an image text in the second image. According to the method and the device, the text in the image is edited by dragging the edit box in the image by the user, the image text can be extracted after the editing is finished, the user does not need to edit the image text at the position of the contrast text after extracting the image text, the operation steps of the user can be reduced, and the efficiency of extracting the image text is improved.

Drawings

Fig. 1 is a flowchart illustrating steps of an image text extraction method according to an embodiment of the present application;

fig. 2 is a schematic view of a display image of a graphic interface according to an embodiment of the present disclosure;

fig. 3 is a schematic diagram of a new added text according to an embodiment of the present application;

FIG. 4 is a schematic diagram of an alternative text provided by an embodiment of the present application;

fig. 5 is a schematic diagram of deleting a text according to an embodiment of the present application;

FIG. 6 is a schematic diagram of a fused image according to an embodiment of the present disclosure;

fig. 7 is a schematic diagram of an image with annotation text provided in an embodiment of the present application;

fig. 8 is a schematic diagram of an image without an annotation text according to an embodiment of the present application;

fig. 9 is a schematic structural diagram of an image text extraction apparatus according to an embodiment of the present application;

fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present application;

fig. 11 is a schematic structural diagram of another electronic device according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The terms first, second and the like in the description and in the claims of the present application are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that embodiments of the application may be practiced in sequences other than those illustrated or described herein, and that the terms "first," "second," and the like are generally used herein in a generic sense and do not limit the number of terms, e.g., the first term can be one or more than one. In addition, "and/or" in the specification and claims means at least one of connected objects, a character "/" generally means that a preceding and succeeding related objects are in an "or" relationship.

The image processing scheme provided by the embodiment of the present application is described in detail below with reference to the accompanying drawings through specific embodiments and application scenarios thereof.

Referring to fig. 1, a flowchart illustrating steps of an image text extraction method provided in an embodiment of the present application is shown, and as shown in fig. 1, the image text extraction method may specifically include the following steps:

step 101: and receiving a dragging operation of the target editing frame on the first image by the user.

The method and the device for determining the text to be edited can be applied to a scene in which the text to be edited in the image is determined according to the fact that the user drags the edit box on the image and the text to be edited is edited by combining the type of the edit box.

The first image refers to an image that needs to be subjected to text editing and text extraction.

In some examples, the first image may be an image containing text that is filtered from an album of the electronic device, for example, the image containing text in an album of a mobile phone is: image 1, image 2, and image 3, and the user may select image 1 as the first image, or may select image 1 and image 3 as the first image.

It can be understood that the above examples are only examples listed for better understanding of the technical solution of the embodiment of the present application, and in a specific implementation, the first image may also be acquired in other ways, and in particular, may be determined according to business requirements, and the present embodiment does not limit this.

When the characters in the first image need to be extracted, the first image may be input by the user in the recognition interface, as shown in fig. 2, after the user inputs the first image, the text on the first image, the position of the text in the image, and the like may be displayed in the recognition interface.

The target edit box refers to an edit box selected by the user to be dragged on the first image, and it is understood that each edit box corresponds to an edit type, such as a text deletion type, a text replacement type, a text addition type, and the like.

In this embodiment, a circle tool may be displayed on the image editing page, and the circle tool may include a plurality of circle boxes, each circle box corresponding to one editing type, that is, each circle box is regarded as one editing box.

After the first image is acquired, when a user needs to edit a text in the first image, the user may select a target edit box, and the user performs an operation of dragging the target edit box on the first image, at which time the user may receive an operation of dragging the target edit box on the first image.

After receiving a drag operation of the target edit box on the first image by the user, executing step 102.

Step 102: and responding to the dragging operation, and determining the text to be edited in the first image according to the dragging parameter of the dragging operation.

The first text refers to a text in the first image that needs to be edited. For example, the first image includes a text 1, a text 2, and a text 3, where when the text 1 needs to be edited, the text 1 is used as a text to be edited in the first image; and when the text 2 and the text 3 need to be edited, taking the text 2 and the text 3 as texts to be edited in the first image.

The dragging parameter refers to a parameter corresponding to a dragging operation, in this example, the dragging parameter may be a position parameter of the dragging operation, when a user drags the target edit box on the first image, and when the user drags a certain position on the first image, the dragging operation on the target edit box may be stopped, and the user's finger leaves the target edit box, at this time, the position where the target edit box is located on the first image may be regarded as the dragging parameter corresponding to the dragging operation.

After receiving the dragging operation of the target editing frame on the first image by the user, responding to the dragging operation, and determining the text to be edited in the first image according to the dragging parameters of the dragging operation.

After the text to be edited in the first image is determined, step 103 is performed.

Step 103: and editing the text to be edited in the first image according to the editing type corresponding to the target editing frame to generate a second image.

The edit type refers to a type of text editing corresponding to the target edit box, and in this embodiment, the edit type may include at least one of a text deletion type, a text replacement type, a text addition type, and other edit types.

The second image is an image obtained after the text to be edited in the first image is edited.

After the text to be edited in the first image is determined, the text to be edited in the first image can be edited according to the editing type corresponding to the target editing box to generate a second image. Specifically, a scheme of editing a text to be edited in a first image according to an editing type to generate a second image may be described in detail with reference to the following specific implementation manner.

In a specific implementation manner of the present application, the step 103 may include:

substep A1: and under the condition that the editing type is a text adding type, acquiring a first text edited by the user, adding the first text to the position of the text to be edited in the first image, and generating a second image added with the text.

In this embodiment, the first text refers to a text that is input by the user and needs to be added in the first image.

When the editing type is a text adding type, a first text edited by a user can be acquired, and the first text is added to a position where a text to be edited in the first image is located, so that a second image to which the text is added can be generated.

It should be understood that the above examples are only examples for better understanding of the technical solutions of the embodiments of the present application, and are not to be taken as the only limitation to the embodiments.

In another specific implementation manner of the present application, the step 103 may further include:

substep A2: and under the condition that the editing type is a text replacement type, acquiring a second text edited by the user, replacing the text to be edited in the first image with the second text, and generating a second image after replacing the text.

In this embodiment, the second text refers to a text that needs to be replaced with the text to be edited in the first image and is input by the user.

For example, as shown in fig. 4, after the user selects a button of a replacement selection box (an edit box of a text replacement type), the style of the selection box is a dashed box, and after the selection of the text to be edited, a confirmation addition text popup window may pop up for the user to input a text to be replaced, that is, the second text, so that the text to be edited in the first image may be replaced with the second text, and the second image after the text is replaced is generated.

substep A3: and under the condition that the editing type is a text deleting type, deleting the text to be edited in the first image, and generating a second image after the text is deleted.

In this embodiment, in a case that the editing type of the target edit box is a text deletion type, the text to be edited in the first image may be deleted to generate the second image after the text is deleted, for example, as shown in fig. 5, after the user selects a deletion selection box (i.e., an edit box of the text deletion type), the selection box is a solid line box with X, after the text to be edited is selected, a specific text content (i.e., the text to be edited) for confirming deletion is directly popped up, so that the user selects whether to delete the text content, and if the text content is deleted, the second image after the text is deleted may be obtained.

It should be understood that the above examples are only for better understanding of the technical solutions of the embodiments of the present application and are not to be taken as the only limitation of the present embodiment.

According to the method and the device, the image text can be edited by combining the type of the edit box, so that the operation of the user on editing the image text can be simplified, and the user experience is improved.

And after editing the text to be edited in the first image according to the editing type corresponding to the target editing box to generate a second image, executing step 104.

Step 104: and extracting image text in the second image.

After the second image is generated, image text in the second image may be extracted.

The method and the device for editing the text in the image can extract the text in the image after editing is finished, and do not need the operation that a user edits at a text position after extracting the text in the image, so that the operation steps of the user can be reduced, and the efficiency of extracting the image text is improved.

In this embodiment, the generated second image may be an image including the modification mark, or may be an image not including the modification mark.

When the second image is an image containing a modification mark, an image text containing the modification mark in the second image may be extracted, as shown in fig. 7, and the extracted image text contains modification marks such as "add", "delete", and the like.

And when the second image is an image not containing the modification mark, the image text not containing the modification mark in the second image may be extracted, as shown in fig. 8.

By outputting the image text with the editing marks, the embodiment of the application can enable the user to clearly determine the edited image text, and improves the user experience.

In this embodiment, the association relationship between the text to be edited and the target edit box may also be stored, and when the text of another image is processed subsequently and the text same as the text to be edited exists in another image, the editing operation of the editing type corresponding to the target edit box may be performed on the text same as the text to be edited in another image, so as to automatically implement editing the image text in another image. In particular, the detailed description may be combined with the following specific implementations.

In another specific implementation manner of the present application, after the step 102, the method may further include:

step B1: and storing the corresponding relation between the text to be edited and the target editing frame.

In the embodiment of the application, after the editing operation of the editing type corresponding to the target editing box is performed on the text to be edited according to the editing type corresponding to the target editing box, the corresponding relationship between the text to be edited and the target editing box can be saved.

After saving the correspondence between the text to be edited and the target edit box, step B2 is executed.

Step B2: when the text in the third image needs to be extracted, determining whether the third image contains the third text which is the same as the text to be edited.

The third image is an image which needs to be subjected to image text extraction.

The third text refers to the same text as the text to be edited included in the third image. For example, when the text to be edited is "aaabbbbdddsss" and the text of "aaabbbbdddsss" is included in the third image, "aaabbbbdddsss" in the third image is regarded as the third text.

When the image text in the third image needs to be extracted, whether the third image contains the same third text as the text to be edited or not can be determined.

After determining that the third image contains the same third text as the text to be edited, step B3 is executed.

Step B3: and in the case that the third text exists, executing an editing operation of an editing type corresponding to the target editing box on the third text.

When the third text is included in the third image, an editing operation of an editing type corresponding to the target edit box may be performed on the third text.

According to the method and the device, the corresponding relation between the text to be edited and the target editing box is stored in advance, the text in other images is extracted subsequently, and when the text which is the same as the text to be edited is contained in the other images, the text which is the same as the text to be edited in the other images is directly edited, the user does not need to perform manual editing again, operation steps of the user are reduced, and use experience of the user is improved.

According to the image text extraction method provided by the embodiment of the application, the dragging operation of a target edit box on a first image by a user is received, the dragging operation is responded, a text to be edited in the first image is determined according to the dragging parameter of the dragging operation, the text to be edited in the first image is edited according to the editing type corresponding to the target edit box, a second image is generated, and the image text in the second image is extracted. According to the method and the device, the text in the image is edited by dragging the edit box in the image by the user, the image text can be extracted after the editing is finished, the user does not need to edit the image text at the position of the contrast text after extracting the image text, the operation steps of the user can be reduced, and the efficiency of extracting the image text is improved.

It should be noted that, in the image text extraction method provided in the embodiment of the present application, the execution subject may be an image text extraction device, or a control module in the image text extraction device for executing the image text extraction method. The image text extraction device provided by the embodiment of the present application is described with an example of an image text extraction method executed by an image text extraction device.

Referring to fig. 9, a schematic structural diagram of an image text extraction apparatus provided in an embodiment of the present application is shown, and as shown in fig. 9, the image text extraction apparatus 900 may specifically include the following modules:

a dragging operation receiving module 910, configured to receive a dragging operation of a target edit box on a first image by a user;

an edited text determining module 920, configured to determine, in response to the dragging operation, a text to be edited in the first image according to a dragging parameter of the dragging operation;

a second image generating module 930, configured to edit the text to be edited in the first image according to the editing type corresponding to the target editing box, so as to generate a second image;

and an image text extracting module 940, configured to extract an image text in the second image.

Optionally, the second image generation module 930 includes:

and the added text image generating unit is used for acquiring a first text edited by the user under the condition that the editing type is a text adding type, adding the first text to the position of the text to be edited in the first image, and then generating a second image added with the text.

Optionally, the second image generation module 930 includes:

and the replacing text image generating unit is used for acquiring a second text edited by the user under the condition that the editing type is a text replacing type, replacing the text to be edited in the first image with the second text, and generating a second image after replacing the text.

Optionally, the second image generation module 930 includes:

and the deleted text image generating unit is used for deleting the text to be edited in the first image and generating a second image after the text is deleted under the condition that the editing type is a text deleting type.

Optionally, the image text extraction module 940 includes:

and the image text extraction unit is used for extracting the image text containing the modification mark in the second image when the second image is the image containing the modification mark.

Optionally, the method further comprises:

the corresponding relation storage module is used for storing the corresponding relation between the text to be edited and the target editing frame;

the third text determining module is used for determining whether a third text which is the same as the text to be edited is contained in the third image when the text in the third image needs to be extracted;

and the editing operation executing module is used for executing the editing operation of the editing type corresponding to the target editing box on the third text under the condition that the third text exists.

The image text extraction device provided by the embodiment of the application determines a text to be edited in a first image according to a dragging parameter of a dragging operation by receiving the dragging operation of a user on the first image of a target edit box, responds to the dragging operation, edits the text to be edited in the first image according to an editing type corresponding to the target edit box, generates a second image, and extracts an image text in the second image. According to the method and the device, the text in the image is edited by dragging the edit box in the image by the user, the image text can be extracted after the editing is finished, the user does not need to edit the image text at the position of the contrast text after extracting the image text, the operation steps of the user can be reduced, and the efficiency of extracting the image text is improved.

The image text extraction device in the embodiment of the present application may be a device, or may be a component, an integrated circuit, or a chip in a terminal. The device can be mobile electronic equipment or non-mobile electronic equipment. By way of example, the mobile electronic device may be a mobile phone, a tablet computer, a notebook computer, a palm top computer, a vehicle-mounted electronic device, a wearable device, an ultra-mobile personal computer (UMPC), a netbook or a Personal Digital Assistant (PDA), and the like, and the non-mobile electronic device may be a server, a Network Attached Storage (NAS), a Personal Computer (PC), a Television (TV), a teller machine or a self-service machine, and the like, and the embodiments of the present application are not particularly limited.

The image text extraction device in the embodiment of the present application may be a device having an operating system. The operating system may be an Android (Android) operating system, an ios operating system, or other possible operating systems, and embodiments of the present application are not limited specifically.

The image text extraction device provided in the embodiment of the present application can implement each process implemented in the method embodiment of fig. 1, and is not described here again to avoid repetition.

Optionally, as shown in fig. 10, an electronic device 1000 is further provided in this embodiment of the present application, and includes a processor 1001, a memory 1002, and a program or an instruction stored in the memory 1002 and executable on the processor 1001, where the program or the instruction is executed by the processor 1001 to implement each process of the above-mentioned embodiment of the image text extraction method, and can achieve the same technical effect, and in order to avoid repetition, it is not described here again.

It should be noted that the electronic device in the embodiment of the present application includes the mobile electronic device and the non-mobile electronic device described above.

Fig. 11 is a schematic diagram of a hardware structure of an electronic device implementing an embodiment of the present application.

The electronic device 1100 includes, but is not limited to: a radio frequency unit 1101, a network module 1102, an audio output unit 1103, an input unit 1104, a sensor 1105, a display unit 1106, a user input unit 1107, an interface unit 1108, a memory 1109, a processor 1110, and the like.

Those skilled in the art will appreciate that the electronic device 1100 may further include a power source (e.g., a battery) for supplying power to the various components, and the power source may be logically connected to the processor 1110 via a power management system, so as to manage charging, discharging, and power consumption management functions via the power management system. The electronic device structure shown in fig. 11 does not constitute a limitation of the electronic device, and the electronic device may include more or less components than those shown, or combine some components, or arrange different components, and thus, the description is not repeated here.

The user input unit 1107 is configured to receive a drag operation of the target edit box on the first image by the user;

a processor 1110, configured to determine, in response to the dragging operation, a text to be edited in the first image according to a dragging parameter of the dragging operation; editing the text to be edited in the first image according to the editing type corresponding to the target editing frame to generate a second image; and extracting image text in the second image.

According to the method and the device, the text in the image is edited by dragging the edit box in the image by the user, the image text can be extracted after the editing is finished, the user does not need to edit the image text at the position of the contrast text after extracting the image text, the operation steps of the user can be reduced, and the efficiency of extracting the image text is improved.

Optionally, the processor 1110 is further configured to, when the editing type is a text addition type, obtain a first text edited by the user, add the first text to a position of a text to be edited in the first image, and generate a second image to which the text is added.

Optionally, the processor 1110 is further configured to, when the editing type is a text replacement type, obtain a second text edited by the user, replace the text to be edited in the first image with the second text, and generate a second image after replacing the text.

Optionally, the processor 1110 is further configured to delete the text to be edited in the first image and generate a second image after the text is deleted, when the editing type is a text deletion type.

Optionally, the processor 1110 is further configured to extract an image text containing a modification mark in the second image, in a case that the second image is an image containing a modification mark.

Optionally, the processor 1110 is further configured to store a corresponding relationship between the text to be edited and the target edit box; when a text in a third image needs to be extracted, determining whether the third image contains a third text which is the same as the text to be edited; and in the case that the third text exists, executing an editing operation of an editing type corresponding to the target editing box on the third text.

According to the method and the device, the corresponding relation between the text to be edited and the target edit box is stored, when the text which is the same as the text to be edited is required to be extracted from other images, the editing operation of the editing type corresponding to the target edit box can be executed on the same text, the user does not need to perform editing operation again, the operation steps of extracting the text by the user can be further reduced, and the user experience is improved.

It should be understood that, in the embodiment of the present application, the input Unit 1104 may include a Graphics Processing Unit (GPU) 11041 and a microphone 11042, and the Graphics processor 11041 processes image data of a still image or a video obtained by an image capturing device (such as a camera) in a video capturing mode or an image capturing mode. The display unit 1106 may include a display panel 11061, and the display panel 11061 may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like. The user input unit 1107 includes a touch panel 11071 and other input devices 11072. A touch panel 11071, also called a touch screen. The touch panel 11071 may include two portions of a touch detection device and a touch controller. Other input devices 11072 may include, but are not limited to, a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, and a joystick, which are not described in detail herein. The memory 1109 may be used for storing software programs and various data including, but not limited to, application programs and an operating system. Processor 1110 may integrate an application processor that handles primarily operating systems, user interfaces, applications, etc. and a modem processor that handles primarily wireless communications. It will be appreciated that the modem processor described above may not be integrated into processor 1110.

The embodiment of the present application further provides a readable storage medium, where a program or an instruction is stored on the readable storage medium, and when the program or the instruction is executed by a processor, the program or the instruction implements each process of the above-mentioned image text extraction method embodiment, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here.

The processor is the processor in the electronic device described in the above embodiment. The readable storage medium includes a computer readable storage medium, such as a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and the like.

The embodiment of the present application further provides a chip, where the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is configured to run a program or an instruction to implement each process of the above image text extraction method embodiment, and can achieve the same technical effect, and in order to avoid repetition, the description is omitted here.

It should be understood that the chips mentioned in the embodiments of the present application may also be referred to as system-on-chip, system-on-chip or system-on-chip, etc.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element. Further, it should be noted that the scope of the methods and apparatus of the embodiments of the present application is not limited to performing the functions in the order illustrated or discussed, but may include performing the functions in a substantially simultaneous manner or in a reverse order based on the functions involved, e.g., the methods described may be performed in an order different than that described, and various steps may be added, omitted, or combined. In addition, features described with reference to certain examples may be combined in other examples.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present application.

While the present embodiments have been described with reference to the accompanying drawings, it is to be understood that the invention is not limited to the precise embodiments described above, which are meant to be illustrative and not restrictive, and that various changes may be made therein by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. An image text extraction method, comprising:

and extracting image text in the second image.

2. The method according to claim 1, wherein the editing the text to be edited in the first image according to the editing type corresponding to the target edit box to generate a second image comprises:

and under the condition that the editing type is a text adding type, acquiring a first text edited by the user, adding the first text to the position of the text to be edited in the first image, and generating a second image added with the text.

3. The method according to claim 1, wherein the editing the text to be edited in the first image according to the editing type corresponding to the target edit box to generate a second image comprises:

and under the condition that the editing type is a text replacement type, acquiring a second text edited by the user, replacing the text to be edited in the first image with the second text, and generating a second image after replacing the text.

4. The method according to claim 1, wherein the editing the text to be edited in the first image according to the editing type corresponding to the target edit box to generate a second image comprises:

and under the condition that the editing type is a text deleting type, deleting the text to be edited in the first image, and generating a second image after the text is deleted.

5. The method of claim 1, wherein extracting image text from the second image comprises:

and extracting the image text containing the modification mark in the second image under the condition that the second image is the image containing the modification mark.

6. The method of claim 1, after the extracting image text in the second image, further comprising:

storing the corresponding relation between the text to be edited and the target editing frame;

when a text in a third image needs to be extracted, determining whether the third image contains a third text which is the same as the text to be edited;

and in the case that the third text exists, executing an editing operation of an editing type corresponding to the target editing box on the third text.

7. An image text extraction device characterized by comprising:

8. The apparatus of claim 7, wherein the second image generation module comprises:

9. The apparatus of claim 7, wherein the second image generation module comprises:

10. An electronic device comprising a processor, a memory, and a program or instructions stored on the memory and executable on the processor, the program or instructions when executed by the processor implementing the steps of the image text extraction method according to any one of claims 1-6.