CN112560854A - Method, apparatus, device and storage medium for processing image - Google Patents

Method, apparatus, device and storage medium for processing image Download PDF

Info

Publication number
CN112560854A
CN112560854A CN202011501453.4A CN202011501453A CN112560854A CN 112560854 A CN112560854 A CN 112560854A CN 202011501453 A CN202011501453 A CN 202011501453A CN 112560854 A CN112560854 A CN 112560854A
Authority
CN
China
Prior art keywords
image
text
seal
stamp
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011501453.4A
Other languages
Chinese (zh)
Inventor
马振宇
吕鹏原
章成全
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202011501453.4A priority Critical patent/CN112560854A/en
Publication of CN112560854A publication Critical patent/CN112560854A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a method, a device, equipment and a storage medium for processing images, and relates to the field of artificial intelligence, in particular to the technical field of computer vision and deep learning. The specific implementation scheme is as follows: acquiring a seal image; carrying out text positioning on the stamp image and determining the stamp text image; converting the seal text image to obtain a target text image; and identifying the text in the target text image and determining the text in the stamp image. The realization mode can identify the seal in the image and the characters in the seal through simpler steps.

Description

Method, apparatus, device and storage medium for processing image
Technical Field
The present application relates to the field of artificial intelligence technology, in particular to the field of computer vision and deep learning, and more particularly to a method, apparatus, device and storage medium for processing images.
Background
Stamps are widely used for documents to represent a means of signing or authenticating. With the development of information technology, there is an increasing demand for stamp recognition in terms of office/government automation. However, unlike ordinary character recognition, stamp recognition is difficult to recognize because of its following features: 1) the characters have multiple types, namely, the horizontal characters have bent characters and multi-line characters; 2) the general radian of the bent characters is larger.
The existing seal identification method is generally complex.
Disclosure of Invention
A method, apparatus, device, and storage medium for processing an image are provided.
According to a first aspect, there is provided a method for processing an image, comprising: acquiring a seal image; carrying out text positioning on the stamp image and determining the stamp text image; converting the seal text image to obtain a target text image; and identifying the text in the target text image and determining the text in the stamp image.
According to a second aspect, there is provided an apparatus for processing an image, comprising: an image acquisition unit configured to acquire a stamp image; the text positioning unit is configured to perform text positioning on the stamp image and determine the stamp text image; the image conversion unit is configured to convert the stamp text image to obtain a target text image; and the text recognition unit is configured to recognize the text in the target text image and determine the text in the stamp image.
According to a third aspect, there is provided an electronic device for processing an image, comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method as described in the first aspect.
According to a fourth aspect, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing a computer to perform the method as described in the first aspect.
According to a fifth aspect, a computer program product comprising a computer program which, when executed by a processor, implements the method as described in the first aspect.
According to the technology of the application, a relatively simple seal detection and identification method is provided, and a seal in an image and characters in the seal can be identified through relatively simple steps.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:
FIG. 1 is an exemplary system architecture diagram in which one embodiment of the present application may be applied;
FIG. 2 is a flow diagram of one embodiment of a method for processing an image according to the present application;
FIG. 3 is a schematic illustration of an application scenario of a method for processing an image according to the present application;
FIG. 4 is a flow diagram of another embodiment of a method for processing an image according to the present application;
FIG. 5 is a schematic block diagram of one embodiment of an apparatus for processing images according to the present application;
fig. 6 is a block diagram of an electronic device for implementing a method for processing an image according to an embodiment of the present application.
Detailed Description
The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
Fig. 1 shows an exemplary system architecture 100 to which embodiments of the method for processing images or the apparatus for processing images of the present application may be applied.
As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. For example, the user may capture an image with a stamp through the terminal devices 101, 102, 103 and send the image to the server 105. The terminal devices 101, 102, 103 may also be connected to an image capturing device for capturing images with stamps. Various communication client applications, such as an image processing application, a social platform application, and the like, may be installed on the terminal devices 101, 102, 103.
The terminal apparatuses 101, 102, and 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices including, but not limited to, smart phones, tablet computers, car computers, laptop portable computers, desktop computers, and the like. When the terminal apparatuses 101, 102, 103 are software, they can be installed in the electronic apparatuses listed above. It may be implemented as multiple pieces of software or software modules (e.g., to provide distributed services) or as a single piece of software or software module. And is not particularly limited herein.
The server 105 may be a server that provides various services, such as a background server that processes images transmitted by the terminal devices 101, 102, 103. The background server can perform seal detection and identification on the received image and feed back the identified characters to the terminal devices 101, 102 and 103.
The server 105 may be hardware or software. When the server 105 is hardware, it may be implemented as a distributed server cluster composed of a plurality of servers, or may be implemented as a single server. When the server 105 is software, it may be implemented as multiple pieces of software or software modules (e.g., to provide distributed services), or as a single piece of software or software module. And is not particularly limited herein.
It should be noted that the method for processing an image provided in the embodiment of the present application may be executed by the terminal devices 101, 102, and 103, or may be executed by the server 105. Accordingly, the apparatus for processing images may be provided in the terminal devices 101, 102, 103, or in the server 105. It should be noted that if the method for processing an image is executed by the terminal apparatuses 101, 102, and 103, the network 104 and the server 105 may not be included in the above-described architecture diagram.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
With continued reference to FIG. 2, a flow 200 of one embodiment of a method for processing an image according to the present application is shown. The method for processing the image of the embodiment comprises the following steps:
step 201, obtaining a stamp image.
In this embodiment, an execution subject of the method for processing an image (for example, the terminal apparatus 101, 102, 103 or the server 105 shown in fig. 1) may acquire a stamp image in various ways. For example, the stamp image is acquired in real time by an image acquisition device connected thereto, or is acquired by an application installed therein. The stamp image may include various types of stamps. The color and shape of the stamp are not limited, and the stamp may be black, red, or the like, or may be a circle stamp, an ellipse stamp, a square stamp, or the like.
Step 202, performing text positioning on the stamp image, and determining the stamp text image.
After the execution main body obtains the stamp image, the text positioning can be carried out on the stamp image, and the stamp text image is determined. Specifically, the executing body may input the stamp image into a pre-trained text positioning model, and the obtained output result is the stamp text image. In some specific applications, the Text positioning model can adopt a Text detection algorithm proposed in An article published by An impact and Accurate Scene Text Detector (CVPR 2017) of a lightweight network, can greatly improve the accuracy and efficiency of the model, and has the obvious advantages of smaller model and higher speed.
And step 203, transforming the seal text image to obtain a target text image.
After the execution main body obtains the stamp text image, the stamp text image can be transformed to obtain a target text image. During conversion, the image corresponding to each character in the stamp text image can be rotated, and each character in the obtained target text image is vertically upward. Alternatively, the executing body may transform the stamp text image into a horizontal shape when it is determined that it is in a curved shape.
And step 204, recognizing the text in the target text image and determining the text in the stamp image.
After transformation, the executing body can identify the text in the target text image and determine the text in the stamp image. Specifically, the executing agent may use an existing text Recognition algorithm, such as an OCR (Optical Character Recognition) algorithm. Or the execution main body can also input the target text image into a pre-trained text recognition model, and the obtained output result is the text in the stamp image. In some specific applications, the text recognition model may use a lightweight CNN + BiLSTM-based network structure. The CNN (Convolutional Neural Networks) may be used to extract high-level abstract visual features of an image input thereto, so as to obtain a feature vector. Then, the obtained feature vector is input into a BilSTM (Bi-directional Long Short-Term Memory, a network formed by combining a forward LSTM and a backward LSTM), and context semantic information of the feature vector is given, so that richer deep feature expression of the target text image is obtained. The network has small calculation amount, can identify the consumption time period of characters, is provided with a character set of up to twenty thousand types, and basically covers various rarely used characters.
With continued reference to fig. 3, a schematic illustration of one application scenario of the method for processing an image according to the present application is shown. In the application scenario of fig. 3, a user acquires a stamp image through an image acquisition device 302 connected to a terminal 301. Then, after the stamp image is processed in the steps 202-204, the text in the stamp image is 'Zhang Sanli four Limited company'. The terminal 301 may output the text for copying or other processing by the user.
The method for processing the image provided by the above embodiment of the application can identify the stamp and the characters in the stamp through relatively simple steps.
With continued reference to FIG. 4, a flow 400 of another embodiment of a method for processing an image according to the present application is shown. As shown in fig. 4, the method of the present embodiment may include the following steps:
step 401, acquiring a target image; and determining the seal image from the target image and a seal detection model trained in advance.
In this embodiment, the executing subject may first acquire the target image in various ways. The target image may include at least one stamp image. After the target image is obtained, the execution main body can perform seal detection on the target image to obtain at least one seal image. Specifically, the executing body may input the target image into a seal detection model trained in advance, and the obtained output result is the seal image. Alternatively, the executive may first identify a circle, ellipse, or rectangle in the target image. Then, characters in a circle, an ellipse, or a rectangle are recognized. If the recognized text includes a specific text (e.g., including "company" or "chapter"), the circle, oval, or rectangle is interpreted as a stamp image.
Step 402, determining a text region marking frame according to the stamp image and a pre-trained text positioning model; and determining a stamp text image according to the text region marking box.
In this embodiment, the execution subject may input the stamp image into a pre-trained text positioning model, and determine the text region labeling box. The text positioning model is used for representing the corresponding relation between the seal image and the text region marking frame. The text region labeling box may include a plurality of boxes, each box representing a connected component. The execution main body may use an image labeled by the text region labeling box as a stamp text image.
In some optional implementations of this embodiment, each text region labeling box may include multiple regression points, and each regression point represents a position of the text region labeling box. For example, a regression point is set for each corner point of the text region labeling box, and a line of the text region labeling box is represented by setting multiple regression points between two corner points. The execution subject may determine the position of each regression point according to the position of the text region labeling box.
And 403, converting the stamp text image to a preset canvas according to the positions of the multiple regression points in the text region labeling frame to obtain a target text image.
After the positions of the multiple regression points in the text region labeling frame are determined, the execution main body can convert the stamp text image into a preset canvas according to the blowing of the multiple regression points to obtain a target text image. Here, the preset canvas may be a canvas in which a size and a shape are previously set, and the execution subject may transform the stamp text image to the preset canvas by using positions of the plurality of regression points through various algorithms. Specifically, the execution subject may uniformly distribute the multiple regression points to a border of a preset canvas, so as to obtain the target text image.
In some optional implementation manners of this embodiment, the multiple regression points are uniformly distributed on the upper side and the lower side of the text region labeling box, and the preset canvas is a rectangle. The above step 403 can be specifically realized by the following steps not shown in fig. 4: and uniformly distributing the regression points distributed on the upper side of the text region labeling frame to the upper side of the rectangle, and uniformly distributing the regression points distributed on the lower side of the text region labeling frame to the lower side of the rectangle to obtain the target text image.
In this implementation, the execution subject may uniformly distribute the multiple regression points distributed above the text region labeling frame to the top of the rectangle, and uniformly distribute the multiple regression points distributed below the text region labeling frame to the bottom of the rectangle, so as to obtain the target text image. For example, the text region annotation box includes 16 regression points, 8 of which are distributed above and 8 of which are distributed below the text region annotation box. The execution main body can uniformly distribute the 8 blocks distributed on the upper side to the upper side of the rectangle and uniformly distribute the 8 blocks distributed on the lower side to the lower side of the rectangle, so that the conversion of the stamp text image is realized, and the target text image is obtained.
Step 404, recognizing the text in the target text image and determining the text in the stamp image.
Step 405, outputting the type of the stamp in the stamp image.
In this embodiment, the execution main body may further output the type of the stamp in the stamp image. The categories of the seal may include round seals, oval seals, square seals, and so on. The execution main body can output the types of the seals for the user to check.
The method for processing the image provided by the above embodiment of the present application can realize the transformation of the image through the regression point, has a small calculation amount, and can obtain an image with high quality.
With further reference to fig. 5, as an implementation of the methods shown in the above figures, the present application provides an embodiment of an apparatus for processing an image, which corresponds to the method embodiment shown in fig. 2, and which is particularly applicable in various electronic devices.
As shown in fig. 5, the apparatus 500 for processing an image of the present embodiment includes: an image acquisition unit 501, a text positioning unit 502, an image transformation unit 503, and a text recognition unit 504.
An image acquisition unit 501 configured to acquire a stamp image.
And a text positioning unit 502 configured to perform text positioning on the stamp image and determine the stamp text image.
An image conversion unit 503 configured to convert the stamp text image to obtain a target text image.
A text recognition unit 504 configured to identify text in the target text image and determine text in the stamp image.
In some optional implementations of this embodiment, the text positioning unit 502 may be further configured to: determining a text region marking frame according to the stamp image and a pre-trained text positioning model; and determining a stamp text image according to the text region marking box.
In some optional implementations of this embodiment, the text region labeling box includes a plurality of regression points. The image transformation unit 503 may be further configured to: and transforming the seal text image to a preset canvas according to the positions of the plurality of regression points in the text region labeling frame to obtain a target text image.
In some optional implementation manners of this embodiment, the multiple regression points are uniformly distributed on the upper side and the lower side of the text region labeling box, and the preset canvas is a rectangle. The image transformation unit 503 is further configured to: and uniformly distributing the regression points distributed on the upper side of the text region labeling frame to the upper side of the rectangle, and uniformly distributing the regression points distributed on the lower side of the text region labeling frame to the lower side of the rectangle to obtain the target text image.
In some optional implementations of this embodiment, the image acquisition unit 501 may be further configured to: acquiring a target image; and determining the seal image according to the target image and a seal detection model trained in advance.
In some optional implementations of this embodiment, the apparatus 500 may further include a category output unit, not shown in fig. 5, configured to: and outputting the type of the seal in the seal image.
It should be understood that the units 501 to 504 recited in the apparatus 500 for processing an image correspond to respective steps in the method described with reference to fig. 2, respectively. Thus, the operations and features described above for the method for processing an image are equally applicable to the apparatus 500 and the units included therein and will not be described in detail here.
According to an embodiment of the present application, an electronic device and a readable storage medium are also provided.
As shown in fig. 6, is a block diagram of an electronic device performing a method for processing an image according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.
As shown in fig. 6, the electronic apparatus includes: one or more processors 601, memory 602, and interfaces for connecting the various components, including a high-speed interface and a low-speed interface. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information of a GUI on an external input/output apparatus (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). In fig. 6, one processor 601 is taken as an example.
The memory 602 is a non-transitory computer readable storage medium as provided herein. Wherein the memory stores instructions executable by at least one processor to cause the at least one processor to perform the methods provided herein for processing images. The non-transitory computer readable storage medium of the present application stores computer instructions for causing a computer to perform the methods provided herein for processing images.
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present application may be written in any combination of one or more programming languages. The program code described above may be packaged as a computer program product. These program code or computer program products may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program code, when executed by the processor 601, causes the functions/acts specified in the flowchart and/or block diagram block or blocks to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
The memory 602, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the method for processing an image (e.g., the image acquisition unit 501, the text positioning unit 502, the image transformation unit 503, and the text recognition unit 504 shown in fig. 5) in the embodiments of the present application. The processor 601 executes various functional applications of the server and data processing, i.e., implements the method for processing images performed in the above-described method embodiments, by executing non-transitory software programs, instructions, and modules stored in the memory 602.
The memory 602 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of an electronic device that performs processing for an image, and the like. Further, the memory 602 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 602 optionally includes memory located remotely from the processor 601, which may be connected via a network to an electronic device executing instructions for processing images. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The electronic device performing the method for processing an image may further include: an input device 603 and an output device 604. The processor 601, the memory 602, the input device 603 and the output device 604 may be connected by a bus or other means, and fig. 6 illustrates the connection by a bus as an example.
The input device 603 may receive input numeric or character information and generate key signal inputs related to performing user settings and function control of an electronic apparatus for processing an image, such as an input device of a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointing stick, one or more mouse buttons, a track ball, a joystick, or the like. The output devices 604 may include a display device, auxiliary lighting devices (e.g., LEDs), and tactile feedback devices (e.g., vibrating motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device can be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
According to the technical scheme of the embodiment of the application, the seal in the image and the characters in the seal can be identified through simpler steps.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, and the present invention is not limited thereto as long as the desired results of the technical solutions disclosed in the present application can be achieved.
The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (15)

1. A method for processing an image, comprising:
acquiring a seal image;
carrying out text positioning on the stamp image and determining a stamp text image;
converting the seal text image to obtain a target text image;
and identifying the text in the target text image, and determining the text in the stamp image.
2. The method of claim 1, wherein said text positioning said stamp image, determining a stamp text image, comprises:
determining a text region marking frame according to the seal image and a pre-trained text positioning model;
and determining the seal text image according to the text region marking frame.
3. The method of claim 2, wherein the text region annotation box comprises a plurality of regression points; and
the step of transforming the stamp text image to obtain a target text image comprises the following steps:
and transforming the seal text image to a preset canvas according to the positions of the multiple regression points in the text region labeling frame to obtain the target text image.
4. The method of claim 3, wherein the regression points are uniformly distributed above and below the text region labeling box, and the preset canvas is rectangular; and
and transforming the seal text image to a preset canvas according to the positions of the regression points to obtain the target text image, wherein the method comprises the following steps:
and uniformly distributing a plurality of regression points distributed on the upper side of the text region labeling frame to the upper side of the rectangle, and uniformly distributing a plurality of regression points distributed on the lower side of the text region labeling frame to the lower side of the rectangle to obtain the target text image.
5. The method of claim 1, wherein said obtaining a stamp image comprises:
acquiring a target image;
and determining the seal image according to the target image and a seal detection model trained in advance.
6. The method of claim 1, wherein the method further comprises:
and outputting the type of the seal in the seal image.
7. An apparatus for processing an image, comprising:
an image acquisition unit configured to acquire a stamp image;
the text positioning unit is configured to perform text positioning on the stamp image and determine the stamp text image;
the image conversion unit is configured to convert the stamp text image to obtain a target text image;
a text recognition unit configured to identify a text in the target text image and determine a text in the stamp image.
8. The apparatus of claim 7, wherein the text positioning unit is further configured to:
determining a text region marking frame according to the seal image and a pre-trained text positioning model;
and determining the seal text image according to the text region marking frame.
9. The apparatus of claim 8, wherein the text region annotation box comprises a plurality of regression points; and
the image transformation unit is further configured to:
and transforming the seal text image to a preset canvas according to the positions of the multiple regression points in the text region labeling frame to obtain the target text image.
10. The apparatus according to claim 9, wherein the regression points are uniformly distributed above and below the text region labeling box, and the preset canvas is rectangular; and
the image transformation unit is further configured to:
and uniformly distributing a plurality of regression points distributed on the upper side of the text region labeling frame to the upper side of the rectangle, and uniformly distributing a plurality of regression points distributed on the lower side of the text region labeling frame to the lower side of the rectangle to obtain the target text image.
11. The apparatus of claim 7, wherein the image acquisition unit is further configured to:
acquiring a target image;
and determining the seal image according to the target image and a seal detection model trained in advance.
12. The apparatus of claim 7, wherein the apparatus further comprises a category output unit configured to:
and outputting the type of the seal in the seal image.
13. An electronic device for processing an image, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-6.
14. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-6.
15. A computer program product comprising a computer program which, when executed by a computing unit, implements the method according to any one of claims 1-6.
CN202011501453.4A 2020-12-18 2020-12-18 Method, apparatus, device and storage medium for processing image Pending CN112560854A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011501453.4A CN112560854A (en) 2020-12-18 2020-12-18 Method, apparatus, device and storage medium for processing image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011501453.4A CN112560854A (en) 2020-12-18 2020-12-18 Method, apparatus, device and storage medium for processing image

Publications (1)

Publication Number Publication Date
CN112560854A true CN112560854A (en) 2021-03-26

Family

ID=75063492

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011501453.4A Pending CN112560854A (en) 2020-12-18 2020-12-18 Method, apparatus, device and storage medium for processing image

Country Status (1)

Country Link
CN (1) CN112560854A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113361404A (en) * 2021-06-02 2021-09-07 北京百度网讯科技有限公司 Method, apparatus, device, storage medium and program product for recognizing text
CN113436080A (en) * 2021-06-30 2021-09-24 平安科技(深圳)有限公司 Seal image processing method, device, equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110659647A (en) * 2019-09-11 2020-01-07 杭州睿琪软件有限公司 Seal image identification method and device, intelligent invoice identification equipment and storage medium
CN111284154A (en) * 2020-03-09 2020-06-16 江苏尚博信息科技有限公司 Seal control machine seal control method, device and system based on image recognition
CN111950353A (en) * 2020-06-30 2020-11-17 深圳市雄帝科技股份有限公司 Seal text recognition method and device and electronic equipment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110659647A (en) * 2019-09-11 2020-01-07 杭州睿琪软件有限公司 Seal image identification method and device, intelligent invoice identification equipment and storage medium
CN111284154A (en) * 2020-03-09 2020-06-16 江苏尚博信息科技有限公司 Seal control machine seal control method, device and system based on image recognition
CN111950353A (en) * 2020-06-30 2020-11-17 深圳市雄帝科技股份有限公司 Seal text recognition method and device and electronic equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113361404A (en) * 2021-06-02 2021-09-07 北京百度网讯科技有限公司 Method, apparatus, device, storage medium and program product for recognizing text
CN113436080A (en) * 2021-06-30 2021-09-24 平安科技(深圳)有限公司 Seal image processing method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN111428008B (en) Method, apparatus, device and storage medium for training a model
CN112507946A (en) Method, apparatus, device and storage medium for processing image
CN111626202B (en) Method and device for identifying video
US20210312172A1 (en) Human body identification method, electronic device and storage medium
CN112149741B (en) Training method and device for image recognition model, electronic equipment and storage medium
CN111860362A (en) Method and device for generating human face image correction model and correcting human face image
CN111598164A (en) Method and device for identifying attribute of target object, electronic equipment and storage medium
EP3876163B1 (en) Model training, image processing method, device, storage medium, and program product
CN111709875B (en) Image processing method, device, electronic equipment and storage medium
CN111539897A (en) Method and apparatus for generating image conversion model
CN111709873A (en) Training method and device of image conversion model generator
CN112241716B (en) Training sample generation method and device
CN111753911A (en) Method and apparatus for fusing models
CN112052825B (en) Method, apparatus, device and storage medium for processing image
CN112560854A (en) Method, apparatus, device and storage medium for processing image
CN112507090A (en) Method, apparatus, device and storage medium for outputting information
CN114863437A (en) Text recognition method and device, electronic equipment and storage medium
CN111563541B (en) Training method and device of image detection model
CN111523292B (en) Method and device for acquiring image information
CN112016523B (en) Cross-modal face recognition method, device, equipment and storage medium
CN113792876A (en) Backbone network generation method, device, equipment and storage medium
US20210224476A1 (en) Method and apparatus for describing image, electronic device and storage medium
CN112529181A (en) Method and apparatus for model distillation
CN112328088A (en) Image presenting method and device
CN111833391A (en) Method and device for estimating image depth information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination