CN114596576A

CN114596576A - Image processing method and device, electronic equipment and storage medium

Info

Publication number: CN114596576A
Application number: CN202210242840.3A
Authority: CN
Inventors: 王晓燕; 吕鹏原; 范森; 章成全; 姚锟
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2022-03-11
Filing date: 2022-03-11
Publication date: 2022-06-07

Abstract

The present disclosure provides an image processing method, an image processing apparatus, an electronic device and a storage medium, which relate to the technical field of artificial intelligence, and further relate to the technical field of computer vision and deep learning, so as to at least solve the technical problem of low efficiency in identifying a target object in the related art. The specific implementation scheme is as follows: acquiring a target image, wherein the target image comprises an object to be identified; detecting a target image to obtain target pixel data, wherein the target pixel data is used for expressing the position relation between at least one pixel in an object to be identified and the vertex coordinates of the object to be identified; and correcting the target image based on the target pixel data to obtain a correction result.

Description

Image processing method and device, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of artificial intelligence technologies, and further relates to the field of computer vision and deep learning technologies, and in particular, to an image processing method and apparatus, an electronic device, and a storage medium.

Background

The character recognition of the express bill generally comprises two parts of character detection and character recognition. But at the in-process of actual transportation, screening and picking up, the express delivery parcel is placed at will, and the shooting angle is unfixed, and the picture of shooing may be forward, invert, slope, distortion etc.. The difficulty of directly carrying out character detection and identification is higher, and the manual work and time cost can be greatly increased by identification after manual alignment. Therefore, the accuracy for detecting and identifying the express bill by adopting the prior art is lower.

Disclosure of Invention

The disclosure provides an image processing method, an image processing device, electronic equipment and a storage medium, which are used for at least solving the technical problem of low accuracy of detection of an express delivery object in the related art.

According to an aspect of the present disclosure, there is provided an image processing method including: acquiring a target image, wherein the target image comprises an object to be identified; detecting a target image to obtain target pixel data, wherein the target pixel data is used for representing the position relation between at least one pixel in an object to be identified and the vertex coordinates of the object to be identified; and correcting the target image based on the target pixel data to obtain a correction result.

According to still another aspect of the present disclosure, there is provided an image processing apparatus including: the device comprises an acquisition module, a recognition module and a recognition module, wherein the acquisition module is used for acquiring a target image, and the target image comprises an object to be recognized; the detection module is used for detecting a target image to obtain target pixel data, wherein the target pixel data is used for representing the position relation between at least one pixel in an object to be identified and the vertex coordinates of the object to be identified; and the correction module is used for correcting the target image based on the target pixel data to obtain a correction result.

According to still another aspect of the present disclosure, there is provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being executable by the at least one processor to enable the at least one processor to perform the image processing method proposed by the present disclosure.

According to still another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the image processing method proposed by the present disclosure.

According to yet another aspect of the present disclosure, a computer program product is provided, comprising a computer program which, when executed by a processor, performs the image processing method as set forth in the present disclosure.

In the disclosure, a target image of a target scene is first acquired, wherein the target image includes an object to be identified; then, detecting the target image to obtain target pixel data, wherein the target pixel data is used for expressing the position relation between at least one pixel in the object to be identified and the vertex coordinates of the object to be identified; and finally, correcting the target image based on the target pixel data to obtain a correction result. The target image recognition efficiency is improved. It is easy to notice that the target pixel data can be used to represent the position relationship between at least one pixel in the object to be identified and the vertex coordinates of the object to be identified, and then the target image is corrected based on the target pixel data, so that the accuracy of identification can be further improved, the situation of false detection can be reduced, and the technical problem of low accuracy in detecting the express delivery object in the related art can be further solved.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

FIG. 1 is a partial view of an courier slip of an embodiment of the present disclosure;

fig. 2 is a block diagram of a hardware structure of a computer terminal (or mobile device) for implementing a data processing method according to an embodiment of the present disclosure;

FIG. 3 is a flow chart of a data processing method according to a first embodiment of the present disclosure;

fig. 4a is a schematic diagram of an irregularly placed express waybill according to an embodiment of the present disclosure;

fig. 4b is a schematic diagram of an irregularly placed express waybill according to an embodiment of the present disclosure;

fig. 5a is a detection diagram of an outer frame of an express bill according to an embodiment of the present disclosure;

fig. 5b is a drawing illustrating express bill outer frame correction according to an embodiment of the disclosure;

FIG. 5c is a flow chart of another data processing method according to a second embodiment of the present disclosure;

FIG. 6a is a sample image of an embodiment of the disclosure;

FIG. 6b is a diagram of a central Gaussian distribution region of a sample image according to an embodiment of the disclosure;

FIG. 6c is a flow chart of another data processing method according to a third embodiment of the present disclosure;

fig. 7 is a block diagram of a data processing apparatus according to an embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

With the development of electric commerce and transportation, the express delivery industry is developed vigorously. In 2021, the number of express deliveries in China breaks through 100 hundred million, and a very important effect is highlighted in promotion of consumption and acceleration of economic cycle. The common express companies in the market have dozens of companies, each company has express bills of various styles, and the information is complicated. In the process of distributing and delivering express at an express site every day, hundreds of express contacted by each person exist, and if the bill number and the recipient information are input through manual operation, the logistics state of an information management system in a logistics company is updated, or regional delivery is divided according to information such as telephone and address, a large amount of labor and time cost are consumed. In an environment requiring express delivery speed, the probability of errors in manual operation is high, and complaints may be caused.

At present, the following methods are mainly used for improving the target detection and identification:

the method 1 is a four-direction classification method, and the four-direction classification model can output 4 directions of the express waybill, namely the upper direction, the lower direction, the left direction and the right direction of the express waybill, and then corrects the express waybill picture by rotating 90 degrees, 180 degrees and 270 degrees according to the directions.

The method 2 is a regression method, and 4 vertexes of the express bill main body are directly detected by using a regression model.

And 3, outputting the position of the main body region, the position of the region in the text forward direction 1/2 and the position of the region in the text forward direction upper left corner 1/4 based on a segmentation algorithm. The body 4 vertex coordinates and the origin vertex are determined in conjunction with the body region position and the 1/4 region position.

Some problems exist in the related art, which are as follows: in the method 1 and the four-direction classification method, it is difficult to classify a picture with an affine transformation angle or a picture rotated by about 45 degrees, which is shot obliquely by a camera. Under the condition of correct classification, the rotated characters still have certain-angle inclination, so that the detection and identification precision of the subsequent characters is influenced; the method 2 is a regression method, and the situation of inaccurate vertex position can occur when the express single format is multiple and the style is complex; method 3 and a segmentation method, wherein a local small image of the express waybill is shown in fig. 1, and under the scene that the small image and the local image of the express waybill are shown in fig. 1, the feature distribution is dispersed, the segmentation precision is easily influenced by large-area image features such as bar codes, and map in an 1/4 area is easy to be wrong, so that the judgment of the vertex starting point has errors.

In accordance with an embodiment of the present disclosure, there is provided an image processing method, it should be noted that the steps illustrated in the flowchart of the drawings may be performed in a computer system such as a set of computer executable instructions, and that while a logical order is illustrated in the flowchart, in some cases the steps illustrated or described may be performed in an order different than here.

The method embodiments provided by the embodiments of the present disclosure may be executed in a mobile terminal, a computer terminal or similar electronic devices. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein. Fig. 2 shows a hardware configuration block diagram of a computer terminal (or mobile device) for implementing the image processing method.

As shown in fig. 2, the computer terminal 200 includes a computing unit 201 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM)202 or a computer program loaded from a storage unit 208 into a Random Access Memory (RAM) 203. In the RAM 203, various programs and data necessary for the operation of the computer terminal 200 can also be stored. The computing unit 201, the ROM 202, and the RAM 203 are connected to each other via a bus 204. An input/output (I/O) interface 205 is also connected to bus 204.

A number of components in the computer terminal 200 are connected to the I/O interface 205, including: an input unit 206 such as a keyboard, a mouse, or the like; an output unit 207 such as various types of displays, speakers, and the like; a storage unit 208, such as a magnetic disk, optical disk, or the like; and a communication unit 209 such as a network card, modem, wireless communication transceiver, etc. The communication unit 209 allows the computer terminal 200 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunication networks.

Computing unit 201 can be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 201 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The computing unit 201 performs the image processing method described herein. For example, in some embodiments, the image processing method may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 208. In some embodiments, part or all of the computer program may be loaded and/or installed onto the computer terminal 200 via the ROM 202 and/or the communication unit 209. When the computer program is loaded into the RAM 203 and executed by the computing unit 201, one or more steps of the image processing method described herein may be performed. Alternatively, in other embodiments, the computing unit 201 may be configured to perform the image processing method by any other suitable means (e.g. by means of firmware).

Various implementations of the systems and techniques described here can be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

It should be noted that in some alternative embodiments, the electronic device shown in fig. 2 may include hardware elements (including circuitry), software elements (including computer code stored on a computer-readable medium), or a combination of both hardware and software elements. It should be noted that fig. 2 is only one example of a particular specific example and is intended to illustrate the types of components that may be present in the electronic device described above.

In the above operating environment, the present disclosure provides the image processing method as shown in fig. 3, which may be executed by the computer terminal shown in fig. 2 or a similar electronic device. Fig. 3 is a flowchart of an image processing method according to a first embodiment of the disclosure. As shown in fig. 3, the method may include the steps of:

step S301, a target image is obtained, wherein the target image comprises an object to be identified.

The target image may be an image of an express package containing an object to be identified, where the object to be identified may be an express receipt on the express package.

The object to be identified can be an express receipt on an express package. Wherein, this express delivery document can contain following information: express bill number, addressee information and the like.

The above-mentioned object to be identified may also be an invoice, an electronic card, a poster, a document, or the like in an image.

In an alternative embodiment, the target image may be acquired by a shooting device, wherein the shooting device may be a mobile phone, a camera, or the like.

In another optional embodiment, in the actual transportation and screening processes, the express packages are placed at random, and the shooting angle of the camera is not fixed, so that the express packages in the shot express package images are forward, inverted, inclined, even distorted and the like, fig. 4a and 4b are express package single images which are irregularly placed in the disclosure, the express package single express package images are directly subjected to character detection and identification in the distribution scene shown in fig. 4a and 4b, and the labor cost and the time cost are greatly increased through identification after manual alignment. In the method and the device, the express receipts in the express parcel images can be corrected after the express parcel images are acquired, so that forward express receipts can be obtained, and the detection accuracy of the express receipts is improved.

Step S302, detecting a target image to obtain target pixel data, wherein the target pixel data is used for representing the position relation between at least one pixel in the object to be recognized and the vertex coordinates of the object to be recognized.

The above-described target pixel data may be the positional relationship between all pixels in the object to be recognized and the vertex coordinates of the object to be recognized. The target pixel data may also be a positional relationship between one of the pixels in the object to be recognized and the vertex coordinates of the object to be recognized; the target pixel data may also be a positional relationship between a plurality of pixels in the object to be recognized and the vertex coordinates of the object to be recognized. Alternatively, the target pixel data may be the difference between the horizontal and vertical coordinates of the pixels in the express document body frame and the 4 vertexes of the body frame.

In an optional embodiment, the at least one pixel may be a pixel in a central area of the express delivery document, and since the pixel in the central area has more text information, the pixel in the central area can be used to more accurately represent the position relationship between the pixel and the vertex coordinate.

In another alternative embodiment, the target image may be detected by using a detection model, so as to obtain target pixel data. Alternatively, the detection model may adopt a multi-channel segmentation method to output the horizontal and vertical coordinate difference values of the pixels and the 4 vertexes in the subject frame, so as to calculate the coordinates of the 4 vertexes of the subject and determine the coordinates of the starting points of the 4 vertexes. The coordinates of the starting point can be determined according to the orientation of the text in the pixel, and optionally, the coordinates of the starting point in the 4 vertex can be the upper left corner on the premise of forward direction of the text.

The detection model may be a Convolutional Neural Network (CNN), which may be a type of feed-forward Neural network that includes convolution calculations and has a deep structure.

The starting point may be determined by the direction of the text, and in an alternative embodiment, the vertex at the top left corner when the text is in the forward direction may be regarded as the starting point.

The vertex coordinates are coordinates of points at four corners of the express document.

In another alternative embodiment, the target object may be detected in all directions from different angles based on the pixel information, so as to obtain target pixel data corresponding to the target image. Further, since the target pixel data is a position relationship between at least one pixel in the object to be recognized and the vertex coordinates of the object to be recognized, the 4 vertex coordinates and the starting points of the 4 vertex coordinates can be determined by the target pixel data, and further, the object to be recognized in the target image can be corrected according to the 4 vertex coordinates and the starting points of the 4 vertex coordinates, so that the object to be recognized is in the forward direction.

Step S303, the target image is corrected based on the target pixel data, and a correction result is obtained.

In an alternative embodiment, logical calculation may be performed according to the target pixel data to obtain 4 vertices and starting points of the object to be recognized in the target image.

In another optional embodiment, the object to be recognized in the target image may be corrected according to the 4 vertices and the starting points, so that the object to be recognized may be displayed in a forward direction, and since the text information in the object to be recognized displayed in the forward direction is in the forward direction, when detecting the text information in the object to be recognized, the accuracy of text information detection may be improved.

In another alternative embodiment, the object to be recognized in the target image may be corrected by affine transformation to obtain a correction result, where the affine transformation refers to a linear transformation performed once in one vector space in geometry, followed by a translation, and then transformed into another vector space. The difficulty of character detection and identification of the forward characters can be greatly reduced through the steps, the character identification precision can be remarkably improved, and manual correction is not needed.

In another optional embodiment, after the corrected target image is obtained, an express bill identification Software Development Kit (also referred to as an SDK) can be integrated into hardware such as a mobile phone, a gun, a high-speed camera and the like, and the Software Development Kit can automatically extract the invoice number, the recipient information and the like on the express bill in real time, so that high identification precision can be ensured, and the manual accounting workload can be greatly reduced.

Fig. 5a is an express bill outer frame detection diagram in the present disclosure, and fig. 5b is an express bill outer frame correction diagram in the present disclosure. As shown in fig. 5a and 5b, in the present disclosure, a target image is detected by a detection model to obtain target pixel data, so that 4 vertexes of an express waybill region in each direction, such as a forward direction, an inverted direction, an inclined direction, and a twisted direction, can be quickly detected, and vertex starting points and an order can be determined according to a character direction, where the vertex starting points may be upper left corners on the premise of a character forward direction. Therefore, the express delivery list is corrected to the forward direction from the region to the character, and the accuracy of subsequent character detection and identification is improved.

According to the present disclosure, in the steps S301 to S303, a target image of a target scene is first obtained, wherein the target image includes an object to be recognized; then, detecting the target image to obtain target pixel data, wherein the target pixel data is used for expressing the position relation between at least one pixel in the object to be identified and the vertex coordinates of the object to be identified; and finally, correcting the target image based on the target pixel data to obtain a correction result. The target image recognition efficiency is improved. It is easy to note that the target pixel data may be used to represent the position relationship between at least one pixel in the object to be recognized and the vertex coordinates of the object to be recognized, and then the target image is corrected based on the target pixel data, so that the recognition accuracy can be further improved, the false detection situation can be reduced, and the technical problem of low accuracy in detecting the express object in the related art can be further solved.

Fig. 5c is a flowchart of an image processing method according to a second embodiment of the present disclosure, as shown in fig. 5c, the method comprising the steps of:

step S501, a target image is obtained, wherein the target image comprises an object to be identified.

Step S502, detecting a target image to obtain target pixel data, wherein the target pixel data is used for representing the position relation between at least one pixel in the object to be recognized and the vertex coordinates of the object to be recognized.

In step S503, the target image is corrected based on the target pixel data, and a correction result is obtained.

And step S504, identifying the target image based on the correction result to obtain an identification result, wherein the identification result is used for representing text information of the object to be identified in the target image.

Optionally, the target image is identified based on the correction result, and an identification result is obtained, where the identification result is used to represent text information of the object to be identified in the target image.

In an optional embodiment, the forward image of the object to be recognized in the target image can be determined according to the correction result, and text information recorded in the object to be recognized, such as recipient information, an express bill number and the like, can be obtained by recognizing the forward image of the object to be recognized, so that the target image can be recognized accurately, and further the text information of the object to be recognized with high accuracy can be obtained.

When the target image is corrected, the coordinate of 4 vertexes of the detection main body and the coordinate information of the starting point can be detected, and the main body area of the express waybill in each direction can be corrected to the character forward direction picture through affine transformation. The difficulty of character detection and identification of the forward characters can be greatly reduced after the operation, the precision of character identification can be obviously improved, and manual correction is not needed. After the correction is finished, the target image recognition can be performed based on the correction result, and the obtained recognition result can be used for representing the text information of the object to be recognized. In an alternative embodiment, the recognition can be performed according to the object image with higher precision, so that the accuracy of the information in the recognized object is improved.

Optionally, correcting the target image based on the target pixel data to obtain a correction result, including: determining vertex coordinates of an object to be identified and a sorting order of the vertex coordinates based on the target pixel data; and correcting the target image based on the vertex coordinates and the sequencing order of the vertex coordinates to obtain a correction result.

The above-mentioned sorting order may be clockwise or counterclockwise.

In an alternative embodiment, the 4 vertex coordinates x1, y1, x2, y2, x3, y3, x4 and y4 of the courier slip picture body area can be obtained through logic calculation, wherein the coordinates can be ordered in a clockwise or counterclockwise direction. Wherein, x1 and y1 are vertices at the upper left corner when the character is in the forward direction, and can be regarded as starting points.

In another alternative embodiment, according to the obtained vertex information and the corresponding start point information, the main body area of the express waybill picture in each direction can be corrected to the text forward direction through affine transformation, so that a corrected result is obtained. The affine transformation is also called affine mapping, and means that in geometry, one vector space is subjected to linear transformation once and then is subjected to translation, and then is transformed into another vector space. The difficulty of character detection and identification can be greatly reduced by the forward characters, the character identification precision can be obviously improved, and manual correction is not needed.

Optionally, the correcting the target image based on the vertex coordinates and the sorting order of the vertex coordinates to obtain a correction result, further comprising: determining a target coordinate in the vertex coordinates according to the sorting sequence of the vertex coordinates, wherein the target coordinate is a starting point coordinate in the vertex coordinates; and correcting the object to be recognized based on the target coordinate and the vertex coordinate to obtain a correction result.

The target coordinates in the vertices described above may be the start point coordinates. Wherein, the starting point is the upper left corner on the premise of forward text.

The obtained 4 vertex coordinates x1, y1, x2, y2, x3, y3, x4 and y4 and the starting points x1 and y1 are subjected to affine transformation, so that the main body area of the express waybill picture in each direction can be corrected to be a character forward picture, and the accuracy of identifying the express object is further improved.

Optionally, detecting the target image to obtain target pixel data includes: and detecting the target image by using the detection model to obtain target pixel data.

In an alternative embodiment, the detection model may be used to detect pixels of an object to be recognized in a target image, so as to obtain target pixel data.

In an optional embodiment, a plurality of target images can be detected simultaneously through the detection model, so that the detection efficiency can be greatly improved.

In another optional example, a plurality of parcels in one target image can be detected simultaneously through the detection model, so that the detection efficiency of the target image is further improved.

Optionally, a raw sample is obtained, wherein the raw sample comprises: the identification method comprises the steps of obtaining a sample image and sample coordinates corresponding to the sample image, wherein the sample coordinates are vertex coordinates of an object to be identified in the sample image; determining sample pixel data based on the sample image and the sample coordinates, wherein the sample pixel data is used for representing the position relation between pixels in the sample image and the object to be identified; determining training data based on the sample pixel data and the sample image; and training the initial model based on the training data to obtain a detection model.

The sample image may be a courier package containing an object to be identified, wherein the object to be identified may be forward, upside down, tilted, or even distorted in the sample image.

The sample coordinates may be vertex coordinates of an object to be recognized in the sample image.

The sample pixel data may be a positional relationship between at least one pixel in the object to be recognized and the vertex coordinates of the object to be recognized.

The training data is data for training the initial model.

In an optional embodiment, an original sample may be obtained, where the original sample may include a plurality of sample images and sample coordinates corresponding to the plurality of sample images, it should be noted that the sample coordinates corresponding to the plurality of sample images may be sample coordinates labeled manually, and optionally, after the plurality of sample images are obtained, coordinates of the sample images may be labeled manually to obtain sample coordinates corresponding to each sample image, and the original sample is generated according to the sample images and the sample coordinates corresponding to the sample images.

Further, at least one pixel in the sample image and the sample coordinates may be logically calculated to determine sample pixel data, and in order to reduce the amount of calculation, the pixel in the target area in the sample image and the sample coordinates may be logically calculated to determine sample pixel data. The training data may be determined from the plurality of sample images and sample pixel data corresponding to the plurality of sample images. Optionally, the sample pixel data and the sample image may be constructed as a sample pair, a plurality of sample pairs may be obtained, training data may be generated according to the plurality of sample pairs, and the initial model may be trained according to the training data to obtain the detection model.

Furthermore, the detection model can be used for detecting the target image to obtain target pixel data of the object to be recognized in the target image, because the target pixel data comprises the position relation between the pixel and the coordinate, the pixel of the object to be recognized in the target image can be recognized, the pixel is logically calculated according to the target pixel data to obtain the sequencing sequence of the vertex coordinate corresponding to the pixel in the object to be recognized and the starting point coordinate in the vertex coordinate, the object to be recognized can be corrected according to the sequencing sequence of the vertex coordinate and the starting point coordinate of the vertex coordinate, the object to be recognized can be displayed in the forward direction, the object to be recognized can be conveniently recognized, and therefore the detection efficiency of the target image can be greatly improved.

Optionally, determining sample pixel data based on the sample image and the sample coordinates comprises: acquiring a target area of an object to be identified in a sample image; and acquiring a difference value between the pixel in the target area and the sample coordinate, and determining sample pixel data.

In an optional embodiment, the target region may be a central region of the object to be recognized, the central region of the object to be recognized in the sample image may be obtained, a difference between a pixel of the central region and a sample coordinate may be obtained, optionally, a difference between a coordinate where the pixel is located and the sample coordinate may be determined, a position relationship between the pixel and the sample coordinate may be determined according to the difference, and then, sample pixel data may be determined.

The target area can be an area with larger characters in the object to be recognized; the target area may be a region where characters in the object to be recognized are clear, or the target area may be a central area of the object to be recognized, or a place with much information. In an alternative embodiment, the target region may be regarded as a central gaussian distribution region, where the gaussian distribution is a normal distribution.

In the present disclosure, only the pixels of the gaussian distribution area of the object to be recognized in the sample image are extracted, the horizontal and vertical coordinate difference between the pixels of the gaussian distribution area of the object to be recognized in the sample image and 4 vertexes in the sample coordinate is calculated, and the sample data is determined, so that the number of output candidate positive samples is greatly reduced, thereby reducing the calculation amount and improving the detection performance.

In the disclosure, a target image is acquired, wherein the target image comprises an object to be identified; detecting a target image to obtain target pixel data, wherein the target pixel data is used for representing the position relation between at least one pixel in an object to be identified and the vertex coordinates of the object to be identified; and correcting the target image based on the target pixel data to obtain a correction result. When an object to be recognized in a target image is detected, pixel-level direction supervision can be introduced, the target image is detected through a detection model, the position relation between pixels in the object to be recognized and vertex coordinates in the object to be recognized can be obtained, the target image is processed through position information in target pixel data, a more accurate object image can be obtained, and therefore the accuracy of the obtained object image is improved.

Fig. 6c is a flowchart of an image processing method according to a third embodiment of the present disclosure, as shown in fig. 6c, the method comprising the steps of:

step S601, obtaining an original sample, where the original sample includes: the identification method comprises the steps of obtaining a sample image and sample coordinates corresponding to the sample image, wherein the sample coordinates are vertex coordinates of an object to be identified in the sample image.

Step S602, determining sample pixel data based on the sample image and the sample coordinates, wherein the sample pixel data is used for representing a position relationship between a pixel in the sample image and the object to be identified.

Step S603 determines training data based on the sample pixel data and the sample image.

Step S604, training the initial model based on the training data to obtain a detection model.

Step S605, a target image is acquired, wherein the target image includes an object to be recognized.

Step S606, detecting the target image to obtain target pixel data, where the target pixel data is used to represent a position relationship between at least one pixel in the object to be recognized and a vertex coordinate of the object to be recognized.

In step S607, the target image is corrected based on the target pixel data, and a correction result is obtained.

Step S608, recognizing the target image based on the correction result to obtain a recognition result, where the recognition result is used to represent text information of the object to be recognized in the target image.

In the technical scheme of the disclosure, the collection, storage, use, processing, transmission, provision, disclosure and other processing of the personal information of the related user are all in accordance with the regulations of related laws and regulations and do not violate the good customs of the public order.

Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions to enable a terminal device (which may be a mobile phone, a computer, a server, or a network device) to execute the methods of the embodiments of the present disclosure.

The present disclosure also provides an image processing apparatus, which is used to implement the above embodiments and preferred embodiments, and the description of the apparatus that has been already made is omitted. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.

Fig. 7 is a block diagram of an image processing apparatus according to an embodiment of the present disclosure, and as shown in fig. 7, a data processing apparatus 700 includes: an obtaining module 701, a detecting module 702, and a correcting module 703.

An obtaining module 701, configured to obtain a target image, where the target image includes an object to be identified;

a detection module 702, configured to detect a target image to obtain target pixel data, where the target pixel data is used to represent a position relationship between at least one pixel in an object to be recognized and a vertex coordinate of the object to be recognized;

the correcting module 703 is configured to correct the target image based on the target pixel data, so as to obtain a correction result.

Optionally, the correction module 703 includes: a first determination unit configured to determine vertex coordinates of an object to be recognized and a sorting order of the vertex coordinates based on target pixel data; and the correction unit is used for correcting the target image based on the vertex coordinates and the sequencing order of the vertex coordinates to obtain a correction result.

Optionally, the correction unit includes: the determining subunit is used for determining a target coordinate in the vertex coordinates according to the sorting order of the vertex coordinates, wherein the target coordinate is a starting point coordinate in the vertex coordinates; and the syndrome unit is used for correcting the object to be recognized based on the target coordinate and the vertex coordinate to obtain a correction result.

Optionally, the detection module includes: and the detection unit is used for detecting the target image by using the detection model to obtain target pixel data.

Optionally, the detection module includes: an obtaining unit configured to obtain an original sample, wherein the original sample includes: the identification method comprises the steps of obtaining a sample image and sample coordinates corresponding to the sample image, wherein the sample coordinates are vertex coordinates of an object to be identified in the sample image; a second determination unit, configured to determine sample pixel data based on the sample image and the sample coordinates, where the sample pixel data is used to represent a positional relationship between a pixel in the sample image and an object to be recognized; the second determining unit is further used for determining training data based on the sample pixel data and the sample image; the second determining unit is further configured to train the initial model based on the training data to obtain a detection model.

Optionally, the second determining unit includes: the acquisition subunit is used for acquiring a target area of an object to be identified in the sample image; the obtaining subunit is further configured to obtain a difference between the pixel in the target region and the sample coordinate, and determine sample pixel data.

Optionally, the apparatus further comprises: and the identification module is used for identifying the target image based on the correction result to obtain an identification result, wherein the identification result is used for representing the text information of the object to be identified in the target image.

It should be noted that the above modules may be implemented by software or hardware, and for the latter, the following may be implemented, but not limited to: the modules are all positioned in the same processor; alternatively, the modules are respectively located in different processors in any combination.

According to an embodiment of the present disclosure, there is also provided an electronic device including a memory having stored therein computer instructions and at least one processor configured to execute the computer instructions to perform the steps in any of the above method embodiments.

Optionally, the electronic device may further include a transmission device and an input/output device, wherein the transmission device is connected to the processor, and the input/output device is connected to the processor.

Alternatively, in the present disclosure, the processor may be configured to execute the following steps by a computer program:

s1, acquiring a target image, wherein the target image comprises an object to be identified;

s2, detecting the target image to obtain target pixel data, wherein the target pixel data is used for expressing the position relation between at least one pixel in the object to be recognized and the vertex coordinates of the object to be recognized;

s3, the target image is corrected based on the target pixel data, and a correction result is obtained.

Optionally, the specific examples in this embodiment may refer to the examples described in the above embodiments and optional implementation manners, and this embodiment is not described herein again.

According to an embodiment of the present disclosure, there is also provided a non-transitory computer readable storage medium having stored therein computer instructions, wherein the computer instructions are arranged to perform the steps in any of the above method embodiments when executed.

Alternatively, in the present embodiment, the above-mentioned nonvolatile storage medium may be configured to store a computer program for executing the steps of:

s1, acquiring a target image, wherein the target image comprises an object to be recognized;

Alternatively, in the present embodiment, the non-transitory computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The present disclosure also provides a computer program product according to an embodiment of the present disclosure. Program code for implementing the audio processing methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the above embodiments of the present disclosure, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the embodiments provided in the present disclosure, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units may be a logical division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present disclosure. And the aforementioned storage medium includes: various media capable of storing program codes, such as a U disk, a Read Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk.

The foregoing is merely a preferred embodiment of the present disclosure, and it should be noted that modifications and embellishments could be made by those skilled in the art without departing from the principle of the present disclosure, and these should also be considered as the protection scope of the present disclosure.

Claims

1. An image processing method, comprising:

acquiring a target image, wherein the target image comprises an object to be identified;

detecting the target image to obtain target pixel data, wherein the target pixel data is used for representing the position relation between at least one pixel in the object to be identified and the vertex coordinates of the object to be identified;

and correcting the target image based on the target pixel data to obtain a correction result.

2. The method of claim 1, wherein said correcting the target image based on the target pixel data to obtain a correction result comprises:

determining the vertex coordinates of the object to be recognized and the sorting order of the vertex coordinates based on the target pixel data;

and correcting the target image based on the vertex coordinates and the sequencing order of the vertex coordinates to obtain the correction result.

3. The method of claim 2, wherein the correcting the target image based on the vertex coordinates and the sorted order of the vertex coordinates to obtain a correction result comprises:

determining a starting point coordinate in the vertex coordinates according to the sorting sequence of the vertex coordinates;

and correcting the object to be recognized based on the starting point coordinate and the vertex coordinate to obtain the correction result.

4. The method of claim 1, wherein the method further comprises:

obtaining a raw sample, wherein the raw sample comprises: the sample image, the sample coordinates corresponding to the sample image and the sample coordinates are the vertex coordinates of the object to be identified in the sample image;

determining sample pixel data based on the sample image and the sample coordinates, wherein the sample pixel data is used for representing the position relation between the pixels in the sample image and the object to be identified;

determining training data based on the sample pixel data and the sample image;

training an initial model based on the training data to obtain a detection model;

detecting the target image to obtain target pixel data, including:

and detecting the target image by using the detection model to obtain target pixel data.

5. The method of claim 4, wherein the determining sample pixel data based on the sample image and the sample coordinates comprises:

acquiring a target area of an object to be identified in the sample image;

and acquiring a difference value between the pixel in the target area and the sample coordinate, and determining the sample pixel data.

6. The method of claim 1, wherein the method further comprises:

and identifying the target image based on the correction result to obtain an identification result, wherein the identification result is used for representing text information of the object to be identified in the target image.

7. An image processing apparatus, comprising:

the device comprises an acquisition module, a recognition module and a recognition module, wherein the acquisition module is used for acquiring a target image, and the target image comprises an object to be recognized;

the detection module is used for detecting the target image to obtain target pixel data, wherein the target pixel data is used for representing the position relation between at least one pixel in the object to be identified and the vertex coordinates of the object to be identified;

and the correction module is used for correcting the target image based on the target pixel data to obtain a correction result.

8. The apparatus of claim 7, wherein the correction module comprises:

a first determination unit configured to determine vertex coordinates of the object to be recognized and an order of the vertex coordinates based on the target pixel data;

and the correction unit is used for correcting the target image based on the vertex coordinates and the sequencing order of the vertex coordinates to obtain the correction result.

9. The apparatus of claim 8, wherein the correction unit comprises:

the determining subunit is configured to determine a target coordinate in the vertex coordinates according to a sorting order of the vertex coordinates, where the target coordinate is a start point coordinate in the vertex coordinates;

and the corrector subunit is used for correcting the object to be recognized based on the target coordinate and the vertex coordinate to obtain the correction result.

10. The apparatus of claim 7, wherein the detection module comprises:

an obtaining unit configured to obtain an original sample, wherein the original sample includes: the sample image and the sample coordinate corresponding to the sample image are the vertex coordinate of the object to be identified in the sample image;

a second determination unit, configured to determine sample pixel data based on the sample image and the sample coordinates, where the sample pixel data is used to represent a positional relationship between a pixel in the sample image and the object to be recognized;

the second determination unit is further configured to determine training data based on the sample pixel data and the sample image;

the second determining unit is further configured to train the initial model based on the training data to obtain a detection model;

wherein, the detection module further comprises:

and the detection unit is used for detecting the target image by using the detection model to obtain target pixel data.

11. The apparatus of claim 10, wherein the second determining unit comprises:

the acquisition subunit is used for acquiring a target area of an object to be identified in the sample image;

the obtaining subunit is further configured to obtain a difference between the pixel in the target region and the sample coordinate, and determine the sample pixel data.

12. The apparatus of claim 7, wherein the apparatus further comprises:

and the identification module is used for identifying the target image based on the correction result to obtain an identification result, wherein the identification result is used for representing the text information of the object to be identified in the target image.

13. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-6.

14. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-6.

15. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-6.