CN115756461A - Annotation template generation method, image identification method and device and electronic equipment - Google Patents

Annotation template generation method, image identification method and device and electronic equipment Download PDF

Info

Publication number
CN115756461A
CN115756461A CN202211477254.3A CN202211477254A CN115756461A CN 115756461 A CN115756461 A CN 115756461A CN 202211477254 A CN202211477254 A CN 202211477254A CN 115756461 A CN115756461 A CN 115756461A
Authority
CN
China
Prior art keywords
image
target
interface
template
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211477254.3A
Other languages
Chinese (zh)
Inventor
刘会霞
郑邦东
胡雅伦
熊博颖
谢小容
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CCB Finetech Co Ltd
Original Assignee
CCB Finetech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CCB Finetech Co Ltd filed Critical CCB Finetech Co Ltd
Priority to CN202211477254.3A priority Critical patent/CN115756461A/en
Publication of CN115756461A publication Critical patent/CN115756461A/en
Pending legal-status Critical Current

Links

Images

Abstract

The disclosure provides a method for generating a labeling template, relates to the technical field of computers, and can be applied to the technical field of finance. The method comprises the following steps: responding to a received annotation template configuration instruction from a user, and determining a first interface to be displayed according to the configuration instruction; in response to receiving the image from the user, determining a second interface to be displayed according to the image; displaying a configuration interface to a user according to the first interface to be displayed and the second interface to be displayed; in response to receiving an annotation template generation instruction from a user, determining operation information of the user on a configuration interface aiming at a target image; and generating a target labeling template corresponding to the image based on the operation information, wherein the target labeling template is used for image recognition. The present disclosure also provides an image recognition method, apparatus, device, storage medium, and program product.

Description

Annotation template generation method, image identification method and device and electronic equipment
Technical Field
The present disclosure relates to the field of computer technologies, and may be applied to the field of financial technologies, and more particularly, to an annotation template generation method, an image recognition method, an apparatus, a device, a medium, and a program product.
Background
In the existing image classification application, an image recognition technology based on deep learning is mainly adopted, and a developer needs to prepare a large number of real images as training data to train an image classification model.
For the existing method for generating the labeling template, developers are required to train a targeted deep learning model or make a template, specific post-processing codes are required to be compiled and redeployed for each type of images, and the method is usually long in time consumption and low in development efficiency; in addition, the sizes of the template pictures required by different models are various, the resolution ratio is not fixed, the angles of the pictures are various, and certain inconvenience is brought to template manufacturing.
Disclosure of Invention
In view of the foregoing, the present disclosure provides an annotation template generation method, an image recognition method, an apparatus, a device, a medium, and a program product. Through the label template generation method, a user can upload images of any size. Before the target template generation area is generated, the size of the target image in the second interface to be displayed and the corresponding target template generation area are determined according to the size of the image, so that on one hand, the definition of the target image is favorably ensured, the target image is prevented from being blurred due to the fact that the target template generation area is too small, and the defect that the target template generation area is too large and the target image rolls is avoided. On the other hand, each pixel point in the target template generation area is enabled to correspond to the coordinate on the target image, and the coordinate information in the labeling template is determined accurately. And generating a target annotation template corresponding to the image based on the operation information of the user. Therefore, the process of configuring the template by the user is more convenient and flexible, and the finally generated labeling template is more accurate.
According to a first aspect of the present disclosure, there is provided an annotation template generation method, including: responding to a received annotation template configuration instruction from a user, and determining a first interface to be displayed according to the configuration instruction; in response to receiving the image from the user, determining a second interface to be displayed according to the image; the second interface to be displayed comprises a target image corresponding to the image and a target template generating area corresponding to the target image; displaying a configuration interface to a user according to the first interface to be displayed and the second interface to be displayed; in response to receiving an annotation template generation instruction from the user, determining operation information of the user on the configuration interface aiming at the target image; and generating a target labeling template corresponding to the image based on the operation information, wherein the target labeling template is used for image recognition.
According to an embodiment of the present disclosure, the determining, in response to receiving the image from the user, a second interface to be presented according to the image includes: determining the size of the target image according to the size of the image and the size of the first interface to be displayed; and processing the image based on the size of the target image to obtain the target image in the second interface to be displayed.
According to an embodiment of the present disclosure, the processing the image based on the size of the target image to obtain the target image in the second interface to be displayed includes: determining a zoom ratio of the image based on a size of the target image; and under the condition that the zoom ratio is determined to be beyond the preset range, adjusting the zoom ratio of the image to obtain the target image.
According to the embodiment of the disclosure, the configuration interface includes a configuration item, and the determining, in response to receiving an annotation template generation instruction from the user, operation information of the user on the configuration interface for the target image includes: in response to receiving a notification that the user selects the configuration item, monitoring a frame selection operation of the user in the target template generation area by adopting the configuration item; and determining a target identification area in the target labeling template based on the first coordinate information of the area corresponding to the frame selection operation.
According to an embodiment of the present disclosure, the method further comprises: identifying the target image, and determining character information in the target image and second coordinate information corresponding to each character information; and determining a target field corresponding to the target identification area in the target labeling template according to the matching relation between the first coordinate information and the second coordinate information.
According to the embodiment of the present disclosure, the annotation template configuration instruction includes an instruction generated according to the predetermined item selected by the user, and the determining, according to the configuration instruction, the first interface to be displayed in response to receiving the annotation template configuration instruction from the user includes: and determining a first interface to be displayed matched with the preset item according to the configuration instruction.
According to an embodiment of the present disclosure, the method further comprises: acquiring a configuration component, wherein the configuration component is used for generating an interactive interface; wherein the interactive interface comprises the configuration interface, and the configuration interface comprises configuration items.
A second aspect of the present disclosure provides an image recognition method, including: acquiring an image to be identified; aiming at the image to be recognized, obtaining a recognition result of the image to be recognized based on the target marking template; wherein the target labeling template is obtained according to the method provided by the present disclosure
A third aspect of the present disclosure provides an annotation template generation apparatus, including: the first determining module is used for responding to a received annotation template configuration instruction from a user and determining a first interface to be displayed according to the configuration instruction; the second determining module is used for responding to the image received from the user and determining a second interface to be displayed according to the image; the second interface to be displayed comprises a target image corresponding to the image and a target template generating area corresponding to the target image; the display module is used for displaying a configuration interface to a user according to the first interface to be displayed and the second interface to be displayed; a third determining module, configured to determine, in response to receiving an annotation template generation instruction from the user, operation information of the user on the configuration interface for the target image; and the generating module is used for generating a target labeling template corresponding to the image based on the operation information, and the target labeling template is used for image recognition.
A fourth aspect of the present disclosure provides an image recognition apparatus including: the acquisition module is used for acquiring an image to be identified; the identification module is used for obtaining an identification result of the image to be identified based on the target marking template aiming at the image to be identified; wherein the target labeling template is obtained according to the device provided by the disclosure
A fifth aspect of the present disclosure provides an electronic device, comprising: one or more processors; memory for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the above disclosed method.
A sixth aspect of the present disclosure also provides a computer-readable storage medium having stored thereon executable instructions that, when executed by a processor, cause the processor to perform the above-disclosed method.
A seventh aspect of the present disclosure also provides a computer program product comprising a computer program which, when executed by a processor, implements the method disclosed above.
According to the annotation template generation method provided by the disclosure, a user can upload images with any size. Before the target template generation area is generated, the size of the target image in the second interface to be displayed and the corresponding target template generation area are determined according to the size of the image, so that on one hand, the definition of the target image is favorably ensured, the target image is prevented from being blurred due to the fact that the target template generation area is too small, and the defect that the target template generation area is too large and the target image rolls is avoided. On the other hand, each pixel point of the target template generation area corresponds to the coordinate on the target image, and the coordinate information in the labeling template is determined accurately. And generating a target annotation template corresponding to the image based on the operation information of the user. Therefore, the process of configuring the template by the user is more convenient and flexible, and the finally generated labeling template is more accurate.
Drawings
The foregoing and other objects, features and advantages of the disclosure will be apparent from the following description of embodiments of the disclosure, which proceeds with reference to the accompanying drawings, in which:
FIG. 1 schematically illustrates an application scenario diagram of an annotation template generation method, an image recognition method, apparatus, device, medium, and program product according to embodiments of the present disclosure;
FIG. 2 schematically illustrates a flow chart of an annotation template generation method according to an embodiment of the present disclosure;
FIG. 3 schematically shows a flow chart of an image recognition method according to an embodiment of the present disclosure;
FIG. 4 is a block diagram schematically illustrating the structure of an annotation template generation apparatus according to an embodiment of the present disclosure;
fig. 5 schematically shows a block diagram of the structure of an image recognition apparatus according to an embodiment of the present disclosure; and
fig. 6 schematically illustrates a block diagram of an electronic device adapted to implement an annotation template generation method and/or an image recognition method according to an embodiment of the present disclosure.
Detailed Description
Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is illustrative only and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.
Where a convention analogous to "A, B and at least one of C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B and C" would include, but not be limited to, systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).
The embodiment of the disclosure provides a method and a device for generating an annotation template, wherein a first interface to be displayed is determined according to a configuration instruction in response to receiving an annotation template configuration instruction from a user; in response to receiving the image from the user, determining a second interface to be displayed according to the image; the second interface to be displayed comprises a target image corresponding to the image and a target template generation area corresponding to the target image; displaying a configuration interface to a user according to the first interface to be displayed and the second interface to be displayed; in response to receiving an annotation template generation instruction from a user, determining operation information of the user on a configuration interface aiming at a target image; and generating a target labeling template corresponding to the image based on the operation information, wherein the target labeling template is used for image identification.
Fig. 1 schematically illustrates an application scenario diagram of an annotation template generation method, an image recognition method, an apparatus, a device, a medium, and a program product according to embodiments of the present disclosure.
As shown in fig. 1, the application scenario 100 according to this embodiment may include terminal devices 101, 102, 103, a network 104 and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
A user may use terminal devices 101, 102, 103 to interact with a server 105 over a network 104 to receive or send messages or the like. The terminal devices 101, 102, 103 may have installed thereon various communication client applications, such as shopping-like applications, web browser applications, search-like applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only).
The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 105 may be a server providing various services, such as a background management server (for example only) providing support for websites browsed by users using the terminal devices 101, 102, 103. The background management server may analyze and perform other processing on the received data such as the user request, and feed back a processing result (e.g., a webpage, information, or data obtained or generated according to the user request) to the terminal device.
It should be noted that the annotation template generation method and/or the image recognition method provided by the embodiment of the present disclosure may be generally executed by the server 105. Accordingly, the annotation template generation device and/or the image recognition device provided by the embodiment of the present disclosure can be generally disposed in the server 105. The annotation template generation method and/or the image recognition method provided by the embodiment of the present disclosure may also be performed by a server or a server cluster that is different from the server 105 and is capable of communicating with the terminal devices 101, 102, 103 and/or the server 105. Accordingly, the annotation template generation device and/or the image recognition device provided by the embodiment of the present disclosure may also be disposed in a server or a server cluster different from the server 105 and capable of communicating with the terminal devices 101, 102, 103 and/or the server 105.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for an implementation.
The annotation template generation method of the disclosed embodiment will be described in detail below with reference to fig. 2 based on the scenario described in fig. 1.
FIG. 2 schematically shows a flow chart of an annotation template generation method according to an embodiment of the present disclosure.
As shown in FIG. 2, the embodiment includes operations S210-S250, and the annotation template generation method can be executed by a server.
In the technical scheme of the disclosure, the data acquisition, collection, storage, use, processing, transmission, provision, disclosure, application and other processing are all in accordance with the regulations of relevant laws and regulations, necessary security measures are taken, and the public order and good custom are not violated.
In operation S210, in response to receiving a configuration instruction of the annotation template from the user, a first interface to be displayed is determined according to the configuration instruction.
The first interface to be displayed can be a template generation area and a display area for displaying a target image. The operation bar and the template generating area can be configured in the first interface to be displayed. The interface to be presented in this step may not include a template generation area because the templates required for different recognition models are different.
In operation S220, in response to receiving the image from the user, determining a second interface to be presented according to the image; the second interface to be displayed comprises a target image corresponding to the image and a target template generating area corresponding to the target image.
The user can upload images of any size. This step may adjust the size of the target image based on the different size images.
The size of the target template generation area can be configured according to the size of the target image, and the target template generation area can be understood as a canvas area for a user to perform content annotation operation on the canvas of the target template generation area so as to generate the annotation template.
In operation S230, a configuration interface is presented to a user according to the first interface to be presented and the second interface to be presented.
The configuration interface comprises a first to-be-displayed interface at the bottommost layer and an operation bar for configuration of a user. A template generating area for user operation and a target image displayed to the user, such as a frame selection identification area in the template generating area.
The target template generation area is arranged in the first interface to be displayed, and the area of the target template generation area is smaller than the first interface to be displayed.
In operation S240, in response to receiving an annotation template generation instruction from a user, operation information of the user on a configuration interface for a target image is determined.
The operation information can comprise a frame of a configuration interface of a user; adjusting, amplifying and dragging a target area and a target image; and clearing the picture frame and the like.
The user can draw the template, configure the identification area field, the identification area and the post-processing rule autonomously.
In operation S250, a target annotation template corresponding to the image is generated based on the operation information, and the target annotation template is used for image recognition.
And taking the template generating area corresponding to the operation information as a target marking template configured by the user, and automatically converting the template generating area into a template file for subsequently using the template to classify and identify the image.
The image identification application aiming at the specific type of the picture at present comprises two steps of judging whether the picture is of the specified type and identifying the content of the picture: 1. judging the type of the picture; 2. and identifying the picture content. For the first step, there are two methods for determining the type of the image, the first method is a method for training a model for a developer, and the second method is a method for configuring a template for the developer, and only the method for configuring the template is described here. The method for making the template by the developer comprises the following steps: a developer draws an image template according to the characteristics of an image in a targeted manner, then the image template is classified by using a template matching classification algorithm, and the template manufacturing mainly comprises the following steps: the developer selects a standard picture for each type of image, and draws the template according to the characteristics of the image, such as the name distribution rule of each field in the bill. The template drawing comprises recording the text content and the coordinates of each positioning field and each identification area which can be used as image characteristics in the picture, and finally storing the text content and the coordinates as template files. The N types of images generate N template files.
And determining a second interface to be displayed according to the image, and supporting a user to upload images with any size. For example, when a user selects an image as a template picture, the aspect ratio of the template generation area is set to be the same as that of the target image, the size of the canvas is dynamically adjusted, the target image is fully paved in the whole template generation area, the coordinates exceeding the target image in the template generation area are avoided, and the interference of useless data can be reduced.
The target annotation template can be used for image recognition. For example, the target may record the coordinates and text content of each location field, identification area in each type of picture.
According to the method for generating the annotation template provided by the embodiment, the user can upload the images with any size. Before the target template generation area is generated, the size of the target image in the second interface to be displayed and the corresponding target template generation area are determined according to the size of the image, so that on one hand, the definition of the target image is favorably ensured, the target image is prevented from being blurred due to the fact that the target template generation area is too small, and the defect that the target template generation area is too large and the target image rolls is avoided. On the other hand, each pixel point of the target template generation area corresponds to the coordinate on the target image, and the coordinate information in the labeling template is determined accurately. And generating a target annotation template corresponding to the image based on the operation information of the user. Therefore, the process of configuring the template by the user is more convenient and flexible, and the finally generated labeling template is more accurate.
In response to receiving the image from the user, determining a second interface to be presented according to the image, including: determining the size of a target image according to the size of the image and the size of the first interface to be displayed; and processing the image based on the size of the target image to obtain the target image in the second interface to be displayed.
For example, after receiving an image from a user, the size of the image, such as the height and width of the original image, is obtained. Then, in order to ensure that the image can be completely displayed in the first interface to be displayed, the definition is ensured. The size of the target image may be determined according to the size of the image and the size of the first interface to be displayed, such as (the width of the display area-the width of the operation bar)/the width of the picture, which is equal to scaleX, and the height of the display area/the height of the picture, which is equal to scaleY. If the size of the target image meets the requirement, (scaleX, scaleY) is taken as the size of the target image, and the image is processed to obtain the target image in the second interface to be displayed.
The method for generating the labeling template provided by the embodiment can be fully beneficial to the display area of the first interface to be displayed. Meanwhile, the target template generation area corresponds to the target image, and the image is processed based on the size of the target image to obtain the target image in the second interface to be displayed, so that the definition of the target template generation area is ensured, and the situation that the background of the target template generation area is not clear due to overlarge or undersize image resolution is avoided.
Processing the image based on the size of the target image to obtain a target image in a second interface to be displayed, including: determining a scaling ratio of the image based on the size of the target image; and under the condition that the zoom ratio is determined to be beyond the preset range, adjusting the zoom ratio of the image to obtain the target image.
For example, after receiving an image from a user, the size of the image, such as the height and width of the original image, is obtained. Then, in order to ensure that the image can be completely displayed in the first interface to be displayed, the definition is ensured. The size of the target image may be determined according to the size of the image and the size of the first interface to be displayed, such as (the width of the display area-the width of the operation bar)/the width of the picture, which is equal to scaleX, and the height of the display area/the height of the picture, which is equal to scaleY. If the scaling ratio is not in accordance with the requirement, the smaller of the scaling ratios can be taken as the scaling ratio (scale) of the original image, whether the scaling ratio exceeds a preset scaling range or not is judged, if the scaling ratio exceeds the maximum value of the preset range, the scaling ratio is set to be the maximum value of the preset range, and if the scaling ratio is smaller than the minimum value of the preset range, the scaling ratio is set to be the minimum value of the preset range. Thereby setting the height of the template generating area as the image height multiplied by the corresponding final scaling ratio and the width of the template generating area as the image width multiplied by the corresponding final scaling ratio. Thus, the size of the target image and the size of the target template generation area can be determined.
In the method for generating the annotation template provided by the embodiment, in order to ensure the definition of the image display, the range of the zoom ratio of the original image is limited, which is beneficial to avoiding the defects that the image needs to be scrolled due to the fact that the target template generation area is too small and the image is blurred due to the fact that the target template generation area is too large.
The configuration interface comprises configuration items, and the method comprises the following steps of responding to a received annotation template generation instruction from a user, determining the operation information of the user on the configuration interface aiming at the target image, and comprising the following steps: in response to receiving a notification that a user selects a configuration item, monitoring the frame selection operation of the user in a target template generation area by adopting the configuration item; and determining a target identification area in the target labeling template based on the first coordinate information of the area corresponding to the frame selection operation.
The configuration items can support the functions of enlarging a template generation area, reducing the template generation area, restoring the template generation area, dragging the template generation area, rotating the template generation area, cutting the template generation area and the like, so that the method is suitable for a user to adjust the template generation areas generated by images with different resolutions and different angles, and the flexibility of labeling operation is improved.
For example, when the user selects the configuration item "frame", the mouse operation of the user is monitored by using the corresponding JavaScript script, when the user presses the mouse, the initial coordinate point is recorded, then the mouse is moved, at this time, a rectangle with the initial point coordinate as the diagonal starting point and the current mouse position corresponding coordinate as the diagonal end point is correspondingly drawn on the template generation area, that is, the canvas, and when the user releases the mouse, the drawing of the recognition frame is completed. And monitoring the operation of drawing the rectangle by the script, and transmitting the coordinate value corresponding to the rectangle to a server for storage, thereby determining the target identification area in the target labeling template.
For example, if the coordinate values of the start point are (x 1, y 1) and the coordinate values of the end point are (x 2, y 2), the width of the area corresponding to the frame selection operation is the absolute value of the difference between x1 and x2, and the height is the absolute value of the difference between y1 and y 2.
Further, when the script monitors that the rectangle drawing is completed, the user can be continuously prompted to select the field name of the target identification area, the field name can be from a preset item when a marking template configuration instruction is generated, and can be a field name and a field model which are built in a document in the preset item.
The identification area corresponds to a target frame formed by the user frame selection operation, the target frame is used as a target identification area, and in the process of identifying the image, the area needing to be identified in the image can be determined through the identification area, and the area can be a character area.
According to the annotation template generation method provided by the embodiment, a user can select a frame in the target template generation area according to requirements, and flexible configuration of each identification area is realized, so that the user is helped to automatically label the identification area on the uploaded image based on the annotation template generation method. The image recognition is carried out through the target recognition area, the image recognition application accuracy can be improved, and the development efficiency of the labeling template is improved.
The method for generating the labeling template further comprises the following steps: identifying a target image, and determining character information in the target image and second coordinate information corresponding to each character information; and determining a target field corresponding to the target identification area in the target labeling template according to the matching relation between the first coordinate information and the second coordinate information.
For example, the target field may include a positioning field, i.e., a fixed text area in each type of image.
Because the target image and the target template generating area have a corresponding relation, the target field corresponding to the target identification area in the target labeling template can be determined according to the matching relation of the first coordinate information and the second coordinate information, and therefore the character information is determined.
The method for generating the annotation template provided by the embodiment can determine the target field corresponding to the target identification area in the target annotation template, thereby facilitating the positioning of the field during image identification.
The annotation template configuration instruction comprises an instruction generated according to a preset item selected by a user, and the first interface to be displayed is determined according to the configuration instruction in response to receiving the annotation template configuration instruction from the user, wherein the method comprises the following steps: and determining a first interface to be displayed matched with the preset item according to the configuration instruction.
The predetermined items may be items and documents. For example, before creating a template, a user may first select any document within any project that is built in. The project is used for isolating different data model requirements of different customers, for example, the requirements of a bank include an identity card, a bank card and a receipt; the bank needs bank card and receipt, and the customers are distinguished by establishing item a and item b. The role of the document is to preset general document field information, such as to build a document under item a: identity card, and enter the fields commonly used for identity card: name, sex, identity card number; build a document under project a: the bank card is input with the following common fields: card number, bank name.
Data can be isolated by selecting preset projects, documents and the like, so that a user can set corresponding labeling templates according to different clients and different demand scenes, and the diversity of the data is increased.
According to the method for generating the annotation template, the first interface to be displayed matched with the preset item is determined according to the configuration instruction, data can be isolated, and the matched first interface to be displayed is generated, so that the matched configuration interface is displayed.
The method for generating the labeling template further comprises the following steps: acquiring a configuration component, wherein the configuration component is used for generating an interactive interface; the interactive interface comprises a configuration interface, and the configuration interface comprises configuration items.
For example, the configuration component may be a canvas component canvas. It is understood that canvas refers to a tag added in HTML5 for generating an image in real time on a web page and manipulating the image content. The scripted client drawing operation may be supported by a definition API with bitmaps that may be operated in JavaScript.
Interaction with the user is achieved through the configuration component, such as providing an interactive interface. For example, the interactive interface may be a configuration interface displayed to the user, an operation bar may be provided in the configuration interface, and the operation bar may include a plurality of configuration items, such as functions of drawing a frame, adjusting, magnifying, dragging, selecting, clearing, and the like, so as to facilitate the user to configure the annotation template.
According to the method for generating the labeling template, the interactive interface and the configuration items are provided through the configuration component, and convenience and flexibility of the labeling template are improved.
Fig. 3 schematically shows a flow chart of an image recognition method according to an embodiment of the present disclosure.
As shown in fig. 3, the embodiment includes operations S310 to S320, and the image recognition method may be performed by a server.
In the technical scheme of the disclosure, the data acquisition, collection, storage, use, processing, transmission, provision, disclosure, application and other processing are all in accordance with the regulations of relevant laws and regulations, necessary security measures are taken, and the public order and good custom are not violated.
In operation S310, an image to be recognized is acquired.
In operation S320, for the image to be recognized, obtaining a recognition result of the image to be recognized based on the target annotation template; and the target labeling template is obtained according to the labeling template generation method.
For example, after the user uploads the picture to be recognized, the user may recognize the text content of the whole picture by using the general image recognition model. The coordinates and the text content of the fields can be positioned according to the target labeling template, the characters which are the same as the positioning fields in the target labeling template are searched in the known identification result, and the real coordinates of each positioning field in the uploaded picture are obtained. And calculating the real coordinates of the identification area in the uploaded image and the result of the current identification area by using the template coordinates and the real coordinates of all the positioning fields and the template coordinates of the identification area through linear regression, rectangular area calculation and other modes. And (3) utilizing the real coordinates and the recognition results of each recognition area in the uploaded images, calling a template matching classification algorithm to classify the images, and performing secondary recognition according to a model and a post-processing rule configured in the template to obtain a final recognition result.
According to the image identification method provided by the embodiment, the identification result of the image to be identified can be obtained by using the target labeling template, and the identification result is accurate.
Based on the label template generation method, the disclosure also provides a label template generation device. The apparatus will be described in detail below with reference to fig. 4.
Fig. 4 schematically shows a block diagram of the annotation template generation apparatus according to an embodiment of the present disclosure.
As shown in fig. 4, the annotation template generation apparatus 400 of this embodiment includes a first determination module 410, a second determination module 420, a presentation module 430, a third determination module 440, and a generation module 450.
The first determining module 410 is configured to, in response to receiving a configuration instruction of an annotation template from a user, determine a first interface to be displayed according to the configuration instruction; a second determining module 420, configured to, in response to receiving the image from the user, determine a second interface to be presented according to the image; the second interface to be displayed comprises a target image corresponding to the image and a target template generating area corresponding to the target image; the display module 430 is configured to display a configuration interface to a user according to the first interface to be displayed and the second interface to be displayed; a third determining module 440, configured to determine, in response to receiving an annotation template generation instruction from the user, operation information of the user on the configuration interface for the target image; and a generating module 450, configured to generate a target annotation template corresponding to the image based on the operation information, where the target annotation template is used for image recognition.
In some embodiments, the second determining module comprises: the first determining submodule is used for determining the size of the target image according to the size of the image and the size of the first interface to be displayed; and the processing module is used for processing the image based on the size of the target image to obtain the target image in the second interface to be displayed.
In some embodiments, the processing module comprises: a determination unit configured to determine a scaling ratio of the image based on a size of the target image; and an adjusting unit, configured to adjust the zoom ratio of the image to obtain the target image when it is determined that the zoom ratio is beyond a predetermined range.
In some embodiments, the configuration interface includes configuration items, and the generation module includes: the monitoring sub-module is used for responding to the received notification that the user selects the configuration item, and monitoring the framing operation of the user in the target template generation area by adopting the configuration item; and the second determining submodule is used for determining a target identification area in the target labeling template based on the first coordinate information of the area corresponding to the frame selection operation.
In some embodiments, the apparatus further includes a matching module, configured to identify the target image, determine text information in the target image and second coordinate information corresponding to each text information; and determining a target field corresponding to the target identification area in the target labeling template according to the matching relation between the first coordinate information and the second coordinate information.
In some embodiments, the annotation template configuration instruction includes an instruction generated according to a predetermined item selected by the user, and the first determining module is configured to determine, according to the configuration instruction, a first interface to be presented that matches the predetermined item.
In some embodiments, the apparatus further comprises an obtaining module configured to obtain a configuration component, the configuration component configured to generate the interactive interface; wherein the interactive interface comprises the configuration interface, and the configuration interface comprises configuration items.
According to an embodiment of the present disclosure, any plurality of the first determining module 410, the second determining module 420, the presenting module 430, the third determining module 440, and the generating module 450 may be combined and implemented in one module, or any one of them may be split into a plurality of modules. Alternatively, at least part of the functionality of one or more of these modules may be combined with at least part of the functionality of the other modules and implemented in one module. According to an embodiment of the present disclosure, at least one of the first determining module 410, the second determining module 420, the presenting module 430, the third determining module 440, and the generating module 450 may be implemented at least in part as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented in hardware or firmware in any other reasonable manner of integrating or packaging a circuit, or in any one of three implementations of software, hardware, and firmware, or in a suitable combination of any of them. Alternatively, at least one of the first determining module 410, the second determining module 420, the presenting module 430, the third determining module 440 and the generating module 450 may be at least partially implemented as a computer program module, which when executed, may perform a corresponding function.
Based on the image identification method, the disclosure also provides an image identification device. The apparatus will be described in detail below with reference to fig. 5.
Fig. 5 schematically shows a block diagram of the structure of an image recognition apparatus according to an embodiment of the present disclosure.
As shown in fig. 5, the image recognition apparatus 500 of this embodiment includes an acquisition module 510 and a recognition module 520.
An obtaining module 510, configured to obtain an image to be identified; the identification module 520 is used for obtaining an identification result of the image to be identified based on the target labeling template aiming at the image to be identified; and the target labeling template is obtained according to the labeling template generating device.
According to an embodiment of the present disclosure, any plurality of the obtaining module 510 and the identifying module 520 may be combined and implemented in one module, or any one of the modules may be split into a plurality of modules. Alternatively, at least part of the functionality of one or more of these modules may be combined with at least part of the functionality of other modules and implemented in one module. According to an embodiment of the present disclosure, at least one of the obtaining module 510 and the identifying module 520 may be implemented at least partially as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented in hardware or firmware in any other reasonable manner of integrating or packaging a circuit, or in any one of three implementations of software, hardware, and firmware, or in any suitable combination of any of them. Alternatively, at least one of the obtaining module 510 and the identifying module 520 may be at least partially implemented as a computer program module, which when executed may perform a corresponding function.
Fig. 6 schematically illustrates a block diagram of an electronic device adapted to implement an annotation template generation method and/or an image recognition method according to an embodiment of the present disclosure.
As shown in fig. 6, an electronic device 600 according to an embodiment of the present disclosure includes a processor 601, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. Processor 601 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or associated chipset, and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), among others. The processor 601 may also include on-board memory for caching purposes. Processor 601 may include a single processing unit or multiple processing units for performing different actions of a method flow according to embodiments of the disclosure.
In the RAM 603, various programs and data necessary for the operation of the electronic apparatus 600 are stored. The processor 601, the ROM602, and the RAM 603 are connected to each other via a bus 604. The processor 601 performs various operations of the method flows according to the embodiments of the present disclosure by executing programs in the ROM602 and/or RAM 603. It is to be noted that the programs may also be stored in one or more memories other than the ROM602 and RAM 603. The processor 601 may also perform various operations of the method flows according to embodiments of the present disclosure by executing programs stored in the one or more memories.
Electronic device 600 may also include input/output (I/O) interface 605, input/output (I/O) interface 605 also connected to bus 604, according to an embodiment of the disclosure. The electronic device 600 may also include one or more of the following components connected to the I/O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The driver 610 is also connected to the I/O interface 605 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 610 as necessary, so that a computer program read out therefrom is mounted in the storage section 608 as necessary.
The present disclosure also provides a computer-readable storage medium, which may be contained in the apparatus/device/system described in the above embodiments; or may exist alone without being assembled into the device/apparatus/system. The computer-readable storage medium carries one or more programs which, when executed, implement the method according to an embodiment of the disclosure.
According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the present disclosure, a computer-readable storage medium may include the ROM602 and/or RAM 603 described above and/or one or more memories other than the ROM602 and RAM 603.
Embodiments of the present disclosure also include a computer program product comprising a computer program containing program code for performing the method illustrated in the flow chart. When the computer program product runs in a computer system, the program code is used for causing the computer system to realize the annotation template generation method provided by the embodiment of the disclosure.
The computer program performs the above-described functions defined in the system/apparatus of the embodiments of the present disclosure when executed by the processor 601. The systems, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.
In one embodiment, the computer program may be hosted on a tangible storage medium such as an optical storage device, a magnetic storage device, or the like. In another embodiment, the computer program may also be transmitted, distributed in the form of a signal on a network medium, downloaded and installed through the communication section 609, and/or installed from the removable medium 611. The computer program containing program code may be transmitted using any suitable network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.
In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 609, and/or installed from the removable medium 611. The computer program, when executed by the processor 601, performs the above-described functions defined in the system of the embodiments of the present disclosure. The systems, devices, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.
In accordance with embodiments of the present disclosure, program code for executing computer programs provided by embodiments of the present disclosure may be written in any combination of one or more programming languages, and in particular, these computer programs may be implemented using high level procedural and/or object oriented programming languages, and/or assembly/machine languages. The programming language includes, but is not limited to, programming languages such as Java, C + +, python, the "C" language, or the like. The program code may execute entirely on the user computing device, partly on the user device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Those skilled in the art will appreciate that various combinations and/or combinations of features recited in the various embodiments and/or claims of the present disclosure can be made, even if such combinations or combinations are not expressly recited in the present disclosure. In particular, various combinations and/or combinations of the features recited in the various embodiments and/or claims of the present disclosure may be made without departing from the spirit or teaching of the present disclosure. All such combinations and/or associations are within the scope of the present disclosure.
The embodiments of the present disclosure have been described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described separately above, this does not mean that the measures in the embodiments cannot be used advantageously in combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be devised by those skilled in the art without departing from the scope of the present disclosure, and such alternatives and modifications are intended to be within the scope of the present disclosure.

Claims (13)

1. A method for generating an annotation template comprises the following steps:
responding to a received annotation template configuration instruction from a user, and determining a first interface to be displayed according to the configuration instruction;
in response to receiving an image from the user, determining a second interface to be presented according to the image; the second interface to be displayed comprises a target image corresponding to the image and a target template generating area corresponding to the target image;
displaying a configuration interface to a user according to the first interface to be displayed and the second interface to be displayed;
in response to receiving an annotation template generation instruction from the user, determining operation information of the user on the configuration interface for the target image; and
and generating a target labeling template corresponding to the image based on the operation information, wherein the target labeling template is used for image identification.
2. The method of claim 1, wherein the determining, in response to receiving the image from the user, a second interface to be presented from the image comprises:
determining the size of the target image according to the size of the image and the size of the first interface to be displayed; and
and processing the image based on the size of the target image to obtain the target image in the second interface to be displayed.
3. The method according to claim 2, wherein the processing the image based on the size of the target image to obtain the target image in the second interface to be displayed comprises:
determining a zoom ratio of the image based on a size of the target image; and
and under the condition that the zoom ratio is determined to be beyond the preset range, adjusting the zoom ratio of the image to obtain the target image.
4. The method of claim 1, wherein the configuration interface includes a configuration item, and the determining, in response to receiving an annotation template generation instruction from the user, operational information of the user at the configuration interface for the target image comprises:
in response to receiving a notification that the user selects the configuration item, monitoring a frame selection operation of the user in the target template generation area by adopting the configuration item; and
and determining a target identification area in the target labeling template based on the first coordinate information of the area corresponding to the frame selection operation.
5. The method of claim 4, further comprising:
identifying the target image, and determining character information in the target image and second coordinate information corresponding to each character information; and
and determining a target field corresponding to the target identification area in the target labeling template according to the matching relation between the first coordinate information and the second coordinate information.
6. The method of claim 1, wherein the annotation template configuration instruction comprises an instruction generated according to a predetermined item selected by the user, and the determining a first interface to be presented according to the configuration instruction in response to receiving the annotation template configuration instruction from the user comprises:
and determining a first interface to be displayed matched with the preset item according to the configuration instruction.
7. The method of claim 1, further comprising:
acquiring a configuration component, wherein the configuration component is used for generating an interactive interface; wherein the interactive interface comprises the configuration interface, and the configuration interface comprises configuration items.
8. An image recognition method, comprising:
acquiring an image to be identified; and
aiming at the image to be recognized, obtaining a recognition result of the image to be recognized based on the target marking template;
wherein the target annotation template is obtained according to the method of any one of claims 1 to 7.
9. An annotation template generation apparatus comprising:
the first determining module is used for responding to a received annotation template configuration instruction from a user and determining a first interface to be displayed according to the configuration instruction;
the second determining module is used for responding to the image received from the user and determining a second interface to be displayed according to the image; the second interface to be displayed comprises a target image corresponding to the image and a target template generating area corresponding to the target image;
the display module is used for displaying a configuration interface to a user according to the first interface to be displayed and the second interface to be displayed;
a third determining module, configured to determine, in response to receiving an annotation template generation instruction from the user, operation information of the user on the configuration interface for the target image; and
and the generating module is used for generating a target labeling template corresponding to the image based on the operation information, and the target labeling template is used for image identification.
10. An image recognition apparatus comprising:
the acquisition module is used for acquiring an image to be identified; and
the identification module is used for obtaining an identification result of the image to be identified based on the target marking template aiming at the image to be identified;
wherein the target annotation template is obtained by the apparatus of claim 9.
11. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-8.
12. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the method according to any one of claims 1 to 8.
13. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1 to 8.
CN202211477254.3A 2022-11-23 2022-11-23 Annotation template generation method, image identification method and device and electronic equipment Pending CN115756461A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211477254.3A CN115756461A (en) 2022-11-23 2022-11-23 Annotation template generation method, image identification method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211477254.3A CN115756461A (en) 2022-11-23 2022-11-23 Annotation template generation method, image identification method and device and electronic equipment

Publications (1)

Publication Number Publication Date
CN115756461A true CN115756461A (en) 2023-03-07

Family

ID=85336329

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211477254.3A Pending CN115756461A (en) 2022-11-23 2022-11-23 Annotation template generation method, image identification method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN115756461A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116402026A (en) * 2023-04-13 2023-07-07 广州文石信息科技有限公司 Application content annotating method, device, equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116402026A (en) * 2023-04-13 2023-07-07 广州文石信息科技有限公司 Application content annotating method, device, equipment and storage medium
CN116402026B (en) * 2023-04-13 2023-12-19 广州文石信息科技有限公司 Application content annotating method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN107911753B (en) Method and device for adding digital watermark in video
CN109255767B (en) Image processing method and device
CN109242801B (en) Image processing method and device
CN109344762B (en) Image processing method and device
KR102002024B1 (en) Method for processing labeling of object and object management server
CN107656911B (en) Form processing method and system
CN109118456B (en) Image processing method and device
US20210200971A1 (en) Image processing method and apparatus
CN112306793A (en) Method and device for monitoring webpage
US20200050906A1 (en) Dynamic contextual data capture
KR20210058768A (en) Method and device for labeling objects
CN110263301B (en) Method and device for determining color of text
CN115756461A (en) Annotation template generation method, image identification method and device and electronic equipment
CN106611148B (en) Image-based offline formula identification method and device
CN107330087B (en) Page file generation method and device
CN113763009A (en) Picture processing method, picture skipping method, device, equipment and medium
CN108834202B (en) Information display method and equipment
CN113220381A (en) Click data display method and device
CN112445394A (en) Screenshot method and device
CN114882283A (en) Sample image generation method, deep learning model training method and device
CN112015936B (en) Method, device, electronic equipment and medium for generating article display diagram
CN110888583B (en) Page display method, system and device and electronic equipment
CN113553123A (en) Data processing method and device, electronic equipment and storage medium
CN114140805A (en) Image processing method, image processing device, electronic equipment and storage medium
US20170053383A1 (en) Apparatus and method for providing 3d content and recording medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination