CN116935028A - Text anchor box generation method, device, equipment and medium - Google Patents

Text anchor box generation method, device, equipment and medium Download PDF

Info

Publication number
CN116935028A
CN116935028A CN202210331749.9A CN202210331749A CN116935028A CN 116935028 A CN116935028 A CN 116935028A CN 202210331749 A CN202210331749 A CN 202210331749A CN 116935028 A CN116935028 A CN 116935028A
Authority
CN
China
Prior art keywords
initial
anchor
box
text
center
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210331749.9A
Other languages
Chinese (zh)
Inventor
吴增程
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Shiyuan Electronics Thecnology Co Ltd
Guangzhou Shiyuan Artificial Intelligence Innovation Research Institute Co Ltd
Original Assignee
Guangzhou Shiyuan Electronics Thecnology Co Ltd
Guangzhou Shiyuan Artificial Intelligence Innovation Research Institute Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Shiyuan Electronics Thecnology Co Ltd, Guangzhou Shiyuan Artificial Intelligence Innovation Research Institute Co Ltd filed Critical Guangzhou Shiyuan Electronics Thecnology Co Ltd
Priority to CN202210331749.9A priority Critical patent/CN116935028A/en
Publication of CN116935028A publication Critical patent/CN116935028A/en
Pending legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Machine Translation (AREA)

Abstract

The application discloses a method for generating a text anchor box, which comprises the following steps: acquiring an initial text box containing text information; obtaining a plurality of initial anchor frames according to the initial text frames, and respectively calculating a first overlapping degree of each initial text frame and each initial anchor frame; clustering the initial text boxes according to the first overlapping degree, and taking an initial anchor box of a clustering center as a center anchor box; and carrying out weight calculation on the initial text boxes with the clustering centers being the same center anchor box based on the first overlapping degree, and carrying out weight calculation on the sizes of the initial text boxes according to weight calculation results to obtain first size parameters of the center anchor boxes, so that the suitability between the anchor boxes and the initial text boxes with different sizes is improved.

Description

Text anchor box generation method, device, equipment and medium
Technical Field
The present application relates to the field of computer technologies, and for example, to a method, an apparatus, a device, and a medium for generating a text anchor box.
Background
Currently, deep learning text detection based on an anchor frame is an important research direction in the field of text detection, and for different text detection projects, different anchor models need to be built and trained. Illustratively, due to the different text forms, a plurality of small text boxes such as a mathematical task question are more, and a plurality of long boxes of a common pdf-format document are more, so that the two boxes need different anchor boxes for model construction and training.
In the existing FAST-RCNN (FAST-Region-Convolutional Neural Network, regional convolutional neural network), the anchor boxes are manually set, whereas in the existing YOLO (YOLO: unique, real-time object detection, target detection algorithm), a K-means clustering algorithm is generally used to cluster bounding boxes in a training set. However, in the method for generating an anchor frame in the prior art, when the initial text box is large in size, large errors are generated, and it is difficult to obtain the anchor frame suitable for the current project.
Disclosure of Invention
The application aims at: provided are a text anchor box generation method, device, apparatus and medium, which can improve the suitability between an anchor box and initial text boxes of different sizes.
In order to achieve the above purpose, the application adopts the following technical scheme:
the text anchor box generation method comprises the following steps:
acquiring an initial text box containing text information;
obtaining a plurality of initial anchor frames according to the initial text frames, and respectively calculating a first overlapping degree of each initial text frame and each initial anchor frame;
clustering the initial text boxes according to the first overlapping degree, and taking an initial anchor box of a clustering center as a center anchor box;
and based on the first overlapping degree, carrying out weight calculation on the initial text boxes with the clustering centers being the same center anchor box, and carrying out weight calculation on the sizes of the initial text boxes according to weight calculation results to obtain first size parameters of the center anchor box.
The application also provides a device for generating the text anchor box, which comprises the following steps:
a text box recognition unit for acquiring an initial text box containing text information;
the overlapping degree calculation unit is used for obtaining a plurality of initial anchor frames according to the initial text frames and respectively calculating the first overlapping degree of each initial text frame and each initial anchor frame;
the center anchor frame generation unit is used for clustering the initial text frames according to the first overlapping degree, and taking the initial anchor frame in the clustering center as a center anchor frame;
and the size calculation unit is used for carrying out weight calculation on the initial text boxes with the clustering centers being the same center anchor box based on the first overlapping degree, and carrying out weight calculation on the sizes of the initial text boxes according to the weight calculation result to obtain a first size parameter of the center anchor box.
The application also provides a computer device, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the method for generating the text anchor box according to any one of the above and/or the steps of the method for generating the text anchor box according to any one of the above when executing the computer program.
The present application also provides a computer-readable storage medium, on which a computer program is stored, which when being executed by a processor, implements the steps of a method of generating a text anchor box as described in any one of the above and/or a method of generating a text anchor box as described in any one of the above.
According to the method for generating the text anchor boxes, the influence of the length-width difference of different initial text boxes on the clustering result is avoided by calculating the overlapping degree of each initial text box and the initial anchor box; and clustering according to the overlapping degree to obtain a central anchor frame, simultaneously carrying out weight calculation on each initial text frame, and uniformly distributing the contribution of each initial text frame to the central anchor frame by introducing weight, so that the obtained size of the central anchor frame is better suitable for all initial text frames corresponding to the central anchor frame.
Drawings
FIG. 1 is a flow chart of a method for generating text anchor boxes according to an embodiment;
FIG. 2 is a schematic diagram of a text anchor box generating device according to an embodiment;
fig. 3 is a block diagram schematically illustrating a structure of a computer device according to an embodiment.
The achievement of the objects, functional features and advantages of the present application will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless expressly stated otherwise, as understood by those skilled in the art. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, modules, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, modules, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. The term "and/or" as used herein includes all or any module and all combination of one or more of the associated listed items.
It will be understood by those skilled in the art that all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs unless defined otherwise. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
Referring to fig. 1, a flow chart of a method for generating a text anchor box according to the present disclosure includes:
s1: acquiring an initial text box containing text information;
s2: obtaining a plurality of initial anchor frames according to the initial text frames, and respectively calculating a first overlapping degree of each initial text frame and each initial anchor frame;
s3: clustering the initial text boxes according to the first overlapping degree, and taking an initial anchor box of a clustering center as a center anchor box;
s4: and based on the first overlapping degree, carrying out weight calculation on the initial text boxes with the clustering centers being the same center anchor box, and carrying out weight calculation on the sizes of the initial text boxes according to weight calculation results to obtain first size parameters of the center anchor box.
As described in step S1, when detecting a target text, firstly, performing preliminary recognition on the target image through a convolutional neural network to obtain a plurality of text candidate regions containing text information in the target image, namely the initial text box; since text information in a plurality of different arrangements may exist in the target image, the size and shape of the initial text boxes tend to be different from one another.
As described in the above step S2, the anchor frame, that is, the anchor frame, generally plays a priori role in the target detection, and is used for marking the portion to be identified in the image, specifically, when the target size in the image is small, if a larger anchor frame is adopted, it will cause the anchor frame to mark the vast majority of non-target areas, and when the target size in the image is large, if a smaller anchor frame is adopted, it will cause the anchor frame to mark only a small portion of the target area, and both of the above cases will cause the offset of the anchor frame mark to be large, that is, cause the error of the final target detection result to be large, so the size difference between the finally obtained anchor frame and the size of the text frame to be detected cannot be too large. However, if an anchor frame is generated for each initial text frame with different sizes, the calculation amount of target detection is greatly increased, and therefore, a method for reducing the detection error and improving the calculation efficiency is required.
Specifically, in this embodiment, a plurality of initial anchor boxes are selected in the initial text boxes according to a preset rule, where the selection rule may be: and uniformly dividing the target image into areas, respectively selecting the same number of initial text boxes in each area as initial anchor boxes, and judging the position relationship between each initial text box and each initial anchor box by calculating the first overlapping degree of each other initial text box and each initial anchor box.
As described in the above step S3, the clustering distance between each initial text box and each initial anchor box may be obtained according to the first overlapping degree, where the first overlapping degree is inversely related to the clustering distance. It will be appreciated that the greater the first degree of overlap of the initial text box and the initial anchor box, the smaller the distance therebetween can be considered.
Specifically, after the above-mentioned clustering distance is obtained, each initial text box is subjected to clustering distance screening, and the initial anchor box with the smallest clustering distance with the current initial text box is used as the clustering center of the initial text box, namely, the center anchor box. It can be appreciated that, since the first overlap degree is inversely related to the clustering distance, the initial anchor box with the smallest clustering distance is the text box with the largest overlap degree with the initial text box. In a specific embodiment, if there are multiple initial anchor boxes with the same clustering distance as the same initial text box, one initial anchor box may be randomly selected as the central anchor box of the initial text box.
As described above in step S4, in the selection of the initial anchor frame, the probability that the initial anchor frame selected for the first time belongs to a larger number of types of initial text frames based on the probability angle, and for example, if the initial text frames include 10 middle-sized initial text frames, 2 small-sized initial text frames, and 1 large-sized initial text frame, the probability that the initial anchor frame is middle-sized is also increased. Thus, to increase the degree of contribution of a smaller number of types of initial text boxes to the anchor box size, such initial text boxes may be weighted higher in order to increase their impact on the final anchor box size.
In sum, the influence of the length-width difference of different initial text boxes on the clustering result is avoided by calculating the overlapping degree of each initial text box and the initial anchor box; and clustering according to the overlapping degree to obtain a central anchor frame, simultaneously carrying out weight calculation on each initial text frame, and uniformly distributing the contribution of each initial text frame to the central anchor frame by introducing weight, so that the obtained size of the central anchor frame is better suitable for all initial text frames corresponding to the central anchor frame.
In one embodiment, after the obtaining the first dimension parameter of the central anchor frame, the method further includes:
calculating second overlapping degree of each central text box and each corresponding initial anchor box based on the first size parameter;
according to the second overlapping degree, carrying out weight calculation on the initial text boxes with the clustering centers being the same center anchor box again, and carrying out weight calculation on the sizes of the initial text boxes according to weight calculation results to obtain second size parameters of the center anchor boxes;
when the second dimension parameter is the same as the first dimension parameter, taking the second dimension parameter as a final dimension parameter of the central anchor frame;
and when the second dimension parameter is different from the first dimension parameter, carrying out iterative computation based on the second dimension parameter to obtain a new dimension parameter of the central anchor frame, stopping iteration until the new dimension parameter is the same as the dimension parameter obtained by the last iteration, and taking the new dimension parameter as the final dimension parameter.
After the central anchor frame with the first size parameter is obtained, the original size of the central anchor frame can be set as the first size parameter, at this time, the overlapping degree between the rest corresponding initial text frames can also be changed to a certain extent due to the change of the size of the central anchor frame, so that at this time, the new overlapping degree can be recalculated, and the size of the central anchor frame can be further adjusted according to the new overlapping degree until the size of the central anchor frame is the same as the size obtained in the previous iteration, thereby obtaining the central anchor frame with the final size parameter, and enabling the central anchor frame to be more suitable for different initial text frames in the cluster.
In one embodiment, the calculating the first overlapping degree of each initial text box and each initial anchor box includes:
respectively identifying the overlapping area between each initial text box and each initial anchor box, and respectively calculating the overlapping area of the overlapping areas;
calculating the sum of the areas of the initial text box and the initial anchor box corresponding to the overlapping area, and calculating the difference of the areas and the overlapping area corresponding to the overlapping area;
and calculating the ratio of the overlapping area corresponding to the overlapping area to the area difference value to obtain the first overlapping degree.
As described above, the first overlap may be an IOU (Intersection Over Union, cross-over ratio), which is calculated by: IOU=area of overlapping area/(area of initial text box and initial anchor box and area of overlapping area), according to the above-mentioned calculation mode, the first overlapping degree is irrelevant to length and width of initial text box, its result is a coefficient related to coverage rate, so that it can avoid the influence of length and width on central anchor box clustering.
In one embodiment, the clustering the initial text boxes according to the first overlapping degree, and taking the initial anchor box of the clustering center as the center anchor box includes:
taking the difference value between 1 and the first overlapping degree as a clustering distance between the initial text box and the corresponding initial anchor box;
and determining a clustering center corresponding to each initial text box according to the clustering distance, and taking the initial anchor box of the clustering center as a central anchor box corresponding to the initial text box.
As described above, the cluster distance is a 1-IOU value between the initial text box and the initial anchor box, which belong to the same initial anchor box, so that the cluster distance and the first overlapping degree IOU show negative correlation.
In one embodiment, the calculating the weight of the initial text box with the cluster center being the same center anchor box based on the first overlapping degree includes:
and calculating the weight corresponding to the initial text box belonging to the same central anchor box through a softmax function based on the first overlapping degree.
As described above, softmax (distance) calculates the ratio of the 1-IOU value for each text to the 1-IOU value for all text. Illustratively, if there are a total of 3 text boxes belonging to the same initial anchor box and the IOU values are 0.3,0.5,0.6, respectively, then distance, i.e., 1-IOU values are 0.7,0.5,0.4, respectively, softmax (distance) is e 0.7/(e 0.7+e 0.5+e 0.4), e 0.5/(e 0.7+e 0.5+e 0.4), e 0.4/(e 0.7+e 0.5+e 0.4), respectively.
In one embodiment, the weighting calculation is performed on the size of the initial text box according to the weight calculation result, so as to obtain a first size parameter of the center anchor box, which includes:
acquiring a size representation of the initial text box, wherein the size representation comprises a first representation parameter and a second representation parameter;
and respectively carrying out weighted calculation on the first characterization parameter and the second characterization parameter of the initial text box with the same clustering center as the center anchor box to obtain a first size parameter of the center anchor box.
As described above, the size characterization may be (w, h), where w=width, i.e. width, and h=height, i.e. height. The softmax (1-IOU) is calculated for all the initial text boxes belonging to the same initial anchor box, then the coefficient is multiplied by w and h of each initial text box, and w and h of all the initial text boxes are added respectively, so that w and h of the corresponding central anchor box can be obtained.
In one embodiment, the obtaining a plurality of initial anchor boxes according to the initial text boxes includes:
and randomly selecting a plurality of text boxes in the initial text box to serve as the initial anchor box.
As described above, in order to reduce the influence of the artificial rule on the selection of the initial anchor frame, the embodiment adopts a random selection method to randomly select a portion in the initial text frame corresponding to the target image as the initial anchor frame. In a specific embodiment, the selection may be performed according to a preset number proportion, for example, according to an initial anchor frame: the initial text box is 1:10, the problem of large calculation amount caused by excessive initial anchor frames or excessive initial text frames is avoided.
Referring to fig. 2, a block diagram of a text anchor box generating device according to the present disclosure is shown, where the device includes:
a text box recognition unit 100 for acquiring an initial text box containing text information;
the overlapping degree calculating unit 200 is configured to obtain a plurality of initial anchor boxes according to the initial text boxes, and calculate a first overlapping degree of each initial text box and each initial anchor box;
a center anchor frame generating unit 300, configured to cluster the initial text frames according to the first overlapping degree, and use an initial anchor frame of a cluster center as a center anchor frame;
and the size calculating unit 400 is configured to perform weight calculation on the initial text boxes with the same clustering center as the center anchor box based on the first overlapping degree, and perform weight calculation on the size of the initial text box according to a weight calculation result, so as to obtain a first size parameter of the center anchor box.
In one embodiment, the size calculation unit 400 is specifically configured to:
calculating second overlapping degree of each central text box and each corresponding initial anchor box based on the first size parameter;
according to the second overlapping degree, carrying out weight calculation on the initial text boxes with the clustering centers being the same center anchor box again, and carrying out weight calculation on the sizes of the initial text boxes according to weight calculation results to obtain second size parameters of the center anchor boxes;
when the second dimension parameter is the same as the first dimension parameter, taking the second dimension parameter as a final dimension parameter of the central anchor frame;
and when the second dimension parameter is different from the first dimension parameter, carrying out iterative computation based on the second dimension parameter to obtain a new dimension parameter of the central anchor frame, stopping iteration until the new dimension parameter is the same as the dimension parameter obtained by the last iteration, and taking the new dimension parameter as the final dimension parameter.
In one embodiment, the overlapping degree calculating unit 200 is specifically configured to:
respectively identifying the overlapping area between each initial text box and each initial anchor box, and respectively calculating the overlapping area of the overlapping areas;
calculating the sum of the areas of the initial text box and the initial anchor box corresponding to the overlapping area, and calculating the difference of the areas and the overlapping area corresponding to the overlapping area;
and calculating the ratio of the overlapping area corresponding to the overlapping area to the area difference value to obtain the first overlapping degree.
In one embodiment, the central anchor frame generating unit 300 is specifically configured to:
taking the difference value between 1 and the first overlapping degree as a clustering distance between the initial text box and the corresponding initial anchor box;
and determining a clustering center corresponding to each initial text box according to the clustering distance, and taking the initial anchor box of the clustering center as a central anchor box corresponding to the initial text box.
In one embodiment, the size calculation unit 400 is specifically configured to:
and calculating the weight corresponding to the initial text box belonging to the same central anchor box through a softmax function based on the first overlapping degree.
In one embodiment, the size calculation unit 400 is specifically configured to:
acquiring a size representation of the initial text box, wherein the size representation comprises a first representation parameter and a second representation parameter;
and respectively carrying out weighted calculation on the first characterization parameter and the second characterization parameter of the initial text box with the same clustering center as the center anchor box to obtain a first size parameter of the center anchor box.
In one embodiment, the overlapping degree calculating unit 400 is specifically configured to:
and randomly selecting a plurality of text boxes in the initial text box to serve as the initial anchor box.
Referring to fig. 3, a computer device is further provided in an embodiment of the present application, and the internal structure of the computer device may be as shown in fig. 3. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The nonvolatile storage medium stores an operating device, a computer program, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used for storing the generated data of the text anchor boxes and the like. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program, when executed by a processor, implements a method for generating text anchor boxes, comprising the steps of: acquiring an initial text box containing text information; obtaining a plurality of initial anchor frames according to the initial text frames, and respectively calculating a first overlapping degree of each initial text frame and each initial anchor frame; clustering the initial text boxes according to the first overlapping degree, and taking an initial anchor box of a clustering center as a center anchor box; and based on the first overlapping degree, carrying out weight calculation on the initial text boxes with the clustering centers being the same center anchor box, and carrying out weight calculation on the sizes of the initial text boxes according to weight calculation results to obtain first size parameters of the center anchor box.
It will be appreciated by those skilled in the art that the architecture shown in fig. 3 is merely a block diagram of a portion of the architecture in connection with the present inventive arrangements and is not intended to limit the computer devices to which the present inventive arrangements are applicable.
An embodiment of the present application also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method of generating a text anchor box. It is understood that the computer readable storage medium in this embodiment may be a volatile readable storage medium or a nonvolatile readable storage medium.
According to the method, the device, the equipment and the medium for generating the text anchor boxes, the influence of the length-width difference of different initial text boxes on the clustering result is avoided by calculating the overlapping degree of each initial text box and the initial anchor box; and clustering according to the overlapping degree to obtain a central anchor frame, simultaneously carrying out weight calculation on each initial text frame, and uniformly distributing the contribution of each initial text frame to the central anchor frame by introducing weight, so that the obtained size of the central anchor frame is better suitable for all initial text frames corresponding to the central anchor frame.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium provided by the present application and used in embodiments may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual speed data rate SDRAM (SSRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, apparatus, article or method that comprises the element.
The foregoing description is only of the preferred embodiments of the present application and is not intended to limit the scope of the application, and all equivalent structures or equivalent processes using the descriptions and drawings of the present application or directly or indirectly applied to other related technical fields are included in the scope of the application.

Claims (10)

1. The method for generating the text anchor box is characterized by comprising the following steps of:
acquiring an initial text box containing text information;
obtaining a plurality of initial anchor frames according to the initial text frames, and respectively calculating a first overlapping degree of each initial text frame and each initial anchor frame;
clustering the initial text boxes according to the first overlapping degree, and taking an initial anchor box of a clustering center as a center anchor box;
and based on the first overlapping degree, carrying out weight calculation on the initial text boxes with the clustering centers being the same center anchor box, and carrying out weight calculation on the sizes of the initial text boxes according to weight calculation results to obtain first size parameters of the center anchor box.
2. The method for generating a text anchor frame according to claim 1, further comprising, after the obtaining the first size parameter of the center anchor frame:
calculating second overlapping degree of each central text box and each corresponding initial anchor box based on the first size parameter;
according to the second overlapping degree, carrying out weight calculation on the initial text boxes with the clustering centers being the same center anchor box again, and carrying out weight calculation on the sizes of the initial text boxes according to weight calculation results to obtain second size parameters of the center anchor boxes;
when the second dimension parameter is the same as the first dimension parameter, taking the second dimension parameter as a final dimension parameter of the central anchor frame;
and when the second dimension parameter is different from the first dimension parameter, carrying out iterative computation based on the second dimension parameter to obtain a new dimension parameter of the central anchor frame, stopping iteration until the new dimension parameter is the same as the dimension parameter obtained by the last iteration, and taking the new dimension parameter as the final dimension parameter.
3. The method for generating text anchor boxes according to claim 1, wherein the calculating the first overlapping degree of each of the initial text boxes and each of the initial anchor boxes includes:
respectively identifying the overlapping area between each initial text box and each initial anchor box, and respectively calculating the overlapping area of the overlapping areas;
calculating the sum of the areas of the initial text box and the initial anchor box corresponding to the overlapping area, and calculating the difference of the areas and the overlapping area corresponding to the overlapping area;
and calculating the ratio of the overlapping area corresponding to the overlapping area to the area difference value to obtain the first overlapping degree.
4. The method for generating a text anchor box according to claim 1, wherein the clustering the initial text boxes according to the first overlapping degree, and taking the initial anchor box of the clustering center as a center anchor box, comprises:
taking the difference value between 1 and the first overlapping degree as a clustering distance between the initial text box and the corresponding initial anchor box;
and determining a clustering center corresponding to each initial text box according to the clustering distance, and taking the initial anchor box of the clustering center as a central anchor box corresponding to the initial text box.
5. The method for generating text anchor boxes according to claim 1, wherein the calculating weights of the initial text boxes with the same center anchor box based on the first overlapping degree includes:
and calculating the weight corresponding to the initial text box belonging to the same central anchor box through a softmax function based on the first overlapping degree.
6. The method for generating a text anchor box according to claim 1, wherein the weighting calculation is performed on the size of the initial text box according to the weight calculation result, so as to obtain a first size parameter of the center anchor box, which includes:
acquiring a size representation of the initial text box, wherein the size representation comprises a first representation parameter and a second representation parameter;
and respectively carrying out weighted calculation on the first characterization parameter and the second characterization parameter of the initial text box with the same clustering center as the center anchor box to obtain a first size parameter of the center anchor box.
7. The method for generating text anchor boxes according to claim 1, wherein the obtaining a plurality of initial anchor boxes according to the initial text boxes comprises:
and randomly selecting a plurality of text boxes in the initial text box to serve as the initial anchor box.
8. A text anchor box generation device, characterized by comprising:
a text box recognition unit for acquiring an initial text box containing text information;
the overlapping degree calculation unit is used for obtaining a plurality of initial anchor frames according to the initial text frames and respectively calculating the first overlapping degree of each initial text frame and each initial anchor frame;
the center anchor frame generation unit is used for clustering the initial text frames according to the first overlapping degree, and taking the initial anchor frame in the clustering center as a center anchor frame;
and the size calculation unit is used for carrying out weight calculation on the initial text boxes with the clustering centers being the same center anchor box based on the first overlapping degree, and carrying out weight calculation on the sizes of the initial text boxes according to the weight calculation result to obtain a first size parameter of the center anchor box.
9. A computer device comprising a memory and a processor, the memory having stored therein a computer program, characterized in that the processor, when executing the computer program, carries out the steps of the method of generating a text anchor box as claimed in any one of claims 1 to 7.
10. A computer-readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of generating a text anchor box as claimed in any one of claims 1 to 7.
CN202210331749.9A 2022-03-30 2022-03-30 Text anchor box generation method, device, equipment and medium Pending CN116935028A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210331749.9A CN116935028A (en) 2022-03-30 2022-03-30 Text anchor box generation method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210331749.9A CN116935028A (en) 2022-03-30 2022-03-30 Text anchor box generation method, device, equipment and medium

Publications (1)

Publication Number Publication Date
CN116935028A true CN116935028A (en) 2023-10-24

Family

ID=88388398

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210331749.9A Pending CN116935028A (en) 2022-03-30 2022-03-30 Text anchor box generation method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN116935028A (en)

Similar Documents

Publication Publication Date Title
CN108846340B (en) Face recognition method and device, classification model training method and device, storage medium and computer equipment
CN106803071B (en) Method and device for detecting object in image
CN109241904B (en) Character recognition model training, character recognition method, device, equipment and medium
CN111950329A (en) Target detection and model training method and device, computer equipment and storage medium
CN112613515B (en) Semantic segmentation method, semantic segmentation device, computer equipment and storage medium
CN110728295B (en) Semi-supervised landform classification model training and landform graph construction method
CN110942012A (en) Image feature extraction method, pedestrian re-identification method, device and computer equipment
CN112001406B (en) Text region detection method and device
CN111914908B (en) Image recognition model training method, image recognition method and related equipment
CN112232426A (en) Training method, device and equipment of target detection model and readable storage medium
CN112464945A (en) Text recognition method, device and equipment based on deep learning algorithm and storage medium
CN114626524A (en) Target service network determining method, service processing method and device
EP4174769A1 (en) Method and apparatus for marking object outline in target image, and storage medium and electronic apparatus
CN114398059A (en) Parameter updating method, device, equipment and storage medium
CN114201572A (en) Interest point classification method and device based on graph neural network
CN117372971A (en) Object recognition method, device, computer equipment and storage medium
CN116935028A (en) Text anchor box generation method, device, equipment and medium
CN116824572A (en) Small sample point cloud object identification method, system and medium based on global and part matching
CN113076823B (en) Training method of age prediction model, age prediction method and related device
CN114742990A (en) Target detection method, device and equipment based on artificial intelligence and storage medium
CN111507188B (en) Face recognition model training method, device, computer equipment and storage medium
EP4007173A1 (en) Data storage method, and data acquisition method and apparatus therefor
CN113986245A (en) Object code generation method, device, equipment and medium based on HALO platform
CN116802651A (en) Information processing apparatus, selection output method, and selection output program
CN117932337B (en) Method and device for training neural network based on embedded platform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination