WO2020199472A1 - 识别模型的优化方法和装置 - Google Patents

识别模型的优化方法和装置 Download PDF

Info

Publication number
WO2020199472A1
WO2020199472A1 PCT/CN2019/103009 CN2019103009W WO2020199472A1 WO 2020199472 A1 WO2020199472 A1 WO 2020199472A1 CN 2019103009 W CN2019103009 W CN 2019103009W WO 2020199472 A1 WO2020199472 A1 WO 2020199472A1
Authority
WO
WIPO (PCT)
Prior art keywords
field picture
training
recognition model
recognition
field
Prior art date
Application number
PCT/CN2019/103009
Other languages
English (en)
French (fr)
Inventor
许洋
刘鹏
王健宗
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020199472A1 publication Critical patent/WO2020199472A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/146Aligning or centring of the image pick-up or image-field
    • G06V30/1475Inclination or skew detection or correction of characters or of image to be recognised
    • G06V30/1478Inclination or skew detection or correction of characters or of image to be recognised of characters or characters lines
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Definitions

  • This application relates to the technical field of image processing. Specifically, this application relates to a method and device for optimizing a recognition model.
  • the text recognition technology mainly trains the field recognition model, and then recognizes the target field.
  • the inventor realizes that generating a field recognition model is often limited by the data provided by the business side, and it is difficult to optimize the field recognition model in a short time, which is not conducive to improving the recognition degree of the recognition model.
  • this application provides a method for optimizing the recognition model, which includes the following steps: obtaining the first field picture to be recognized by the recognition model in production; sending the first field picture to the crowdsourcing platform for labeling, Obtain the corresponding annotation information; wherein, the user of the crowdsourcing platform identifies and annotates the first field picture; regularly receives the annotation information of the first field picture returned by the crowdsourcing platform, according to the first field picture To obtain the training field picture; add the training field picture to the training data set of the recognition model, and use the training data set to optimize the recognition model.
  • this application also provides a recognition model optimization device, which includes: an acquisition module for acquiring the first field picture to be recognized by the recognition model in production; The field pictures are sent to the crowdsourcing platform for labeling, and the corresponding labeling information is obtained; wherein the users of the crowdsourcing platform identify and label the first field pictures; the receiving module is used to periodically receive the information returned by the crowdsourcing platform The labeling information of the first field picture is used to obtain training field pictures according to the labeling information of the first field picture; the optimization module is used to add the training field picture to the training data set of the recognition model and use the The training data set optimizes the recognition model.
  • the application also provides a server, which includes: one or more processors; a memory; one or more computer programs, wherein the one or more computer programs are stored in the memory and Is configured to be executed by the one or more processors, and the one or more computer programs are configured to execute a method for optimizing a recognition model, and the method for optimizing the recognition model includes: obtaining a recognition model to be used in production The first field picture that is recognized; the first field picture is sent to the crowdsourcing platform for labeling, and the corresponding labeling information is obtained; wherein, the user of the crowdsourcing platform identifies and labels the first field picture; receives it regularly
  • the annotation information of the first field picture returned by the crowdsourcing platform obtains the training field picture according to the annotation information of the first field picture; adds the training field picture to the training data set of the recognition model, and uses The training data set optimizes the recognition model.
  • the present application also provides a computer-readable non-volatile storage medium, the computer-readable non-volatile storage medium stores a computer program, and the computer program implements a A method for optimizing a recognition model, the method for optimizing the recognition model includes: obtaining a first field picture to be recognized by the recognition model in production; sending the first field picture to a crowdsourcing platform for labeling, and obtaining corresponding labeling information; Wherein, the user of the crowdsourcing platform identifies and annotates the first field picture; regularly receives the annotation information of the first field picture returned by the crowdsourcing platform, and obtains training according to the annotation information of the first field picture Field pictures; add the training field pictures to the training data set of the recognition model, and use the training data set to optimize the recognition model.
  • the method and device for optimizing a recognition model uses the crowdsourcing platform to complete the recognition and annotation of a large number of field pictures, which can provide a large number of training fields with high reliability to the recognition model in a relatively short time This avoids the inability to obtain high-reliability training field pictures in a short period of time, and also solves the problem of a large amount of field data generated in production, but the reliability is not high due to failure to label, and to ensure The training samples of the recognition model are updated in time to ensure that the recognition model can be optimized in a short time interval, and the recognition ability of the recognition model is improved.
  • FIG. 1 is a flowchart of a method for optimizing a recognition model according to an embodiment of this application
  • FIG. 2 is a flowchart of a method for optimizing a recognition model according to another embodiment of this application;
  • FIG. 3 is a flowchart of a method for optimizing a recognition model in another embodiment of this application.
  • Fig. 5 is a schematic diagram of an optimizing device for a recognition model according to an embodiment of the application.
  • Fig. 6 is a schematic structural diagram of a server according to an embodiment of the application.
  • the field picture generated in business production is obtained through the server.
  • the field picture is used to provide the training data set to the recognition model.
  • the field picture is positioned as the first field image.
  • the first field picture is a picture of the field content intercepted for the identification object. The content of the field is not recognized by the result. If it is directly used as a training data set to optimize the recognition model, the recognition ability of the obtained recognition model will be affected to a certain extent.
  • S120 Send the first field picture to the crowdsourcing platform for labeling, and obtain corresponding labeling information.
  • the user of the crowdsourcing platform identifies and annotates the picture in the first field.
  • the server sends the field content in the first field picture obtained in step S110 to the crowdsourcing platform.
  • the user on the crowdsourcing platform recognizes the field content in the first field picture, annotates the first field picture according to the recognition result, and obtains and returns annotated information about the corresponding first field picture.
  • the crowdsourcing platform may collect more than one field content for a single first field picture.
  • Recognition results For a first field picture, if multiple recognition results are generated on the crowdsourcing platform, the multiple recognition results can be counted on the proportions, and according to the statistical results of the proportions, it is confirmed which recognition result is used for the corresponding Picture of the first field and label it.
  • the server sends the field content in the first field picture to the crowdsourcing platform, which may periodically compress the field content in the first field picture to form a data compression package, and send it to the Crowdsourcing platform.
  • the crowdsourcing platform receives the data compression package and decompresses it so that users on the crowdsourcing platform can identify its content.
  • S130 Periodically receive the annotation information of the first field picture returned by the crowdsourcing platform, and obtain the training field picture according to the annotation information of the first field picture.
  • the user sets the frequency of receiving the annotation information of the first field picture according to the optimized frequency of the recognition model.
  • the server receives the annotation information of the picture in the first field returned by the crowdsourcing platform according to the receiving frequency.
  • the server forms a training data set that can be included in the recognition model according to the first field picture and its corresponding annotation information, as a training field picture. Since the training field pictures are annotated by users of the crowdsourcing platform, and a large number of recognition results are collected and obtained by statistics, the training field pictures have a high accuracy rate and can be directly included in the recognition model and used as training samples .
  • step S130 Use the training field picture obtained in step S130 as a training sample, add it to the training data set of the recognition model, update the training data set, and optimize the recognition model with the updated training data set to continuously improve The recognition capability of the recognition model.
  • a method for optimizing a recognition model provided by the present application is to obtain the first field picture to be recognized provided for the recognition model, mark the first field picture on the crowdsourcing platform and obtain the corresponding label information And use the training field picture to optimize the recognition model.
  • This application uses the crowdsourcing platform to annotate the first field picture in production to obtain high-confidence training samples that can be directly used by the recognition model, which solves the problem of large amount of business data but low reliability.
  • the annotation information includes the annotation code of the first field picture and the recognition result of the first field picture by the crowdsourcing platform.
  • the annotation code has a one-to-one correspondence with each first field picture, so that the corresponding first field picture is saved later for further training data.
  • a reference value can be set for the above-mentioned ratio of the recognition result, and the reference value is at least more than half of the total number of all the recognition results collected.
  • the recognition result is considered to be recognized by most users, and the recognition result is used to perform the corresponding first field picture Label.
  • the proportion of all recognition results of the corresponding first field picture is lower than the reference value, it is considered that any one of the recognition results cannot be directly recognized as the final recognition result of the corresponding first field picture.
  • the text layout of the corresponding first field picture such as scanning, and separating the text or symbols according to the scanning results to form multiple individual fonts or symbols, and use the font or symbol as an independent Judgment object.
  • the single font or symbol is recognized according to the crowdsourcing platform, and if a corresponding recognition result is higher than the reference value, it is confirmed as the final recognition result of the single font or symbol.
  • all the recognition results are integrated, and the final recognition result of the first field picture is obtained and labeled.
  • the label can be different from the label that can obtain the recognition result at one time, so that the recognition model can perform special labeling.
  • the ratio of the recognition results of all the individual fonts or symbols is higher than the reference value after the first field picture is separated into a single font or symbol
  • the ratio of a single font or symbol is higher than the reference value
  • Fig. 2 is a flowchart of a method for optimizing a recognition model in another embodiment.
  • step S130 includes the steps:
  • S131 Periodically receive the annotation information of the first field picture returned by the crowdsourcing platform
  • the step of obtaining the training field picture according to the label information of the first field picture includes:
  • S132 Encode the first recognition result of the corresponding first field picture according to the annotation encoding
  • the recognition result of the first field picture is received periodically after step S120, and the first recognition result is obtained.
  • the first recognition result is correspondingly coded with the annotation code of the corresponding first field picture, so as to confirm the correspondence between the recognition result and the corresponding first field picture.
  • the first field picture of the corresponding labeling information such as the labeling code and the recognition result obtained from the two steps, since the corresponding first recognition result is obtained through the crowdsourcing platform for labeling, its credibility is much higher than that of the previous After annotated business data, the first field picture including the annotation information can be directly used as a training field picture to provide training samples for the recognition model.
  • the labeling information includes a labeling code and the first recognition result of the first field picture corresponding to the labeling code, so that different first field pictures will not be confused due to the same recognition result being labelled, which is beneficial to Follow-up for further training data optimization or data enhancement processing.
  • FIG. 3 is a flowchart of another embodiment of a method for optimizing a recognition model. Based on the above, for the step 140, the training field picture is added to the training data of the recognition model. Before the concentrated steps, it also includes:
  • step S1 in order to increase the robustness of the recognition model. Perform data enhancement processing on the first field picture to obtain a corresponding second field picture. That is to say, the quality of the first field picture is reduced, so that the recognition model can also recognize the correct text for field pictures of poor quality.
  • step S2 the second field picture obtained from step S1 is resent to the crowdsourcing platform and annotated.
  • the label code of the recognition result obtained is the same as the corresponding first field picture; or the numerical number of the times of labeling is added to it, such as labeling the same original field picture
  • the label code for labeling the picture in the first field is N123-1
  • the label code for labeling the picture in the second field after data enhancement processing is N123-2.
  • the second field picture is sent to the crowdsourcing platform for recognition, and the corresponding second recognition result is obtained.
  • the specific process of obtaining the second recognition result is the same as the process of obtaining the recognition result mentioned above. .
  • step S3 the first recognition result is compared with the second recognition result, so as to prevent the field pictures of the first field picture from undergoing excessive data enhancement processing as the training samples of the recognition model, and reduce the The recognition ability of the recognition model.
  • step S4 if the comparison result obtained in step S3 is consistent, that is, after the data enhancement processing of the first field picture, the user of the crowdsourcing platform can still recognize its content, so The second field picture obtained at this time can be used as a training field picture to provide training samples for the recognition model.
  • step S1 performing data enhancement processing on the first field picture to obtain the second field picture, at least the following methods can be used:
  • A1. Identify the effective content of the first field picture, and determine the first effective area of the first field picture;
  • A2 crop the first field picture outside the boundary of the first effective area; wherein the cropped area border is a detection frame;
  • A3. Reduce the detection frame by a number of pixels inward to obtain a second effective area, and intercept a second field picture according to the second effective area.
  • the first field picture is cropped.
  • the effective content is the field content of the first field picture.
  • the first field picture may be binarized to obtain the first valid area where the field content in the first field picture is located, and according to the first valid area, the first valid area
  • the first field picture is cropped outside the boundary of the area, the cropped area is the detection area, and the boundary of the detection area is the detection frame, and the detection frame covers the entire first effective area.
  • the cropping method is to randomly reduce a number of pixels inward of the detection frame to obtain the second field picture after the size of the detection frame is reduced.
  • the scope of the reduction of the detection frame is outside the boundary of the effective area, so as not to crop the field content in the detection frame.
  • the method further includes the following steps:
  • A31 Extend the detection frame outward by a number of pixels to obtain the first effective area; wherein the pixels that are expanded outward are larger than those that are reduced inward.
  • the detection frame is expanded outward by a number of pixels to obtain the first effective area, so as to simulate the slight fluctuation of the detection frame output according to the cropped area, so that all
  • the recognition model may consider different data enhancement processing conditions, and finally achieve the recognition ability of the recognition model.
  • the pixels expanded outward of the detection frame are larger than pixels reduced inward, so as to avoid subsequent inward reduction of the detection frame to the effective area, which affects the integrity of the field content.
  • the data enhancement processing method is to perform motion blur processing on the first field picture, the first field picture is moved in a random direction, and the random direction may be the first field picture is shaken in multiple directions , Or it can move quickly in any direction, the second field picture is blurred relative to the corresponding first field picture due to the movement, and the second field picture that reduces the quality of the field picture is added to increase the recognition model Training field pictures enhance the recognition ability of the recognition model.
  • the down-sampling process is performed on the first field picture to reduce the picture quality of the first field picture to obtain the second field picture with a reduced resolution. There is no corresponding set value for the down-sampling multiple of the first field picture, and the same recognition result can be obtained after down-sampling processing.
  • first field picture for data enhancement processing it can be rotated to any direction and placed.
  • the user can choose to add the first field picture Make rotation adjustments, and then perform recognition.
  • the labeling information obtained after labeling the first field picture may include the field picture after adjusting the rotation position of the first field picture according to the user of the crowdsourcing platform, so as to standardize the recognition.
  • the training field picture of the model; at the same time, the placement direction of the first field picture is not restricted, so that training field pictures of different situations are added to the recognition model to enhance the recognition ability of the recognition model.
  • one or more of the above methods may be selected to process the first field picture to obtain corresponding second field pictures respectively to increase the training of the recognition model Field pictures improve the recognition ability of the recognition model.
  • FIG. 4 is a flowchart of a specific embodiment of the above-mentioned method for optimizing the recognition model of the present application. The specific embodiment is described below:
  • S403 Receive a first recognition result after labeling the picture in the first field.
  • S405 Perform data enhancement processing on the first field picture to obtain a second field picture, and go to step S402;
  • S406 Receive a second recognition result after labeling the picture in the second field.
  • step S409 If the two are consistent, go to step S409:
  • step S410 If the recognition results of the two are consistent, go to step S410:
  • the first field image in business production is annotated through the crowdsourcing platform, and the corresponding recognition result is output as the recognition model.
  • the training field image is provided as training Samples to achieve the purpose of optimizing the recognition model in time.
  • an embodiment of the present application also provides a device for optimizing a recognition model, as shown in FIG. 5, including:
  • the obtaining module 510 is used to obtain the first field picture to be recognized by the recognition model in production;
  • the labeling module 520 is configured to send the first field picture to a crowdsourcing platform for labeling to obtain corresponding labeling information; wherein, the user of the crowdsourcing platform identifies and labels the first field picture;
  • the receiving module 530 is configured to periodically receive the annotation information of the first field picture returned by the crowdsourcing platform, and obtain the training field picture according to the annotation information of the first field picture;
  • the optimization module 540 is configured to add the training field pictures to the training data set of the recognition model, and use the training data set to optimize the recognition model.
  • FIG. 6 is a schematic diagram of the internal structure of the server in an embodiment.
  • the server includes a processor 610, a nonvolatile storage medium 620, a memory 630, and a network interface 640 connected through a system bus.
  • the non-volatile storage medium 620 of the server stores an operating system, a database, and computer-readable instructions.
  • the database may store control information sequences.
  • the processor 610 can implement the functions of the acquisition module 510, the annotation module 520, the reception module 530, and the optimization model 540 in the recognition model optimization apparatus in the embodiment shown in FIG. 5.
  • the processor 610 of the server is used to provide computing and control capabilities to support the operation of the entire server.
  • the memory 630 of the server may store computer-readable instructions, and when the computer-readable instructions are executed by the processor 610, the processor 610 can make the processor 610 execute an optimization method for the recognition model.
  • the network interface 640 of the server is used to connect and communicate with the terminal.
  • this application also proposes a non-volatile storage medium storing computer-readable instructions.
  • the one or more processors execute The following steps: obtain the first field picture to be recognized by the recognition model in production; send the first field picture to the crowdsourcing platform for labeling, and obtain the corresponding label information; regularly receive the first field returned by the crowdsourcing platform
  • the annotation information of the picture according to the annotation information of the first field picture, obtain the training field picture; add the training field picture to the training data set of the recognition model, and use the training data set to compare the recognition model Optimization, wherein the user of the crowdsourcing platform identifies and annotates the picture in the first field.
  • the method and device for optimizing the recognition model provided by this application are mainly used by users of the crowdsourcing platform to directly annotate the first field image to be recognized obtained in the production of the recognition model, and the first field
  • the image is used to identify the field content; according to the corresponding annotation information, the corresponding training field image is obtained and added as a training sample to the training data set of the recognition model, and the recognition model is continuously optimized.
  • This application uses the crowdsourcing platform to complete the identification and labeling of a large number of field pictures, which can provide the recognition model with a large number of high-reliability training field pictures in a short time, and ensure that the training samples of the recognition model are processed in a timely manner.
  • the update ensures that the recognition model can be optimized in a short time interval, and the recognition ability of the recognition model is improved.
  • This application further provides an optimization solution for performing data enhancement processing on the first field picture to obtain the second field picture.
  • the crowdsourcing platform is also used to mark the second field picture to obtain the corresponding recognition result.
  • the second field picture that has undergone moderate data enhancement processing is selected as the training field picture, which adds training to the recognition model
  • the sample further improves the recognition ability of the recognition model.
  • the method and device for optimizing the recognition model of the present application avoids the inability to obtain high-reliability training field pictures in a short time, and can also solve the problems generated in production A large amount of field data, but the credibility is not high due to failure to mark. Finally, a solution that can utilize a large amount of field data generated in production and quickly optimize the recognition model is realized.
  • the aforementioned non-volatile storage medium may be a non-volatile storage medium such as a magnetic disk, an optical disc, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM). Wait.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

涉及图像处理技术领域,一种识别模型的优化方法和装置,所述方法包括获取识别模型在生产中待识别的第一字段图片(S110);将所述第一字段图片发送至众包平台进行标注,得到对应的标注信息(S120);定期接收所述众包平台返回的第一字段图片的标注信息,根据所述第一字段图片的标注信息,获取训练字段图片(S130);将所述训练字段图片添加至所述识别模型的训练数据集中,并利用所述训练数据集对所述识别模型进行优化(S140)。该方法避免了不能在短时间内获取可信度高的训练字段图片的问题,保证及时对所述识别模型的训练样本进行更新,提升了所述识别模型的识别能力。

Description

识别模型的优化方法和装置
本申请要求于2019年4月4日提交中国专利局、申请号为2019102700383,发明名称为“识别模型的优化方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及图像处理的技术领域,具体而言,本申请涉及一种识别模型的优化方法和装置。
背景技术
随着智能识别发展的趋势,文字智能识别已经运用至各领域中。目前,文字识别技术主要是通过训练字段识别模型,然后再对目标字段进行识别。但是,发明人意识到生成字段识别模型常常受限于业务方提供的数据,且难以在短时间内完成对字段识别模型进行优化,不利于提高识别模型的识别度。
发明内容
为克服以上技术问题,特别是现有技术中识别模型常常受限于业务方提供的数据的问题,特提出以下技术方案:
为解决上述技术问题,本申请提供一种识别模型的优化方法,包括以下步骤:获取识别模型在生产中待识别的第一字段图片;将所述第一字段图片发送至众包平台进行标注,得到对应的标注信息;其中,所述众包平台的用户对所述第一字段图片进行识别标注;定期接收所述众包平台返回的第一字段图片的标注信息,根据所述第一字段图片的标注信息,获取训练字段图片;将所述训练字段图片添加至所述识别模型的训练数据集中,并利用所述训练数据集对所述识别模型进行优化。
为解决上述技术问题,本申请还提供一种识别模型的优化装置,其包 括:获取模块,用于获取识别模型在生产中待识别的第一字段图片;标注模块,用于将所述第一字段图片发送至众包平台进行标注,得到对应的标注信息;其中,所述众包平台的用户对所述第一字段图片进行识别标注;接收模块,用于定期接收所述众包平台返回的第一字段图片的标注信息,根据所述第一字段图片的标注信息,获取训练字段图片;优化模块,用于将所述训练字段图片添加至所述识别模型的训练数据集中,并利用所述训练数据集对所述识别模型进行优化。
为解决上述技术问题,本申请还提供一种服务器,其包括:一个或多个处理器;存储器;一个或多个计算机程序,其中所述一个或多个计算机程序被存储在所述存储器中并被配置为由所述一个或多个处理器执行,所述一个或多个计算机程序配置用于执行一种识别模型的优化方法,所述识别模型的优化方法包括:获取识别模型在生产中待识别的第一字段图片;将所述第一字段图片发送至众包平台进行标注,得到对应的标注信息;其中,所述众包平台的用户对所述第一字段图片进行识别标注;定期接收所述众包平台返回的第一字段图片的标注信息,根据所述第一字段图片的标注信息,获取训练字段图片;将所述训练字段图片添加至所述识别模型的训练数据集中,并利用所述训练数据集对所述识别模型进行优化。
为解决上述技术问题,本申请还提供一种计算机可读非易失性存储介质,所述计算机可读非易失性存储介质上存储有计算机程序,该计算机程序被处理器执行时实现一种识别模型的优化方法,所述识别模型的优化方法包括:获取识别模型在生产中待识别的第一字段图片;将所述第一字段图片发送至众包平台进行标注,得到对应的标注信息;其中,所述众包平台的用户对所述第一字段图片进行识别标注;定期接收所述众包平台返回的第一字段图片的标注信息,根据所述第一字段图片的标注信息,获取训练字段图片;将所述训练字段图片添加至所述识别模型的训练数据集中,并利用所述训练数据集对所述识别模型进行优化。
本申请所提供的一种识别模型的优化方法和装置,利用所述众包平台完成对大量字段图片进行识别标注,能在较短时间内向所述识别模型提供大量且可信度高的训练字段图片,从而避免了不能在短时间内不能获取可 信度高的训练字段图片,也可解决了生产中所产生的大量字段数据,但因未能进行标注而可信度不高的问题,保证及时对所述识别模型的训练样本进行更新,保证所述识别模型在能以较短的时间间隔完成优化,提升所述识别模型的识别能力。
附图说明
图1为本申请中的一个实施例的识别模型的优化方法的流程图;
图2为本申请中的又一个实施例的识别模型的优化方法的流程图;
图3为本申请中的另一个实施例的识别模型的优化方法的流程图;
图4为本申请中的所述识别模型的优化方法的一个具体实施例流程图;
图5为本申请中的一个实施例的识别模型的优化装置的示意图;
图6为本申请中的一个实施例的服务器的结构示意图。
具体实施方式
下面详细描述本申请的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,仅用于解释本申请,而不能解释为对本申请的限制。
本技术领域技术人员可以理解,除非特意声明,这里使用的单数形式“一”、“一个”、“所述”和“该”也可包括复数形式。应该进一步理解的是,本申请的说明书中使用的措辞“包括”是指存在所述特征、整数、步骤、操作、元件和/或组件,但是并不排除存在或添加一个或多个其他特征、整数、步骤、操作、元件、组件和/或它们的组。应该理解,当我们称元件被“连接”或“耦接”到另一元件时,它可以直接连接或耦接到其他元件,或者也可以存在中间元件。此外,这里使用的“连接”或“耦接”可以包括无线连接或无线耦接。这里使用的措辞“和/或”包括一个或更多个相关联的列出项的全部或任一单元和全部组合。
本技术领域技术人员可以理解,除非另外定义,这里使用的所有术语 (包括技术术语和科学术语),具有与本申请所属领域中的普通技术人员的一般理解相同的意义。还应该理解的是,诸如通用字典中定义的那些术语,应该被理解为具有与现有技术的上下文中的意义一致的意义,并且除非像这里一样被特定定义,否则不会用理想化或过于正式的含义来解释。
为了解决目前字段识别受限于业务方提供的数据,导致影响了识别模型优化的问题,本申请提供了一种识别模型的优化方法,请参考图1所示,图1是一个实施例的识别模型的优化方法的流程图,包括以下步骤:
S110、获取识别模型在生产中待识别的第一字段图片。
在本步骤中,通过服务器获取在业务生产中所产生的字段图片,该字段图片是用于向所述识别模型提供训练数据集的材料,在本实施例中该字段图片被定位为第一字段图片。所述第一字段图片是对识别对象所截取关于字段内容的图片。所述字段内容是未经过结果识别的,若直接将其作为训练数据集对识别模型进行优化,得到的识别模型的识别能力会受到一定程度的影响。
S120、将所述第一字段图片发送至众包平台进行标注,得到对应的标注信息。
在此步骤中,所述众包平台的用户对所述第一字段图片进行识别标注。
服务器对步骤S110的所获取的第一字段图片中的字段内容发送至所述众包平台。所述众包平台上的用户对第一字段图片中的字段内容进行识别,并根据识别的结果对所述第一字段图片进行标注,得到并返回关于对应的第一字段图片的标注信息。
由于众包平台上的所有用户均可以对所述第一字段图片中的字段内容进行识别并标注,因此,针对单个第一字段图片的字段内容,所述众包平台可能会收集到多于一个识别结果。对于一个第一字段图片,若在所述众包平台上产生多个识别结果,则可对该多个识别结果进行比例的统计,并根据该比例的统计结果,确认使用哪个识别结果对对应的第一字段图片,并进行标注。
在本实施例中,服务器将第一字段图片中的字段内容发送至所述众包 平台,可以是定期对所述第一字段图片中的字段内容经过压缩形成数据压缩包,并发送至所述众包平台。所述众包平台接收到所述数据压缩包,并对其进行解压,供众包平台上的用户对其内容进行识别。
S130、定期接收所述众包平台返回的第一字段图片的标注信息,根据所述第一字段图片的标注信息,获取训练字段图片。
用户根据识别模型的优化频率,设定接收所述第一字段图片的标注信息的频率。服务器根据接收频率,接收所述众包平台返回第一字段图片的标注信息。服务器根据所述第一字段图片及其对应的标注信息,形成可收录至所述识别模型的训练数据集,作为训练字段图片。由于所述训练字段图片是经过众包平台的用户进行标注,收集了大量的识别结果并统计得到的,因此,所述训练字段图片的准确率较高,能直接收录至识别模型并作为训练样本。
S140、将所述训练字段图片添加至所述识别模型的训练数据集中,并利用所述训练数据集对所述识别模型进行优化。
将步骤S130得到的训练字段图片作为训练样本,添加至所述识别模型的训练数据集中,对所述训练数据集进行更新,并以更新后的训练数据集对所述识别模型进行优化,不断提高所述识别模型的识别能力。
本申请提供的一种识别模型的优化方法,获取为所述识别模型提供的待识别的第一字段图片,对所述第一字段图片在所述众包平台上标注并得到包含对应的标注信息的训练字段图片,并利用该训练字段图片对所述识别模型进行优化。本申请通过众包平台对生产中的第一字段图片进行标注,得到所述识别模型可直接使用的高可信度的训练样本,解决了业务数据数量大但可信度低的问题,同时,也可以解决因受到业务提供的训练字段图片有限或训练字段图片未能及时更新,而影响了识别模型的优化进度的问题。对于步骤S120,所述标注信息包括所述第一字段图片的标注编码和众包平台对所述第一字段图片的识别结果。其中,所述标注编码与各个第一字段图片一一对应,以便后期保存对应的第一字段图片进行进一步的训练数据。
对于所述识别结果的获取,具体地,可以对上述关于识别结果的比例 设定参考值,且该参考值至少为所收集到的全部的识别结果总数的一半以上。
若对应的第一字段图片的某个识别结果的占比高于该参考值的情况下,则认为该识别结果得到绝大部分用户的认同,则以该识别结果对对应的第一字段图片进行标注。
若对应的第一字段图片的所有识别结果的占比均低于该参考值的情况下,则认为任何一个是识别结果均不能直接认定为对应的第一字段图片的最终识别结果。对此,根据对应的第一字段图片的文字排版,如经过扫描,并根据扫描的结果对文字或符号之间进行分隔,形成多个单独的字体或符号,并以该字体或符号作为独立的判断对象。根据所述众包平台对该单个字体或符号进行识别,若得到的对应的某个识别结果高于所述参考值,则确认其为单个字体或符号最终的识别结果。待所述第一字段图片中的所有单个字体或符号识别完毕后,对所有的识别结果进行整合,并得到该第一字段图片最终的识别结果,并进行标注。该标注可以有别于可一次性得到识别结果的标注,以便识别模型进行特别标注。
对于将所述第一字段图片分隔成单个字体或符号后,仍无法得到所有单个字体或符号的识别结果的比例高于参考值的情况下,若比例高于参考值的单个字体或符号的个数占总数达到设定比例值时,则将根据所述众包平台的用户对该第一字段图片的类型的判断进行预测,得到一个识别结果,并进行标注,该标注可有别于上述所有的其他标注,以便所述识别模型以特定的形式进行标注,提高识别的准确性。
若对应的第一字段图片中比例高于参考值的单个字体或符号的个数占总数未能达到设定比例值时,则判断所述第一字段图片为异常状态,并向服务器返回对应的异常提示信息,启动重新获取对应第一字段图片的指令。
参照图2,图2是是又一个实施例的识别模型的优化方法的流程图,在上述基础上,步骤S130包括步骤:
S131:定期接收所述众包平台返回的第一字段图片的标注信息;
其中,所述根据所述第一字段图片的标注信息,获取训练字段图片的 步骤包括:
S132、根据所述标注编码对对应的第一字段图片的所述第一识别结果进行编码;
S133、根据编码的结果,为所述第一识别结果分配对应的标注编码,得到完成识别的第一字段图片作为训练字段图片。
对应上述步骤S131-S133,定期接收经过步骤S120对所述第一字段图片进行标注得到其识别结果,得到第一识别结果。其中,该第一识别结果与对应的第一字段图片的标注编码进行对应编码,以便确认所述识别结果与对应的第一字段图片的对应关系。
根据从该两步骤所得到对应的标注编码、识别结果等标注信息的第一字段图片,由于经过所述众包平台进行标注得到了对应的第一识别结果,因此其可信度大大高于未经过标注的业务数据,该包括标注信息的第一字段图片可直接作为训练字段图片,为所述识别模型提供训练样本。
而且所述标注信息包括标注编码和与该标注编码对应的所述第一字段图片的第一识别结果,使得不同的第一字段图片之间不会因为标注有相同的识别结果而混淆,有利于后续进行进一步的训练数据优化或数据增强处理。
参照图3,图3是是另一个实施例的识别模型的优化方法的流程图,在上述的基础上,对于步骤140中的所述将所述训练字段图片添加至所述识别模型的训练数据集中的步骤之前,还包括:
S1、对所述第一字段图片进行数据增强处理,得到第二字段图片;
S2、将所述第二字段图片发送至所述众包平台进行标注,根据对应的标注编码,获取所述第二字段图片的第二识别结果;
S3、对所述第一识别结果与所述第二识别结果进行比较;
S4、若两者一致,则将所述第二字段图片作为训练字段图片。
在上述步骤S1中,为了增加所述识别模型的鲁棒性。对所述第一字段图片进行数据增强处理,得到对应的第二字段图片。即将所述第一字段图片的质量降低,使得所述识别模型对于质量较差的字段图片时,也能识别出正确地文本。
在步骤S2中,对于从步骤S1得到的所述第二字段图片,重新发送至所述众包平台,并进行标注。在此次重新对同一原始的字段图片进行标注得到的识别结果的标注编码与对应的第一字段图片相同;或者是在其基础上增加标注的次数的数值编号,如对于同一原始字段图片进行标注时,对其第一字段图片进行标注的标注编码为N123-1,若对应经过数据增强处理后的第二字段图片进行标注的标注编码为N123-2。这样,以便对快速查询或搜索到对应字段图片的数据处理历史,有利于为调整所述识别模型的优化方法提供参考。
具体地,所述第二字段图片在发送至所述众包平台进行识别,并得到对应的第二识别结果,具体获取第二识别结果的过程与上述提到的所述识别结果的获取过程相同。
在步骤S3中,将所述第一识别结果与所述第二识别结果进行比较,以防止所述第一字段图片进行数据增强处理过度的字段图片作为所述识别模型的训练样本,降低所述识别模型的识别能力。
因此,在步骤S4中,对于步骤S3得到的比较结果是一致的情况下,即为对所述第一字段图片的数据增强处理后,所述众包平台的用户仍可辨认出其内容,所以此时得到的第二字段图片可以作为训练字段图片,为所述识别模型提供训练样本。
对于上述提到的步骤S1对所述第一字段图片进行数据增强处理,得到第二字段图片的步骤中,至少可以使用以下几种方式:
第一种方式,具体的步骤如下:
A1、对所述第一字段图片进行有效内容的识别,确定所述第一字段图片的第一有效区域;
A2、在所述第一有效区域的边界外对所述第一字段图片进行裁剪;其中,裁剪得到的区域边框为检测框;
A3、将所述检测框向内缩减若干个像素,得到第二有效区域,根据所述第二有效区域截取第二字段图片。
在数据增强处理方式是对第一字段图片进行裁剪,为了在裁剪的过程中保证所述第一字段图片中内容的完整性,需要对所述第一字段图片进行 有效内容的识别,确定所述第一字段图片的第一有效区域。其中,所述有效内容为所述第一字段图片的字段内容。
在本实施例中,可对所述第一字段图片经过二值化处理,得到所述第一字段图片中字段内容所在的第一有效区域,并根据该第一有效区域,以该第一有效区域的边界外对所述第一字段图片进行裁剪,裁剪得到的区域为检测区域,该检测区域的边界为检测框,所述检测框覆盖了整个第一有效区域。
所述裁剪的方式为对所述检测框向内随机进行缩减若干个像素,得到检测框的尺寸缩减后的第二字段图片。对所述检测框缩减的范围在所述有效区域的边界外,以免裁剪检测框内的字段内容。
在步骤A3的所述将所述检测框向内缩减若干个像素的步骤之前,还包括以下步骤:
A31、将所述检测框向外扩充若干个像素,得到所述第一有效区域;其中,所述向外扩充的像素大于向内缩减的像素。
在步骤A3对所述检测框向内缩减之前,对所述检测框向外扩充若干个像素,得到所述第一有效区域,以便模拟根据裁剪得到的区域输出的检测框的微小浮动,使所述识别模型可考虑不同的数据增强处理情况,最终达到所述识别模型的识别能力。
其中,所述检测框向外扩充的像素大于向内缩减的像素,以避免后续对所述检测框的向内缩减至所述有效区域内,影响对所述字段内容的完整性。
第二种方式,具体的步骤如下:
B1、将所述第一字段图片向随机方向进行移动的模糊处理,得到所述第二字段图片。
在数据增强处理方式是对第一字段图片进行运动模糊处理,所述将所述第一字段图片向随机方向进行移动,所述随机方向可以是将所述第一字段图片向多个方向进行晃动,或者可以向任一方向快速移动,所述第二字段图片相对于对应的第一字段图片因移动产生了模糊效果,增加了降低字段图片质量的第二字段图片,以增加所述识别模型的训练字段图片,提升 所述识别模型的识别能力。
对所述第一字段图片移动的速度没有对应的设定值,均以进行运动模糊处理后得到相同的识别结果即可。
第三种方式,具体的步骤如下:
C1、对所述第一字段图片进行若干倍数的下采样处理,降低所述第一字段图片的分辨率,得到所述第二字段图片。
在该实施例中,对所述第一字段图片进行下采样处理,降低所述第一字段图片的图片质量,得到分辨率下降的所述第二字段图片。对于所述第一字段图片下采样的倍数没有对应的设定值,均以进行下采样处理后得到相同的识别结果即可。
对于上述所提到的进行数据增强处理的所述第一字段图片可以是旋转至任意一个方向放置,当将其发送至所述众包平台进行标注时,用户可以选择对所述第一字段图片进行旋转调整,然后再进行识别。
具体地,当进行对所述第一字段图片进行标注后所得到的标注信息可以包括根据所述众包平台用户对对所述第一字段图片调整旋转方位后的字段图片,以便规范所述识别模型的训练字段图片;同时,不对所述第一字段图片的放置方向进行限制,以便为所述识别模型增加不同情况的训练字段图片,达到增强所述识别模型的识别能力。
在对所述第一字段图片进行数据增强处理中,可以选择以上一种或多中方式对所述第一字段图片进行处理,分别得到对应的第二字段图片,以增加所述识别模型的训练字段图片,提升所述识别模型的识别能力。
参考图4,图4是针对上述关于本申请的识别模型的优化方法的具体实施例的流程图,以下就对该具体实施例进行说明:
S401、获取识别模型在生产中待识别的第一字段图片;
S402、发送至所述众包平台进行标注;
S403、接收对所述第一字段图片进行标注后得到第一识别结果;
S404、对所述第一识别结果进行编码,并得到相应的标注编码;
S405、对所述第一字段图片进行数据增强处理,得到第二字段图片,并转至步骤S402;
S406、接收对所述第二字段图片进行标注后得到第二识别结果;
S407、对所述第二识别结果进行编码,并得到相应的标注编码;
S408、对比关于所述第一识别结果和所述第二识别结果是否一致;
若两者一致,转步骤S409:
S409、将所述第二识别结果作为训练字段图片;
若两者的识别结果一致,转步骤S410:
S410、将所述第一字段图片进行丢弃,不再对其进行识别。
在本申请中提供的一种识别模型的优化方法,通过所述众包平台对业务生产中的第一字段图片进行标注,并将对应的识别结果输出为所述识别模型提供训练字段图像作为训练样本,达到及时为所述识别模型进行优化的目的。基于与上述一种识别模型的优化方法相同的发明构思,本申请实施例还提供了一种识别模型的优化装置,如图5所示,包括:
获取模块510,用于获取识别模型在生产中待识别的第一字段图片;
标注模块520,用于将所述第一字段图片发送至众包平台进行标注,得到对应的标注信息;其中,所述众包平台的用户对所述第一字段图片进行识别标注;
接收模块530,用于定期接收所述众包平台返回的第一字段图片的标注信息,根据所述第一字段图片的标注信息,获取训练字段图片;
优化模块540,用于将所述训练字段图片添加至所述识别模型的训练数据集中,并利用所述训练数据集对所述识别模型进行优化。
请参考图6,图6为一个实施例中服务器的内部结构示意图。如图6所示,该服务器包括通过系统总线连接的处理器610、非易失性存储介质620、存储器630和网络接口640。其中,该服务器的非易失性存储介质620存储有操作系统、数据库和计算机可读指令,数据库中可存储有控件信息序列,该计算机可读指令被处理器610执行时,可使得处理器610实现一种识别模型的优化方法,处理器610能实现图5所示实施例中的一种识别模型的优化装置中的获取模块510、标注模块520、接收模块530和优化模型540的功能。该服务器的处理器610用于提供计算和控制能力,支撑整个服务器的运行。该服务器的存储器630中可存储有计算机可读指 令,该计算机可读指令被处理器610执行时,可使得处理器610执行一种识别模型的优化方法。该服务器的网络接口640用于与终端连接通信。本领域技术人员可以理解,图6中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的服务器的限定,具体的服务器可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
在一个实施例中,本申请还提出了一种存储有计算机可读指令的非易失性存储介质,该计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:获取识别模型在生产中待识别的第一字段图片;将所述第一字段图片发送至众包平台进行标注,得到对应的标注信息;定期接收所述众包平台返回的第一字段图片的标注信息,根据所述第一字段图片的标注信息,获取训练字段图片;将所述训练字段图片添加至所述识别模型的训练数据集中,并利用所述训练数据集对所述识别模型进行优化,其中,所述众包平台的用户对所述第一字段图片进行识别标注。
综合上述实施例可知,本申请最大的有益效果在于:
本申请所提供的一种识别模型的优化方法和装置,其主要是通过众包平台的用户对识别模型在生产中所得到的待识别的第一字段图片直接进行标注,对所述第一字段图片进行字段内容的识别;根据对应的标注信息,得到对应的训练字段图片,并作为训练样本添加至所述识别模型的训练数据集中,不断对所述识别模型进行优化。本申请利用所述众包平台完成对大量字段图片进行识别标注,能在较短时间内向所述识别模型提供大量且可信度高的训练字段图片,保证及时对所述识别模型的训练样本进行更新,保证所述识别模型在能以较短的时间间隔完成优化,提升所述识别模型的识别能力。
本申请还进一步提供了一优化方案,对所述第一字段图片进行数据增强处理,得到第二字段图片。同时,同样利用所述众包平台对所述第二字段图片进行标注,得到对应的识别结果。根据所述第一字段图片的识别结果和所述第二字段图片的识别结果进行对比,筛选出经过适度数据增强处理的所述第二字段图片作为训练字段图片,为所述识别模型增加了训练样 本,进一步提升了所述识别模型的识别能力。
综上,本申请识别模型的优化方法和装置,通过对所述众包平台的使用,避免了不能在短时间内不能获取可信度高的训练字段图片,也可解决了生产中所产生的大量字段数据,但因未能进行标注而可信度不高的问题。最终实现可利用生产中所产生的大量字段数据,并对所述识别模型进行快速优化的方案。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,该计算机程序可存储于一计算机可读取非易失性存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,前述的非易失性存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)等非易失性存储介质,或随机存储记忆体(Random Access Memory,RAM)等。
以上所述实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对本申请专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。

Claims (20)

  1. 一种识别模型的优化方法,包括以下步骤:
    获取识别模型在生产中待识别的第一字段图片;
    将所述第一字段图片发送至众包平台进行标注,得到对应的标注信息;其中,所述众包平台的用户对所述第一字段图片进行识别标注;
    定期接收所述众包平台返回的第一字段图片的标注信息,根据所述第一字段图片的标注信息,获取训练字段图片;
    将所述训练字段图片添加至所述识别模型的训练数据集中,并利用所述训练数据集对所述识别模型进行优化。
  2. 根据权利要求1所述的方法,所述标注信息包括所述第一字段图片的标注编码和众包平台对带有标注编码的第一字段图片的第一识别结果;
    所述根据所述第一字段图片的标注信息,获取训练字段图片的步骤包括:
    根据所述标注编码对对应的第一字段图片的所述第一识别结果进行编码;
    根据编码的结果,为所述第一识别结果分配对应的标注编码,得到完成识别的第一字段图片作为训练字段图片。
  3. 根据权利要求2所述的方法,所述将所述训练字段图片添加至所述识别模型的训练数据集中的步骤前,还包括:
    对所述第一字段图片进行数据增强处理,得到第二字段图片;
    将所述第二字段图片发送至所述众包平台进行标注,根据对应的标注编码获取所述第二字段图片的第二识别结果;
    将所述第一识别结果与所述第二识别结果进行比较;
    若两者一致,则将所述第二字段图片作为训练字段图片。
  4. 根据权利要求3所述的方法,所述对所述第一字段图片进行数据增强处理,得到第二字段图片的步骤,包括:
    对所述第一字段图片进行有效内容的识别,确定所述第一字段图片的第一有效区域;
    在所述第一有效区域的边界外对所述第一字段图片进行裁剪;其中,裁剪得到的区域边框为检测框;
    将所述检测框向内缩减若干个像素,得到第二有效区域,根据所述第二有效区域截取第二字段图片;
    其中,所述有效内容为所述第一字段图片的字段内容。
  5. 根据权利要求4所述的方法,在所述将所述有效区域向内缩减若干个像素的步骤之前,还包括:
    将所述检测框向外扩充若干个像素,得到所述第一有效区域;其中,所述向外扩充的像素大于向内缩减的像素。
  6. 根据权利要求3所述的方法,所述对所述第一字段图片进行数据增强处理,得到第二字段图片的步骤,包括:
    将所述第一字段图片向随机方向进行移动的模糊处理,得到所述第二字段图片。
  7. 根据权利要求3所述的方法,所述对所述第一字段图片进行数据增强处理,得到第二字段图片的步骤,包括:
    对所述第一字段图片进行若干倍率的下采样处理,降低所述第一字段图片的分辨率,得到所述第二字段图片。
  8. 一种识别模型的优化装置,包括:
    获取模块,用于获取识别模型在生产中待识别的第一字段图片;
    标注模块,用于将所述第一字段图片发送至众包平台进行标注,得到对应的标注信息;其中,所述众包平台的用户对所述第一字段图片进行识别标注;
    接收模块,用于定期接收所述众包平台返回的第一字段图片的标注信息,根据所述第一字段图片的标注信息,获取训练字段图片;
    优化模块,用于将所述训练字段图片添加至所述识别模型的训练数据集中,并利用所述训练数据集对所述识别模型进行优化。
  9. 一种服务器,包括:
    一个或多个处理器;
    存储器;
    一个或多个计算机程序,其中所述一个或多个计算机程序被存储在所述存储器中并被配置为由所述一个或多个处理器执行,所述一个或多个计算机程序配置用于执行一种识别模型的优化方法,所述识别模型的优化方法包括以下步骤:
    获取识别模型在生产中待识别的第一字段图片;
    将所述第一字段图片发送至众包平台进行标注,得到对应的标注信息;其中,所述众包平台的用户对所述第一字段图片进行识别标注;
    定期接收所述众包平台返回的第一字段图片的标注信息,根据所述第一字段图片的标注信息,获取训练字段图片;
    将所述训练字段图片添加至所述识别模型的训练数据集中,并利用所述训练数据集对所述识别模型进行优化。
  10. 根据权利要求9所述的服务器,所述标注信息包括所述第一字段图片的标注编码和众包平台对带有标注编码的第一字段图片的第一识别结果;
    所述根据所述第一字段图片的标注信息,获取训练字段图片的步骤包括:
    根据所述标注编码对对应的第一字段图片的所述第一识别结果进行编码;
    根据编码的结果,为所述第一识别结果分配对应的标注编码,得到完成识别的第一字段图片作为训练字段图片。
  11. 根据权利要求10所述的服务器,所述将所述训练字段图片添加至所述识别模型的训练数据集中的步骤前,还包括:
    对所述第一字段图片进行数据增强处理,得到第二字段图片;
    将所述第二字段图片发送至所述众包平台进行标注,根据对应的标注编码获取所述第二字段图片的第二识别结果;
    将所述第一识别结果与所述第二识别结果进行比较;
    若两者一致,则将所述第二字段图片作为训练字段图片。
  12. 根据权利要求11所述的服务器,所述对所述第一字段图片进行数据增强处理,得到第二字段图片的步骤,包括:
    对所述第一字段图片进行有效内容的识别,确定所述第一字段图片的第一有效区域;
    在所述第一有效区域的边界外对所述第一字段图片进行裁剪;其中,裁剪得到的区域边框为检测框;
    将所述检测框向内缩减若干个像素,得到第二有效区域,根据所述第二有效区域截取第二字段图片;
    其中,所述有效内容为所述第一字段图片的字段内容。
  13. 根据权利要求12所述的服务器,在所述将所述有效区域向内缩减若干个像素的步骤之前,还包括:
    将所述检测框向外扩充若干个像素,得到所述第一有效区域;其中,所述向外扩充的像素大于向内缩减的像素。
  14. 根据权利要求11所述的服务器,所述对所述第一字段图片进行数据增强处理,得到第二字段图片的步骤,包括:
    将所述第一字段图片向随机方向进行移动的模糊处理,得到所述第二字段图片。
  15. 根据权利要求11所述的服务器,所述对所述第一字段图片进行数据增强处理,得到第二字段图片的步骤,包括:
    对所述第一字段图片进行若干倍率的下采样处理,降低所述第一字段图片的分辨率,得到所述第二字段图片。
  16. 一种计算机可读非易失性存储介质,所述计算机可读非易失性存储介质上存储有计算机程序,该计算机程序被处理器执行时实现一种识别模型的优化方法,所述识别模型的优化方法包括以下步骤:
    获取识别模型在生产中待识别的第一字段图片;
    将所述第一字段图片发送至众包平台进行标注,得到对应的标注信息;其中,所述众包平台的用户对所述第一字段图片进行识别标注;
    定期接收所述众包平台返回的第一字段图片的标注信息,根据所述第一字段图片的标注信息,获取训练字段图片;
    将所述训练字段图片添加至所述识别模型的训练数据集中,并利用所述训练数据集对所述识别模型进行优化。
  17. 根据权利要求16所述的非易失性存储介质,所述标注信息包括所述第一字段图片的标注编码和众包平台对带有标注编码的第一字段图片的第一识别结果;
    所述根据所述第一字段图片的标注信息,获取训练字段图片的步骤包括:
    根据所述标注编码对对应的第一字段图片的所述第一识别结果进行编码;
    根据编码的结果,为所述第一识别结果分配对应的标注编码,得到完成识别的第一字段图片作为训练字段图片。
  18. 根据权利要求17所述的非易失性存储介质,所述将所述训练字段图片添加至所述识别模型的训练数据集中的步骤前,还包括:
    对所述第一字段图片进行数据增强处理,得到第二字段图片;
    将所述第二字段图片发送至所述众包平台进行标注,根据对应的标注编码获取所述第二字段图片的第二识别结果;
    将所述第一识别结果与所述第二识别结果进行比较;
    若两者一致,则将所述第二字段图片作为训练字段图片。
  19. 根据权利要求18所述的非易失性存储介质,所述对所述第一字段图片进行数据增强处理,得到第二字段图片的步骤,包括:
    对所述第一字段图片进行有效内容的识别,确定所述第一字段图片的第一有效区域;
    在所述第一有效区域的边界外对所述第一字段图片进行裁剪;其中,裁剪得到的区域边框为检测框;
    将所述检测框向内缩减若干个像素,得到第二有效区域,根据所述第二有效区域截取第二字段图片;
    其中,所述有效内容为所述第一字段图片的字段内容。
  20. 根据权利要求19所述的非易失性存储介质,在所述将所述有效区域向内缩减若干个像素的步骤之前,还包括:
    将所述检测框向外扩充若干个像素,得到所述第一有效区域;其中,所述向外扩充的像素大于向内缩减的像素。
PCT/CN2019/103009 2019-04-04 2019-08-28 识别模型的优化方法和装置 WO2020199472A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910270038.3A CN110135409B (zh) 2019-04-04 2019-04-04 识别模型的优化方法和装置
CN201910270038.3 2019-04-04

Publications (1)

Publication Number Publication Date
WO2020199472A1 true WO2020199472A1 (zh) 2020-10-08

Family

ID=67569369

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/103009 WO2020199472A1 (zh) 2019-04-04 2019-08-28 识别模型的优化方法和装置

Country Status (2)

Country Link
CN (1) CN110135409B (zh)
WO (1) WO2020199472A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115841255A (zh) * 2022-12-27 2023-03-24 济南市工程质量与安全中心 基于在线分析的建筑工程现场预警方法及系统

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110135409B (zh) * 2019-04-04 2023-11-03 平安科技(深圳)有限公司 识别模型的优化方法和装置
CN112699906B (zh) * 2019-10-22 2023-09-22 杭州海康威视数字技术股份有限公司 获取训练数据的方法、装置及存储介质
CN112686045B (zh) * 2021-03-17 2021-06-29 北京世纪好未来教育科技有限公司 文本错误检测模型的评测方法及装置

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014083378A1 (en) * 2012-11-29 2014-06-05 Hewlett-Packard Development Company, L.P. Image analysis
CN107273492A (zh) * 2017-06-15 2017-10-20 复旦大学 一种基于众包平台处理图像标注任务的交互方法
CN108268575A (zh) * 2017-01-04 2018-07-10 阿里巴巴集团控股有限公司 标注信息的处理方法、装置和系统
US20180268458A1 (en) * 2015-01-05 2018-09-20 Valorbec Limited Partnership Automated recommendation and virtualization systems and methods for e-commerce
CN108664897A (zh) * 2018-04-18 2018-10-16 平安科技(深圳)有限公司 票据识别方法、装置及存储介质
CN108829652A (zh) * 2018-04-28 2018-11-16 河海大学 一种基于众包的图片标注系统
CN108921029A (zh) * 2018-06-04 2018-11-30 浙江大学 一种融合残差卷积神经网络和pca降维的sar自动目标识别方法
CN109800320A (zh) * 2019-01-04 2019-05-24 平安科技(深圳)有限公司 一种图像处理方法、设备及计算机可读存储介质
CN110135409A (zh) * 2019-04-04 2019-08-16 平安科技(深圳)有限公司 识别模型的优化方法和装置

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009049314A2 (en) * 2007-10-11 2009-04-16 Trustees Of Boston University Video processing system employing behavior subtraction between reference and observed video image sequences
US9195910B2 (en) * 2013-04-23 2015-11-24 Wal-Mart Stores, Inc. System and method for classification with effective use of manual data input and crowdsourcing
CN108573255A (zh) * 2017-03-13 2018-09-25 阿里巴巴集团控股有限公司 文字合成图像的识别方法及装置、图像识别方法
CN108446621A (zh) * 2018-03-14 2018-08-24 平安科技(深圳)有限公司 票据识别方法、服务器及计算机可读存储介质

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014083378A1 (en) * 2012-11-29 2014-06-05 Hewlett-Packard Development Company, L.P. Image analysis
US20180268458A1 (en) * 2015-01-05 2018-09-20 Valorbec Limited Partnership Automated recommendation and virtualization systems and methods for e-commerce
CN108268575A (zh) * 2017-01-04 2018-07-10 阿里巴巴集团控股有限公司 标注信息的处理方法、装置和系统
CN107273492A (zh) * 2017-06-15 2017-10-20 复旦大学 一种基于众包平台处理图像标注任务的交互方法
CN108664897A (zh) * 2018-04-18 2018-10-16 平安科技(深圳)有限公司 票据识别方法、装置及存储介质
CN108829652A (zh) * 2018-04-28 2018-11-16 河海大学 一种基于众包的图片标注系统
CN108921029A (zh) * 2018-06-04 2018-11-30 浙江大学 一种融合残差卷积神经网络和pca降维的sar自动目标识别方法
CN109800320A (zh) * 2019-01-04 2019-05-24 平安科技(深圳)有限公司 一种图像处理方法、设备及计算机可读存储介质
CN110135409A (zh) * 2019-04-04 2019-08-16 平安科技(深圳)有限公司 识别模型的优化方法和装置

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115841255A (zh) * 2022-12-27 2023-03-24 济南市工程质量与安全中心 基于在线分析的建筑工程现场预警方法及系统
CN115841255B (zh) * 2022-12-27 2024-05-31 济南市工程质量与安全中心 基于在线分析的建筑工程现场预警方法及系统

Also Published As

Publication number Publication date
CN110135409A (zh) 2019-08-16
CN110135409B (zh) 2023-11-03

Similar Documents

Publication Publication Date Title
WO2020199472A1 (zh) 识别模型的优化方法和装置
CN109784181B (zh) 图片水印识别方法、装置、设备及计算机可读存储介质
CN109344884B (zh) 媒体信息分类方法、训练图片分类模型的方法及装置
CN109618225B (zh) 视频抽帧方法、装置、设备和介质
KR101810578B1 (ko) 셔터 클릭을 통한 자동 미디어 공유
CN110446062B (zh) 大数据文件传输的接收处理方法、电子装置及存储介质
CN113221706B (zh) 基于多进程的多路视频流的ai分析方法及系统
CN110019873B (zh) 人脸数据处理方法、装置及设备
US10638135B1 (en) Confidence-based encoding
WO2022193523A1 (zh) 图像处理方法、装置、设备及存储介质
WO2022042609A1 (zh) 提取热词的方法、装置、电子设备及介质
WO2021169642A1 (zh) 基于视频的眼球转向确定方法与系统
WO2023241385A1 (zh) 一种模型迁移方法、装置及电子设备
JP2023543640A (ja) 酒ラベル識別方法、酒製品情報管理方法及びその装置、デバイス及び記憶媒体
CN110661693A (zh) 促进在计算装置中执行的基于媒体的内容共享的方法、计算装置可读存储介质以及计算装置
CN113051430B (zh) 模型训练方法、装置、电子设备、介质及产品
CN118196799A (zh) 圆形印章文字识别方法、装置、电子设备及存储介质
CN111062374A (zh) 身份证信息的识别方法、装置、系统、设备及可读介质
WO2024125312A1 (zh) 一种画质识别方法、装置、设备、存储介质及产品
CN113177409B (zh) 一种智能敏感字词识别系统
CN112866724B (zh) 一种基于软件定义网络和边缘计算技术的视频业务处理方法和系统
US20140337709A1 (en) Method and apparatus for displaying web page
CN114741557B (zh) 视图数据库管理分类方法
CN112733565A (zh) 二维码粗定位方法、设备及存储介质
CN110334663A (zh) 基于图像的年龄识别方法及装置、存储介质与终端

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19923496

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19923496

Country of ref document: EP

Kind code of ref document: A1