WO2021196935A1 - Procédé et appareil de vérification de données, dispositif électronique et support de stockage - Google Patents

Procédé et appareil de vérification de données, dispositif électronique et support de stockage Download PDF

Info

Publication number
WO2021196935A1
WO2021196935A1 PCT/CN2021/078082 CN2021078082W WO2021196935A1 WO 2021196935 A1 WO2021196935 A1 WO 2021196935A1 CN 2021078082 W CN2021078082 W CN 2021078082W WO 2021196935 A1 WO2021196935 A1 WO 2021196935A1
Authority
WO
WIPO (PCT)
Prior art keywords
file
field name
type
image file
data
Prior art date
Application number
PCT/CN2021/078082
Other languages
English (en)
Chinese (zh)
Inventor
刘振涛
Original Assignee
深圳壹账通智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳壹账通智能科技有限公司 filed Critical 深圳壹账通智能科技有限公司
Publication of WO2021196935A1 publication Critical patent/WO2021196935A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Definitions

  • This application relates to the field of data processing technology, in particular to data-based verification methods, devices, electronic equipment and storage media.
  • a data verification method includes: obtaining the business type of the target business and the image files that need to be verified by the target business; determining the file type of the image file according to the image file, and according to the business type and the image
  • the file type of the file determines the file type of the target verification file that needs to be verified, and the data source identifier of the target verification file and the file identifier of the target verification file are determined according to the image file, wherein the target verification file It is a file for verifying image files; input the business type and the file type of the image file into the pre-trained first machine learning model, and output that the image file corresponding to the file type needs to be verified
  • the first field name of the pre-trained first machine learning model is obtained by training sample data including the business type, the file type of the image file, and the first field name in the image file that needs to be verified; input the image
  • the pre-trained second machine learning model uses the business type, the file type of the target verification file, and the target verification file to verify the first
  • the field value data in the field name is obtained by training the sample data of the second field name for verifying the field value data, and the field value data in the second field name is used to verify the field value data in the first field name; Acquiring the field value data in the first field name according to the first field name, and acquiring the target verification file according to the data source information of the target verification file and the file identifier of the target verification file;
  • the field value data in the first field name is verified based on the field value data in the second field name in the target verification file.
  • a data verification device includes: a first acquisition unit, used to acquire the business type of a target business and the image file that the target business needs to be verified; a first execution unit, used to determine the image file according to the image file
  • the file type of the target verification file to be verified is determined according to the service type and the file type of the image file, and the data source identification and target verification of the target verification file are determined according to the image file.
  • the file identification of the verification file wherein the target verification file is a file for verifying the image file;
  • the second execution unit is used to input the service type and the file type of the image file to the pre-trained first In the machine learning model, the first field name that needs to be verified in the image file corresponding to the file type is output, and the pre-trained first machine learning model includes the service type, the file type of the image file, and the image
  • the sample data of the first field name in the file that needs to be verified is obtained through training;
  • the third execution unit is used to input the file type of the image file, the service type, the file type of the target verification file, and the From the first field name to the second pre-trained machine learning model, output the second field name obtained in the target verification file for verifying the field value data in the first field name, and the pre-training
  • the second machine learning model is obtained by training the sample data containing the business type, the file type of the target verification file, and the second field name in the target verification file that verifies the field value data in
  • the verification unit is used to obtain the target verification file based on the data in the target verification file
  • the field value data in the second field name is verified against the field value data in the first field name.
  • An electronic device includes a memory and a processor, the memory stores computer-readable instructions, and when the computer-readable instructions are executed by the processor, the processor executes the following steps:
  • the file type of the target verification file, and the data source identification of the target verification file and the file identification of the target verification file are determined according to the image file, wherein the target verification file is a file for verifying the image file
  • the target verification file is a file for verifying the image file
  • the pre-trained first machine learning model is obtained by training the sample data containing the service type, the file type of the image file, and the first field name in the image file that needs to be verified; input the file type of the image file, the service type , The file type of the target verification file and the first field name to the pre-trained second machine learning model, and output the data of the field value in the first field name in the target verification
  • the pre-trained second machine learning model collates the field value data in the first field name by including the business type, the file type of the target verification file, and the target verification file.
  • the sample data of the second field name of the verification is obtained through training, and the field value data in the second field name is used to verify the field value data in the first field name; the obtained data is obtained according to the first field name.
  • the field value data in the second field name is verified against the field value data in the first field name.
  • a storage medium storing computer-readable instructions.
  • the one or more processors execute the following steps:
  • the file type of the target verification file, and the data source identification of the target verification file and the file identification of the target verification file are determined according to the image file, wherein the target verification file is a file for verifying the image file
  • the target verification file is a file for verifying the image file
  • the pre-trained first machine learning model is obtained by training the sample data containing the service type, the file type of the image file, and the first field name in the image file that needs to be verified; input the file type of the image file, the service type , The file type of the target verification file and the first field name to the pre-trained second machine learning model, and output the data of the field value in the first field name in the target verification
  • the pre-trained second machine learning model collates the field value data in the first field name by including the business type, the file type of the target verification file, and the target verification file.
  • the sample data of the second field name of the verification is obtained through training, and the field value data in the second field name is used to verify the field value data in the first field name; the obtained data is obtained according to the first field name.
  • the field value data in the second field name is verified against the field value data in the first field name.
  • This application can quickly and accurately verify each image file.
  • Fig. 1 shows a schematic diagram of an exemplary system architecture to which the technical solutions of the embodiments of the present application can be applied.
  • Fig. 2 is a flowchart of a data verification method shown in an exemplary embodiment of the application.
  • FIG. 3 is a specific flowchart of step S220 of the data verification method shown in an exemplary embodiment of the application.
  • Fig. 4 is a flowchart of a data verification method shown in an exemplary embodiment of the application.
  • Fig. 5 is a flowchart of a data verification method shown in an exemplary embodiment of the application.
  • Fig. 6 is a block diagram of a data verification device shown in an exemplary embodiment of the present application.
  • Fig. 7 is an exemplary block diagram of an electronic device for implementing the foregoing data verification method according to an exemplary embodiment of the present application.
  • Fig. 8 shows a computer-readable storage medium for implementing the above-mentioned data verification method according to an exemplary embodiment of the present application.
  • Fig. 1 shows a schematic diagram of an exemplary system architecture to which the technical solutions of the embodiments of the present application can be applied.
  • the system architecture may include a client (as shown in FIG. 1, one or more of the smart phone 101, the tablet computer 102, and the portable computer 103, of course, it may also be a desktop computer, etc.), a network 104 And server 105.
  • the network 104 is a medium used to provide a communication link between the client and the server 105.
  • the network 104 may include various connection types, such as wired communication links, wireless communication links, and so on.
  • the numbers of clients, networks, and servers in FIG. 1 are merely illustrative. There can be any number of clients, networks, and servers according to implementation needs.
  • the server 105 may be a server cluster composed of multiple servers. The user can use the client to interact with the server 105 through the network 104 to receive or send messages, etc.
  • the server 105 can be a server that provides various services, such as a server that provides a data verification service.
  • the client obtains the business type of the target business and the image files that need to be verified by the target business; determines the file type of the image file according to the image file, and determines the need according to the business type and the file type of the image file
  • the file type of the target verification file to be verified, and the data source identification of the target verification file and the file identification of the target verification file are determined according to the image file, where the target verification file is the file for verifying the image file;
  • the pre-trained first machine learning model contains There are business types, image file file types, and image files that need to be verified by the sample data training of the first field name; the file type of the input image file, the business type, the file type of the target verification file, and the first field name
  • the output is obtained in the target verification file to verify the field value data in the first field name.
  • the pre-trained second machine learning model contains the business The type, the file type of the target verification file and the sample data of the second field name in the target verification file that verify the field value data in the first field name are obtained by training, and the field value data in the second field name is used for Verify the field value data in the first field name, and obtain the field value data in the first field name according to the first field name, and obtain it according to the data source information of the target verification file and the file identification of the target verification file Target verification file; verify the field value data in the first field name based on the field value data in the second field name in the target verification file.
  • the pre-trained first machine learning model it is possible to quickly determine the first field name in each image file that needs to be verified according to the business type of the target business and the image file that needs to be verified.
  • the field value data in other field names that do not need to be verified are verified; the second machine learning model can be used to verify the file type of the image file, the business type, the file type of the target file, and the first field through the pre-trained second machine learning model Determine the name of the second field in the target verification file that needs to be verified against the field value data in the first field name, so as to quickly and accurately determine the verification file that needs to be verified and the verification file that needs to be verified.
  • the field value data in one field name is effectively checked for the field value data in the second field name.
  • the data verification method provided in the embodiments of the present application is generally executed by the client, and correspondingly, the data verification device is generally set in the client.
  • the server 105 may also have similar functions as the client, so as to execute the solution of the data verification method provided in the embodiments of the present application. The implementation details of the technical solutions of the embodiments of the present application will be described in detail below.
  • FIG. 2 is a flowchart of a data verification method shown in an exemplary embodiment of this application.
  • the execution subject of the data verification method in this embodiment is the client, as shown in FIG. 1 It may include the following steps S210 to S260, which are described in detail as follows.
  • step S210 the service type of the target service and the image file that needs to be verified for the target service are obtained.
  • the target business refers to a specific business that the user can handle.
  • different business types such as insurance policy loan, mortgage loan, and personal housing loan can be used.
  • the image file that needs to be verified by the target business is used as the image file that needs to be verified when the user enters the business.
  • the user can enter the type of business handled and the image file that needs to be verified through the virtual button provided on the business handling page of the client.
  • the number of image files can be One or more, the number of image files can be determined according to the actual needs of handling the business.
  • step S220 the file type of the image file is determined according to the image file, the file type of the target verification file that needs to be verified is determined according to the service type and the file type of the image file, and the file type of the target verification file to be verified is determined according to the The image file determines the data source identification of the target verification file and the file identification of the target verification file, wherein the target verification file is a file for verifying the image file.
  • the file type of the image file refers to the file type determined after the image file is recognized.
  • the file types of image files are different.
  • the file types of image files can be ID cards, insurance policies, real estate certificates, mortgage contracts, etc.
  • the file type can be determined based on the character data contained in the image file.
  • FIG. 3 is a specific flowchart of step S220 of the data verification method shown in an exemplary embodiment of the application.
  • Step S220 may include step S310 to step S320, which are described in detail as follows.
  • Step S310 Perform OCR character recognition on the image file to obtain recognized text information.
  • the image file when the file type of the image file is determined according to the image file, the image file may be subjected to OCR character recognition to obtain the recognized text information.
  • the recognized text information refers to the recognition of all character data in the image file The character data collection obtained afterwards.
  • the character data set includes the character string corresponding to each field name in the image file and the character string corresponding to the field value data in each field name.
  • the character string corresponding to the field name is "Insured Name", “Insurance amount”, “Insurance company name”, “Insurance policy number”, etc.
  • the field value data in the field name of "Insurant name” corresponds to the string "Zhang San”
  • the string corresponding to the field value data can be "10000.00”
  • the string corresponding to the field value data in the field name "Insurance Company Name” can be "Ping An Insurance Company of China”
  • the character string corresponding to the field value data can be "5485426232".
  • Step S320 Determine the file type of the image file according to the key field name included in the recognized text information.
  • the image files can be classified based on the key field names with differences, so as to determine the file type of the image file. For example, in a loan scenario, for a certain image file, if the recognized text information after recognizing the image file contains "name of applicant", "name of insurance company", "policy number” and "type of insurance".
  • the four key field names can determine the file type of the image file as the insurance policy. It should be pointed out that the key field name generally identifies the specific field name in the image file.
  • the specific field name can be one or multiple. The number of specific field names can be determined according to the actual classification situation. .
  • the verification file is a file for verifying the character data contained in the image file, where the file type of the verification file is the same as the file type and service type of the image file.
  • the mapping relationship between the type and the service type determines the file type of the verification file used to verify the image file.
  • the image files that need to be verified include the insurance policy image file, the ID card image file, and the loan note image file input by the user.
  • the verification file for verifying each image file, for the policy image file input by the user in the policy loan business it can be determined according to the mapping relationship that the real policy file of the insurance company needs to be used to perform the verification on the policy image file input by the user.
  • the data source identification may specifically be the identification information of the external data server or the local data server storing the verification file, and the file identification of the verification file is used as the unique identification information for identifying the verification file, such as a data ticket number. .
  • the data source identification and verification of the verification file can be determined according to the character data in the image file. The file ID of the file.
  • OCR character recognition can be performed on the policy image file input by the user to obtain the recognized text information, where the recognized text information includes all the character data in the policy image file, and the recognized text information contains the "insurance company name"
  • the field value data "Ping An Insurance Company of China” in this field name is used as the data source identification of the policy document to be verified
  • the field value data "5485426232" in the field name of the recognized text information "insurance policy number” is used as The document identification of the insurance policy document to be verified, thereby facilitating the acquisition of the insurance policy document to be verified according to the data source identification and the document identification of the insurance policy document.
  • step S230 input the service type and the file type of the image file into the pre-trained first machine learning model, and output the first field name that needs to be verified in the image file corresponding to the file type
  • the pre-trained first machine learning model is obtained by training the sample data including the service type, the file type of the image file, and the name of the first field in the image file that needs to be verified.
  • the first field is the name of the field in the image file that needs to be verified for the field value data in the field name. It should be pointed out that when handling different services, the image file type There will be differences in the field names that need to be verified in files and image files. There is an association between the field names that need to be verified in image files, the type of business handled and the file type of the image file. For example, in the policy loan business in the loan scenario, when the business handled is the policy loan business and the image file input by the user is an ID card image file, the first field that needs to be verified on the ID card image file is named "name" And "ID number", that is, only the field value data in the two first field names of "name" and "ID number” need to be verified.
  • the business type of the target business that needs to be processed and the file of each image file input by the user can be The type is input into the pre-trained first machine learning model, and the first field name that needs to be verified in each image file input by the user is determined.
  • the field names that need to be verified can be all the field names contained in the image file, and of course, they can also be part of the field names contained in the image file.
  • FIG. 4 is a flowchart of a data verification method shown in an exemplary embodiment of this application, which may include steps S410 to S420, which are described in detail as follows.
  • step S410 the training set sample data used for training the first machine learning model to be trained is obtained, and each piece of sample data in the training set sample data includes the business type, the file type of the image file, and the image file. The name of the first field to be checked.
  • the pre-trained first machine learning model is obtained by training the machine learning model through training sample data.
  • the first machine learning model may be a CNN (Convolutional Neural Network, convolutional neural network) model, or may also be a deep neural network model.
  • the specific training process of the first machine learning model is as follows: Obtain the training set sample data used for training. Each piece of sample data in the training set sample data includes the business type of the existing target business, and the existing target business needs to be verified. The file type of each image file and the name of the first field that needs to be verified in each image file.
  • step S420 the first machine learning model to be trained is trained using the training set sample data to obtain the first machine learning model after training.
  • the first machine learning model is trained based on the acquired training set sample data to obtain the trained first machine learning model.
  • FIG. 5 is a flowchart of a data verification method shown in an exemplary embodiment of this application, which may include steps S510 to S530, which are described in detail as follows.
  • step S510 obtain test set sample data used for verifying the trained first machine learning model, and each piece of sample data in the test set sample data includes the business type, the file type of the image file, and the image file The name of the first field to be checked in.
  • the trained first machine learning model can also be verified through test sample data.
  • the test set sample data can be obtained.
  • Each piece of sample data in the test set sample data also includes the business type of the existing target business, the file type of each image file that needs to be verified by the existing target business, and each image file The name of the first field to be checked in.
  • step S520 the service type of each sample data and the file type of the image file of the test set sample data are input to the first machine learning model after training, and the first machine learning model that needs to be verified is outputted from the predicted image file.
  • a field name A field name.
  • step S530 if the first field name in the image file in the test set sample data that needs to be verified is the same as the first field name in the predicted image file that needs to be verified, the number of sample data pieces is all the same. If the proportion of the total number of sample data in the test set sample data exceeds a predetermined proportion threshold, the trained first machine learning model is identified as the pre-trained first machine learning model.
  • the training set sample data If in the training set sample data, it is known that the field names that need to be verified in the image files under this file type are the same as the predicted field names in the image files under this file type.
  • the number of sample data pieces If the proportion of the number of sample data in the training set sample data exceeds the predetermined proportion threshold, the verification has passed, otherwise, the verification has not passed, and the first machine learning model needs to continue to be trained until the verification passes.
  • step S240 input the file type of the image file, the service type, the file type of the target verification file, and the first field name to the pre-trained second machine learning
  • the second field name for verifying the field value data in the first field name in the target verification file is output
  • the pre-trained second machine learning model contains business types
  • the file type of the target verification file and the sample data of the second field name in the target verification file for verifying the field value data in the first field name are obtained by training, and the field value data in the second field name is used for Verify the field value data in the first field name.
  • the target verification will be caused. There will be differences in the name of the second field that needs to be verified in the verification file.
  • the first field name corresponding to the value data of each field in the image file of the loan note that needs to be verified includes “lender name” and “loan” Personal ID” and “Lender’s mobile phone number”.
  • the second field name used to verify the field value data in the first field name in the insurance policy includes "insurant name", " "Insured's ID card” and "Insured's mobile phone number”.
  • the field value data in the second field name of "Insurant Name” is used to verify the field value data in the first field name of "Lender Name", and the second field name of "Insurant ID card”
  • the field value data in the "Lender ID” is used to verify the field value data in the first field name, and the field value data in the second field name "Insured’s mobile phone number”
  • the field value data in the first field name of "phone number” is checked.
  • the second machine learning model may be a CNN (Convolutional Neural Network, convolutional neural network) model or a deep neural network model.
  • the sample data for training of the second machine learning model includes the business type, the file type of the verification file, and the sample data of the second field name that is used to verify the field value data in the first field name in the verification data.
  • the field value data in the second field name is used to verify the field value data in the first field name. Since the training process of the pre-trained second machine learning model is similar to the pre-trained first machine learning model, we will not repeat it .
  • step S250 the field value data in the first field name is obtained according to the first field name, and the data source information of the target check file and the file identifier of the target check file are obtained.
  • Target verification file
  • the field value data in the first field name that needs to be verified can be obtained according to the corresponding character data in the first field name image file , As the field value data for verification.
  • the target server that needs to obtain the target verification file can be determined according to the data source information of the target verification file, and the target server can be obtained from the server storing the target verification file according to the file identifier. The required target verification file.
  • step S260 the field value data in the first field name is verified based on the field value data in the second field name in the target verification file.
  • the field value data in the second field name of the image file is verified against the field value data in the first field name that needs to be verified.
  • verify the field value data in the first field name that needs to be verified in the image file to ensure that it can be accurate for each image file
  • Local verification improves the accuracy of verification; in addition, verification is only performed on the field value data in the first field name that needs to be verified in the image file, and it can also avoid all field names contained in the image file.
  • the field value data in are all verified, which improves the efficiency of verification.
  • the pre-trained first machine learning model can quickly determine the name of the first field in each image file that needs to be verified according to the business type of the target business and the image file that needs to be verified, and then you can Avoid verifying the field value data in other field names in the image file that does not need to be verified;
  • the second machine learning model can be used to verify the file according to the file type, business type, and target of the image file through the pre-trained second machine learning model
  • the type and the first field name determine the second field name in the target verification file that needs to be verified for the field value data in the first field name, so as to quickly and accurately determine the verification file that needs to be verified and the verification
  • the field value data in the second field name that needs to be effectively checked for the field value data in the first field name in the file, while ensuring the accuracy of the verification result, realizes the rapid and accurate verification of each image file
  • even in the context of business types and multiple image files only the training data of the pre-trained machine learning model needs to be adjusted
  • step S250 it may further include the step of: obtaining field value data in the second field name in the target verification file and verifying the field value data in the first field name.
  • the verification result of displays the verification result.
  • the verification result When the verification result is displayed, the verification result can be imported into the corresponding display document template according to the text type of the verification file and the corresponding relationship between the file type of the image text input by the user and the display document template to generate Display documents for display to facilitate and more intuitively view the corresponding verification results.
  • FIG. 6 is a block diagram of a data verification device shown in an exemplary embodiment of the present application.
  • the data verification device 600 may be integrated in the above-mentioned client, and may specifically include a first acquiring unit 610 and a second acquiring unit 610.
  • the first obtaining unit 610 is used to obtain the service type of the target service and the image file for which the target service needs to be verified; the first execution unit 620 is used to determine the file type of the image file according to the image file, and Determine the file type of the target verification file to be verified according to the service type and the file type of the image file, and determine the data source identifier of the target verification file and the file identifier of the target verification file according to the image file , wherein the target verification file is a file for verifying an image file; the second execution unit 630 is configured to input the service type and the file type of the image file into the pre-trained first machine learning model , Output to obtain the first field name that needs to be verified in the image file corresponding to the file type, and the pre-trained first machine learning model contains the business type, the file type of the image file, and the image file that needs to be checked.
  • the sample data of the first field name for verification is obtained through training; the third execution unit 640 is used to input the file type of the image file, the service type, the file type of the target verification file, and the first field Name to the second pre-trained machine learning model, output the second field name obtained in the target verification file for verifying the field value data in the first field name, and the pre-trained second
  • the machine learning model is obtained by training the sample data containing the business type, the file type of the target verification file, and the second field name in the target verification file that verifies the field value data in the first field name.
  • the second The field value data in the field name is used to verify the field value data in the first field name;
  • the second obtaining unit 650 is used to obtain the field in the first field name according to the first field name Value data, and obtain the target verification file according to the data source information of the target verification file and the file identification of the target verification file;
  • the verification unit 660 is configured to obtain the target verification file based on the first The field value data in the second field name is verified against the field value data in the first field name.
  • the first execution unit includes: a recognition sub-unit for performing OCR character recognition on the image file to obtain recognized text information; and an execution sub-unit for obtaining recognized text information based on the recognized text information.
  • the field name determines the file type of the image file.
  • the data verification device further includes: a display unit, configured to obtain a comparison of the field value data in the first field name based on the field value data in the second field name in the target verification file The verification result of data verification, the verification result is displayed.
  • the data verification device further includes: a third acquiring unit, configured to acquire training set sample data used for training the first machine learning model to be trained, each of the training set sample data
  • the piece of sample data includes the business type, the file type of the image file, and the name of the first field in the image file that needs to be verified; the training unit is used to train the first machine learning model to be trained through the training set sample data to obtain The first machine learning model after training.
  • the data verification device further includes: a fourth acquiring unit configured to acquire test set sample data used to verify the trained first machine learning model, and the test set sample data
  • Each piece of sample data includes the business type, the file type of the image file, and the name of the first field in the image file that needs to be verified;
  • the fourth execution unit is used to convert the business type of each sample data of the test set sample data ,
  • the file type of the image file is input to the first machine learning model after training, and the first field name that needs to be verified in the predicted image file is output;
  • the detection unit is used to determine if the image file in the test set sample data
  • the proportion of the number of sample data items whose first field name needs to be verified and the first field name needed to be verified in the predicted image file are the same in the total number of sample data items in the test set sample data exceeds a predetermined ratio Threshold, the trained first machine learning model is identified as the pre-trained first machine learning model.
  • the example embodiments described here can be implemented by software, or can be implemented by combining software with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, U disk, mobile hard disk, etc.) or on the network , Including several instructions to make a computing device (which may be a personal computer, a server, a mobile terminal, or a network device, etc.) execute the method according to the embodiments of the present disclosure.
  • a non-volatile storage medium which can be a CD-ROM, U disk, mobile hard disk, etc.
  • Including several instructions to make a computing device which may be a personal computer, a server, a mobile terminal, or a network device, etc.
  • an electronic device capable of implementing the above method is also provided.
  • FIG. 7 is an exemplary block diagram of an electronic device for implementing the foregoing data verification method according to an exemplary embodiment of the application.
  • the electronic device 700 shown in FIG. 7 is only an example, and should not bring any limitation to the functions and scope of use of the embodiments of the present application.
  • the electronic device 700 is represented in the form of a general-purpose computing device.
  • the components of the electronic device 700 may include, but are not limited to: the aforementioned at least one processing unit 710, the aforementioned at least one storage unit 720, and a bus 730 connecting different system components (including the storage unit 720 and the processing unit 710).
  • the storage unit stores program code, and the program code can be executed by the processing unit 710, so that the processing unit 710 executes the various exemplary methods described in the “Exemplary Method” section of this specification. Steps of implementation.
  • the processing unit 710 may perform the following steps:
  • the file type of the target verification file, and the data source identification of the target verification file and the file identification of the target verification file are determined according to the image file, wherein the target verification file is a file for verifying the image file
  • the target verification file is a file for verifying the image file
  • the pre-trained first machine learning model is obtained by training the sample data containing the service type, the file type of the image file, and the first field name in the image file that needs to be verified; input the file type of the image file, the service type , The file type of the target verification file and the first field name to the pre-trained second machine learning model, and output the data of the field value in the first field name in the target verification
  • the pre-trained second machine learning model collates the field value data in the first field name by including the business type, the file type of the target verification file, and the target verification file.
  • the sample data of the second field name of the verification is obtained through training, and the field value data in the second field name is used to verify the field value data in the first field name; the obtained data is obtained according to the first field name.
  • the field value data in the second field name is verified against the field value data in the first field name.
  • the storage unit 720 may include a readable medium in the form of a volatile storage unit, such as a random access storage unit (RAM) 7201 and/or a cache storage unit 7202, and may further include a read-only storage unit (ROM) 7203.
  • RAM random access storage unit
  • ROM read-only storage unit
  • the storage unit 720 may also include a program/utility tool 7204 having a set of (at least one) program module 7205.
  • program module 7205 includes but is not limited to: an operating system, one or more application programs, other program modules, and program data, Each of these examples or some combination may include the implementation of a network environment.
  • the bus 730 may represent one or more of several types of bus structures, including a storage unit bus or a storage unit controller, a peripheral bus, a graphics acceleration port, a processing unit, or a local area using any bus structure among multiple bus structures. bus.
  • the electronic device 700 may also communicate with one or more external devices 900 (such as keyboards, pointing devices, Bluetooth devices, etc.), and may also communicate with one or more devices that enable a user to interact with the electronic device 700, and/or communicate with Any device (eg, router, modem, etc.) that enables the electronic device 700 to communicate with one or more other computing devices. Such communication may be performed through an input/output (I/O) interface 740.
  • the electronic device 700 may also communicate with one or more networks (for example, a local area network (LAN), a wide area network (WAN), and/or a public network, such as the Internet) through the network adapter 760.
  • networks for example, a local area network (LAN), a wide area network (WAN), and/or a public network, such as the Internet
  • the network adapter 760 communicates with other modules of the electronic device 700 through the bus 730. It should be understood that although not shown in the figure, other hardware and/or software modules can be used in conjunction with the electronic device 700, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives And data backup storage system, etc.
  • the example embodiments described here can be implemented by software, or can be implemented by combining software with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, U disk, mobile hard disk, etc.) or on the network , Including several instructions to make a computing device (which may be a personal computer, a server, a terminal device, or a network device, etc.) execute the method according to the embodiments of the present disclosure.
  • a computing device which may be a personal computer, a server, a terminal device, or a network device, etc.
  • a computer-readable storage medium is also provided.
  • the computer-readable storage medium may be volatile or non-volatile, and the computer-readable storage medium may be The program product of the method.
  • the computer-readable storage medium stores computer-readable instructions, and when the computer-readable instructions are executed by one or more processors, the one or more processors execute the following steps:
  • the file type of the target verification file, and the data source identification of the target verification file and the file identification of the target verification file are determined according to the image file, wherein the target verification file is a file for verifying the image file
  • the target verification file is a file for verifying the image file
  • the pre-trained first machine learning model is obtained by training the sample data containing the service type, the file type of the image file, and the first field name in the image file that needs to be verified; input the file type of the image file, the service type , The file type of the target verification file and the first field name to the pre-trained second machine learning model, and output the data of the field value in the first field name in the target verification
  • the pre-trained second machine learning model collates the field value data in the first field name by including the business type, the file type of the target verification file, and the target verification file.
  • the sample data of the second field name of the verification is obtained through training, and the field value data in the second field name is used to verify the field value data in the first field name; the obtained data is obtained according to the first field name.
  • the field value data in the second field name is verified against the field value data in the first field name.
  • various aspects of the present application can also be implemented in the form of a program product, which includes program code.
  • the program product runs on a terminal device, the program code is used to make the The terminal device executes the steps according to various exemplary embodiments of the present application described in the above-mentioned "Exemplary Method" section of this specification.
  • FIG. 8 is a computer-readable storage medium for implementing the above-mentioned data verification method according to an exemplary embodiment of the present application.
  • FIG. 8 depicts a program product 800 for implementing the above method according to an embodiment of the present application, which may adopt a portable compact disk read-only memory (CD-ROM) and include program code, and may be installed on an electronic device, such as a personal computer run.
  • CD-ROM portable compact disk read-only memory
  • the program product of this application is not limited to this.
  • the readable storage medium can be any tangible medium that contains or stores a program, and the program can be used by or in combination with an instruction execution system, device, or device.
  • the program product can use any combination of one or more readable media.
  • the readable medium may be a readable signal medium or a readable storage medium.
  • the readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or a combination of any of the above. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable Type programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • the computer-readable signal medium may include a data signal propagated in baseband or as a part of a carrier wave, and readable program code is carried therein. This propagated data signal can take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the readable signal medium may also be any readable medium other than a readable storage medium, and the readable medium may send, propagate, or transmit a program for use by or in combination with the instruction execution system, apparatus, or device.
  • the program code contained on the readable medium can be transmitted by any suitable medium, including but not limited to wireless, wired, optical cable, RF, etc., or any suitable combination of the foregoing.
  • the program code used to perform the operations of the present application can be written in any combination of one or more programming languages.
  • the programming languages include object-oriented programming languages—such as Java, C++, etc., as well as conventional procedural programming languages. Programming language-such as "C" language or similar programming language.
  • the program code can be executed entirely on the user's computing device, partly on the user's device, executed as an independent software package, partly on the user's computing device and partly executed on the remote computing device, or entirely on the remote computing device or server Executed on.
  • the remote computing device can be connected to a user computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computing device (for example, using Internet service providers). Business to connect via the Internet).
  • LAN local area network
  • WAN wide area network
  • Internet service providers for example, using Internet service providers.
  • all the above-mentioned data can also be stored in a node of a blockchain.
  • image files, the first field name and the second field name, etc., these data can be stored in the blockchain node.
  • the blockchain referred to in this application is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
  • Blockchain essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Accounting & Taxation (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Finance (AREA)
  • Evolutionary Biology (AREA)
  • Development Economics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Multimedia (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Sont divulgués dans la présente invention un procédé et un appareil de vérification de données, un dispositif électronique et un support de stockage relevant du domaine technique du traitement de données. Le procédé de vérification de données comprend les étapes consistant à : acquérir le type d'un service cible et un fichier d'image qui doit être vérifié dans le service cible ; puis déterminer le type du fichier d'image en fonction du fichier d'image, déterminer le type d'un fichier de vérification cible devant effectuer une vérification en fonction du type du service et du type du fichier d'image et déterminer un identifiant de source de données du fichier de vérification cible et un identifiant du fichier de vérification cible en fonction du fichier d'image, le fichier de vérification cible étant un fichier qui vérifie le fichier d'image. La solution technique d'après la présente invention permet de vérifier des fichiers d'images avec rapidité et précision.
PCT/CN2021/078082 2020-04-01 2021-02-26 Procédé et appareil de vérification de données, dispositif électronique et support de stockage WO2021196935A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010249650.5 2020-04-01
CN202010249650.5A CN111598122B (zh) 2020-04-01 2020-04-01 数据校验方法、装置、电子设备和存储介质

Publications (1)

Publication Number Publication Date
WO2021196935A1 true WO2021196935A1 (fr) 2021-10-07

Family

ID=72183396

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/078082 WO2021196935A1 (fr) 2020-04-01 2021-02-26 Procédé et appareil de vérification de données, dispositif électronique et support de stockage

Country Status (2)

Country Link
CN (1) CN111598122B (fr)
WO (1) WO2021196935A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114760533A (zh) * 2022-05-17 2022-07-15 北京达佳互联信息技术有限公司 校验值存储方法、帧数据校验方法、装置、电子设备
CN117726300A (zh) * 2023-12-22 2024-03-19 国网江苏省电力工程咨询有限公司 用于招标代理业务资料校验的自动化智能处理系统

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111598122B (zh) * 2020-04-01 2022-02-08 深圳壹账通智能科技有限公司 数据校验方法、装置、电子设备和存储介质

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8233751B2 (en) * 2006-04-10 2012-07-31 Patel Nilesh V Method and system for simplified recordkeeping including transcription and voting based verification
CN106127659A (zh) * 2016-08-26 2016-11-16 南威软件股份有限公司 一种社区网格化管理系统
CN108388831A (zh) * 2018-01-10 2018-08-10 链家网(北京)科技有限公司 一种备件识别和信息整理方法及装置
CN109034816A (zh) * 2018-06-08 2018-12-18 平安科技(深圳)有限公司 用户信息验证方法、装置、计算机设备及存储介质
CN109815792A (zh) * 2018-12-13 2019-05-28 平安普惠企业管理有限公司 图片文件识别方法、装置、计算机设备及存储介质
CN110751110A (zh) * 2019-10-24 2020-02-04 泰康保险集团股份有限公司 身份影像信息核验方法、装置、设备及存储介质
CN111598122A (zh) * 2020-04-01 2020-08-28 深圳壹账通智能科技有限公司 数据校验方法、装置、电子设备和存储介质

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
IL202028A (en) * 2009-11-10 2016-06-30 Icts Holding Company Ltd Product, devices and methods for computerized authentication of electronic documents
RU2641225C2 (ru) * 2014-01-21 2018-01-16 Общество с ограниченной ответственностью "Аби Девелопмент" Способ выявления необходимости обучения эталона при верификации распознанного текста
CN107067044B (zh) * 2017-05-31 2024-03-29 北京空间飞行器总体设计部 一种财务报销全票据智能审核系统
CN108446621A (zh) * 2018-03-14 2018-08-24 平安科技(深圳)有限公司 票据识别方法、服务器及计算机可读存储介质
US10540579B2 (en) * 2018-05-18 2020-01-21 Sap Se Two-dimensional document processing
US10795752B2 (en) * 2018-06-07 2020-10-06 Accenture Global Solutions Limited Data validation
CN110619252B (zh) * 2018-06-19 2022-11-04 百度在线网络技术(北京)有限公司 识别图片中表单数据的方法、装置、设备及存储介质
US10452897B1 (en) * 2018-08-06 2019-10-22 Capital One Services, Llc System for verifying the identity of a user
CN110070081A (zh) * 2019-03-13 2019-07-30 深圳壹账通智能科技有限公司 自动信息录入方法、装置、存储介质及电子设备
CN110288755B (zh) * 2019-05-21 2023-05-23 平安银行股份有限公司 基于文本识别的发票检验方法、服务器及存储介质
CN110348975A (zh) * 2019-05-24 2019-10-18 深圳壹账通智能科技有限公司 报关单信息校验方法及装置、电子设备和存储介质

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8233751B2 (en) * 2006-04-10 2012-07-31 Patel Nilesh V Method and system for simplified recordkeeping including transcription and voting based verification
CN106127659A (zh) * 2016-08-26 2016-11-16 南威软件股份有限公司 一种社区网格化管理系统
CN108388831A (zh) * 2018-01-10 2018-08-10 链家网(北京)科技有限公司 一种备件识别和信息整理方法及装置
CN109034816A (zh) * 2018-06-08 2018-12-18 平安科技(深圳)有限公司 用户信息验证方法、装置、计算机设备及存储介质
CN109815792A (zh) * 2018-12-13 2019-05-28 平安普惠企业管理有限公司 图片文件识别方法、装置、计算机设备及存储介质
CN110751110A (zh) * 2019-10-24 2020-02-04 泰康保险集团股份有限公司 身份影像信息核验方法、装置、设备及存储介质
CN111598122A (zh) * 2020-04-01 2020-08-28 深圳壹账通智能科技有限公司 数据校验方法、装置、电子设备和存储介质

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114760533A (zh) * 2022-05-17 2022-07-15 北京达佳互联信息技术有限公司 校验值存储方法、帧数据校验方法、装置、电子设备
CN114760533B (zh) * 2022-05-17 2024-04-09 北京达佳互联信息技术有限公司 校验值存储方法、帧数据校验方法、装置、电子设备
CN117726300A (zh) * 2023-12-22 2024-03-19 国网江苏省电力工程咨询有限公司 用于招标代理业务资料校验的自动化智能处理系统
CN117726300B (zh) * 2023-12-22 2024-05-24 国网江苏省电力工程咨询有限公司 用于招标代理业务资料校验的自动化智能处理系统

Also Published As

Publication number Publication date
CN111598122B (zh) 2022-02-08
CN111598122A (zh) 2020-08-28

Similar Documents

Publication Publication Date Title
WO2021196935A1 (fr) Procédé et appareil de vérification de données, dispositif électronique et support de stockage
WO2021120677A1 (fr) Procédé et appareil d'entraînement de modèle d'entreposage, dispositif informatique et support de stockage
CN111210335B (zh) 用户风险识别方法、装置及电子设备
WO2019200810A1 (fr) Procédé et appareil d'analyse de l'authenticité de données d'utilisateur, support de stockage et dispositif électronique
WO2022174491A1 (fr) Procédé et appareil fondés sur l'intelligence artificielle pour le contrôle qualité des dossiers médicaux, dispositif informatique et support de stockage
EP4006909B1 (fr) Procédé, appareil et dispositif de contrôle de qualité et support d'enregistrement
CN108921552B (zh) 一种验证证据的方法及装置
CN112181835B (zh) 自动化测试方法、装置、计算机设备及存储介质
CN112990294B (zh) 行为判别模型的训练方法、装置、电子设备及存储介质
WO2020232902A1 (fr) Procédé et appareil d'identification d'objet anormal, dispositif informatique et support de stockage
CN110351672B (zh) 信息推送方法、装置及电子设备
WO2021159669A1 (fr) Procédé et appareil de connexion sécurisée à un système, dispositif informatique et support de stockage
WO2019056496A1 (fr) Procédé de génération d'intervalle de probabilité d'examen d'image et procédé de détermination d'examen d'image
WO2021174814A1 (fr) Procédé et appareil de vérification de réponses pour une tâche d'externalisation ouverte, dispositif informatique et support d'informations
US20150178346A1 (en) Using biometric data to identify data consolidation issues
US11222143B2 (en) Certified information verification services
WO2021072864A1 (fr) Procédé et appareil d'acquisition de similarité de textes, et dispositif électronique et support de stockage lisible par ordinateur
WO2020252925A1 (fr) Procédé et appareil de recherche de groupe de caractéristiques utilisateur pour caractéristique utilisateur optimisée, dispositif électronique et support de stockage lisible par ordinateur non volatil
WO2020252880A1 (fr) Procédé et appareil de vérification de turing inverse, support d'informations et dispositif électronique
CN111210109A (zh) 基于关联用户预测用户风险的方法、装置和电子设备
CN115545753A (zh) 一种基于贝叶斯算法的合作伙伴预测方法及相关设备
US20220309084A1 (en) Record matching in a database system
CN111859985B (zh) Ai客服模型测试方法、装置、电子设备及存储介质
CN110348190B (zh) 基于用户操作行为的用户设备归属判断方法及装置
CN111369375A (zh) 一种社交关系确定方法、装置、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21778922

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 17.03.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21778922

Country of ref document: EP

Kind code of ref document: A1