CN117033795A - Label verification method and device, electronic equipment and storage medium - Google Patents

Label verification method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN117033795A
CN117033795A CN202311047418.3A CN202311047418A CN117033795A CN 117033795 A CN117033795 A CN 117033795A CN 202311047418 A CN202311047418 A CN 202311047418A CN 117033795 A CN117033795 A CN 117033795A
Authority
CN
China
Prior art keywords
tag
data
target data
data tag
verification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311047418.3A
Other languages
Chinese (zh)
Inventor
范鹏丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Life Insurance Company of China Ltd
Original Assignee
Ping An Life Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Life Insurance Company of China Ltd filed Critical Ping An Life Insurance Company of China Ltd
Priority to CN202311047418.3A priority Critical patent/CN117033795A/en
Publication of CN117033795A publication Critical patent/CN117033795A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The embodiment of the application provides a tag verification method and device, electronic equipment and a storage medium, and belongs to the technical field of financial science and technology. The method comprises the following steps: acquiring a data tag to be verified; classifying the data tags, and screening out a numerical value type target data tag; inputting target data labels into a preset ten-bit label model in batches, and generating a verification result of the target data labels; generating a data control graph based on the verification result; and carrying out rationality judgment on the data distribution of the target data tag according to the data control diagram and a preset threshold range to obtain a judgment result, wherein the judgment result comprises judging whether the target data tag belongs to an abnormal data tag. Based on the method, the client tag can be rapidly, efficiently and accurately verified, so that insurance service personnel can be assisted to analyze the characteristics of the client from multiple angles, client circling and selecting can be performed according to the tag, accurate client touching can be further achieved, and a more refined operation target is provided for the client.

Description

Label verification method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of financial technology (Fintech), and in particular, to a tag verification method and apparatus, an electronic device, and a storage medium.
Background
Customer labels are symbolic representations of customer features, each data label is an angle to recognize, view and describe a customer, and multiple angles, multiple dimensions of mass rich labels are required to recognize the customer in multiple directions, such as: raw data of the client: population properties, location information, account information, etc.; platform activity information of the client: information preferences, activity preferences, etc.; customer's predictive label: loss probability, recent demand, etc.; policy tag of client: people to be recovered, people to be developed, and the like. And carrying out all-round depiction on the clients through a large number of rich labels to form a set of client portraits which are compared.
In the insurance field, in order to assist insurance service personnel in analyzing characteristics of clients from multiple angles, client circle selection is performed according to tags, accurate touch is further achieved, more refined operation targets are provided for the clients, and massive client tags are required to be verified. However, the prior art faces the fact that a large number of tags are verified by human testing, which presents a significant challenge to human testing. Therefore, how to quickly, efficiently and accurately verify the client tag is a technical problem to be solved.
Disclosure of Invention
The embodiment of the application mainly aims to provide a tag verification method and device, electronic equipment and storage medium, which can quickly, efficiently and accurately verify a client tag, so that insurance business personnel can be assisted in analyzing characteristics of clients from multiple angles, client circle selection is performed according to the tag, accurate customer touching is further realized, a more refined operation target is provided for the clients, and the competitiveness of the insurance business personnel and an insurance company where the insurance business personnel are located is improved.
To achieve the above object, a first aspect of an embodiment of the present application provides a tag verification method, including:
acquiring a data tag to be verified;
classifying the data tags, and screening out a numerical value type target data tag;
inputting the target data labels into a preset ten-bit label model in batches for ten-method verification processing, and generating a verification result of the target data labels;
generating a data control graph based on the verification result;
and carrying out rationality judgment on the data distribution of the target data tag according to the data control diagram and a preset threshold range to obtain a judgment result, wherein the judgment result comprises judging whether the target data tag belongs to an abnormal data tag or not.
In some embodiments, the obtaining the data tag to be verified includes:
connecting a preset big data cluster;
reading a large-width table to be verified from the large data cluster;
searching a table structure corresponding to the large-width table;
and acquiring the data tag to be verified from the table structure.
In some embodiments, the classifying the data tag, and screening out the target data tag with a numeric type includes:
identifying the name root of the data tag;
checking the name root word with a preset keyword to obtain a checking result;
and screening out the target data tag with the numerical value type according to the checking result.
In some embodiments, the batch inputting the target data tag to a preset tenth tag model for performing a tenth verification process, and generating a verification result of the target data tag includes:
inputting the target data labels into a preset tenth label model in batches for performing a tenth verification process, wherein the tenth label model is defined with a label processing function;
determining the index deletion quantity, the index deletion duty ratio, the average value, the maximum value and the minimum value in the target data tag;
Equally dividing the numerical range between the maximum value and the minimum value into ten parts to obtain ten points;
embedding the target data tag into the tag processing function to generate data quantity of each tenth position point of the target data tag;
and determining a verification result of the target data tag according to the index deletion number, the index deletion duty ratio, the average value, the maximum value, the minimum value and the data quantity of the ten sites.
In some embodiments, said embedding said target data tag into said tag processing function generates a data amount for each of said ten sites of said target data tag, comprising:
importing the target data tag into a YAML file;
and circularly reading the target data tag in the YAML file, and inputting the target data tag into the tag processing function to obtain the data quantity of each ten-point of the target data tag.
In some embodiments, the rationality judgment is performed on the data distribution of the target data tag according to the data control chart and a preset threshold range, so as to obtain a judgment result, where the judgment result includes judging whether the target data tag belongs to an abnormal data tag, and the method includes:
Converting the data control graph into a normal distribution graph;
determining the position of the threshold range in the normal distribution diagram;
and judging that the target data tag belongs to an abnormal data tag under the condition that the target data tag is determined to be out of the threshold range in the normal distribution diagram.
In some embodiments, after the rationality judgment is performed on the data distribution of the target data tag according to the data control chart and the preset threshold range, a judgment result is obtained, where the judgment result includes that whether the target data tag belongs to an abnormal data tag is judged, the method further includes:
and when the target data tag is judged to belong to the abnormal data tag, an abnormal prompt is sent out.
To achieve the above object, a second aspect of an embodiment of the present application proposes a tag verification apparatus, the apparatus comprising:
the acquisition module is used for acquiring the data tag to be verified;
the classification module is used for classifying the data labels and screening out numerical value type target data labels;
the verification module is used for inputting the target data labels into a preset tenth label model in batches to perform tenth verification processing and generating a verification result of the target data labels;
The generation module is used for generating a data control chart based on the verification result;
the judging module is used for judging the rationality of the data distribution of the target data tag according to the data control diagram and a preset threshold range to obtain a judging result, and the judging result comprises judging whether the target data tag belongs to an abnormal data tag or not.
To achieve the above object, a third aspect of the embodiments of the present application proposes an electronic device, including a memory storing a computer program and a processor implementing the method according to the first aspect when the processor executes the computer program.
To achieve the above object, a fourth aspect of the embodiments of the present application proposes a computer-readable storage medium storing a computer program which, when executed by a processor, implements the method of the first aspect.
The method and the device for verifying the tag, the electronic equipment and the storage medium provided by the application acquire the data tag to be verified; classifying the data tags, and screening out a numerical value type target data tag; inputting target data labels into a preset ten-bit label model in batches, and generating a verification result of the target data labels; generating a data control graph based on the verification result; and carrying out rationality judgment on the data distribution of the target data tag according to the data control diagram and a preset threshold range to obtain a judgment result, wherein the judgment result comprises judging whether the target data tag belongs to an abnormal data tag. Based on the method, the target data labels with numerical value types are screened from the verified data labels, the target data labels are subjected to a decimal method, the target data labels are fed into a tenth intelligent client label model, the data quantity of each target data label in each tenth is generated, the data distribution of each target data label is reasonably judged through a data control chart and a preset threshold range, so that abnormal data labels in a large number of labels are obtained rapidly. Based on the method, the client tag can be rapidly, efficiently and accurately verified, so that the insurance business personnel can be assisted to analyze the characteristics of the client from multiple angles, client circle selection is performed according to the tag, accurate client touching is further achieved, a more refined operation target is provided for the client, the tag verification efficiency is improved, the release of human resources is reduced, cost reduction and efficiency improvement are achieved, and the competitiveness of the insurance business personnel and the insurance company where the insurance business personnel are located is improved.
Drawings
FIG. 1 is a flow chart of a tag verification method provided by an embodiment of the present application;
fig. 2 is a flowchart of step S101 in fig. 1;
fig. 3 is a flowchart of step S102 in fig. 1;
fig. 4 is a flowchart of step S103 in fig. 1;
fig. 5 is a flowchart of step S404 in fig. 4;
fig. 6 is a flowchart of step S105 in fig. 1;
fig. 7 is a schematic structural diagram of a tag verification apparatus according to an embodiment of the present application;
fig. 8 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
It should be noted that although functional block division is performed in a device diagram and a logic sequence is shown in a flowchart, in some cases, the steps shown or described may be performed in a different order than the block division in the device, or in the flowchart. The terms first, second and the like in the description and in the claims and in the above-described figures, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the application only and is not intended to be limiting of the application.
First, several nouns involved in the present application are parsed:
artificial intelligence (artificial intelligence, AI): is a new technical science for researching and developing theories, methods, technologies and application systems for simulating, extending and expanding the intelligence of people; artificial intelligence is a branch of computer science that attempts to understand the nature of intelligence and to produce a new intelligent machine that can react in a manner similar to human intelligence, research in this field including robotics, language recognition, image recognition, natural language processing, and expert systems. Artificial intelligence can simulate the information process of consciousness and thinking of people. Artificial intelligence is also a theory, method, technique, and application system that utilizes a digital computer or digital computer-controlled machine to simulate, extend, and expand human intelligence, sense the environment, acquire knowledge, and use knowledge to obtain optimal results.
And (3) insurance: the insurance applicant pays insurance fee to the insurer according to the contract agreement, and the insurer bears the liability of reimbursement for insurance fund on property loss caused by possible accidents of the contract agreement, or bears the business insurance action of paying insurance fund responsibility when the insured person dies, disabilities and diseases or reaches the conditions of age, period and the like of the contract agreement. From an economic perspective, insurance is a financial arrangement that accounts for the loss of accidents; from a legal point of view, insurance is a contractual behavior, which is a contractual arrangement in which one party agrees to compensate for the loss of the other party; from the social perspective, insurance is an important component of a social economic guarantee system, and is an exquisite stabilizer for social production and social life; from a risk management perspective, insurance is one method of risk management.
Insurance agent: the agent operates insurance business according to the entrusting authorization of the insurer and charges the agent fee. The insurance agent performs business activities in the range authorized by the insurance agent on behalf of the insurance agent, including promotion activities of soliciting business, accepting application, issuing temporary insurance policies or insurance policies, collecting insurance fees, and agent survey claims. The proxy fee is typically paid on a traffic scale. The insurance agent can be classified into a general agent, a local agent, a concurrent agent, etc. according to the service scope. The proxy mode is a special proxy for proxy service of one insurance company, independent operation can be performed on independent proxy services of a plurality of insurance companies at the same time, and the like.
Exhibition: the business is the general name of loan, insurance, financial and other business staff for searching clients to develop corresponding business activities. The insurance development channel mainly comprises direct development, agent development and broker development. Wherein, direct exhibition means that insurers strive for services by means of own service personnel; the agent exhibition means that the agent performs policy promotion in the range of the authority of the insurer, and the agent can be divided into professional agents and concurrent agents. In property insurance, direct exhibition and concurrent exhibition are mainly relied on, and personal insurance adopts a direct exhibition mode, and professional agents generally solicit business.
An Internet control message protocol (Internet Control Message Protocol, ICMP), which is a sub-protocol of the TCP/IP protocol suite, is used to transfer control messages between IP hosts, routers. The control message refers to a message of the network itself such as a network is not connected, whether a host is reachable, whether a route is available, and the like. These control messages, although not transmitting user data, play an important role in the transfer of user data. ICMP uses the basic support of IP as if it were a higher level protocol, but ICMP is actually an integral part of IP and must be implemented by each IP block.
Hive: is a data warehouse tool based on Hadoop and is used for extracting, converting and loading data, and is a mechanism capable of storing, inquiring and analyzing large-scale data stored in the Hadoop. The hive data warehouse tool can map a structured data file into a database table, provide SQL query functions, and convert SQL sentences into MapReduce tasks for execution. Hive has the advantages that learning cost is low, rapid MapReduce statistics can be realized through SQL-like sentences, mapReduce is simpler, and a special MapReduce application program does not need to be developed. hive is well suited for statistical analysis of data warehouses.
Macro (Macro): is a name for batch processing. Macros in computer science are abstractions (abstractions) that replace certain text patterns according to a series of predefined rules. The interpreter or compiler will automatically make this mode replacement when a macro is encountered. For compiled languages, macro expansion occurs at compile time, and the tool that performs macro expansion is often referred to as a macro expander. Macros are also often used in many similar environments, which are concepts derived from macro expansion, including keyboard macros and macro languages. In most cases, the use of the word "macro" implies the translation of a cmdlet or action into a series of instructions. Macros are the commands that are organized together as a single command to accomplish a particular task.
Normal distribution (Normal distribution): also known as the "normal distribution", also known as the gaussian distribution (Gaussian distribution), was originally obtained by the ameter (Abraham de Moivre) in an asymptotic formula for binomial distribution. C.f. gaussian derives it from another angle when studying measurement errors. The properties of the material were studied by p.s. laplace and gaussian. Is a probability distribution which is very important in the fields of mathematics, physics, engineering and the like, and has great influence on a plurality of aspects of statistics. The normal curve is bell-shaped, the two ends are low, the middle is high, and the left and right symmetry is bell-shaped, so people are often called bell-shaped curves.
In the insurance field, in order to assist insurance service personnel in analyzing characteristics of clients from multiple angles, client circle selection is performed according to tags, accurate touch is further achieved, more refined operation targets are provided for the clients, and massive client tags are required to be verified. However, the prior art faces the fact that a large number of tags are verified by human testing, which presents a significant challenge to human testing. Therefore, how to quickly, efficiently and accurately verify the client tag is a technical problem to be solved.
Based on the above, the embodiment of the application provides a tag verification method and device, electronic equipment and a storage medium, and a data tag to be verified is obtained; classifying the data tags, and screening out a numerical value type target data tag; inputting target data labels into a preset ten-bit label model in batches, and generating a verification result of the target data labels; generating a data control graph based on the verification result; and carrying out rationality judgment on the data distribution of the target data tag according to the data control diagram and a preset threshold range to obtain a judgment result, wherein the judgment result comprises judging whether the target data tag belongs to an abnormal data tag. Based on the method, the target data labels with numerical value types are screened from the verified data labels, the target data labels are subjected to a decimal method, the target data labels are fed into a tenth intelligent client label model, the data quantity of each target data label in each tenth is generated, the data distribution of each target data label is reasonably judged through a data control chart and a preset threshold range, so that abnormal data labels in a large number of labels are obtained rapidly. Based on the method, the client tag can be rapidly, efficiently and accurately verified, so that the insurance business personnel can be assisted to analyze the characteristics of the client from multiple angles, client circle selection is performed according to the tag, accurate client touching is further achieved, a more refined operation target is provided for the client, the tag verification efficiency is improved, the release of human resources is reduced, cost reduction and efficiency improvement are achieved, and the competitiveness of the insurance business personnel and the insurance company where the insurance business personnel are located is improved.
The embodiment of the application provides a tag verification method and device, an electronic device and a storage medium, and specifically, the embodiment of the application is described by the following, and the tag verification method in the embodiment of the application is described first.
The embodiment of the application can acquire and process the related data based on the artificial intelligence technology. Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results.
Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.
The embodiment of the application provides a label verification method, and relates to the technical field of artificial intelligence. The label verification method provided by the embodiment of the application can be applied to the terminal, the server side and software running in the terminal or the server side. In some embodiments, the terminal may be a smart phone, tablet, notebook, desktop, etc.; the server side can be configured as an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, and a cloud server for providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, basic cloud computing services such as big data and artificial intelligent platforms and the like; the software may be an application or the like that implements the tag verification method, but is not limited to the above form.
The application is operational with numerous general purpose or special purpose computer system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
It should be noted that, in each specific embodiment of the present application, when related processing is required according to user information, user behavior data, user history data, user location information, and other data related to user identity or characteristics, permission or consent of the user is obtained first, and the collection, use, processing, and the like of the data comply with related laws and regulations and standards. In addition, when the embodiment of the application needs to acquire the sensitive personal information of the user, the independent permission or independent consent of the user is acquired through popup or jump to a confirmation page and the like, and after the independent permission or independent consent of the user is definitely acquired, the necessary relevant data of the user for enabling the embodiment of the application to normally operate is acquired.
Fig. 1 is an optional flowchart of a tag verification method according to an embodiment of the present application, where the method in fig. 1 may include, but is not limited to, steps S101 to S105.
Step S101, obtaining a data tag to be verified;
step S102, classifying the data labels, and screening out the target data labels of numerical value types;
step S103, inputting target data labels into a preset tenth label model in batches for performing tenth verification processing, and generating a verification result of the target data labels;
step S104, generating a data control chart based on the verification result;
step S105, rationality judgment is carried out on the data distribution of the target data tag according to the data control diagram and a preset threshold range, and a judgment result is obtained, wherein the judgment result comprises judging whether the target data tag belongs to an abnormal data tag.
In step S101 of some embodiments, a data tag to be verified is acquired. The data labels to be verified can be client labels, the client labels are symbolic representations of client features, each data label is an angle for recognizing, observing and describing clients, in order to recognize clients in multiple directions, massive and rich labels with multiple angles and multiple dimensions are needed, and in the protection field, the client labels comprise: raw data of the client: population properties, location information, account information, etc.; platform activity information of the client: information preferences, activity preferences, etc.; customer's predictive label: loss probability, recent demand, etc.; policy tag of client: people to be recovered, people to be developed, and the like. The clients are comprehensively depicted by acquiring a large number of data labels, a set of compared client portrait is formed, so that insurance service personnel can analyze the characteristics of the clients from multiple angles conveniently, client circling is performed according to the labels, accurate touchers are realized, and more refined operation targets are provided for the clients.
In step S102 of some embodiments, the data tags are classified, and the numerical type of the target data tag is screened out. In the big data model, there are two types of data that are commonly used: decimal and string. Wherein, decimal is used for storing data of a value type, and string is used for storing data other than the value type. Based on the above, the data tags are classified into post-decmal tags and string tags, and decmal tags which need to be verified, namely, target data tags of numerical types, are screened out. Illustratively, in the security field, customer tags are classified into post-decmal tags and string tags, and decmal tags that need to be verified, i.e., customer tags of numerical type, are screened out.
In step S103 of some embodiments, the target data tag is input to a preset tenth tag model in batch to perform a tenth verification process, so as to generate a verification result of the target data tag. The method includes the steps that in the field of insurance, a numerical value type client label is input into a tenth label model to carry out tenth verification processing, a preset tenth label model is used for carrying out tenth processing on the client label, the maximum value and the minimum value of the client label are taken in the tenth processing, the numerical value range of the data label is equally divided into 10 parts according to the set tenth positions, the numerical value of the data label at each tenth position is returned, and therefore the data distribution of the client label is defined, and the verification result of the numerical value type client label is generated. The verification result may also include the number of index deletions of the client tag (e.g., the amount of data in NULL in the entire population wide table), the index deletion ratio (e.g., the ratio of the amount of data in NULL to the total amount of data in the entire population wide table), the average, the maximum, the minimum, and the statistics of the amount of data at each ten-point site.
In step S104 of some embodiments, a data control map is generated based on the verification result. The data control diagram is generated according to the verification result output by the ten-bit label model, and the data control diagram can intuitively display the data distribution of the client labels, so that the client labels with abnormality can be quickly found out.
In step S105 of some embodiments, the rationality judgment is performed on the data distribution of the target data tag according to the data control chart and the preset threshold range, so as to obtain a judgment result, where the judgment result includes judging whether the target data tag belongs to an abnormal data tag. The method includes the steps that in the insurance field, for example, the position of a threshold range in a normal distribution diagram is set, and under the condition that a client label is out of the threshold range in a data control diagram is determined, the client label is judged to belong to an abnormal data label, so that the abnormal data label in a huge amount of client labels is obtained quickly, an insurance business person can be assisted to analyze characteristics of clients from multiple angles, client circle selection is carried out according to the labels, accurate client touching is further achieved, more refined operation targets are provided for the clients, and due to the fact that the efficiency of label verification is improved, the input of manpower resources is reduced, and cost reduction and efficiency improvement can be achieved for insurance companies.
Step S101 to step S105 shown in the embodiment of the application are used for obtaining a data tag to be verified; classifying the data tags, and screening out a numerical value type target data tag; inputting target data labels into a preset ten-bit label model in batches, and generating a verification result of the target data labels; generating a data control graph based on the verification result; and carrying out rationality judgment on the data distribution of the target data tag according to the data control diagram and a preset threshold range to obtain a judgment result, wherein the judgment result comprises judging whether the target data tag belongs to an abnormal data tag. Based on the method, the target data labels with numerical value types are screened from the verified data labels, the target data labels are subjected to a decimal method, the target data labels are fed into a tenth intelligent client label model, the data quantity of each target data label in each tenth is generated, the data distribution of each target data label is reasonably judged through a data control chart and a preset threshold range, so that abnormal data labels in a large number of labels are obtained rapidly. Based on the method, the client tag can be rapidly, efficiently and accurately verified, so that the insurance business personnel can be assisted to analyze the characteristics of the client from multiple angles, client circle selection is performed according to the tag, accurate client touching is further achieved, a more refined operation target is provided for the client, the tag verification efficiency is improved, the release of human resources is reduced, cost reduction and efficiency improvement are achieved, and the competitiveness of the insurance business personnel and the insurance company where the insurance business personnel are located is improved.
Referring to fig. 2, in some embodiments, step S101 may include, but is not limited to, steps S201 to S204:
step S201, connecting a preset big data cluster;
step S202, reading a large-width table to be verified from a large data cluster;
step S203, searching a table structure corresponding to the large-width table;
step S204, the data tag to be verified is obtained from the table structure.
In some embodiments, a preset large data cluster is connected, and a large number of data labels are stored in the large data cluster, for example, the large data cluster can be connected to a Hive data cluster, hive is a data warehouse tool based on Hadoop and is opened by Facebook, and a structured data file can be mapped into a table and provides a similar SQL query function. The large-width table which needs to be verified is read from the large data cluster, the large-width table is a database table with a large number of fields, a table structure corresponding to the large-width table is searched, the data label to be verified is obtained from the table structure, the data label to be verified can be subjected to data falling, particularly in the insurance field, insurance service personnel can quickly obtain a large number of customer labels from the data cluster, and therefore the working efficiency of the insurance service personnel is improved.
Referring to fig. 3, in some embodiments, step S102 may include, but is not limited to, steps S301 to S303:
step S301, identifying the name root of the data tag;
step S302, checking the name root word with a preset keyword to obtain a checking result;
step S303, screening out the target data labels of the numerical value type according to the checking result.
In some embodiments, the name root of the data tag is identified, the root of the name determines the meaning of the name of the data tag, the name root is checked with a preset keyword, the keyword may include but is not limited to words with implicit numerical types such as times, number of people, amount of money, and the like, the name root is checked with the preset keyword, for example, when the number of times of comparing the name root of the data tag with the number of times in the preset keyword is the same, the data tag is screened out as a target data tag of the numerical type; for another example, when the name root of the comparison data tag is not matched with the preset keyword, the data tag is not screened out as a numerical value type target data tag, and in the insurance field, insurance service personnel can quickly find out a numerical value type client tag from a large number of client tags, so that the working efficiency of the insurance service personnel is improved.
Referring to fig. 4, in some embodiments, step S103 may include, but is not limited to, steps S401 to S405:
step S401, inputting target data labels into a preset tenth label model in batches for performing a tenth method verification process, wherein the tenth label model defines a label processing function;
step S402, determining the index missing quantity, the index missing ratio, the average value, the maximum value and the minimum value in the target data label;
step S403, equally dividing the numerical range between the maximum value and the minimum value into ten parts to obtain ten points;
step S404, embedding the target data tag into a tag processing function to generate the data quantity of each ten-point of the target data tag;
and step S405, determining a verification result of the target data tag according to the index deletion number, the index deletion duty ratio, the average value, the maximum value, the minimum value and the data quantity of the ten sites.
In some embodiments, the target data tags are input to a preset tenth tag model in batches for performing a tenth verification process, wherein the tenth tag model defines a tag processing function, and the tag processing function is used for processing the target data tags to obtain the index missing number (the tag is the NULL data amount in the whole crowd wide table), the index missing ratio (the tag is the ratio of the NULL data amount to the total data amount in the whole crowd wide table), the average value, the maximum value, the minimum value, and the statistics of the data amount at each tenth point. The so-called tenth method is: the maximum and minimum values of the index can be taken equally divided into ten parts, for example, 0.01,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,0.99, note: the starting point and the ending point are adjacent to 0.01, and the number of the ten points is returned, so that the distribution of the data is defined. In the insurance field, the target data labels of numerical value types are screened out from the verified data labels, the target data labels are subjected to a decimal method, the target data labels are fed into a tenth intelligent client label model, and the data quantity of each target data label in each tenth is generated.
Referring to fig. 5, in some embodiments, steps S404 may be followed by steps S501 to S502, including but not limited to:
step S501, importing a target data tag into a YAML file;
step S502, the target data labels in the YAML file are circularly read, the target data labels are input into a label processing function, and the data quantity of each ten-point of the target data labels is obtained.
In some embodiments, for batch tag processing, parameterization may be implemented by file data driven techniques, e.g., using YAML files for data driving. Specifically, YAML file is used for storing parameters (the parameters mainly comprise table names and fields), the parameters in the YAML file are circularly read and fed into a tag processing method macro, and the data quantity of each ten-point of a target data tag is obtained, so that the speed of batch processing of the target data tag is improved, and the working efficiency of insurance service personnel is improved.
Referring to fig. 6, in some embodiments, step S105 may include, but is not limited to, steps S601 to S603:
step S601, converting the data control diagram into a normal distribution diagram;
step S602, determining the position of the threshold range in the normal distribution diagram;
In step S603, if it is determined that the target data tag is out of the threshold range in the normal distribution map, it is determined that the target data tag belongs to the abnormal data tag.
In some embodiments, the generated data control map based on the verification result is converted into a normal distribution map, the position of the threshold range in the normal distribution map is determined, and in the case that the target data tag is determined to be out of the threshold range in the normal distribution map, it is determined that the target data tag belongs to an abnormal data tag. In the insurance field, taking the age label of a customer as an example, the threshold value range is preset to be 0 to 150 years old, if there is a customer label of which the age label of a certain customer is out of the threshold value range in the normal distribution map, for example, a customer label of less than 0 years old or more than 150 years old, the value of the age label of the customer belongs to an abnormal value, and therefore, the age label of the customer is judged to belong to an abnormal data label. Based on this, can verify customer's label fast, high-efficient, accurately to can assist insurance business personnel to follow the characteristics of multi-angle analysis customer, carry out the customer circle according to the label and select, and then realize accurate touch, provide more meticulous operation target for the customer, owing to improved the efficiency that the label was verified, reduced the input of manpower resources, realize the cost reduction and increase, thereby promote insurance business personnel and the competitiveness of its place insurance company.
In some embodiments, step S105 may be followed by, but is not limited to, step S106:
step S106: and when the target data label is judged to belong to the abnormal data label, an abnormal prompt is sent out.
In some embodiments, in the protection field, when it is determined that the target data tag belongs to an abnormal data tag, an abnormal prompt is timely sent out so as to remind an insurance service person to perform corresponding processing on the abnormal data tag, for example, the abnormal data tag is removed, so that normal data tag can be analyzed, thereby assisting the insurance service person in analyzing characteristics of clients from multiple angles, and performing client circling according to the tag, further realizing accurate customer touching, providing more refined operation targets for the clients, reducing input of human resources due to improving efficiency of tag verification, realizing cost reduction and efficiency improvement, and further improving competitiveness of the insurance service person and an insurance company where the insurance service person is located.
The tag verification method of the present application is further described below in connection with specific embodiments.
In the insurance field, a data tag to be verified is obtained. The data label to be verified can be a client label, the client label is a symbolic representation of the client characteristics, each data label is an angle for recognizing, observing and describing the client, and in order to recognize the client in multiple directions, a plurality of angles and a plurality of dimensions are needed to be abundant labels, and the client label can comprise: raw data of the client: population properties, location information, account information, etc.; platform activity information of the client: information preferences, activity preferences, etc.; customer's predictive label: loss probability, recent demand, etc.; policy tag of client: people to be recovered, people to be developed, and the like. In the big data model, there are two types of data that are commonly used: decimal and string. Wherein, decimal is used for storing data of a value type, and string is used for storing data other than the value type. Classifying the data labels into post-decmal labels and string labels, and screening out decmal labels which need to be verified, namely, target data labels of numerical value types. Therefore, the customer labels are classified into post-decmal labels and string labels, and decmal labels which need to be verified, namely, customer labels of numerical type, are screened out. Inputting the customer label of the numerical value type into a tenth label model for performing tenth verification processing, wherein the preset tenth label model is used for performing tenth processing on the customer label, namely taking the maximum value and the minimum value of the customer label, equally dividing the numerical range of the data label into 10 parts according to the set tenth positions, and returning the numerical value of the data label at each tenth position, so as to determine the data distribution of the customer label and generate the verification result of the customer label of the numerical value type. The verification result may also include the number of index deletions of the client tag (e.g., the amount of data in NULL in the entire population wide table), the index deletion ratio (e.g., the ratio of the amount of data in NULL to the total amount of data in the entire population wide table), the average, the maximum, the minimum, and the statistics of the amount of data at each ten-point site. And generating a data control diagram according to the verification result output by the ten-bit label model, wherein the data control diagram can intuitively display the data distribution of the client labels so as to quickly find out the client labels with abnormality. The method comprises the steps of setting the position of a threshold range in a normal distribution diagram, judging that a client tag belongs to an abnormal data tag under the condition that the client tag is determined to be out of the threshold range in a data control diagram, and accordingly obtaining the abnormal data tag in a large number of client tags rapidly, so that insurance business personnel can be assisted in analyzing the characteristics of clients from multiple angles, client circle selection is carried out according to the tags, accurate client touching is further achieved, a finer operation target is provided for the clients, and due to the fact that the efficiency of tag verification is improved, the input of human resources is reduced, and cost reduction and efficiency improvement can be achieved for insurance companies.
Based on the data label, the embodiment of the application obtains the data label to be verified; classifying the data tags, and screening out a numerical value type target data tag; inputting target data labels into a preset ten-bit label model in batches, and generating a verification result of the target data labels; generating a data control graph based on the verification result; and carrying out rationality judgment on the data distribution of the target data tag according to the data control diagram and a preset threshold range to obtain a judgment result, wherein the judgment result comprises judging whether the target data tag belongs to an abnormal data tag. Based on the method, the target data labels with numerical value types are screened from the verified data labels, the target data labels are subjected to a decimal method, the target data labels are fed into a tenth intelligent client label model, the data quantity of each target data label in each tenth is generated, the data distribution of each target data label is reasonably judged through a data control chart and a preset threshold range, so that abnormal data labels in a large number of labels are obtained rapidly. Based on the method, the client tag can be rapidly, efficiently and accurately verified, so that the insurance business personnel can be assisted to analyze the characteristics of the client from multiple angles, client circle selection is performed according to the tag, accurate client touching is further achieved, a more refined operation target is provided for the client, the tag verification efficiency is improved, the release of human resources is reduced, cost reduction and efficiency improvement are achieved, and the competitiveness of the insurance business personnel and the insurance company where the insurance business personnel are located is improved.
Referring to fig. 7, an embodiment of the present application further provides a tag verification apparatus, which may implement the tag verification method, where the apparatus includes:
an obtaining module 710, configured to obtain a data tag to be verified;
the classification module 720 is configured to classify the data tags, and screen out a numerical value type of target data tag;
the verification module 730 is configured to input the target data tag into a preset tenth tag model in batch to perform a tenth verification process, and generate a verification result of the target data tag;
a generating module 740, configured to generate a data control graph based on the verification result;
the judging module 750 is configured to perform rationality judgment on the data distribution of the target data tag according to the data control chart and the preset threshold range, so as to obtain a judgment result, where the judgment result includes judging whether the target data tag belongs to an abnormal data tag.
In some embodiments of the present application, the acquisition module 710 acquires a data tag to be verified; the classification module 720 classifies the data labels and screens out the target data labels with numerical value types; the verification module 730 inputs the target data tags into a preset tenth tag model in batches to perform a tenth verification process, and generates a verification result of the target data tags; the generation module 740 generates a data control diagram based on the verification result; the judging module 750 performs rationality judgment on the data distribution of the target data tag according to the data control chart and a preset threshold range, so as to obtain a judging result, wherein the judging result comprises judging whether the target data tag belongs to an abnormal data tag.
In some embodiments of the present application, the acquisition module 710 acquires a data tag to be verified. The data labels to be verified can be client labels, the client labels are symbolic representations of client features, each data label is an angle for recognizing, observing and describing clients, in order to recognize clients in multiple directions, massive and rich labels with multiple angles and multiple dimensions are needed, and in the protection field, the client labels comprise: raw data of the client: population properties, location information, account information, etc.; platform activity information of the client: information preferences, activity preferences, etc.; customer's predictive label: loss probability, recent demand, etc.; policy tag of client: people to be recovered, people to be developed, and the like. The clients are comprehensively depicted by acquiring a large number of data labels, a set of compared client portrait is formed, so that insurance service personnel can analyze the characteristics of the clients from multiple angles conveniently, client circling is performed according to the labels, accurate touchers are realized, and more refined operation targets are provided for the clients.
In some embodiments of the present application, the classification module 720 classifies the data tags and screens out the numeric type of the target data tags. In the big data model, there are two types of data that are commonly used: decimal and string. Wherein, decimal is used for storing data of a value type, and string is used for storing data other than the value type. Based on the above, the data tags are classified into post-decmal tags and string tags, and decmal tags which need to be verified, namely, target data tags of numerical types, are screened out. Illustratively, in the security field, customer tags are classified into post-decmal tags and string tags, and decmal tags that need to be verified, i.e., customer tags of numerical type, are screened out.
In some embodiments of the present application, the verification module 730 inputs the target data tag into a preset ten-bit tag model in batch to perform a ten-method verification process, and generates a verification result of the target data tag. The method includes the steps that in the field of insurance, a numerical value type client label is input into a tenth label model to carry out tenth verification processing, a preset tenth label model is used for carrying out tenth processing on the client label, the maximum value and the minimum value of the client label are taken in the tenth processing, the numerical value range of the data label is equally divided into 10 parts according to the set tenth positions, the numerical value of the data label at each tenth position is returned, and therefore the data distribution of the client label is defined, and the verification result of the numerical value type client label is generated. The verification result may also include the number of index deletions of the client tag (e.g., the amount of data in NULL in the entire population wide table), the index deletion ratio (e.g., the ratio of the amount of data in NULL to the total amount of data in the entire population wide table), the average, the maximum, the minimum, and the statistics of the amount of data at each ten-point site.
In some embodiments of the application, the generation module 740 generates a data control graph based on the verification result. The data control diagram is generated according to the verification result output by the ten-bit label model, and the data control diagram can intuitively display the data distribution of the client labels, so that the client labels with abnormality can be quickly found out.
In some embodiments of the present application, the determining module 750 performs a rationality determination on the data distribution of the target data tag according to the data control chart and the preset threshold range, so as to obtain a determination result, where the determination result includes determining whether the target data tag belongs to an abnormal data tag. The method includes the steps that in the insurance field, for example, the position of a threshold range in a normal distribution diagram is set, and under the condition that a client label is out of the threshold range in a data control diagram is determined, the client label is judged to belong to an abnormal data label, so that the abnormal data label in a huge amount of client labels is obtained quickly, an insurance business person can be assisted to analyze characteristics of clients from multiple angles, client circle selection is carried out according to the labels, accurate client touching is further achieved, more refined operation targets are provided for the clients, and due to the fact that the efficiency of label verification is improved, the input of manpower resources is reduced, and cost reduction and efficiency improvement can be achieved for insurance companies.
Based on this, in the tag verification apparatus of the embodiment of the present application, the acquisition module 710 acquires the data tag to be verified; the classification module 720 classifies the data labels and screens out the target data labels with numerical value types; the verification module 730 inputs the target data tags into a preset tenth tag model in batches to perform a tenth verification process, and generates a verification result of the target data tags; the generation module 740 generates a data control diagram based on the verification result; the judging module 750 performs rationality judgment on the data distribution of the target data tag according to the data control chart and a preset threshold range, so as to obtain a judging result, wherein the judging result comprises judging whether the target data tag belongs to an abnormal data tag. The method comprises the steps of obtaining a data tag to be verified; classifying the data tags, and screening out a numerical value type target data tag; inputting target data labels into a preset ten-bit label model in batches, and generating a verification result of the target data labels; generating a data control graph based on the verification result; and carrying out rationality judgment on the data distribution of the target data tag according to the data control diagram and a preset threshold range to obtain a judgment result, wherein the judgment result comprises judging whether the target data tag belongs to an abnormal data tag. Based on the method, the target data labels with numerical value types are screened from the verified data labels, the target data labels are subjected to a decimal method, the target data labels are fed into a tenth intelligent client label model, the data quantity of each target data label in each tenth is generated, the data distribution of each target data label is reasonably judged through a data control chart and a preset threshold range, so that abnormal data labels in a large number of labels are obtained rapidly. Based on the method, the client tag can be rapidly, efficiently and accurately verified, so that the insurance business personnel can be assisted to analyze the characteristics of the client from multiple angles, client circle selection is performed according to the tag, accurate client touching is further achieved, a more refined operation target is provided for the client, the tag verification efficiency is improved, the release of human resources is reduced, cost reduction and efficiency improvement are achieved, and the competitiveness of the insurance business personnel and the insurance company where the insurance business personnel are located is improved.
The specific implementation of the tag verification apparatus is substantially the same as the specific embodiment of the tag verification method described above, and will not be described herein.
The embodiment of the application also provides electronic equipment, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the tag verification method when executing the computer program. The electronic equipment can be any intelligent terminal including a tablet personal computer, a vehicle-mounted computer and the like.
Referring to fig. 8, fig. 8 illustrates a hardware structure of an electronic device according to another embodiment, the electronic device includes:
the processor 801 may be implemented by a general-purpose CPU (central processing unit), a microprocessor, an application-specific integrated circuit (ApplicationSpecificIntegratedCircuit, ASIC), or one or more integrated circuits, etc. for executing related programs to implement the technical solutions provided by the embodiments of the present application.
The memory 802 may be implemented in the form of read-only memory (ReadOnlyMemory, ROM), static storage, dynamic storage, or random access memory (RandomAccessMemory, RAM). The memory 802 may store an operating system and other application programs, and when the technical solution provided in the embodiments of the present disclosure is implemented by software or firmware, relevant program codes are stored in the memory 802, and the processor 801 invokes the tag verification method for executing the embodiments of the present disclosure, that is, by acquiring the data tag to be verified; classifying the data tags, and screening out a numerical value type target data tag; inputting target data labels into a preset ten-bit label model in batches, and generating a verification result of the target data labels; generating a data control graph based on the verification result; and carrying out rationality judgment on the data distribution of the target data tag according to the data control diagram and a preset threshold range to obtain a judgment result, wherein the judgment result comprises judging whether the target data tag belongs to an abnormal data tag. Based on the method, the target data labels with numerical value types are screened from the verified data labels, the target data labels are subjected to a decimal method, the target data labels are fed into a tenth intelligent client label model, the data quantity of each target data label in each tenth is generated, the data distribution of each target data label is reasonably judged through a data control chart and a preset threshold range, so that abnormal data labels in a large number of labels are obtained rapidly. Based on the method, the client tag can be rapidly, efficiently and accurately verified, so that the insurance business personnel can be assisted to analyze the characteristics of the client from multiple angles, client circle selection is performed according to the tag, accurate client touching is further achieved, a more refined operation target is provided for the client, the tag verification efficiency is improved, the release of human resources is reduced, cost reduction and efficiency improvement are achieved, and the competitiveness of the insurance business personnel and the insurance company where the insurance business personnel are located is improved.
An input/output interface 803 for implementing information input and output.
The communication interface 804 is configured to implement communication interaction between the device and other devices, and may implement communication through a wired manner (such as USB, network cable, etc.), or may implement communication through a wireless manner (such as mobile network, WIFI, bluetooth, etc.).
A bus that transfers information between the various components of the device (e.g., processor 801, memory 802, input/output interface 803, and communication interface 804).
Wherein the processor 801, the memory 802, the input/output interface 803, and the communication interface 804 implement communication connection between each other inside the device through a bus.
The embodiment of the application also provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and the computer program realizes the label verification method when being executed by a processor.
The memory, as a non-transitory computer readable storage medium, may be used to store non-transitory software programs as well as non-transitory computer executable programs. In addition, the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory remotely located relative to the processor, the remote memory being connectable to the processor through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The label verification method, the label verification device, the electronic equipment and the storage medium provided by the embodiment of the application are used for obtaining the data label to be verified; classifying the data tags, and screening out a numerical value type target data tag; inputting target data labels into a preset ten-bit label model in batches, and generating a verification result of the target data labels; generating a data control graph based on the verification result; and carrying out rationality judgment on the data distribution of the target data tag according to the data control diagram and a preset threshold range to obtain a judgment result, wherein the judgment result comprises judging whether the target data tag belongs to an abnormal data tag. Based on the method, the target data labels with numerical value types are screened from the verified data labels, the target data labels are subjected to a decimal method, the target data labels are fed into a tenth intelligent client label model, the data quantity of each target data label in each tenth is generated, the data distribution of each target data label is reasonably judged through a data control chart and a preset threshold range, so that abnormal data labels in a large number of labels are obtained rapidly. Based on the method, the client tag can be rapidly, efficiently and accurately verified, so that the insurance business personnel can be assisted to analyze the characteristics of the client from multiple angles, client circle selection is performed according to the tag, accurate client touching is further achieved, a more refined operation target is provided for the client, the tag verification efficiency is improved, the release of human resources is reduced, cost reduction and efficiency improvement are achieved, and the competitiveness of the insurance business personnel and the insurance company where the insurance business personnel are located is improved.
Those of ordinary skill in the art will appreciate that all or some of the steps, systems, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable programs, data structures, program modules or other data, as known to those skilled in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable programs, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media.
The embodiments described in the embodiments of the present application are for more clearly describing the technical solutions of the embodiments of the present application, and do not constitute a limitation on the technical solutions provided by the embodiments of the present application, and those skilled in the art can know that, with the evolution of technology and the appearance of new application scenarios, the technical solutions provided by the embodiments of the present application are equally applicable to similar technical problems.
It will be appreciated by persons skilled in the art that the embodiments of the application are not limited by the illustrations, and that more or fewer steps than those shown may be included, or certain steps may be combined, or different steps may be included.
The above described apparatus embodiments are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, i.e. may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
Those of ordinary skill in the art will appreciate that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof.
The terms "first," "second," "third," "fourth," and the like in the description of the application and in the above figures, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be understood that in the present application, "at least one (item)" means one or more, and "a plurality" means two or more. "and/or" for describing the association relationship of the association object, the representation may have three relationships, for example, "a and/or B" may represent: only a, only B and both a and B are present, wherein a, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b or c may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.
In the several embodiments provided by the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the above-described division of units is merely a logical function division, and there may be another division manner in actual implementation, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described above as separate components may or may not be physically separate, and components shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, including multiple instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method of the various embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing a program.
The preferred embodiments of the present application have been described above with reference to the accompanying drawings, and are not thereby limiting the scope of the claims of the embodiments of the present application. Any modifications, equivalent substitutions and improvements made by those skilled in the art without departing from the scope and spirit of the embodiments of the present application shall fall within the scope of the claims of the embodiments of the present application.

Claims (10)

1. A method of tag verification, the method comprising:
acquiring a data tag to be verified;
classifying the data tags, and screening out a numerical value type target data tag;
inputting the target data labels into a preset ten-bit label model in batches for ten-method verification processing, and generating a verification result of the target data labels;
generating a data control graph based on the verification result;
and carrying out rationality judgment on the data distribution of the target data tag according to the data control diagram and a preset threshold range to obtain a judgment result, wherein the judgment result comprises judging whether the target data tag belongs to an abnormal data tag or not.
2. The method of claim 1, wherein the obtaining the data tag to be verified comprises:
connecting a preset big data cluster;
reading a large-width table to be verified from the large data cluster;
searching a table structure corresponding to the large-width table;
and acquiring the data tag to be verified from the table structure.
3. The method of claim 1, wherein classifying the data tags to filter out numeric type target data tags, comprises:
Identifying the name root of the data tag;
checking the name root word with a preset keyword to obtain a checking result;
and screening out the target data tag with the numerical value type according to the checking result.
4. The method of claim 1, wherein the batch inputting the target data tag into a preset tentacle tag model for tentacle verification processing, generating a verification result of the target data tag, comprises:
inputting the target data labels into a preset tenth label model in batches for performing a tenth verification process, wherein the tenth label model is defined with a label processing function;
determining the index deletion quantity, the index deletion duty ratio, the average value, the maximum value and the minimum value in the target data tag;
equally dividing the numerical range between the maximum value and the minimum value into ten parts to obtain ten points;
embedding the target data tag into the tag processing function to generate data quantity of each tenth position point of the target data tag;
and determining a verification result of the target data tag according to the index deletion number, the index deletion duty ratio, the average value, the maximum value, the minimum value and the data quantity of the ten sites.
5. The method of claim 4, wherein said embedding said target data tag into said tag processing function generates a data volume for each of said ten sites of said target data tag, comprising:
importing the target data tag into a YAML file;
and circularly reading the target data tag in the YAML file, and inputting the target data tag into the tag processing function to obtain the data quantity of each ten-point of the target data tag.
6. The method according to claim 1, wherein the rationality judgment is performed on the data distribution of the target data tag according to the data control chart and a preset threshold range, so as to obtain a judgment result, and the judgment result includes judging whether the target data tag belongs to an abnormal data tag, including:
converting the data control graph into a normal distribution graph;
determining the position of the threshold range in the normal distribution diagram;
and judging that the target data tag belongs to an abnormal data tag under the condition that the target data tag is determined to be out of the threshold range in the normal distribution diagram.
7. The method according to any one of claims 1 to 6, wherein after performing a rationality judgment on the data distribution of the target data tag according to the data control map and a preset threshold range to obtain a judgment result, the judgment result includes judging whether the target data tag belongs to an abnormal data tag, further includes:
and when the target data tag is judged to belong to the abnormal data tag, an abnormal prompt is sent out.
8. A tag verification apparatus, the apparatus comprising:
the acquisition module is used for acquiring the data tag to be verified;
the classification module is used for classifying the data labels and screening out numerical value type target data labels;
the verification module is used for inputting the target data labels into a preset tenth label model in batches to perform tenth verification processing and generating a verification result of the target data labels;
the generation module is used for generating a data control chart based on the verification result;
the judging module is used for judging the rationality of the data distribution of the target data tag according to the data control diagram and a preset threshold range to obtain a judging result, and the judging result comprises judging whether the target data tag belongs to an abnormal data tag or not.
9. An electronic device comprising a memory storing a computer program and a processor implementing the tag verification method of any one of claims 1 to 7 when the computer program is executed by the processor.
10. A computer readable storage medium storing a computer program, characterized in that the computer program, when executed by a processor, implements the tag verification method of any one of claims 1 to 7.
CN202311047418.3A 2023-08-18 2023-08-18 Label verification method and device, electronic equipment and storage medium Pending CN117033795A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311047418.3A CN117033795A (en) 2023-08-18 2023-08-18 Label verification method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311047418.3A CN117033795A (en) 2023-08-18 2023-08-18 Label verification method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117033795A true CN117033795A (en) 2023-11-10

Family

ID=88631451

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311047418.3A Pending CN117033795A (en) 2023-08-18 2023-08-18 Label verification method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117033795A (en)

Similar Documents

Publication Publication Date Title
CN110119413B (en) Data fusion method and device
US11244011B2 (en) Ingestion planning for complex tables
EP4195112A1 (en) Systems and methods for enriching modeling tools and infrastructure with semantics
US9459950B2 (en) Leveraging user-to-tool interactions to automatically analyze defects in IT services delivery
CN110782129B (en) Business progress monitoring method, device and system and computer readable storage medium
US20160203330A1 (en) Code repository intrusion detection
EP3872637A1 (en) Application programming interface assessment
CN114021970A (en) Enterprise data asset model construction method based on data middlebox
CN112805697A (en) System and method for analyzing and modeling content
US20170109638A1 (en) Ensemble-Based Identification of Executions of a Business Process
CN113297287B (en) Automatic user policy deployment method and device and electronic equipment
CN112016138A (en) Method and device for automatic safe modeling of Internet of vehicles and electronic equipment
CN113868498A (en) Data storage method, electronic device, device and readable storage medium
US11860727B2 (en) Data quality-based computations for KPIs derived from time-series data
CN115936895A (en) Risk assessment method, device and equipment based on artificial intelligence and storage medium
CN110689211A (en) Method and device for evaluating website service capability
WO2016093839A1 (en) Structuring of semi-structured log messages
CN117057866A (en) Service recommendation method and device, electronic equipment and storage medium
CN116523622A (en) Object risk prediction method and device, electronic equipment and storage medium
CN112712270B (en) Information processing method, device, equipment and storage medium
CN117033795A (en) Label verification method and device, electronic equipment and storage medium
KR20230103025A (en) Method, Apparatus, and System for provision of corporate credit analysis and rating information
Gopala Krishnan et al. Predictive algorithm and criteria to perform big data analytics
CN111563178A (en) Rule logic diagram comparison method, device, medium and electronic equipment
CN117973872B (en) Supply chain risk identification method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination