CN117725060A - Data tag system processing method and device and storage medium - Google Patents

Data tag system processing method and device and storage medium Download PDF

Info

Publication number
CN117725060A
CN117725060A CN202311707872.7A CN202311707872A CN117725060A CN 117725060 A CN117725060 A CN 117725060A CN 202311707872 A CN202311707872 A CN 202311707872A CN 117725060 A CN117725060 A CN 117725060A
Authority
CN
China
Prior art keywords
data
label
business
tag
processing target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311707872.7A
Other languages
Chinese (zh)
Inventor
张海龙
阚宗挺
向尚志
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University Yangtze River Delta Wisdom Oasis Innovation Center
Original Assignee
Zhejiang University Yangtze River Delta Wisdom Oasis Innovation Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University Yangtze River Delta Wisdom Oasis Innovation Center filed Critical Zhejiang University Yangtze River Delta Wisdom Oasis Innovation Center
Priority to CN202311707872.7A priority Critical patent/CN117725060A/en
Publication of CN117725060A publication Critical patent/CN117725060A/en
Pending legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present disclosure provides a method and an apparatus for processing a data tag system, and a storage medium, where the method includes: determining a processing target, acquiring data to be subjected to tag management of the processing target, designing a tag system for the data, developing tags based on the tag system, and checking the developed tags; evaluating the verified label at least based on at least one of coverage area, accuracy and timeliness of the label to obtain an evaluation result; and calculating an evaluation value of the tag system based on the evaluation result and the set dimension and dimension weight, judging whether the tag system meets the evaluation requirement or not based on the calculated evaluation value, and re-determining a new tag system until the evaluation requirement is met when the evaluation requirement is not met. The method and the system can be used for more quickly creating a label system and performing flow operation, are simple and convenient, can be used for checking the accuracy of labels, are controllable in cost and the like, and show great commercial value and social value.

Description

Data tag system processing method and device and storage medium
Technical Field
The present disclosure relates to a data tag system creation technology, and in particular, to a data tag system processing method and apparatus, and a storage medium.
Background
The data center station for unified data integration management can provide a unified data application outlet, and tags such as user portraits, recommendation algorithms, data analysis mining, fine operation and the like can be relied on in many application scenes. The labels are on top of the data warehouse, categorizing information for the canonical data system. The data should be given a label before flowing to the data mart, and label construction is not a one-step job, requiring continuous optimization and governance.
At present, the construction of a data tag system of each enterprise or team is not unified standard and specification, and is based on the current business and requirements, the data tag construction depends on the preparation of basic data, the carding of business and the model and perfection degree of a data warehouse, and the operations consume much time and energy and have low cost; meanwhile, the method is not used for a primary or transformation enterprise without data management or data construction. Taking the clothing industry as an example, the related data are not labeled, so that no data construction and treatment scheme exists, the enterprises in the clothing industry can be helped to quickly locate the construction targets, determine the data infrastructure, develop the labels and verify the label system and manage the life cycle of the labels, and the enterprises related to the clothing can be helped to quickly, effectively and cheaply construct an index system, so that the method becomes a current urgent requirement.
Disclosure of Invention
The present disclosure provides a method and apparatus for processing a data tag system, and a storage medium, so as to at least solve the above technical problems in the prior art.
According to a first aspect of the present disclosure, there is provided a data tag system processing method, comprising:
determining a processing target, acquiring data to be subjected to tag management of the processing target, designing a tag system for the data, developing tags based on the designed tag system, and checking the developed tags;
evaluating the verified label at least based on at least one of coverage area, accuracy and timeliness of the label to obtain an evaluation result;
and calculating the evaluation value of the tag system based on the evaluation result and the set dimension and dimension weight, judging whether the tag system meets the evaluation requirement or not based on the calculated evaluation value, and re-designing a new tag system for the data when the evaluation requirement is not met, and continuously evaluating the new tag system until the evaluation requirement is met.
In some executable embodiments, the method further comprises:
determining the life cycle of the tag according to different types of the data, the service type and the application to which the data belongs;
creating a data warehouse platform for the data, and importing the life cycle of the tag into the data warehouse platform;
and updating and maintaining the life cycle of the tag on the data warehouse platform based on the asset of the data and the map of the data so as to realize quality and abnormality monitoring of the tag.
In some executable embodiments, the determining the processing target includes:
according to the business requirement of business in the enterprise, at least one of the following data of the business is determined: business materials, customer preferences, market quotations, network hotspots, business processes, business related personnel information, business related products, business life cycle, business generated data and business data processing processes;
and determining a management and evaluation mode of the data based on the determined business data, and taking the determined management and evaluation mode as the processing target.
In some executable embodiments, the developing the tag based on the designed tag system and verifying the developed tag includes:
determining attribute information of the processing target, performing association reasoning based on the attribute information, recommending by an algorithm to obtain a possible association relation of the processing target, and determining a corresponding label for the processing target based on the possible association relation;
classifying and dividing the labels at least based on the business category, the user type and the demand type of the processing target, and storing the labels in the data warehouse platform based on the classified types; the label types at least comprise a phenotype label, a fact label, a rule label and a model label; wherein the phenotype label is stored in a data detail layer (Data WareHouse Detail, DWD) in the data warehouse platform;
engineering development is carried out according to label rules and data source conditions, and label quality is tested and checked by combining business appeal and engineering results, so that label accuracy is ensured;
configuring a label calling script to enable the calling script to run at fixed time so as to acquire a label from the data warehouse platform; and simultaneously configuring script monitoring and quality monitoring to perform operation monitoring on the script comprising the calling script, thereby ensuring normal operation of the script.
In some executable embodiments, the combining the business appeal and the engineering result to perform a test check on the label quality, to ensure the label accuracy, includes:
determining whether there is contradictory underlying logic between the plurality of tags of the processing target by a target population index (Target Group Index, TGI), correcting the tag with the contradiction; or (b)
Using third party data in a third party platform to check the accuracy of a plurality of labels of the processing target, and correcting the labels inconsistent with the third party data based on the third party data; or (b)
And selecting target groups under different labels of the processing targets, performing a separated type inter-group experiment A/B test, and checking the accuracy of the labels according to the throwing result.
According to a second aspect of the present disclosure there is provided a data tag system processing apparatus comprising:
the design unit is used for determining a processing target, acquiring data to be subjected to tag management of the processing target, and designing a tag system for the data;
the development unit is used for developing labels based on the designed label system;
the verification unit is used for verifying the developed label;
the evaluation unit is used for evaluating the verified label at least based on at least one of coverage area, accuracy and timeliness of the label to obtain an evaluation result;
the judging unit is used for calculating the evaluation value of the tag system based on the evaluation result and the set dimension and dimension weight, judging whether the tag system meets the evaluation requirement or not based on the calculated evaluation value, and re-designing a new tag system for the data when the evaluation requirement is not met, and continuously evaluating the new tag system until the evaluation requirement is met.
In some executable embodiments, the apparatus further comprises:
the determining unit is used for determining the life cycle of the tag according to different types of the data, the service type and the application to which the data belong;
a creation unit for creating a data warehouse platform for the data, and importing the life cycle of the tag into the data warehouse platform;
and the maintenance unit is used for updating and maintaining the life cycle of the tag on the basis of the asset of the data and the map of the data on the data warehouse platform so as to realize quality and abnormality monitoring of the tag.
In some executable embodiments, the design unit is further configured to:
according to the business requirement of business in the enterprise, at least one of the following data of the business is determined: business materials, customer preferences, market quotations, network hotspots, business processes, business related personnel information, business related products, business life cycle, business generated data and business data processing processes;
and determining a management and evaluation mode of the data based on the determined business data, and taking the determined management and evaluation mode as the processing target.
In some executable embodiments, the design unit is further configured to:
determining attribute information of the processing target, performing association reasoning based on the attribute information, recommending by an algorithm to obtain a possible association relation of the processing target, and determining a corresponding label for the processing target based on the possible association relation;
classifying and dividing the labels at least based on the business category, the user type and the demand type of the processing target, and storing the labels in the data warehouse platform based on the classified types; the label types at least comprise a phenotype label, a fact label, a rule label and a model label; wherein, the phenotype label is stored in a data detail layer DWD in the data warehouse platform;
engineering development is carried out according to label rules and data source conditions, and label quality is tested and checked by combining business appeal and engineering results, so that label accuracy is ensured;
configuring a label calling script to enable the calling script to run at fixed time so as to acquire a label from the data warehouse platform; and simultaneously configuring script monitoring and quality monitoring to perform operation monitoring on the script comprising the calling script, thereby ensuring normal operation of the script.
In some executable embodiments, the verification unit is further configured to:
determining whether a contradictory underlying logic exists among a plurality of labels of the processing target through a target population index TGI, and correcting the labels with the contradiction; or (b)
Using third party data in a third party platform to check the accuracy of a plurality of labels of the processing target, and correcting the labels inconsistent with the third party data based on the third party data; or (b)
And selecting target groups under different labels of the processing targets, performing a separated type inter-group experiment A/B test, and checking the accuracy of the labels according to the throwing result.
According to a third aspect of the present disclosure there is provided a non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the steps of the aforementioned data tag system processing method.
According to the technical scheme, no requirement is made on related data of the processing target, the processing target can be directly operated, the operation of the flow process is realized, the problems encountered in various scenes are covered, and the label system can be effectively established in a guiding manner. The accuracy of the label can be measured, a checksum evaluation mechanism is provided, the business requirements are influenced between the quality of the label, the accuracy of the label can be analyzed when the label is built, meanwhile, the task is conveniently online and optimized based on the scoring mechanism, an optimal label system is used instead of the most complete label system, and the cost is greatly reduced. The technical scheme disclosed by the invention is particularly suitable for a data management-data label construction and management scheme based on the clothing industry, can be used for more quickly creating a label system and performing flow operation, is simple and convenient, can be used for checking the accuracy of labels, has controllable cost and the like, and shows great commercial value and social value.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The above, as well as additional purposes, features, and advantages of exemplary embodiments of the present disclosure will become readily apparent from the following detailed description when read in conjunction with the accompanying drawings. Several embodiments of the present disclosure are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings, in which:
in the drawings, the same or corresponding reference numerals indicate the same or corresponding parts.
FIG. 1 is a schematic diagram of an implementation flow of a data tagging system processing method according to an embodiment of the present disclosure;
FIG. 2 is a schematic diagram showing the constitution of a data tag system processing apparatus according to an embodiment of the present disclosure;
fig. 3 shows a schematic diagram of a composition structure of an electronic device according to an embodiment of the disclosure.
Detailed Description
In order to make the objects, features and advantages of the present disclosure more comprehensible, the technical solutions in the embodiments of the present disclosure will be clearly described in conjunction with the accompanying drawings in the embodiments of the present disclosure, and it is apparent that the described embodiments are only some embodiments of the present disclosure, but not all embodiments. Based on the embodiments in this disclosure, all other embodiments that a person skilled in the art would obtain without making any inventive effort are within the scope of protection of this disclosure.
The technical scheme of the embodiment of the disclosure provides a scheme for constructing, evaluating and realizing a set of data tag system, and is mainly applied to the label-free associated current situation which is more suitable for the clothing industry. No matter which specific subclass of clothing is fit, and no requirements are made of the enterprise's data base and data capabilities. Firstly, constructing a label system according to a specified flow, then evaluating the label system, and carrying out label optimization and life cycle management. Wherein, label system construction includes: the method comprises four steps of determining a target, designing a label system, developing a label and checking the label, wherein the target is used as a guide, and the service and the data are used for driving the label to fall to the ground. The tag system evaluation includes: and evaluating the generated label, evaluating according to the coverage range, the accuracy and the timeliness of the label, and judging a label system based on a scheme of a scoring mechanism. Tag lifecycle management includes: the life cycle of the management label can be uniformly managed by means of data asset management-data map, and the management label can be managed by means of a data warehouse construction platform so as to monitor the quality and abnormality of the label task.
The technical scheme of the embodiment of the disclosure starts from the target, refines the design, develops and verifies; architecture assessment and lifecycle management. The scheme covers the whole flow of an index system, from where the label comes, how to process and verify, how to use, how to process the abnormality, when to take off the line and the like, and is suitable for clothing companies and businesses in each stage.
The technical solutions of the embodiments of the present disclosure are described in detail below with reference to the accompanying drawings.
Fig. 1 is a schematic implementation flow diagram of a data tag system processing method according to an embodiment of the disclosure, and as shown in fig. 1, the data tag system processing method according to an embodiment of the disclosure includes the following processing steps:
step 101, determining a processing target, obtaining data to be subjected to tag management of the processing target, designing a tag system for the data, developing tags based on the designed tag system, and verifying the developed tags.
In the embodiment of the disclosure, the processing target can be people, articles, abstract information and the like, and the tag mainly faces to the data mart layer and mainly adopts the application. A processing object may correspond to tags of many different meanings, so tags are set according to business or data asset construction. Specifically, determining the processing target may include: according to the business requirement of business in the enterprise, at least one of the following data of the business is determined: business materials, customer preferences, market quotations, network hotspots, business processes, business related personnel information, business related products, business life cycle, business generated data and business data processing processes; and determining a management and evaluation mode of the data based on the determined business data, and taking the determined management and evaluation mode as the processing target. The method is mainly oriented to core business and main business.
In the embodiment of the disclosure, the data of the camping service in the enterprise are given labels as much as possible, for example, the clothing materials, the customer preference, the market quotation and the network hot spot need to be considered in the clothing design industry. Carding main camping business: and determining service flows, service related persons, service related products, service life cycles, service generated data, service data processing flows and service data service use scenes.
And 102, evaluating the verified label at least based on at least one of coverage, accuracy and timeliness of the label to obtain an evaluation result.
In the embodiment of the disclosure, at least one of the following data of the business is determined according to the business requirement of the business in the enterprise: business materials, customer preferences, market quotations, network hotspots, business processes, business related personnel information, business related products, business life cycle, business generated data and business data processing processes; and determining a management and evaluation mode of the data based on the determined business data, and taking the determined management and evaluation mode as the processing target. Specifically, the relationship between the tag and the processing target may be explicitly obtained or may be obtained through some attributes, association reasoning, algorithm recommendation. For example, a user can buy baby clothing to obtain information that the user has a child; female users always buy the cartoon and game imitation clothing, and can infer that the users are secondary and cartoon lovers. The relationship determination requires the business and the technician to agree together, the relationship determination rule is formed, and the electronic equipment determines the corresponding relationship between the tag and the processing target according to the corresponding rule. The classification can also be based on business line categories, user based, demand based categories. Taking the clothing industry as an example, the clothing can be divided based on the human body part, sex, hairstyle, gesture, fabric type, sleeve length, trousers length, collar height, skirt length, waist line height, color, clothing style, profile and decoration, occasion and style feeling, collar subdivision, shoulder subdivision, sleeve subdivision, front cut subdivision, hem subdivision, pocket subdivision, process elements and the like.
In the embodiment of the present disclosure, on the basis of step 102, the life cycle of the tag needs to be determined according to the different types of the data, the service types and applications to which the data belong; creating a data warehouse platform for the data, and importing the life cycle of the tag into the data warehouse platform; and updating and maintaining the life cycle of the tag on the data warehouse platform based on the asset of the data and the map of the data so as to realize quality and abnormality monitoring of the tag.
Determining attribute information of the processing target, performing association reasoning based on the attribute information, recommending by an algorithm to obtain a possible association relation of the processing target, and determining a corresponding label for the processing target based on the possible association relation;
classifying and dividing the labels at least based on the business category, the user type and the demand type of the processing target, and storing the labels in the data warehouse platform based on the classified types; the label types at least comprise a phenotype label, a fact label, a rule label and a model label; wherein, the phenotype label is stored in a data detail layer DWD in the data warehouse platform; engineering development is carried out according to label rules and data source conditions, and label quality is tested and checked by combining business appeal and engineering results, so that label accuracy is ensured; configuring a label calling script to enable the calling script to run at fixed time so as to acquire a label from the data warehouse platform; and simultaneously configuring script monitoring and quality monitoring to perform operation monitoring on the script comprising the calling script, thereby ensuring normal operation of the script.
In the embodiment of the disclosure, the method further comprises the step of checking after the label is obtained, and specifically, the quality of the label is tested and checked by combining business appeal and engineering results, so that the accuracy of the label is ensured. As an implementation manner, the label may be sampled and checked, 6% of data is extracted from each label category, and the data is checked as a check object, so that the whole of the label is checked by using the local check result. Or determining whether a contradictory underlying logic exists among the plurality of labels of the processing target through the target population index TGI, and correcting the label with the contradictory; for example, a piece of women's underwear is labeled with a female tag, and the women's underwear is labeled with a male tag, which is not only logically understood from the group-facing direction of the underwear, but also unreasonably understood. Or, using third party data in a third party platform to check the accuracy of a plurality of tags of the processing target, and correcting the tag inconsistent with the third party data based on the third party data; for example, through a broad-point communication, third party data is used to verify accuracy. Or, selecting target groups under different labels of the processing targets, performing a separated type inter-group experiment A/B test, and checking the accuracy of the labels according to the put-in result. In the concrete implementation, the method can carry out multiple times of detection in a small range in an actual business scene, and then carry out online of a label system in a large scale, so that large-scale adjustment and modification are avoided.
And 103, calculating an evaluation value of the tag system based on the evaluation result and the set dimension and dimension weight, judging whether the tag system meets the evaluation requirement or not based on the calculated evaluation value, and re-designing a new tag system for the data when the evaluation requirement is not met, and continuously evaluating the new tag system until the evaluation requirement is met.
In the embodiment of the disclosure, the quality of the label system can be observed through verification, the evaluation system can be used for comprehensively evaluating the whole label, and the evaluation of the label can comprise the following modes:
the dimensions and scores may be adjusted according to business and scenario based on the suggested dimensions and score values given below:
coverage rate: 15 min at most, accounting for 15%;
accuracy: 15 min at most, accounting for 15%;
timeliness: 15 min at most, accounting for 15%;
degree of use: 15 min at most, accounting for 15%;
attention degree: 15 min at most, accounting for 15%;
the actual use degree is as follows: 15 min at most, accounting for 15%;
data desensitization, namely, the maximum data is 10 minutes, and the data desensitization accounts for 10 percent;
label score = coverage score + accuracy score + timeliness score + usage score + attention score + utility score + data desensitization score.
Based on the above score statistical model, the score of the tag system is calculated and then whether the score is satisfied as compared with the score value at the time of target setting.
In the embodiment of the disclosure, the tag is used as the portrait of the processing target, and the important data asset of the enterprise is ensured through the fine operation of the mode.
Fig. 2 is a schematic diagram showing a composition structure of a data tag system processing apparatus according to an embodiment of the present disclosure, and as shown in fig. 2, the data tag system processing apparatus according to an embodiment of the present disclosure includes:
the design unit 20 is used for determining a processing target, acquiring data to be subjected to tag management of the processing target, and designing a tag system for the data;
a development unit 21 for performing tag development based on the designed tag system;
a verification unit 22 for verifying the developed tag;
an evaluation unit 23, configured to evaluate the verified tag based on at least one of coverage, accuracy, and timeliness of the tag, to obtain an evaluation result;
and the judging unit 24 is used for calculating the evaluation value of the tag system based on the evaluation result and the set dimension and dimension weight, judging whether the tag system meets the evaluation requirement or not based on the calculated evaluation value, and re-designing a new tag system for the data when the evaluation requirement is not met, and continuously evaluating the new tag system until the evaluation requirement is met.
On the basis of the data tag system processing apparatus shown in fig. 2, the data tag system processing apparatus according to the embodiment of the present disclosure further includes:
a determining unit (not shown in fig. 2) for determining a life cycle of the tag according to different types of the data, the service type and the application to which the data belongs;
a creation unit (not shown in fig. 2) for creating a data warehouse platform for the data, importing the lifecycle of the tag into the data warehouse platform;
a maintenance unit (not shown in fig. 2) for updating and maintaining the life cycle of the tag based on the asset of the data and the map of the data on the data warehouse platform to realize quality and anomaly monitoring of the tag.
As an implementation, the design unit 20 is further configured to:
according to the business requirement of business in the enterprise, at least one of the following data of the business is determined: business materials, customer preferences, market quotations, network hotspots, business processes, business related personnel information, business related products, business life cycle, business generated data and business data processing processes;
and determining a management and evaluation mode of the data based on the determined business data, and taking the determined management and evaluation mode as the processing target.
As an implementation, the design unit 20 is further configured to:
determining attribute information of the processing target, performing association reasoning based on the attribute information, recommending by an algorithm to obtain a possible association relation of the processing target, and determining a corresponding label for the processing target based on the possible association relation;
classifying and dividing the labels at least based on the business category, the user type and the demand type of the processing target, and storing the labels in the data warehouse platform based on the classified types; the label types at least comprise a phenotype label, a fact label, a rule label and a model label; wherein, the phenotype label is stored in a data detail layer DWD in the data warehouse platform;
engineering development is carried out according to label rules and data source conditions, and label quality is tested and checked by combining business appeal and engineering results, so that label accuracy is ensured;
configuring a label calling script to enable the calling script to run at fixed time so as to acquire a label from the data warehouse platform; and simultaneously configuring script monitoring and quality monitoring to perform operation monitoring on the script comprising the calling script, thereby ensuring normal operation of the script.
As an implementation, the verification unit 22 is further configured to:
determining whether a contradictory underlying logic exists among a plurality of labels of the processing target through a target population index TGI, and correcting the labels with the contradiction; or (b)
Using third party data in a third party platform to check the accuracy of a plurality of labels of the processing target, and correcting the labels inconsistent with the third party data based on the third party data; or (b)
And selecting target groups under different labels of the processing targets, performing a separated type inter-group experiment A/B test, and checking the accuracy of the labels according to the throwing result.
In an exemplary embodiment, each processing unit in the data tagging system processing apparatus of the embodiments of the present disclosure may be implemented by one or more central processing units (CPU, central Processing Unit), graphics processor (GPU, graphics Processing Unit), application specific integrated circuit (ASIC, application Specific Integrated Circuit), DSP, programmable logic device (PLD, programmable Logic Device), complex programmable logic device (CPLD, complex Programmable Logic Device), field programmable gate array (FPGA, field-Programmable Gate Array), general purpose processor, controller, microcontroller (MCU, micro Controller Unit), microprocessor (Microprocessor), or other electronic element.
The specific manner in which the various modules and units perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.
According to embodiments of the present disclosure, the present disclosure also provides an electronic device and a readable storage medium.
Fig. 3 illustrates a schematic block diagram of an example electronic device 800 that may be used to implement embodiments of the present disclosure. As shown in fig. 3, the electronic device 800 includes a computing unit 801 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 802 or a computer program loaded from a storage unit 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data required for the operation of the electronic device 800 can also be stored. The computing unit 801, the ROM 802, and the RAM 803 are connected to each other by a bus 804. An input/output (I/O) interface 805 is also connected to the bus 804.
Various components in electronic device 800 are connected to I/O interface 805, including: an input unit 806 such as a keyboard, mouse, etc.; an output unit 807 such as various types of displays, speakers, and the like; a storage unit 808, such as a magnetic disk, optical disk, etc.; and a communication unit 809, such as a network card, modem, wireless communication transceiver, or the like. The communication unit 809 allows the electronic device 800 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunication networks.
The computing unit 801 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 801 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 801 performs the various methods and processes described above, such as the data tag hierarchy processing method. For example, in some embodiments, the data tag system processing methods of the disclosed embodiments may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as the storage unit 808. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 800 via the ROM 802 and/or the communication unit 809. When a computer program is loaded into RAM 803 and executed by computing unit 801, one or more of the steps of the data taggant processing method described above may be performed. Alternatively, in other embodiments, the computing unit 801 may be configured to perform the steps of the data tag hierarchy processing method in any other suitable way (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems-on-a-chip (SOCs), complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server incorporating a blockchain.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel or sequentially or in a different order, provided that the desired results of the technical solutions of the present disclosure are achieved, and are not limited herein.
Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present disclosure, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.
The foregoing is merely specific embodiments of the disclosure, but the protection scope of the disclosure is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the disclosure, and it is intended to cover the scope of the disclosure. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims (11)

1. A method of processing a data tag system, the method comprising:
determining a processing target, acquiring data to be subjected to tag management of the processing target, designing a tag system for the data, developing tags based on the designed tag system, and checking the developed tags;
evaluating the verified label at least based on at least one of coverage area, accuracy and timeliness of the label to obtain an evaluation result;
and calculating the evaluation value of the tag system based on the evaluation result and the set dimension and dimension weight, judging whether the tag system meets the evaluation requirement or not based on the calculated evaluation value, and re-designing a new tag system for the data when the evaluation requirement is not met, and continuously evaluating the new tag system until the evaluation requirement is met.
2. The method according to claim 1, wherein the method further comprises:
determining the life cycle of the tag according to different types of the data, the service type and the application to which the data belongs;
creating a data warehouse platform for the data, and importing the life cycle of the tag into the data warehouse platform;
and updating and maintaining the life cycle of the tag on the data warehouse platform based on the asset of the data and the map of the data so as to realize quality and abnormality monitoring of the tag.
3. The method of claim 2, wherein the determining the processing target comprises:
according to the business requirement of business in the enterprise, at least one of the following data of the business is determined: business materials, customer preferences, market quotations, network hotspots, business processes, business related personnel information, business related products, business life cycle, business generated data and business data processing processes;
and determining a management and evaluation mode of the data based on the determined business data, and taking the determined management and evaluation mode as the processing target.
4. A method according to claim 3, wherein the developing of tags based on the designed tag system and verifying the developed tags comprises:
determining attribute information of the processing target, performing association reasoning based on the attribute information, recommending by an algorithm to obtain a possible association relation of the processing target, and determining a corresponding label for the processing target based on the possible association relation;
classifying and dividing the labels at least based on the business category, the user type and the demand type of the processing target, and storing the labels in the data warehouse platform based on the classified types; the label types at least comprise a phenotype label, a fact label, a rule label and a model label; wherein, the phenotype label is stored in a data detail layer DWD in the data warehouse platform;
engineering development is carried out according to label rules and data source conditions, and label quality is tested and checked by combining business appeal and engineering results, so that label accuracy is ensured;
configuring a label calling script to enable the calling script to run at fixed time so as to acquire a label from the data warehouse platform; and simultaneously configuring script monitoring and quality monitoring to perform operation monitoring on the script comprising the calling script, thereby ensuring normal operation of the script.
5. The method of claim 4, wherein said combining business appeal and engineering results to test label quality, to ensure label accuracy, comprises:
determining whether a contradictory underlying logic exists among a plurality of labels of the processing target through a target population index TGI, and correcting the labels with the contradiction; or (b)
Using third party data in a third party platform to check the accuracy of a plurality of labels of the processing target, and correcting the labels inconsistent with the third party data based on the third party data; or (b)
And selecting target groups under different labels of the processing targets, performing a separated type inter-group experiment A/B test, and checking the accuracy of the labels according to the throwing result.
6. A data tag system processing apparatus, the apparatus comprising:
the design unit is used for determining a processing target, acquiring data to be subjected to tag management of the processing target, and designing a tag system for the data;
the development unit is used for developing labels based on the designed label system;
the verification unit is used for verifying the developed label;
the evaluation unit is used for evaluating the verified label at least based on at least one of coverage area, accuracy and timeliness of the label to obtain an evaluation result;
the judging unit is used for calculating the evaluation value of the tag system based on the evaluation result and the set dimension and dimension weight, judging whether the tag system meets the evaluation requirement or not based on the calculated evaluation value, and re-designing a new tag system for the data when the evaluation requirement is not met, and continuously evaluating the new tag system until the evaluation requirement is met.
7. The apparatus of claim 6, wherein the apparatus further comprises:
the determining unit is used for determining the life cycle of the tag according to different types of the data, the service type and the application to which the data belong;
a creation unit for creating a data warehouse platform for the data, and importing the life cycle of the tag into the data warehouse platform;
and the maintenance unit is used for updating and maintaining the life cycle of the tag on the basis of the asset of the data and the map of the data on the data warehouse platform so as to realize quality and abnormality monitoring of the tag.
8. The apparatus of claim 7, wherein the design unit is further configured to:
according to the business requirement of business in the enterprise, at least one of the following data of the business is determined: business materials, customer preferences, market quotations, network hotspots, business processes, business related personnel information, business related products, business life cycle, business generated data and business data processing processes;
and determining a management and evaluation mode of the data based on the determined business data, and taking the determined management and evaluation mode as the processing target.
9. The method of claim 8, wherein the design unit is further configured to:
determining attribute information of the processing target, performing association reasoning based on the attribute information, recommending by an algorithm to obtain a possible association relation of the processing target, and determining a corresponding label for the processing target based on the possible association relation;
classifying and dividing the labels at least based on the business category, the user type and the demand type of the processing target, and storing the labels in the data warehouse platform based on the classified types; the label types at least comprise a phenotype label, a fact label, a rule label and a model label; wherein, the phenotype label is stored in a data detail layer DWD in the data warehouse platform;
engineering development is carried out according to label rules and data source conditions, and label quality is tested and checked by combining business appeal and engineering results, so that label accuracy is ensured;
configuring a label calling script to enable the calling script to run at fixed time so as to acquire a label from the data warehouse platform; and simultaneously configuring script monitoring and quality monitoring to perform operation monitoring on the script comprising the calling script, thereby ensuring normal operation of the script.
10. The apparatus of claim 9, wherein the verification unit is further configured to:
determining whether a contradictory underlying logic exists among a plurality of labels of the processing target through a target population index TGI, and correcting the labels with the contradiction; or (b)
Using third party data in a third party platform to check the accuracy of a plurality of labels of the processing target, and correcting the labels inconsistent with the third party data based on the third party data; or (b)
And selecting target groups under different labels of the processing targets, performing a separated type inter-group experiment A/B test, and checking the accuracy of the labels according to the throwing result.
11. A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the steps of the data tag system processing method of any one of claims 1 to 5.
CN202311707872.7A 2023-12-12 2023-12-12 Data tag system processing method and device and storage medium Pending CN117725060A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311707872.7A CN117725060A (en) 2023-12-12 2023-12-12 Data tag system processing method and device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311707872.7A CN117725060A (en) 2023-12-12 2023-12-12 Data tag system processing method and device and storage medium

Publications (1)

Publication Number Publication Date
CN117725060A true CN117725060A (en) 2024-03-19

Family

ID=90208177

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311707872.7A Pending CN117725060A (en) 2023-12-12 2023-12-12 Data tag system processing method and device and storage medium

Country Status (1)

Country Link
CN (1) CN117725060A (en)

Similar Documents

Publication Publication Date Title
Gao et al. Consensus reaching with non-cooperative behavior management for personalized individual semantics-based social network group decision making
CN110009174A (en) Risk identification model training method, device and server
JP2002092305A (en) Score calculating method, and score providing method
CN113360580A (en) Abnormal event detection method, device, equipment and medium based on knowledge graph
JP6884435B2 (en) Partner company supply chain risk analysis method
CN105468161A (en) Instruction execution method and device
CN112950218A (en) Business risk assessment method and device, computer equipment and storage medium
CN109784352A (en) A kind of method and apparatus for assessing disaggregated model
CN113569162A (en) Data processing method, device, equipment and storage medium
CN113469461B (en) Method and device for generating information
CN115840738A (en) Data migration method and device, electronic equipment and storage medium
CN116009495A (en) Resource model establishment method, device, equipment and medium based on digital twin
US20230237076A1 (en) Automatically drawing infographics for statistical data based on a data model
CN110569363A (en) Decision flow component generation method and device, electronic equipment and storage medium
CN116911805B (en) Resource alarm method, device, electronic equipment and computer readable medium
CN109359946A (en) Construction Audit method and system
CN117725060A (en) Data tag system processing method and device and storage medium
Khojasteh et al. A study of the influencing technological and technical factors successful implementation of business intelligence system in internet service providers companies
CN113987186B (en) Method and device for generating marketing scheme based on knowledge graph
CN114490406A (en) Test coverage item management method, device, equipment and medium
CN114676266A (en) Conflict identification method, device, equipment and medium based on multilayer relation graph
CN114219208A (en) Credit granting processing method and device for small and micro enterprises and electronic equipment
US20200286104A1 (en) Platform for In-Memory Analysis of Network Data Applied to Profitability Modeling with Current Market Information
CN116991693B (en) Test method, device, equipment and storage medium
CN117829660A (en) Quality management method and device for clothing data, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination