CN110968596A - Data processing method based on label system - Google Patents

Data processing method based on label system Download PDF

Info

Publication number
CN110968596A
CN110968596A CN201911211040.XA CN201911211040A CN110968596A CN 110968596 A CN110968596 A CN 110968596A CN 201911211040 A CN201911211040 A CN 201911211040A CN 110968596 A CN110968596 A CN 110968596A
Authority
CN
China
Prior art keywords
data
data processing
label
processing
label system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201911211040.XA
Other languages
Chinese (zh)
Inventor
王勇庆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chaozhou Zhuoshu Big Data Industry Development Co Ltd
Original Assignee
Chaozhou Zhuoshu Big Data Industry Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chaozhou Zhuoshu Big Data Industry Development Co Ltd filed Critical Chaozhou Zhuoshu Big Data Industry Development Co Ltd
Priority to CN201911211040.XA priority Critical patent/CN110968596A/en
Publication of CN110968596A publication Critical patent/CN110968596A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention particularly relates to a data processing method based on a label system. The data processing method based on the label system comprises the steps of firstly establishing a corresponding relation between a label system and a processing rule, and carrying out structured analysis on data by using the identification data characteristics of the label system so as to standardize and generalize the processing rule; and then extracting data according to the label items to realize multi-level analysis and multiplexing of the data. The data processing method based on the label system not only enables the processing rule to be standardized and generalized by establishing the corresponding relation between the label system and the processing rule, but also extracts data according to the label item, realizes multi-level analysis and multiplexing of the data, and greatly improves the value of the data.

Description

Data processing method based on label system
Technical Field
The invention relates to the technical field of big data, in particular to a data processing method based on a label system.
Background
Data is a form of expression for facts, concepts, or instructions that may be processed by human or automated means. After the data is interpreted and given a certain meaning, it becomes information. Data processing (data processing) is the collection, storage, retrieval, processing, transformation, and transmission of data.
The basic purpose of data processing is to extract and derive valuable, meaningful data for certain people from large, cluttered, unintelligible amounts of data.
Data processing is the basic link of system engineering and automatic control. Data processing is throughout various fields of social production and social life. The development of data processing technology and the breadth and depth of its application have greatly influenced the progress of human society development.
However, in the big data era, with the rapid rise and popularization of internet technology, the data volume collected by people in different fields is large to an unprecedented extent.
The industry groups the characteristics of big data into 4 "V" -Volume, Variety, Value, Velocity.
Firstly, the data size is huge, and the data size is increased from a TB level to a PB level;
secondly, the data types are various and comprise a plurality of types such as weblogs, videos, pictures, geographical location information and the like;
thirdly, the value density is low, taking video as an example, in the continuous monitoring process, the data which is possibly useful is only one or two seconds;
fourth, the processing speed is fast, which is a substantial difference from the conventional data mining technology.
In summary, the coming of big data era has revolutionized the way of data generation, storage and processing, and people's work and life can be basically represented digitally, so it is more and more important to adopt a way of effectively processing data.
In the traditional data processing mode, a data processing engineer carries out a series of processes according to business requirements, the processes are closely related to business, and the standardization of processing rules and the unification of data results cannot be achieved, namely the reusability of data cannot be achieved. The complex and repeated work causes the inefficiency of data processing, and the data cannot be shared and used due to the unnormalization of the data, so that the value of the data is reduced.
With the advent of the big data era, data is quantized, diversified and valued, and data processing is becoming more important and complex. How to find an effective data processing mode makes the value of the data more prominent, and the data use more convenient becomes a difficult problem to be solved urgently.
In view of the above situation, the present invention provides a data processing method based on a tag system.
Disclosure of Invention
In order to make up for the defects of the prior art, the invention provides a simple and efficient data processing method based on a label system.
The invention is realized by the following technical scheme:
a data processing method based on a label system is characterized in that: firstly, establishing a corresponding relation between a tag system and a processing rule, and performing structured analysis on data by using the tag system identification data characteristics to standardize and generalize the processing rule; and then extracting data according to the label items to realize multi-level analysis and multiplexing of the data.
According to the data processing method based on the label system, firstly, original data are analyzed according to business requirements, and a label system composed of a label tree structure is created.
In order to avoid that the specific business group personnel are too closely related to the specific business in the analysis process and cannot achieve the effect of general use in other business departments after data processing, the personnel of the basic data processing group and the personnel of the specific business group participate together to create a label system.
In the label system, a corresponding relation between a data processing rule and the label system is established, the data processing rule is more detailed, each label has a corresponding data processing rule, and the processed data is endowed with a corresponding label value; the data is further processed, the tag value of the previous step can participate in the calculation of the next step by depending on the analysis of the tag value of the previous step, the data processing of each step is independent, and the processed data can be used for further analysis of different rules, so that the multiplexing of processing rules and the multiplexing of data are realized; and the data processing rules are as simple as possible, and the processing of each rule is in accordance with the meaning expressed by the corresponding label.
According to the data processing method based on the label system, after the label system is established, an operation storage scheme is constructed, data processing rules are calculated, processing results are stored, and the processing results are displayed.
In order to conveniently and uniformly monitor and process all executed rules and make the flow clearer, uniform interface integration is carried out on all processing modes, the processing modes are identified in the rule making process, and the uniform interface is used for operation execution.
And when the data is regular and the data volume is small, storing the processing result by adopting a traditional relational database.
When the data volume is large and the query speed requirement on the processing result is high, a non-relational database such as an elastic search is adopted, so that the storage and quick query functions of the large data volume can be met, the self-contained data processing function is realized, and the further exploration of data by an operator is facilitated.
The invention has the beneficial effects that: the data processing method based on the label system not only enables the processing rule to be standardized and generalized by establishing the corresponding relation between the label system and the processing rule, but also extracts data according to the label item, realizes multi-level analysis and multiplexing of the data, and greatly improves the value of the data.
Drawings
FIG. 1 is a schematic diagram of a data processing method based on a label system according to the present invention.
Detailed Description
In order to make those skilled in the art better understand the technical solution of the present invention, the technical solution in the embodiment of the present invention will be clearly and completely described below with reference to the embodiment of the present invention. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The data processing method based on the label system comprises the steps of firstly establishing a corresponding relation between a label system and a processing rule, and carrying out structured analysis on data by using the identification data characteristics of the label system so as to standardize and generalize the processing rule; and then extracting data according to the label items to realize multi-level analysis and multiplexing of the data.
According to the data processing method based on the label system, firstly, original data are analyzed according to business requirements, and a label system composed of a label tree structure is created.
A label system which is well built can enable people of each service group to clearly find the position of the data requirement of the people, so that the business process of the people can be quickly built from the label system. In order to avoid that the specific business group personnel are too closely related to the specific business in the analysis process and cannot achieve the effect of general use in other business departments after data processing, the personnel of the basic data processing group and the personnel of the specific business group participate together to create a label system.
The label system with good construction enables the data processing to become structured and streamlined, the data processing to become clear and reliable, the workload of the data processing can be reduced, and the value of the data can be obtained to the maximum extent.
In the label system, a corresponding relation between a data processing rule and the label system is established, the data processing rule is more detailed, each label has a corresponding data processing rule, each step of processing achieves an effect, and the processed data is endowed with a corresponding label value; the data is further processed, the tag value of the previous step can participate in the next calculation, the data processing of each step is independent, the processed data can be used for further analysis of different rules, and therefore multiplexing of processing rules and multiplexing of data are achieved, and the multiplexing of the processing rules and the data can greatly improve the efficiency of data processing.
In addition, the data processing rules are as simple as possible, and the processing of each rule is matched with the meaning expressed by the corresponding label. Because the creation of data processing rules is closely related to the label hierarchy, rule creators should not include multiple steps in a rule for convenience, which can cause confusion in the overall label hierarchy.
According to the data processing method based on the label system, after the label system is established, an operation storage scheme is constructed, data processing rules are calculated, processing results are stored, and the processing results are displayed.
Due to different data sources, the adopted processing modes are various. For example, the traditional data processing mode, Excel processing, relational database processing, and currently popular big data processing architecture spark, etc. In order to conveniently and uniformly monitor and process all executed rules and make the flow clearer, uniform interface integration is carried out on all processing modes, the processing modes are identified in the rule making process, and the uniform interface is used for operation execution.
The scheme of processing result storage can be based on specific scenes, and when the data is regular and the data volume is small, the traditional relational database is adopted to store the processing result.
When the data volume is large and the query speed requirement on the processing result is high, a non-relational database such as an elastic search is adopted, so that the storage and quick query functions of the large data volume can be met, the self-contained data processing function is realized, and the further exploration of data by an operator is facilitated.
Compared with the prior art, the data processing method based on the label system has the following characteristics:
firstly, by establishing a label system, a data processor can quickly and clearly obtain a data position required by the data processor, and quickly construct a service of the data processor; and simultaneously, the situation that the analysis and calculation are repeated from a large amount of data every time the same batch of data is used is avoided.
Secondly, the data processing rules support diversity, and no matter the rules for performing mathematical operation, enumeration, regular expression or text analysis on the data can be applied, the flexibility of the rules enables the application range of the label to be wider
Thirdly, the uniform rule calculation interface can help technicians to monitor and manage the whole processing flow more effectively.
A data processing method based on a tag system in the embodiment of the present invention is described in detail above. While the present invention has been described with reference to specific examples, which are provided to assist in understanding the core concepts of the present invention, it is intended that all other embodiments that can be obtained by those skilled in the art without departing from the spirit of the present invention shall fall within the scope of the present invention.

Claims (8)

1. A data processing method based on a label system is characterized in that: firstly, establishing a corresponding relation between a tag system and a processing rule, and performing structured analysis on data by using the tag system identification data characteristics to standardize and generalize the processing rule; and then extracting data according to the label items to realize multi-level analysis and multiplexing of the data.
2. The tag system-based data processing method of claim 1, wherein: firstly, analyzing original data according to business requirements, and creating a label system composed of a label tree structure.
3. The data processing method based on the label system as claimed in claim 2, characterized in that: in order to avoid that the specific business group personnel are too closely related to the specific business in the analysis process and cannot achieve the effect of general use in other business departments after data processing, the personnel of the basic data processing group and the personnel of the specific business group participate together to create a label system.
4. A data processing method based on a label system according to claim 2 or 3, characterized in that: in the label system, a corresponding relation between a data processing rule and the label system is established, the data processing rule is more detailed, each label has a corresponding data processing rule, and the processed data is endowed with a corresponding label value; the data is further processed, the tag value of the previous step can participate in the calculation of the next step by depending on the analysis of the tag value of the previous step, the data processing of each step is independent, and the processed data can be used for further analysis of different rules, so that the multiplexing of processing rules and the multiplexing of data are realized; and the data processing rules are as simple as possible, and the processing of each rule is in accordance with the meaning expressed by the corresponding label.
5. The tag system-based data processing method of claim 4, wherein: and after the label system is established, constructing an operation storage scheme, calculating a data processing rule, storing a processing result and displaying the processing result.
6. The tag system-based data processing method of claim 5, wherein: in order to conveniently and uniformly monitor and process all executed rules and make the flow clearer, uniform interface integration is carried out on all processing modes, the processing modes are identified in the rule making process, and the uniform interface is used for operation execution.
7. The tag system-based data processing method of claim 5, wherein: and when the data is regular and the data volume is small, storing the processing result by adopting a traditional relational database.
8. The tag system-based data processing method of claim 5, wherein: when the data volume is large and the query speed requirement on the processing result is high, the non-relational database is adopted, so that the storage and quick query functions of the large data volume can be met, the data processing function of the non-relational database is realized, and the further exploration of data by an operator is facilitated.
CN201911211040.XA 2019-12-02 2019-12-02 Data processing method based on label system Withdrawn CN110968596A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911211040.XA CN110968596A (en) 2019-12-02 2019-12-02 Data processing method based on label system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911211040.XA CN110968596A (en) 2019-12-02 2019-12-02 Data processing method based on label system

Publications (1)

Publication Number Publication Date
CN110968596A true CN110968596A (en) 2020-04-07

Family

ID=70032505

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911211040.XA Withdrawn CN110968596A (en) 2019-12-02 2019-12-02 Data processing method based on label system

Country Status (1)

Country Link
CN (1) CN110968596A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112637065A (en) * 2021-03-05 2021-04-09 南京中孚信息技术有限公司 Distributed processing system and method based on label addressing calculation
CN114238699A (en) * 2021-12-16 2022-03-25 浪潮卓数大数据产业发展有限公司 Data processing method based on label rule

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112637065A (en) * 2021-03-05 2021-04-09 南京中孚信息技术有限公司 Distributed processing system and method based on label addressing calculation
CN114238699A (en) * 2021-12-16 2022-03-25 浪潮卓数大数据产业发展有限公司 Data processing method based on label rule

Similar Documents

Publication Publication Date Title
CN104331435B (en) A kind of efficient mass data abstracting method of low influence based on Hadoop big data platforms
CN108268600B (en) AI-based unstructured data management method and device
CN107506389B (en) Method and device for extracting job skill requirements
KR101617696B1 (en) Method and device for mining data regular expression
CN109214642B (en) Automatic extraction and classification method and system for building construction process constraints
CN110489749B (en) Business process optimization method of intelligent office automation system
CN111125300A (en) Intelligent analysis system based on knowledge graph information data
CN112000773A (en) Data association relation mining method based on search engine technology and application
CN109582837A (en) A kind of visualized data processing method based on cloud and system
CN110968596A (en) Data processing method based on label system
CN108304382A (en) Mass analysis method based on manufacturing process text data digging and system
CN104252616A (en) Human face marking method, device and equipment
CN114254102B (en) Natural language-based collaborative emergency response SOAR script recommendation method
CN116049379A (en) Knowledge recommendation method, knowledge recommendation device, electronic equipment and storage medium
CN112000790A (en) Legal text accurate retrieval method, terminal system and readable storage medium
CN111026940A (en) Network public opinion and risk information monitoring system and electronic equipment for power grid electromagnetic environment
CN111666263A (en) Method for realizing heterogeneous data management in data lake environment
CN111159411A (en) Knowledge graph fused text position analysis method, system and storage medium
CN109976271B (en) Method for calculating information structure order degree by using information representation method
CN105930453A (en) Repeatability analyzing method and device
CN111813555A (en) Super-fusion infrastructure layered resource management system based on internet technology
CN111125198A (en) Computer data mining clustering method based on time sequence
CN110399337A (en) File automating method of servicing and system based on data-driven
CN116994076B (en) Small sample image recognition method based on double-branch mutual learning feature generation
CN112307278B (en) Topic context real-time generation method and system with arbitrary scale

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20200407

WW01 Invention patent application withdrawn after publication