CN110968596A - Data processing method based on label system - Google Patents
Data processing method based on label system Download PDFInfo
- Publication number
- CN110968596A CN110968596A CN201911211040.XA CN201911211040A CN110968596A CN 110968596 A CN110968596 A CN 110968596A CN 201911211040 A CN201911211040 A CN 201911211040A CN 110968596 A CN110968596 A CN 110968596A
- Authority
- CN
- China
- Prior art keywords
- data
- data processing
- label
- processing
- label system
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2453—Query optimisation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention particularly relates to a data processing method based on a label system. The data processing method based on the label system comprises the steps of firstly establishing a corresponding relation between a label system and a processing rule, and carrying out structured analysis on data by using the identification data characteristics of the label system so as to standardize and generalize the processing rule; and then extracting data according to the label items to realize multi-level analysis and multiplexing of the data. The data processing method based on the label system not only enables the processing rule to be standardized and generalized by establishing the corresponding relation between the label system and the processing rule, but also extracts data according to the label item, realizes multi-level analysis and multiplexing of the data, and greatly improves the value of the data.
Description
Technical Field
The invention relates to the technical field of big data, in particular to a data processing method based on a label system.
Background
Data is a form of expression for facts, concepts, or instructions that may be processed by human or automated means. After the data is interpreted and given a certain meaning, it becomes information. Data processing (data processing) is the collection, storage, retrieval, processing, transformation, and transmission of data.
The basic purpose of data processing is to extract and derive valuable, meaningful data for certain people from large, cluttered, unintelligible amounts of data.
Data processing is the basic link of system engineering and automatic control. Data processing is throughout various fields of social production and social life. The development of data processing technology and the breadth and depth of its application have greatly influenced the progress of human society development.
However, in the big data era, with the rapid rise and popularization of internet technology, the data volume collected by people in different fields is large to an unprecedented extent.
The industry groups the characteristics of big data into 4 "V" -Volume, Variety, Value, Velocity.
Firstly, the data size is huge, and the data size is increased from a TB level to a PB level;
secondly, the data types are various and comprise a plurality of types such as weblogs, videos, pictures, geographical location information and the like;
thirdly, the value density is low, taking video as an example, in the continuous monitoring process, the data which is possibly useful is only one or two seconds;
fourth, the processing speed is fast, which is a substantial difference from the conventional data mining technology.
In summary, the coming of big data era has revolutionized the way of data generation, storage and processing, and people's work and life can be basically represented digitally, so it is more and more important to adopt a way of effectively processing data.
In the traditional data processing mode, a data processing engineer carries out a series of processes according to business requirements, the processes are closely related to business, and the standardization of processing rules and the unification of data results cannot be achieved, namely the reusability of data cannot be achieved. The complex and repeated work causes the inefficiency of data processing, and the data cannot be shared and used due to the unnormalization of the data, so that the value of the data is reduced.
With the advent of the big data era, data is quantized, diversified and valued, and data processing is becoming more important and complex. How to find an effective data processing mode makes the value of the data more prominent, and the data use more convenient becomes a difficult problem to be solved urgently.
In view of the above situation, the present invention provides a data processing method based on a tag system.
Disclosure of Invention
In order to make up for the defects of the prior art, the invention provides a simple and efficient data processing method based on a label system.
The invention is realized by the following technical scheme:
a data processing method based on a label system is characterized in that: firstly, establishing a corresponding relation between a tag system and a processing rule, and performing structured analysis on data by using the tag system identification data characteristics to standardize and generalize the processing rule; and then extracting data according to the label items to realize multi-level analysis and multiplexing of the data.
According to the data processing method based on the label system, firstly, original data are analyzed according to business requirements, and a label system composed of a label tree structure is created.
In order to avoid that the specific business group personnel are too closely related to the specific business in the analysis process and cannot achieve the effect of general use in other business departments after data processing, the personnel of the basic data processing group and the personnel of the specific business group participate together to create a label system.
In the label system, a corresponding relation between a data processing rule and the label system is established, the data processing rule is more detailed, each label has a corresponding data processing rule, and the processed data is endowed with a corresponding label value; the data is further processed, the tag value of the previous step can participate in the calculation of the next step by depending on the analysis of the tag value of the previous step, the data processing of each step is independent, and the processed data can be used for further analysis of different rules, so that the multiplexing of processing rules and the multiplexing of data are realized; and the data processing rules are as simple as possible, and the processing of each rule is in accordance with the meaning expressed by the corresponding label.
According to the data processing method based on the label system, after the label system is established, an operation storage scheme is constructed, data processing rules are calculated, processing results are stored, and the processing results are displayed.
In order to conveniently and uniformly monitor and process all executed rules and make the flow clearer, uniform interface integration is carried out on all processing modes, the processing modes are identified in the rule making process, and the uniform interface is used for operation execution.
And when the data is regular and the data volume is small, storing the processing result by adopting a traditional relational database.
When the data volume is large and the query speed requirement on the processing result is high, a non-relational database such as an elastic search is adopted, so that the storage and quick query functions of the large data volume can be met, the self-contained data processing function is realized, and the further exploration of data by an operator is facilitated.
The invention has the beneficial effects that: the data processing method based on the label system not only enables the processing rule to be standardized and generalized by establishing the corresponding relation between the label system and the processing rule, but also extracts data according to the label item, realizes multi-level analysis and multiplexing of the data, and greatly improves the value of the data.
Drawings
FIG. 1 is a schematic diagram of a data processing method based on a label system according to the present invention.
Detailed Description
In order to make those skilled in the art better understand the technical solution of the present invention, the technical solution in the embodiment of the present invention will be clearly and completely described below with reference to the embodiment of the present invention. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The data processing method based on the label system comprises the steps of firstly establishing a corresponding relation between a label system and a processing rule, and carrying out structured analysis on data by using the identification data characteristics of the label system so as to standardize and generalize the processing rule; and then extracting data according to the label items to realize multi-level analysis and multiplexing of the data.
According to the data processing method based on the label system, firstly, original data are analyzed according to business requirements, and a label system composed of a label tree structure is created.
A label system which is well built can enable people of each service group to clearly find the position of the data requirement of the people, so that the business process of the people can be quickly built from the label system. In order to avoid that the specific business group personnel are too closely related to the specific business in the analysis process and cannot achieve the effect of general use in other business departments after data processing, the personnel of the basic data processing group and the personnel of the specific business group participate together to create a label system.
The label system with good construction enables the data processing to become structured and streamlined, the data processing to become clear and reliable, the workload of the data processing can be reduced, and the value of the data can be obtained to the maximum extent.
In the label system, a corresponding relation between a data processing rule and the label system is established, the data processing rule is more detailed, each label has a corresponding data processing rule, each step of processing achieves an effect, and the processed data is endowed with a corresponding label value; the data is further processed, the tag value of the previous step can participate in the next calculation, the data processing of each step is independent, the processed data can be used for further analysis of different rules, and therefore multiplexing of processing rules and multiplexing of data are achieved, and the multiplexing of the processing rules and the data can greatly improve the efficiency of data processing.
In addition, the data processing rules are as simple as possible, and the processing of each rule is matched with the meaning expressed by the corresponding label. Because the creation of data processing rules is closely related to the label hierarchy, rule creators should not include multiple steps in a rule for convenience, which can cause confusion in the overall label hierarchy.
According to the data processing method based on the label system, after the label system is established, an operation storage scheme is constructed, data processing rules are calculated, processing results are stored, and the processing results are displayed.
Due to different data sources, the adopted processing modes are various. For example, the traditional data processing mode, Excel processing, relational database processing, and currently popular big data processing architecture spark, etc. In order to conveniently and uniformly monitor and process all executed rules and make the flow clearer, uniform interface integration is carried out on all processing modes, the processing modes are identified in the rule making process, and the uniform interface is used for operation execution.
The scheme of processing result storage can be based on specific scenes, and when the data is regular and the data volume is small, the traditional relational database is adopted to store the processing result.
When the data volume is large and the query speed requirement on the processing result is high, a non-relational database such as an elastic search is adopted, so that the storage and quick query functions of the large data volume can be met, the self-contained data processing function is realized, and the further exploration of data by an operator is facilitated.
Compared with the prior art, the data processing method based on the label system has the following characteristics:
firstly, by establishing a label system, a data processor can quickly and clearly obtain a data position required by the data processor, and quickly construct a service of the data processor; and simultaneously, the situation that the analysis and calculation are repeated from a large amount of data every time the same batch of data is used is avoided.
Secondly, the data processing rules support diversity, and no matter the rules for performing mathematical operation, enumeration, regular expression or text analysis on the data can be applied, the flexibility of the rules enables the application range of the label to be wider
Thirdly, the uniform rule calculation interface can help technicians to monitor and manage the whole processing flow more effectively.
A data processing method based on a tag system in the embodiment of the present invention is described in detail above. While the present invention has been described with reference to specific examples, which are provided to assist in understanding the core concepts of the present invention, it is intended that all other embodiments that can be obtained by those skilled in the art without departing from the spirit of the present invention shall fall within the scope of the present invention.
Claims (8)
1. A data processing method based on a label system is characterized in that: firstly, establishing a corresponding relation between a tag system and a processing rule, and performing structured analysis on data by using the tag system identification data characteristics to standardize and generalize the processing rule; and then extracting data according to the label items to realize multi-level analysis and multiplexing of the data.
2. The tag system-based data processing method of claim 1, wherein: firstly, analyzing original data according to business requirements, and creating a label system composed of a label tree structure.
3. The data processing method based on the label system as claimed in claim 2, characterized in that: in order to avoid that the specific business group personnel are too closely related to the specific business in the analysis process and cannot achieve the effect of general use in other business departments after data processing, the personnel of the basic data processing group and the personnel of the specific business group participate together to create a label system.
4. A data processing method based on a label system according to claim 2 or 3, characterized in that: in the label system, a corresponding relation between a data processing rule and the label system is established, the data processing rule is more detailed, each label has a corresponding data processing rule, and the processed data is endowed with a corresponding label value; the data is further processed, the tag value of the previous step can participate in the calculation of the next step by depending on the analysis of the tag value of the previous step, the data processing of each step is independent, and the processed data can be used for further analysis of different rules, so that the multiplexing of processing rules and the multiplexing of data are realized; and the data processing rules are as simple as possible, and the processing of each rule is in accordance with the meaning expressed by the corresponding label.
5. The tag system-based data processing method of claim 4, wherein: and after the label system is established, constructing an operation storage scheme, calculating a data processing rule, storing a processing result and displaying the processing result.
6. The tag system-based data processing method of claim 5, wherein: in order to conveniently and uniformly monitor and process all executed rules and make the flow clearer, uniform interface integration is carried out on all processing modes, the processing modes are identified in the rule making process, and the uniform interface is used for operation execution.
7. The tag system-based data processing method of claim 5, wherein: and when the data is regular and the data volume is small, storing the processing result by adopting a traditional relational database.
8. The tag system-based data processing method of claim 5, wherein: when the data volume is large and the query speed requirement on the processing result is high, the non-relational database is adopted, so that the storage and quick query functions of the large data volume can be met, the data processing function of the non-relational database is realized, and the further exploration of data by an operator is facilitated.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911211040.XA CN110968596A (en) | 2019-12-02 | 2019-12-02 | Data processing method based on label system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911211040.XA CN110968596A (en) | 2019-12-02 | 2019-12-02 | Data processing method based on label system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110968596A true CN110968596A (en) | 2020-04-07 |
Family
ID=70032505
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911211040.XA Withdrawn CN110968596A (en) | 2019-12-02 | 2019-12-02 | Data processing method based on label system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110968596A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112637065A (en) * | 2021-03-05 | 2021-04-09 | 南京中孚信息技术有限公司 | Distributed processing system and method based on label addressing calculation |
CN114238699A (en) * | 2021-12-16 | 2022-03-25 | 浪潮卓数大数据产业发展有限公司 | Data processing method based on label rule |
-
2019
- 2019-12-02 CN CN201911211040.XA patent/CN110968596A/en not_active Withdrawn
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112637065A (en) * | 2021-03-05 | 2021-04-09 | 南京中孚信息技术有限公司 | Distributed processing system and method based on label addressing calculation |
CN114238699A (en) * | 2021-12-16 | 2022-03-25 | 浪潮卓数大数据产业发展有限公司 | Data processing method based on label rule |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104331435B (en) | A kind of efficient mass data abstracting method of low influence based on Hadoop big data platforms | |
CN108268600B (en) | AI-based unstructured data management method and device | |
CN107506389B (en) | Method and device for extracting job skill requirements | |
KR101617696B1 (en) | Method and device for mining data regular expression | |
CN109214642B (en) | Automatic extraction and classification method and system for building construction process constraints | |
CN110489749B (en) | Business process optimization method of intelligent office automation system | |
CN111125300A (en) | Intelligent analysis system based on knowledge graph information data | |
CN112000773A (en) | Data association relation mining method based on search engine technology and application | |
CN109582837A (en) | A kind of visualized data processing method based on cloud and system | |
CN110968596A (en) | Data processing method based on label system | |
CN108304382A (en) | Mass analysis method based on manufacturing process text data digging and system | |
CN104252616A (en) | Human face marking method, device and equipment | |
CN114254102B (en) | Natural language-based collaborative emergency response SOAR script recommendation method | |
CN116049379A (en) | Knowledge recommendation method, knowledge recommendation device, electronic equipment and storage medium | |
CN112000790A (en) | Legal text accurate retrieval method, terminal system and readable storage medium | |
CN111026940A (en) | Network public opinion and risk information monitoring system and electronic equipment for power grid electromagnetic environment | |
CN111666263A (en) | Method for realizing heterogeneous data management in data lake environment | |
CN111159411A (en) | Knowledge graph fused text position analysis method, system and storage medium | |
CN109976271B (en) | Method for calculating information structure order degree by using information representation method | |
CN105930453A (en) | Repeatability analyzing method and device | |
CN111813555A (en) | Super-fusion infrastructure layered resource management system based on internet technology | |
CN111125198A (en) | Computer data mining clustering method based on time sequence | |
CN110399337A (en) | File automating method of servicing and system based on data-driven | |
CN116994076B (en) | Small sample image recognition method based on double-branch mutual learning feature generation | |
CN112307278B (en) | Topic context real-time generation method and system with arbitrary scale |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20200407 |
|
WW01 | Invention patent application withdrawn after publication |