CN116414827A - Terminal data acquisition, analysis and management method based on intelligent library - Google Patents
Terminal data acquisition, analysis and management method based on intelligent library Download PDFInfo
- Publication number
- CN116414827A CN116414827A CN202111673102.6A CN202111673102A CN116414827A CN 116414827 A CN116414827 A CN 116414827A CN 202111673102 A CN202111673102 A CN 202111673102A CN 116414827 A CN116414827 A CN 116414827A
- Authority
- CN
- China
- Prior art keywords
- data
- classification
- physical data
- terminal
- physical
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000007726 management method Methods 0.000 title claims abstract description 24
- 238000004458 analytical method Methods 0.000 title claims abstract description 20
- 238000012545 processing Methods 0.000 claims abstract description 11
- 238000000034 method Methods 0.000 claims description 18
- 238000004891 communication Methods 0.000 claims description 3
- 238000003384 imaging method Methods 0.000 claims description 3
- 239000011159 matrix material Substances 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- 238000013480 data collection Methods 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 6
- 230000008569 process Effects 0.000 description 6
- 230000008859 change Effects 0.000 description 4
- 238000007405 data analysis Methods 0.000 description 3
- 206010061274 Malocclusion Diseases 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000007418 data mining Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000013139 quantization Methods 0.000 description 2
- 241000282414 Homo sapiens Species 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000011144 upstream manufacturing Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2462—Approximate or statistical queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2465—Query processing support for facilitating data mining operations in structured databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/252—Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Fuzzy Systems (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a terminal data acquisition analysis management method based on an intelligent library, which utilizes a physical terminal and a virtual terminal to acquire data, adopts the two terminals to acquire the data, can better acquire various data, adopts the data of multiple devices in the same field to carry out repeated data approximate data combination, carries out removal processing on obvious error data, marks the obvious error data through keywords to form a unit management database, and then carries out analysis management according to later requirements.
Description
Technical Field
The invention relates to the technical field of computer science, in particular to a terminal data acquisition, analysis and management method based on an intelligent library.
Background
At present, computer science and technology rapidly develops, a database is the core of the computer science and technology, information reflected by data is the key of development and progress of human beings, different data are processed and analyzed sequentially through a computer system, the needed results are output, and the data results are applied to life production, so that the development of the science and technology of people is continuously promoted. The method has the advantages that different data information is known by a computer fully, different data information is collected, different data information is analyzed and compared, the useful data information needed by people is extracted from massive data information through continuous understanding, comparison, analysis and learning, the data information is applied to the living production of people, the scientific technology level of people is continuously improved, in the process of understanding, collecting and comparing and analyzing massive data information, the data collection is the most important part in the whole process, more data information can be comprehensively known by people only through the full and comprehensive collection of the data, a plurality of data collection technologies exist in the market at present, but the data collection mode is simpler, only the data is collected by the prior art, no corresponding processing is performed on the data, the collected data is original, whether the collected data is associated with each other or not is not reflected when the collected data is collected by using the technology, the problems of repeated, old, the redundant collection and the like exist, a great deal of time and data can be required to be removed when people analyze the data in the process of comparing and analyzing the massive data, the efficiency is low, the data collection efficiency is designed, the data is improved by the prior art, the data is comprehensively analyzed and the data is not analyzed by the prior art, and the data collection method is based on the prior art, and the data collection method is fully is improved, and the data collection management is difficult and is comprehensively analyzed.
Disclosure of Invention
In order to solve the technical problems, the invention provides a terminal data acquisition, analysis and management method based on an intelligent library, which utilizes a physical terminal and a virtual terminal to acquire data, adopts data to multiple devices in the same domain, performs repeated data approximate data combination, and performs analysis and management according to later requirements, thereby greatly improving analysis and management efficiency.
A terminal data acquisition, analysis and management method based on an intelligent library is characterized in that: the method comprises the following specific steps:
1) The data acquisition is carried out by using a physical data acquisition terminal and a virtual data acquisition terminal, and the acquired data is concentrated to an intelligent library;
the virtual data acquisition comprises stored data and network real-time data;
the physical data acquisition terminal comprises a communication access terminal, a remote acquisition device and a handheld data acquisition device;
2) Preprocessing the obtained data;
classifying the data of the same or similar domain into a group according to the application scene requirement;
3) Removing redundancy from the packet data;
let n signal samples in a group of data be observation objects, each object has m physical data, and a sample sequence d= [ D1, D2, …, dm ] T can be obtained, where di is all sample points of the ith physical data of the samples, i=1, 2, …, m, and starting point zero-imaging processing is performed on the samples of the m physical data:
wherein the method comprises the steps ofIs the initial point zero image of the ith point to form an initialized sample matrixAnd for all i.ltoreq.j, i, j=1, 2, …, m, respectively solving the association coefficients |si| and |sj| of the ith physical data and the jth physical data:
and the gray absolute correlation between the physical data is obtained by the following formula:
finding out a value with gray absolute association degree larger than 0.8 between physical data, regarding the corresponding physical data as strongly-relevant physical data, and randomly removing one physical data in the two physical data with large absolute association degree;
4) Marking the element management database through keywords according to the classification mode;
5) And analyzing the corresponding unit management database according to actual needs to obtain an analysis result.
As a further improvement of the invention, the classification means of the application scenario include classification from field type, classification from data structure, classification from the perspective of describing things, classification from the perspective of data processing, classification from data granularity and classification from update means.
Compared with the prior art, the invention has the beneficial effects that:
the invention provides a terminal data acquisition analysis management method based on an intelligent library, which utilizes a physical terminal and a virtual terminal to acquire data, adopts the two terminals to acquire the data, can better acquire various data, adopts the data of multiple devices in the same domain to carry out repeated data approximate data combination, carries out removal processing on obvious error data, marks the obvious error data through keywords to form a unit management database, and then carries out analysis management according to later requirements.
Drawings
Fig. 1 is a schematic diagram of a principle and structure of a terminal data acquisition, analysis and management method of the present invention.
Detailed Description
The invention is described in further detail below with reference to the attached drawings and detailed description:
as a specific embodiment of the invention, the invention provides a terminal data acquisition, analysis and management method based on an intelligent library, which comprises the following specific steps:
1) The data acquisition is carried out by using a physical data acquisition terminal and a virtual data acquisition terminal, and the acquired data is concentrated to an intelligent library;
the virtual data acquisition comprises stored data and network real-time data;
the physical data acquisition terminal comprises a communication access terminal, a remote acquisition device and a handheld data acquisition device;
2) Preprocessing the obtained data;
classifying the data of the same or similar domain into a group according to the application scene requirement;
the classification modes of the application scene comprise a field type classification, a data structure classification, a thing description angle classification, a data processing angle classification, a data granularity classification and an updating mode classification;
classifying from the field types including a text class, a numeric class, and a temporal class;
text-like data is often used for descriptive fields such as name, address, transaction summary, etc. Such data is not quantized and cannot be used directly for four-law operations. When in use, the field can be standardized (such as address standardization) and then character matching can be carried out, and direct fuzzy matching can also be carried out.
The numeric class data is used to describe quantization attributes, or is used for encoding. For example, transaction amount, commodity number, product number, customer score and the like all belong to quantization attributes, and can be directly used for four operations, which are core fields of daily calculation indexes. The postal code, the ID card number, the card number and the like belong to codes, a plurality of enumeration values are regularly coded, four arithmetic operations can be carried out, but no substantial business meaning exists, and a plurality of codes exist as dimensions.
Time class data is used only to describe the time at which an event occurs, time is a very important dimension, and is very important in business statistics or analysis;
classifying from the data structure includes structured data, semi-structured data, and unstructured data;
structured data generally refers to data recorded in a relational database mode, wherein the data is stored according to tables and fields, and the fields are mutually independent.
The semi-structured data is data recorded in a self-description text mode, and the self-description data is very convenient in the use process because the self-description data does not need to meet the very strict structure and relationship on the relational database. Many web sites and application access logs employ this format, as does the web pages themselves.
Unstructured data generally refers to data in the format of voice, pictures, video, etc. Such data is typically encoded in a specific application format, is very large in data volume, and cannot be simply converted into structured data.
Classifying from the perspective of describing things comprises state class data, event class data and mixed class data;
the description of the objective world with data can generally be viewed from two aspects. The first aspect is an entity describing the objective world, i.e. individual objects such as people, tables, accounts, etc. For these objects, each having a characteristic, different kinds of objects have different characteristics, such as the characteristics of a person including name, gender and age, and the characteristics of a table including color and texture; different characteristic values differ for different individuals of the same subject, such as Zhang Sannan years old, four-year-old, four-female 24 years old. Some features are stable and unchanged, others are constantly changing, such as gender is generally unchanged, but account amounts and positions of people are possibly changed at any time. Thus, each object can be described using a set of characteristic data that can change over time (the change in data depends on the change in the object on the one hand and on the time difference the change reflects on the data on the other hand), and the data at each point in time reflects the state that the object at that point in time is in and is therefore referred to as state class data.
The second aspect describes the relationships between objects in the objective world, how they interact, how they react. This interaction or response is recorded and this type of data is referred to as event type data. For example, a customer purchases an item of clothing from a store, where three objects, customer, store, clothing, respectively, have a transaction relationship between them.
The mixed class data also belongs to the category of event class data theoretically, and the difference between the two is that the event occurrence process described by the mixed class data lasts longer, the event is not finished when the data is recorded, and the event is changed. For example, the whole process from order generation to case settlement needs to last for a period of time, and the first record of order data is that the order state and the order amount can be changed for a plurality of times later when the order is produced.
Classifying the original data and the derivative data from the data processing perspective;
raw data refers to data from an upstream system that has not been processed. While a large amount of derived data is generated from the original data, a piece of original data remains without any modification, and once the derived data has a problem, the derived data can be recalculated from the original data at any time.
Derived data refers to data generated by processing the raw data. Derived data includes various data marts, summary layers, broad tables, data analysis and mining results, and the like. From the derivative purpose, the method can be simply divided into two cases, wherein one is to improve the data delivery efficiency, and the data marts, the summarization layers and the broad tables belong to the case. Another is that data analysis and mining results are of this kind in order to solve business problems.
Classifying from data granularity including detail data and summary data;
raw data, typically obtained from a business system, is relatively small in granularity, including a large amount of business details. For example, the customer table contains data such as sex, age, name, etc. of each customer, and the transaction table contains data such as time, place, amount, etc. of each transaction. This data is referred to as detail data. Although the detail data contains the most abundant business details, a great deal of calculation is often needed in analysis and mining, and the efficiency is low.
In order to improve the data analysis efficiency, the data needs to be preprocessed, and the data is generally summarized according to common dimensions such as time dimension, regional dimension, product dimension and the like. When analyzing the data, the summarized data is preferentially used, and if the summarized data cannot meet the requirement, the detail data is used, so that the data use efficiency is improved.
Classifying batch data and real-time data from an updating mode;
when the source system provides data, different source systems have different providing modes, and the two modes can be mainly divided. One is a batch mode, which is provided at intervals, with all changes in the period being provided. The batch mode has lower timeliness, most of the traditional systems adopt a T+1 mode, and service users can only analyze data of the previous day and see the report of the previous day at the highest speed.
Another way is in real time, i.e. whenever data changes or new data is generated, it is provided immediately. The method is quick in timeliness, and can effectively meet the service with high timeliness requirements, such as scene marketing. However, the method has higher technical requirements, the system must be ensured to be stable enough, and once data errors occur, serious business influence is easily caused.
3) Removing redundancy from the packet data;
let n signal samples in a group of data be observation objects, each object has m physical data, and a sample sequence d= [ D1, D2, …, dm ] T can be obtained, where di is all sample points of the ith physical data of the samples, i=1, 2, …, m, and starting point zero-imaging processing is performed on the samples of the m physical data:
wherein the method comprises the steps ofIs the initial point zero image of the ith point to form an initialized sample matrixAnd for all i.ltoreq.j, i, j=1, 2, …, m, respectively solving the association coefficients |si| and |sj| of the ith physical data and the jth physical data:
and the gray absolute correlation between the physical data is obtained by the following formula:
and finding out a value with gray absolute association degree larger than 0.8 between the physical data, regarding the corresponding physical data as strongly-correlated physical data, and randomly removing one physical data in the two physical data with large absolute association degree.
4) Marking the element management database through keywords according to the classification mode;
5) And analyzing the corresponding unit management database according to actual needs.
The above description is only of the preferred embodiment of the present invention, and is not intended to limit the present invention in any other way, but is intended to cover any modifications or equivalent variations according to the technical spirit of the present invention, which fall within the scope of the present invention as defined by the appended claims.
Claims (2)
1. A terminal data acquisition, analysis and management method based on an intelligent library is characterized in that: the method comprises the following specific steps:
1) The data acquisition is carried out by using a physical data acquisition terminal and a virtual data acquisition terminal, and the acquired data is concentrated to an intelligent library;
the virtual data acquisition comprises stored data and network real-time data;
the physical data acquisition terminal comprises a communication access terminal, a remote acquisition device and a handheld data acquisition device;
2) Preprocessing the obtained data;
classifying the data of the same or similar domain into a group according to the application scene requirement;
3) Removing redundancy from the packet data;
setting n signal samples in data of one group as observation objects, wherein each object has m physical data, and a sample sequence D= [ D ] can be obtained 1 ,d 2 ,…,d m ] T Wherein d is i All sample points of the ith physical data of the sample, i=1, 2, …, m, and performing starting point zero-imaging processing on samples of m physical data:
wherein the method comprises the steps ofIs the starting point zero image of the ith point, forms the initialized sample matrix +.>And for all i.ltoreq.j, i, j=1, 2, …, m, respectively obtaining the association coefficient |s of the ith physical data and the jth physical data i |and |s j |:
And the gray absolute correlation between the physical data is obtained by the following formula:
finding out a value with gray absolute association degree larger than 0.8 between physical data, regarding the corresponding physical data as strongly-relevant physical data, and randomly removing one physical data in the two physical data with large absolute association degree;
4) Marking the element management database through keywords according to the classification mode;
5) And analyzing the corresponding unit management database according to actual needs to obtain an analysis result.
2. The terminal data acquisition, analysis and management method based on the intelligent library as claimed in claim 1, wherein the method comprises the following steps: the classification of the application scene includes classification from field type, classification from data structure, classification from the perspective of describing things, classification from the perspective of data processing, classification from data granularity, and classification from update mode.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111673102.6A CN116414827A (en) | 2021-12-31 | 2021-12-31 | Terminal data acquisition, analysis and management method based on intelligent library |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111673102.6A CN116414827A (en) | 2021-12-31 | 2021-12-31 | Terminal data acquisition, analysis and management method based on intelligent library |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116414827A true CN116414827A (en) | 2023-07-11 |
Family
ID=87055142
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111673102.6A Withdrawn CN116414827A (en) | 2021-12-31 | 2021-12-31 | Terminal data acquisition, analysis and management method based on intelligent library |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116414827A (en) |
-
2021
- 2021-12-31 CN CN202111673102.6A patent/CN116414827A/en not_active Withdrawn
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Guevara et al. | diverse: an R Package to Analyze Diversity in Complex Systems. | |
CN111831636B (en) | Data processing method, device, computer system and readable storage medium | |
CN108363821A (en) | A kind of information-pushing method, device, terminal device and storage medium | |
CN103605651A (en) | Data processing showing method based on on-line analytical processing (OLAP) multi-dimensional analysis | |
CN113836131B (en) | Big data cleaning method and device, computer equipment and storage medium | |
CN102693299A (en) | System and method for parallel video copy detection | |
CN112307232A (en) | Intelligent classification storage processing method for big data content | |
Caruso et al. | Deprivation and the dimensionality of welfare: a variable‐selection cluster‐analysis approach | |
CN116644184B (en) | Human resource information management system based on data clustering | |
CN111859070A (en) | Mass internet news cleaning system | |
CN114119057A (en) | User portrait model construction system | |
CN116842142B (en) | Intelligent retrieval system for medical instrument | |
CN115204436A (en) | Method, device, equipment and medium for detecting abnormal reasons of business indexes | |
JP3185167B2 (en) | Data processing system | |
CN116862434A (en) | Material data management system and method based on big data | |
CN116414827A (en) | Terminal data acquisition, analysis and management method based on intelligent library | |
CN116340387A (en) | Statistical analysis method and system for personal information disclosure condition of data table | |
CN115034762A (en) | Post recommendation method and device, storage medium, electronic equipment and product | |
CN113779110A (en) | Family relation network extraction method and device, computer equipment and storage medium | |
CN113408207A (en) | Data mining method based on social network analysis technology | |
CN113538011A (en) | Method for associating non-registered contact information with registered user in power system | |
CN111666378A (en) | Chinese yearbook title classification method based on word vectors | |
Kettenring et al. | Cluster analysis applied to the validation of course objectives | |
CN112215627B (en) | Customer information data processing system | |
CN113392203B (en) | Intelligent question-answering method, intelligent question-answering device, electronic equipment and computer readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20230711 |