CN111597399A

CN111597399A - Computer data processing system and method based on data fusion

Info

Publication number: CN111597399A
Application number: CN202010426699.3A
Authority: CN
Inventors: 尹大伟
Original assignee: Laiwu Vocational and Technical College
Current assignee: Laiwu Vocational and Technical College
Priority date: 2020-05-19
Filing date: 2020-05-19
Publication date: 2020-08-28

Abstract

The invention belongs to the technical field of computers, and particularly relates to a computer data processing system and method based on data fusion. The system comprises: the data evaluation unit is used for evaluating the data to be processed and acquiring the data information of the data to be processed; the data information at least comprises: data size, data type and data structure; the resource allocation unit allocates computing resources of the computer for data processing based on the data information acquired by the data evaluation unit according to a preset resource allocation model; and the data fusion unit is used for calling the computing resources distributed by the resource distribution unit, carrying out data fusion on the data to be processed based on the data information acquired by the data evaluation unit, and storing the fused data. The fusion processing of computer data is realized through data evaluation and resource allocation, and the method has the advantages of high processing efficiency and high resource utilization rate.

Description

Computer data processing system and method based on data fusion

Technical Field

The invention belongs to the technical field of computers, and particularly relates to a computer data processing system and method based on data fusion.

Background

Data (Data) is a representation of facts, concepts or instructions that can be manipulated by either manual or automated means. After the data is interpreted and given a certain meaning, it becomes information. Data processing (dataprocessing) is the collection, storage, retrieval, processing, transformation, and transmission of data.

The basic purpose of data processing is to extract and derive valuable, meaningful data for certain people from large, possibly chaotic, unintelligible amounts of data.

Data processing is the basic link of system engineering and automatic control. Data processing is throughout various fields of social production and social life. The development of data processing technology and the breadth and depth of its application have greatly influenced the progress of human society development.

The data fusion technology comprises the steps of collecting, transmitting, integrating, filtering, correlating and synthesizing useful information given by various information sources so as to assist people in situation/environment judgment, planning, detection, verification and diagnosis. The method is extremely important for timely and accurately acquiring various useful information on a battlefield, carrying out timely and complete evaluation on battlefield conditions, threats and importance degrees thereof, implementing tactics and strategic aid decision making and controlling the command of combat troops. The future battlefield is changeable instantly, and factors influencing decision making are more and more complex, so that a commander is required to make the most accurate judgment on the battlefield situation in the shortest time, and the most effective command control is implemented on the combat troops. The series of most' realization needs to have the most advanced data processing technology to be basically guaranteed. Otherwise, the high-brightness military leaders and commanders are inundated with data in the great amount, such as the tobacco, or the judgment is missed, or the decision is delayed and the warplane is lost, thereby causing disastrous results.

The system resources are used to track the running of the application program rather than running the application program, as if there are more cars on the highway and there is no way to drive if there are not a few cars. It can therefore be said with certainty that it is the computer system's performance that is affected by other factors, and never the size of the available system resources. When the performance of the user computer system is significantly degraded, the cause should be looked up from other aspects without immediately doubting the system resources.

From the hardware aspect, the fact that the memory is too small to cause the system to frequently use the virtual memory is one of the main reasons for affecting the system performance;

from a software perspective, because Windows is a multitasking operating system, it is common practice to run multiple applications simultaneously, regardless of whether actually needed at the time. Programmers writing and debugging these applications generally consider only their operation in a single task environment, and do not have much effort to consider and debug from a multi-task environment, so many applications often do not work well in conjunction, and running multiple such applications at the same time can cause system performance degradation due to their conflicts with each other. Of course, imperfections in the Windows9X multitask management mechanism are also one of the major causes of this problem.

Patent No. cn201410047608.xa discloses a multi-platform point cloud data fusion method, relates to the fields of surveying and mapping and engineering measurement, and comprises the following steps: data acquisition: acquiring original data of ground objects and features in a target area through data acquisition equipment and fixed ground laser scanning equipment carried by a mobile platform; data preprocessing: preprocessing the collected original data such as engineering tissue management, filtering, denoising and the like; data fusion: and performing precision analysis on the filtered and denoised point cloud data, performing precision correction on the rest data according to the point cloud data with the highest precision, and realizing data coordinate conversion acquired by a fixed ground laser scanning device and non-field control point coordinate conversion based on the point cloud data acquired by the mobile platform. Although the data fusion of various fields and structures can be realized, and the noise reduction processing is carried out on the data, the complexity of the data processing process is higher, the occupied system resources are more, and more resource waste is caused when certain low-order data fusion is carried out.

Patent No. CN201610191767.6A discloses a processing method for data fusion and intelligent search of multiple data sources, and a processing method and application for data fusion and intelligent search of multiple data sources, wherein the sensor layout adopts a planar layout, sensors are located on the same plane to form a sensor network, the data fusion comprises data fusion of multiple sensors of the same type and data fusion of different sensors, data characteristics and data types of the sensors are acquired by polling, data are subjected to redundant processing by acquisition nodes when being acquired, and a self-adaptive algorithm based on batch estimation is adopted. According to the method, the data of each sensor is dynamically acquired, so that the identification time of the system for the data of the sensors is prolonged, the data precision is improved, and the data accuracy is increased. The data fusion is specific to the sensor system, and when the data fusion is carried out, the analysis process specific to the data is not carried out, the efficiency of carrying out the data fusion is low, and the resource occupation rate is high.

Disclosure of Invention

In view of the above, the main objective of the present invention is to provide a computer data processing system and method based on data fusion, which perform system resource allocation and data fusion based on data evaluation, and have the advantages of high processing efficiency and high resource utilization rate.

In order to achieve the purpose, the technical scheme of the invention is realized as follows:

a computer data processing system based on data fusion, the system comprising: the data evaluation unit is used for evaluating the data to be processed and acquiring the data information of the data to be processed; the data information at least comprises: data size, data type and data structure; the resource allocation unit allocates computing resources of the computer for data processing based on the data information acquired by the data evaluation unit according to a preset resource allocation model; and the data fusion unit is used for calling the computing resources distributed by the resource distribution unit, carrying out data fusion on the data to be processed based on the data information acquired by the data evaluation unit, and storing the fused data.

Further, the data evaluation unit includes: a plurality of data identification subunits; the data identification subunit is used for training respectively based on a plurality of dimensions and a plurality of feature spaces; the trained data identification subunit can analyze the data to be processed under the corresponding dimension and feature space to obtain an analysis result; the data evaluation unit further includes: and the analysis integration unit is used for integrating the analysis results of all the data identification subunits to obtain the data scale, the data type and the data structure of the data to be processed.

Further, the dimensions are defined as: features of data, i.e. dataSize, data type, and data structure; the training process specifically comprises: the data identification subunit extracts data features based on pre-collected training data samples respectively under the data scale dimension, the data type dimension or the data structure dimension, and counts the times of the data features according with each feature space by using the following formula:

wherein N is the number of times of conforming to the feature space, S is the number of data, and lambda_iFor the weight of the ith training sample, M is the number of features in each feature space, count_jThe number of data features of the ith training sample; setting the priority of the feature space corresponding to the training sample from high to low according to the counted times that the training sample conforms to each feature space and from multiple to few to finish the training of the data feature space; when the data to be processed is evaluated, the data identification subunit performs feature space mapping under corresponding dimensionality on the data to be processed respectively, counts feature space mapping results, and takes the mapping result with the highest frequency as an identification result.

Further, the resource allocation unit, according to a preset resource allocation model, based on the data information obtained by the data evaluation unit, performs the following steps: establishing a resource allocation model, wherein the resource allocation model is represented by the following notations:

wherein F (x) is the percentage of resources allocated, the data information comprises the data size, the weighted average of the data type and the data structure, α is a constant, α>3，

Is a standard average value and is a set constant; according to the established resource allocation model, firstly, the weighted average value of the data information is calculated by the following formula:the data scale size is A + the weight value corresponding to the data type is B + the weight value corresponding to the data structure is C; wherein, the weight value corresponding to the data type is as follows: presetting different numerical values as weight values of different data types; the weight corresponding to the data structure is defined as: presetting different numerical values as weights of different data structures; and then calculating the percentage of the computer resources which should be allocated by using a resource allocation model, and sending the calculation result to the data fusion unit.

Further, the data fusion unit calls the computing resources allocated by the resource allocation unit, performs data fusion on the data to be processed based on the data information acquired by the data evaluation unit, and executes the following steps in the method for storing the fused data: according to the percentage of the computer resources obtained by calculation, calling the computer resources, extracting the data space of the data to be processed, and classifying the data to be processed into different target heterogeneous databases according to the data space of the data to be processed; carrying out normalization processing on the target heterogeneous database to obtain a classified target heterogeneous data matrix; and respectively mapping and matching the classification target heterogeneous data matrix with each directional data space group by using the following formula:

wherein, sim (d)_j，d_k) In order to map the matching result,

for a product target heterogeneous data matrix, w_jiIs the matrix row value, | d_jL is the value of the corresponding matrix determinant;

for directional data space groups, w_kiIs the matrix row value, | d_k| | is the value of the corresponding matrix determinant; according to the result of the final mapping matching, matching mapping result sim (d)_j，d_k) The directional data space group corresponding to the minimum value is used as the data space corresponding to the product information to complete the construction of the data space(ii) a And performing chaotic fuzzy matching according to the constructed data space to finish integration of different heterogeneous data.

A computer data processing method based on data fusion, the method performing the steps of: the data evaluation unit is used for evaluating the data to be processed and acquiring the data information of the data to be processed; the data information includes at least: data size, data type and data structure; the resource allocation unit allocates computing resources of the computer for data processing based on the data information acquired by the data evaluation unit according to a preset resource allocation model; and the data fusion unit is used for calling the computing resources distributed by the resource distribution unit, carrying out data fusion on the data to be processed based on the data information acquired by the data evaluation unit, and storing the fused data.

Further, the dimensions are defined as: characteristics of the data, i.e., data size, data type, and data structure; the training process specifically comprises: the data identification subunit extracts data features based on pre-collected training data samples respectively under the data scale dimension, the data type dimension or the data structure dimension, and counts the times of the data features according with each feature space by using the following formula:

wherein N is the number of times of conforming to the feature space, S is the number of data, and lambda_iFor the weight of the ith training sample, M is the number of features in each feature space, count_jThe ith training sampleThe number of data features of (a); setting the priority of the feature space corresponding to the training sample from high to low according to the counted times that the training sample conforms to each feature space and from multiple to few to finish the training of the data feature space; when the data to be processed is evaluated, the data identification subunit performs feature space mapping under corresponding dimensionality on the data to be processed respectively, counts feature space mapping results, and takes the mapping result with the highest frequency as an identification result.

Is a standard average value and is a set constant; according to the established resource allocation model, firstly, the weighted average value of the data information is calculated by the following formula: the data scale size is A + the weight value corresponding to the data type is B + the weight value corresponding to the data structure is C; wherein, the weight value corresponding to the data type is as follows: presetting different numerical values as weight values of different data types; the weight corresponding to the data structure is defined as: presetting different numerical values as weights of different data structures; and then calculating the percentage of the computer resources which should be allocated by using a resource allocation model, and sending the calculation result to the data fusion unit.

Furthermore, the data fusion unit calls the computing resources distributed by the resource distribution unit, performs data fusion on the data to be processed based on the data information acquired by the data evaluation unit,the method for storing the fused data comprises the following steps: according to the percentage of the computer resources obtained by calculation, calling the computer resources, extracting the data space of the data to be processed, and classifying the data to be processed into different target heterogeneous databases according to the data space of the data to be processed; carrying out normalization processing on the target heterogeneous database to obtain a classified target heterogeneous data matrix; and respectively mapping and matching the classification target heterogeneous data matrix with each directional data space group by using the following formula:

wherein, sim (d)_j，d_k) In order to map the matching result,

for directional data space groups, w_kiIs the matrix row value, | d_k| | is the value of the corresponding matrix determinant; according to the result of the final mapping matching, matching mapping result sim (d)_j，d_k) The directional data space group corresponding to the minimum value is used as a data space corresponding to the product information to complete the construction of the data space; and performing chaotic fuzzy matching according to the constructed data space to finish integration of different heterogeneous data.

The computer data processing system and method based on data fusion have the following beneficial effects: the invention performs system resource allocation and data fusion on the basis of data evaluation, and has the advantages of high processing efficiency and high resource utilization rate. The process for realizing the beneficial effects is mainly embodied in two aspects: 1. and evaluating the data size, the data type and the data structure of the data to be processed, so that the basic situation of the data to be processed can be known as a whole. To facilitate subsequent data fusion and resource allocation. In an actual situation, the data type, the data structure and the data scale are not always specified, after the information is obtained, the system resources can be more reasonably distributed according to the current situation of the data to be processed, when the data type is more complex, such as floating point data, the data structure is more variable and complex, and the data with larger data scale is processed, more system resources are distributed, the result can be more quickly obtained, when the data with simple data structure and smaller data scale is processed, smaller resources can be distributed, and the waste of resources is avoided. 2. In the process of data fusion, the heterogeneous data matrix is established to be respectively mapped and matched with each directional data space group, so that the method has the advantages that the same data source to be processed can be fused once for data with different data structures, the fusion accuracy is higher, and the fusion efficiency is higher. Under the condition of occupying the same system resources, the computer data processing system of the invention can process data more conveniently and accurately.

Drawings

FIG. 1 is a system diagram of a computer data processing system based on data fusion according to an embodiment of the present invention;

FIG. 2 is a schematic flow chart of a method of a computer data processing method based on data fusion according to an embodiment of the present invention;

FIG. 3 is a schematic flow chart illustrating data fusion performed by the data fusion unit of the data fusion-based computer data processing system and method according to the embodiment of the present invention;

FIG. 4 is a graph showing experimental curves of data fusion efficiency of the data fusion-based computer data processing system and method according to the present invention and a graph showing comparison experimental effects of the prior art;

fig. 5 is a schematic diagram of an experimental curve of resource utilization rate of the computer data processing system and method based on data fusion according to the embodiment of the present invention and a schematic diagram of a comparative experimental effect in the prior art.

1-Experimental curves of the invention, 2-Experimental curves of the prior art.

Detailed Description

The method of the present invention will be described in further detail below with reference to the accompanying drawings and embodiments of the invention.

Example 1

As shown in fig. 1, 2 and 3, a computer data processing system based on data fusion, the system comprising: the data evaluation unit is used for evaluating the data to be processed and acquiring the data information of the data to be processed; the data information at least comprises: data size, data type and data structure; the resource allocation unit allocates computing resources of the computer for data processing based on the data information acquired by the data evaluation unit according to a preset resource allocation model; and the data fusion unit is used for calling the computing resources distributed by the resource distribution unit, carrying out data fusion on the data to be processed based on the data information acquired by the data evaluation unit, and storing the fused data.

By adopting the technical scheme, the system resource allocation and data fusion are carried out on the basis of data evaluation, and the method has the advantages of high processing efficiency and high resource utilization rate. The process for realizing the beneficial effects is mainly embodied in two aspects: 1. and evaluating the data size, the data type and the data structure of the data to be processed, so that the basic situation of the data to be processed can be known as a whole. To facilitate subsequent data fusion and resource allocation. In practical situations, the data type, the data structure and the data scale are not specified, so that after the information is obtained, system resources can be more reasonably distributed according to the current situation of the data to be processed, when the data type is more complex, such as floating point data, the data structure is more variable and complex, and when the data scale is larger, more system resources are distributed, the result can be more quickly obtained, when the data structure is simple and the data scale is smaller, smaller resources can be distributed, and the waste of resources is avoided. 2. In the process of data fusion, the heterogeneous data matrix is established to be respectively mapped and matched with each directional data space group, so that the method has the advantages that the same data source to be processed but different data structures can be fused at one time, the fusion accuracy is higher, and the fusion efficiency is higher. Under the condition of occupying the same system resources, the computer data processing system of the invention processes data more conveniently and accurately.

Example 2

On the basis of the above embodiment, the data evaluation unit includes: a plurality of data identification subunits; the data identification subunit is used for training respectively based on a plurality of dimensions and a plurality of feature spaces; the trained data identification subunit can analyze the data to be processed under the corresponding dimension and feature space to obtain an analysis result; the data evaluation unit further includes: and the analysis integration unit is used for integrating the analysis results of all the data identification subunits to obtain the data scale, the data type and the data structure of the data to be processed.

Specifically, from a sensing layer to an application layer of the internet of things, the types and the quantity of various information are multiplied, the quantity of data to be analyzed is also increased in stages, and meanwhile, the problem of data fusion among various heterogeneous networks or multiple systems is also involved, so that the problem of how to timely dig out hidden information and effective data from massive data is solved, and a huge challenge is brought to data processing, and therefore, the problem of how to reasonably, effectively integrate, dig and intelligently process massive data is the difficult problem of the internet of things. The method is combined with distributed computing technologies such as P2P and cloud computing, and becomes a way for solving the above problems. Cloud computing provides a new high-efficiency computing mode for the Internet of things, dynamic telescopic cheap computing can be provided through a network as required, the data center is relatively reliable and safe, convenience and low price of Internet service and the capacity of a mainframe are achieved, data and application sharing among different devices can be easily achieved, and users do not need to worry about troublesome problems such as information leakage and hacker invasion. Cloud computing is a milestone in the information development process, emphasizes the aggregation, optimization and dynamic allocation of information resources, saves the information cost and greatly improves the efficiency of a data center.

Example 3

On the basis of the above embodiment, theThe dimensions are defined as: characteristics of the data, i.e., data size, data type, and data structure; the training process specifically comprises: the data identification subunit extracts data features based on pre-collected training data samples under the data scale dimension, the data type dimension or the data structure dimension, and counts the times of data features conforming to each feature space by using the following formula:

wherein N is the number of times of conforming to the feature space, S is the number of data, λ_iIs the weight of the ith training sample, M is the number of features in each feature space, count_jThe number of data features of the ith training sample; setting the priority of the feature space corresponding to the training sample from high to low according to the counted times that the training sample conforms to each feature space and from multiple to few to finish the training of the data feature space; when the data to be processed is evaluated, the data identification subunit performs feature space mapping under corresponding dimensionality on the data to be processed respectively, counts the feature space mapping result, and takes the mapping result with the highest frequency as the identification result.

By adopting the technical scheme, the method respectively performs characteristic space mapping under corresponding dimensionality on the data to be processed, counts characteristic space mapping results, and takes the mapping result with the highest frequency as the identification result. This may improve the efficiency of data evaluation, because there may be a plurality of different data structures in the same data to be processed, and among these data structures, there is a dominant data structure, if the identification evaluation is performed for each data structure, the processing is too slow. The invention adopts a recognition method of a segmentation function formula, and based on feature space mapping, the efficiency of evaluation recognition can be not lost under the condition of ensuring the accuracy.

Example 4

As shown in fig. 4, on the basis of the previous embodiment, the method for allocating computing resources of a computer for data processing by the resource allocation unit based on the data information obtained by the data evaluation unit according to a preset resource allocation model performs the following steps: establishing a resource allocation model, wherein the resource allocation model is represented by the following notations:

Specifically, in computer science, a system resource (system resource) means any physical or virtual component of a computer system that limits its computing power. Any device connected to a computer system is a resource, such as a keyboard, a screen, etc. Any component within a computer system is a resource, such as a CPU, RAM. Software virtualization components in computer systems, including files, network connections, and memory blocks, are a resource. The allocation of system resources refers to the allocation of computer software resources and hardware resources, so that the system resources are fully utilized and the system is not locked. Allocating system resources can be divided into the following four categories: processor allocation, memory allocation, I/O device allocation.

In computer science, a system resource (system resource) means any physical or virtual component of a computer system that limits its computing power. Any device connected to a computer system is a resource, such as a keyboard, a screen, etc. Any component within a computer system is a resource, such as a CPU, RAM. Software virtualization components in computer systems, including files, network connections, and memory blocks, are a resource. Allocating system resources refers to allocating computer software resources and hardware resources, so that the system resources are fully utilized and the system is not deadlocked. Allocating system resources can be divided into the following four categories: processor allocation, memory allocation, I/O device allocation.

Example 5

As shown in fig. 5, on the basis of the previous embodiment, the data fusion unit invokes the computing resources allocated by the resource allocation unit, performs data fusion on the data to be processed based on the data information acquired by the data evaluation unit, and stores the fused data by the following steps: according to the calculated percentage of the computer resources, calling the computer resources, extracting the data space of the data to be processed, and classifying the data to be processed into different target heterogeneous databases according to the data space of the data to be processed; carrying out normalization processing on the target heterogeneous database to obtain a classified target heterogeneous data matrix; and respectively mapping and matching the classified target heterogeneous data matrix with each directional data space group by using the following formula:

wherein, sim (d)_j，d_k) In order to map the matching result,

for a product target heterogeneous data matrix, w_jiIs the matrix row value, | d_j| is the value of the corresponding matrix determinant;

for directional data space groups, w_kiIs the matrix row value, | d_k| | is the value of the corresponding matrix determinant; according to the result of the final mapping matching, matching mapping result sim (d)_j，d_k) The directional data space group corresponding to the minimum value is used as a data space corresponding to the product information to complete the construction of the data space; and performing chaotic fuzzy matching according to the constructed data space to complete integration of different heterogeneous data.

Example 6

Example 7

Example 8

On the basis of the above embodiment, the dimensions are defined as: characteristics of the data, i.e., data size, data type, and data structure; the training process specifically comprises: the data identification subunit extracts data based on pre-collected training data samples under the data scale dimension, the data type dimension or the data structure dimensionAnd (4) counting the times of the data characteristics conforming to each characteristic space by using the following formula:

Example 9

On the basis of the previous embodiment, the method for allocating the computing resources of the computer for data processing by the resource allocation unit according to the preset resource allocation model and based on the data information acquired by the data evaluation unit executes the following steps: establishing a resource allocation model, wherein the resource allocation model is represented by the following general expression:

Is a standard average value and is a set constant; according to the established resource allocation model, firstly, the weighted average value of the data information is calculated by the following formula: the data scale size is A + the weight value corresponding to the data type is B + the weight value corresponding to the data structure is C; wherein, the weight value corresponding to the data type is as follows: presetting different numbers for different data typesThe value is taken as its weight; the weight corresponding to the data structure is defined as: presetting different numerical values as weights of different data structures; and then calculating the percentage of the computer resources which should be allocated by using a resource allocation model, and sending the calculation result to the data fusion unit.

Example 10

On the basis of the previous embodiment, the data fusion unit calls the computing resources allocated by the resource allocation unit, performs data fusion on the data to be processed based on the data information acquired by the data evaluation unit, and stores the fused data by the following steps: according to the calculated percentage of the computer resources, calling the computer resources, extracting the data space of the data to be processed, and classifying the data to be processed into different target heterogeneous databases according to the data space of the data to be processed; carrying out normalization processing on the target heterogeneous database to obtain a classified target heterogeneous data matrix; and respectively mapping and matching the classification target heterogeneous data matrix with each directional data space group by using the following formula:

wherein, sim (d)_j，d_k) In order to map the matching result,

It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process and related description of the system described above may refer to the corresponding process in the foregoing method embodiments, and will not be described herein again.

It should be noted that, the system provided in the foregoing embodiment is only illustrated by dividing the functional modules, and in practical applications, the functions may be allocated to different functional modules according to needs, that is, the modules or steps in the embodiment of the present invention are further decomposed or combined, for example, the modules in the foregoing embodiment may be combined into one module, or may be further decomposed into multiple sub-modules, so as to complete all or part of the functions described above. The names of the modules and steps involved in the embodiments of the present invention are only for distinguishing the modules or steps, and are not to be construed as unduly limiting the present invention.

It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes and related descriptions of the storage device and the processing device described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

Those of skill in the art would appreciate that the various illustrative modules, method steps, and modules described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that programs corresponding to the software modules, method steps may be located in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable disk, CD-ROM, or any other form of storage medium known in the art. To clearly illustrate this interchangeability of electronic hardware and software, various illustrative components and steps have been described above generally in terms of their functionality. Whether these functions are performed as electronic hardware or software depends upon the particular application and design constraints imposed on the technical solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

The terms "first," "second," and the like are used for distinguishing between similar elements and not necessarily for describing or implying a particular order or sequence.

The terms "comprises," "comprising," or any other similar term are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.

So far, the technical solutions of the present invention have been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of the present invention is obviously not limited to these specific embodiments. Equivalent changes or substitutions of related technical features can be made by those skilled in the art without departing from the principle of the invention, and the technical scheme after the changes or substitutions can fall into the protection scope of the invention.

The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention.

Claims

1. A computer data processing system based on data fusion, the system comprising: the data evaluation unit is used for evaluating the data to be processed and acquiring the data information of the data to be processed; the data information at least comprises: data size, data type and data structure; the resource allocation unit allocates computing resources of the computer for data processing based on the data information acquired by the data evaluation unit according to a preset resource allocation model; and the data fusion unit is used for calling the computing resources distributed by the resource distribution unit, carrying out data fusion on the data to be processed based on the data information acquired by the data evaluation unit, and storing the fused data.

2. The system of claim 1, wherein the data evaluation unit comprises: a plurality of data identification subunits; the data identification subunit is used for training respectively based on a plurality of dimensions and a plurality of feature spaces; the trained data identification subunit can analyze the data to be processed under the corresponding dimension and characteristic space to obtain an analysis result; the data evaluation unit further includes: and the analysis integration unit is used for integrating the analysis results of all the data identification subunits to obtain the data scale, the data type and the data structure of the data to be processed.

3. The system of claim 2, wherein the dimensions are defined as: characteristics of the data, i.e., data size, data type, and data structure; the training process specifically comprises: the data identification subunit extracts data features based on training data samples collected in advance under the data scale dimension, the data type dimension or the data structure dimension, and counts the times of data features conforming to each feature space by using the following formula:

4. The system of claim 3, wherein the resource allocation unit, based on the data information obtained by the data evaluation unit according to a preset resource allocation model, performs the following steps: establishing a resource allocation model, wherein the resource allocation model is represented by the following notations:

f (x) percentage of allocated resources, x is data information, data size, weighted average of data type and data structure, α is constant, α>3，

Is a standard average value and is a set constant; according to the established resource allocation model, firstly, the weighted average value of the data information is calculated by the following formula: the data scale size is A + the weight value corresponding to the data type is B + the weight value corresponding to the data structure is C; wherein, the weight value corresponding to the data type is as follows: presetting different numerical values as weights of different data types; the weight corresponding to the data structure is defined as: presetting different numerical values as weights of different data structures; and then calculating the percentage of the computer resources which should be allocated by using a resource allocation model, and sending the calculation result to the data fusion unit.

5. The system of claim 4, wherein the data fusion unit calls the computing resources allocated by the resource allocation unit, performs data fusion on the data to be processed based on the data information acquired by the data evaluation unit, and the method for storing the fused data performs the following steps: according to the calculated percentage of the computer resources, calling the computer resources, extracting the data space of the data to be processed, and classifying the data to be processed into different target heterogeneous databases according to the data space of the data to be processed; carrying out normalization processing on the target heterogeneous database to obtain a classified target heterogeneous data matrix; the use is as followsAnd (3) a formula, namely respectively mapping and matching the classified target heterogeneous data matrix with each directional data space group:

wherein, sim (d)_j，d_k) In order to map the matching result,

6. Computer data processing method based on data fusion according to the system of one of claims 1 to 5, characterized in that the method performs the following steps: the data evaluation unit is used for evaluating the data to be processed and acquiring the data information of the data to be processed; the data information at least comprises: data size, data type and data structure; the resource allocation unit allocates computing resources of the computer for data processing based on the data information acquired by the data evaluation unit according to a preset resource allocation model; and the data fusion unit is used for calling the computing resources distributed by the resource distribution unit, carrying out data fusion on the data to be processed based on the data information acquired by the data evaluation unit, and storing the fused data.

7. The method of claim 6, wherein the data evaluation unit comprises: a plurality of data identification subunits; the data identification subunit is used for training respectively based on a plurality of dimensions and a plurality of feature spaces; the trained data identification subunit can analyze the data to be processed under the corresponding dimension and characteristic space to obtain an analysis result; the data evaluation unit further includes: and the analysis integration unit is used for integrating the analysis results of all the data identification subunits to obtain the data scale, the data type and the data structure of the data to be processed.

8. The method of claim 7, wherein the dimensions are defined as: characteristics of the data, i.e., data size, data type, and data structure; the training process specifically comprises: the data identification subunit extracts data features based on training data samples collected in advance under the data scale dimension, the data type dimension or the data structure dimension, and counts the times of data features conforming to each feature space by using the following formula:

9. The method of claim 8, wherein the resource allocation unit, based on the data information obtained by the data evaluation unit according to a preset resource allocation model, performs the following steps: establishing a resource allocation model, wherein the resource allocation model is represented by the following notations:

wherein F (x) is percentage of allocated resources, x is data information, namely data size, weighted average of data type and data structure, α is constant, α>3，

10. The method according to claim 9, wherein the data fusion unit calls the computing resources allocated by the resource allocation unit, performs data fusion on the data to be processed based on the data information acquired by the data evaluation unit, and stores the fused data by performing the following steps: according to the calculated percentage of the computer resources, calling the computer resources, extracting the data space of the data to be processed, and classifying the data to be processed into different target heterogeneous databases according to the data space of the data to be processed; carrying out normalization processing on the target heterogeneous database to obtain a classified target heterogeneous data matrix; using the following formula, will be divided intoAnd (3) mapping and matching the similar target heterogeneous data matrix with each directional data space group respectively:

wherein, sim (d)_j，d_k) In order to map the matching result,