CN116662839A - Associated big data cluster analysis method and device based on multidimensional intelligent acquisition - Google Patents

Associated big data cluster analysis method and device based on multidimensional intelligent acquisition Download PDF

Info

Publication number
CN116662839A
CN116662839A CN202310516513.7A CN202310516513A CN116662839A CN 116662839 A CN116662839 A CN 116662839A CN 202310516513 A CN202310516513 A CN 202310516513A CN 116662839 A CN116662839 A CN 116662839A
Authority
CN
China
Prior art keywords
data
attribute
covariance
projection
carrying
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310516513.7A
Other languages
Chinese (zh)
Inventor
张煇
刘俊龙
杨勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanxi Changhe Technology Co ltd
Original Assignee
Shanxi Changhe Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanxi Changhe Technology Co ltd filed Critical Shanxi Changhe Technology Co ltd
Priority to CN202310516513.7A priority Critical patent/CN116662839A/en
Publication of CN116662839A publication Critical patent/CN116662839A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/10Pre-processing; Data cleansing
    • G06F18/15Statistical pre-processing, e.g. techniques for normalisation or restoring missing data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of data cluster analysis, and discloses a related big data cluster analysis method and device based on multidimensional intelligent acquisition, comprising the following steps: performing attribute analysis on the target data to obtain data attributes; performing linear transformation on the data attributes to obtain attribute linear values, performing normal distribution processing on each attribute in the data attributes to obtain an attribute normal distribution map, and calculating probability density of each graph in the attribute normal distribution map; determining an expected value corresponding to the data attribute, constructing a covariance matrix of attribute covariance, and determining a covariance structure of target data; carrying out sharpening noise reduction treatment on target data to obtain noise reduction data, carrying out feature extraction on the noise reduction data to obtain feature data, constructing a conversion matrix of the feature data, carrying out sharpening projection on the feature data to obtain data projection, and carrying out cluster analysis on the feature data to obtain a cluster result of the associated big data. The invention aims to improve the accuracy of the multi-dimensional intelligent acquisition associated big data clustering analysis.

Description

Associated big data cluster analysis method and device based on multidimensional intelligent acquisition
Technical Field
The invention relates to the technical field of data cluster analysis, in particular to a related big data cluster analysis method and device based on multidimensional intelligent acquisition.
Background
At present, the rapid development of cloud computing, intelligent technology and sensing technology promotes the data to be explosively and rapidly increased, the processing and analysis of the data become important factors of the current society, and under the background of big data age, a large amount of data can be generated every day in the research fields of different dimensions such as government service, biology, medicine, celestial body research and the like, and due to the diversity of the data dimensions, the mining of potential information of the data is particularly important, and the main method of mining the data is a clustering analysis method.
The existing clustering analysis method mainly calculates the relevance between data by combining the semantics of the data and the data keywords, and performs clustering analysis on the data according to the relevance, but uncertainty data can be generated in each process of acquisition, transmission and processing of the data, and potential relations in the data are not mined and analyzed, so that the accuracy of the clustering analysis of the data is reduced, and therefore, a method capable of improving the accuracy of the clustering analysis of the related big data acquired by multidimensional intelligence is needed.
Disclosure of Invention
The invention provides a correlation big data cluster analysis method and device based on multidimensional intelligent acquisition, which mainly aim to improve the accuracy of the correlation big data cluster analysis of the multidimensional intelligent acquisition.
In order to achieve the above purpose, the related big data cluster analysis method based on multidimensional intelligent acquisition provided by the invention comprises the following steps:
acquiring associated big data to be analyzed, performing data filtering on the associated big data to obtain target data, and performing attribute analysis on the target data to obtain data attributes;
performing linear transformation on the data attributes to obtain attribute linear values, performing normal distribution processing on each attribute in the data attributes according to the attribute linear values to obtain an attribute normal distribution map, and calculating the probability density of each graph in the attribute normal distribution map through the following formula;
wherein F expresses probability density of each graph in the normal distribution diagram of the attribute, beta represents attribute mean value of the data attribute, exp represents an exponential function, B j Random variable representing normal distribution map of jth attribute, C j Representing a graph parameter corresponding to the j-th attribute normal distribution graph;
determining expected values corresponding to the data attributes according to the probability density, calculating covariance among each attribute in the data attributes according to the expected values to obtain attribute covariance, constructing a covariance matrix of the attribute covariance, and determining a covariance structure of the target data according to the covariance matrix;
And carrying out sharpening noise reduction treatment on the target data according to the covariance structure to obtain noise reduction data, carrying out feature extraction on the noise reduction data to obtain feature data, constructing a conversion matrix of the feature data, carrying out sharpening projection on the feature data by combining the conversion matrix to obtain data projection, and carrying out cluster analysis on the feature data according to the data projection to obtain a clustering result of the associated big data.
Optionally, the performing data filtering on the associated big data to obtain target data includes:
carrying out standardization processing on the associated big data to obtain standardized data;
vectorizing the standardized data to obtain standardized vectors, and calculating cosine values of included angles among the standardized vectors;
and performing de-duplication processing on the standardized data according to the cosine value of the included angle to obtain target data.
Optionally, the performing attribute analysis on the target data to obtain a data attribute includes:
extracting a data tag corresponding to each data in the target data, and calculating the weight of each tag in the data tag through the following formula to obtain a tag weight;
Wherein D is i Representing the tag weight of each of the data tags, B i A tag vector representing an i-th tag of the data tags,representing vector covariance corresponding to a label vector of an ith label in the user labels, wherein trace () represents a spatial filtering function;
and extracting the characteristic labels in the data labels according to the label weights, and carrying out attribute analysis on the characteristic labels to obtain data attributes.
Optionally, calculating the covariance of each attribute in the data attributes according to the expected value to obtain an attribute covariance, including:
the covariance between each of the data attributes is calculated by the following formula:
Cov(m,m+1)=E[m,m+1]-E[m]E[m+1]
wherein Cov (m, m+1) represents covariance between each attribute in the data attributes, m and m+1 represent sequence numbers of the data attributes, em represents expected values corresponding to the mth data attribute, and Em+1 represents expected values corresponding to the mth+1th data attribute.
Optionally, the sharpening noise reduction processing is performed on the target data according to the covariance structure to obtain noise reduction data, including:
according to the covariance structure, carrying out feature decomposition on the covariance matrix to obtain matrix features;
calculating the feature weight of each feature in the matrix features, and screening the target data according to the feature weights to obtain screening data;
Performing dimension reduction processing on the screening data to obtain dimension reduction data;
and carrying out sharpening processing on the data with reduced dimension to obtain sharpened data, and carrying out noise reduction processing on the sharpened data to obtain noise reduction data.
Optionally, the performing feature decomposition on the covariance matrix according to the covariance structure to obtain a matrix feature includes:
performing feature decomposition on the covariance matrix through the following formula:
wherein G represents the matrix characteristics of the covariance matrix, cov z Representing covariance structure, Q represents orthonormal matrix, ΣQ -1 Is the reciprocal sum of the orthogonal matrix.
Optionally, the constructing a transformation matrix of the feature data includes:
calculating the characteristic value of each data in the characteristic data, and sequencing the characteristic values to obtain sequenced characteristic values;
counting the number of the characteristic values to obtain the characteristic number, and identifying the data dimension of the characteristic data;
setting a retention coefficient of the sorting characteristic values according to the characteristic quantity and the data dimension;
filtering the sorting characteristic values according to the retention coefficient to obtain target characteristic values;
and carrying out vector conversion on the characteristic data corresponding to the target characteristic value to obtain a characteristic vector, and constructing a conversion matrix of the characteristic data according to the target characteristic value and the characteristic vector.
Optionally, the performing cluster analysis on the feature data according to the sharpened projection to obtain a cluster result of the associated big data includes:
obtaining the coordinates of each projection in the sharpened projections to obtain projection coordinates, and calculating the projection similarity of each projection according to the projection coordinates;
determining potential association degrees of the characteristic data according to the projection similarity;
and carrying out cluster analysis on the characteristic data according to the potential association degree to obtain a cluster result of the association big data.
Optionally, the calculating the projection similarity of each projection according to the projection coordinates includes:
calculating the projection similarity of each projection by the following formula:
wherein S represents the projection similarity of each projection, k represents the distance parameter, l and l+1 represent the sequence numbers of the projections, w represents the total number of projections, X l And Y l Representing the projection coordinates of the first projection, X l+1 And Y l+1 The projection coordinates of the (i+1) th projection are indicated.
Associated big data cluster analysis device based on multidimensional intelligent acquisition, which is characterized in that the device comprises:
the attribute analysis module is used for acquiring associated big data to be analyzed, carrying out data filtering on the associated big data to obtain target data, and carrying out attribute analysis on the target data to obtain data attributes;
The matrix construction module is used for carrying out linear transformation on the data attributes to obtain attribute linear values, carrying out normal distribution processing on each attribute in the data attributes according to the attribute linear values to obtain an attribute normal distribution diagram, and calculating the probability density of each graph in the attribute normal distribution diagram through the following formula;
wherein F expresses probability density of each graph in the normal distribution diagram of the attribute, beta represents attribute mean value of the data attribute, exp represents an exponential function, B j Represents the j-th attribute normal scoreRandom variables of the layout, C j Representing a graph parameter corresponding to the j-th attribute normal distribution graph;
the feature extraction module is used for determining expected values corresponding to the data attributes according to the probability density, calculating covariance among each attribute in the data attributes according to the expected values to obtain attribute covariance, constructing a covariance matrix of the attribute covariance, and determining a covariance structure of the target data according to the covariance matrix;
and the cluster analysis module is used for carrying out sharpening noise reduction processing on the target data according to the covariance structure to obtain noise reduction data, carrying out feature extraction on the noise reduction data to obtain feature data, constructing a conversion matrix of the feature data, carrying out sharpening projection on the feature data by combining the conversion matrix to obtain data projection, carrying out cluster analysis on the feature data according to the data projection to obtain a clustering result of the associated big data.
According to the method, unimportant data or incomplete data in the associated big data can be removed by obtaining the associated big data to be analyzed and carrying out data filtering on the associated big data, convenience is provided for subsequent processing of the data. Therefore, the associated big data cluster analysis method and device based on the multidimensional intelligent acquisition can improve the accuracy of the associated big data cluster analysis of the multidimensional intelligent acquisition.
Drawings
FIG. 1 is a schematic flow chart of a related big data clustering analysis method based on multidimensional intelligent acquisition according to an embodiment of the present invention;
FIG. 2 is a functional block diagram of a related big data cluster analysis device based on multidimensional intelligent acquisition according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of an electronic device for implementing the associated big data cluster analysis method based on multidimensional intelligent collection according to an embodiment of the present application.
The achievement of the objects, functional features and advantages of the present application will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
The embodiment of the application provides a related big data cluster analysis method based on multidimensional intelligent acquisition. In the embodiment of the application, the execution subject of the associated big data cluster analysis method based on multidimensional intelligent acquisition comprises at least one of electronic equipment such as a server and a terminal which can be configured to execute the method provided by the embodiment of the application. In other words, the associated big data cluster analysis method based on multidimensional intelligent acquisition can be executed by software or hardware installed in a terminal device or a server device, wherein the software can be a blockchain platform. The service end includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like. The server may be an independent server, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks (Content Delivery Network, CDN), and basic cloud computing services such as big data and artificial intelligence platforms.
Referring to fig. 1, a flow chart of a related big data cluster analysis method based on multidimensional intelligent acquisition according to an embodiment of the present invention is shown. In this embodiment, the associated big data cluster analysis method based on multidimensional intelligent acquisition includes steps S1-S4.
S1, acquiring associated big data to be analyzed, performing data filtering on the associated big data to obtain target data, and performing attribute analysis on the target data to obtain data attributes.
According to the invention, through acquiring the associated big data to be analyzed, filtering the data of the associated big data, unimportant data or incomplete data in the associated big data can be removed, convenience is provided for subsequent processing of the data, wherein the associated big data is data with certain relativity, for example, in the aspect of social management in government service, the social management comprises various management, for example, multi-aspect management of public security, traffic and institutions, a large amount of data can be generated, mutual lottery utilization can be performed among the associated data, so that better management of the data is facilitated, for example, in the aspect of public security management, information data about the vehicle in traffic management is scheduled, so that information statistics of the vehicle is facilitated, further, the efficiency of data processing can be improved, the target data is the data obtained after the associated big data is processed through filtering, deleting and the like, further, the acquisition of the associated big data to be analyzed can be realized through a data collector, and the data collector is realized through a script.
According to the invention, the invalid data in the associated big data can be removed by carrying out data filtering on the associated big data so as to improve the efficiency of subsequent data processing, wherein the target data is the data obtained by removing the invalid data in the associated big data.
As an embodiment of the present invention, the performing data filtering on the associated big data to obtain target data includes: and carrying out standardization processing on the associated big data to obtain standardized data, carrying out vectorization operation on the standardized data to obtain standardized vectors, calculating an included angle cosine value between the standardized vectors, and carrying out de-duplication processing on the standardized data according to the included angle cosine value to obtain target data.
The standardized data are data obtained after the associated big data are subjected to format unification, the standardized vector is a vector expression form corresponding to the standardized data, the included angle cosine value is the included angle between the standardized vectors, and the closer the included angle is to zero, the more similar the two vectors are indicated.
Furthermore, the normalization processing of the associated big data can be realized by a standard deviation normalization method, the vectorization operation of the normalized data can be realized by a word2vec algorithm, the calculation of the cosine value of the included angle between the normalized vectors can be realized by a cosine function, the de-duplication processing of the normalized data can be realized by a de-duplication tool, and the de-duplication tool is compiled by a script language.
According to the method, the relevant attribute information of each data in the target data can be known by carrying out attribute analysis on the target data, so that the knowledge of the target data is increased, and a premise is provided for the follow-up calculation of the attribute covariance, wherein the data attribute is the property corresponding to each data in the target data.
As an embodiment of the present invention, the performing attribute analysis on the target data to obtain a data attribute includes: extracting a data tag corresponding to each data in the target data, calculating the weight of each tag in the data tag to obtain a tag weight, extracting a characteristic tag in the data tag according to the tag weight, and carrying out attribute analysis on the characteristic tag to obtain a data attribute.
The data labels are information such as identifiers or marks corresponding to each piece of data in the target data, the label weight represents the importance degree of each label in the data labels, and the characteristic labels are representative labels in the data labels.
Further, extracting the data tag corresponding to each data in the target data can be achieved through a tag extractor, the tag extractor is compiled by Java language, the feature tag can be obtained by extracting the tag with the largest value of the tag weight through an extraction function, the extraction function comprises a LEFT function, and the attribute analysis of the feature tag can be achieved through an attribute analysis method, such as a funnel analysis method.
Further, as an optional embodiment of the present invention, the calculating the weight of each tag in the data tag to obtain a tag weight includes:
the weight of each of the data tags is calculated by the following formula:
wherein D is i Representing the tag weight of each of the data tags, B i A tag vector representing an i-th tag of the data tags,representing the vector covariance corresponding to the label vector of the ith label in the user labels, trace () represents the spatial filter function.
S2, carrying out linear transformation on the data attributes to obtain attribute linear values, carrying out normal distribution processing on each attribute in the data attributes according to the attribute linear values to obtain an attribute normal distribution diagram, and calculating the probability density of each graph in the attribute normal distribution diagram.
The invention can obtain the linear value of the data attribute by carrying out linear transformation on the data attribute, and provides guarantee for the subsequent normal distribution processing, wherein the attribute linear value is a numerical expression form corresponding to the data attribute, the attribute normal distribution diagram is a variable frequency distribution diagram corresponding to the data attribute, the probability density is the area of each image in the normal distribution diagram, namely the occurrence probability corresponding to each attribute, and the expected value is the average number of output values corresponding to the data attribute.
Further, as an alternative embodiment of the present invention, the linear transformation of the data attributes may be implemented by a linear function, such as a linear function, and the normal distribution process of each of the data attributes may be implemented by a gaussian function, and the expected value may be obtained by calculating an integrated value of the probability density.
As one embodiment of the present invention, the calculating the probability density of each graph in the attribute normal distribution map includes:
the probability density of each graph in the attribute normal distribution graph is calculated by the following formula:
wherein F expresses probability density of each graph in the normal distribution diagram of the attribute, beta represents attribute mean value of the data attribute, exp represents an exponential function, B j Random variable representing normal distribution map of jth attribute, C j And (5) representing the graph parameters corresponding to the j-th attribute normal distribution graph.
S3, determining expected values corresponding to the data attributes according to the probability density, calculating covariance among each attribute in the data attributes according to the expected values to obtain attribute covariance, constructing a covariance matrix of the attribute covariance, and determining a covariance structure of the target data according to the covariance matrix.
According to the probability density, the expected value corresponding to the data attribute is determined so as to facilitate understanding of the gap corresponding to the data attribute, wherein the expected value is the average number of the output values corresponding to the data attribute, and further, the expected value corresponding to the data attribute is determined by calculating the average value of the data attribute.
According to the method, the covariance among each attribute in the data attributes is calculated according to the expected value, so that the degree of the phase difference among the data attributes can be known, and further guarantee is provided for the follow-up construction of the variance matrix of the attribute covariance, wherein the attribute covariance is the overall error among each attribute in the data attributes, the larger the error of the target data is indicated if the numerical value of the attribute covariance is a negative number, and the smaller the error of the target data is indicated if the numerical value of the attribute covariance is a negative number.
As an embodiment of the present invention, the calculating the covariance between each attribute in the data attributes according to the expected value, to obtain an attribute covariance includes:
the covariance between each of the data attributes is calculated by the following formula:
Cov(m,m+1)=E[m,m+1]-E[m]E[m+1]
Wherein Cov (m, m+1) represents covariance between each attribute in the data attributes, m and m+1 represent sequence numbers of the data attributes, em represents expected values corresponding to the mth data attribute, and Em+1 represents expected values corresponding to the mth+1th data attribute.
The invention can know the construction condition corresponding to the covariance matrix and what kind of difference exists between the target data by constructing the covariance matrix of the attribute covariance, and further, the covariance matrix of the attribute covariance can be constructed by a matrix function which is compiled by programming language, and the covariance structure of the target data can be determined by the structure type of the covariance matrix.
And S4, carrying out sharpening noise reduction treatment on the target data according to the covariance structure to obtain noise reduction data, carrying out feature extraction on the noise reduction data to obtain feature data, constructing a conversion matrix of the feature data, carrying out sharpening projection on the feature data in combination with the conversion matrix to obtain data projection, and carrying out clustering analysis on the feature data according to the data projection to obtain a clustering result of the associated big data.
According to the method, the target data is sharpened and noise reduced according to the covariance structure, uncertainty data of workers of the target data can be removed, redundant data are removed, and accuracy is improved for subsequent feature extraction of the noise reduction data, wherein the noise reduction data is obtained by removing the redundant data and the uncertainty data in the target data.
As an embodiment of the present invention, the sharpening noise reduction processing is performed on the target data according to the covariance structure, to obtain noise reduction data, including: according to the covariance structure, carrying out feature decomposition on the covariance matrix to obtain matrix features, calculating feature weights of each feature in the matrix features, screening the target data according to the feature weights to obtain screening data, carrying out dimension reduction processing on the screening data to obtain dimension reduction data, carrying out sharpening processing on the dimension reduction data to obtain sharpening data, and carrying out noise reduction processing on the sharpening data to obtain noise reduction data.
The matrix features are features corresponding to the covariance matrix, the feature weights represent importance of the matrix features, the screening data are data obtained after the target data are screened according to the size of the matrix feature values, the dimension reduction data are data obtained after the target data are reduced from high dimension to low dimension, and the sharpening data are data obtained after the definition of the dimension reduction data is improved.
Further, as an optional embodiment of the present invention, the feature weight of each feature in the matrix feature may be implemented by an analytic hierarchy process, the screening of the target data may be implemented by a screening function, such as a look up function, the dimension reduction processing of the screening data may be implemented by a naive bayes method, the sharpening processing of the dimension reduction data may be implemented by a sharpening tool, the sharpening tool is compiled by a scripting language, and the noise reduction processing of the sharpening data may be implemented by a mean filtering method.
Further, as an optional embodiment of the present invention, the performing feature decomposition on the covariance matrix according to the covariance structure to obtain matrix features includes:
performing feature decomposition on the covariance matrix through the following formula:
wherein G represents the matrix characteristics of the covariance matrix, cov z Representing the covariance structure, Q represents the orthonormal matrix, and Σ () is the diagonal matrix corresponding to the covariance matrix.
The invention can obtain the characteristic part in the noise reduction data by carrying out characteristic extraction on the noise reduction data, and provides guarantee for the subsequent construction of a conversion matrix, wherein the characteristic data is representative data in the noise reduction data, and further, the characteristic extraction of the noise reduction data can be realized by a principal component analysis method.
According to the invention, the conversion matrix of the characteristic data is constructed so as to facilitate the subsequent sharpening projection processing of the characteristic data through the conversion matrix, so that the accuracy of the subsequent data clustering analysis is improved, wherein the conversion matrix is a square matrix for converting the characteristic data.
As an embodiment of the present invention, the constructing a transformation matrix of the feature data includes: calculating the characteristic value of each data in the characteristic data, sorting the characteristic values to obtain sorted characteristic values, counting the number of the characteristic values to obtain characteristic numbers, identifying the data dimension of the characteristic data, setting a retention coefficient of the sorted characteristic values according to the characteristic numbers and the data dimension, filtering the sorted characteristic values according to the retention coefficient to obtain a target characteristic value, carrying out vector conversion on the characteristic data corresponding to the target characteristic value to obtain a characteristic vector, and constructing a conversion matrix of the characteristic data according to the target characteristic value and the characteristic vector.
The feature value is a feature score value corresponding to each data in the feature data, the sorting feature value is obtained after sorting according to the numerical value of the feature value, the feature quantity is the total number of the feature values, the data dimension is a space dimension of the feature data, such as a two-dimensional space, a three-dimensional space and the like, the retention coefficient is a proportion of retention of the sorting feature value, the target feature value is obtained after filtering the sorting feature value according to the retention coefficient, and the feature vector is a vector expression form corresponding to the feature data.
Further, calculating the characteristic value of each data in the characteristic data can be achieved through a characteristic value calculator, sorting of the characteristic values can be achieved through a bubbling sorting algorithm, statistics of the number of the characteristic values can be achieved through a moving weighted average method, identification of the data dimension of the characteristic data can be achieved through key values of the characteristic data, filtering of the sorting characteristic values can be achieved through a bloom filter, vector conversion of the characteristic data corresponding to the target characteristic values can be achieved through a Word2vec algorithm, and construction of the conversion matrix of the characteristic data can be achieved through the matrix function.
The method and the device can project the characteristic data into the coordinates by sharpening projection, so that cluster analysis is conveniently carried out on the sharpened data, wherein the data projection is an image obtained after the characteristic data are projected to the corresponding coordinates, and further, the sharpening projection of the characteristic data can be realized through a projection tool, such as a data projector.
According to the data projection, the clustering analysis is carried out so that the sharpened data can be subjected to the aggregation classification, and the classification of the data with different dimensions is finished, wherein the clustering result is obtained after the clustering analysis of the sharpened data.
As an embodiment of the present invention, the performing cluster analysis on the feature data according to the sharpened projection to obtain a cluster result of the associated big data includes: obtaining coordinates of each projection in the sharpened projections to obtain projection coordinates, calculating projection similarity of each projection according to the projection coordinates, determining potential association degree of the feature data according to the projection similarity, and carrying out cluster analysis on the feature data according to the potential association degree to obtain a clustering result of the associated big data.
The projection coordinates are point coordinates of each projection in the sharpened projections, the projection similarity is the similarity degree between each projection, the correlation between the sharpened projections is represented, and the potential correlation degree is the hidden correlation between the characteristic data and is not easy to obtain through a data surface.
Further, as an alternative embodiment of the present invention, the obtaining the coordinates of each projection in the sharpened projections may be implemented by a coordinate identifier, where the coordinate identifier is compiled by C language, and the cluster analysis of the feature data may be implemented by a K-means algorithm.
Further, as an optional embodiment of the present invention, the calculating the projection similarity of each projection according to the projection coordinates includes:
calculating the projection similarity of each projection by the following formula:
wherein S represents the projection similarity of each projection, k represents the distance parameter, l and l+1 represent the sequence numbers of the projections, w represents the total number of projections, X l And Y l Representing the projection coordinates of the first projection, X l+1 And Y l+1 The projection coordinates of the (i+1) th projection are indicated.
According to the method, unimportant data or incomplete data in the associated big data can be removed by obtaining the associated big data to be analyzed and carrying out data filtering on the associated big data, convenience is provided for subsequent processing of the data. Therefore, the associated big data cluster analysis method based on multidimensional intelligent acquisition can improve the accuracy of associated big data cluster analysis of multidimensional intelligent acquisition.
Fig. 2 is a functional block diagram of a related big data cluster analysis device based on multidimensional intelligent acquisition according to an embodiment of the present invention.
The associated big data cluster analysis device 100 based on multidimensional intelligent acquisition can be installed in electronic equipment. Depending on the implementation function, the associated big data cluster analysis device 100 based on multidimensional intelligent collection may include an attribute analysis module 101, a matrix construction module 102, a feature extraction module 103 and a cluster analysis module 104. The module of the invention, which may also be referred to as a unit, refers to a series of computer program segments, which are stored in the memory of the electronic device, capable of being executed by the processor of the electronic device and of performing a fixed function.
In the present embodiment, the functions concerning the respective modules/units are as follows:
the attribute analysis module 101 is configured to obtain associated big data to be analyzed, perform data filtering on the associated big data to obtain target data, and perform attribute analysis on the target data to obtain a data attribute;
the matrix construction module 102 is configured to perform linear transformation on the data attributes to obtain attribute linear values, perform normal distribution processing on each attribute in the data attributes according to the attribute linear values to obtain an attribute normal distribution map, and calculate probability density of each graph in the attribute normal distribution map according to the following formula;
Wherein F expresses probability density of each graph in the attribute normal distribution diagram, and beta expresses numberAccording to attribute mean value of attribute, exp represents exponential function, B j Random variable representing normal distribution map of jth attribute, C j Representing a graph parameter corresponding to the j-th attribute normal distribution graph;
the feature extraction module 103 is configured to determine an expected value corresponding to the data attribute according to the probability density, calculate covariance between each attribute in the data attribute according to the expected value, obtain attribute covariance, construct a covariance matrix of the attribute covariance, and determine a covariance structure of the target data according to the covariance matrix;
the cluster analysis module 104 is configured to perform sharpening noise reduction processing on the target data according to the covariance structure to obtain noise reduction data, perform feature extraction on the noise reduction data to obtain feature data, construct a transformation matrix of the feature data, perform sharpening projection on the feature data in combination with the transformation matrix to obtain data projection, and perform cluster analysis on the feature data according to the data projection to obtain a clustering result of the associated big data.
In detail, each module in the related big data cluster analysis device 100 based on multidimensional intelligent collection in the embodiment of the present application adopts the same technical means as the related big data cluster analysis method based on multidimensional intelligent collection described in fig. 1, and can produce the same technical effects, which are not described herein.
Fig. 3 is a schematic structural diagram of an electronic device 1 for implementing a multidimensional intelligent acquisition-based associated big data cluster analysis method according to an embodiment of the present application.
The electronic device 1 may comprise a processor 10, a memory 11, a communication bus 12 and a communication interface 13, and may further comprise a computer program stored in the memory 11 and executable on the processor 10, such as an associated big data cluster analysis method program based on multidimensional intelligent acquisition.
The processor 10 may be formed by an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be formed by a plurality of integrated circuits packaged with the same function or different functions, including one or more central processing units (Central Processing Unit, CPU), a microprocessor, a digital processing chip, a graphics processor, a combination of various control chips, and so on. The processor 10 is a Control Unit (Control Unit) of the electronic device 1, connects various components of the entire electronic device using various interfaces and lines, executes or executes programs or modules stored in the memory 11 (for example, executes associated big data cluster analysis method programs based on multidimensional intelligent acquisition, etc.), and invokes data stored in the memory 11 to perform various functions of the electronic device and process data.
The memory 11 includes at least one type of readable storage medium including flash memory, a removable hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, etc. The memory 11 may in some embodiments be an internal storage unit of the electronic device, such as a mobile hard disk of the electronic device. The memory 11 may in other embodiments also be an external storage device of the electronic device, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the electronic device. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic device. The memory 11 may be used not only to store application software installed in an electronic device and various data, such as codes of related big data cluster analysis method programs based on multidimensional intelligent collection, but also to temporarily store data that has been output or is to be output.
The communication bus 12 may be a peripheral component interconnect standard (Peripheral Component Interconnect, PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, among others. The bus may be classified as an address bus, a data bus, a control bus, etc. The bus is arranged to enable a connection communication between the memory 11 and at least one processor 10 etc.
The communication interface 13 is used for communication between the electronic device 1 and other devices, including a network interface and a user interface. Optionally, the network interface may include a wired interface and/or a wireless interface (e.g., WI-FI interface, bluetooth interface, etc.), typically used to establish a communication connection between the electronic device and other electronic devices. The user interface may be a Display (Display), an input unit such as a Keyboard (Keyboard), or alternatively a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch, or the like. The display may also be referred to as a display screen or display unit, as appropriate, for displaying information processed in the electronic device and for displaying a visual user interface.
Fig. 3 shows only an electronic device with components, it being understood by a person skilled in the art that the structure shown in fig. 3 does not constitute a limitation of the electronic device 1, and may comprise fewer or more components than shown, or may combine certain components, or may be arranged in different components.
For example, although not shown, the electronic device 1 may further include a power source (such as a battery) for supplying power to each component, and preferably, the power source may be logically connected to the at least one processor 10 through a power management device, so that functions of charge management, discharge management, power consumption management, and the like are implemented through the power management device. The power supply may also include one or more of any of a direct current or alternating current power supply, recharging device, power failure detection circuit, power converter or inverter, power status indicator, etc. The electronic device 1 may further include various sensors, bluetooth modules, wi-Fi modules, etc., which will not be described herein.
It should be understood that the embodiments described are for illustrative purposes only and are not limited to this configuration in the scope of the patent application.
The associated big data cluster analysis method program based on multidimensional intelligent acquisition stored in the memory 11 in the electronic device 1 is a combination of a plurality of instructions, which when run in the processor 10 can realize:
acquiring associated big data to be analyzed, performing data filtering on the associated big data to obtain target data, and performing attribute analysis on the target data to obtain data attributes;
Performing linear transformation on the data attributes to obtain attribute linear values, performing normal distribution processing on each attribute in the data attributes according to the attribute linear values to obtain an attribute normal distribution map, and calculating the probability density of each graph in the attribute normal distribution map through the following formula;
wherein F expresses probability density of each graph in the normal distribution diagram of the attribute, beta represents attribute mean value of the data attribute, exp represents an exponential function, B j Random variable representing normal distribution map of jth attribute, C j Representing a graph parameter corresponding to the j-th attribute normal distribution graph;
determining expected values corresponding to the data attributes according to the probability density, calculating covariance among each attribute in the data attributes according to the expected values to obtain attribute covariance, constructing a covariance matrix of the attribute covariance, and determining a covariance structure of the target data according to the covariance matrix;
and carrying out sharpening noise reduction treatment on the target data according to the covariance structure to obtain noise reduction data, carrying out feature extraction on the noise reduction data to obtain feature data, constructing a conversion matrix of the feature data, carrying out sharpening projection on the feature data by combining the conversion matrix to obtain data projection, and carrying out cluster analysis on the feature data according to the data projection to obtain a clustering result of the associated big data.
In particular, the specific implementation method of the above instructions by the processor 10 may refer to the description of the relevant steps in the corresponding embodiment of the drawings, which is not repeated herein.
Further, the modules/units integrated in the electronic device 1 may be stored in a computer readable storage medium if implemented in the form of software functional units and sold or used as separate products. The computer readable storage medium may be volatile or nonvolatile. For example, the computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer memory, a Read-only memory (ROM).
The present invention also provides a computer readable storage medium storing a computer program which, when executed by a processor of an electronic device, can implement:
acquiring associated big data to be analyzed, performing data filtering on the associated big data to obtain target data, and performing attribute analysis on the target data to obtain data attributes;
performing linear transformation on the data attributes to obtain attribute linear values, performing normal distribution processing on each attribute in the data attributes according to the attribute linear values to obtain an attribute normal distribution map, and calculating the probability density of each graph in the attribute normal distribution map through the following formula;
Wherein F expresses probability density of each graph in the normal distribution diagram of the attribute, beta represents attribute mean value of the data attribute, exp represents an exponential function, B j Random variable representing normal distribution map of jth attribute, C j Representing a graph parameter corresponding to the j-th attribute normal distribution graph;
determining expected values corresponding to the data attributes according to the probability density, calculating covariance among each attribute in the data attributes according to the expected values to obtain attribute covariance, constructing a covariance matrix of the attribute covariance, and determining a covariance structure of the target data according to the covariance matrix;
and carrying out sharpening noise reduction treatment on the target data according to the covariance structure to obtain noise reduction data, carrying out feature extraction on the noise reduction data to obtain feature data, constructing a conversion matrix of the feature data, carrying out sharpening projection on the feature data by combining the conversion matrix to obtain data projection, and carrying out cluster analysis on the feature data according to the data projection to obtain a clustering result of the associated big data.
In the several embodiments provided in the present invention, it should be understood that the disclosed apparatus, device and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical function division, and there may be other manners of division when actually implemented.
The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional module in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units can be realized in a form of hardware or a form of hardware and a form of software functional modules.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof.
The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
The embodiment of the application can acquire and process the related data based on the artificial intelligence technology. Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results.
Furthermore, it is evident that the word "comprising" does not exclude other elements or steps, and that the singular does not exclude a plurality. A plurality of units or means recited in the system claims can also be implemented by means of software or hardware by means of one unit or means. The terms first, second, etc. are used to denote a name, but not any particular order.
Finally, it should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present application and not for limiting the same, and although the present application has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made to the technical solution of the present application without departing from the spirit and scope of the technical solution of the present application.

Claims (10)

1. The associated big data cluster analysis method based on multidimensional intelligent acquisition is characterized by comprising the following steps of:
Acquiring associated big data to be analyzed, performing data filtering on the associated big data to obtain target data, and performing attribute analysis on the target data to obtain data attributes;
performing linear transformation on the data attributes to obtain attribute linear values, performing normal distribution processing on each attribute in the data attributes according to the attribute linear values to obtain an attribute normal distribution map, and calculating the probability density of each graph in the attribute normal distribution map through the following formula;
wherein F expresses probability density of each graph in the normal distribution diagram of the attribute, beta represents attribute mean value of the data attribute, exp represents an exponential function, B j Random variable representing normal distribution map of jth attribute, C j Representing a graph parameter corresponding to the j-th attribute normal distribution graph;
determining expected values corresponding to the data attributes according to the probability density, calculating covariance among each attribute in the data attributes according to the expected values to obtain attribute covariance, constructing a covariance matrix of the attribute covariance, and determining a covariance structure of the target data according to the covariance matrix;
and carrying out sharpening noise reduction treatment on the target data according to the covariance structure to obtain noise reduction data, carrying out feature extraction on the noise reduction data to obtain feature data, constructing a conversion matrix of the feature data, carrying out sharpening projection on the feature data by combining the conversion matrix to obtain data projection, and carrying out cluster analysis on the feature data according to the data projection to obtain a clustering result of the associated big data.
2. The multi-dimensional intelligent acquisition-based associated big data cluster analysis method of claim 1, wherein the performing data filtering on the associated big data to obtain target data comprises:
carrying out standardization processing on the associated big data to obtain standardized data;
vectorizing the standardized data to obtain standardized vectors, and calculating cosine values of included angles among the standardized vectors;
and performing de-duplication processing on the standardized data according to the cosine value of the included angle to obtain target data.
3. The multi-dimensional intelligent acquisition-based associated big data cluster analysis method according to claim 1, wherein the performing attribute analysis on the target data to obtain data attributes comprises:
extracting a data tag corresponding to each data in the target data, and calculating the weight of each tag in the data tag through the following formula to obtain a tag weight;
wherein D is i Representing the tag weight of each of the data tags, B i A tag vector representing an i-th tag of the data tags,representing vector covariance corresponding to a label vector of an ith label in the user labels, wherein trace () represents a spatial filtering function;
And extracting the characteristic labels in the data labels according to the label weights, and carrying out attribute analysis on the characteristic labels to obtain data attributes.
4. The multi-dimensional intelligent collection-based associative big data cluster analysis method according to claim 1, wherein the calculating the covariance of each attribute in the data attributes according to the expected value to obtain an attribute covariance comprises:
the covariance between each of the data attributes is calculated by the following formula:
Cov(m,m+1)=E[m,m+1]-E[m]E[m+1]
wherein Cov (m, m+1) represents covariance between each attribute in the data attributes, m and m+1 represent sequence numbers of the data attributes, em represents expected values corresponding to the mth data attribute, and Em+1 represents expected values corresponding to the mth+1th data attribute.
5. The multi-dimensional intelligent acquisition-based associated big data cluster analysis method according to claim 1, wherein the sharpening noise reduction processing is performed on the target data according to the covariance structure to obtain noise reduction data, and the method comprises the following steps:
according to the covariance structure, carrying out feature decomposition on the covariance matrix to obtain matrix features;
calculating the feature weight of each feature in the matrix features, and screening the target data according to the feature weights to obtain screening data;
Performing dimension reduction processing on the screening data to obtain dimension reduction data;
and carrying out sharpening processing on the data with reduced dimension to obtain sharpened data, and carrying out noise reduction processing on the sharpened data to obtain noise reduction data.
6. The method for clustering analysis of associated big data based on multidimensional intelligent acquisition according to claim 5, wherein the performing feature decomposition on the covariance matrix according to the covariance structure to obtain matrix features comprises:
performing feature decomposition on the covariance matrix through the following formula:
wherein G represents the matrix characteristics of the covariance matrix, cov z Representing covariance structure, Q represents orthonormal matrix, ΣQ -1 Is the reciprocal sum of the orthogonal matrix.
7. The multi-dimensional intelligent acquisition-based associative big data cluster analysis method according to claim 1, wherein the constructing the transformation matrix of the feature data comprises:
calculating the characteristic value of each data in the characteristic data, and sequencing the characteristic values to obtain sequenced characteristic values;
counting the number of the characteristic values to obtain the characteristic number, and identifying the data dimension of the characteristic data;
setting a retention coefficient of the sorting characteristic values according to the characteristic quantity and the data dimension;
Filtering the sorting characteristic values according to the retention coefficient to obtain target characteristic values;
and carrying out vector conversion on the characteristic data corresponding to the target characteristic value to obtain a characteristic vector, and constructing a conversion matrix of the characteristic data according to the target characteristic value and the characteristic vector.
8. The multi-dimensional intelligent acquisition-based associated big data clustering analysis method according to claim 1, wherein the performing cluster analysis on the feature data according to the sharpening projection to obtain a clustering result of the associated big data comprises:
obtaining the coordinates of each projection in the sharpened projections to obtain projection coordinates, and calculating the projection similarity of each projection according to the projection coordinates;
determining potential association degrees of the characteristic data according to the projection similarity;
and carrying out cluster analysis on the characteristic data according to the potential association degree to obtain a cluster result of the association big data.
9. The multi-dimensional intelligent collection-based associative big data cluster analysis method according to claim 8, wherein the calculating the projection similarity of each projection according to the projection coordinates comprises:
Calculating the projection similarity of each projection by the following formula:
wherein S represents the projection similarity of each projection, k represents the distance parameter, l and l+1 represent the sequence numbers of the projections, w represents the total number of projections, X l And Y l Representing the projection coordinates of the first projection, X l+1 And Y l+1 The projection coordinates of the (i+1) th projection are indicated.
10. Associated big data cluster analysis device based on multidimensional intelligent acquisition, which is characterized in that the device comprises:
the attribute analysis module is used for acquiring associated big data to be analyzed, carrying out data filtering on the associated big data to obtain target data, and carrying out attribute analysis on the target data to obtain data attributes;
the matrix construction module is used for carrying out linear transformation on the data attributes to obtain attribute linear values, carrying out normal distribution processing on each attribute in the data attributes according to the attribute linear values to obtain an attribute normal distribution diagram, and calculating the probability density of each graph in the attribute normal distribution diagram through the following formula;
wherein F expresses probability density of each graph in the normal distribution diagram of the attribute, beta represents attribute mean value of the data attribute, exp represents an exponential function, B j Random variable representing normal distribution map of jth attribute, C j Representing a graph parameter corresponding to the j-th attribute normal distribution graph;
the feature extraction module is used for determining expected values corresponding to the data attributes according to the probability density, calculating covariance among each attribute in the data attributes according to the expected values to obtain attribute covariance, constructing a covariance matrix of the attribute covariance, and determining a covariance structure of the target data according to the covariance matrix;
and the cluster analysis module is used for carrying out sharpening noise reduction processing on the target data according to the covariance structure to obtain noise reduction data, carrying out feature extraction on the noise reduction data to obtain feature data, constructing a conversion matrix of the feature data, carrying out sharpening projection on the feature data by combining the conversion matrix to obtain data projection, carrying out cluster analysis on the feature data according to the data projection to obtain a clustering result of the associated big data.
CN202310516513.7A 2023-05-09 2023-05-09 Associated big data cluster analysis method and device based on multidimensional intelligent acquisition Pending CN116662839A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310516513.7A CN116662839A (en) 2023-05-09 2023-05-09 Associated big data cluster analysis method and device based on multidimensional intelligent acquisition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310516513.7A CN116662839A (en) 2023-05-09 2023-05-09 Associated big data cluster analysis method and device based on multidimensional intelligent acquisition

Publications (1)

Publication Number Publication Date
CN116662839A true CN116662839A (en) 2023-08-29

Family

ID=87712751

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310516513.7A Pending CN116662839A (en) 2023-05-09 2023-05-09 Associated big data cluster analysis method and device based on multidimensional intelligent acquisition

Country Status (1)

Country Link
CN (1) CN116662839A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116934486A (en) * 2023-09-15 2023-10-24 深圳格隆汇信息科技有限公司 Decision evaluation method and system based on deep learning
CN117113119A (en) * 2023-10-24 2023-11-24 陕西女娲神草农业科技有限公司 Equipment association relation analysis method and system applied to gynostemma pentaphylla preparation scene

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116934486A (en) * 2023-09-15 2023-10-24 深圳格隆汇信息科技有限公司 Decision evaluation method and system based on deep learning
CN116934486B (en) * 2023-09-15 2024-01-12 深圳市蓝宇飞扬科技有限公司 Decision evaluation method and system based on deep learning
CN117113119A (en) * 2023-10-24 2023-11-24 陕西女娲神草农业科技有限公司 Equipment association relation analysis method and system applied to gynostemma pentaphylla preparation scene
CN117113119B (en) * 2023-10-24 2023-12-26 陕西女娲神草农业科技有限公司 Equipment association relation analysis method and system applied to gynostemma pentaphylla preparation scene

Similar Documents

Publication Publication Date Title
Zenggang et al. Research on image retrieval algorithm based on combination of color and shape features
CN116662839A (en) Associated big data cluster analysis method and device based on multidimensional intelligent acquisition
CN113705462B (en) Face recognition method, device, electronic equipment and computer readable storage medium
CN113157927B (en) Text classification method, apparatus, electronic device and readable storage medium
CN113378970B (en) Sentence similarity detection method and device, electronic equipment and storage medium
CN114398557B (en) Information recommendation method and device based on double images, electronic equipment and storage medium
CN114979120B (en) Data uploading method, device, equipment and storage medium
CN107315984B (en) Pedestrian retrieval method and device
CN115238670A (en) Information text extraction method, device, equipment and storage medium
CN113656690B (en) Product recommendation method and device, electronic equipment and readable storage medium
CN114220536A (en) Disease analysis method, device, equipment and storage medium based on machine learning
CN109800215A (en) Method, apparatus, computer storage medium and the terminal of a kind of pair of mark processing
CN111402068B (en) Premium data analysis method and device based on big data and storage medium
CN111651625A (en) Image retrieval method, image retrieval device, electronic equipment and storage medium
CN115409041B (en) Unstructured data extraction method, device, equipment and storage medium
CN115661472A (en) Image duplicate checking method and device, computer equipment and storage medium
CN113343306B (en) Differential privacy-based data query method, device, equipment and storage medium
CN113515591B (en) Text defect information identification method and device, electronic equipment and storage medium
CN114996386A (en) Business role identification method, device, equipment and storage medium
CN111444159B (en) Refined data processing method, device, electronic equipment and storage medium
CN111652281B (en) Information data classification method, device and readable storage medium
CN113343102A (en) Data recommendation method and device based on feature screening, electronic equipment and medium
CN116522105B (en) Method, device, equipment and medium for integrally constructing data based on cloud computing
CN116257488B (en) Geotechnical engineering investigation big data archiving method, device, electronic equipment and medium
CN113704411B (en) Word vector-based similar guest group mining method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 030013 Room 707, Block A, Gaoxin Guozhi Building, No. 3, Dong'e'er Lane, Taiyuan Xuefu Park, Shanxi Comprehensive Reform Demonstration Zone, Taiyuan City, Shanxi Province

Applicant after: Changhe Information Co.,Ltd.

Address before: 030013 Room 707, Block A, Gaoxin Guozhi Building, No. 3, Dong'e'er Lane, Taiyuan Xuefu Park, Shanxi Comprehensive Reform Demonstration Zone, Taiyuan City, Shanxi Province

Applicant before: Shanxi Changhe Technology Co.,Ltd.

CB02 Change of applicant information