CN112989153B - Data processing method and device and computer equipment - Google Patents

Data processing method and device and computer equipment Download PDF

Info

Publication number
CN112989153B
CN112989153B CN201911284244.6A CN201911284244A CN112989153B CN 112989153 B CN112989153 B CN 112989153B CN 201911284244 A CN201911284244 A CN 201911284244A CN 112989153 B CN112989153 B CN 112989153B
Authority
CN
China
Prior art keywords
clustering
distribution
target
sql
multidimensional
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911284244.6A
Other languages
Chinese (zh)
Other versions
CN112989153A (en
Inventor
王哲
蔡博文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201911284244.6A priority Critical patent/CN112989153B/en
Publication of CN112989153A publication Critical patent/CN112989153A/en
Application granted granted Critical
Publication of CN112989153B publication Critical patent/CN112989153B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application provides a data processing method, a data processing device and computer equipment. In the embodiment of the application, a plurality of clustering results are obtained by determining a plurality of target parameters of a plurality of objects and carrying out multidimensional clustering on the plurality of objects according to parameter data of a plurality of target parameters corresponding to the plurality of objects respectively. And generating a multidimensional clustering distribution map according to the distribution condition of the clustering results in a clustering space coordinate system established based on the target parameters, and outputting the multidimensional clustering distribution map. The multi-dimensional clustering distribution diagram is used for determining the query condition of the target object based on the distribution condition of the clustering results. The application can more accurately determine the query condition of the target object, and further improve the retrieval precision.

Description

Data processing method and device and computer equipment
Technical Field
The embodiment of the application relates to the technical field of networks, in particular to a data processing method, a data processing device and computer equipment.
Background
With the rapid development of network technology, when a target object meeting a specific condition is retrieved, it is generally required to perform a combined search or multiple searches based on a query condition set by the multidimensional feature of the target object.
Taking a cloud database as an example, demands for database audit services such as database security, database performance diagnosis, database abnormal behaviors and the like are increasingly urgent along with users. How to accurately track and locate databases that present security events or performance problems, and trace back the root causes that cause the security events or performance problems in the databases, is a primary goal of database auditing services.
In the prior art, by setting related query conditions (such as keywords, time, database names, operation types, execution states, number of scanning records, time consumption of execution and the like) of the database based on multidimensional features of the database, a user searches the database conforming to the query conditions based on at least one input query condition, so that tracking and positioning of the database are realized. But whether such a database is tracked and located depends exactly on how well the query entered by the user matches. Because there are many possible combinations of factors that typically cause security events or performance problems in the database, users can only empirically set query conditions, which can lead to problems with inaccurate set query conditions and reduced search accuracy.
Disclosure of Invention
The embodiment of the application provides a data processing method, a data processing device and computer equipment, which enable a user to intuitively determine the query condition of a target object according to the cluster distribution condition of a plurality of objects in a multi-dimensional cluster distribution map by generating the multi-dimensional cluster distribution map of the plurality of objects, thereby further improving the retrieval accuracy of the target object.
In a first aspect, an embodiment of the present application provides a data processing method, including:
determining a plurality of target parameters for a plurality of objects;
according to the parameter data of a plurality of target parameters corresponding to the plurality of objects respectively, carrying out multidimensional clustering on the plurality of objects to obtain a plurality of clustering results;
Generating a multidimensional clustering distribution diagram according to the distribution condition of the clustering results in a clustering space coordinate system established based on the target parameters;
outputting the multidimensional clustering distribution map; the multi-dimensional clustering distribution diagram is used for determining the query condition of the target object based on the distribution condition of the clustering results.
In a second aspect, an embodiment of the present application provides a data processing method, including:
Determining a plurality of target parameters for a plurality of databases; wherein the plurality of target parameters include an average number of scan lines and an average time consumption;
According to average scanning line number data and average time consumption data respectively corresponding to the databases, multidimensional clustering is carried out on the databases to obtain a plurality of clustering results;
generating a multidimensional clustering distribution diagram according to the distribution situation of the clustering results in a clustering space coordinate system established based on the average scanning line number and the average time consumption;
Outputting the multidimensional clustering distribution map; the multi-dimensional clustering distribution diagram is used for determining the query condition of the target database based on the distribution condition of the clustering results.
In a third aspect, an embodiment of the present application provides a data processing apparatus, including:
A determining module for determining a plurality of target parameters of a plurality of objects;
The clustering module is used for carrying out multidimensional clustering on the plurality of objects according to the parameter data of the plurality of target parameters corresponding to the plurality of objects respectively to obtain a plurality of clustering results;
The distribution map generation module is used for generating a multidimensional clustering distribution map according to the distribution condition of the clustering results in a clustering space coordinate system established based on the target parameters;
the output module is used for outputting the multidimensional clustering distribution graph; the multi-dimensional clustering distribution diagram is used for determining the query condition of the target object based on the distribution condition of the clustering results.
In a fourth aspect, an embodiment of the present application provides a data processing apparatus, including:
A determining module for determining a plurality of target parameters of a plurality of databases; wherein the plurality of target parameters include an average number of scan lines and an average time consumption;
the clustering module is used for carrying out multidimensional clustering on the databases according to the average scanning line number data and the average time consumption data which correspond to the databases respectively to obtain a plurality of clustering results;
The distribution map generation module is used for generating a multidimensional clustering distribution map according to the distribution condition of the clustering results in a clustering space coordinate system established based on the average scanning line number and the average time consumption;
the first output module is used for outputting the multidimensional clustering distribution graph; the multi-dimensional clustering distribution diagram is used for determining the query condition of the target database based on the distribution condition of the clustering results.
In a fifth aspect, in an embodiment of the present application, a computer device includes a processing component, a display component, and a storage component; the storage component is used for storing one or more computer instructions, wherein the one or more computer instructions are used for being called by the processing component for execution;
The processing assembly is configured to:
determining a plurality of target parameters for a plurality of objects;
according to the parameter data of a plurality of target parameters corresponding to the plurality of objects respectively, carrying out multidimensional clustering on the plurality of objects to obtain a plurality of clustering results;
Generating a multidimensional clustering distribution diagram according to the distribution condition of the clustering results in a clustering space coordinate system established based on the target parameters;
The display component outputs the multi-dimensional clustering distribution diagram; the multi-dimensional clustering distribution diagram is used for determining the query condition of the target object based on the distribution condition of the clustering results.
Compared with the prior art, the application can obtain the following technical effects:
The embodiment of the application provides a data processing method, a data processing device and computer equipment, wherein a plurality of clustering results are obtained by determining a plurality of target parameters of a plurality of objects and carrying out multidimensional clustering on the plurality of objects according to parameter data of the plurality of target parameters respectively corresponding to the plurality of objects. And generating a multidimensional clustering distribution map according to the distribution condition of the clustering results in a clustering space coordinate system established based on the target parameters, and outputting the multidimensional clustering distribution map. The multi-dimensional clustering distribution diagram is used for determining the query condition of the target object based on the distribution condition of the clustering results. Through displaying the multidimensional clustering distribution graph, a user can intuitively, vividly and clearly know the clustering distribution condition of the plurality of objects in a clustering space coordinate system established based on a plurality of target parameters, and an accurate reference basis is provided for quickly and accurately determining the query condition of the target object, so that the retrieval precision of the target object is further improved based on the query condition.
These and other aspects of the application will be more readily apparent from the following description of the embodiments.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow diagram of one embodiment of a data processing method according to the present application;
FIG. 2 shows a schematic representation of a multi-dimensional cluster map provided in accordance with the present application;
FIG. 3 is a flow chart illustrating another embodiment of a data processing method according to the present application;
FIG. 4 shows a schematic representation of another multi-dimensional cluster map provided in accordance with the present application;
FIG. 5 shows a schematic representation of yet another multi-dimensional cluster map provided in accordance with the present application;
FIG. 6 is a flow chart illustrating a further embodiment of a data processing method according to the present application;
FIG. 7 is a schematic diagram illustrating the construction of one embodiment of a data processing apparatus provided in accordance with the present application;
FIG. 8 is a schematic diagram showing the structure of a further embodiment of a data processing apparatus according to the present application;
FIG. 9 is a schematic diagram showing the structure of another embodiment of a data processing apparatus according to the present application;
fig. 10 is a schematic structural view of an embodiment of a computer device according to the present application.
Detailed Description
In order to enable those skilled in the art to better understand the present application, the following description will make clear and complete descriptions of the technical solutions according to the embodiments of the present application with reference to the accompanying drawings.
In some of the flows described in the specification and claims of the present application and in the foregoing figures, a plurality of operations occurring in a particular order are included, but it should be understood that the operations may be performed out of order or performed in parallel, with the order of operations such as 101, 102, etc., being merely used to distinguish between the various operations, the order of the operations themselves not representing any order of execution. In addition, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel. It should be noted that, the descriptions of "first" and "second" herein are used to distinguish different messages, devices, modules, etc., and do not represent a sequence, and are not limited to the "first" and the "second" being different types.
Under the background of big data, when the target object meeting the specific condition is to be retrieved, the person skilled in the art usually relies on experience to set the query condition according to the multidimensional feature of the target object, which results in the condition that the set query condition is not accurate enough, so that the person skilled in the art also adopts a combined retrieval or multiple retrieval mode to compensate the problem of lower retrieval precision caused by the inaccurate query condition through complicated retrieval operation.
Therefore, in order to obtain accurate query conditions to improve the retrieval accuracy, the inventor provides a technical scheme through a series of researches. In the embodiment of the application, a plurality of clustering results are obtained by determining a plurality of target parameters of a plurality of objects and carrying out multidimensional clustering on the plurality of objects according to parameter data of a plurality of target parameters corresponding to the plurality of objects respectively. And generating a multidimensional clustering distribution map according to the distribution condition of the clustering results in a clustering space coordinate system established based on the target parameters, and outputting the multidimensional clustering distribution map. The multi-dimensional clustering distribution diagram is used for determining the query condition of the target object based on the distribution condition of the clustering results. The multi-dimensional clustering distribution diagram can intuitively, vividly and clearly know the clustering distribution of a plurality of objects in the clustering space coordinate system established based on a plurality of target parameters, and provides accurate reference for a user to quickly and accurately determine the query condition of the target object, thereby further improving the retrieval precision of the target object based on the query condition.
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to fall within the scope of the application.
Fig. 1 is a schematic flow chart of an embodiment of a data processing method according to an embodiment of the present application. The method may comprise the steps of:
101: a plurality of target parameters for a plurality of objects are determined.
In practice, the target parameter may be a dimension for describing the object. Generally, when an object having a certain feature is retrieved, a query condition may be set from a plurality of dimensions describing the object so that a target object may be accurately retrieved. For example, when a document is searched by the theoretical library, a target document can be searched from multiple dimensions such as an author, time, keywords, a title and the like; when the abnormal database is tracked in the cloud database, the abnormal database can be tracked from a plurality of dimensions such as average scanning line number and average time consumption of the database.
It will be appreciated that multi-dimensional data is not only widely used in the field of search queries, but is also applicable in the field of multi-dimensional data analysis. Multidimensional data analysis can be used to perform multi-angle, comprehensive analysis of business-related objects for a business. For example, sales analysis of a certain product in business-oriented analysis can be performed from different dimensions of channels, time, users, regions and the like; particularly, with the advent of the cloud era, the multi-dimensional data analysis based on the cloud processing platform can provide more powerful service decision-making force, insight discovery force, service flow optimization capability and the like for users.
In the embodiment of the application, the plurality of objects can refer to the same class of objects, such as a plurality of databases in a cloud database; or different classes of objects having the same dimensions, such as different types of products having the same sales dimensions, are not specifically limited herein.
In practical applications, the target parameters may be set according to the dimension of the object actually focused by the user, for example, the user may track the abnormal database, and then the dimension representing the abnormality of the database may be set as the target parameters, alternatively, multiple target parameters of the database may be set as the average number of scanning lines and the average time consumption, respectively.
102: And carrying out multidimensional clustering on the plurality of objects according to the parameter data of the plurality of target parameters corresponding to the plurality of objects respectively to obtain a plurality of clustering results.
Taking the object as a database in a cloud database as an example, the set multiple target parameters are an average scanning line number and average time consumption. When a user sends out a tracking and positioning instruction for an abnormal database, firstly, a plurality of databases corresponding to the user in the cloud database are determined, and data in the two dimensions, namely average scanning data and average time-consuming data, are respectively obtained, so that multidimensional clustering is carried out on the databases.
In practical application, a clustering space coordinate system corresponding to the plurality of objects can be established according to the number of the target parameters, and dimensions corresponding to the clustering space coordinate system are set at the same time. For example, a two-dimensional clustering space coordinate system is established for two-dimensional target parameters, an average time-consuming parameter is set as an X dimension, an average scanning line number parameter is set as a Y dimension, and a clustering threshold value of each target parameter in each dimension is set according to the requirement of a user on the clustering quality, and after multi-dimensional clustering is performed on the parameter data of the databases on the corresponding space coordinate system based on the corresponding clustering threshold value, a plurality of clustering results are obtainedWhere n represents the total number of clustering results actually obtained. It can be understood that in the multidimensional clustering process of multiple objects, the quality of the clustering result can be improved by setting the clustering threshold as small as possible, but the data volume of the objects to be clustered is large, so that the data processing efficiency is considered to be considered between the clustering quality and the clustering efficiency, and when the clustering threshold is a value other than 0, the actual clustering result is obtainedWherein/>Clustering threshold value of target parameter corresponding to X dimension,/>And the clustering threshold value of the target parameter corresponding to the Y dimension.
The clustering method is not limited to the above-mentioned clustering method, and may be any clustering method in the prior art for a plurality of objects in different application scenarios to perform multidimensional clustering on a plurality of data dimensions determined by a plurality of target parameters, which is not limited herein.
103: And generating a multidimensional clustering distribution diagram according to the distribution condition of the clustering results in a clustering space coordinate system established based on the target parameters.
As an implementation manner, the generating the multidimensional clustering distribution map according to the distribution situation of the clustering results in the clustering space coordinate system established based on the target parameters may include:
Establishing a clustering space coordinate system based on the plurality of target parameters;
determining distribution positions of the plurality of clustering results in the clustering space coordinate system respectively;
And marking the distribution positions by a preset graph, and generating the multi-dimensional clustering distribution diagram.
Alternatively, a multi-bit cluster distribution map can be drawn based on a cluster space coordinate system established by the plurality of target spaces, and a plurality of cluster results are obtainedThe distribution position of each clustering result in the clustering space coordinate system can be determined. The distribution position of each clustering result can be marked by a preset graph, and the preset graph mark can be diamond, circle, star, triangle or other shapes, which is not particularly limited herein.
As an alternative embodiment, the preset graphic may include a scatter point; the identifying, with a preset pattern, the distribution position in the clustering spatial coordinate system, and generating the multi-dimensional clustering distribution map may include:
And marking the distribution positions in the clustering space coordinate system by using the scattered points, and generating a multidimensional scattered point diagram.
In practical application, the scattered points can be solid circular graphic marks or hollow circle graphic marks, and the positions of the clustering results in the clustering space coordinate system are marked by the scattered points. FIG. 2 shows that a clustering space coordinate system is established by taking the average time-consuming parameter of the database as the X-axis and the average scanning line number parameter as the Y-axis, and a plurality of clustering results are determinedA two-dimensional scatter plot generated from the distribution in the cluster space coordinate system.
Optionally, in practical application, a visual drawing tool may be used to render and generate the multidimensional clustering distribution map, where the visual tool may include any drawing tool that adopts any existing visual drawing technology such as Canvas technology, SVG (Scalable Vector Graphics ) technology, webGL technology or OpenGL technology, etc. may be applied to the scheme of the present application, and each of the drawing tools may have advantages and disadvantages that may be specifically selected according to the actual application scenario, and is not limited herein. And rendering and generating a multidimensional clustering distribution diagram by using a visual drawing tool, so that multidimensional data of a plurality of objects are visually displayed in a multidimensional clustering distribution diagram mode.
It can be understood that the visual display mode of the multidimensional data of the plurality of objects in the embodiment of the application is not limited to the mode of generating the multidimensional clustering distribution map by establishing the clustering space coordinate system, and can also be used for carrying out visual display in the modes of a normal distribution map, a bar graph, a histogram, a sector graph and the like according to the number of the clustered objects respectively corresponding to the plurality of clustering results and the probability distribution condition thereof, so that the distribution condition of the clustering results can be seen, and the proportion of the objects corresponding to each clustering result to the total number can be seen more intuitively. In the embodiment of the application, the visual display mode of the multidimensional data of the plurality of objects is not particularly limited, and the visual display mode can be set according to the requirements of users or actual application scenes.
104: Outputting the multidimensional clustering distribution map.
The multi-dimensional clustering distribution diagram is used for determining the query condition of the target object based on the distribution condition of the clustering results.
In practical application, the multidimensional clustering distribution map can be output in the visual display interface, so that a user obtains multidimensional data information of a plurality of objects according to the multidimensional clustering distribution map displayed in the visual display interface, and at least one clustering result matched with the target object is determined as a query condition by taking the position distribution of the plurality of clustering results in the multidimensional clustering distribution map as a reference, so that the target object is obtained by searching according to the determined query condition.
It is to be understood that the target object may be one object or a plurality of objects, which is not particularly limited herein. The actual query condition needs to be matched with the search requirement of the target object, for example, the user wants to search a database with an abnormality from a plurality of databases, the average time consumption and the average scanning line number are set according to the set target parameters, when the average scanning line number of a certain database is greatly increased, the performance problem is likely to occur, if the average time consumption is long, the performance problem or the network link problem is likely to occur, so that at least one target clustering result meeting the abnormality threshold can be selected according to the abnormality threshold corresponding to the target parameters, and the at least one target clustering result can be determined as the query condition. As shown in fig. 2, if the anomaly threshold corresponding to the average time-consuming parameter is set to be greater than 10000 microseconds and the anomaly threshold corresponding to the average scanning line number parameter is greater than 1000 lines, at least one target clustering result when the data value of any dimension of the plurality of clustering results meets the corresponding anomaly threshold may be selected, and the at least one target clustering result is determined as a query condition.
In the embodiment of the application, a plurality of target parameters of a plurality of objects focused on by a user are determined based on the retrieval requirement of the user on the target objects. The clustering dimension established by the target parameters is used for carrying out multidimensional clustering on the objects based on the parameter data corresponding to the target parameters to obtain a plurality of clustering results, the clustering analysis on the objects can help to know the distribution structure of the objects in different data dimensions, and the common characteristics of the multidimensional data of the objects are used for generating the visual display in a multidimensional clustering distribution diagram mode, so that the distribution situation of the multidimensional data of the objects in the clustering dimension can be intuitively, vividly and clearly displayed, and an accurate reference basis is provided for a user to quickly and accurately determine the query condition matched with the retrieval requirement of the target object, thereby further improving the retrieval precision of the target object based on the query condition.
Fig. 3 is a flowchart of another embodiment of a data processing method according to an embodiment of the present application. The method may comprise the steps of:
301: a plurality of target parameters for a plurality of objects are determined.
302: Object attributes of the plurality of objects are determined separately.
In practical application, the object may further include a multi-dimensional object attribute feature, which may include, for example, a commodity, a material attribute, a usage attribute, a model attribute, a color attribute, a size attribute, and other attribute features in multiple dimensions, and may include, for example, an object as a database, at least a region attribute, an operation attribute, a data type attribute, a service attribute, and so on.
303: And carrying out multidimensional clustering on the objects according to the parameter data of the target parameters corresponding to the objects and the object attributes corresponding to the objects to obtain a plurality of clustering results.
As an optional implementation manner, the multi-dimensional clustering of the plurality of objects according to the parameter data of the target parameters respectively corresponding to the plurality of objects and the object attributes respectively corresponding to the plurality of objects may include:
classifying the objects based on object attributes respectively corresponding to the objects to obtain a plurality of attribute categories;
And respectively carrying out multidimensional clustering on the plurality of objects under the attribute categories according to the parameter data of the target parameters corresponding to the plurality of objects to obtain a plurality of clustering results.
Taking a cloud database as an example, object attributes of the plurality of objects may be operation attributes, where the operation attributes refer to operation types for the databases, and may be classified into insertion operations, selection operations, update operations, presentation operations for the databases, and the like, and each database in the cloud database may correspond to at least one operation type described above.
Therefore, the databases are classified according to the operation type corresponding to each database, and the databases corresponding to the insertion operation type, the databases corresponding to the selection operation type, the databases corresponding to the update operation type and the databases corresponding to the display operation type are obtained through classification. And then, respectively carrying out multidimensional clustering on the databases belonging to the same operation type, thereby obtaining a plurality of clustering results respectively corresponding to different operation types.
304: And generating a multidimensional clustering distribution diagram according to the distribution condition of the clustering results in a clustering space coordinate system established based on the target parameters.
As an implementation manner, the generating the multidimensional clustering distribution map according to the distribution situation of the clustering results in the clustering space coordinate system established based on the target parameters may include:
Establishing a clustering space coordinate system based on the plurality of target parameters;
determining distribution positions of the plurality of clustering results in the clustering space coordinate system respectively;
Determining preset graphs corresponding to the attribute categories respectively;
And respectively marking the distribution positions by using the attribute categories to which the clustering results belong to corresponding preset graphs, and generating the multidimensional clustering distribution map.
In practical application, in order to display object attribute information of the plurality of objects in the multi-dimensional clustering distribution diagram, different preset patterns may be set for a plurality of attribute categories obtained by classifying based on object attributes, and the different preset patterns may be distinguished by color, shape, and the like, which is not limited herein specifically.
As an implementation manner, the identifying the distribution position by using the attribute types to which the plurality of clustering results belong to corresponding preset graphs, and generating the multi-dimensional clustering distribution diagram may include:
determining preset graphs corresponding to the clustering results respectively based on attribute categories corresponding to the clustering results respectively;
Determining the graph sizes of preset graphs corresponding to the clustering results respectively based on the number of the clustering objects contained in the clustering results respectively;
And marking the distribution positions of preset patterns corresponding to the clustering results by corresponding pattern sizes respectively, and generating the multidimensional clustering distribution map.
In practical application, the size of the preset graph in the multidimensional clustering distribution graph can reflect the number of clustered objects corresponding to each clustering result in a ratio relation, and the larger the number of clustered objects corresponding to any clustering result is, the larger the size of the corresponding preset graph is, and otherwise, the smaller the size is.
Alternatively, a number range corresponding to different sizes may be preset, for example, when the number of objects is within a range of 0 to 100, the size of the pattern corresponding to the preset pattern may be 0.01 mm, when the number of objects is 100 to 1000, the size of the corresponding pattern is 0.02 mm, when the number of objects is 1000 to 10000, the size of the corresponding pattern is 0.04 mm, etc., which may be specifically set according to the actual situation, and is not specifically limited herein.
305: Outputting the multidimensional clustering distribution map.
The multi-dimensional clustering distribution diagram is used for determining the query condition of the target object based on the distribution condition of the clustering results.
As shown in fig. 4, the classification of the plurality of data classes according to the operation attribute of the database results in four operation categories, which are an insert operation, a select operation, an update operation, and a display operation, respectively. The insertion operation may be set to correspond to a circular icon, the selection operation may correspond to a triangular icon, the update operation may correspond to a diamond icon, and the display operation may correspond to a star icon.
In practical application, the running state of the instance in the database can be reflected according to the proportion of different operation types corresponding to the database access operation. For example, a large number of insert operations may result in a degradation of the database, but to further find the cause of the abnormality that actually affects the database, it may be determined whether the degradation of the database is due to the large number of insert operations based on the proportion of insert operation categories in at least one target result that satisfies the abnormality threshold of the target parameter. Therefore, the clustering results obtained by the objects are further classified based on the object attributes in the multi-dimensional clustering distribution diagram, object attribute dimension information can be intuitively reflected, so that the range of the query condition is further narrowed on the basis, and the target clustering results which are more matched with the retrieval requirement are screened out, so that the accuracy of the query condition is improved.
As an alternative embodiment, the method may further include:
Determining an association parameter associated with the plurality of target parameters;
Acquiring associated data of the plurality of objects corresponding to the associated parameters respectively;
sorting the objects according to the size of the associated data to generate an object list;
And outputting the object list.
In the embodiment of the application, the data dimension of the screening target object is further increased by increasing or decreasing the associated parameter associated with the target parameter, so that the limitation of the multidimensional clustering distribution map on the data dimension of the screening target object is avoided, and the accuracy of the query condition is improved, so that the retrieval precision of the target object is further improved.
Further, in order to display associated parameter information of a plurality of objects under different attribute categories, when the multidimensional clustering distribution graph includes a plurality of attribute categories corresponding to the plurality of clustering results, the method may further include:
Based on the associated data corresponding to each of a plurality of objects in the same attribute category, respectively determining the maximum associated data in the attribute categories;
Generating a correlation distribution map based on the maximum correlation data corresponding to each of the attribute categories;
And outputting the association distribution diagram.
As an alternative implementation manner, the accuracy of the query condition can be further improved by combining a plurality of target parameters with associated parameters thereof, so as to further improve the data dimension of the query condition. Thus, the method may further comprise:
And establishing a linkage relation among the multidimensional clustering distribution map, the object list and the association distribution map based on the association relation between the target parameter and the association parameter.
Through establishing the multidimensional clustering distribution diagram, the object list and the linkage relation of the association distribution diagram, a foundation is laid for providing interactive operation of more data dimensions for users so as to further accurately and simply screen target objects.
As shown in fig. 5, (a) is a multidimensional clustering profile corresponding to a plurality of databases, (B) is a table of SQL (database) templates generated when the number of execution times of database operations is based on association parameters, and (C) is an association profile corresponding to the maximum number of execution times in objects belonging to different operation categories. It can be seen from (C) that the maximum values of the corresponding execution times under different operation categories are different, so that there is a certain association relationship between the different operation categories and the execution times.
In practical applications, the object list as described in (B) may not only include a plurality of databases ordered according to the number of execution times, but also display associated data information, parameter data information and other related information of each database, so as to provide the user with more effective information that may refer to analysis, which is not limited herein and may be set according to practical requirements.
In the embodiment of the application, the clustering dimension of the multidimensional clustering is generally limited by the existing clustering algorithm, so that the data with higher dimension cannot be clustered. When the search requirement relates to the situation that the data dimension of the object is more, the problem that the multidimensional clustering cannot perform higher data dimension clustering can be further solved through the object attribute characteristics, the association parameters and the like. Therefore, reference information with more data dimensions is provided for the user, and more data information with more dimensions can be obtained for the user by displaying a multi-bit cluster distribution diagram, an object list and an associated distribution diagram based on a visual interface, so that the accuracy of determining query conditions and the matching degree with search requirements are improved, and the accuracy of searching target objects is greatly improved.
Fig. 6 is a flowchart of another embodiment of a data processing method according to an embodiment of the present application. The method may comprise the steps of:
601: a plurality of target parameters for a plurality of objects are determined.
602: And carrying out multidimensional clustering on the plurality of objects according to the parameter data of the plurality of target parameters corresponding to the plurality of objects respectively to obtain a plurality of clustering results.
603: And generating a multidimensional clustering distribution diagram according to the distribution condition of the clustering results in a clustering space coordinate system established based on the target parameters.
604: Outputting the multidimensional clustering distribution map.
The multi-dimensional clustering distribution diagram is used for determining the query condition of the target object based on the distribution condition of the clustering results.
The foregoing detailed descriptions of the embodiments of steps 601 to 604 of the present application are omitted herein.
605: Based on a selection operation for any distribution area containing at least one clustering result in the multidimensional clustering distribution graph, a query instruction taking the at least one clustering result as a query condition is generated.
In practical application, a user can perform interactive operation with the multidimensional clustering distribution diagram output by the visual interactive interface, the user can determine at least one clustering result as a query condition according to the distribution positions of a plurality of clustering results displayed in the multiposition clustering distribution diagram, and a query instruction taking the at least one clustering result as the query condition is generated by selecting a clustering area containing the at least one clustering result through selection operation.
In practical application, when the determined at least one clustering result is more scattered, the clustering areas may include a plurality of clustering areas, and the user may simultaneously select a plurality of clustering areas to generate the query instruction.
Optionally, the preset graph for marking the clustering result in the multidimensional clustering distribution graph may be used as a selection control, and when the user selects at least one clustering result determined as the query condition, the user only needs to select the selection space for marking the distribution position of the clustering result, which is not particularly limited herein.
606: And obtaining at least one target object matched with the at least one clustering result based on the query instruction.
In practical application, the clustering quality of the embodiment of the application also affects the accuracy of the query condition, and when the clustering threshold range is smaller, the target object retrieved by the query instruction generated based on the clustering result is matched with the retrieval requirement of the user more highly, and otherwise, the target object is lower. Therefore, in the embodiment of the application, the clustering quality can be improved by improving the clustering algorithm so as to further improve the accuracy of the query condition.
607: And outputting the at least one target object and parameter data corresponding to the at least one target object respectively.
The method comprises the steps of outputting at least one target object obtained through retrieval and corresponding parameter data thereof in a visual interaction interface, enabling a user to conveniently and intuitively obtain a retrieval result of the target object, and deeply analyzing the matching degree of the retrieved target object and retrieval requirements based on the displayed parameter data of each target object, so as to determine whether the query condition of the target object needs to be adjusted.
As an alternative embodiment, the outputting the at least one target object and the parameter data corresponding to the at least one target object respectively may include:
sequencing the at least one target object according to the size of parameter data corresponding to the preset target parameters to generate a target object list;
and outputting the target object list.
As another optional implementation manner, when the multidimensional clustering distribution graph includes a plurality of attribute categories corresponding to the plurality of clustering results, the method may further include:
Based on the parameter data of the preset target parameters corresponding to at least one target object in the same attribute category, respectively determining the maximum parameter data in the attribute categories;
generating a preset target distribution diagram based on the maximum parameter data corresponding to the attribute categories respectively;
outputting the preset target distribution diagram.
In practical application, the preset target parameters may be selected according to the importance degree of each target parameter to the query condition, or any one of the target parameters may be selected as the preset target parameter, which is not limited herein.
In the embodiment of the application, the target object list and the preset target distribution diagram are similar to the generation modes of the object list and the related distribution diagram, and the multi-position clustering distribution diagram, the target object list and the preset target distribution diagram of the objects are simultaneously displayed in the same interactive interface, so that the accuracy of the query condition determined by the user can be further verified in a reverse direction according to the retrieval result, and therefore, whether the accuracy of the retrieval result is required to be improved or not is judged based on the verification result, or the accuracy of the retrieval result is improved in the modes of increasing the data dimension of the retrieval target object or the object characteristics and the like, and the description is omitted.
In practical application, the user can trigger to generate and output the object list and the associated distribution map corresponding to the associated parameters associated with the target parameters after generating the multidimensional clustering distribution map corresponding to the objects. In some embodiments, after establishing the linkage relationship among the multidimensional clustering distribution graph, the object list and the association distribution graph based on the association relationship between the target parameter and the association parameter, the method may further include:
generating a query instruction taking at least one clustering result as a query condition based on a selection operation of any distribution area containing the at least one clustering result in the multi-dimensional clustering distribution diagram;
Based on the query instruction, at least one target object matched with the at least one clustering result is obtained;
And updating the object list and the association distribution diagram according to the at least one target object linkage.
Since the linkage relation of the multidimensional clustering distribution diagram, the object list and the association distribution diagram is established in advance, when a user triggers the selection operation of a clustering area containing at least one clustering result, at least one first target object matched with the at least one clustering result is obtained. At this time, the association parameter is used as a second-level query condition to further screen the at least one first target object, and at least one second target object of the at least one first target object, which meets the abnormality threshold corresponding to the association parameter, is determined. Since the object list sorts the at least one first target object based on the size of the parameter data, the first M first target objects or the last M first target objects may be sequentially selected as the at least one second target object according to the sorting order thereof, where M >0. So as to realize more accurate positioning and tracking of the target object.
The embodiment of the application is not limited by the data dimension of the associated parameters, the associated parameters which are important for the retrieval requirement can be preferentially selected as the newly added data dimension of the screening target object according to the requirement of the retrieval precision, and when the associated parameters are more, the priority of each associated parameter can be set so as to determine the retrieval priority of the corresponding query condition according to the priority.
In order to further simplify the user operation and improve the interaction efficiency of the user, the search accuracy of the target object can be further improved by rapidly and accurately determining the search condition through the association distribution diagram corresponding to the association parameter. The updating the object list and the association profile based on the at least one target object linkage may include:
Based on the associated data sizes respectively corresponding to the at least one target object, obtaining a sequencing result of the at least one target object;
updating the object list based on the sorting result;
And updating the association distribution diagram based on the maximum association data of the at least one target object under the attribute categories.
The user can obtain the influence degree of the association parameters on the objects under different attribute categories through the association distribution diagram. Taking the example shown in (C) in fig. 5 as an example, by using the association distribution diagram of the database, the execution times of the database under different operation types can be further analyzed, if the execution times are higher, the contribution to the occurrence of the abnormality of the database is larger, otherwise, the contribution to the occurrence of the abnormality of the database is lower, so that whether the operation type is one of the reasons affecting the abnormality of the database can be further analyzed, and a richer data dimension can be provided for further analysis of the specific reason causing the abnormality of the database.
In the embodiment of the application, in order to simplify the search operation of a user and improve the search efficiency of the user, the user provides the search service of the target object for the user through visual interaction with the multi-dimensional clustering distribution diagram, the user selects at least one clustering result serving as a query condition according to the distribution condition of the clustering results displayed in the multi-dimensional clustering distribution diagram, generates a query instruction to screen out at least one target object matched with the at least one clustering result, and visually displays the at least one target object and related parameter data thereof. The visual display manner of at least one target object in the practical embodiment of the present application includes, but is not limited to, a manner of target object list, and any other visual display manner is not specifically limited herein.
Further, in order to obtain more accurate query conditions and obtain query conditions with higher matching degree with search requirements, the data dimension of the query conditions can be further increased by establishing a linkage relation among an object list, a correlation distribution diagram and a multi-dimensional clustering distribution diagram, the correlation parameter and a plurality of target parameters are combined to be used as the data dimension of the query conditions, and the correlation parameter and the maximum correlation data are simultaneously used as the data dimension of the query conditions in a linkage mode after a user triggers the selection operation on a clustering area by setting the linkage relation, so that the interaction operation between the user and the visual multi-dimensional data is simplified, the interaction efficiency is improved, and meanwhile, the search precision of the target object is greatly improved.
Fig. 7 is a schematic structural diagram of an embodiment of a data processing apparatus according to an embodiment of the present application. The apparatus may include:
A determining module 701, configured to determine a plurality of target parameters of a plurality of objects;
and the clustering module 702 is configured to perform multidimensional clustering on the plurality of objects according to parameter data of a plurality of target parameters corresponding to the plurality of objects, so as to obtain a plurality of clustering results.
A distribution map generating module 703, configured to generate a multidimensional clustering distribution map according to distribution situations of the multiple clustering results in a clustering spatial coordinate system established based on the multiple target parameters.
A first output module 704, configured to output the multi-dimensional clustering distribution map.
The multi-dimensional clustering distribution diagram is used for determining the query condition of the target object based on the distribution condition of the clustering results.
As an implementation manner, the profile generation module 703 may specifically be used for:
Establishing a clustering space coordinate system based on the plurality of target parameters;
determining distribution positions of the plurality of clustering results in the clustering space coordinate system respectively;
And marking the distribution positions by a preset graph, and generating the multi-dimensional clustering distribution diagram.
As an alternative embodiment, the preset graphic may include a scatter point; the profile generation module 703 may specifically be configured to:
And marking the distribution positions in the clustering space coordinate system by using the scattered points, and generating a multidimensional scattered point diagram.
The foregoing details of the implementation of the embodiments of the present application have been described in detail, and are not repeated herein.
In the embodiment of the application, a plurality of target parameters of a plurality of objects focused on by a user are determined based on the retrieval requirement of the user on the target objects. The clustering dimension established by the target parameters is used for carrying out multidimensional clustering on the objects based on the parameter data corresponding to the target parameters to obtain a plurality of clustering results, the clustering analysis on the objects can help to know the distribution structure of the objects in different data dimensions, and the common characteristics of the multidimensional data of the objects are used for generating the visual display in a multidimensional clustering distribution diagram mode, so that the distribution situation of the multidimensional data of the objects in the clustering dimension can be intuitively, vividly and clearly displayed, and an accurate reference basis is provided for a user to quickly and accurately determine the query condition matched with the retrieval requirement of the target object, thereby further improving the retrieval precision of the target object based on the query condition.
Fig. 8 is a schematic structural diagram of an embodiment of a data processing apparatus according to an embodiment of the present application. The apparatus may include:
A determining module 801 is configured to determine a plurality of target parameters of a plurality of objects.
And a clustering module 802, configured to perform multidimensional clustering on the multiple objects according to parameter data of multiple target parameters corresponding to the multiple objects, so as to obtain multiple clustering results.
The clustering module 802 may include:
an object attribute determining unit 811 is configured to determine object attributes of the plurality of objects, respectively.
The clustering result obtaining unit 812 is configured to perform multidimensional clustering on the plurality of objects according to the parameter data of the target parameters respectively corresponding to the plurality of objects and the object attributes respectively corresponding to the plurality of objects, so as to obtain a plurality of clustering results.
And the distribution map generating module 803 is used for generating a multidimensional clustering distribution map according to the distribution condition of the clustering results in the clustering space coordinate system established based on the target parameters.
A first output module 804 is configured to output the multi-dimensional clustering distribution map.
As an optional implementation manner, the multi-dimensional clustering of the plurality of objects according to the parameter data of the target parameters respectively corresponding to the plurality of objects and the object attributes respectively corresponding to the plurality of objects may include:
classifying the objects based on object attributes respectively corresponding to the objects to obtain a plurality of attribute categories;
And respectively carrying out multidimensional clustering on the plurality of objects under the attribute categories according to the parameter data of the target parameters corresponding to the plurality of objects to obtain a plurality of clustering results.
As an implementation manner, the profile generation module 803 may specifically be configured to:
Establishing a clustering space coordinate system based on the plurality of target parameters;
determining distribution positions of the plurality of clustering results in the clustering space coordinate system respectively;
Determining preset graphs corresponding to the attribute categories respectively;
And respectively marking the distribution positions by using the attribute categories to which the clustering results belong to corresponding preset graphs, and generating the multidimensional clustering distribution map.
As an implementation manner, the identifying the distribution positions by using the attribute categories to which the plurality of clustering results belong to corresponding preset graphs, and generating the multi-dimensional clustering distribution map may specifically be used to:
determining preset graphs corresponding to the clustering results respectively based on attribute categories corresponding to the clustering results respectively;
Determining the graph sizes of preset graphs corresponding to the clustering results respectively based on the number of the clustering objects contained in the clustering results respectively;
And marking the distribution positions of preset patterns corresponding to the clustering results by corresponding pattern sizes respectively, and generating the multidimensional clustering distribution map.
As an alternative embodiment, the apparatus may further include:
An associated parameter determining module for determining associated parameters associated with the plurality of target parameters;
the associated data acquisition module is used for acquiring associated data of the plurality of objects corresponding to the associated parameters respectively;
the object list generation module is used for sequencing the objects according to the size of the associated data to generate an object list;
And the second output module is used for outputting the object list.
Further, in order to display associated parameter information of a plurality of objects under different attribute categories, when the multidimensional clustering distribution graph includes a plurality of attribute categories corresponding to the plurality of clustering results, the apparatus may further include:
And the maximum association data determining module is used for respectively determining the maximum association data under the plurality of attribute categories based on the association data corresponding to each of the plurality of objects under the same attribute category.
And the association distribution map generation module is used for generating an association distribution map based on the maximum association data corresponding to the attribute categories.
And the third output module is used for outputting the association distribution diagram.
As an alternative implementation manner, the accuracy of the query condition can be further improved by combining a plurality of target parameters with associated parameters thereof, so as to further improve the data dimension of the query condition. Thus, the apparatus may further comprise:
And the association module is used for establishing a linkage relation among the multidimensional clustering distribution map, the object list and the association distribution map based on the association relation between the target parameter and the association parameter.
The foregoing details of the implementation of the embodiments of the present application have been described in detail, and are not repeated herein.
In the embodiment of the application, the clustering dimension of the multidimensional clustering is generally limited by the existing clustering algorithm, so that the data with higher dimension cannot be clustered. When the search requirement relates to the situation that the data dimension of the object is more, the problem that the multidimensional clustering cannot perform higher data dimension clustering can be further solved through the object attribute characteristics, the association parameters and the like. Therefore, reference information with more data dimensions is provided for the user, and more data information with more dimensions can be obtained for the user by displaying a multi-bit cluster distribution diagram, an object list and an associated distribution diagram based on a visual interface, so that the accuracy of determining query conditions and the matching degree with search requirements are improved, and the accuracy of searching target objects is greatly improved.
Fig. 9 is a schematic structural diagram of another embodiment of a data processing apparatus according to an embodiment of the present application. The apparatus may include:
A determining module 901, configured to determine a plurality of target parameters of a plurality of objects;
And a clustering module 902, configured to perform multidimensional clustering on the multiple objects according to parameter data of multiple target parameters corresponding to the multiple objects, so as to obtain multiple clustering results.
The distribution map generating module 903 is configured to generate a multidimensional clustering distribution map according to distribution situations of the multiple clustering results in a clustering spatial coordinate system established based on the multiple target parameters.
A first output module 904, configured to output the multi-dimensional cluster distribution map.
The multi-dimensional clustering distribution diagram is used for determining the query condition of the target object based on the distribution condition of the clustering results.
The first query instruction generating module 905 is configured to generate a query instruction using at least one clustering result as a query condition based on a selection operation for any distribution region including the at least one clustering result in the multi-dimensional clustering distribution diagram.
A first target object obtaining module 906, configured to obtain, based on the query instruction, at least one target object that matches the at least one clustering result.
And a fourth output module 907, configured to output the at least one target object and parameter data corresponding to the at least one target object respectively.
As an alternative embodiment, the fourth output module 907 may be specifically configured to:
sequencing the at least one target object according to the size of parameter data corresponding to the preset target parameters to generate a target object list;
and outputting the target object list.
As another optional implementation manner, when the multidimensional clustering distribution graph includes a plurality of attribute categories corresponding to the plurality of clustering results, the apparatus may further include:
And the maximum parameter data determining module is used for respectively determining the maximum parameter data under the attribute categories based on the parameter data of the preset target parameters corresponding to at least one target object under the same attribute category.
The preset target distribution diagram generation module is used for generating a preset target distribution diagram based on the maximum parameter data corresponding to the attribute categories.
And the fifth output module is used for outputting the preset target distribution diagram.
In some embodiments, after the linkage relationship among the multidimensional clustering distribution graph, the object list and the association distribution graph is established based on the association relationship between the target parameter and the association parameter, the apparatus may further include:
And the second query instruction generation module is used for generating a query instruction taking at least one clustering result as a query condition based on a selection operation of any distribution area containing the at least one clustering result in the multi-dimensional clustering distribution diagram.
The second target object acquisition module is used for acquiring at least one target object matched with the at least one clustering result based on the query instruction;
and the linkage module is used for updating the object list and the association distribution diagram in a linkage way according to the at least one target object.
In order to further simplify the user operation and improve the interaction efficiency of the user, the search accuracy of the target object can be further improved by rapidly and accurately determining the search condition through the association distribution diagram corresponding to the association parameter. The linkage module can be specifically used for:
Based on the associated data sizes respectively corresponding to the at least one target object, obtaining a sequencing result of the at least one target object;
updating the object list based on the sorting result;
And updating the association distribution diagram based on the maximum association data of the at least one target object under the attribute categories.
The foregoing details of the implementation of the embodiments of the present application have been described in detail, and are not repeated herein.
In the embodiment of the application, in order to simplify the search operation of a user and improve the search efficiency of the user, the user provides the search service of the target object for the user through visual interaction with the multi-dimensional clustering distribution diagram, the user selects at least one clustering result serving as a query condition according to the distribution condition of the clustering results displayed in the multi-dimensional clustering distribution diagram, generates a query instruction to screen out at least one target object matched with the at least one clustering result, and visually displays the at least one target object and related parameter data thereof. The visual display manner of at least one target object in the practical embodiment of the present application includes, but is not limited to, a manner of target object list, and any other visual display manner is not specifically limited herein.
Further, in order to obtain more accurate query conditions and obtain query conditions with higher matching degree with search requirements, the data dimension of the query conditions can be further increased by establishing a linkage relation among an object list, a correlation distribution diagram and a multi-dimensional clustering distribution diagram, the correlation parameter and a plurality of target parameters are combined to be used as the data dimension of the query conditions, and the correlation parameter and the maximum correlation data are simultaneously used as the data dimension of the query conditions in a linkage mode after a user triggers the selection operation on a clustering area by setting the linkage relation, so that the interaction operation between the user and the visual multi-dimensional data is simplified, the interaction efficiency is improved, and meanwhile, the search precision of the target object is greatly improved.
Fig. 10 is a schematic structural diagram of an embodiment of a computer device according to an embodiment of the present application, where the computer device may include a processing component 1001, a display component 1002, and a storage component 1003.
The storage component 1003 is configured to store one or more computer instructions; the one or more computer instructions are configured to be invoked by the processing component 1001 for execution.
The processing component 1001 may be configured to:
determining a plurality of target parameters for a plurality of objects;
according to the parameter data of a plurality of target parameters corresponding to the plurality of objects respectively, carrying out multidimensional clustering on the plurality of objects to obtain a plurality of clustering results;
Generating a multidimensional clustering distribution diagram according to the distribution condition of the clustering results in a clustering space coordinate system established based on the target parameters;
The display component 1002 outputs the multi-dimensional cluster profile; the multi-dimensional clustering distribution diagram is used for determining the query condition of the target object based on the distribution condition of the clustering results.
Wherein the processing component 1001 may include one or more processors to execute computer instructions to perform all or part of the steps in the methods described above. Of course, the processing component may also be implemented as one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors or other electronic elements for executing the methods described above.
The storage component 1003 is configured to store various types of data to support operations in the server. The memory component may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.
Of course, the computer device may naturally also include other components, such as input/output interfaces, communication components, and the like.
The input/output interface provides an interface between the processing component and a peripheral interface module, which may be an output device, an input device, etc.
The communication component is configured to facilitate communication between the server and other devices, either wired or wireless, such as communication with a terminal.
The embodiment of the application also provides a computer readable storage medium, which stores a computer program, and the computer program can implement the data processing methods of the embodiments shown in fig. 1, 3 and 6 when executed by a computer.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and are not limiting; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application.

Claims (17)

1. A method of data processing, comprising:
determining a plurality of target parameters of a plurality of SQL templates, wherein the plurality of target parameters comprise average scanning line numbers and average time consumption;
according to parameter data of a plurality of target parameters respectively corresponding to the SQL templates, carrying out multidimensional clustering on the SQL templates to obtain a plurality of clustering results;
Generating a multidimensional clustering distribution diagram according to the distribution condition of the clustering results in a clustering space coordinate system established based on the target parameters;
Outputting the multidimensional clustering distribution map; the multidimensional clustering distribution diagram is used for determining the query condition of the target SQL template based on the distribution condition of the clustering results.
2. The method of claim 1, wherein generating a multi-dimensional cluster profile from the distribution of the plurality of cluster results in a cluster space coordinate system established based on the plurality of target parameters comprises:
Establishing a clustering space coordinate system based on the plurality of target parameters;
determining distribution positions of the plurality of clustering results in the clustering space coordinate system respectively;
And marking the distribution positions by a preset graph, and generating the multi-dimensional clustering distribution diagram.
3. The method of claim 2, wherein the predetermined pattern comprises scattered points; the step of identifying the distribution positions in the clustering space coordinate system by a preset graph, and the step of generating the multi-dimensional clustering distribution diagram comprises the following steps:
And marking the distribution positions in the clustering space coordinate system by using the scattered points, and generating a multidimensional scattered point diagram.
4. The method of claim 1, wherein the multi-dimensional clustering the plurality of SQL templates according to the parameter data of the plurality of target parameters corresponding to the plurality of SQL templates, respectively, to obtain a plurality of clustering results includes:
Determining operation attributes of the SQL templates respectively;
And carrying out multidimensional clustering on the SQL templates according to the parameter data of the target parameters respectively corresponding to the SQL templates and the operation attributes respectively corresponding to the SQL templates to obtain a plurality of clustering results.
5. The method of claim 4, wherein the multi-dimensional clustering the plurality of SQL templates according to the parameter data of the target parameters respectively corresponding to the plurality of SQL templates and the operation attributes respectively corresponding to the plurality of SQL templates, to obtain a plurality of clustering results comprises:
Classifying the SQL templates based on the operation attributes respectively corresponding to the SQL templates to obtain a plurality of attribute categories;
And respectively carrying out multidimensional clustering on the SQL templates under the attribute categories according to the parameter data of the target parameters respectively corresponding to the SQL templates to obtain a plurality of clustering results.
6. The method of claim 5, wherein generating a multi-dimensional cluster profile based on the distribution of the plurality of cluster results in a cluster space coordinate system established based on the plurality of target parameters comprises:
Establishing a clustering space coordinate system based on the plurality of target parameters;
determining distribution positions of the plurality of clustering results in the clustering space coordinate system respectively;
Determining preset graphs corresponding to the attribute categories respectively;
And respectively marking the distribution positions by using the attribute categories to which the clustering results belong to corresponding preset graphs, and generating the multidimensional clustering distribution map.
7. The method of claim 6, wherein the identifying the distribution locations with the attribute categories to which the plurality of clustering results belong corresponding to a preset graph, respectively, and generating the multi-dimensional clustering distribution graph comprises:
determining preset graphs corresponding to the clustering results respectively based on attribute categories corresponding to the clustering results respectively;
Determining the graph sizes of preset graphs corresponding to the clustering results respectively based on the number of the clustering SQL templates contained in the clustering results respectively;
And marking the distribution positions of preset patterns corresponding to the clustering results by corresponding pattern sizes respectively, and generating the multidimensional clustering distribution map.
8. The method according to claim 1, wherein the method further comprises:
Determining an association parameter associated with the plurality of target parameters;
Acquiring associated data of the plurality of SQL templates corresponding to the associated parameters respectively;
Ordering the SQL templates according to the associated data size to generate an SQL template list;
and outputting the SQL template list.
9. The method of claim 8, wherein when the multidimensional clustering profile includes a plurality of attribute categories corresponding to the plurality of clustering results, the method further comprises:
based on the associated data corresponding to each of a plurality of SQL templates under the same attribute category, respectively determining the maximum associated data under the attribute categories;
Generating a correlation distribution map based on the maximum correlation data corresponding to each of the attribute categories;
And outputting the association distribution diagram.
10. The method according to claim 9, wherein the method further comprises:
and establishing a linkage relation among the multidimensional clustering distribution map, the SQL template list and the association distribution map based on the association relation between the target parameter and the association parameter.
11. The method of claim 1, wherein after said outputting said multi-dimensional cluster map, further comprising:
generating a query instruction taking at least one clustering result as a query condition based on a selection operation of any distribution area containing the at least one clustering result in the multi-dimensional clustering distribution diagram;
Based on the query instruction, obtaining at least one target SQL template matched with the at least one clustering result;
And outputting parameter data corresponding to the at least one target SQL template respectively.
12. The method of claim 11, wherein outputting the at least one target SQL template and the parameter data corresponding to the at least one target SQL template, respectively, comprises:
sequencing the at least one target SQL template according to the parameter data size corresponding to the preset target parameters to generate a target SQL template list;
and outputting the target SQL template list.
13. The method of claim 12, wherein when the multidimensional clustering profile includes a plurality of attribute categories corresponding to the plurality of clustering results, the method further comprises:
Based on the parameter data of the preset target parameters corresponding to at least one target SQL template under the same attribute category, respectively determining the maximum parameter data under the attribute categories;
generating a preset target distribution diagram based on the maximum parameter data corresponding to the attribute categories respectively;
outputting the preset target distribution diagram.
14. The method according to claim 10, wherein after establishing the linkage relationship of the multidimensional clustering distribution graph, the SQL template list, and the association distribution graph based on the association relationship of the target parameter and the association parameter, the method further comprises:
generating a query instruction taking at least one clustering result as a query condition based on a selection operation of any distribution area containing the at least one clustering result in the multi-dimensional clustering distribution diagram;
Based on the query instruction, obtaining at least one target SQL template matched with the at least one clustering result;
and updating the SQL template list and the association distribution diagram in a linkage way according to the at least one target SQL template.
15. The method of claim 14, wherein the updating the SQL template list and the association profile based on the at least one target SQL template linkage comprises:
Based on the associated data sizes respectively corresponding to the at least one target SQL template, obtaining a sequencing result of the at least one target SQL template;
Updating the SQL template list based on the sorting result;
and updating the association distribution diagram based on the maximum association data of the at least one target SQL template under the plurality of attribute categories.
16. A data processing apparatus, comprising:
The determining module is used for determining a plurality of target parameters of a plurality of SQL templates, wherein the target parameters comprise average scanning line numbers and average time consumption;
the clustering module is used for carrying out multidimensional clustering on the SQL templates according to the parameter data of the target parameters corresponding to the SQL templates to obtain a plurality of clustering results;
The distribution map generation module is used for generating a multidimensional clustering distribution map according to the distribution condition of the clustering results in a clustering space coordinate system established based on the target parameters;
The output module is used for outputting the multidimensional clustering distribution graph; the multidimensional clustering distribution diagram is used for determining the query condition of the target SQL template based on the distribution condition of the clustering results.
17. A computer device comprising a processing component, a display component, and a storage component; the storage component is used for storing one or more computer instructions, wherein the one or more computer instructions are used for being called by the processing component for execution;
The processing assembly is configured to:
determining a plurality of target parameters of a plurality of SQL templates, wherein the plurality of target parameters comprise average scanning line numbers and average time consumption;
according to parameter data of a plurality of target parameters respectively corresponding to the SQL templates, carrying out multidimensional clustering on the SQL templates to obtain a plurality of clustering results;
Generating a multidimensional clustering distribution diagram according to the distribution condition of the clustering results in a clustering space coordinate system established based on the target parameters;
The display component outputs the multi-dimensional clustering distribution diagram; the multidimensional clustering distribution diagram is used for determining the query condition of the target SQL template based on the distribution condition of the clustering results.
CN201911284244.6A 2019-12-13 2019-12-13 Data processing method and device and computer equipment Active CN112989153B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911284244.6A CN112989153B (en) 2019-12-13 2019-12-13 Data processing method and device and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911284244.6A CN112989153B (en) 2019-12-13 2019-12-13 Data processing method and device and computer equipment

Publications (2)

Publication Number Publication Date
CN112989153A CN112989153A (en) 2021-06-18
CN112989153B true CN112989153B (en) 2024-05-24

Family

ID=76341832

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911284244.6A Active CN112989153B (en) 2019-12-13 2019-12-13 Data processing method and device and computer equipment

Country Status (1)

Country Link
CN (1) CN112989153B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113326405B (en) * 2021-06-30 2022-12-13 数云科际(深圳)技术有限公司 Park entrance recommendation method and system based on BIM technology

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20080011949A (en) * 2006-08-01 2008-02-11 (주)윕스 Grouping system of documents and method thereof and recording medium thereof
CN106874349A (en) * 2016-12-26 2017-06-20 深圳市位和科技有限责任公司 Multidimensional data analysis method and system based on interactive visual
CA3078148A1 (en) * 2017-01-20 2018-07-26 10353744 Canada Ltd. Search method and apparatus, and non-temporary computer-readable storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8676802B2 (en) * 2006-11-30 2014-03-18 Oracle Otc Subsidiary Llc Method and system for information retrieval with clustering
US9336277B2 (en) * 2013-05-31 2016-05-10 Google Inc. Query suggestions based on search data
CN106021362B (en) * 2016-05-10 2018-04-13 百度在线网络技术(北京)有限公司 Generation, image searching method and the device that the picture feature of query formulation represents

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20080011949A (en) * 2006-08-01 2008-02-11 (주)윕스 Grouping system of documents and method thereof and recording medium thereof
CN106874349A (en) * 2016-12-26 2017-06-20 深圳市位和科技有限责任公司 Multidimensional data analysis method and system based on interactive visual
CA3078148A1 (en) * 2017-01-20 2018-07-26 10353744 Canada Ltd. Search method and apparatus, and non-temporary computer-readable storage medium

Also Published As

Publication number Publication date
CN112989153A (en) 2021-06-18

Similar Documents

Publication Publication Date Title
CN106874349B (en) Multidimensional data analysis method and system based on interactive visual
CN109325218B (en) Data screening statistical method and device, electronic equipment and storage medium
US8799859B2 (en) Augmented design structure matrix visualizations for software system analysis
WO2007078814A2 (en) Apparatus and method for strategy map validation and visualization
CN101268462A (en) Building of database queries from graphical operations
WO2019129520A1 (en) Systems and methods for combining data analyses
CN110287219B (en) Data processing method and system
US11030552B1 (en) Context aware recommendation of analytic components
CN108830554A (en) The outcome data information quality intelligent detecting method and system of task based access control model
JP5588811B2 (en) Data analysis support system and method
US10275501B2 (en) System and method for multi-dimensional data representation of objects
US11960547B2 (en) Single view presentation of multiple queries in a data visualization application
CN112395846A (en) Electronic experiment record report generation system capable of configuring template
JPWO2009031297A1 (en) Image search device, image classification device and method, and program
CN112989153B (en) Data processing method and device and computer equipment
CN111143356B (en) Report retrieval method and device
US20090058867A1 (en) Optimized Visualization And Analysis Of Tabular And Multidimensional Data
US20130124484A1 (en) Persistent flow apparatus to transform metrics packages received from wireless devices into a data store suitable for mobile communication network analysis by visualization
KR101561669B1 (en) Mobile electronic field node device for phytosociological vegetation structure investigation
CN110955774A (en) Word frequency distribution-based character classification method, device, equipment and medium
US11302070B1 (en) Systems and methods for multi-tree deconstruction and processing of point clouds
CN114780589A (en) Multi-table connection query method, device, equipment and storage medium
CN115829925A (en) Appearance defect detection method and device, computer equipment and storage medium
CN114969457A (en) Product retrieval method, product retrieval device, electronic equipment and readable medium
CN110413662B (en) Multichannel economic data input system, acquisition system and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant