CN109670267B

CN109670267B - Data processing method and device

Info

Publication number: CN109670267B
Application number: CN201811643437.1A
Authority: CN
Inventors: 杨方廷; 贾彦江
Original assignee: Beijing Aerospace Data Co ltd
Current assignee: Beijing Aerospace Data Co ltd
Priority date: 2018-12-29
Filing date: 2018-12-29
Publication date: 2023-06-13
Anticipated expiration: 2038-12-29
Also published as: CN109670267A

Abstract

The invention discloses a data processing method, a device and equipment, wherein the data processing method comprises the following steps: acquiring a data identification request of a user, wherein the data identification request carries data to be identified; determining a data identification industrial model for the user based on the attribute information of the data to be identified; inputting the data to be identified into the data identification industrial model for identification processing to obtain an identification result; and sending the identification result to the user. According to the method and the device, the corresponding data identification industrial model can be found through the data identification request of the user, the time for the user to inquire the data identification industrial model corresponding to the data to be identified is shortened, the working efficiency of the user is improved, the data identification industrial model obtained through the data identification request has pertinence relative to the data to be identified in the data identification request, and the obtained result is more accurate when the data identification industrial model processes the data to be identified.

Description

Data processing method and device

Technical Field

The present disclosure relates to the field of data processing, and in particular, to a data processing method and apparatus.

Background

With the rapid development of industry, various intangible digital assets such as various production manufacturing, technological processes, operation ideas, management experiences and the like of various companies in the industry field are increased, and in order to facilitate the management of the digital assets, the digital assets are converted into industrial algorithm models, so that the number of the industrial algorithm models is increased.

In the prior art, in an industrial algorithm model, when a user has data to be tested, the user cannot quickly find the industrial algorithm model corresponding to the data to be tested, and cannot test the data to be tested of the user in time, so that the working efficiency of the user is reduced.

Disclosure of Invention

In view of the foregoing, an object of the present application is to provide a data processing method and apparatus, which solve the problem of low data processing efficiency in the prior art.

In a first aspect, an embodiment of the present application provides a data processing method, including:

acquiring a data identification request of a user, wherein the data identification request carries data to be identified;

determining a data identification industrial model for the user based on the attribute information of the data to be identified;

inputting the data to be identified into the data identification industrial model for identification processing to obtain an identification result;

And sending the identification result to the user.

Optionally, the determining the data identification industrial model for the user based on the attribute information of the data to be identified includes:

acquiring the use right information of a preset model corresponding to the attribute information;

and determining that the user completes the use right transaction on the preset model according to the use right information, and taking the preset model which completes the use right transaction as the data identification industrial model.

Optionally, the data recognition industrial model is trained according to the following:

acquiring training data from a sample database;

configuring initial parameters of each preset model based on the training data and a plurality of preset models;

configuring the training data into each preset model, and training each preset model to obtain each preset model after training, a training result corresponding to each preset model and model accuracy;

judging whether the training result meets preset conditions, if not, adjusting parameters of each preset model according to the training result, and retraining the preset models after the parameters are adjusted until the training result meets the preset conditions;

And determining the data identification industrial model from the preset models which are trained based on the model accuracy corresponding to the preset models.

Optionally, after the identifying the industrial model for the user determination data, further comprises:

using the data to identify a first user number of the industrial model according to the data identification request, and configuring a first operation environment for the data identification industrial model;

virtualizing a data identification industrial model into a copy of the second user number industrial model by using the second user number of the data identification industrial model according to a data identification request;

configuring a second operation environment for the user according to the user information corresponding to the second user number and the first operation environment;

operating in the second operating environment to identify industrial model copies for the user-configured data.

Optionally, the preset model is constructed according to the following manner:

editing a plurality of algorithms in an algorithm editor, and storing the algorithms in an algorithm list;

selecting an algorithm from the algorithm list to process preset training data to obtain an algorithm file corresponding to each algorithm in the algorithm list;

Comparing the algorithm files, selecting an algorithm from the algorithm list according to a comparison result, and generating the preset model according to the selected algorithm.

Optionally, the sample database is constructed according to the following manner, including:

acquiring operation data of equipment;

according to a preset mapping relation, mapping the operation data into metadata;

the metadata is formed into the sample database.

Optionally, the method further comprises:

acquiring original data of a data source;

analyzing the source address carried by the original data to obtain detailed data information of the original data;

determining a data source and a data structure to which the original data belong, and extracting a field matched with the mapped data characteristic from the detailed data information according to the data characteristics mapped by the determined data source and the data structure to obtain the data characteristics of the original data;

extracting the data characteristics of the original data according to a preset target metadata mapping table to obtain target data characteristics to be stored;

and storing the target data characteristics to be stored in an industrial metadata base.

Optionally, after the obtaining the preset models for each training completion, the method further includes:

Acquiring a search instruction, wherein the search instruction comprises keyword information of a model;

extracting model information matched with the keyword information according to the keyword information of the search instruction;

arranging the model information according to a preset ordering rule, loading the model information into a preset visual interface, and displaying a visualized model loaded with the model information;

inputting the acquired data information into the model for operation, and generating an operation result of the model;

and loading the data information and the operation result into a preset model operation report template to generate a model operation report.

Optionally, the configuring initial parameters of each preset model includes:

a data source is visually configured, a metadata table is generated according to a metadata mapping relation corresponding to the data source, and data of a target end are combined according to the metadata table;

and configuring initial parameters of each preset model according to historical parameters in the industrial knowledge base.

In a second aspect, an embodiment of the present application provides a data processing apparatus, including:

the acquisition module is used for acquiring a data identification request of a user, wherein the data identification request carries data to be identified;

The determining module is used for determining data identification industrial models for the users based on the attribute information of the data to be identified;

the computing module is used for inputting the data to be identified into the data identification industrial model for identification processing to obtain an identification result;

and the feedback module is used for sending the identification result to the user.

According to the data processing method provided by the embodiment of the invention, the data to be identified carried in the data identification request of the user is used for determining the data identification industrial model required by the user, the data to be identified is input into the data identification industrial model to obtain the identification result, the result is fed back to the user, and the user can obtain the required result.

In order to make the above objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered limiting the scope, and that other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic flow chart of a data processing method according to an embodiment of the present application;

FIG. 2 is a schematic flow chart of an industrial model transaction method according to an embodiment of the present disclosure;

FIG. 3 is a flowchart of a method for identifying an industrial model using training data according to an embodiment of the present application;

FIG. 4 is a schematic flow chart of an operation environment configuration of a data identification industrial model according to an embodiment of the present application;

fig. 5 is a flow chart of a method for constructing a preset model according to an embodiment of the present application;

FIG. 6 is a flowchart of a method for constructing a sample database according to an embodiment of the present disclosure;

FIG. 7 is a flow chart of a method for building an industrial metadata database according to an embodiment of the present application;

FIG. 8 is a flowchart of a method for generating a model operation report according to an embodiment of the present application;

FIG. 9 is a schematic flow chart of configuring initial parameters for a preset model according to an embodiment of the present application;

fig. 10 is a flow chart of a method for constructing an industrial knowledge graph according to an embodiment of the present application;

FIG. 11 is a schematic diagram of a data processing apparatus according to an embodiment of the present application;

fig. 12 is a schematic structural diagram of a computer device 1200 according to an embodiment of the present application.

Detailed Description

For the purposes of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, but not all embodiments. The components of the embodiments of the present application, which are generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, as provided in the accompanying drawings, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, are intended to be within the scope of the present application.

Considering that the middle team simulation training system in the prior art is to set the singlechip in the combat equipment through an embedded mode, further collect operation data of the combat equipment, the combat equipment needs to be refitted through the embedded mode, and the refitted combat equipment can cause adverse effects on actual combat results when being applied in actual combat. Based on this, embodiments of the present invention provide a fixed structure, a sensor, and an analog system, which are described below by way of embodiments.

As shown in fig. 1, an embodiment of the present application provides a data processing method, including:

101, acquiring a data identification request of a user, wherein the data identification request carries data to be identified;

here, the data recognition request may be a request for tool life prediction, tool classification, or the like, and the data to be recognized is data of the object to be recognized.

For example, the data identification request is a tool life prediction, the object to be identified is a tool a, and the data to be identified is a production date, thickness, width, length, and the like of the tool a.

102, determining a data identification industrial model for the user based on the attribute information of the data to be identified;

here, the attribute information is information for characterizing data to be identified, such as tool life prediction, tool classification, and the like, and the data identification industrial model is a model that satisfies user demands and completes training.

Specifically, when the data identification industrial model is determined, the preset view comprises a plurality of category data identification industrial models, and the data identification industrial model corresponding to the attribute information of the data to be identified is found in the plurality of category data identification industrial models, wherein the preset view is an interface for displaying information required by a user.

For example, the attribute information of the data to be identified is a tool life prediction, and the preset view includes a data identification industrial model with the category of the tool life prediction and a data identification industrial model with the category of the tool classification, and the determined data identification industrial model is the data identification industrial model with the category of the tool life prediction according to the attribute information of the data to be identified.

103, inputting the data to be identified into the data identification industrial model for identification processing to obtain an identification result;

here, the recognition result includes a calculation result of the data recognition industrial model.

Specifically, the data identification industrial model comprises at least one parameter, the corresponding parameter in the data identification industrial model is assigned according to each data in the data to be identified, and the data identification industrial model is calculated based on the assigned parameter value in the data identification industrial model to obtain an identification result.

104, sending the identification result to the user.

Specifically, the recognition result may be fed back to the user through display in the user interface, or may be fed back to the user through transmitting the recognition result to the mobile terminal of the user.

According to the method and the device for identifying the data, the data to be identified, which are carried in the data identification request of the user, are used for determining the data identification industrial model required by the user, the data to be identified are input into the data identification industrial model to obtain the identification result, the result is fed back to the user, and the user can obtain the required result.

In step 102, the determining data for the user based on the attribute information of the data to be identified identifies an industrial model, as shown in fig. 2, and the application provides an industrial model transaction method, which includes:

201, obtaining the use right information of a preset model corresponding to the attribute information;

202, determining that the user completes the right-to-use transaction on the preset model according to the right-to-use information, and using the preset model which completes the right-to-use transaction as the data to identify an industrial model.

Here, the preset model is a preset model, and the usage right information of the preset model includes, but is not limited to: single use and use prices, multiple uses and corresponding use prices, monthly use and use prices, annual use and use prices, right to use transfer and transfer prices, and the like, the right to use transaction includes: single use right transaction, multiple use right transaction, monthly use right transaction, annual use right transaction, use right transfer transaction, etc.

Specifically, after the user obtains the corresponding preset model according to the attribute information, the user selects the use right information corresponding to the preset model according to the self requirement, the corresponding transaction information is displayed in the preset view according to the use right information selected by the user, the user carries out transaction according to the transaction information, and the preset model is the data identification industrial model which can be used by the user after the transaction is completed.

For example, a preset model is obtained according to attribute information, a user clicks a virtual button represented by single use in use right information of the preset model, a single use transaction interface of the preset model is displayed in a preset view, the user transfers a single use price to a corresponding account according to prompt of the transaction interface, after the corresponding account receives the single use price, the user is determined to finish the use right transaction of the preset model, and the preset model is the data which can be used by the user to identify an industrial model.

The embodiment of the application realizes the conversion of the market trading mode and the business mode of the preset model through the use right trade, enhances the utilization rate of the digital assets of the enterprise, and improves the benefit of the enterprise. Therefore, various data of each enterprise can be changed into tangible digital assets, the transfer and conversion efficiency of the digital assets is improved, benefits are created for the enterprises, and the effective utilization of the digital assets of the enterprises is facilitated.

After determining the data-recognition industrial model for the user at step 102, the data-recognition industrial model is trained, as shown in FIG. 3, according to the following:

301, acquiring training data from a sample database;

302, configuring initial parameters of each preset model based on the training data and a plurality of preset models;

in step 302, initial parameters of each preset model are configured for each preset model, including the following steps:

step 1, acquiring data of a plurality of different features according to a preset model.

Specifically, for example, if the multiple models are to process digital assets in the real estate industry, the different feature data obtained may be: area used, floor space, pattern, floor, etc.

And 2, arranging the data to form a matrix model.

Specifically, when the acquired plurality of data are arranged, the matrix model is formed by using the characteristics of the data as columns and specific values of the data as rows, for example, after the acquired use area, floor space, pattern and floor are arranged, the first column may be the use area, the second column is the floor space, the third column is the pattern, the fourth column is the floor space, and the use area, the floor space, the pattern and the specific values corresponding to the floor space are rows, so as to form the matrix model.

And 3, carrying out data serialization processing on the elastic distributed data set (Resilient Distributed Datasets, RDD) of the matrix model, carrying out association analysis on the RDD of the matrix model, and selecting and combining the RDD of the matrix model with association degree smaller than a preset association degree value to obtain training data corresponding to the industrial model.

Specifically, the feature of the RDD of the matrix model is analyzed through a trained algorithm, the association degree of the feature of the RDD of the matrix model is judged, for example, the association degree of the RDD corresponding to the first feature and the RDD corresponding to the second feature is judged, if the association degree is larger than a preset association degree value, the RDD corresponding to the first feature and the RDD corresponding to the second feature are considered to be replaced with each other, and when training data obtained by combining the RDD corresponding to the first feature and the RDD corresponding to the second feature is used for training the industrial model, the training result of the industrial model is inaccurate, so that when the training data is obtained, the association analysis is carried out on the RDD of the matrix model, the training data is screened and optimized, and the accuracy of the training result of the industrial model is improved.

Further, when the training data is obtained, a plurality of different training data can be obtained, the different training data are subjected to joint analysis, the optimal training data are selected from the different training data to train the industrial model, the accuracy of training the industrial model is improved, and the optimal training data are the training data with the lowest association degree of the data characteristics.

303, configuring the training data into each preset model, and training each preset model to obtain each trained preset model, a training result corresponding to each preset model and model accuracy;

The training data includes test data and processing results, and specifically, step 302 includes:

and acquiring a preset model and corresponding test data.

The test data are used for testing the preset model, and the processing results are marked in the test data correspondingly.

And configuring the test data into a preset model, and calling a plurality of preset algorithms by the preset model to respectively calculate the test data to obtain a plurality of calculation results.

Specifically, the algorithms respectively process the test data to obtain a plurality of calculation results.

Analyzing a plurality of calculation results and corresponding processing results of the test data, selecting an algorithm with highest accuracy in the calculation results as a reference algorithm of a preset model, and taking the calculation results obtained by calculation of the reference algorithm as the training results.

And carrying out statistical analysis on the calculation result and the processing result obtained by each algorithm to obtain the accuracy of the calculation result corresponding to the algorithm, and the like, obtaining the accuracy of each algorithm in a plurality of preset algorithms by statistical analysis, selecting the algorithm with the highest accuracy as a reference algorithm of a preset model, and taking the calculation result obtained by calculation of the reference algorithm as a training result.

304, judging whether the training result meets a preset condition, if not, adjusting parameters of each preset model according to the training result, and retraining the preset model after adjusting the parameters until the training result meets the preset condition;

specifically, the training result of the preset model comprises the accuracy of a reference algorithm in the preset model, if the training result is larger than or equal to a preset value, the training result is considered to meet the preset condition, and the training of the preset model is finished; if the training result is smaller than the preset value, the preset condition is not met, the parameters of the preset model are adjusted according to the training result, specifically, the preset model can be automatically adjusted based on the training result, the preset model after the parameters are adjusted is retrained, in the retrained process, the reference algorithm of the preset model is reselected, the calculation result of the reselected reference algorithm is used as the training result of the preset model until the training result meets the preset condition, the accuracy of the algorithm in the preset model is continuously improved, the operation result of the preset model is continuously adjusted, and the accuracy of the operation result of the preset model is improved.

305, determining the data identification industrial model from the preset models after training based on the model accuracy corresponding to each preset model.

Specifically, determining a training-completed preset model corresponding to the model accuracy representing the maximum value as the data identification industrial model corresponding to the same preset classification.

Continuing with the example of calculating the accuracy of the preset model in step 102, according to the predictive model A ₁ Accuracy of 80%, predictive model B ₁ Accuracy of 70%, predictive model C ₁ The accuracy of (2) is 85%, and among the three preset models, the highest accuracy is the prediction model C ₁ Prediction model C ₁ Is the target model required by the user.

The embodiment of the invention provides a preset model training method, which comprises the following steps: acquiring a preset model and training data, and configuring initial parameters of the preset model; configuring training data into the preset model, and training the preset model to obtain a training result; judging whether the training result meets the preset condition, if not, adjusting the parameters of the preset model according to the training result, and retraining the preset model after adjusting the parameters until the training result meets the preset condition. According to the preset model training method, the preset model is trained by using the training data, the parameters of the preset model are continuously adjusted in the training process, the accuracy of the preset model is continuously improved, the trained preset model is converged and is suitable for the industrial field, the method is universal and can be applied to various industrial fields, the preset model can be timely updated by updating the training data, and the technical problems that in the prior art, the existing method for training the preset model is not universal, only the production requirement of fixed products can be met, the preset model cannot be timely trained, and the preset model cannot be effectively updated for a long time are solved.

As shown in fig. 4, step 102, after said identifying the industrial model for said user determination data, further comprises:

401, identifying a first user number of an industrial model by using data according to a data identification request, and configuring a first operation environment for the data identification industrial model;

in the embodiment of the application, after a user logs in an industrial model integrated transaction platform, a user requirement containing keyword information is input from a displayed search box, and the industrial model integrated transaction platform analyzes a received data identification request. As an alternative embodiment, the user needs to know how long a certain model of device can still function normally, and thus, the input data identification request may be: model XXXX.

In step 401, identifying a first number of users of an industrial model using data in accordance with a data identification request, configuring a first operating environment for the data identification industrial model, comprising:

counting data used in the current request to identify the number of industrial models;

counting the number of users corresponding to each data identification industrial model;

and configuring a first operation environment for the industrial model according to the number of users corresponding to each data identification industrial model, the number of data identification industrial models, the type of the data identification industrial model, the current available resources and a preset operation environment configuration strategy.

Here, the operating environment configuration policy may be: if the number of users corresponding to the data identification industrial model is larger, the type of the data identification industrial model is a type with complex operation, and the number of the data identification industrial model is smaller, the number of the resources corresponding to the first operation environment configured for the data identification industrial model is larger.

Data identification industry model types include, but are not limited to: equipment-class industry model, production-class industry model, decision-class industry model, and enterprise-class industry model. For example, for decision-making industry models, the operations are complex.

In this embodiment, as an optional embodiment, the integration and allocation of the operation environments of the data identification industrial models are performed by the industrial model engine atomic calculation unit, and a virtual background resource space is allocated to the operated data identification industrial models to form an operation environment including a virtual space, where the operation environments of the data identification industrial models are independent from each other. And in the subsequent operation of the data identification industrial model, applying and calling resources in a virtual space.

402, virtualizing a data identification industrial model into a second number of industrial model copies using the data identification request according to the second number of users of the data identification industrial model;

Specifically, the target industrial model is virtualized as an industrial model copy that matches the second number of users. Each user using the target industrial model corresponds to an industrial model copy using the target industrial model. The industrial model comprehensive transaction platform is logged in, target industrial classification is selected from the displayed classification list, and next-stage target classification is selected according to the displayed next-stage classification list corresponding to the target classification until a required target industrial model is selected.

403, configuring a second operation environment for the user according to the user information corresponding to the second user number and the first operation environment;

in step 403, configuring a second operation environment for the user according to the user information corresponding to the second user number and the first operation environment, including:

and comprehensively evaluating according to the level and authority of each user using the target industrial model, the configuration condition of each user and the affordable use price of each user, respectively configuring second operation environments for each user, wherein the second operation environments configured for each user are matched with the first operation environments, and the second operation environments configured among the users are mutually independent.

Specifically, the algorithm and the resource of the target industrial model are shared between the data identification industrial model copies, and the data of the user input data identification industrial model copies are isolated from each other.

After the comprehensive evaluation is performed according to the level and authority of each user using the target industrial model, the configuration condition of each user, and the use price bearable by each user, before the second operation environment is configured for each user, the method further comprises:

and determining whether the user completes the use right transaction of the target industrial model, and if so, executing the step of respectively configuring a second operation environment for each user.

The usage right transaction includes: a single use right transaction, a plurality of use right transactions, a monthly use right transaction, a annual use right transaction, and a use right transfer transaction.

In particular, the specific requirements and specifications of the industrial model can be identified based on the data, which is given to identify the price of use of the industrial model. After the user clicks the use, determining to conduct the transaction of the data identification industrial model use right, and after the use right transaction is completed, collecting the proposal of the transaction according to the proportion in the allocation strategy by the industrial model comprehensive transaction platform. And may be assigned according to a preset usage right price assignment recipe, for example, to the owner of the data-recognition industrial model and the developer of the data-recognition industrial model in accordance with an assignment policy.

404, identifying a copy of the industrial model for the user-configured data in the second operating environment.

Specifically, the subsequent execution of the data identification industrial model replica is based on the pre-assigned second execution environment.

When a plurality of users select an industrial classification model prediction engine for use at the same time, the industrial classification model is virtualized into a plurality of data identification industrial model copies for concurrent execution so as to meet the use requirements of the users, and each user corresponds to a virtual data identification industrial model copy. The virtual data identification industrial model copy realizes resource sharing and data concurrency through resource isolation, including but not limited to: sharing model parameter information and isolating user data resources. User data resources include, but are not limited to: target data input by a user, a prediction result output by a task, operation control parameters set by the user and the like.

When the same data identification industrial model is executed concurrently by different users, a second running environment containing virtual operation resource space can be constructed, and the use requirements of the concurrent execution users are executed. For example, the data is obtained to identify the operating resources required by the industrial model, such as memory size, CPU/GPU core number, cache size, interaction space, etc., through parameters of the operating environment configuration of the target industrial model. Then, a virtualized resource pool is constructed through distributed physical resource integration, for example, resources with higher idle rate are selected from a preset resource pool according to the first user number of the industrial model requested to be used currently, and the concurrency of operation, the type of GPU or CPU, the number of cores of an operation unit, the size of a virtual memory, the size of a cache space and the like, so as to form an independent virtualized subspace, and a virtual operation resource space (first operation environment) of the industrial model is obtained. And finally, according to the virtual operation resource space of the target industrial model, allocating a virtual operation resource space for the industrial model copy for the data identification corresponding to each user. In this way, the user, the industrial model and the virtual resource space form a multi-dimensional resource pool virtualization space, so that the multi-dimensional resource pool virtualization space can be displayed in a three-dimensional space mode. For example, the user is set to an x-coordinate, the industrial model is set to a y-coordinate, and the virtual resource space is set to a z-coordinate, and the virtual resource space in the resource pool occupied by a specific user and the industrial model is uniquely determined by a three-dimensional coordinate system.

Taking 5 users needing to use 2 industrial models, wherein 2 users use a first industrial model, 2 users use a second industrial model, 1 user needs to use the first industrial model and the second industrial model at the same time as an example, 5 users respectively correspond to 5 points of an x axis, two industrial models correspond to 2 points of a y axis, and virtual resource space allocated for the users using the industrial models is used as a point of a z axis. A three-dimensional resource pool architecture is formed through the three-dimensional coordinates of x, y and z, and the resource pool is independent for each user.

In the embodiment of the application, the operation environment of the data identification industrial model is configured according to the requirements of the user, so that the diversified requirements of the user can be met, the resource utilization rate can be improved, and the configuration efficiency of the operation environment of the data identification industrial model is improved.

As shown in fig. 5, before training the preset model, the preset model is constructed according to the following manner:

501, editing a plurality of algorithms in an algorithm editor, and storing the algorithms in an algorithm list;

in step 501, a plurality of algorithms are edited in an algorithm editor, and the plurality of algorithms are saved in an algorithm list, including:

the algorithm editor integrates algorithm engine technology.

And through an algorithm engine technology, online editing algorithm codes in an algorithm editor, online debugging and online execution are carried out on the algorithm codes, and an output result is obtained online.

When the algorithm code is edited in the algorithm editor, setting the event attribute of the algorithm code, and automatically executing preset operation according to the event attribute in the running process of the algorithm code.

Specifically, the event attributes include: the event attribute values include: sending mail, suspending algorithm, waiting, etc., if the event attribute is execution timeout, the operation of sending mail reminding can be performed according to the preset event attribute value of sending mail.

The basic properties of the algorithm code are configured on the algorithm editor.

Specifically, the basic attributes include the name of the algorithm, the version of the algorithm, the creation time of the algorithm, a parameter list and the like, and the configuration of the basic attributes of the algorithm codes on the algorithm editor facilitates the management and inquiry of the algorithm codes.

And storing the event attribute, the basic attribute and the corresponding algorithm codes to form an algorithm, and storing the algorithm in an algorithm list.

502, selecting an algorithm from the algorithm list to process preset training data to obtain an algorithm file corresponding to each algorithm in the algorithm list;

Specifically, the training data comprises a plurality of historical data marked with processing results, an algorithm is selected from an algorithm list to process preset training data, the processing results corresponding to the algorithm and an algorithm file are obtained, the algorithm file comprises the operation accuracy of the algorithm, and specifically, the processing results of the algorithm and the marked processing results are analyzed to obtain the operation accuracy of the algorithm; and by analogy, each algorithm in the algorithm list is obtained to process the training data, and a processing result and an algorithm file corresponding to each algorithm are obtained.

503, comparing the algorithm files, selecting an algorithm from the algorithm list according to the comparison result, and generating the preset model according to the selected algorithm.

Specifically, the algorithm file comprises the operation accuracy of the algorithm, the operation time of the algorithm and the like; comparing the algorithm files, comparing the operation accuracy of the algorithms in the algorithm files, selecting the algorithm with the highest operation accuracy, and generating a preset model according to the selected algorithm.

In an embodiment of the present application, the industrial model is for processing digital assets, the method comprising: editing a plurality of algorithms in an algorithm editor, and storing the algorithms in an algorithm list; selecting an algorithm from the algorithm list to process preset training data to obtain an algorithm file corresponding to each algorithm in the algorithm list; and comparing the algorithm files, selecting an algorithm from the algorithm list according to the comparison result, and generating a preset model according to the selected algorithm. According to the preset model generation method, the preset model can be constructed, the digital asset of an enterprise can be converted into the preset model, and then the converted industrial model can be placed on a model bazaar for transaction, so that the digital asset can be recycled, the market value of the digital asset is brought into play, meanwhile, the management efficiency of the enterprise on the digital asset is improved, and the storage and the flow of the digital asset are further realized.

Before training data is obtained from a sample database, the sample database needs to be constructed, as shown in fig. 6, and the present application provides a method for constructing the sample database, which includes:

601, acquiring operation data of equipment;

here, the device is an industrial device for which data is to be acquired, and the operation data includes several of data acquired by a detection device provided on the industrial device, a hardware address of the industrial device itself, and a memory address storing an operation parameter of the industrial device. The form of the data is not unique, and may be a time sequence signal or an analog signal, a constant parameter, or an image.

Specifically, a data model is set to perform preliminary processing on the obtained operation data of the device, the data model comprises a source end data source and a target end data source, and the source end data source is connected with the target end data source and is used for analyzing the encrypted operation data of the device.

The acquiring of the operation data comprises the following steps:

step 1, an equipment terminal collector acquires the operation data of equipment in real time, and packages the operation data within a preset period of time.

And 2, the source end data source acquires the packed operation data from the equipment end.

Here, the equipment-side collector refers to equipment which is installed near the running equipment and collects the running state of the equipment in real time,

specifically, the device-side collector only transmits new data or changed data generated by the device, and the target data source acquires packed operation data from the device-side collector according to a preset device-side collector address; and the target data source transmits the packed operation data acquired from the equipment-side collector to the source-side data source.

602, mapping the operation data into metadata according to a preset mapping relation;

here, the preset mapping relationship refers to a correspondence relationship between the numerical value and the type of the operation data of the device and the metadata. The metadata refers to a sample database feature (specifically, a feature value) corresponding to a feature of the equipment operation data at the current moment in the mapping relation.

The operation data of the device is data collected from different device ends, and generally has different forms, namely multi-source heterogeneous data. The data type and the numerical value of the operation data need to be mapped into metadata according to a preset mapping relation.

603, forming the metadata into the sample database.

Specifically, the sample database stores the sample database features corresponding to the operation data of all the devices at the previous moment, and the specific form of the sample database features is a feature value. The operational data of all devices corresponds to a number of feature values, so the sample database features corresponding to the operational data of all devices are actually feature vectors. Mapping the address of the equipment end into logic characteristics according to a preset mapping relation; a sample database is formed based on the logical characteristics and the metadata. The mapping relation also comprises a storage position of the characteristic vector corresponding to the address of the equipment end and the operation data generated by the equipment in the sample database. According to the address of the equipment end, the feature vector corresponding to the operation data of the equipment at the current moment is put into the address corresponding to the sample database according to the mapping relation, and the features of the sample database of the equipment are updated to form a new sample database.

Compared with the method that the complex and diverse industrial data is identified and processed by the corresponding technology in the prior art and the sample database cannot be formed, the method forms a standard identification model and improves the efficiency of acquiring useful information from the operation data of the equipment.

Metadata is stored in an industrial metadata base, as shown in fig. 7, an embodiment of the present application further provides a method for constructing an industrial metadata base, including: 701, obtaining the original data of the data source.

Data sources include, but are not limited to, different file systems such as normal files, distributed file system (HDFS, hadoop Distributed File System) files, and different databases such as MySQL, hbase, mongoDB. When the original data is obtained from the data source, the data source attribute is automatically configured for the original data, wherein the attribute mainly comprises the source address of the original data, and the source address adopts a URL format (Uniform Resource Locator ) and comprises the name or IP address of the data source where the original data is located, the port used by the data source, the path reaching the original data and the name of the original data. For example, the source address is configured as HDFS:// xxx. X.x/a and jdbc: mysql:// x.x.x.x.3306/db.

702, resolving the source address carried by the original data to obtain detailed data information of the original data.

In step 702, resolving a source address carried by the original data to obtain detailed data information of the original data, including the following steps:

and step 1, analyzing the source address carried by the original data.

The resolving of the source address includes determining whether the source address exists, whether the source address can be connected, and whether the source address has the authority to read the file.

And 2, inquiring the mapping relation between the source address and the data source connection mode to obtain the data source connection mode of the source address mapping carried by the original data.

And step 3, acquiring detailed data information of the original data according to the obtained data source connection mode.

Different data sources extract original data according to the mapped data source connection mode, a common file or an HDFS file can divide the file into a plurality of blocks of asynchronous extraction, a text file adopts an mr mode to extract data and a cleaning text file, and the HDFS file adopts a spark mode to extract data; hbase, mongoDB and the like extract data meeting the conditions by a common SQL direct query mode, and MySQL, hbase and other databases extract data by an sqoop mode.

703, determining the data source and the data structure to which the original data belongs, and extracting the field matched with the mapped data feature from the detailed data information according to the data features mapped by the determined data source and the data structure to obtain the data feature of the original data.

Since the detailed data information contains a large amount of information content, if all the detailed data information is received, the subsequent data storage management is difficult, so that the detailed data information is screened according to the preset data characteristics of the original data, and the required data characteristics are extracted, wherein the data characteristics comprise, but are not limited to, data fields, data constraints, data values, data types, data structures and the like. Data types include, but are not limited to, int, long, string, varchar, timestamp, date, etc.; data structures include, but are not limited to Xml, json, parquet, keyvalue, lp, datagrid, etc.

And 704, extracting the data characteristics of the original data according to a preset target metadata mapping table to obtain target data characteristics to be stored.

And extracting the data characteristics matched with the data characteristics of the target metadata mapping table from the data characteristics of the original data.

Defining basic data types of metadata in a data source in advance, defining data types of target data by adopting an object-oriented language, converting the data types of any metadata into the data types of the target data through a Protobuf protocol, correlating the data types of the metadata and the data types of the target data through a structure ID, and storing the correlation in a target metadata mapping table. In addition to performing data type conversion, conversion between data features such as data constraint can also be performed.

The data characteristics of the original data comprise data structures, the data structures of the target metadata mapping table are selected, and the data structures matched with the selected data structures are extracted from the data characteristics of the original data so as to obtain target data to be stored. If a user needs to process data to a certain extent, the embodiment of the invention provides some auxiliary data structures, and only needs to input data, other indexes which need to be calculated through the data or outputs in a report form can be generated according to the auxiliary data structures.

705, storing the target data characteristic to be stored in an industrial metadata base.

According to the preset attribute of the target data, the attribute is configured for the target data, including the storage address of the target data, the callable mode of the target data, the URL path of the target data, the account password and other basic attribute configurations. The target data is divided into static data and dynamic data, such as equipment ID and equipment approach time which are static data; the time sequence data is dynamic data, and when the dynamic data is stored, one item of information with a data structure of keyvalue and a data field of a time stamp needs to be added. The constructed industrial metadata base can integrate data from different data sources and different types of structures into a data combination body with the same data structure, and is convenient for storage management and application service of adding, deleting and checking.

In the industry, data from different data sources and different data structures are referred to as multi-source heterogeneous data, and data from the same data source but different data structures are referred to as complex heterogeneous data.

When the original data is multi-source heterogeneous data, as a specific embodiment, the multi-source heterogeneous data are respectively from a MongoDB data source, a MySQL data source and an ftp data source, detailed data information of the original data is obtained by analyzing a source address, and corresponding data features are extracted from the detailed data information according to preset data features of the original data, and the specific data features are shown in table (1):

watch (1)

The multi-source heterogeneous data is converted into HDFS data through a metadata mapping table, and the data is specifically shown in a table (2):

watch (2)

Target data integrated into the same data structure is stored in industrial metadata.

When the original data is complex heterogeneous data, in one embodiment, the complex heterogeneous data is derived from an ftp data source, including data in both xml format and json format. The detailed data information of the original data is obtained by analyzing the source address, and corresponding data characteristics are extracted from the detailed data information according to the preset data characteristics of the original data, wherein the detailed data characteristics are specifically shown in a table (3):

Watch (3)

The complex heterogeneous data is converted into the HDFS data through the metadata mapping table, and the specific steps are shown in the table (4):

watch (4)

As shown in fig. 8, after the obtaining of the preset models for each training completion, the method further includes:

801, obtaining a retrieval instruction, wherein the retrieval instruction comprises keyword information of a model.

The user converts and trains the intangible digital asset in the industrial field into a data identification industrial model, the data identification industrial model is released on the model market, and the data identification industrial model can be displayed on the model market after being checked by platform staff. Classifying the data identification industrial model in the model market according to the keyword information, wherein the keyword information comprises an algorithm for constructing the data identification industrial model, a technical field, an application range and the like, such as equipment types, predictive maintenance, discrete manufacturing related, flow manufacturing related and the like. The model market generates a corresponding algorithm icon by acquiring an algorithm for constructing data to identify an industrial model; training data and test data of the data identification industrial model are obtained, a corresponding data model icon is generated, and the algorithm icon and the data model icon jointly form an icon corresponding to the data identification industrial model in a model market. The general model is displayed as a default layout, but model display schemes of different styles are provided in the background of the model market, the model display schemes are divided into two forms of free and charging, and after a user selects or purchases the model display scheme, the model display scheme is applied in a personalized mode in the background to display the model. In addition, the user can customize the model display layout in the background according to the own requirements, and can develop the model display layout by himself according to the development interface of the industrial model market platform and release the model display layout to the background for other users to use.

The model market displays the data identification industrial model, provides value added service of the data identification industrial model, and provides an open commercial model with continuously-increased industrial model market platform according to the price of the data identification industrial model, the condition of the used resources and the characteristics of the data identification industrial model, the data identification industrial model owner obtains the benefit of the use right of the model, and the industrial model market platform obtains the difference price of the transaction of the data identification industrial model and the cost of the use of the resources through the use of the data identification industrial model by a user. The data identification industrial model is obtained by improving the preset model with lower value by a user before being displayed in the model market, and the value of the improved data identification industrial model is higher, so that the pricing of the improved data identification industrial model after being uploaded to the model market is higher than that of the preset model before, the problem that the value of the preset model is low is solved, and the user income is improved. And 802, extracting model information matched with the keyword information from a database according to the keyword information of the retrieval instruction.

803, arranging the model information according to a preset ordering rule, loading the model information into a preset visual interface, and displaying the visualized model loaded with the model information.

The model information comprises information of the uploading model time of the user, and the information is arranged from the near to the far according to the uploading model time, so that the user can search the latest data uploaded to the platform to identify the industrial model; the model information comprises downloading frequency information of the model, and the downloading frequency information is arranged from more to less according to the downloading frequency information, so that a user can search data with more downloading frequency and using frequency by other users to identify the industrial model; the model information also includes price information of the model, and the price information is arranged in ascending order or descending order.

Optionally, the model information includes model price information, and generating the model price information includes the steps of:

and step 1, acquiring actual data quantity predicted by the model and algorithm used by the model from the model information.

And 2, calculating the prediction accuracy of the model.

And 3, calculating the comprehensive score of the model according to the actual data quantity predicted by the model, the construction algorithm and the prediction accuracy.

And 4, the model information comprises the evaluation degree and the good evaluation rate of the model, and the comprehensive cost of the model is calculated according to the evaluation degree, the good evaluation rate and the comprehensive score.

The evaluation degree of the model is obtained by calculating weights of a user score and a prediction accuracy of the model, wherein the user score is quite high, the user score is at least one score, [1,6 ] is poor, [6,8 ] is medium, and [8,10] is good.

And 5, calculating and generating model price information according to the comprehensive cost.

And 804, inputting the acquired data information into the model for operation, and generating an operation result of the model.

And 805, loading the data information and the operation result into a preset model operation report template to generate a model operation report.

The generation of the model operation report template in step 805 specifically includes the steps of:

step 1, determining a construction algorithm, automatically matching a system template, and setting attribute information of the construction algorithm.

And 2, obtaining metadata configuration, constructing an algorithm and a knowledge base model.

Step 3, determining RDD (resilient distributed data set, resilient Distributed Datasets) relations of metadata configuration, construction algorithm and knowledge base model.

In one embodiment, metadata is configured to industrial algorithm 1, industrial algorithm 1 through industrial algorithm 2, industrial algorithm 2 through knowledge base model.

And step 4, generating a model operation report template according to the RDD relation and the attribute information.

And 5, storing a model operation report template.

In addition to running the report templates with the default style model, the user may also personalize the templates such that the report style and content of each template is different.

The update of the data information can be classified into the following three cases: when the model runs again after parameters are adjusted or data are input, the model running report is automatically updated on the basis of the original report; when the algorithm of the model or the data of the model are automatically matched, the model running report is automatically adapted and updated; the model run report also automatically adapts to updates as the user of the model changes or the run environment changes.

According to the model operation report generation method, a user identifies an industrial model by searching standard data converted from intangible digital assets in the industrial field, which are uniformly stored in a model market, so that various experience soft assets of an enterprise are changed into unfortunate digital assets, and permanent storage and circulation of the enterprise digital assets are facilitated. Meanwhile, the model market also avoids the problem that false and poor-quality data identification industrial models are muddy to fish and illegally profitable in the model market, so that the technical problem that digital assets in the prior art cannot be uniformly managed and commercialized is solved.

As shown in fig. 9, the configuring initial parameters of each preset model includes:

901, visually configuring a data source, generating a metadata table according to a metadata mapping relation corresponding to the data source, and combining data of a target terminal according to the metadata table;

The metadata comprises a data source, a connection attribute, target data, data characteristics and the like.

Specifically, a data source is configured in an industrial model management background in a dragging mode, metadata corresponding to the data source is mapped according to a mapping relation of the metadata, a metadata table is obtained after the mapping result is tidied, and when the target end needs to combine and configure data of the target end corresponding to the metadata, quick and flexible mapping can be performed through the metadata table.

When configuring the data source, if the data source is an information system, a knowledge base or an FTP server, etc., the corresponding connection attribute, such as URL address, etc., is configured to establish the source-to-destination data connection.

Step 901 above: the visualized configuration data source generates a metadata table according to the metadata mapping relation corresponding to the data source, and combines the data of the target end according to the metadata table, comprising: when the data source is a complex heterogeneous data source, integrating a plurality of data sources in the complex heterogeneous data source according to respective connection modes and reading modes to obtain a complex heterogeneous data source configuration scheme.

Specifically, when the data sources are formed by combining multiple data sources with different data characteristics, a processing method, namely a connection mode and a reading mode, corresponding to each data source needs to be acquired first. And integrate all processing methods into one configuration scheme for the data source.

And configuring the complex heterogeneous data source according to the complex heterogeneous data source configuration scheme.

Specifically, the data sources are configured with corresponding connection attributes according to the configuration scheme.

And generating a data source assembly according to various data sources in the complex heterogeneous data sources.

Specifically, multiple data sources in the complex heterogeneous data sources are mapped into the same data source in a metadata mapping mode, and then are combined into a data source combination body.

And combining the data source assembly with the auxiliary data source characteristics to generate metadata corresponding to the data source assembly.

Specifically, metadata of the data source assembly is generated after the data source assembly and the auxiliary data source features are combined.

And generating a combined body metadata table according to the metadata mapping relation corresponding to the data source combined body.

Specifically, the metadata of the data source combination is mapped, and the result is arranged into a combination metadata table.

And combining the data of the target end according to the combination body metadata table.

Specifically, when the target end needs to combine and configure the data of the target end corresponding to the metadata corresponding to the data source assembly, the metadata table can be used for fast and flexible mapping.

And 902, configuring initial parameters of each preset model according to historical parameters in an industrial knowledge base.

Specifically, training parameters are stored in an industrial knowledge base, and the industrial model is preliminarily configured according to the historical parameters before model training.

The report generated after the operation of the data identification industrial model comprises an industrial knowledge graph, as shown in fig. 10, the application provides a construction method of the industrial knowledge graph, which comprises the following steps:

and 1001, acquiring a feature vector of data to be processed.

In the embodiment of the application, the data to be processed may be a plurality of operation data of the industrial equipment, such as data collected by a sensor on the equipment, a hardware address of the equipment, and data stored in a memory of the equipment. The data to be processed can also be parameters of parts or products, such as a cutter, and can be a plurality of parameters such as the size, the material, the sharpness and the like of the cutter. The data to be processed is multi-source heterogeneous data, and cannot be directly processed as the algorithm model, so that the data to be processed is mapped into the form of the feature vector. The method for constructing the industrial knowledge graph provided by the implementation is described below by taking the predicted service life of the tool as an example.

In step 1001, the feature vector of the data to be processed is obtained, which includes the following steps:

step 1, a data model is established, wherein the data model comprises a source end data source, a target end data source and a mapping relation of the source end data source and the target end data source.

And 2, acquiring data to be processed from a source terminal through a source terminal data source.

And step 3, acquiring a target end data source mapped by the data to be processed based on the mapping relation between the source end data source and the target end data source, and obtaining the feature vector of the data to be processed.

Specifically, a source end data source of the data model is connected with a source end, wherein when the data to be processed refer to equipment, the source end can be a memory, a file or a database for storing equipment parameters, and can also be an acquisition component; when the data to be processed refers to a part or a product, the source end can be a memory, a file or a database for storing parameters of the product or the part. The generated feature vector can be obtained from the target data source. The mapping relation between the source end data source and the target end data source specifically refers to the configuration of processing the data to be processed into the feature vector. The mapping relation is stored in a metadata base of the data model. When the source end data source collects the data to be processed from the source end, the metadata mapping component in the data model invokes the mapping relation in the metadata database to map the data to be processed into the feature vector.

The source end can upload files of the size information of the cutter of the type, such as a plurality of parameter information in the length of the cutting edge of the cutter, the length of the cutter body, the width of the cutter body and the length of the cutter handle. The source end can also be the material information of the cutters stored in the database, such as the material of the cutter body and the material of the cutter handle. The source end data source acquires the information, the metadata component in the data model calls the mapping relation between the source end data source and the target end data source in the metadata database as one possible mapping relation, the size information of all cutters corresponds to one characteristic value in the mapping relation, the length of the cutter body, the width of the cutter body and the length of the cutter handle correspond to one characteristic value in the mapping relation as another possible mapping relation, and the edge length of the cutter corresponds to another characteristic value in the mapping relation. And splicing the obtained characteristic values according to a set sequence to form a high-dimensional characteristic vector.

1002, an algorithm model is established and initialized, wherein the algorithm model includes a first deep learning network and a second deep learning network.

In this embodiment, the feature vector is trained by the first deep learning network of the algorithm model to form knowledge (feature of extracted data), the knowledge is brought into the second deep learning network of the algorithm model to perform training to form new knowledge (new feature is extracted), and a relationship between the knowledge and the new knowledge is established.

As an alternative embodiment, the method for initializing the algorithm model includes (1) and (2), specifically includes:

(1) The index of the data to be processed is configured at the first deep learning network and the second deep learning network.

(2) Setting a training algorithm of the first deep learning network and a training algorithm of the second deep learning network.

Specifically, the index of the data to be processed refers to the address of the data to be processed at the source end, and the first deep learning network and the second algorithm layer can acquire the data to be processed for training to form knowledge. The data to be processed acquired by the first deep learning network and the second algorithm layer may be different. And configuring indexes of data to be processed on a page configuring the first deep learning network, and completing algorithm configuration through a drag algorithm. The configuration method of the second deep learning network is the same, and the configuration algorithm is different from that of the first deep learning network.

1003, generating a first knowledge unit based on the first deep learning network and feature vectors of the data to be processed.

Step 1003 specifically includes inputting feature vectors of data to be processed into a first deep learning network to obtain a model training result. Inputting the model training result into a knowledge base, and enabling the knowledge base to generate a first knowledge unit according to the model training result and the mapping relation between the model training result and the knowledge unit.

Specifically, the first deep learning network is used for extracting characteristics of data to be processed, a prediction result refers to an evaluation level of the data to be processed, and the first knowledge unit refers to an evaluation corresponding to the evaluation level of the data to be processed. Parameters (a plurality of parameters such as size, materials and sharpness of a cutting edge) of the cutter are collected, input into a first deep learning network for training, and a symbol representing the evaluation level of the service life of the cutter is output, wherein the evaluation level symbol comprises A, B, C, D levels. The knowledge base stores the evaluation corresponding to the evaluation level, if A corresponds to very long service life, B corresponds to very long service life, C corresponds to general service life, and A corresponds to short service life.

And 1004, generating weight values of the second knowledge unit, the first knowledge unit and the second knowledge unit based on the second deep learning network and the first knowledge unit.

As an optional implementation manner, inputting the first knowledge unit and the feature vector of the data to be processed into a training layer of a second deep learning network to obtain a second knowledge unit output by the training layer; and inputting the first knowledge unit and the second knowledge unit into a weight value generation layer of the second deep learning network to generate weight values of the first knowledge unit and the second knowledge unit.

In particular, the second knowledge unit refers to a new knowledge unit formed by the first knowledge unit in combination with the data to be processed. The first knowledge unit and the feature vector of the data to be processed generated in step 1003 are input to the training layer of the second deep learning network for training again, so as to form a second knowledge unit. The knowledge units and the new knowledge units (training results of the training layer of the second deep learning network) are input to the weight value generation layer of the second deep learning network, and the second deep learning network outputs weight values. The weight value here represents the degree of tightness between two knowledge units, and the greater the weight value, the tighter the relationship between the two knowledge units.

After the service life of the cutter is evaluated, data and evaluation of the service life of the cutter are input into a training layer of a second deep learning network for retraining, so that more accurate evaluation, such as influence factors of the service life, is obtained. And simultaneously inputting the evaluation of the service life of the cutter and the influence factors of the service life into the weight generation layer to obtain the weight value of the connection tightness degree of the evaluation of the service life of the cutter and the influence factors of the service life, wherein the higher the numerical value, the higher the tightness degree.

As another alternative embodiment, the first knowledge unit and the feature vector of other data are input to a training layer of the second deep learning network, and a second knowledge unit output by the training layer is obtained.

After the service life of the cutter is evaluated, if the noise data of the machine operation and the evaluation are input into a training layer of a second deep learning network for retraining, the noise generated when the cutter is used is obtained. And simultaneously inputting the evaluation of the service life of the cutter and the noise generated when the cutter is used into the weight generation layer to obtain the evaluation of the service life of the cutter and the weight value of the noise generated when the cutter is used.

1005, generating an industrial knowledge graph comprising a directed graph from the first knowledge unit to the second knowledge unit according to the weight values.

Specifically, the first knowledge unit and the second knowledge unit represent the knowledge unit and the trained knowledge unit, respectively. The knowledge units can be represented by nodes, the trained knowledge units are obtained by training the knowledge units, the flowing direction of the knowledge flows from the knowledge units to the trained knowledge units, the direction can be represented, the weight value refers to the close relationship between the knowledge units and the trained knowledge units, and the length of a connecting line can be represented. According to the method, a directed graph of knowledge units and trained knowledge units is formed. Training the knowledge units to obtain trained knowledge, and constructing knowledge units and a directed graph of the trained knowledge units according to the weight values to form a knowledge graph. The user can select a display mode of the knowledge graph, wherein the display mode can be a topological graph, an icon or a mind map, and if the display mode is the topological graph or the mind map, the user can edit the knowledge graph by dragging, and the edited knowledge graph can be derived.

As shown in fig. 11, an embodiment of the present application provides a data processing apparatus, including:

an obtaining module 1101, configured to obtain a data identification request of a user, where the data identification request carries data to be identified;

a determining module 1102, configured to determine, for the user, a data identification industrial model based on attribute information of the data to be identified;

the computing module 1103 is configured to input the data to be identified to the data identification industrial model for identification processing, so as to obtain an identification result;

and a feedback module 1104, configured to send the identification result to the user.

Optionally, the determining module 1102 is specifically configured to:

Optionally, the data processing apparatus further includes: a training module 1105, the training module 1105 being configured to:

acquiring training data from a sample database;

Optionally, the data processing apparatus further includes: an environment configuration module 1106, the environment configuration module 1106 being specifically configured to:

Optionally, the data processing apparatus further includes: a model building module 1107, the model building module 1107 being specifically configured to:

comparing the algorithm files, selecting an algorithm from the algorithm list according to a comparison result, and generating the preset model according to the selected algorithm. Optionally, the data processing apparatus further includes: a data preparation module 1008, the data preparation module 1008 being specifically configured to:

acquiring operation data of equipment;

the metadata is formed into the sample database.

Optionally, the data processing apparatus further includes: an industrial metadata base construction module 1109, the industrial metadata base construction module 1109 being specifically configured to:

acquiring original data of a data source;

Optionally, the data processing apparatus further includes: a generating module 1110, where the generating module 1110 is specifically configured to:

Optionally, the data processing apparatus further includes: an initial parameter configuration module 1111, where the initial parameter configuration module 1111 is specifically configured to:

Corresponding to the data processing method in fig. 1, the embodiment of the present application further provides a computer device 1200, as shown in fig. 12, where the device includes a memory 1201, a processor 1202, and a computer program stored in the memory 1201 and capable of running on the processor 1202, where the steps of the data processing method are implemented when the processor 1202 executes the computer program.

Specifically, the memory 1201 and the processor 1202 may be general-purpose memories and processors, which are not specifically limited herein, and when the processor 1202 runs a computer program stored in the memory 1201, the data processing method described above may be executed, so as to solve the problem of low data processing efficiency in the prior art, determine, by using data to be identified carried in a data identification request of a user, a data identification industrial model required by the user, input the data to be identified into the data identification industrial model to obtain an identification result, and feed back the result to the user, so that the user can obtain the required result.

Corresponding to the data processing method in fig. 1, the embodiment of the present application further provides a computer readable storage medium, on which a computer program is stored, which computer program, when being executed by a processor, performs the steps of the data processing method described above.

Specifically, the storage medium can be a general storage medium, such as a mobile disk, a hard disk, and the like, when a computer program on the storage medium is run, the model selection method can be executed, which is used for solving the problem of low data processing efficiency in the prior art, the data to be identified carried in the data identification request of the user is used for determining the data identification industrial model required by the user, the data to be identified is input into the data identification industrial model to obtain an identification result, the result is fed back to the user, the user can obtain the required result, the corresponding data identification industrial model can be found through the data identification request of the user, the time for inquiring the data identification industrial model corresponding to the data to be identified by the user is shortened, the work efficiency of the user is improved, and compared with the data to be identified in the data identification request, the data identification industrial model obtained through the data identification request has pertinence, so that the result obtained when the data identification industrial model is used for processing the data to be identified is more accurate.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The above-described apparatus embodiments are merely illustrative, for example, the division of the units is merely a logical function division, and there may be other manners of division in actual implementation, and for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some communication interface, device or unit indirect coupling or communication connection, which may be in electrical, mechanical or other form.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments provided in the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

It should be noted that: like reference numerals and letters in the following figures denote like items, and thus once an item is defined in one figure, no further definition or explanation of it is required in the following figures, and furthermore, the terms "first," "second," "third," etc. are used merely to distinguish one description from another and are not to be construed as indicating or implying relative importance.

Finally, it should be noted that: the foregoing examples are merely specific embodiments of the present application, and are not intended to limit the scope of the present application, but the present application is not limited thereto, and those skilled in the art will appreciate that while the foregoing examples are described in detail, the present application is not limited thereto. Any person skilled in the art may modify or easily conceive of the technical solution described in the foregoing embodiments, or make equivalent substitutions for some of the technical features within the technical scope of the disclosure of the present application; such modifications, changes or substitutions do not depart from the spirit and scope of the corresponding technical solutions. Are intended to be encompassed within the scope of this application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method of data processing, comprising:

sending the identification result to the user;

wherein the data identifies an industrial model based on training the data in the following manner:

acquiring training data from a sample database;

determining the data identification industrial model from all the preset models which finish training based on the model accuracy corresponding to all the preset models;

wherein, for each preset model, initial parameters of each preset model are configured, comprising the following steps:

acquiring data of a plurality of different features according to a preset model;

Arranging the data to form a matrix model;

carrying out data serialization processing on the elastic distributed data set of the matrix model, carrying out association analysis on the elastic distributed data set of the matrix model, and selecting and combining the elastic distributed data set of the matrix model with association degree smaller than a preset association degree value to obtain training data corresponding to the industrial model;

after said identifying the industrial model for said user-determined data, further comprising:

2. The method of claim 1, wherein the determining the data identification industry model for the user based on the attribute information of the data to be identified comprises:

3. The method of claim 1, wherein the pre-set model is constructed according to the following:

4. The method of claim 1, wherein constructing the sample database according to the following manner comprises:

acquiring operation data of equipment;

the metadata is formed into the sample database.

5. The method as recited in claim 1, further comprising:

Acquiring original data of a data source;

6. The method of claim 1, further comprising, after said obtaining each training-completed pre-set model:

7. The method of claim 1, wherein said configuring initial parameters of each of said predetermined models comprises:

8. A data processing apparatus, comprising:

the feedback module is used for sending the identification result to the user;

The data processing device further comprises a training module, wherein the training module is used for:

acquiring training data from a sample database;

the training module is further configured to configure initial parameters of each preset model for each preset model, including:

arranging the data to form a matrix model;

The data processing apparatus further comprises an environment configuration module for: