CN116775568A - Data service release method, device, equipment and medium based on business domain - Google Patents

Data service release method, device, equipment and medium based on business domain Download PDF

Info

Publication number
CN116775568A
CN116775568A CN202310648725.0A CN202310648725A CN116775568A CN 116775568 A CN116775568 A CN 116775568A CN 202310648725 A CN202310648725 A CN 202310648725A CN 116775568 A CN116775568 A CN 116775568A
Authority
CN
China
Prior art keywords
service
data
data set
original
service data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310648725.0A
Other languages
Chinese (zh)
Inventor
李广
贺长荣
沈亮
杨凯
张峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Duodian Life Chengdu Technology Co ltd
Original Assignee
Duodian Life Chengdu Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Duodian Life Chengdu Technology Co ltd filed Critical Duodian Life Chengdu Technology Co ltd
Priority to CN202310648725.0A priority Critical patent/CN116775568A/en
Publication of CN116775568A publication Critical patent/CN116775568A/en
Pending legal-status Critical Current

Links

Abstract

The embodiment of the invention discloses a data service release method, device, equipment and medium based on business domain. One embodiment of the method comprises the following steps: creating a data service directory; extracting an original service data set from a service database corresponding to a target service domain; storing the original business data set to a central database; performing data cleaning processing on the original service data set stored in the central database to obtain a service data set; receiving extended input data; generating a service data set according to the service data set and the expansion input data; creating a data service document corresponding to the service data set; generating a data service file according to the service data set and the data service document; in response to receiving the data service publication request, a data service filename is inserted into the data service directory. The embodiment can provide data service for users in the form of data service files, reduce the times of regenerating service data and reduce the waste of computer resources.

Description

Data service release method, device, equipment and medium based on business domain
Technical Field
The embodiment of the disclosure relates to the technical field of computers, in particular to a business domain-based data service release method, a business domain-based data service release device, a business domain-based data service release equipment and a business domain-based data service release medium.
Background
The data service refers to the process of cleaning the original business data to generate the service data which is helpful for the user to make business decisions. Currently, when implementing data services, the following methods are generally adopted: and responding to the data service request of the user, performing data cleaning processing on the original service data, and then directly providing the obtained service data for the user.
However, the inventors have found that when the data service is implemented in the above manner, there are often the following technical problems:
first, each time a user applies for data services, service data needs to be regenerated, and computer resources are wasted.
Secondly, because the business data relates to the background knowledge of the corresponding business field, abnormal business data is detected by a manual detection mode, the time consumption is long, the efficiency is low, and the computer resources are wasted.
The above information disclosed in this background section is only for enhancement of understanding of the background of the inventive concept and, therefore, may contain information that does not form the prior art that is already known to those of ordinary skill in the art in this country.
Disclosure of Invention
The disclosure is in part intended to introduce concepts in a simplified form that are further described below in the detailed description. The disclosure is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
Some embodiments of the present disclosure propose business domain-based data service distribution methods, apparatuses, electronic devices, and computer-readable media to solve one or more of the technical problems mentioned in the background section above.
In a first aspect, some embodiments of the present disclosure provide a business domain-based data service publishing method, the method comprising: creating a data service catalog in a central database; extracting an original service data set from a service database corresponding to a target service domain; storing the extracted original business data set into the central database; performing data cleaning processing on the original service data set stored in the central database to obtain a service data set; receiving extension input data, wherein the extension input data comprises at least one of manual input data and external call data; generating a service data set according to the service data set and the expansion input data; creating a data service document corresponding to the service data set, wherein the data service document comprises preset output format information corresponding to the service data set; generating a data service file according to the service data set and the data service document, wherein the data service file corresponds to a data service file name; and in response to receiving the data service release request, inserting the data service file name into the data service directory.
In a second aspect, some embodiments of the present disclosure provide a data service publishing apparatus based on a traffic domain, the apparatus comprising: a first creation unit configured to create a data service directory in a central database; the extraction unit is configured to extract an original service data set from a service database corresponding to the target service domain; a storage unit configured to store the extracted original service data set to the aforementioned central database; the data cleaning unit is configured to perform data cleaning processing on the original service data set stored in the central database to obtain a service data set; a receiving unit configured to receive extension input data, wherein the extension input data includes at least one of manual input data and external call data; a first generation unit configured to generate a service data set according to the service data set and the extended input data; a second creation unit configured to create a data service document corresponding to the service data set, wherein the data service document includes preset output format information corresponding to the service data set; a second generating unit configured to generate a data service file according to the service data set and the data service document, wherein the data service file corresponds to a data service file name; and an inserting unit configured to insert the data service file name into the data service directory in response to receiving a data service distribution request.
In a third aspect, some embodiments of the present disclosure provide an electronic device comprising: one or more processors; a storage device having one or more programs stored thereon, which when executed by one or more processors causes the one or more processors to implement the method described in any of the implementations of the first aspect above.
In a fourth aspect, some embodiments of the present disclosure provide a computer readable medium having a computer program stored thereon, wherein the program, when executed by a processor, implements the method described in any of the implementations of the first aspect above.
The above embodiments of the present disclosure have the following advantageous effects: by the data service release method based on the business domain, which is disclosed by the embodiment of the invention, the data service can be provided for the user in the form of the data service file, the number of times of regenerating service data is reduced, and the waste of computer resources is reduced. Specifically, the reason for wasting computer resources is that: every time a user applies for data service, service data needs to be regenerated, and computer resources are wasted. Based on this, some embodiments of the present disclosure provide a business domain-based data service publishing method by first creating a data service directory in a central database. Thus, the business domain user can search the generated data service file through the service data catalog. And then extracting the original service data set from the service database corresponding to the target service domain. Thus, the original service data set corresponding to the target service domain can be obtained. And then, storing the extracted original business data set into the central database. And secondly, performing data cleaning processing on the original service data set stored in the central database to obtain a service data set. Thus, the service data set can be obtained through the data cleaning process. Next, extended input data is received. Wherein the extended input data includes at least one of manual input data and external call data. And then, generating a service data set according to the service data set and the extension input data. Thereby, a service data set for generating the data service file can be obtained. Thereafter, a data service document corresponding to the service data set is created. The data service file comprises preset output format information corresponding to the service data set. Thereby, a data service document for generating a data service file can be obtained. Next, a data service file is generated based on the service data set and the data service document. Wherein, the data service file corresponds to a data service file name. Thereby, a data service file comprising a set of service data corresponding to the target business domain can be obtained. Finally, the data service file name is inserted into the data service directory in response to receiving a data service release request. Thus, the business domain user can search the file name of the data service file in the data service directory to acquire the data service file. Also, since the data service is provided to the user in the form of a data service file, the number of times of reproducing the service data can be reduced. Thus, the waste of computer resources is reduced.
Drawings
The above and other features, advantages, and aspects of embodiments of the present disclosure will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. The same or similar reference numbers will be used throughout the drawings to refer to the same or like elements. It should be understood that the figures are schematic and that elements and components are not necessarily drawn to scale.
FIG. 1 is a flow chart of some embodiments of a business domain based data service distribution method according to the present disclosure;
FIG. 2 is a schematic diagram of the architecture of some embodiments of a business domain based data service distribution device according to the present disclosure;
fig. 3 is a schematic structural diagram of an electronic device suitable for use in implementing some embodiments of the present disclosure.
Detailed Description
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete. It should be understood that the drawings and embodiments of the present disclosure are for illustration purposes only and are not intended to limit the scope of the present disclosure.
It should be noted that, for convenience of description, only the portions related to the present invention are shown in the drawings. Embodiments of the present disclosure and features of embodiments may be combined with each other without conflict.
It should be noted that the terms "first," "second," and the like in this disclosure are merely used to distinguish between different devices, modules, or units and are not used to define an order or interdependence of functions performed by the devices, modules, or units.
It should be noted that references to "one", "a plurality" and "a plurality" in this disclosure are intended to be illustrative rather than limiting, and those of ordinary skill in the art will appreciate that "one or more" is intended to be understood as "one or more" unless the context clearly indicates otherwise.
The names of messages or information interacted between the various devices in the embodiments of the present disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.
Operations such as collection, storage, use, etc. of personal information (e.g., raw business data) of a user involved in the present disclosure, and before performing the corresponding operations, the relevant organization or individual is up to the end to include carrying out personal information security impact assessment, fulfilling informed obligations to the personal information body, obtaining authorized consent of the personal information body in advance, etc.
The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
Fig. 1 illustrates a flow 100 of some embodiments of a business domain based data service distribution method according to the present disclosure. The data service release method based on the business domain comprises the following steps:
step 101, creating a data service directory in a central database.
In some embodiments, an executing body (e.g., computing device) of the business domain based data service distribution method may create a data service catalog in a central database. The central database may be a database corresponding to the execution subject. The data service directory may be a component or table in the central database.
The execution body may be hardware or software. When the execution body is hardware, the execution body may be implemented as a distributed cluster formed by a plurality of servers or terminal devices, or may be implemented as a single server or a single terminal device. When the execution master is embodied as software, it may be installed in the above-listed hardware device. It may be implemented as a plurality of software or software modules, for example, for providing distributed services, or as a single software or software module. The present invention is not particularly limited herein. It should be appreciated that there may be any number of computing devices as desired for an implementation.
Step 102, extracting an original service data set from a service database corresponding to the target service domain.
In some embodiments, the executing entity may extract the original service data set from the service database corresponding to the target service domain. The target service domain may be a service domain corresponding to a service system that issues a data service request to the execution body. The service domain may be a service sub-domain obtained by manually dividing the service domain according to the service type. For example, the service area may be a logistics area. The service fields may include, but are not limited to, an order field, a distribution field, and an item field. The business systems include, but are not limited to, order systems corresponding to the order fields, distribution systems corresponding to the distribution fields, and item systems corresponding to the item fields. The service database may be a database corresponding to the target service domain. In practice, the executing entity may extract the original service data set from the service database corresponding to the target service domain through a data extraction tool (e.g., ETL tool). The original service data set may be a data set generated by a service system corresponding to the target service domain and not subjected to data cleaning processing. For example, the target business domain may be an item domain. The original business data set may be an item data set generated by an item system corresponding to the item domain and not subjected to data cleansing processing. The item data may include an item name field, an item number field, an item value field, and a promotion status field.
Step 103, storing the extracted original business data set to a central database.
In some embodiments, the executing entity may store the extracted set of raw business data to the central database.
And 104, performing data cleaning processing on the original service data set stored in the central database to obtain the service data set.
In some embodiments, the executing body may perform data cleaning processing on the original service data set stored in the central database to obtain a service data set. The service data set may be a data set obtained by performing a data cleaning process on the original service data set.
In some optional implementations of some embodiments, the executing entity may perform a data cleaning process on the original service data set stored in the central database to obtain a service data set by:
first, for each original service data in the original service data set, the following processing steps are performed:
and a first sub-step of performing a structuring process on the original service data in response to determining that the original service data is unstructured data or semi-structured data. The unstructured data may be text format data. In practice, first, the executing entity may determine that the original service data is unstructured data or semi-structured data through the format of the original service data. The execution body may then convert the semi-structured data into structured data by extracting JSON attributes as fields or XML tags as fields. The execution body may convert unstructured data into structured data through a natural language processing library (e.g., stanford CoreNLP library, NTLK library).
And a second sub-step, in response to determining that the original service data is abnormal, repairing the abnormal value in the original service data.
And secondly, carrying out de-duplication processing on the processed original service data set to obtain each piece of original service data after the de-duplication processing as a service data set. In practice, the execution body may perform deduplication processing on the processed original service data set through a data deduplication tool (e.g., an ETL tool).
In some optional implementations of some embodiments, the executing body may perform repair processing on an outlier in the original service data in response to determining that the original service data is abnormal by:
first, for each field data included in the original service data, the following steps are performed:
and a first sub-step of inputting the field data into an abnormal data detection model to obtain a classification label corresponding to the field data. The abnormal data detection model may be a model that characterizes the detection of data anomalies on the field data. The anomaly detection model may be a model-trained decision tree model. The classification tag may be information representing the detection result of the field data, which is output by the abnormal data detection model.
And a second sub-step of selecting, as repair data, filler data having the same field name as the field data from a preset filler data table of the central database in response to determining that the classification tag is an abnormal data tag. In practice, the class labels described above may be boolean type variables. For example, the class label may be "TRUE" to indicate that the field data is abnormal data. The class label may be "FALSE" to indicate that the field data is non-anomalous. The preset filling data table may be a database table matched with each field in the original service data. The filling data in the preset filling data table may be a field data default value corresponding to each field of the original service data provided by the target service domain user. The target service domain user may be a user of a service system corresponding to the target service domain.
And a third sub-step of replacing the field data with the repair data.
In some optional implementations of some embodiments, the anomaly data detection model described above may be generated by:
and the first step is to receive a service standard information set, a service standard priority information set and a model adjustment sample set corresponding to the target service domain. Wherein, the service standard information in the service standard information set corresponds to the service standard priority information in the service standard priority information set. The model adjustment sample in the model adjustment sample set comprises sample service data and sample classification labels corresponding to the sample service data. The sample service data may be service data provided for the target service domain user for training the abnormal data detection model. The sample classification tag may be information indicating whether the corresponding sample service data is abnormal. The service criterion information set may be provided for the target service domain user. The service standard information in the service standard information set may be valid data information corresponding to a service data field in the service data set. The valid data information may be a range of values, a set of values, or data format information. For example, the target business domain may be an order domain. The service data set may be an order data set generated by an order system corresponding to the order domain and obtained after the data cleaning process. The order data in the order data set may include a "unit price value" field and a "date of placement" field. The valid data information corresponding to the "order value" field in the order data may be a range of values, e.g., "110-125". The valid data information corresponding to the "order date" field in the order data may be data format information, for example, "YYYY-MM-DD". In practice, the service standard priority information in the service standard priority information set may be a numerical value representing the priority of the corresponding service standard information, and the larger the numerical value, the higher the priority of the corresponding service standard information may be represented.
And step two, sequencing the service standard information in the service standard set according to the service standard priority information in the service standard priority information set to obtain a service standard information sequence. In practice, the service standard information sequences may be obtained by sorting the service standard information in the service standard set from large to small according to the value of the service standard priority information in the service standard priority information set.
Thirdly, constructing an initial abnormal data detection model according to the service standard information sequence. The initial abnormal data detection model may be a decision tree model. In practice, the initial anomaly data detection model described above may be built through an interface in a machine learning library (e.g., a scikit-learn library).
Fourth, based on the model adjustment sample set and the initial abnormal data detection model, the following model training steps are executed:
and a first sub-step of inputting the sample business data of at least one model adjustment sample in the model adjustment sample set into the abnormal data detection model to obtain a classification label corresponding to each model adjustment sample in the at least one model adjustment sample.
And a second sub-step of comparing the classification label corresponding to each model adjustment sample in the at least one model adjustment sample with the corresponding sample classification label. In practice, it may be compared whether the boolean type variable of the obtained class label is the same as the boolean type variable of the sample class label. The Boolean type variables of the two are the same, and the corresponding sample service data detection result is characterized to be correct. The Boolean type variables of the two are different, and the corresponding sample service data detection result errors are represented.
And a third sub-step of determining the classification accuracy of the initial abnormal data detection model according to the comparison result. In practice, the ratio of the total number of the sample service data with the correct detection result to the total number of the sample service data may be determined as the classification accuracy of the initial abnormal data detection model.
And a fourth sub-step of determining the initial abnormal data detection model as an abnormal data detection model in response to determining that the classification accuracy is equal to or greater than a preset classification accuracy. The preset classification accuracy is preset. Here, the specific setting of the preset classification accuracy is not limited. For example, the preset classification accuracy may be 99%.
And a fifth sub-step of adjusting the initial abnormal data detection model in response to determining that the classification accuracy is less than a preset classification accuracy, forming a model adjustment sample set from unused model adjustment samples, determining the adjusted initial abnormal data detection model as an initial abnormal data detection model, and executing the model training step again. In practice, the above-described initial abnormal data detection model may be adjusted by pruning operations.
The above related content is taken as an invention point of the embodiment of the disclosure, and solves the second technical problem mentioned in the background art, namely, because the service data relates to the background knowledge of the corresponding service field, abnormal service data is detected by a manual detection mode, which is time-consuming, low in efficiency and wasteful in computer resources. ". Factors that lead to wasted computer resources are often as follows: because the business data relates to the background knowledge of the corresponding business field, abnormal business data is detected by a manual detection mode, the time consumption is long, the efficiency is low, and the computer resources are wasted. If the above factors are solved, the effect of reducing the waste of computer resources can be achieved. To achieve this effect, the present disclosure introduces an abnormal data detection model. First, for each field data included in the original service data, the following steps are performed: and inputting the field data into an abnormal data detection model to obtain a classification label corresponding to the field data. Thus, the abnormality detection result of the above-mentioned field data can be obtained. And selecting filling data with the same field name as the field data from a preset filling data table of the central database as repair data in response to determining that the classification label is an abnormal data label. Thus, it can be determined whether the field data is abnormal data. If the field data is abnormal data, the repair data matched with the field data can be determined. And replacing the field data with the repair data. Therefore, the identification and repair of the abnormal field data can be completed, the manual detection is avoided, and the waste of computer resources is reduced.
Step 105, receive extended input data.
In some embodiments, the execution body may receive extended input data. Wherein the extended input data includes at least one of manual input data and external call data. The manual input data may be service data input by the target service domain user. The external call data may be a data service file corresponding to other service domains generated by the execution body.
And 106, generating a service data set according to the service data set and the extended input data.
In some embodiments, the execution body may generate a service data set according to the service data set and the extended input data.
In some optional implementations of some embodiments, the executing entity may generate the service data set according to the service data set and the extended input data by:
and a first step of determining the service data set and the expansion input data as target data sets.
And secondly, aggregating the target data set according to preset aggregation conditions to obtain an aggregated data set. The preset aggregation condition may be a logic expression representing an aggregation operation performed on the target data set. In practice, the executing body may execute an aggregation operation on the target data set in the central database according to a preset aggregation condition, so as to obtain an aggregated data set.
And thirdly, determining the aggregate data set as a service data set.
In step 107, a data service document corresponding to the set of service data is created.
In some embodiments, the execution body may create a data service document corresponding to the service data set. The data service file comprises preset output format information corresponding to the service data set. The preset output format information may be format information representing an output format of the data service file. Here, the specific setting of the preset data service file output format information is not limited. For example, the preset data service file output format information may include, but is not limited to, format information characterizing a CSV file format, format information characterizing a model parameter file format (e.g., ONNX file format), format information characterizing a PDF file format, and format information characterizing an Excel file format. The data service document may be a document characterizing a description of the service data set. The data service document may include preset output format information, and a field description information set corresponding to the service data set. The set of field description information may be provided by the target service domain user. Each field description information in the field description information set may be text information of a meaning represented by each field in the service data set.
Step 108, generating a data service file according to the service data set and the data service document.
In some embodiments, the execution body may generate a data service file from the service data set and the data service document. Wherein, the data service file corresponds to a data service file name. The data service file may be stored in the central database. The data service file may be a file containing the service data set. The data service file name may be "item domain-item real-time list", and the data service file representing the data service file name may include an item real-time data set.
In some optional implementations of some embodiments, the executing entity may generate the data service file from the service data set and the data service document by:
first, determining output format information corresponding to the service data set in the data service document.
And step two, exporting the service data set in a file format corresponding to the determined output format information to obtain a data service file.
In some optional implementations of some embodiments, the executing entity may generate the data service file from the service data set and the data service document by:
In response to determining that the output format information corresponding to the service data set in the data service document is the format information (e.g., 0NNX file format) characterizing the model parameter information file format, the following training steps are performed based on the service data set and the initial preset service test model:
a first sub-step of determining the set of service data as a set of service data samples.
And a second sub-step of inputting at least one service data sample in the service data sample set into an initial preset service test model to obtain service test data information corresponding to each service data sample in the at least one service data sample. The initial preset service test model may be a convolutional neural network model or a deep neural network. The service test data information may be information characterizing a prediction result of a corresponding service data sample.
And a third sub-step of determining a loss function value according to a preset service data loss function, each service data sample in the at least one service data sample and corresponding service test data information. The predetermined service data loss function may be a mean square error function. In practice, the executing body may substitute each service data sample in the at least one service data sample and corresponding service test data information into the preset service data loss function to determine a loss function value.
And a fourth sub-step of determining an initial preset service test model as a service test model in response to determining that the loss function value is less than a preset loss threshold. Here, the specific setting of the preset loss threshold value is not limited.
And secondly, generating a model parameter information file according to the determined model parameters of the service test model. In practice, the execution subject may derive the model parameters of the determined service test model by a method in a deep learning framework (for example, a torch.save () method of a Pytorch framework) to generate a model parameter information file.
And thirdly, determining the model parameter information file as a data service file.
In some optional implementations of some embodiments, the training step may further include the steps of:
and in the first step, in response to determining that the loss function value is greater than a preset loss threshold, adjusting model parameters of an initial preset service test model. In practice, the executing body may adjust the initial preset service test model by using a back propagation algorithm and a gradient descent algorithm.
Second, unused service data samples are formed into a service data sample set.
And thirdly, determining the adjusted initial preset service test model as the initial preset service test model, and executing the training step again.
In response to receiving the data service publication request, a data service filename is inserted into the data service directory, step 109.
In some embodiments, the executing entity may insert the data service file name into a data service directory in response to receiving a data service publication request. The data service issuing request may be issued for a service system corresponding to the target service domain.
The above embodiments of the present disclosure have the following advantageous effects: by the data service release method based on the business domain, which is disclosed by the embodiment of the invention, the data service can be provided for the user in the form of the data service file, the number of times of regenerating service data is reduced, and the waste of computer resources is reduced. Specifically, the reason for wasting computer resources is that: every time a user applies for data service, service data needs to be regenerated, and computer resources are wasted. Based on this, some embodiments of the present disclosure provide a business domain-based data service publishing method by first creating a data service directory in a central database. Thus, the business domain user can search the generated data service file through the service data catalog. And then extracting the original service data set from the service database corresponding to the target service domain. Thus, the original service data set corresponding to the target service domain can be obtained. And then, storing the extracted original business data set into the central database. And secondly, performing data cleaning processing on the original service data set stored in the central database to obtain a service data set. Thus, the service data set can be obtained through the data cleaning process. Next, extended input data is received. Wherein the extended input data includes at least one of manual input data and external call data. And then, generating a service data set according to the service data set and the extension input data. Thereby, a service data set for generating the data service file can be obtained. Thereafter, a data service document corresponding to the service data set is created. The data service file comprises preset output format information corresponding to the service data set. Thereby, a data service document for generating a data service file can be obtained. Next, a data service file is generated based on the service data set and the data service document. Wherein, the data service file corresponds to a data service file name. Thereby, a data service file comprising a set of service data corresponding to the target business domain can be obtained. Finally, the data service file name is inserted into the data service directory in response to receiving a data service release request. Thus, the business domain user can search the file name of the data service file in the data service directory to acquire the data service file. Also, since the data service is provided to the user in the form of a data service file, the number of times of reproducing the service data can be reduced. Thus, the waste of computer resources is reduced.
With continued reference to fig. 2, as an implementation of the method illustrated in the above figures, the present disclosure provides some embodiments of a digital content presentation apparatus, which apparatus embodiments correspond to those illustrated in fig. 1, and which apparatus is particularly applicable in a variety of electronic devices.
As shown in fig. 2, the digital content presentation device 200 of some embodiments includes: a first creation unit 201, an extraction unit 202, a storage unit 203, a data cleansing unit 204, a reception 205, a first generation unit 206, a second creation unit 207, a second generation unit 208, and an insertion unit 209. Wherein the first creation unit 201 is configured to create a data service directory in a central database; the extraction unit 202 is configured to extract an original service data set from a service database corresponding to the target service domain; the storage unit 203 is configured to store the extracted original service data set to the above-mentioned central database; the data cleaning unit 204 is configured to perform data cleaning processing on the original service data set stored in the central database to obtain a service data set; the receiving unit 205 is configured to receive extension input data, wherein the extension input data includes at least one of manual input data and external call data; the first generating unit 206 is configured to generate a service data set according to the service data set and the extended input data; the second creating unit 207 is configured to create a data service document corresponding to the service data set, wherein the data service document includes preset output format information corresponding to the service data set; the second generating unit 208 is configured to generate a data service file according to the service data set and the data service document, wherein the data service file corresponds to a data service file name; the inserting unit 209 is configured to insert the data service file name into the data service directory in response to receiving a data service issue request.
It will be appreciated that the elements described in the apparatus 200 correspond to the various steps in the method described with reference to fig. 1. Thus, the operations, features and resulting benefits described above for the method are equally applicable to the apparatus 200 and the units contained therein, and are not described in detail herein.
Referring now to fig. 3, a schematic diagram of an electronic device 300 (e.g., a computing device) suitable for use in implementing some embodiments of the present disclosure is shown. The electronic devices in some embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), car terminals (e.g., car navigation terminals), and the like, as well as stationary terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 3 is merely an example and should not impose any limitations on the functionality and scope of use of embodiments of the present disclosure.
As shown in fig. 3, the electronic device 300 may include a processing means 301 (e.g., a central processing unit, a graphics processor, etc.) that may perform various suitable actions and processes in accordance with a program stored in a Read Only Memory (ROM) 302 or a program loaded from a storage means 308 into a Random Access Memory (RAM) 303. In the RAM 303, various programs and data required for the operation of the electronic apparatus 300 are also stored. The processing device 301, the ROM 302, and the RAM 303 are connected to each other via a bus 304. An input/output (I/O) interface 305 is also connected to bus 304.
In general, the following devices may be connected to the I/O interface 305: input devices 306 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 307 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 308 including, for example, magnetic tape, hard disk, etc.; and communication means 309. The communication means 309 may allow the electronic device 300 to communicate with other devices wirelessly or by wire to exchange data. While fig. 3 shows an electronic device 300 having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead. Each block shown in fig. 3 may represent one device or a plurality of devices as needed.
In particular, according to some embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, some embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such embodiments, the computer program may be downloaded and installed from a network via communications device 309, or from storage device 308, or from ROM 302. The above-described functions defined in the methods of some embodiments of the present disclosure are performed when the computer program is executed by the processing means 301.
It should be noted that, the computer readable medium described in some embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In some embodiments of the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In some embodiments of the present disclosure, however, the computer-readable signal medium may comprise a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.
In some implementations, the clients, servers may communicate using any currently known or future developed network protocol, such as HTTP (HyperText Transfer Protocol ), and may be interconnected with any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the internet (e.g., the internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed networks.
The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: creating a data service catalog in a central database; extracting an original service data set from a service database corresponding to a target service domain; storing the extracted original business data set into the central database; performing data cleaning processing on the original service data set stored in the central database to obtain a service data set; receiving extension input data, wherein the extension input data comprises at least one of manual input data and external call data; generating a service data set according to the service data set and the expansion input data; creating a data service document corresponding to the service data set, wherein the data service document comprises preset output format information corresponding to the service data set; generating a data service file according to the service data set and the data service document, wherein the data service file corresponds to a data service file name; and in response to receiving the data service release request, inserting the data service file name into the data service directory.
Computer program code for carrying out operations for some embodiments of the present disclosure may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in some embodiments of the present disclosure may be implemented by means of software, or may be implemented by means of hardware. The described units may also be provided in a processor, for example, described as: a processor includes a first creation unit, an extraction unit, a storage unit, a data cleansing unit, a reception unit, a first generation unit, a second creation unit, a second generation unit, and an insertion unit. Where the names of these units do not constitute a limitation on the unit itself in some cases, for example, the first creation unit may also be described as "a unit that creates a data service catalog in a central database".
The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a Complex Programmable Logic Device (CPLD), and the like.
The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by those skilled in the art that the scope of the invention in the embodiments of the present disclosure is not limited to the specific combination of the above technical features, but encompasses other technical features formed by any combination of the above technical features or their equivalents without departing from the spirit of the invention. Such as the above-described features, are mutually substituted with (but not limited to) the features having similar functions disclosed in the embodiments of the present disclosure.

Claims (9)

1. A data service release method based on business domain includes:
creating a data service catalog in a central database;
extracting an original service data set from a service database corresponding to a target service domain;
storing the extracted original business data set to the central database;
performing data cleaning processing on the original service data set stored in the central database to obtain a service data set;
receiving extension input data, wherein the extension input data comprises at least one of manual input data and external call data;
generating a service data set according to the service data set and the expansion input data;
creating a data service document corresponding to the service data set, wherein the data service document comprises preset output format information corresponding to the service data set;
generating a data service file according to the service data set and the data service document, wherein the data service file corresponds to a data service file name;
and in response to receiving a data service release request, inserting the data service file name into the data service directory.
2. The method of claim 1, wherein the performing data cleansing processing on the original service data set stored in the central database to obtain a service data set includes:
For each original service data in the set of original service data, performing the following processing steps:
in response to determining that the original business data is unstructured data or semi-structured data, performing structured processing on the original business data;
in response to determining that the original service data is abnormal, repairing an abnormal value in the original service data;
and performing de-duplication processing on the processed original service data set to obtain each piece of original service data after the de-duplication processing as a service data set.
3. The method of claim 1, wherein the generating a set of service data from the set of traffic data and the extended input data comprises:
determining the service data set and the expansion input data as a target data set;
according to preset aggregation conditions, aggregating the target data set to obtain an aggregated data set;
the aggregate data set is determined to be a service data set.
4. The method of claim 1, wherein the generating a data service file from the service data collection and the data service document comprises:
determining output format information corresponding to the service data set in the data service document;
And exporting the service data set in a file format corresponding to the determined output format information to obtain a data service file.
5. The method of claim 1, wherein the generating a data service file from the service data collection and the data service document comprises:
in response to determining that output format information corresponding to the service data set in the data service document is format information characterizing a model parameter information file format, based on the service data set and an initial preset service test model, performing the following training steps:
determining the service data set as a service data sample set;
inputting at least one service data sample in the service data sample set into an initial preset service test model to obtain service test data information corresponding to each service data sample in the at least one service data sample;
determining a loss function value according to a preset service data loss function, each service data sample in the at least one service data sample and corresponding service test data information;
in response to determining that the loss function value is less than a preset loss threshold, determining an initial preset service test model as a service test model;
Generating a model parameter information file according to the determined model parameters of the service test model;
and determining the model parameter information file as a data service file.
6. The method of claim 5, wherein the training step further comprises:
in response to determining that the loss function value is greater than a preset loss threshold, adjusting model parameters of an initial preset service test model;
forming unused service data samples into a service data sample set;
and determining the adjusted initial preset service test model as an initial preset service test model, and executing the training step again.
7. A business domain based data service distribution device, comprising:
a first creation unit configured to create a data service directory in a central database;
the extraction unit is configured to extract an original service data set from a service database corresponding to the target service domain;
a storage unit configured to store the extracted original service data set to the central database;
the data cleaning unit is configured to perform data cleaning processing on the original service data set stored in the central database to obtain a service data set;
A receiving unit configured to receive extension input data, wherein the extension input data includes at least one of manual input data and external call data;
a first generation unit configured to generate a service data set according to the service data set and the extended input data;
a second creation unit configured to create a data service document corresponding to the service data set, wherein the data service document includes preset output format information corresponding to the service data set;
a second generating unit configured to generate a data service file according to the service data set and the data service document, wherein the data service file corresponds to a data service file name;
an inserting unit configured to insert the data service file name into the data service directory in response to receiving a data service issue request.
8. An electronic device, comprising:
one or more processors;
a storage device having one or more programs stored thereon;
when executed by the one or more processors, causes the one or more processors to implement the method of any of claims 1 to 6.
9. A computer readable medium having stored thereon a computer program, wherein the program when executed by a processor implements the method of any of claims 1 to 6.
CN202310648725.0A 2023-05-30 2023-05-30 Data service release method, device, equipment and medium based on business domain Pending CN116775568A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310648725.0A CN116775568A (en) 2023-05-30 2023-05-30 Data service release method, device, equipment and medium based on business domain

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310648725.0A CN116775568A (en) 2023-05-30 2023-05-30 Data service release method, device, equipment and medium based on business domain

Publications (1)

Publication Number Publication Date
CN116775568A true CN116775568A (en) 2023-09-19

Family

ID=88012512

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310648725.0A Pending CN116775568A (en) 2023-05-30 2023-05-30 Data service release method, device, equipment and medium based on business domain

Country Status (1)

Country Link
CN (1) CN116775568A (en)

Similar Documents

Publication Publication Date Title
CN111104479A (en) Data labeling method and device
CN115145560B (en) Business orchestration method, apparatus, device, computer-readable medium, and program product
CN111563163A (en) Text classification model generation method and device and data standardization method and device
CN111680799B (en) Method and device for processing model parameters
CN112947919A (en) Method and device for constructing service model and processing service request
CN115357469B (en) Abnormal alarm log analysis method and device, electronic equipment and computer medium
CN116775568A (en) Data service release method, device, equipment and medium based on business domain
CN116881097B (en) User terminal alarm method, device, electronic equipment and computer readable medium
CN117235535B (en) Abnormal supply end power-off method and device, electronic equipment and medium
CN115374320B (en) Text matching method and device, electronic equipment and computer medium
CN116800834B (en) Virtual gift merging method, device, electronic equipment and computer readable medium
CN116720202B (en) Service information detection method, device, electronic equipment and computer readable medium
CN115204150B (en) Information verification method and device, electronic equipment and computer readable medium
CN116703263B (en) Power equipment distribution method, device, electronic equipment and computer readable medium
CN116702168B (en) Method, device, electronic equipment and computer readable medium for detecting supply end information
CN113468053B (en) Application system testing method and device
CN116701181B (en) Information verification flow display method, device, equipment and computer readable medium
CN116629984B (en) Product information recommendation method, device, equipment and medium based on embedded model
CN116703262B (en) Distribution resource adjustment method, distribution resource adjustment device, electronic equipment and computer readable medium
CN117132245B (en) Method, device, equipment and readable medium for reorganizing online article acquisition business process
CN113111181B (en) Text data processing method and device, electronic equipment and storage medium
CN116204740A (en) Label determining method, information recommending method, device, equipment and storage medium
CN113362097A (en) User determination method and device
CN115842819A (en) Method, device and equipment for downloading test data of automatic driving system
CN113986959A (en) Logistics information acquisition method and device, electronic equipment and computer readable medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination