CN114416174A - Model reconstruction method and device based on metadata, electronic equipment and storage medium - Google Patents

Model reconstruction method and device based on metadata, electronic equipment and storage medium Download PDF

Info

Publication number
CN114416174A
CN114416174A CN202210078236.1A CN202210078236A CN114416174A CN 114416174 A CN114416174 A CN 114416174A CN 202210078236 A CN202210078236 A CN 202210078236A CN 114416174 A CN114416174 A CN 114416174A
Authority
CN
China
Prior art keywords
field
model
candidate
reconstructed
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210078236.1A
Other languages
Chinese (zh)
Inventor
周维
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202210078236.1A priority Critical patent/CN114416174A/en
Publication of CN114416174A publication Critical patent/CN114416174A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • G06F8/72Code refactoring

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to the technical field of artificial intelligence, and particularly discloses a metadata-based model reconstruction method, a metadata-based model reconstruction device, an electronic device and a storage medium, wherein the method comprises the following steps: extracting feature fields of metadata corresponding to a model to be reconstructed to obtain at least one feature field, wherein each feature field in the at least one feature field is used for identifying features of a service corresponding to the model to be reconstructed; determining a service domain of the model to be reconstructed according to the at least one characteristic field; determining a reconstruction template of the model to be reconstructed according to the service domain; determining at least one target field in the at least one characteristic field, wherein the frequency of occurrence of each target field in the at least one target field is greater than a first threshold; determining standard processing logic for each of the target fields; and reconstructing the model to be reconstructed according to the reconstruction template and the standard processing logic of each target field to obtain a reconstructed model.

Description

Model reconstruction method and device based on metadata, electronic equipment and storage medium
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a model reconstruction method and device based on metadata, electronic equipment and a storage medium.
Background
With the advent of the digital society, data as an asset of an enterprise is increasingly important to the survival and development of the enterprise. In this situation, the digital transformation of the enterprise becomes an essential step for the development of modern enterprises. However, most enterprises in the early development stage of the enterprise have no standard for the development of business models due to the rapid development requirement of the business, and developers generally develop the business models according to their own habits, so that a large amount of repeated development (chimney development) exists in the enterprise system. The consequences of this chimney development are: computing resource waste, namely, the same index exists in a plurality of data processing tasks; the development and maintenance cost is high, namely, the same index needs to be repeatedly developed by developers, and the maintenance cost of the developers is caused by multiple tasks; the data asset value is low, namely, the data user is troubled and not beneficial to data asset precipitation due to the fact that the same index name has different statistical calibers and logics. Therefore, the existing business model with irregular historical stock makes the existing enterprise digital transformation difficult, and easily causes the half-way waste of the enterprise digital transformation.
Aiming at the situation, the existing processing mode is to arrange personnel to carry out model carding again, analyze indexes of stock and then carry out model reconstruction through a corresponding modeling method. However, the defect and disadvantage of this scheme is that a considerable amount of professional model designers are required, which results in a high cost of model reconstruction, and meanwhile, since no corresponding measures are provided in the later period, the newly added data model needs to be reconstructed again in a standard manner, which results in a short period of time for the reconstructed model.
Disclosure of Invention
In order to solve the above problems in the prior art, embodiments of the present application provide a method and an apparatus for reconstructing a model based on metadata, an electronic device, and a storage medium, which can implement automatic screening and reconstruction of business models stored in a system, do not require a large number of professional model designers, reduce reconstruction cost, and at the same time, implement periodic inspection of the quality of an existing model, and prolong the achievement period of a reconstructed model.
In a first aspect, an embodiment of the present application provides a metadata-based model reconstruction method, including:
extracting feature fields of metadata corresponding to a model to be reconstructed to obtain at least one feature field, wherein each feature field in the at least one feature field is used for identifying the feature of a service corresponding to the model to be reconstructed;
determining a service domain of a model to be reconstructed according to at least one characteristic field;
determining a reconstruction template of a model to be reconstructed according to the service domain;
determining at least one target field in the at least one characteristic field, wherein the frequency of occurrence of each target field in the at least one target field is greater than a first threshold;
determining standard processing logic for each target field;
and reconstructing the model to be reconstructed according to the reconstruction template and the standard processing logic of each target field to obtain the reconstruction model.
In a second aspect, an embodiment of the present application provides a metadata-based model reconstruction apparatus, including:
the extraction module is used for extracting the characteristic field of the metadata corresponding to the model to be reconstructed to obtain at least one characteristic field, wherein each characteristic field in the at least one characteristic field is used for identifying the characteristic of the service corresponding to the model to be reconstructed;
the processing module is used for determining a service domain of a model to be reconstructed according to the at least one characteristic field, determining a reconstruction template of the model to be reconstructed according to the service domain, and determining at least one target field in the at least one characteristic field, wherein the occurrence frequency of each target field in the at least one target field is greater than a first threshold value, and determining the standard processing logic of each target field;
and the reconstruction module is used for reconstructing the model to be reconstructed according to the reconstruction template and the standard processing logic of each target field to obtain the reconstruction model.
In a third aspect, an embodiment of the present application provides an electronic device, including: a processor coupled to the memory, the memory for storing a computer program, the processor for executing the computer program stored in the memory to cause the electronic device to perform the method of the first aspect.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium having a computer program stored thereon, the computer program causing a computer to perform the method according to the first aspect.
In a fifth aspect, embodiments of the present application provide a computer program product comprising a non-transitory computer readable storage medium storing a computer program, the computer operable to cause the computer to perform a method according to the first aspect.
The implementation of the embodiment of the application has the following beneficial effects:
in the embodiment of the application, at least one characteristic field for identifying the characteristics of the service corresponding to the model to be reconstructed is determined by acquiring the metadata of the model to be reconstructed. And then, determining a service domain corresponding to the reconstruction model according to the at least one characteristic field, and further determining a standard model template of the reconstruction model, namely the reconstruction model, so as to realize the standardization of the reconstruction model, ensure the achievement period of the reconstruction model, and simultaneously complete the standardization change of all models of the same type by modifying the reconstruction model once after the standard of the later stage is changed. Then, according to the occurrence frequency of each characteristic field in at least one characteristic field, the characteristic field with the occurrence frequency larger than the first threshold value is used as a target field to be considered preferentially when the model is reconstructed, and then the reconstructed model can meet the basic operation requirement of the service corresponding to the model. And finally, determining the standard processing logic of each target field, and reconstructing the model to be reconstructed according to the reconstruction template and the standard processing logic of each target field to obtain the reconstructed model. Therefore, automatic reconstruction of the business model of the stock in the system is realized, a large number of professional model designers are not needed, and the reconstruction cost is reduced. Meanwhile, the method provided by the embodiment can also be used for periodically inspecting the reconstructed model, so that the achievement period of the reconstructed model is prolonged.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
Fig. 1 is a schematic hardware structure diagram of a metadata-based model reconstruction apparatus according to an embodiment of the present disclosure;
fig. 2 is a schematic flowchart of a metadata-based model reconstruction method according to an embodiment of the present disclosure;
fig. 3 is a schematic flowchart of a method for extracting a feature field from metadata corresponding to a model to be reconstructed to obtain at least one feature field according to an embodiment of the present disclosure;
fig. 4 is a schematic flowchart of a method for determining a field group corresponding to each character string in at least one second candidate field according to an embodiment of the present application;
fig. 5 is a schematic flowchart of a method for determining a service domain of the model to be reconstructed according to the at least one feature field according to an embodiment of the present application;
fig. 6 is a block diagram illustrating functional modules of a metadata-based model reconstruction apparatus according to an embodiment of the present disclosure;
fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. All other embodiments obtained by a person of ordinary skill in the art without any inventive work based on the embodiments in the present application are within the scope of protection of the present application.
The terms "first," "second," "third," and "fourth," etc. in the description and claims of this application and in the accompanying drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, result, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
First, referring to fig. 1, fig. 1 is a schematic diagram of a hardware structure of a metadata-based model reconstruction apparatus according to an embodiment of the present disclosure. The metadata-based model reconstruction apparatus 100 includes at least one processor 101, a communication line 102, a memory 103, and at least one communication interface 104.
In this embodiment, the processor 101 may be a general processing unit (CPU), a microprocessor, an application-specific integrated circuit (ASIC), or one or more ics for controlling the execution of programs according to the present disclosure.
The communication link 102, which may include a path, carries information between the aforementioned components.
The communication interface 104 may be any transceiver or other device (e.g., an antenna, etc.) for communicating with other devices or communication networks, such as an ethernet, RAN, Wireless Local Area Network (WLAN), etc.
The memory 103 may be, but is not limited to, a read-only memory (ROM) or other type of static storage device that can store static information and instructions, a Random Access Memory (RAM) or other type of dynamic storage device that can store information and instructions, an electrically erasable programmable read-only memory (EEPROM), a compact disc read-only memory (CD-ROM) or other optical disk storage, optical disk storage (including compact disc, laser disc, optical disc, digital versatile disc, blu-ray disc, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer.
In this embodiment, the memory 103 may be independent and connected to the processor 101 through the communication line 102. The memory 103 may also be integrated with the processor 101. The memory 103 provided in the embodiments of the present application may generally have a nonvolatile property. The memory 103 is used for storing computer-executable instructions for executing the scheme of the application, and is controlled by the processor 101 to execute. The processor 101 is configured to execute computer-executable instructions stored in the memory 103, thereby implementing the methods provided in the embodiments of the present application described below.
In alternative embodiments, computer-executable instructions may also be referred to as application code, which is not specifically limited in this application.
In alternative embodiments, processor 101 may include one or more CPUs, such as CPU0 and CPU1 of FIG. 1.
In alternative embodiments, the metadata-based model reconstruction apparatus 100 may include a plurality of processors, such as the processor 101 and the processor 107 of FIG. 1. Each of these processors may be a single-core (single-CPU) processor or a multi-core (multi-CPU) processor. A processor herein may refer to one or more devices, circuits, and/or processing cores for processing data (e.g., computer program instructions).
In an alternative embodiment, if the model reconstruction apparatus 100 based on metadata is a server, for example, it may be an independent server, or may be a cloud server that provides basic cloud computing services such as cloud service, cloud database, cloud computing, cloud function, cloud storage, web service, cloud communication, middleware service, domain name service, security service, Content Delivery Network (CDN), big data, and artificial intelligence platform. The metadata-based model reconstruction apparatus 100 may further include an output device 105 and an input device 106. The output device 105 is in communication with the processor 101 and may display information in a variety of ways. For example, the output device 105 may be a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display device, a Cathode Ray Tube (CRT) display device, a projector (projector), or the like. The input device 106 is in communication with the processor 101 and may receive user input in a variety of ways. For example, the input device 106 may be a mouse, a keyboard, a touch screen device, or a sensing device, among others.
The above-described metadata-based model reconstruction apparatus 100 may be a general-purpose device or a dedicated device. The present embodiment is not limited to the type of the metadata-based model reconstruction apparatus 100.
Next, it should be noted that the embodiments disclosed in the present application may acquire and process related data based on artificial intelligence technology. Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.
The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
Hereinafter, a metadata-based model reconstruction method disclosed in the present application will be explained:
referring to fig. 2, fig. 2 is a schematic flowchart of a metadata-based model reconstruction method according to an embodiment of the present disclosure. The model reconstruction method based on the metadata comprises the following steps:
201: and extracting the characteristic field of the metadata corresponding to the model to be reconstructed to obtain at least one characteristic field.
In the embodiment, a large number of system models are accumulated in the system, and some of the system models are added according to the current operation scheme, and some of the system models are abandoned according to the historical operation scheme. In other words, these obsolete system models are also stored in the model base of the system together with the system models that are currently in use. However, since the operation schemes corresponding to these obsolete system models are already completed and are no longer called in a short time, the workload is increased when the system models are reconstructed, and the reconstruction cost is wasted. In view of this, in the present embodiment, before model reconstruction is performed, it is also necessary to select a system model having a value to be reconstructed from a large number of system models accumulated in the system as a model to be reconstructed. Specifically, the system model that is still used by the system in the time period based on the current time may be regarded as the system model having the reconstructed value.
Based on this, in the present embodiment, whether the existing model is in use or not may be determined by obtaining the occurrence time of the last data query or data use of each system model of at least one system model accumulated in the system, and then determining the model to be reconstructed in the at least one system model according to the occurrence time of the last data query or data use of each system model. For example, a system model in which the interval between the occurrence time of the last data query or data usage and the current time is less than or equal to a second threshold may be considered as a model to be reconstructed. For example, if a system model is found in the system to have not been accessed within 3 months, the model may be logged to a model downline list while being culled from a model reconstruction list because the model has no value for reconstruction.
Meanwhile, in this embodiment, the metadata may refer to service metadata in the system, and the service metadata is used to store relevant information of the corresponding system model, for example: table name, field Chinese name, table data update time and other related information. Therefore, the dimension of the system model, the use frequency of the indexes, the dimension indexes of the related service scenes and the like can be visually seen from the service metadata. Based on this, at least one feature field for identifying the feature of the service corresponding to the model to be reconstructed can be obtained by extracting the feature field from the service metadata.
Based on this, the present embodiment provides a method for extracting a feature field from metadata corresponding to a model to be reconstructed to obtain at least one feature field, as shown in fig. 3, the method includes:
301: at least one field name is determined according to the model structure of the model to be reconstructed.
In this embodiment, each field name of the at least one field name is used to identify naming information of a corresponding field in the model to be reconstructed. Specifically, for different models, the storage locations of important fields in the model may be determined according to their corresponding service attributes, operating logic, model structure, and the like, and then the names of these locations are obtained as field names.
302: and determining at least one character string corresponding to the at least one field name in a one-to-one mode in the metadata according to the at least one field name.
In this embodiment, since the service metadata is used to store the relevant information of the corresponding system model, for example: table name, field Chinese name, table data update time and other related information. Therefore, the character string corresponding to each field name in the at least one field name can be obtained in the service metadata in a field name matching mode, so as to obtain at least one character string.
303: and performing text segmentation processing on each character string in the at least one character string to obtain at least one field group corresponding to the at least one character string one to one.
In this embodiment, first, a text segmentation process may be performed on each character string to obtain at least one first candidate field corresponding to each character string. For example, a set of separators may be preset, which includes some separators commonly used in chinese, such as: punctuation marks, special marks, diagrams, conjunctions, stop words, etc. And then matching each character string with the separator set, replacing the separator in each character string with a space, and performing text segmentation processing on each character string to obtain at least one substring. Then, each sub-character string in at least one sub-character string is respectively matched with the general word segmentation dictionary in a forward maximum mode. And when the matching of the sub-character string and the words in the dictionary is successful, extracting the successfully matched words in the sub-character string to obtain at least one first candidate field corresponding to the character string.
Part-of-speech information for each of the at least one first candidate field may then be determined, such that at least one second candidate field is determined in the at least one first candidate field based on the part-of-speech information for each first candidate field. In this embodiment, the part-of-speech information may refer to information describing the nature of the field, such as verbs, nouns, named entities, etc., and may be determined by analyzing the sentence and semantic meaning of each first candidate field, and then screening one or more first candidate fields of the part-of-speech information from among the at least one first candidate field as the at least one second candidate field. For example, the candidate condition may be set to that the part of speech information is a named entity, and thus, a first candidate field of the at least one first candidate field whose part of speech information is a named entity is screened out as a second candidate field.
And finally, determining a field group corresponding to each character string in at least one second candidate field to obtain at least one field group. In this embodiment, there may be a long field that is divided into a plurality of parts at the time of word segmentation in the field screened out by the word segmentation. The overall meaning of the long field and the meaning of the divided fields may not be the same or may even conflict with each other. Therefore, in order to obtain a field group capable of representing the precise meaning of each character segment, the split long fields need to be found again, and the split fields need to be replaced by the long fields. Based on this, the present embodiment provides a method for determining a field group corresponding to each character string in at least one second candidate field, as shown in fig. 4, where the method includes:
401: and combining the first adjacent field and the second adjacent field in the at least one second candidate field to obtain at least one third candidate field.
In this embodiment, the first neighboring field and the second neighboring field are any two different second candidate fields, and a field interval between the first neighboring field and the second neighboring field is smaller than the first threshold. Specifically, the first adjacent field and the second adjacent field are two adjacent fields in the second candidate field, where the field interval is smaller than the first threshold, and the field interval can be understood as the number of characters between the corresponding positions of the first adjacent field and the second adjacent field in the corresponding character string. Illustratively, for the string "i am graduate in 2021 at the university of compound denier in shanghai", the second candidate field may be obtained after word segmentation and screening: "2021", "shanghai", "counterdenier" and "university". At this time, the number of characters between the corresponding positions in the original character string of the second candidate fields "2021 year" and "shanghai" is 3, so the character distance between the second candidate fields "2021 year" and "shanghai" is 3. And the number of characters between the corresponding positions in the original character string of the second candidate fields "home" and "university" is 0, so the character distance between the second candidate fields "home" and "university" is 0.
In the present embodiment, the first threshold may be set to 2, whereby, taking the above-mentioned character string "i graduate to the university of double denier at shanghai in 2021" as an example, the second candidate fields that satisfy the requirements are: "shanghai" and "rejoin", and "rejoin" and "university". Thus, the third candidate fields "shanghai fudan" and "fudan university" can be obtained.
402: and semantic extraction is carried out on each third candidate field in the at least one third candidate field to obtain at least one semantic vector which corresponds to the at least one third candidate field one by one.
403: at least one fourth candidate field is determined among the at least one third candidate field based on the at least one semantic vector.
In this embodiment, similarity calculation may be performed on the semantic vector and the standard vector corresponding to the semantic vector, and when the calculated similarity is greater than a preset threshold, a third candidate field of the semantic vector corresponding to the similarity is used as a fourth candidate field.
404: and deleting the second candidate fields forming each fourth candidate field in the at least one fourth candidate field from the at least one second candidate field to obtain at least one fifth candidate field.
In this embodiment, the fifth candidate field is the second candidate field remaining after the second candidate field forming each of the at least one fourth candidate field is removed. Illustratively, following the above-described example of the string "i am graduating university in the shanghai in 2021", assuming that the fourth candidate field determined by the semantic similarity calculation is "compound-denier university", since the fourth candidate field "compound-denier university" is composed of the second candidate fields "compound-denier" and "university", the second candidate fields "compound-denier" and "university" are derived from the original several second candidate fields: the second candidate fields "2021 year" and "shanghai" are the fifth candidate fields if "2021 year", "shanghai", "counterdenier" and "university" are removed.
405: and combining the at least one fourth candidate field and the at least one fifth candidate field to obtain a field group corresponding to each character string.
Illustratively, following the above example of the string "i am a compound denier university in shanghai in 2021", combining the fourth candidate field "compound denier university" with the fifth candidate field "2021 and" shanghai "results in a field set corresponding to the string" i am a compound denier university in shanghai in 2021 ": "2021", "shanghai", and "university of double denier".
304: at least one characteristic field is determined in at least one field group.
In this embodiment, all fields in at least one field group may be summarized and subjected to deduplication processing, so as to obtain at least one feature field.
202: and determining the service domain of the model to be reconstructed according to the at least one characteristic field.
In the present embodiment, processing logic of different services also has a certain difference due to different processing requirements and processing purposes. Therefore, each service in the system can be classified, and some services with similar processing logic can be divided into the same service domain. Therefore, by determining the service domain of the model to be reconstructed, the general processing logic of the model to be reconstructed can be determined.
Meanwhile, in the present embodiment, the at least one feature field is obtained by extracting service metadata storing related information of the corresponding system model. Therefore, the at least one characteristic field can comprehensively characterize the characteristics of the service corresponding to the corresponding model to be reconstructed. Therefore, the present embodiment provides a method for determining a service domain of the model to be reconstructed according to the at least one feature field, as shown in fig. 5, where the method includes:
501: and determining a service domain group corresponding to each characteristic field in the at least one characteristic field to obtain at least one service domain group corresponding to the at least one characteristic field one to one.
In this embodiment, the service metadata of the history model may be analyzed to obtain a service domain table, in which one or more service domains corresponding to each feature field are recorded. For example, the feature field "AUTO" may correspond to a vehicle-related business domain, such as: business domains such as vehicle purchase loan, vehicle insurance, vehicle claim settlement, vehicle mortgage and the like. Therefore, the service domain group corresponding to each characteristic field can be obtained by inquiring the service domain table.
502: and counting the at least one service domain group, and determining the score of the service domain contained in each service domain group in the at least one service domain group.
For example, the at least one service domain group may be scanned sequentially, and a service domain is not scanned even if the corresponding score is increased by 1, until all service domain groups are scanned, so as to obtain the score of the service domain included in each service domain group. Specifically, assume that there are now 3 service domain groups, which are: service domain group 1[ car insurance, claims settlement, mortgage ], service domain group 2[ claims settlement, loan ], service domain group 3[ claims settlement, car insurance, purchase ]. Counting the 3 service domain groups to obtain 2 times of total occurrence of the car insurance, and recording 2 points; the claim settlement occurs 3 times in total, and 3 points are recorded; the mortgage is appeared for 1 time in total, and 1 point is recorded; 1 loan occurs for 1 time in total, and 1 point is recorded; the total number of purchases was 1 point.
503: and taking the business domain with the highest score as the business domain of the model to be reconstructed.
Specifically, following the above example, since the score of the claim is the highest and is 3, the corresponding business domain of the model to be reconstructed is the claim business domain.
203: and determining a reconstruction template of the model to be reconstructed according to the service domain.
In this embodiment, since the processing logic of the services in the same service domain is similar to each other, the general processing logic of each service domain can be extracted, and a corresponding reconstruction template can be generated according to some pain requirements of the service domain. Therefore, when the model is reconstructed, some general operations can be quickly generated by calling the general reconstruction template, so that the model reconstruction time and cost are reduced, and the model reconstruction efficiency is improved.
204: at least one target field is determined among the at least one characteristic field.
In this embodiment, the frequency of occurrence of each of the at least one target field is greater than a first threshold. Illustratively, the target field is a field which appears more frequently in the service metadata of the corresponding system model, so that the field is a field which is frequently called or used by the system model and has higher importance on the system model. Therefore, these fields can be prioritized as the basis fields for the reconstruction model.
205: the standard processing logic for each target field is determined.
In this embodiment, the metadata base may be retrieved according to each target field to obtain at least one service metadata, and then the system model base corresponding to each service metadata may be searched according to the at least one service metadata to obtain at least one system model. Then, in each system model of the at least one system model, a processing logic corresponding to each target field may be determined, resulting in at least one candidate processing logic corresponding to the at least one system model one-to-one. Finally, the candidate processing logic with the highest ratio is used as the standard processing logic of each target field by determining the ratio of each candidate processing logic in at least one candidate processing logic.
Specifically, assuming that n candidate processing logics are finally obtained, when n is 1, it is described that the processing logics of all the system models are consistent for the target field, and therefore, the candidate processing logic can be directly used as the standard processing logic of the field; and when n >1, it indicates that there are multiple processing logics in the target field system, and in this case, the candidate processing logic with the highest ratio can be selected as the standard processing logic of the target field by counting the ratio of each candidate processing logic in the n candidate processing logics.
206: and reconstructing the model to be reconstructed according to the reconstruction template and the standard processing logic of each target field to obtain the reconstruction model.
In this embodiment, the processing order of the standard processing logic corresponding to each target field may be determined by reconstructing the template, and then the corresponding flow chain is obtained. And then filling the standard processing logic corresponding to each target field into the corresponding position in the reconstruction template according to the process chain to obtain a reconstruction model.
In summary, in the metadata-based model reconstruction method provided by the present invention, at least one feature field identifying a feature of a service corresponding to a model to be reconstructed is determined by obtaining metadata of the model to be reconstructed. And then, determining a service domain corresponding to the reconstruction model according to the at least one characteristic field, and further determining a standard model template of the reconstruction model, namely the reconstruction model, so as to realize the standardization of the reconstruction model, ensure the achievement period of the reconstruction model, and simultaneously complete the standardization change of all models of the same type by modifying the reconstruction model once after the standard of the later stage is changed. Then, according to the occurrence frequency of each characteristic field in at least one characteristic field, the characteristic field with the occurrence frequency larger than the first threshold value is used as a target field to be considered preferentially when the model is reconstructed, and then the reconstructed model can meet the basic operation requirement of the service corresponding to the model. And finally, determining the standard processing logic of each target field, and reconstructing the model to be reconstructed according to the reconstruction template and the standard processing logic of each target field to obtain the reconstructed model. Therefore, automatic reconstruction of the business model of the stock in the system is realized, a large number of professional model designers are not needed, and the reconstruction cost is reduced. Meanwhile, the method provided by the embodiment can also be used for periodically inspecting the reconstructed model, so that the achievement period of the reconstructed model is prolonged.
Referring to fig. 6, fig. 6 is a block diagram illustrating functional modules of a metadata-based model reconstruction apparatus according to an embodiment of the present disclosure. As shown in fig. 6, the metadata-based model reconstruction apparatus 600 includes:
an extracting module 601, configured to perform feature field extraction on metadata corresponding to a model to be reconstructed to obtain at least one feature field, where each feature field in the at least one feature field is used to identify a feature of a service corresponding to the model to be reconstructed;
a processing module 602, configured to determine a service domain of a model to be reconstructed according to at least one feature field, determine a reconstruction template of the model to be reconstructed according to the service domain, determine at least one target field in the at least one feature field, where an occurrence frequency of each target field in the at least one target field is greater than a first threshold, and determine a standard processing logic of each target field;
and the reconstruction module 603 is configured to reconstruct the model to be reconstructed according to the reconstruction template and the standard processing logic of each target field, so as to obtain a reconstruction model.
In an embodiment of the present invention, in terms of extracting a feature field from metadata corresponding to a model to be reconstructed to obtain at least one feature field, the extracting module 601 is specifically configured to:
determining at least one field name according to the model structure of the model to be reconstructed, wherein each field name in the at least one field name is used for identifying the naming information of a corresponding field in the model to be reconstructed;
determining at least one character string in the metadata according to the at least one field name, wherein the at least one character string is in one-to-one correspondence with the at least one field name;
performing text segmentation processing on each character string in at least one character string to obtain at least one field group, wherein the at least one field group corresponds to the at least one character string one to one;
at least one characteristic field is determined in at least one field group.
In an embodiment of the present invention, in terms of performing text segmentation processing on each character string in at least one character string to obtain at least one field group, the extraction module 601 is specifically configured to:
performing text segmentation processing on each character string to obtain at least one first candidate field corresponding to each character string;
determining part-of-speech information of each of at least one first candidate field;
determining at least one second candidate field in the at least one first candidate field according to the part-of-speech information of each first candidate field;
and determining a field group corresponding to each character string in at least one second candidate field to obtain at least one field group.
In an embodiment of the present invention, in determining, in at least one second candidate field, a field group corresponding to each character string, the extracting module 601 is specifically configured to:
combining a first adjacent field and a second adjacent field in at least one second candidate field to obtain at least one third candidate field, wherein the first adjacent field and the second adjacent field are any two different second candidate fields, and the field interval between the first adjacent field and the second adjacent field is smaller than a first threshold value;
performing semantic extraction on each third candidate field in the at least one third candidate field to obtain at least one semantic vector, wherein the at least one semantic vector is in one-to-one correspondence with the at least one third candidate field;
determining at least one fourth candidate field among the at least one third candidate field according to the at least one semantic vector;
deleting the second candidate fields forming each fourth candidate field in the at least one fourth candidate field from the at least one second candidate field to obtain at least one fifth candidate field;
and combining the at least one fourth candidate field and the at least one fifth candidate field to obtain a field group corresponding to each character string.
In an embodiment of the present invention, the metadata-based model reconstruction apparatus 600 may further include: a screening module (not shown) configured to, before obtaining metadata corresponding to the model to be reconstructed:
acquiring the occurrence time of the last data query or data use of each system model in at least one system model;
and determining a model to be reconstructed in at least one system model according to the occurrence time of the last data query or data use of each system model, wherein the interval between the occurrence time of the last data query or data use corresponding to the model to be reconstructed and the current time is less than or equal to a second threshold value.
In an embodiment of the present invention, in determining a service domain of a model to be reconstructed according to at least one feature field, the processing module 602 is specifically configured to:
determining a service domain group corresponding to each characteristic field in at least one characteristic field to obtain at least one service domain group, wherein the at least one service domain group is in one-to-one correspondence with the at least one characteristic field;
counting at least one service domain group, and determining the score of each service domain contained in each service domain group in the at least one service domain group;
and taking the business domain with the highest score as the business domain of the model to be reconstructed.
In an embodiment of the present invention, in terms of determining a standard processing logic of each target field, the processing module 602 is specifically configured to:
retrieving a metadata base according to each target field to obtain at least one service metadata;
searching a system model base according to at least one service metadata to obtain at least one system model, wherein the at least one system model corresponds to the at least one service metadata one to one;
determining a processing logic corresponding to each target field in each system model of at least one system model to obtain at least one candidate processing logic, wherein the at least one candidate processing logic is in one-to-one correspondence with the at least one system model;
the ratio of each candidate processing logic in at least one candidate processing logic is determined, and the candidate processing logic with the highest ratio is used as the standard processing logic of each target field.
Referring to fig. 7, fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure. As shown in fig. 7, the electronic device 700 includes a transceiver 701, a processor 702, and a memory 703. Connected to each other by a bus 704. The memory 703 is used to store computer programs and data, and may transfer the data stored in the memory 703 to the processor 702.
The processor 702 is configured to read the computer program in the memory 703 to perform the following operations:
extracting feature fields of metadata corresponding to a model to be reconstructed to obtain at least one feature field, wherein each feature field in the at least one feature field is used for identifying the feature of a service corresponding to the model to be reconstructed;
determining a service domain of a model to be reconstructed according to at least one characteristic field;
determining a reconstruction template of a model to be reconstructed according to the service domain;
determining at least one target field in the at least one characteristic field, wherein the frequency of occurrence of each target field in the at least one target field is greater than a first threshold;
determining standard processing logic for each target field;
and reconstructing the model to be reconstructed according to the reconstruction template and the standard processing logic of each target field to obtain the reconstruction model.
In an embodiment of the present invention, in terms of extracting a feature field from metadata corresponding to a model to be reconstructed to obtain at least one feature field, the processor 702 is specifically configured to perform the following operations:
determining at least one field name according to the model structure of the model to be reconstructed, wherein each field name in the at least one field name is used for identifying the naming information of a corresponding field in the model to be reconstructed;
determining at least one character string in the metadata according to the at least one field name, wherein the at least one character string is in one-to-one correspondence with the at least one field name;
performing text segmentation processing on each character string in at least one character string to obtain at least one field group, wherein the at least one field group corresponds to the at least one character string one to one;
at least one characteristic field is determined in at least one field group.
In an embodiment of the present invention, in terms of performing text segmentation processing on each character string of at least one character string to obtain at least one field group, the processor 702 is specifically configured to perform the following operations:
performing text segmentation processing on each character string to obtain at least one first candidate field corresponding to each character string;
determining part-of-speech information of each of at least one first candidate field;
determining at least one second candidate field in the at least one first candidate field according to the part-of-speech information of each first candidate field;
and determining a field group corresponding to each character string in at least one second candidate field to obtain at least one field group.
In an embodiment of the present invention, in determining a field group corresponding to each character string in the at least one second candidate field, the processor 702 is specifically configured to perform the following operations:
combining a first adjacent field and a second adjacent field in at least one second candidate field to obtain at least one third candidate field, wherein the first adjacent field and the second adjacent field are any two different second candidate fields, and the field interval between the first adjacent field and the second adjacent field is smaller than a first threshold value;
performing semantic extraction on each third candidate field in the at least one third candidate field to obtain at least one semantic vector, wherein the at least one semantic vector is in one-to-one correspondence with the at least one third candidate field;
determining at least one fourth candidate field among the at least one third candidate field according to the at least one semantic vector;
deleting the second candidate fields forming each fourth candidate field in the at least one fourth candidate field from the at least one second candidate field to obtain at least one fifth candidate field;
and combining the at least one fourth candidate field and the at least one fifth candidate field to obtain a field group corresponding to each character string.
In an embodiment of the present invention, before obtaining metadata corresponding to a model to be reconstructed, the processor 702 is further configured to:
acquiring the occurrence time of the last data query or data use of each system model in at least one system model;
and determining a model to be reconstructed in at least one system model according to the occurrence time of the last data query or data use of each system model, wherein the interval between the occurrence time of the last data query or data use corresponding to the model to be reconstructed and the current time is less than or equal to a second threshold value.
In an embodiment of the present invention, in determining a service domain of a model to be reconstructed according to at least one feature field, the processor 702 is specifically configured to:
determining a service domain group corresponding to each characteristic field in at least one characteristic field to obtain at least one service domain group, wherein the at least one service domain group is in one-to-one correspondence with the at least one characteristic field;
counting at least one service domain group, and determining the score of each service domain contained in each service domain group in the at least one service domain group;
and taking the business domain with the highest score as the business domain of the model to be reconstructed.
In an embodiment of the present invention, in terms of standard processing logic for determining each target field, the processor 702 is specifically configured to:
retrieving a metadata base according to each target field to obtain at least one service metadata;
searching a system model base according to at least one service metadata to obtain at least one system model, wherein the at least one system model corresponds to the at least one service metadata one to one;
determining a processing logic corresponding to each target field in each system model of at least one system model to obtain at least one candidate processing logic, wherein the at least one candidate processing logic is in one-to-one correspondence with the at least one system model;
the ratio of each candidate processing logic in at least one candidate processing logic is determined, and the candidate processing logic with the highest ratio is used as the standard processing logic of each target field.
It should be understood that the metadata-based model reconstruction device in the present application may include a smart Phone (e.g., an Android Phone, an iOS Phone, a Windows Phone, etc.), a tablet computer, a palm computer, a notebook computer, a Mobile Internet device MID (Mobile Internet Devices, MID for short), a robot or a wearable device, etc. The above-mentioned metadata-based model reconstruction device is merely an example, and is not exhaustive, and includes but is not limited to the above-mentioned metadata-based model reconstruction device. In practical applications, the above metadata-based model reconstruction apparatus may further include: intelligent vehicle-mounted terminal, computer equipment and the like.
Through the above description of the embodiments, those skilled in the art will clearly understand that the present invention can be implemented by combining software and a hardware platform. With this understanding in mind, all or part of the technical solutions of the present invention that contribute to the background can be embodied in the form of a software product, which can be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes instructions for causing a computer device (which can be a personal computer, a server, or a network device, etc.) to execute the methods according to the embodiments or some parts of the embodiments.
Accordingly, the present application also provides a computer readable storage medium, which stores a computer program, the computer program being executed by a processor to implement part or all of the steps of any one of the metadata-based model reconstruction methods as described in the above method embodiments. For example, the storage medium may include a hard disk, a floppy disk, an optical disk, a magnetic tape, a magnetic disk, a flash memory, and the like.
Embodiments of the present application also provide a computer program product comprising a non-transitory computer readable storage medium storing a computer program operable to cause a computer to perform some or all of the steps of any one of the metadata-based model reconstruction methods as described in the above method embodiments.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are all alternative embodiments and that the acts and modules referred to are not necessarily required by the application.
In the above embodiments, the description of each embodiment has its own emphasis, and for parts not described in detail in a certain embodiment, reference may be made to the description of other embodiments.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is merely a logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of some interfaces, devices or units, and may be an electric or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The integrated unit may be implemented in the form of hardware, or may be implemented in the form of a software program module.
The integrated units, if implemented in the form of software program modules and sold or used as stand-alone products, may be stored in a computer readable memory. Based on such understanding, the technical solution of the present application may be substantially implemented or a part of or all or part of the technical solution contributing to the prior art may be embodied in the form of a software product stored in a memory, and including several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned memory comprises: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable memory, and the memory may include: flash Memory disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.
The foregoing detailed description of the embodiments of the present application has been presented to illustrate the principles and implementations of the present application, and the above description of the embodiments is only provided to help understand the methods and their core ideas of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (10)

1. A method for metadata-based model reconstruction, the method comprising:
extracting feature fields of metadata corresponding to a model to be reconstructed to obtain at least one feature field, wherein each feature field in the at least one feature field is used for identifying features of a service corresponding to the model to be reconstructed;
determining a service domain of the model to be reconstructed according to the at least one characteristic field;
determining a reconstruction template of the model to be reconstructed according to the service domain;
determining at least one target field in the at least one characteristic field, wherein the frequency of occurrence of each target field in the at least one target field is greater than a first threshold;
determining standard processing logic for each of the target fields;
and reconstructing the model to be reconstructed according to the reconstruction template and the standard processing logic of each target field to obtain a reconstructed model.
2. The method according to claim 1, wherein the extracting the feature field of the metadata corresponding to the model to be reconstructed to obtain at least one feature field comprises:
determining at least one field name according to the model structure of the model to be reconstructed, wherein each field name in the at least one field name is used for identifying the naming information of the corresponding field in the model to be reconstructed;
determining at least one character string in the metadata according to the at least one field name, wherein the at least one character string is in one-to-one correspondence with the at least one field name;
performing text segmentation processing on each character string in the at least one character string to obtain at least one field group, wherein the at least one field group is in one-to-one correspondence with the at least one character string;
determining the at least one characteristic field in the at least one field group.
3. The method of claim 2, wherein the text segmentation processing each of the at least one string to obtain at least one field group comprises:
performing text segmentation processing on each character string to obtain at least one first candidate field corresponding to each character string;
determining part-of-speech information of each of the at least one first candidate field;
determining at least one second candidate field in the at least one first candidate field according to the part-of-speech information of each first candidate field;
and determining a field group corresponding to each character string in the at least one second candidate field to obtain the at least one field group.
4. The method according to claim 3, wherein the determining, in the at least one second candidate field, the field group corresponding to each character string comprises:
combining a first adjacent field and a second adjacent field in the at least one second candidate field to obtain at least one third candidate field, wherein the first adjacent field and the second adjacent field are any two different second candidate fields, and a field interval between the first adjacent field and the second adjacent field is smaller than a first threshold;
performing semantic extraction on each third candidate field in the at least one third candidate field to obtain at least one semantic vector, wherein the at least one semantic vector is in one-to-one correspondence with the at least one third candidate field;
determining at least one fourth candidate field among the at least one third candidate field according to the at least one semantic vector;
deleting the second candidate fields forming each fourth candidate field in the at least one second candidate field to obtain at least one fifth candidate field;
and combining the at least one fourth candidate field and the at least one fifth candidate field to obtain a field group corresponding to each character string.
5. The method according to claim 1, wherein before the obtaining metadata corresponding to the model to be reconstructed, the method further comprises:
acquiring the occurrence time of the last data query or data use of each system model in at least one system model;
and determining the model to be reconstructed in the at least one system model according to the occurrence time of the last data query or data use of each system model, wherein the interval between the occurrence time of the last data query or data use corresponding to the model to be reconstructed and the current time is less than or equal to a second threshold value.
6. The method of claim 1, wherein the determining the service domain of the model to be reconstructed according to the at least one feature field comprises:
determining a service domain group corresponding to each characteristic field in the at least one characteristic field to obtain at least one service domain group, wherein the at least one service domain group is in one-to-one correspondence with the at least one characteristic field;
counting the at least one service domain group, and determining the score of each service domain contained in each service domain group in the at least one service domain group;
and taking the business domain with the highest score as the business domain of the model to be reconstructed.
7. The method of claim 1, wherein the standard processing logic for determining the each target field comprises:
retrieving a metadata base according to each target field to obtain at least one service metadata;
searching a system model base according to the at least one service metadata to obtain at least one system model, wherein the at least one system model corresponds to the at least one service metadata one by one;
determining a processing logic corresponding to each target field in each system model of the at least one system model to obtain at least one candidate processing logic, wherein the at least one candidate processing logic is in one-to-one correspondence with the at least one system model;
determining the ratio of each candidate processing logic in the at least one candidate processing logic, and taking the candidate processing logic with the highest ratio as the standard processing logic of each target field.
8. An apparatus for metadata-based model reconstruction, the apparatus comprising:
the extraction module is used for extracting a feature field of metadata corresponding to a model to be reconstructed to obtain at least one feature field, wherein each feature field in the at least one feature field is used for identifying the feature of a service corresponding to the model to be reconstructed;
the processing module is used for determining a service domain of the model to be reconstructed according to the at least one characteristic field, determining a reconstruction template of the model to be reconstructed according to the service domain, determining at least one target field in the at least one characteristic field, wherein the occurrence frequency of each target field in the at least one target field is greater than a first threshold, and determining the standard processing logic of each target field;
and the reconstruction module is used for reconstructing the model to be reconstructed according to the reconstruction template and the standard processing logic of each target field to obtain a reconstructed model.
9. An electronic device comprising a processor, a memory, a communication interface, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the processor, the one or more programs including instructions for performing the steps in the method of any of claims 1-7.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which is executed by a processor to implement the method according to any one of claims 1-7.
CN202210078236.1A 2022-01-22 2022-01-22 Model reconstruction method and device based on metadata, electronic equipment and storage medium Pending CN114416174A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210078236.1A CN114416174A (en) 2022-01-22 2022-01-22 Model reconstruction method and device based on metadata, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210078236.1A CN114416174A (en) 2022-01-22 2022-01-22 Model reconstruction method and device based on metadata, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114416174A true CN114416174A (en) 2022-04-29

Family

ID=81276398

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210078236.1A Pending CN114416174A (en) 2022-01-22 2022-01-22 Model reconstruction method and device based on metadata, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114416174A (en)

Similar Documents

Publication Publication Date Title
US11797607B2 (en) Method and apparatus for constructing quality evaluation model, device and storage medium
US11521603B2 (en) Automatically generating conference minutes
WO2022218186A1 (en) Method and apparatus for generating personalized knowledge graph, and computer device
US20120158742A1 (en) Managing documents using weighted prevalence data for statements
CN111125116B (en) Method and system for positioning code field in service table and corresponding code table
CN111143556A (en) Software function point automatic counting method, device, medium and electronic equipment
CN112818200A (en) Data crawling and event analyzing method and system based on static website
CN110968664A (en) Document retrieval method, device, equipment and medium
CN114037007A (en) Data set construction method and device, computer equipment and storage medium
CN111538903B (en) Method and device for determining search recommended word, electronic equipment and computer readable medium
CN112579781A (en) Text classification method and device, electronic equipment and medium
CN117216214A (en) Question and answer extraction generation method, device, equipment and medium
CN107368464B (en) Method and device for acquiring bidding product information
CN114416174A (en) Model reconstruction method and device based on metadata, electronic equipment and storage medium
KR20230059364A (en) Public opinion poll system using language model and method thereof
CN110413757B (en) Word paraphrase determining method, device and system
AU2019290658B2 (en) Systems and methods for identifying and linking events in structured proceedings
CN113919352A (en) Database sensitive data identification method and device
RU2549118C2 (en) Iterative filling of electronic glossary
CN113434631A (en) Emotion analysis method and device based on event, computer equipment and storage medium
CN113822013A (en) Labeling method and device for text data, computer equipment and storage medium
CN112579841B (en) Multi-mode database establishment method, retrieval method and system
CN117573956B (en) Metadata management method, device, equipment and storage medium
CN114969385B (en) Knowledge graph optimization method and device based on document attribute assignment entity weight
CN114238572B (en) Multi-database data extraction method and device based on artificial intelligence and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination