WO2019024231A1

WO2019024231A1 - Automatic data matching method, electronic device and computer-readable storage medium

Info

Publication number: WO2019024231A1
Application number: PCT/CN2017/104820
Authority: WO
Inventors: 陈娴娴; 李菲菲; 徐亮; 肖京
Original assignee: 平安科技（深圳）有限公司
Priority date: 2017-08-04
Filing date: 2017-09-30
Publication date: 2019-02-07
Also published as: CN107679544A

Abstract

An automatic data matching method, the method comprising the following steps: acquiring a classified feature obtained by a feature extraction operation (S31); performing, according to a preset dynamic list, normalization processing of the classified feature obtained by the feature extraction operation, to obtain a normalized classified feature (S32); extracting, from the normalized classified feature, a special field comprising a splittable character, splitting the special field into several field segments according to the position of the splittable character in the special field, and comparing the split field segments with a target category (S33); and on the basis of a preset field logic inclusion relationship, comparing a field which does not match with the target category (S34). The method can improve the rate of successful matching between a classified feature and a target category and the accuracy thereof. Further disclosed are an electronic device and a computer-readable storage medium.

Description

Automatic data matching method, electronic device and computer readable storage medium

The present application is based on the priority of the Chinese Patent Application entitled "Data Automatic Matching Method, Electronic Device and Computer-Readable Storage Medium", filed on August 4, 2017, with the application number of CN201710660957.2, which is filed on August 4, 2017. The entire content of the application is incorporated herein by reference.

Technical field

The present invention relates to the field of computer information technology, and in particular, to an automatic data matching method, an electronic device, and a computer readable storage medium.

Background technique

Feature extraction is an important step in various data mining prediction models. It is very important to classify the classification features according to the existing target classification in the data preprocessing stage. However, the direct matching of the uncleaned classification features with the target classification has problems such as data matching success rate and extremely low accuracy, which cannot meet the model requirements. Moreover, due to the continuous movement of massive data, the data magnitude has far exceeded the scope of manual matching. Therefore, the data matching algorithm design in the prior art is not reasonable enough and needs to be improved.

Summary of the invention

In view of this, the present invention provides an automatic data matching method, an electronic device, and a computer readable storage medium. The special field structured splitting process and the field logical inclusion relationship effectively improve the matching success rate of the classification feature and the target classification. Accuracy.

First, in order to achieve the above object, the present invention provides an electronic device including a memory, a processor, and an automatic data matching program stored on the memory, where the data automatic matching program is implemented by the processor. The following steps:

Obtaining the classification feature obtained by the feature extraction operation;

Performing normalization on the classification features obtained by the feature extraction operation according to a preset dynamic list to obtain a normalized classification feature;

Extracting a special field containing the detachable character from the normalized classification feature, and splitting the special field into a plurality of field segments according to the position of the detachable character in the special field, and The split field segment matches the target classification; and

The unsuccessful matching field is matched with the target classification by a preset field logical inclusion relationship.

Preferably, the normalizing the classification feature obtained by the feature extraction operation according to the preset dynamic list comprises:

If the preset dynamic list is the first type dynamic list, extracting the first type special character stored in the first type dynamic list, and obtaining the feature extraction operation according to the extracted first type special character The classification feature is deleted or replaced, and the normalized first type classification is obtained. feature;

If the preset dynamic list is the second type dynamic list, extracting the second type special character stored in the second type dynamic list, and obtaining the feature extraction operation according to the extracted second type special character The classification feature is deleted or replaced to obtain a normalized second type classification feature;

If the preset dynamic list is a third type dynamic list, extracting a third type special character stored in the third type dynamic list, and obtaining the feature extraction operation according to the extracted third type special character The classification feature is deleted or replaced to obtain a normalized third type classification feature.

Preferably, splitting the special field into several field segments comprises:

Recording the position of the detachable character in the special field as a split point; and

The field segment before the split point and the field segment after the split point are extracted separately.

Preferably, splitting the special field into several field segments comprises:

If the preset dynamic list is a first type dynamic list, extracting, from the normalized first type classification feature, a first type special field including a detachable character, according to the detachable character a location in the first type special field, splitting the first type special field into a plurality of field segments;

If the preset dynamic list is a second type dynamic list, extracting, from the normalized second type classification feature, a second type special field including a detachable character, according to the detachable character Positioning in the second type special field, splitting the second type special field into a plurality of field segments; and

If the preset dynamic list is a third type dynamic list, extracting, from the normalized third type classification feature, a third type special field including a detachable character, according to the detachable character The location in the third type special field splits the third type special field into several field segments.

Preferably, the matching the unsuccessful field with the target classification by using the preset field logical inclusion relationship includes:

Calculating a semantic similarity value between the unsuccessful matching field and the target classification according to the semantic logic similarity calculation algorithm; and

If the semantic similarity value is greater than the preset threshold, determining that the unsuccessful match field has a logical inclusion relationship with the target classification, and marking the unsuccessful matching field as having a matching relationship with the target classification.

Preferably, the preset dynamic list is dynamically adjusted according to data changes of the data source.

Preferably, the target is classified into preset rule data in an internal data platform.

In addition, in order to achieve the above object, the present invention further provides an automatic data matching method, which is applied to an electronic device, and the method includes:

Normalizing the classification features obtained by the feature extraction operation according to a preset dynamic list Processing, obtaining normalized classification features;

If the preset dynamic list is the first type dynamic list, extracting the first type special character stored in the first type dynamic list, and obtaining the feature extraction operation according to the extracted first type special character The classification feature is deleted or replaced to obtain a normalized first type classification feature;

Preferably, splitting the special field into several field segments comprises:

Further, in order to achieve the above object, the present invention also provides a computer readable storage medium storing an automatic data matching program, the data automatic matching program being executable by at least one processor, so that The at least one processor performs the steps of the data automatic matching method as described above.

Compared with the prior art, the electronic device, data automatic matching method and calculation proposed by the invention The machine readable storage medium is structured and split by special fields, which effectively improves the matching success rate and accuracy of the classification feature and the target classification. Further, the irregularity is solved by the field logical inclusion relationship (or field semantic inclusion relationship). The matching problem of the missing field, thereby further improving the matching success rate and accuracy of the classification feature and the target classification.

DRAWINGS

1 is a schematic diagram of an optional hardware architecture of an electronic device of the present invention;

2 is a block diagram showing an embodiment of an automatic data matching procedure in an electronic device of the present invention;

FIG. 3 is a schematic diagram of an implementation process of an embodiment of an automatic data matching method according to the present invention.

Reference mark:

电子设备 Electronic equipment	22
存储器Memory	21twenty one
处理器processor	22twenty two
网络接口Network Interface	23twenty three
数据自动匹配程序Automatic data matching program	2020
获取模块 Acquisition module	201201
处理模块 Processing module	202202
第一匹配模块 First matching module	203203
第二匹配模块 Second matching module	204204
流程步骤Process step	S31-S34S31-S34

The implementation, functional features, and advantages of the present invention will be further described in conjunction with the embodiments.

Detailed ways

The present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It is understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.

It should be noted that the descriptions of "first", "second" and the like in the present invention are for the purpose of description only, and are not to be construed as indicating or implying their relative importance or implicitly indicating the number of indicated technical features. . Thus, features defining "first" and "second" may be explicitly or implicitly included. At least one of the features. In addition, the technical solutions between the various embodiments may be combined with each other, but must be based on the realization of those skilled in the art, and when the combination of the technical solutions is contradictory or impossible to implement, it should be considered that the combination of the technical solutions does not exist. It is also within the scope of protection required by the present invention.

It is further to be understood that the term "comprises", "comprises" or any other variations thereof is intended to encompass a non-exclusive inclusion, such that a process, method, article, or device that comprises a And includes other elements not explicitly listed, or elements that are inherent to such a process, method, article, or device. An element that is defined by the phrase "comprising a ..." does not exclude the presence of additional equivalent elements in the process, method, item, or device that comprises the element.

First, the present invention proposes an electronic device 2.

Referring to FIG. 1, a schematic diagram of an optional hardware architecture of the electronic device 2 of the present invention is shown. In this embodiment, the electronic device 2 may include, but is not limited to, a memory 21, a processor 22, and a network interface 23 that can communicate with each other through a system bus. It is pointed out that FIG. 1 only shows the electronic device 2 with the components 21-23, but it should be understood that not all illustrated components are required to be implemented, and more or fewer components may be implemented instead.

The electronic device 2 may be a computing device such as a rack server, a blade server, a tower server, or a rack server. The electronic device 2 may be an independent server or a server cluster composed of multiple servers. .

The memory 21 includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card type memory (eg, SD or DX memory, etc.), a random access memory (RAM), a static Random access memory (SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disk, optical disk, and the like. In some embodiments, the memory 21 may be an internal storage unit of the electronic device 2, such as a hard disk or memory of the electronic device 2. In other embodiments, the memory 21 may also be an external storage device of the electronic device 2, such as a plug-in hard disk equipped on the electronic device 2, a smart memory card (SMC), and a secure digital device. (Secure Digital, SD) card, flash card, etc. Of course, the memory 21 may also include both an internal storage unit of the electronic device 2 and an external storage device thereof. In this embodiment, the memory 21 is generally used to store an operating system installed in the electronic device 2 and various types of application software, such as the data automatic matching program 20 and the like. Further, the memory 21 can also be used to temporarily store various types of data that have been output or are to be output.

The processor 22 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 22 is typically used to control the overall operation of the electronic device 2, such as performing control and processing related to data interaction or communication with the electronic device 2. In this embodiment, the processor 22 is configured to run program code or process data stored in the memory 21, such as running the data automatic matching program 20 and the like.

The network interface 23 may comprise a wireless network interface or a wired network interface, which is typically used to establish a communication connection between the electronic device 2 and other electronic devices. For example, the network interface 23 is configured to connect the electronic device 2 to an external data platform through a network, and establish a data transmission channel and a communication connection between the electronic device 2 and an external data platform. The network may be an intranet, an Internet, a Global System of Mobile communication (GSM), a Wideband Code Division Multiple Access (WCDMA), a 4G network, or a 5G network. Wireless or wired networks such as network, Bluetooth, Wi-Fi, etc.

So far, the application environment and the hardware structure and functions of the related devices of the various embodiments of the present invention have been described in detail. Hereinafter, various embodiments of the present invention will be proposed based on the above-described application environment and related equipment.

Referring to FIG. 2, it is a block diagram of an embodiment of an automatic data matching program 20 in the electronic device 2 of the present invention. In this embodiment, the data automatic matching program 20 may be divided into one or more program modules, and the one or more program modules are stored in the memory 21 and are processed by one or more processors ( This embodiment is executed by the processor 22) to complete the present invention. For example, in FIG. 2, the data automatic matching program 20 can be divided into an obtaining module 201, a processing module 202, a first matching module 203, and a second matching module 204. A program module as used herein refers to a series of computer program instructions that are capable of performing a particular function. The functions of each of the program modules 201-204 will be described in detail below.

The obtaining module 201 is configured to acquire a classification feature obtained by the feature extraction operation. The feature extraction operation is a pre-processing step of various data mining prediction models. Preferably, in the embodiment, the classification features include, but are not limited to, text data such as drug name, diagnostic information, medical order information, medical equipment, surgery type, family history, and the like.

The processing module 202 is configured to perform normalization processing on the classification feature obtained by the feature extraction operation according to a preset dynamic list to obtain a normalized classification feature.

Preferably, in this embodiment, the preset dynamic list includes a dynamic list corresponding to different types of data sources, such as a dynamic list corresponding to the first type of data source (such as a dynamic list corresponding to the MS SQL Server data source, hereinafter referred to as a dynamic list corresponding to the second type of data source (such as a dynamic list corresponding to the Oracle data source, hereinafter referred to as a "second type dynamic list"), and a dynamic corresponding to the third type of data source. List (such as the dynamic list corresponding to the MySQL data source, hereinafter referred to as the "third type dynamic list"). Those skilled in the art should understand that in other embodiments, the number of dynamic lists may also be increased or decreased according to the number of data source types.

Preferably, in this embodiment, different dynamic characters are stored in the dynamic list corresponding to different types of data sources, and are used for performing classification feature normalization processing on different types of data sources. For example, the first type of dynamic list stores a first type of special character for performing classification feature normalization processing on the first type of data source, and the second type of dynamic list stores the second type of special character. For classifying feature normalization processing for a second type of data source; the third type of motion The third type special character is stored in the state list for performing classification feature normalization processing on the third type data source.

Preferably, in this embodiment, the preset dynamic list is dynamically adjusted according to data changes of the data source, such as adding new special characters. For example, the first type dynamic list is dynamically adjusted according to data changes of the first type data source, and the second type dynamic list is dynamically adjusted according to data changes of the second type data source, and the third type dynamic list is based on The data changes of the third type of data source are dynamically adjusted and the like.

Preferably, in the embodiment, the normalizing the classification feature obtained by the feature extraction according to the preset dynamic list comprises: extracting a special character stored in a preset dynamic list, according to the extracted The special character performs normalization processing such as deleting or replacing the classification feature obtained by the feature extraction operation.

Specifically, if the preset dynamic list is a first type dynamic list, extracting first type special characters (such as "/" and "\", etc.) stored in the first type dynamic list, according to the The extracted first type special character deletes or replaces the classification feature obtained by the feature extraction operation to obtain a normalized first type classification feature.

If the preset dynamic list is the second type dynamic list, extracting the second type special character stored in the second type dynamic list, and obtaining the feature extraction operation according to the extracted second type special character The classification feature is deleted or replaced to obtain a normalized second type classification feature.

The first matching module 203 is configured to extract, from the normalized classification feature, a special field that includes a detachable character, and according to the position of the detachable character in the special field, the special The field is split into several field segments and the split field segments are matched to the target classification. The target classification may be preset rule data in an internal data platform (such as a Hadoop data platform).

Preferably, in the embodiment, splitting the special field into a plurality of field segments comprises: recording a position of the detachable character in the special field as a split point; respectively extracting the split point The field fragment and the field fragment after the split point.

For example, if the normalized classification feature includes a special field "a+b" or "a//b", where "+" and "//" are detachable characters, then The special fields are split into field segments "a" and "b", and the split field segments "a" and "b" are respectively matched with the target classification.

Since a special field (such as "a+b" or "a//b") is directly matched to the target classification, it is likely to cause the match to fail. However, if the above special field is split into the field segments "a" and "b", the split field segments "a" and "b" are respectively matched with the target classification. When the match success rate will be greatly improved. Therefore, the present invention can effectively improve the matching success rate and accuracy of the classification feature and the target classification by the special field structured split processing described in the first matching module 203.

Preferably, in this embodiment, if the preset dynamic list is a first type dynamic list, extracting the first type special including the detachable characters from the normalized first type classification feature a field, according to a position of the detachable character in the first type special field, splitting the first type special field into a plurality of field segments, and matching the split field segment with a target classification .

If the preset dynamic list is a second type dynamic list, extracting, from the normalized second type classification feature, a second type special field including a detachable character, according to the detachable character The location in the second type special field splits the second type special field into a plurality of field segments, and matches the split field segment with the target classification.

If the preset dynamic list is a third type dynamic list, extracting, from the normalized third type classification feature, a third type special field including a detachable character, according to the detachable character The location in the third type special field splits the third type special field into a plurality of field segments, and matches the split field segment with the target classification.

The second matching module 204 is configured to match the unsuccessful matching field with the target classification by using a preset field logical inclusion relationship (or a field semantic inclusion relationship).

Preferably, in this embodiment, the data matching in the first matching module 203 can be recorded as a first match, and the first matching includes: matching of a special field (ie, splitting the special field into field segments and The target classification is matched to match the non-special field (that is, the non-special field in the normalized classification feature is matched with the target classification). Further, the data matching in the second matching module 204 can be recorded as a second matching, and the second matching includes: matching the field in which the first matching is unsuccessful with the target classification.

Preferably, in this embodiment, the matching the unsuccessful field with the target classification by using the preset field logical inclusion relationship includes:

According to a semantic logic similarity calculation algorithm (such as an algorithm for calculating semantic similarity based on a tree hierarchy), a semantic similarity value of a field with unsuccessful matching (that is, a field with unsuccessful first matching) and a target classification is calculated;

If the semantic similarity value is greater than a preset threshold (eg, 80%), determining that the unsuccessful match field has a logical inclusion relationship with the target classification, and marking the unsuccessful matching field as having a matching relationship with the target classification , the field that the match was unsuccessful is modified to match the successful field.

For example, if the field with the first unsuccessful match contains "aspirin tablets" and the target category contains the field "aspirin", since the "aspirin tablets" and "aspirin" have a semantic logic inclusion relationship, the match is unsuccessful. The field "Aspirie Slices" is modified to match the successful fields.

Since the second matching module 204 will match the first matching module 203 to the first unsuccessful field If the matching of the first unsuccessful field and the target classification have a logical inclusion relationship (or a semantic inclusion relationship), the field whose first matching is unsuccessful is modified to be a successfully matched field. Therefore, the present invention solves the matching problem of the irregularity missing field by the field logical inclusion relationship (or the field semantic inclusion relationship) in the second matching module 204, thereby further improving the matching success rate of the classification feature and the target classification. And the accuracy rate, and the matching efficiency has a significant advantage compared with the manual matching, which greatly reduces the workload of manual matching.

It should be noted that, in other embodiments, in some cases, for example, if the first matching success rate is already high (eg, greater than 90%), the second matching module 204 may also be removed. .

Through the above program modules 201-204, the data automatic matching program 20 proposed by the present invention effectively improves the matching success rate and accuracy of the classification feature and the target classification through the special field structured splitting process, and further, through the field logic The inclusion relationship (or field semantic inclusion relationship) solves the matching problem of the irregularity missing field, thereby further improving the matching success rate and accuracy of the classification feature and the target classification.

In addition, the present invention also proposes an automatic data matching method.

Referring to FIG. 3, it is a schematic flowchart of an implementation of an embodiment of the data automatic matching method of the present invention. In this embodiment, the order of execution of the steps in the flowchart shown in FIG. 3 may be changed according to different requirements, and some steps may be omitted.

Step S31: Acquire a classification feature obtained by the feature extraction operation. The feature extraction operation is a pre-processing step of various data mining prediction models. Preferably, in the embodiment, the classification features include, but are not limited to, text data such as drug name, diagnostic information, medical order information, medical equipment, surgery type, family history, and the like.

Step S32, normalizing the classification features obtained by the feature extraction operation according to the preset dynamic list to obtain a normalized classification feature.

Step S33, extracting a special field containing the detachable character from the normalized classification feature, and splitting the special field into several field segments according to the position of the detachable character in the special field And matching the split field segments with the target classification. The target classification may be preset rule data in an internal data platform (such as a Hadoop data platform).

Since a special field (such as "a+b" or "a//b") is directly matched to the target classification, it is likely to cause the match to fail. However, if the above special fields are split into the field segments "a" and "b", and the split field segments "a" and "b" are respectively matched with the target classification, the matching success rate will be greatly improved. Upgrade. Therefore, the present invention passes the special word described in step S33. The segmental structured splitting process can effectively improve the matching success rate and accuracy of the classification feature and the target classification.

Step S34, matching the unsuccessful matching field with the target classification by using a preset field logical inclusion relationship (or a field semantic inclusion relationship).

Preferably, in this embodiment, the data matching in step S33 may be recorded as a first match, and the first match includes: matching of special fields (ie, splitting the special field into field segments and target classifications) Matching) matching with non-special fields (that is, matching non-special fields in the normalized classification features with target classifications). Further, the data matching in step S34 may be recorded as a second matching, and the second matching includes: matching the field in which the first matching is unsuccessful with the target classification.

Preferably, in this embodiment, the step of matching the unsuccessful matching field with the target classification by using the preset field logical inclusion relationship includes:

Since the first unsuccessful matching field in step S33 is further matched in step S34, if the first unsuccessful matching field is found to have a logical inclusion relationship with the target classification (or the semantic inclusion) System), the field with the first unsuccessful match is modified to match the successful field. Therefore, the present invention solves the matching problem of the irregularity missing field by the field logical inclusion relationship (or the field semantic inclusion relationship) described in step S34, thereby further improving the matching success rate and accuracy of the classification feature and the target classification. And the matching efficiency has a significant advantage compared with the manual matching, which greatly reduces the workload of manual matching.

It should be noted that, in other embodiments, in some cases, for example, in the case where the first matching success rate is already high (eg, greater than 90%), the step S34 may also be removed.

Through the above steps S31-S34, the automatic data matching method proposed by the present invention effectively improves the matching success rate and accuracy of the classification feature and the target classification through the special field structured splitting process, and further, through the field logic inclusion relationship (or the field semantic inclusion relationship) solves the matching problem of the irregularity missing field, thereby further improving the matching success rate and accuracy of the classification feature and the target classification.

Further, in order to achieve the above object, the present invention also provides a computer readable storage medium (such as a ROM/RAM, a magnetic disk, an optical disk), wherein the computer readable storage medium stores an automatic data matching program 20, and the data is automatically The matching program 20 can be executed by at least one processor 22 to cause the at least one processor 22 to perform the steps of the data automatic matching method as described above.

Through the description of the above embodiments, those skilled in the art can clearly understand that the foregoing embodiment method can be implemented by means of software plus a necessary general hardware platform, and can also be implemented by hardware, but in many cases, the former is A better implementation. Based on such understanding, the technical solution of the present invention, which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, disk, The optical disc includes a number of instructions for causing a terminal device (which may be a cell phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the methods described in various embodiments of the present invention.

The preferred embodiments of the present invention have been described above with reference to the drawings, and are not intended to limit the scope of the invention. The serial numbers of the embodiments of the present invention are merely for the description, and do not represent the advantages and disadvantages of the embodiments. Additionally, although logical sequences are shown in the flowcharts, in some cases the steps shown or described may be performed in a different order than the ones described herein.

A person skilled in the art can implement the invention in various variants without departing from the scope and spirit of the invention. For example, the features of one embodiment can be used in another embodiment to obtain a further embodiment. The equivalent structure or equivalent process transformations made by the present specification and the drawings are directly or indirectly applied to other related technical fields, and are included in the scope of patent protection of the present invention.

Claims

An electronic device, comprising: a memory, a processor, wherein the memory stores an automatic data matching program, and when the data automatic matching program is executed by the processor, the following steps are implemented:

Obtaining the classification feature obtained by the feature extraction operation;

Performing normalization on the classification features obtained by the feature extraction operation according to a preset dynamic list to obtain a normalized classification feature;

Extracting a special field containing the detachable character from the normalized classification feature, and splitting the special field into a plurality of field segments according to the position of the detachable character in the special field, and The split field segment matches the target classification; and

The unsuccessful matching field is matched with the target classification by a preset field logical inclusion relationship.
The electronic device according to claim 1, wherein the normalizing the classification feature obtained by the feature extraction operation according to the preset dynamic list comprises:

If the preset dynamic list is the first type dynamic list, extracting the first type special character stored in the first type dynamic list, and obtaining the feature extraction operation according to the extracted first type special character The classification feature is deleted or replaced to obtain a normalized first type classification feature;

If the preset dynamic list is the second type dynamic list, extracting the second type special character stored in the second type dynamic list, and obtaining the feature extraction operation according to the extracted second type special character The classification feature is deleted or replaced to obtain a normalized second type classification feature;

If the preset dynamic list is a third type dynamic list, extracting a third type special character stored in the third type dynamic list, and obtaining the feature extraction operation according to the extracted third type special character The classification feature is deleted or replaced to obtain a normalized third type classification feature.
The electronic device of claim 2, wherein splitting the special field into a plurality of field segments comprises:

Recording the position of the detachable character in the special field as a split point; and

The field segment before the split point and the field segment after the split point are extracted separately.
The electronic device of claim 3, wherein splitting the special field into a plurality of field segments comprises:

If the preset dynamic list is a first type dynamic list, extracting, from the normalized first type classification feature, a first type special field including a detachable character, according to the detachable character Positioning in the first type special field, splitting the first type special field into several Field fragment

If the preset dynamic list is a second type dynamic list, extracting, from the normalized second type classification feature, a second type special field including a detachable character, according to the detachable character Positioning in the second type special field, splitting the second type special field into a plurality of field segments; and

If the preset dynamic list is a third type dynamic list, extracting, from the normalized third type classification feature, a third type special field including a detachable character, according to the detachable character The location in the third type special field splits the third type special field into several field segments.
The electronic device according to claim 1, wherein the matching the unsuccessful field with the target classification by using the preset field logical inclusion relationship comprises:

Calculating a semantic similarity value between the unsuccessful matching field and the target classification according to the semantic logic similarity calculation algorithm; and

If the semantic similarity value is greater than the preset threshold, determining that the unsuccessful match field has a logical inclusion relationship with the target classification, and marking the unsuccessful matching field as having a matching relationship with the target classification.
The electronic device according to claim 1, wherein the preset dynamic list is dynamically adjusted according to data changes of the data source.
The electronic device according to claim 1, wherein the target is classified into rule data preset in an internal data platform.
An automatic data matching method is applied to an electronic device, and the method includes:

Obtaining the classification feature obtained by the feature extraction operation;

Performing normalization on the classification features obtained by the feature extraction operation according to a preset dynamic list to obtain a normalized classification feature;

Extracting a special field containing the detachable character from the normalized classification feature, and splitting the special field into a plurality of field segments according to the position of the detachable character in the special field, and The split field segment matches the target classification; and

The unsuccessful matching field is matched with the target classification by a preset field logical inclusion relationship.
The automatic data matching method according to claim 8, wherein the normalizing the classification features obtained by the feature extraction operation according to the preset dynamic list comprises:

If the preset dynamic list is the first type dynamic list, extracting the first type special character stored in the first type dynamic list, and according to the extracted first type special character pair The classification feature obtained by the extraction operation is deleted or replaced, and the normalized first type classification feature is obtained;

If the preset dynamic list is the second type dynamic list, extracting the second type special character stored in the second type dynamic list, and obtaining the feature extraction operation according to the extracted second type special character The classification feature is deleted or replaced to obtain a normalized second type classification feature;

If the preset dynamic list is a third type dynamic list, extracting a third type special character stored in the third type dynamic list, and obtaining the feature extraction operation according to the extracted third type special character The classification feature is deleted or replaced to obtain a normalized third type classification feature.
The data automatic matching method according to claim 8, wherein the splitting the special field into a plurality of field segments comprises:

Recording the position of the detachable character in the special field as a split point; and

The field segment before the split point and the field segment after the split point are extracted separately.
The data automatic matching method according to claim 10, wherein the splitting the special field into a plurality of field segments comprises:

If the preset dynamic list is a first type dynamic list, extracting, from the normalized first type classification feature, a first type special field including a detachable character, according to the detachable character a location in the first type special field, splitting the first type special field into a plurality of field segments;

If the preset dynamic list is a second type dynamic list, extracting, from the normalized second type classification feature, a second type special field including a detachable character, according to the detachable character Positioning in the second type special field, splitting the second type special field into a plurality of field segments; and

If the preset dynamic list is a third type dynamic list, extracting, from the normalized third type classification feature, a third type special field including a detachable character, according to the detachable character The location in the third type special field splits the third type special field into several field segments.
The automatic data matching method according to claim 8, wherein the matching the unsuccessful field with the target classification by using the preset field logical inclusion relationship comprises:

Calculating a semantic similarity value between the unsuccessful matching field and the target classification according to the semantic logic similarity calculation algorithm; and

If the semantic similarity value is greater than the preset threshold, determining that the unsuccessful match field has a logical inclusion relationship with the target classification, and marking the unsuccessful matching field as having a matching relationship with the target classification.
The data automatic matching method according to claim 8, wherein the preset dynamic list is dynamically adjusted according to data changes of the data source.
The data automatic matching method according to claim 8, wherein the target classification is preset rule data in an internal data platform.
A computer readable storage medium storing an automatic data matching program, the data automatic matching program being executable by at least one processor to cause the at least one processor to perform the following steps:

Obtaining the classification feature obtained by the feature extraction operation;

Performing normalization on the classification features obtained by the feature extraction operation according to a preset dynamic list to obtain a normalized classification feature;

Extracting a special field containing the detachable character from the normalized classification feature, and splitting the special field into a plurality of field segments according to the position of the detachable character in the special field, and The split field segment matches the target classification; and

The unsuccessful matching field is matched with the target classification by a preset field logical inclusion relationship.
The computer readable storage medium according to claim 15, wherein the normalizing the classification feature obtained by the feature extraction operation according to the preset dynamic list comprises:

If the preset dynamic list is the first type dynamic list, extracting the first type special character stored in the first type dynamic list, and obtaining the feature extraction operation according to the extracted first type special character The classification feature is deleted or replaced to obtain a normalized first type classification feature;

If the preset dynamic list is the second type dynamic list, extracting the second type special character stored in the second type dynamic list, and obtaining the feature extraction operation according to the extracted second type special character The classification feature is deleted or replaced to obtain a normalized second type classification feature;

If the preset dynamic list is a third type dynamic list, extracting a third type special character stored in the third type dynamic list, and obtaining the feature extraction operation according to the extracted third type special character The classification feature is deleted or replaced to obtain a normalized third type classification feature.
The computer readable storage medium of claim 15 wherein splitting the special field into a number of field segments comprises:

Recording the position of the detachable character in the special field as a split point; and

The field segment before the split point and the field segment after the split point are extracted separately.
A computer readable storage medium according to claim 17, wherein said special The split field is divided into several field segments including:

If the preset dynamic list is a first type dynamic list, extracting, from the normalized first type classification feature, a first type special field including a detachable character, according to the detachable character a location in the first type special field, splitting the first type special field into a plurality of field segments;

If the preset dynamic list is a second type dynamic list, extracting, from the normalized second type classification feature, a second type special field including a detachable character, according to the detachable character Positioning in the second type special field, splitting the second type special field into a plurality of field segments; and

If the preset dynamic list is a third type dynamic list, extracting, from the normalized third type classification feature, a third type special field including a detachable character, according to the detachable character The location in the third type special field splits the third type special field into several field segments.
The computer readable storage medium according to claim 15, wherein the matching the unsuccessful matching field with the target classification by using the preset field logical inclusion relationship comprises:

Calculating a semantic similarity value between the unsuccessful matching field and the target classification according to the semantic logic similarity calculation algorithm; and

If the semantic similarity value is greater than the preset threshold, determining that the unsuccessful match field has a logical inclusion relationship with the target classification, and marking the unsuccessful matching field as having a matching relationship with the target classification.
The computer readable storage medium of claim 15 wherein said predetermined dynamic list is dynamically adjusted based on data changes of the data source.