CN109101634B - Data recording processing method, device, electronic equipment and storage medium - Google Patents
Data recording processing method, device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN109101634B CN109101634B CN201810931008.8A CN201810931008A CN109101634B CN 109101634 B CN109101634 B CN 109101634B CN 201810931008 A CN201810931008 A CN 201810931008A CN 109101634 B CN109101634 B CN 109101634B
- Authority
- CN
- China
- Prior art keywords
- data record
- database
- matching
- value
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The disclosure relates to a data recording processing method, a data recording processing device, an electronic device and a storage medium, and aims to improve matching efficiency of data recording. The method comprises the following steps: obtaining a first database and a second database describing the same set of objects; determining a matching value of a first data record in the first database and each data record to be matched in the second database according to a preset matching rule, wherein the first data record is used for describing a first object in the object set; and determining a second data record for describing the first object from the second database according to the matching value corresponding to each data record to be matched in the second database.
Description
Technical Field
The present disclosure relates to the field of data processing technologies, and in particular, to a data recording processing method and apparatus, an electronic device, and a storage medium.
Background
In the business process of each enterprise, a large amount of data, such as user data, business data, etc., is generally generated. Over time, this data is gradually accumulated into the data resources of the enterprise. Different enterprises may process the owned data resources in different ways, and then store the processed data records in a database to provide references for business decisions of enterprise operators.
However, as the processing requirements of enterprises for data resources become increasingly complex, there may be a need to match data records in databases based on different processing approaches. In the related art, matching is performed on data records in a database based on a mode different from a processing mode through a manual processing mode, and the matching efficiency is low.
Disclosure of Invention
The invention aims to provide a data recording processing method, a data recording processing device, an electronic device and a storage medium, so as to improve the matching efficiency of data recording.
In order to achieve the above object, a first aspect of the embodiments of the present disclosure provides a data recording processing method, where the method includes:
obtaining a first database and a second database describing the same set of objects;
determining a matching value of a first data record in the first database and each data record to be matched in the second database according to a preset matching rule, wherein the first data record is used for describing a first object in the object set;
and determining a second data record for describing the first object from the second database according to the matching value corresponding to each data record to be matched in the second database.
Optionally, determining a second data record describing the first object from the second database includes:
sorting the matching values corresponding to the data records to be matched in the second database;
determining a difference between the highest match value and the next highest match value;
and determining the data record with the highest corresponding matching value as the second data record under the condition that the difference value is larger than a preset threshold value.
Optionally, the method further comprises:
under the condition that the difference value is not larger than the preset threshold value, outputting prompt information, wherein the prompt information is used for prompting a user to select one data record from the data record with the highest corresponding matching value and the data record with the next highest corresponding matching value;
determining a second data record from the second database describing the first object, comprising:
determining the user-selected data record as the second data record.
Optionally, the preset matching rule includes a plurality of sub-matching rules; determining a matching value of a first data record in the first database and any data record to be matched in the second database according to a preset matching rule, wherein the matching value comprises the following steps:
determining a matching initial value of the first data record and any data record to be matched in the second database according to each sub-matching rule;
and determining a matching value corresponding to the data record to be matched in the second database according to the matching initial value corresponding to each sub-matching rule and the weight value of each sub-matching rule.
Optionally, after determining a second data record describing the first object from the second database, the method further comprises:
and storing the first data record, the second data record and the matching relation between the first data record and the second data record to a third database for describing the object set.
Optionally, the method further comprises:
and when a data record acquisition request for the first object is detected, acquiring the first data record and/or the second data record from the third database.
Optionally, the preset matching rule includes:
a general matching rule, or a specific matching rule configured based on feature parameters of objects in the object set, or a combination of the general matching rule and the specific matching rule, wherein the general matching rule comprises: fuzzy matching rules, or equivalence matching rules, or a combination of both.
Optionally, the characteristic parameter of the object in the set of objects is a geographical location; the special matching rule comprises a longitude and latitude matching rule; and/or administrative region level matching rules.
A second aspect of the embodiments of the present disclosure provides a data recording processing apparatus, including:
an obtaining module for obtaining a first database and a second database for describing the same object set;
a matching value determining module, configured to determine, according to a preset matching rule, a matching value between a first data record in the first database and each data record to be matched in the second database, where the first data record is used to describe a first object in the object set;
and the data record determining module is used for determining a second data record for describing the first object from the second database according to the matching value corresponding to each data record to be matched in the second database.
Optionally, the data record determining module includes:
the sorting submodule is used for sorting the matching values corresponding to the data records to be matched in the second database;
a first determining submodule for determining a difference between a highest match value and a next highest match value;
and the second determining submodule is used for determining the data record with the highest corresponding matching value as the second data record under the condition that the difference value is larger than a preset threshold value.
Optionally, the apparatus further comprises:
the output module is used for outputting prompt information under the condition that the difference value is not larger than the preset threshold value, wherein the prompt information is used for prompting a user to select one data record from the data record with the highest corresponding matching value and the data record with the next highest corresponding matching value;
the data record determination module comprises:
a third determining submodule, configured to determine the data record selected by the user as the second data record.
Optionally, the matching rule comprises a plurality of sub-matching rules; the matching value determination module includes:
a matching initial value determining submodule, configured to determine, according to each sub-matching rule, a matching initial value of the first data record and any data record to be matched in the second database;
and the matching value determining submodule is used for determining the matching value corresponding to the data record to be matched in the second database according to the matching initial value corresponding to each sub-matching rule and the weight value of each sub-matching rule.
Optionally, the apparatus further comprises:
and the storage module is used for storing the first data record, the second data record and the matching relation between the first data record and the second data record into a third database for describing the object set.
Optionally, the apparatus further comprises:
an obtaining module, configured to obtain the first data record and/or the second data record from the third database when a data record obtaining request for the first object is detected.
Optionally, the preset matching rule includes:
a general matching rule, or a specific matching rule configured based on feature parameters of objects in the object set, or a combination of the general matching rule and the specific matching rule, wherein the general matching rule comprises: fuzzy matching rules, or equivalence matching rules, or a combination of both.
Optionally, the characteristic parameter of the object in the set of objects is a geographical location; the special matching rules include: longitude and latitude matching rules and/or administrative region level matching rules.
A third aspect of the disclosed embodiments provides an electronic device, comprising a processor; a memory for storing processor-executable instructions; wherein, the processor is used for executing the steps of the data recording processing method.
A fourth aspect of the embodiments of the present disclosure provides a computer-readable storage medium on which computer program instructions are stored, which program instructions, when executed by a processor, implement the steps of the above-mentioned data recording method.
According to the technical scheme, after a plurality of databases used for describing the same object set are obtained, according to a preset matching rule, the matching value of the data record in one database of the databases and the data records in other databases except the database is determined, and finally, according to the determined matching value, the data records used for describing the same object in the object set in the databases are determined. Therefore, the data records in the databases can be automatically matched without manual matching, and the matching efficiency is improved.
Additional features and advantages of the disclosure will be set forth in the detailed description which follows.
Drawings
The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure without limiting the disclosure. In the drawings:
fig. 1 is a flowchart of a data recording processing method according to an embodiment of the present disclosure.
Fig. 2 is another flowchart of a data recording processing method according to an embodiment of the present disclosure.
Fig. 3 is a schematic diagram of a data recording processing apparatus according to an embodiment of the disclosure.
Fig. 4 is another schematic diagram of a data recording processing apparatus according to an embodiment of the disclosure.
Fig. 5 is a block diagram of an electronic device provided in an embodiment of the present disclosure.
Detailed Description
The following detailed description of specific embodiments of the present disclosure is provided in connection with the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating the present disclosure, are given by way of illustration and explanation only, not limitation.
The embodiment of the disclosure provides a data record processing method, which includes after obtaining a plurality of databases for describing the same object set, determining a matching value between a data record in one of the databases and a data record in another database except the database according to a preset matching rule, and finally determining a data record in the databases for describing the same object in the object set according to the determined matching value. Therefore, the data records in the databases can be automatically matched without manual matching, and the matching efficiency is improved.
The following describes the data recording processing method provided by the embodiments of the present disclosure in detail with reference to specific embodiments.
Referring to fig. 1, fig. 1 is a flowchart of a data recording processing method provided by an embodiment of the present disclosure, as shown in fig. 1, the method includes the following steps:
step S11: obtaining a first database and a second database describing the same set of objects;
step S12: determining a matching value of a first data record in the first database and each data record to be matched in the second database according to a preset matching rule, wherein the first data record is used for describing a first object in the object set;
step S13: and determining a second data record for describing the first object from the second database according to the matching value corresponding to each data record to be matched in the second database.
The object set is a set of a plurality of objects, the plurality of objects in the object set belong to the same type and have the same characteristic parameters, and the values of the characteristic parameters of different objects in the object set are different. Illustratively, the set of objects is a set of cities, including: each city in the object set has the same characteristic parameters, including: name, definition, zip code, latitude and longitude, administrative level, etc. The names, definitions, postcodes, longitudes and latitudes and administrative region grades of different cities in the object set are different. By way of example, the name of the city in Beijing is: beijing, the definition is: the capital of China, the zip code is: 100000, the latitude and longitude values are: north latitude N39 ° 54 '11.97 ", east longitude E116 ° 24' 3.52", administrative grade is: the city of direct jurisdictions; the name of the Chengdu city is: the definition of Chengdu is: the largest province city in southwest, the zip code is: 610000, the latitude and longitude values are: north latitude N30 ° 34 '21.63 ", east longitude E104 ° 03' 44.20", administrative grade is: province (belonging to Sichuan province).
A database is a collection of data records, one for describing one object in a set of objects.
Illustratively, one data record in the first database is as follows:
name: success, define: the largest province city in southwest, zip code: 610000, latitude and longitude values: north latitude N30 ° 34 '21.63 ", east longitude E104 ° 03' 44.20", administrative grade: province (Chengdu city belongs to Sichuan province).
Illustratively, one data record in the second database is as follows:
name: achievement (CD), definition: one province city in southwest, famous for leisure slow rhythm, postcode: 610000, latitude and longitude values: north latitude N30 ° 34 '21.63 ", east longitude E104 ° 03' 44.20", administrative grade: province (belonging to Sichuan province).
In a practical application scenario, there may be multiple databases each describing the same set of objects. Illustratively, the set of objects is a set of cities, and the databases of the plurality of e-commerce type enterprises are all used for describing the set of objects; as another example, a set of objects is a set of stores, and a database of a plurality of takeaway-type enterprises is used to describe the set of objects.
It is understood that the number of databases describing the same object set may be multiple, and in order to match the data records in multiple databases, one database in the multiple databases may be used as a first database, and any other database except the database (i.e., the first database) in the multiple databases may be used as a second database, based on which, the data record processing method provided by the embodiment of the present disclosure is executed to match the data records in two databases, and the data record processing method provided by the embodiment of the present disclosure is repeatedly executed, so that the data records in the multiple databases may be matched.
Aiming at the condition that data records in a plurality of databases, which are used for describing the same object, are inconsistent, the embodiment of the disclosure provides that the data records in different databases are matched according to a preset matching rule. The preset matching rules comprise: a general matching rule, or a specific matching rule configured based on feature parameters of objects in the object set, or a combination of the general matching rule and the specific matching rule, wherein the general matching rule comprises: fuzzy matching rules, or equivalence matching rules, or a combination of both. Optionally, the characteristic parameter of the object in the set of objects is a geographical location; the special matching rules include: longitude and latitude matching rules and/or administrative region level matching rules.
In the disclosed embodiments, the universal matching rules are applicable to matching data records in a database describing any set of objects.
Illustratively, the object set is a set of cities, the database of enterprise a and the database of enterprise B are both used for describing the object set, and in the process of determining a data record matching with one data record in enterprise a from the database of enterprise B, the universal matching rule can be used as a preset matching rule; the object set is a set of stores, the database of the enterprise C and the database of the enterprise D are both used for describing the object set, and in the process of determining a data record matching with one data record in the enterprise C from the database of the enterprise D, the common matching rule can also be used as a preset matching rule.
The universal matching rules include: fuzzy matching rules, or equivalence matching rules, or a combination of both. The fuzzy matching rule is applicable to data items of text types in the data records, and the equivalent matching rule is applicable to data items of numerical types in the data records. If the data record only includes a text type of data item, then only the fuzzy matching rule may be selected as the universal matching rule; similarly, if the data record only includes data items of numerical type, only the equivalent matching rule may be selected as the general matching rule; similarly, if the data record includes both text-type and numeric-type data items, the fuzzy matching rule and the equivalence matching rule may both be treated as a universal matching rule.
Illustratively, one data record in the first database is as follows: name: success, define: the largest province city in southwest, zip code: 610000. one data record in the second database is as follows: name: achievement (CD), definition: one province city in southwest, famous for leisure slow rhythm, postcode: 610000. since the name and description data items are data items belonging to a text type and the zip code data item is a data item belonging to a numerical type, both the fuzzy matching rule and the equivalent matching rule are taken as general matching rules.
In the embodiment of the present disclosure, the dedicated matching rule (or the personalized rule) is a matching rule that is specifically established for a database describing a specific object set according to the feature parameters of the objects in the object set described by the database (the feature parameters of the objects may refer to the foregoing description, and are not described here again). For the databases describing different object sets, the databases describing different object sets have different applicable special matching rules because the characteristic parameters of the objects in the different object sets are different.
Illustratively, the object set is a set of cities, each city in the object set having the same characteristic parameters, including: name, description, zip code, latitude and longitude, administrative region level, etc., and establishing a special matching rule for the object set includes: and determining the matching rules between the database of the enterprise A and the database of the enterprise B as the longitude and latitude matching rules and the administrative region level matching rules.
Illustratively, the object set is a set of stores, each city in the object set having the same characteristic parameters, including: POI (Point of Interest, each POI contains four aspects of information, name, category, coordinate, classification), merchant contact phone, contact person, brand, etc., and establishing a special matching rule for the object set comprises: POI matching rule, trade company contact phone matching rule, contact person matching rule and brand matching rule, the database of enterprise C and the database of enterprise D are both used for describing the object set, and the matching rule between the database of enterprise C and the database of enterprise D is determined as follows: POI matching rules, merchant contact phone matching rules, contact person matching rules, and brand matching rules.
It can be seen that, since the database of enterprise a and the database of enterprise B are both used to describe the set of cities, and the database of enterprise C and the database of enterprise D are both used to describe the set of stores, the characteristic parameters of cities are different from those of stores, so that the special matching rules applied between the database of enterprise a and the database of enterprise B are different from those applied between the database of enterprise C and the database of enterprise D.
In one embodiment, the general matching rule and the special matching rule can be combined, that is, the preset matching rule comprises the general matching rule and the special matching rule, on one hand, the matching accuracy is greatly improved due to the increase of the number of the matching rules; on the other hand, different special matching rules can be set along with different objects described by the data records, so that the flexibility of data record matching is improved.
Illustratively, the database of enterprise a and the database of enterprise B are both used to describe a set of cities, and in determining a data record matching one data record in enterprise a from the database of enterprise B, the following rules are taken as preset matching rules:
1) universal matching rules, including fuzzy matching rules (for matching names of cities and definitions of cities) and equivalence matching rules (for matching zip codes of cities);
2) the special matching rules comprise longitude and latitude matching rules (used for matching longitude and latitude values of the city) and administrative region level matching rules (used for matching administrative region levels of the city).
In the embodiment of the present disclosure, although the first database and the second database are both used to describe the same set of objects, it is unknown which data record in the second database describes the object (e.g., the first object) described by one data record (e.g., the first data record) in the first database, so that the preset matching rule needs to be applied, the first data record in the first database is compared with each data record to be matched in the second database one by one, the matching value of the first data record and each data record to be matched in the second database is determined, and then the data record (i.e., the second data record) describing the first object is determined from the second database.
The data records to be matched refer to data records which are not successfully matched. For example, in the first determination of a data record from the second database that matches one data record (e.g., a first data record) in the first database, the data record to be matched is all data records in the second database, after determining that the data record that matches the first data record from the second database is the second data record, the second data record is the data record that matches successfully, and in the second determination of a data record from the second database that matches another data record (e.g., a third data record) in the first database, the data record to be matched is the remaining data record in the second database except the second data record.
Optionally, determining a second data record describing the first object from the second database includes:
sorting the matching values corresponding to the data records to be matched in the second database;
determining a difference between the highest match value and the next highest match value;
determining the data record with the highest corresponding matching value as the second data record under the condition that the difference value is larger than a preset threshold value;
under the condition that the difference value is not larger than the preset threshold value, outputting prompt information, wherein the prompt information is used for prompting a user to select one data record from the data record with the highest corresponding matching value and the data record with the next highest corresponding matching value; determining the user-selected data record as the second data record.
In the embodiment of the present disclosure, the data record in the second database, which is used to describe the first object, with the highest matching value may be used as the data record that matches the first record (i.e., the second data record).
Or, the matching values corresponding to the data records to be matched in the second database may be sorted, the highest matching value and the second highest matching value are determined, then the difference between the highest matching value and the second highest matching value is determined, if the difference between the highest matching value and the second highest matching value is greater than the preset threshold, the difference between the corresponding data record with the highest matching value and the corresponding data record with the second highest matching value is obvious, so the corresponding data record with the highest matching value is directly used as the second data record; if the difference value between the two data records is not greater than the preset threshold value, the difference between the data record with the highest matching value and the data record with the next highest matching value is weak, and both the two data records are possibly the data records matched with the first data record.
It is understood that the number of data records in the first database may be multiple, one data record in the first database may be used as the first data record, the first data record is compared with all data records in the second database, and steps S12-S13 are executed until a data record (i.e., the second data record) matching the first data record is determined from the second database, and thus, the matching of the first data record in the first database with the second data record in the second database is completed. Similarly, one data record in the first database except the first data record is taken as a new first data record, the new first data record is compared with the remaining data records in the second database except the second data record, and the steps S12-S13 are executed until a data record matching the new first data record is determined from the second database.
In one embodiment, the preset matching rule comprises a plurality of sub-matching rules; accordingly, step S12 includes:
determining a matching initial value of the first data record and the data record in the second database according to each sub-matching rule;
and obtaining the matching value according to the matching initial value corresponding to each sub-matching rule and the weight value of each sub-matching rule.
In the embodiment of the present disclosure, the preset matching rule may be a general matching rule, may also be a specific matching rule, or may be a combination of the general matching rule and the specific matching rule. The number of the general matching rules may be multiple, and the number of the specific matching rules may also be multiple, so that the number of the preset matching rules is multiple, and each matching rule is a sub-matching rule.
Illustratively, the preset matching rule includes four sub-matching rules: the method comprises the following steps of firstly, fuzzy matching rules, secondly, equivalent matching rules, thirdly, longitude and latitude matching rules and fourthly, administrative region level matching rules.
Different sub-matching rules evaluate the matching degree between two data records from different databases from different angles, so that when the matching value between two data records from different databases is determined, different weight values are necessarily given to the different sub-matching rules, the weight value of each sub-matching rule can be default or determined according to the reliability of each sub-matching rule, and the reliability of one sub-matching rule can be obtained through neural network learning.
And determining a matching initial value of the first record in the first database and one record in the second database aiming at each sub-matching rule, and thus, applying a plurality of sub-matching rules to obtain a plurality of matching initial values. Then, multiplying each matching initial value by the weight value of the sub-matching rule based on the matching initial value to obtain a product, obtaining a plurality of products corresponding to a plurality of sub-matching rules, and finally adding the products to obtain the matching value of the first record in the first database and one record in the second database.
Illustratively, the preset matching rule includes four sub-matching rules: the method comprises the following steps of firstly, fuzzy matching rules, secondly, equivalent matching rules, thirdly, longitude and latitude matching rules and fourthly, administrative region grade matching rules, wherein the weighted values are a1, a2, a3 and a4 respectively. Applying fuzzy matching rules to the data record a in the database of the enterprise A and the data record B in the database of the enterprise B, wherein the determined initial matching value is Score 1; applying an equivalent matching rule, and determining that the initial matching value is Score 2; applying a longitude and latitude matching rule, wherein the determined initial matching value is Score 3; the administrative region level matching rule is applied and the initial value of the match determined is Score 4. Then the match between data record a in enterprise a's database and data record B in enterprise B's database is Score1 a1+ Score2 a2+ Score3 a3+ Score4 a 4.
The following is a complete example of how to determine from the second database the data records that match the first data records in the first database.
The database of the enterprise A and the database of the enterprise B are both used for describing a set of cities, and the preset matching rules comprise four sub-matching rules: the method comprises the following steps of firstly, fuzzy matching rules, secondly, equivalent matching rules, thirdly, longitude and latitude matching rules and fourthly, administrative region grade matching rules, wherein the weighted values are a1, a2, a3 and a4 respectively.
Data record a in enterprise A's database is as follows:
name: success, define: the largest province city in southwest, zip code: 610000, latitude and longitude values: north latitude N30 ° 34 '21.63 ", east longitude E104 ° 03' 44.20", administrative grade: province (belonging to Sichuan province).
Data record B in enterprise B's database is as follows:
name: achievement (CD), definition: one province city in southwest, famous for leisure slow rhythm, postcode: 610000, latitude and longitude values: north latitude N30 ° 34 '21.63 ", east longitude E104 ° 03' 44.20", administrative grade: province (belonging to Sichuan province).
Data record B1 in enterprise B's database is as follows:
name: beijing, definition: capital of China, zip code: 100000, latitude and longitude values: north latitude N39 ° 54 '11.97 ", east longitude E116 ° 24' 3.52", administrative grade: the city of direct jurisdictions.
Applying fuzzy matching rules to the data record a in the database of the enterprise A, the data record B in the database of the enterprise B and the data record B1, wherein the determined initial matching values are respectively Score1 and Score 1'; applying the equivalent matching rule, determining the initial matching values of Score2 and Score2 '(Score 2' is zero because the zip code of the capital is different from that of Beijing); applying longitude and latitude matching rules, wherein the determined initial matching values are Score3 and Score 3'; applying administrative level matching rules, the determined initial match values are Score4 and Score4 '(Score 4' is zero, since administrative levels do not match, province and prefecture cities are two different administrative levels). Then the match between data record a in enterprise a's database and data record B in enterprise B's database is Score1 a1+ Score2 a2+ Score3 a3+ Score4 a 4; the match between data record a in enterprise a's database and data record B1 in enterprise B's database was Score1 'a 1+ Score 2' a2+ Score3 'a 3+ Score 4' a 4.
Compare Score and Score 'because Score is greater than Score', data record B in enterprise B's database matches data record a in enterprise a's database as compared to data record B1.
As shown in fig. 2, fig. 2 is another flowchart of a data recording processing method according to an embodiment of the present disclosure. Referring to fig. 2, in an implementation manner, the data record processing method provided by the embodiment of the present disclosure further includes the following steps:
step S14: storing the first data record, the second data record, and the matching relationship between the first data record and the second data record to a third database.
Optionally, as shown in fig. 2, the data record processing method provided in the embodiment of the present disclosure further includes the following steps:
step S15: and when a data record acquisition request for the first object is detected, acquiring the first data record and/or the second data record from the third database.
In the embodiment of the disclosure, in consideration of the requirement that data records in databases based on different processing modes are fused to obtain a new database, after the data records in different databases are matched one by one, two or more matched data records and the matching relationship between the matched data records are stored in a third database (another database different from the first database and the second database) so as to achieve the purpose of fusing and unifying the data records in the plurality of databases, and then if a data record acquisition request for an object (for example, a first object) in the object set is detected, the third database can be called, the data record describing the object can be read from the third database, only the first data record can be read from the third database, and only the second data record can be read from the third database, alternatively, the first data record and the second data record are read from a third database.
Illustratively, upon determining that data record B in the database of Enterprise B matches data record a in the database of Enterprise A, a data record is created, the created data record comprising: data record a and data record b and the matching relationship between the two. The newly created data records are then stored in another database.
Thereafter, if the user wants to know how the database of the enterprise B describes the city, the newly created data record can be obtained from the other database, and then the data record B is extracted.
Or, if the user already knows the data record B used in the database of the enterprise B and wants to know which data record in the database of the enterprise a matches the data record B, the newly created data record can be obtained from the other database, and the matching relationship between the data record B and the data record a is further extracted.
Or, when it is necessary to merge each data record describing the city with other data in real time, the newly created data record may be obtained from the other database, and the data record a and the data record b are extracted at one time and merged with other data in real time.
Based on the same inventive concept, the embodiment of the disclosure also provides a data recording and processing device. Referring to fig. 3, fig. 3 is a schematic diagram of a data record processing apparatus according to an embodiment of the present disclosure. As shown in fig. 3, a data recording processing apparatus 300 provided in the embodiment of the present disclosure includes:
an obtaining module 301, configured to obtain a first database and a second database that describe a same object set;
a matching value determining module 302, configured to determine, according to a preset matching rule, a matching value between a first data record in the first database and each data record to be matched in the second database, where the first data record is used to describe a first object in the object set;
a data record determining module 303, configured to determine, according to a matching value corresponding to each data record to be matched in the second database, a second data record used for describing the first object from the second database.
Optionally, the data record determining module includes:
the sorting submodule is used for sorting the matching values corresponding to the data records to be matched in the second database;
a first determining submodule for determining a difference between a highest match value and a next highest match value;
and the second determining submodule is used for determining the data record with the highest corresponding matching value as the second data record under the condition that the difference value is larger than a preset threshold value.
Optionally, the apparatus further comprises:
the output module is used for outputting prompt information under the condition that the difference value is not larger than the preset threshold value, wherein the prompt information is used for prompting a user to select one data record from the data record with the highest corresponding matching value and the data record with the next highest corresponding matching value;
the data record determination module comprises:
a third determining submodule, configured to determine the data record selected by the user as the second data record.
Optionally, the matching rule comprises a plurality of sub-matching rules; the matching value determination module includes:
a matching initial value determining submodule, configured to determine, according to each sub-matching rule, a matching initial value of the first data record and any data record to be matched in the second database;
and the matching value determining submodule is used for determining the matching value corresponding to the data record to be matched in the second database according to the matching initial value corresponding to each sub-matching rule and the weight value of each sub-matching rule.
Optionally, fig. 4 is a schematic diagram of a data record processing apparatus provided in the embodiment of the present disclosure. As shown in fig. 4, the data record processing apparatus 300 according to the embodiment of the present disclosure further includes:
a storage module 304, configured to store the first data record, the second data record, and the matching relationship between the first data record and the second data record in a third database describing the object set.
Optionally, as shown in fig. 4, the data record processing apparatus 300 according to the embodiment of the present disclosure further includes:
an obtaining module 305, configured to obtain the first data record and/or the second data record from the third database when a data record obtaining request for the first object is detected.
Optionally, the preset matching rule includes:
a general matching rule, or a specific matching rule configured based on feature parameters of objects in the object set, or a combination of the general matching rule and the specific matching rule, wherein the general matching rule comprises: fuzzy matching rules, or equivalence matching rules, or a combination of both.
Optionally, the characteristic parameter of the object in the set of objects is a geographical location; the special matching rules include: longitude and latitude matching rules and/or administrative region level matching rules.
It should be noted that, regarding the apparatus in the above embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated herein.
Fig. 5 is a block diagram of an electronic device provided by an embodiment of the present disclosure. For example, the electronic device 100 may be provided as a data processing server. Referring to fig. 5, electronic device 100 includes a processor 1122, which can be one or more in number, and a memory 1132 for storing computer programs executable by processor 1122. The computer programs stored in memory 1132 may include one or more modules that each correspond to a set of instructions. Further, the processor 1122 may be configured to execute the computer program to perform the data recording processing method described above.
Additionally, electronic device 100 may also include a power component 1126 and a communication component 1150, the power component 1126 may be configured to perform power management of electronic device 100, and the communication component 1150 may be configured to enable communication, e.g., wired or wireless communication, of electronic device 100. In addition, the electronic device 100 may also include input/output (I/O) interfaces 1158. The electronic device 100 may operate based on an operating system stored in the memory 1132, such as Windows Server, MacOSXTM, UnixTM, Linux, and the like.
In another exemplary embodiment, there is also provided a computer readable storage medium comprising program instructions which, when executed by a processor, implement the steps of the data record processing method described above. For example, the computer readable storage medium may be the memory 1132 described above comprising program instructions that are executable by the processor 1122 of the electronic device 100 to perform the data recording processing method described above.
The preferred embodiments of the present disclosure are described in detail with reference to the accompanying drawings, however, the present disclosure is not limited to the specific details of the above embodiments, and various simple modifications may be made to the technical solution of the present disclosure within the technical idea of the present disclosure, and these simple modifications all belong to the protection scope of the present disclosure. In addition, it should be noted that the various technical features described in the above embodiments may be combined in any suitable manner without contradiction, and in order to avoid unnecessary repetition, various combinations that are possible in the present disclosure are not further described.
Claims (10)
1. A data record processing method, characterized in that the method comprises:
obtaining a first database and a second database describing the same set of objects; the first database is one of a plurality of databases, and the second database is any one of the databases except the first database;
determining a matching value of a first data record in the first database and each data record to be matched in the second database according to a preset matching rule, wherein the first data record is used for describing a first object in the object set; the data records to be matched are data records which are not successfully matched;
determining a second data record used for describing the first object from the second database according to the matching value corresponding to each data record to be matched in the second database;
and storing the first data record, the second data record and the matching relation between the first data record and the second data record to a third database for describing the object set.
2. The method of claim 1, wherein determining a second data record from the second database that describes the first object comprises:
sorting the matching values corresponding to the data records to be matched in the second database;
determining a difference between the highest match value and the next highest match value;
and determining the data record with the highest corresponding matching value as the second data record under the condition that the difference value is larger than a preset threshold value.
3. The method of claim 2, further comprising:
under the condition that the difference value is not larger than the preset threshold value, outputting prompt information, wherein the prompt information is used for prompting a user to select one data record from the data record with the highest corresponding matching value and the data record with the next highest corresponding matching value;
determining a second data record from the second database describing the first object, comprising:
determining the user-selected data record as the second data record.
4. The method according to claim 1, wherein the preset matching rule comprises a plurality of sub-matching rules; determining a matching value of a first data record in the first database and any data record to be matched in the second database according to a preset matching rule, wherein the matching value comprises the following steps:
determining a matching initial value of the first data record and any data record to be matched in the second database according to each sub-matching rule;
and determining a matching value corresponding to the data record to be matched in the second database according to the matching initial value corresponding to each sub-matching rule and the weight value of each sub-matching rule.
5. The method of claim 1, further comprising:
and when a data record acquisition request for the first object is detected, acquiring the first data record and/or the second data record from the third database.
6. The method of claim 1, wherein the preset matching rule comprises:
a general matching rule, or a specific matching rule configured based on feature parameters of objects in the object set, or a combination of the general matching rule and the specific matching rule, wherein the general matching rule comprises: fuzzy matching rules, or equivalence matching rules, or a combination of both.
7. The method of claim 6, wherein the characteristic parameter of the object in the set of objects is a geographic location; the special matching rules include: longitude and latitude matching rules and/or administrative region level matching rules.
8. A data record processing apparatus, characterized in that the apparatus comprises:
an obtaining module for obtaining a first database and a second database for describing the same object set; the first database is one of a plurality of databases, and the second database is any one of the databases except the first database;
a matching value determining module, configured to determine, according to a preset matching rule, a matching value between a first data record in the first database and each data record to be matched in the second database, where the first data record is used to describe a first object in the object set; the data records to be matched are data records which are not successfully matched;
the data record determining module is used for determining a second data record for describing the first object from the second database according to the matching value corresponding to each data record to be matched in the second database;
and the storage module is used for storing the first data record, the second data record and the matching relation between the first data record and the second data record into a third database for describing the object set.
9. An electronic device, comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to perform the steps of the method of any one of claims 1 to 7.
10. A computer-readable storage medium, on which computer program instructions are stored, which program instructions, when executed by a processor, carry out the steps of the method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810931008.8A CN109101634B (en) | 2018-08-15 | 2018-08-15 | Data recording processing method, device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810931008.8A CN109101634B (en) | 2018-08-15 | 2018-08-15 | Data recording processing method, device, electronic equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109101634A CN109101634A (en) | 2018-12-28 |
CN109101634B true CN109101634B (en) | 2021-06-11 |
Family
ID=64849999
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810931008.8A Active CN109101634B (en) | 2018-08-15 | 2018-08-15 | Data recording processing method, device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109101634B (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5991758A (en) * | 1997-06-06 | 1999-11-23 | Madison Information Technologies, Inc. | System and method for indexing information about entities from different information sources |
CN107145574A (en) * | 2017-05-05 | 2017-09-08 | 恒生电子股份有限公司 | database data processing method, device and storage medium and electronic equipment |
CN107291951A (en) * | 2017-07-24 | 2017-10-24 | 北京都在哪智慧城市科技有限公司 | Data processing method, device, storage medium and processor |
-
2018
- 2018-08-15 CN CN201810931008.8A patent/CN109101634B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5991758A (en) * | 1997-06-06 | 1999-11-23 | Madison Information Technologies, Inc. | System and method for indexing information about entities from different information sources |
CN107145574A (en) * | 2017-05-05 | 2017-09-08 | 恒生电子股份有限公司 | database data processing method, device and storage medium and electronic equipment |
CN107291951A (en) * | 2017-07-24 | 2017-10-24 | 北京都在哪智慧城市科技有限公司 | Data processing method, device, storage medium and processor |
Also Published As
Publication number | Publication date |
---|---|
CN109101634A (en) | 2018-12-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109657163B (en) | Destination address determining method and device, electronic equipment and storage medium | |
CN105677831A (en) | Method and device for determining recommended merchants | |
CN109947881B (en) | POI weight judging method and device, mobile terminal and computer readable storage medium | |
CN111931077B (en) | Data processing method, device, electronic equipment and storage medium | |
CN110795472A (en) | Address standardization method, system, equipment and medium based on fuzzy matching | |
CN105550221B (en) | Information search method and device | |
CN110427574B (en) | Route similarity determination method, device, equipment and medium | |
CN108345689B (en) | Trademark registration success rate query method and device, and trademark registration method and device | |
CN107368480A (en) | A kind of interest point data type of error positioning, repeat recognition methods and device | |
CN107679053A (en) | Location recommendation method, device, computer equipment and storage medium | |
CN103678315A (en) | Image processing device, image processing method and electronic equipment | |
CN112632409A (en) | Same user identification method, device, computer equipment and storage medium | |
CN110083677B (en) | Contact person searching method, device, equipment and storage medium | |
JPH11167581A (en) | Information sorting method, device and system | |
CN115544088A (en) | Address information query method and device, electronic equipment and storage medium | |
CN108536695B (en) | Aggregation method and device of geographic position information points | |
CN111831685A (en) | Query statement processing method, model training method, device and equipment | |
CN109699003A (en) | Location determining method and device | |
CN111382138A (en) | POI data processing method, device, equipment and medium | |
CN109376977A (en) | Recommended agent people method, electronic equipment and computer readable storage medium | |
CN109101634B (en) | Data recording processing method, device, electronic equipment and storage medium | |
CN111080343B (en) | House source searching method and system based on multiple users | |
CN114817743A (en) | Interest point searching method and device | |
CN114237588A (en) | Code warehouse selection method, device, equipment and storage medium | |
CN109213937B (en) | Intelligent search method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |