CN117271849A - Method, device and medium for searching quick address matching based on graph database - Google Patents

Method, device and medium for searching quick address matching based on graph database Download PDF

Info

Publication number
CN117271849A
CN117271849A CN202311317854.8A CN202311317854A CN117271849A CN 117271849 A CN117271849 A CN 117271849A CN 202311317854 A CN202311317854 A CN 202311317854A CN 117271849 A CN117271849 A CN 117271849A
Authority
CN
China
Prior art keywords
address
matching
graph
points
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311317854.8A
Other languages
Chinese (zh)
Inventor
张晨
周研
蒋阔
吴菁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Create Link Technology Co ltd
Original Assignee
Zhejiang Create Link Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Create Link Technology Co ltd filed Critical Zhejiang Create Link Technology Co ltd
Priority to CN202311317854.8A priority Critical patent/CN117271849A/en
Publication of CN117271849A publication Critical patent/CN117271849A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2468Fuzzy queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/907Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/909Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using geographical or spatial information, e.g. location

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Library & Information Science (AREA)
  • Automation & Control Theory (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a quick address matching search method, a quick address matching search device and a storage medium based on a graph database. The method comprises the following steps: creating a graph model; the graph model comprises a plurality of address points or unit name points, and a plurality of matching edges are formed among the address points, among the unit name points or between the address points and the unit name points; acquiring an input address; and adopting a fuzzy matching algorithm to find the matching address of the input address from the graph model. The address quick matching search scheme based on the graph database provided by the embodiment of the invention utilizes the capability of the graph database for searching the multi-hop association relation efficiently, and utilizes the connectivity algorithm of the graph and the monotone increment of the internal id of the graph database to quickly find the point with the smallest connectable id as the initially matched address to return to the user.

Description

Method, device and medium for searching quick address matching based on graph database
Technical Field
The invention relates to the technical field of computers, in particular to a method, a device and a medium for searching for quick address matching based on a graph database.
Background
In commercial applications, especially in some consumer loans and credit card application businesses of banks, matching of addresses and unit names in application information is required for performing tasks such as merging customer information, identifying duplicate records, blacklist address/unit name hit detection, etc. However, such matching often presents challenges due to factors such as data quality, naming conventions, and user input habits. On top of this, how to use the existing matching result to act on the subsequent matching is more challenging.
In order to find an address/unit name that matches the entered address/unit name, assume that address D finds an address B that matches it, and address B has also previously found a matching address a through the fuzzy matching service. At this time, there are several cases for the matching result to be returned to the user:
1. finding the earliest matching result through continuous chain type inquiry;
2. finding out a matching result with the highest matching score through continuous chain type inquiry;
3. a variety of alternative matching results.
In anti-fraud traffic, if an address/unit name is associated with multiple applicants' information, then the address/unit name itself is considered a highly risky address. The more points this address can be associated to, the greater the risk may be. If the matching result with the highest matching score is selected, in the worst case, each new matching is performed, the generated score is higher than that of the previous matching, and each new address/unit name is returned, so that the associated purpose is not achieved; if the earliest matching address/unit name is selected, all information matching this address/unit name will be associated, and this association will create a propagation effect, further conveying risk.
By pushing down on address B finding An address A1 by fuzzy matching and address A1 also finding An address A2 matching it by fuzzy matching, it is easy to think that this may result in An address matching chain whose initial chain is An- > a, i.e.: b- > A1- > A2- > a 3.- > An- > a, in this case, a scheme needs to be designed, so that history matching can be effectively identified, and a can be returned as a fuzzy matching address of B. Some existing schemes store historical matching pairs in a key/value database and then advance through continuous chained queries a little by little until the original matching address/unit name is found.
That is, in the real-time fuzzy matching scenario, new address/unit names are continuously transmitted, and when the new address/unit names match the address/unit names A1 in the B history base, how to efficiently find the earliest matching address a of A1 is a problem to be solved.
Disclosure of Invention
In view of the technical drawbacks mentioned in the background art, an object of an embodiment of the present invention is to provide a method, an apparatus and a storage medium for fast address matching search based on a graph database.
In order to achieve the above object, in a first aspect, an embodiment of the present invention provides a method for searching for fast matching of addresses based on a graph database, including:
creating a graph model; the graph model comprises a plurality of address points or unit name points, and a plurality of matching edges are formed among the address points, among the unit name points or between the address points and the unit name points;
acquiring an input address;
and adopting a fuzzy matching algorithm to find the matching address of the input address from the graph model.
As a preferred implementation of the present application, after finding the matching address of the input address, the method further includes:
on the graph model, a graph connectivity algorithm is used to find the initial address of the input address.
Further, as a preferred implementation of the present application, the method further includes:
if a fuzzy matching algorithm is adopted, the matching address of the input address is not found from the graph model, and the address obtained by matching the input address from the graph model is determined to be a first matching item;
inserting the input address and a first matching item into the graph model as a new address point;
matching edges are established between the input address and a new address point, and between the new address point and an original address, respectively.
Further, as a preferred implementation of the present application, after establishing the matching edges between the input address and the new address point, and between the new address point and the original address, respectively, the method further includes:
and returning the matching address or the unmatched result to the user.
In a second aspect, an embodiment of the present invention provides an address fast matching search apparatus based on a graph database, including:
a creation unit for creating a graph model; the graph model comprises a plurality of address points or unit name points, and a plurality of matching edges are formed among the address points, among the unit name points or between the address points and the unit name points;
an acquisition unit configured to acquire an input address;
and the processing unit is used for finding out the matching address of the input address from the graph model by adopting a fuzzy matching algorithm.
As a preferred implementation of the present application, the processing unit is further configured to:
on the graph model, a graph communication algorithm is adopted to find the initial address of the input address;
if a fuzzy matching algorithm is adopted, the matching address of the input address is not found from the graph model, and the address obtained by matching the input address from the graph model is determined to be a first matching item;
inserting the input address and a first matching item into the graph model as a new address point;
respectively establishing matching edges between the input address and a new address point and between the new address point and an original address;
and returning the matching address or the unmatched result to the user.
In a third aspect, an embodiment of the present invention further provides an address quick matching search apparatus based on a graph database, including a processor, an input device, an output device, and a memory, where the processor, the input device, the output device, and the memory are connected to each other, where the memory is configured to store a computer program, where the computer program includes program instructions, and where the processor is configured to invoke the program instructions to perform the method steps described in the first aspect above.
In a fourth aspect, embodiments of the present invention provide a computer-readable storage medium storing a computer program comprising program instructions which, when executed by a processor, implement the method steps as described in the first aspect above.
The address quick matching search scheme based on the graph database provided by the embodiment of the invention utilizes the capability of the graph database for searching the multi-hop association relation efficiently, and utilizes the connectivity algorithm of the graph and the monotone increment of the internal id of the graph database to quickly find the point with the smallest connectable id as the initially matched address to return to the user.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.
FIG. 1 is a flow chart of a method for fast address matching based on a graph database according to a first embodiment of the present invention;
FIG. 2 is a flow chart of a method for fast address matching based on a graph database according to a second embodiment of the present invention;
FIG. 3 is a block diagram of an address quick matching device based on a graph database according to a first embodiment of the present invention;
fig. 4 is another construction diagram of the device shown in fig. 3.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be understood that the terms "comprises" and "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It should be noted that, the implementation principle of the technical scheme is as follows:
the advantage that the graph database can be used for searching the multi-hop edge relation more quickly is utilized, every two matched address/unit names are connected through edges, then a graph connection algorithm is utilized, the graph connection algorithm starts from a starting node, all algorithms of all nodes which can reach from the starting node are returned, the address/unit names matched before are found by the algorithm, and the initially inserted matched address and unit name are found by judging the id size (the smaller the id is, the earlier the inserted time is indicated).
Referring to fig. 1, the method for searching for fast matching of addresses based on a graph database according to the first embodiment of the present invention may include the following steps:
s1, creating a graph model.
The graph model comprises a plurality of address points or unit name points, and a plurality of matching edges are formed among the address points, among the unit name points or between the address points and the unit name points.
S2, acquiring an input address.
S3, a fuzzy matching algorithm is adopted to find the matching address of the input address from the graph model, and a graph communication algorithm is adopted to find the initial address of the input address on the graph model.
And S4, if a fuzzy matching algorithm is adopted to find no matching address of the input address from the graph model, determining the address obtained by matching the input address from the graph model as a first matching item.
S5, the input address and the first matching item are inserted into the graph model to serve as new address points.
S6, respectively establishing matching edges between the input address and the new address point and between the new address point and the initial address.
And S7, returning the matching address or the unmatched result to the user.
Referring to fig. 2 again, in another method embodiment, the method for searching for fast matching of addresses based on the graph database includes:
step one, creating a graph model: an address/unit name is created and a matching edge is created.
Step two, assuming that a matching address of the input address/unit name is found through the fuzzy matching algorithm, then applying a graph connectivity algorithm on the created matching graph for the matched address/unit name, finding the original address/unit name point through the graph connectivity algorithm, and if not finding, considering that the matched address/unit name is not found, namely the currently matched address is the first matching item, and inserting the current matching item and the input item into the graph is needed.
And step three, writing the input address/unit name and the matching item into a matching diagram (diagram model) as points.
And step four, writing a matching edge established between the input address/unit name and the matched address/unit name into a map (map model).
And step five, returning the initial matching address to the user.
From the above description, it can be known that, according to the address fast matching search scheme based on the graph database provided by the embodiment of the invention, the capability of searching the multi-hop association relationship efficiently by using the graph database is utilized, the point with the smallest id which can be communicated is fast found by using the connectivity algorithm of the graph and the monotonic increment of the internal id of the graph database, and is returned to the user as the address which is matched initially.
Based on the same inventive concept, an embodiment of the present invention provides a device for searching for fast address matching based on a graph database, as shown in fig. 3, the device includes:
a creation unit for creating a graph model; the graph model comprises a plurality of address points or unit name points, and a plurality of matching edges are formed among the address points, among the unit name points or between the address points and the unit name points;
an acquisition unit configured to acquire an input address;
and the processing unit is used for finding out the matching address of the input address from the graph model by adopting a fuzzy matching algorithm.
Further, the processing unit is further configured to:
on the graph model, a graph communication algorithm is adopted to find the initial address of the input address;
if a fuzzy matching algorithm is adopted, the matching address of the input address is not found from the graph model, and the address obtained by matching the input address from the graph model is determined to be a first matching item;
inserting the input address and a first matching item into the graph model as a new address point;
respectively establishing matching edges between the input address and a new address point and between the new address point and an original address;
and returning the matching address or the unmatched result to the user.
Alternatively, as another preferred embodiment of the present invention, as shown in fig. 4, the apparatus may include: one or more processors 101, one or more input devices 102, one or more output devices 103, and a memory 104, the processors 101, input devices 102, output devices 103, and memory 104 being interconnected by a bus 105. The memory 104 is used for storing a computer program comprising program instructions, the processor 101 being configured to invoke the program instructions to perform the method steps described in the method embodiments.
It should be appreciated that in embodiments of the present invention, the processor 101 may be a central processing unit (Central Processing Unit, CPU), which may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSPs), application specific integrated circuits (Application Specific Integrated Circuit, ASICs), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The input device 102 may include a keyboard or the like, and the output device 103 may include a display (LCD or the like), a speaker or the like.
The memory 104 may include read only memory and random access memory and provides instructions and data to the processor 101. A portion of the memory 104 may also include non-volatile random access memory. For example, the memory 104 may also store information of device type.
In a specific implementation, the processor 101, the input device 102, and the output device 103 described in the embodiments of the present invention may execute the implementation described in the embodiments of the method for searching for fast matching addresses based on a graph database provided in the embodiments of the present invention, which is not described herein again.
It should be noted that, for a more specific workflow description of the electronic device, please refer to the foregoing method embodiment section, and the description is omitted here.
Furthermore, corresponding to the foregoing method embodiments and electronic devices, embodiments of the present invention provide a computer-readable storage medium storing a computer program, the computer program including program instructions that when executed by a processor implement: an address quick matching search method based on a graph database.
The computer readable storage medium may be an internal storage unit of the electronic device according to any of the foregoing embodiments, for example, a hard disk or a memory of a system. The computer readable storage medium may also be an external storage device of the system, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the system. Further, the computer readable storage medium may also include both internal storage units and external storage devices of the system. The computer readable storage medium is used to store the computer program and other programs and data required by the system. The computer-readable storage medium may also be used to temporarily store data that has been output or is to be output.
The aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
While the invention has been described with reference to certain preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions of equivalents may be made and equivalents will be apparent to those skilled in the art without departing from the scope of the invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims (9)

1. The quick address matching search method based on the graph database is characterized by comprising the following steps of:
creating a graph model; the graph model comprises a plurality of address points or unit name points, and a plurality of matching edges are formed among the address points, among the unit name points or between the address points and the unit name points;
acquiring an input address;
and adopting a fuzzy matching algorithm to find the matching address of the input address from the graph model.
2. The method of claim 1, wherein after finding a matching address for the input address, the method further comprises:
on the graph model, a graph connectivity algorithm is used to find the initial address of the input address.
3. The method of claim 2, wherein the method further comprises:
if a fuzzy matching algorithm is adopted, the matching address of the input address is not found from the graph model, and the address obtained by matching the input address from the graph model is determined to be a first matching item;
inserting the input address and a first matching item into the graph model as a new address point;
matching edges are established between the input address and a new address point, and between the new address point and an original address, respectively.
4. A method as claimed in claim 3, wherein after establishing matching edges between the input address and a new address point, the new address point and an original address, respectively, the method further comprises:
and returning the matching address or the unmatched result to the user.
5. An address quick match search device based on a graph database, comprising:
a creation unit for creating a graph model; the graph model comprises a plurality of address points or unit name points, and a plurality of matching edges are formed among the address points, among the unit name points or between the address points and the unit name points;
an acquisition unit configured to acquire an input address;
and the processing unit is used for finding out the matching address of the input address from the graph model by adopting a fuzzy matching algorithm.
6. The apparatus of claim 5, wherein the processing unit is further to:
on the graph model, a graph connectivity algorithm is used to find the initial address of the input address.
7. The apparatus of claim 6, wherein the processing unit is further to:
if a fuzzy matching algorithm is adopted, the matching address of the input address is not found from the graph model, and the address obtained by matching the input address from the graph model is determined to be a first matching item;
inserting the input address and a first matching item into the graph model as a new address point;
respectively establishing matching edges between the input address and a new address point and between the new address point and an original address;
and returning the matching address or the unmatched result to the user.
8. An address quick match search device based on a graph database, characterized in that the electronic device is configured to perform user rights management for the graph database, comprising a processor, an input device, an output device and a memory, the processor, the input device, the output device and the memory being interconnected, wherein the memory is configured to store a computer program comprising program instructions, the processor being configured to invoke the program instructions to perform the method steps according to any of the claims 1-4.
9. A computer readable storage medium storing a computer program comprising program instructions, characterized in that the program instructions when executed by a processor implement the method steps of any of claims 1-4.
CN202311317854.8A 2023-10-11 2023-10-11 Method, device and medium for searching quick address matching based on graph database Pending CN117271849A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311317854.8A CN117271849A (en) 2023-10-11 2023-10-11 Method, device and medium for searching quick address matching based on graph database

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311317854.8A CN117271849A (en) 2023-10-11 2023-10-11 Method, device and medium for searching quick address matching based on graph database

Publications (1)

Publication Number Publication Date
CN117271849A true CN117271849A (en) 2023-12-22

Family

ID=89212219

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311317854.8A Pending CN117271849A (en) 2023-10-11 2023-10-11 Method, device and medium for searching quick address matching based on graph database

Country Status (1)

Country Link
CN (1) CN117271849A (en)

Similar Documents

Publication Publication Date Title
US11626972B2 (en) Data processing method and apparatus
EP3591510A1 (en) Method and device for writing service data in block chain system
CN104598439B (en) Method and device for correcting title of information object and method for pushing information object
US7865505B2 (en) Efficient exact set similarity joins
CN111737499B (en) Data searching method based on natural language processing and related equipment
US11501317B2 (en) Methods, apparatuses, and devices for generating digital document of title
CN109408522A (en) A kind of update method and device of user characteristic data
CN115048435B (en) Intelligent database storage method and system
CN108897729B (en) Transaction template sharing method and device, electronic equipment and storage medium
CN112150305A (en) Enterprise power user information verification method and system, computer equipment and medium
CN111310137B (en) Block chain associated data evidence storing method and device and electronic equipment
CN113129150A (en) Transaction data processing method and device, terminal device and readable storage medium
CN117195185A (en) User authority management method for graph database, electronic equipment and medium
CN113434582B (en) Service data processing method and device, computer equipment and storage medium
CN110489416B (en) Information storage method based on data processing and related equipment
CN109901991A (en) A kind of method, apparatus and electronic equipment for analyzing exception call
CN110992039B (en) Transaction processing method, device and equipment
CN110457332B (en) Information processing method and related equipment
CN115544214B (en) Event processing method, device and computer readable storage medium
CN117271849A (en) Method, device and medium for searching quick address matching based on graph database
CN110046180B (en) Method and device for locating similar examples and electronic equipment
CN112612817A (en) Data processing method and device, terminal equipment and computer readable storage medium
CN111597368A (en) Data processing method and device
CN114328755B (en) Data writing method, data reading device and electronic equipment
CN113486627B (en) Single number generation method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination