CN113779331A - Address alias identification method and device, electronic equipment and computer storage medium - Google Patents

Address alias identification method and device, electronic equipment and computer storage medium Download PDF

Info

Publication number
CN113779331A
CN113779331A CN202111128210.5A CN202111128210A CN113779331A CN 113779331 A CN113779331 A CN 113779331A CN 202111128210 A CN202111128210 A CN 202111128210A CN 113779331 A CN113779331 A CN 113779331A
Authority
CN
China
Prior art keywords
address name
target
address
training sample
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111128210.5A
Other languages
Chinese (zh)
Other versions
CN113779331B (en
Inventor
何天赋
陈国春
颜萍
王晟宇
袁野
李瑞远
鲍捷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jingdong City Beijing Digital Technology Co Ltd
Original Assignee
Jingdong City Beijing Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jingdong City Beijing Digital Technology Co Ltd filed Critical Jingdong City Beijing Digital Technology Co Ltd
Priority to CN202111128210.5A priority Critical patent/CN113779331B/en
Publication of CN113779331A publication Critical patent/CN113779331A/en
Application granted granted Critical
Publication of CN113779331B publication Critical patent/CN113779331B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90344Query processing by using string matching techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Bioethics (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides an address alias identification method, an address alias identification device, electronic equipment and a computer storage medium, wherein the method comprises the following steps: in a receiving address database, a first user with a receiving address name as a first address name and a second user with a receiving address name as a second address name are inquired and obtained; aiming at each target user, acquiring a report point of the target user in a report point database according to the identification of the target user; extracting a feature matrix of the target address name according to all report points corresponding to the target address name; inputting the characteristic matrix of the target address name into an analysis model to obtain a characteristic value of the target address name; determining an alias relationship score according to the characteristic value of the first address name and the characteristic value of the second address name; if the alias relationship score is greater than the threshold, determining that the first address name and the second address name have an alias relationship. Therefore, the aim of quickly and accurately identifying the address alias is fulfilled.

Description

Address alias identification method and device, electronic equipment and computer storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to an address alias identification method and apparatus, an electronic device, and a computer storage medium.
Background
A geographical entity, in addition to its standard name, is also commonly referred to as an alias. For example, the Beijing Central office building located in the Central Business area is commonly referred to by people as "Chinese honor".
Alias acquisition in the prior art mainly depends on manual collection and entry. The collection method may include: crowdsourcing collection, namely dividing a city according to partitions, and distributing alias collection tasks of the partitions to crowds familiar with local geographic information; registration information regarding building registration is acquired. It can be seen that the existing manual collection and entry needs to consume a lot of labor cost and time, and the obtained alias has serious hysteresis.
Disclosure of Invention
In view of the above, the present application provides an address alias identification method, an address alias identification device, an electronic device, and a computer storage medium, which can quickly and accurately identify an address alias.
The first aspect of the present application provides an address alias identification method, including:
acquiring a first address name and a second address name;
in a receiving address database, inquiring to obtain a first user with a receiving address name as the first address name and a second user with the receiving address name as the second address name;
aiming at each target user, acquiring a report point of the target user in a report point database according to the identification of the target user, and associating the report point of the target user with a target address name corresponding to the target user; the report point is a positioning address and a timestamp which are obtained when the user uses an application program to operate; the target user is the first user or the second user; the target address name is the first address name or the second address name; the target address name corresponding to the first user is a first target name; the target address name corresponding to the second user is a second address name;
extracting a feature matrix of the target address name according to all report points corresponding to the target address name;
inputting the feature matrix of the target address name into an analysis model to obtain a feature value of the target address name; the analysis model is obtained by training a deep convolutional neural network by a plurality of training sample address names and real aliases corresponding to the training sample address names;
determining an alias relationship score according to the characteristic value of the first address name and the characteristic value of the second address name;
if the alias relationship score is greater than a threshold value, determining that the first address name and the second address name have an alias relationship.
Optionally, the method for constructing the analysis model includes:
constructing a training sample set; the training sample set comprises all report points corresponding to a plurality of training sample address names and real aliases corresponding to the training sample address names;
extracting a feature matrix of the address name of the target training sample according to all report points corresponding to the address name of the target training sample; wherein the target training sample address name is a first training sample address name or a second training sample address name;
inputting the feature matrix of the target training sample address name into a deep convolutional neural network to obtain a feature value of the target training sample address name;
obtaining a prediction result of whether the first training sample address name and the second training sample address name have an alias relationship according to the characteristic value of the first address name and the characteristic value of the second address name;
and continuously adjusting parameters in the deep convolutional neural network by using the error between the prediction result and the real alias corresponding to the address name of the training sample until the error between the prediction result output by the adjusted deep convolutional neural network and the real alias corresponding to the address name of the training sample meets a preset convergence condition, and determining the adjusted deep convolutional neural network as an analysis model.
Optionally, the extracting the feature matrix of the target address name according to all the entry points corresponding to the target address name includes:
equally dividing the city map to which the target address name belongs into small grids of N x N; wherein N is a positive integer;
counting the number of all report points corresponding to the target address name in each small piece grid;
and normalizing the number in each small grid to obtain a feature matrix of the target address name.
Optionally, the determining an alias relationship score according to the feature value of the first address name and the feature value of the second address name includes:
calculating to obtain a similar value of the characteristic value of the first address name and the characteristic value of the second address name according to the characteristic value of the first address name and the characteristic value of the second address name;
and calculating to obtain an alias relationship score between the first address name and the second address name by using the similarity value.
Optionally, before, for each target user, obtaining a report point of the target user in a report point database according to the identifier of the target user, and associating the report point of the target user with a target address name corresponding to the target user, the method further includes:
encrypting the identification of each target user by using a preset encryption mode to obtain the encrypted identification of the target user;
wherein, for each target user, obtaining a report point of the target user in a report point database according to the identifier of the target user, and associating the report point of the target user with a target address name corresponding to the target user, includes:
and aiming at each target user, acquiring a report point of the target user in a report point database according to the encrypted identifier of the target user, and associating the report point of the target user with a target address name corresponding to the target user.
A second aspect of the present application provides an address alias identification apparatus, including:
an acquisition unit configured to acquire a first address name and a second address name;
the query unit is used for querying a first user with a receiving address name as the first address name and a second user with the receiving address name as the second address name in a receiving address database;
the association unit is used for acquiring a report point of each target user in a report point database according to the identification of the target user and associating the report point of the target user with a target address name corresponding to the target user; the report point is a positioning address and a timestamp which are obtained when the user uses an application program to operate; the target user is the first user or the second user; the target address name is the first address name or the second address name; the target address name corresponding to the first user is a first target name; the target address name corresponding to the second user is a second address name;
the first extraction unit is used for extracting a feature matrix of the target address name according to all report points corresponding to the target address name;
the analysis unit is used for inputting the feature matrix of the target address name into an analysis model to obtain a feature value of the target address name; the analysis model is obtained by training a deep convolutional neural network by a plurality of training sample address names and real aliases corresponding to the training sample address names;
a first determining unit, configured to determine an alias relationship score according to a feature value of the first address name and a feature value of the second address name;
a second determining unit, configured to determine that an alias relationship exists between the first address name and the second address name if the alias relationship score is greater than a threshold value.
Optionally, the construction unit of the analysis model includes:
the training sample construction unit is used for constructing a training sample set; the training sample set comprises all report points corresponding to a plurality of training sample address names and real aliases corresponding to the training sample address names;
the second extraction unit is used for extracting a feature matrix of the address name of the target training sample according to all report points corresponding to the address name of the target training sample; wherein the target training sample address name is a first training sample address name or a second training sample address name;
the input unit is used for inputting the feature matrix of the target training sample address name into a deep convolutional neural network to obtain a feature value of the target training sample address name;
the prediction unit is used for obtaining a prediction result of whether the first training sample address name and the second training sample address name have an alias relationship according to the characteristic value of the first address name and the characteristic value of the second address name;
and the adjusting unit is used for continuously adjusting the parameters in the deep convolutional neural network by using the error between the prediction result and the real alias corresponding to the training sample address name until the error between the prediction result output by the adjusted deep convolutional neural network and the real alias corresponding to the training sample address name meets a preset convergence condition, and determining the adjusted deep convolutional neural network as an analysis model.
Optionally, the first extracting unit includes:
the equal dividing unit is used for equally dividing the city map to which the target address name belongs into small grids of N x N; wherein N is a positive integer;
the counting unit is used for counting the number of all report points corresponding to the target address name in each small piece grid;
and the first extraction subunit is used for normalizing the number of each small piece of grid to obtain a feature matrix of the target address name.
Optionally, the first determining unit includes:
the first calculation unit is used for calculating and obtaining a similar value of the characteristic value of the first address name and the characteristic value of the second address name according to the characteristic value of the first address name and the characteristic value of the second address name;
and the second calculating unit is used for calculating to obtain an alias relationship score between the first address name and the second address name by using the similarity value.
Optionally, the apparatus for identifying an address alias further includes:
the encryption unit is used for encrypting the identification of each target user by using a preset encryption mode to obtain the encrypted identification of the target user;
wherein the association unit is configured to:
and aiming at each target user, acquiring a report point of the target user in a report point database according to the encrypted identifier of the target user, and associating the report point of the target user with a target address name corresponding to the target user.
A third aspect of the present application provides an electronic device comprising:
one or more processors;
a storage device having one or more programs stored thereon;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of identifying address aliases of any of the first aspects.
A fourth aspect of the present application provides a computer storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the method for identifying address aliases according to any one of the first aspects.
As can be seen from the above aspects, the present application provides an address alias identification method, an address alias identification device, an electronic device, and a computer storage medium, where the address alias identification method includes: firstly, acquiring a first address name and a second address name; then, in a receiving address database, a first user with a receiving address name as the first address name and a second user with a receiving address name as the second address name are obtained through inquiry; then aiming at each target user, acquiring a report point of the target user in a report point database according to the identification of the target user, and associating the report point of the target user with a target address name corresponding to the target user; the report point is a positioning address and a timestamp which are obtained when the user uses an application program to operate; then, extracting and obtaining a feature matrix of the target address name according to all report points corresponding to the target address name; inputting the feature matrix of the target address name into an analysis model to obtain a feature value of the target address name; finally, determining an alias relationship score according to the characteristic value of the first address name and the characteristic value of the second address name; if the alias relationship score is greater than a threshold value, determining that the first address name and the second address name have an alias relationship. Therefore, the aim of quickly and accurately identifying the address alias is fulfilled.
Drawings
In order to more clearly illustrate the technical solutions in the present embodiment or the prior art, the drawings needed to be used in the description of the embodiment or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only the embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a specific flowchart of an address alias identification method according to an embodiment of the present application;
fig. 2 is a schematic diagram of a distribution of report points according to another embodiment of the present application;
FIG. 3 is a flowchart of a method for extracting a feature matrix of destination address names according to another embodiment of the present disclosure;
fig. 4 is a schematic diagram of feature matrix extraction of a 3 × 3 address name a according to another embodiment of the present application;
FIG. 5 is a flowchart of a method for constructing an analysis model according to another embodiment of the present application;
FIG. 6 is a flow diagram of a method for determining an alias relationship score according to another embodiment of the present application;
fig. 7 is a schematic diagram of an address alias identification apparatus according to another embodiment of the present application;
fig. 8 is a schematic diagram of an electronic device implementing an address alias identification method according to another embodiment of the present application.
Detailed Description
The technical solutions in the embodiments will be described clearly and completely with reference to the drawings in the embodiments, and it is obvious that the described embodiments are only a part of the embodiments, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the scope of protection.
It should be noted that the terms "first", "second", and the like, referred to in this application, are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence of functions performed by these devices, modules or units, but the terms "include", or any other variation thereof are intended to cover a non-exclusive inclusion, so that a process, method, article, or apparatus that includes a series of elements includes not only those elements but also other elements that are not explicitly listed, or includes elements inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The embodiment of the application provides an address alias identification method, as shown in fig. 1, specifically including the following steps:
s101, acquiring a first address name and a second address name.
It should be noted that the first address name and the second address name in this application may be two address names considered to be specified, or two address names randomly selected in the system, and are not limited herein.
S102, inquiring a first user with a receiving address name of a first address name and a second user with a receiving address name of a second address name in a receiving address database.
Specifically, in the receiving address database, the specific implementation manner of querying the first user with the receiving address name being the first address name and the second user with the receiving address name being the second address name may be implemented by, but not limited to, matching address text strings, and this is not limited herein.
For example: the users who fill in the shipping address as address name a are user 1, user 2, and user 3. Then, the application can generate a user list of address names a → { user 1, user 2, user 3 }. Similarly, the users who fill in the destination address as the address name D are the user 3 and the user 4. Then, the present application may generate a user list of address names D → { user 3, user 4.
S103, aiming at each target user, acquiring a report point of the target user in a report point database according to the identification of the target user, and associating the report point of the target user with a target address name corresponding to the target user.
The report point is a positioning address and a time stamp which are obtained when a user uses an application program to operate; the target user is a first user or a second user; the target address name is a first address name or a second address name; the target address name corresponding to the first user is a first target name; the target address name corresponding to the second user is the second address name.
It should be noted that, a specific operation may also be set, and each operation does not need to acquire a report of the user, which is not limited herein.
Continuing with the above example, address name a → { user 1, user 2, user 3} → { user 1's newspaper point, user 2's newspaper point, user 3's newspaper point }; address names D → { user 3, user 4} → { user 3's newspaper point, user 4's newspaper point }, can be obtained. For a specific distribution of report points, see fig. 2.
Optionally, in another embodiment of the present application, to protect the privacy security of the client, an implementation of the address alias further includes, before performing step S103:
and encrypting the identifier of each target user by using a preset encryption mode to obtain the encrypted identifier of the target user.
It should be noted that the preset encryption manner may be, but is not limited to, the MD5 algorithm, and is not limited herein.
Therefore, the purpose that contact report points are anonymous users in the processing process of the application and detailed position information of the specific users is difficult to crack is achieved.
One implementation manner of step S103 includes:
and aiming at each target user, acquiring a report point of the target user in a report point database according to the encrypted identifier of the target user, and associating the report point of the target user with a target address name corresponding to the target user.
And S104, extracting to obtain a feature matrix of the target address name according to all report points corresponding to the target address name.
Optionally, in another embodiment of the present application, an implementation manner of step S104, as shown in fig. 3, includes:
s301, equally dividing the city map to which the target address name belongs into small piece grids of N x N.
Wherein N is a positive integer.
S302, counting the number of all report points corresponding to the target address name in each small-scale grid.
S303, normalizing the number in each small piece grid to obtain a feature matrix of the target address name.
Specifically, as shown in fig. 4, a simple schematic diagram of feature matrix extraction of 3 × 3 address names a is shown.
And S105, inputting the feature matrix of the target address name into the analysis model to obtain the feature value of the target address name.
The analysis model is obtained by training the deep convolutional neural network by a plurality of training sample address names and real aliases corresponding to the training sample address names. The deep convolutional neural network is formed by superposing a plurality of convolutional layers and pooling layers.
Optionally, in another embodiment of the present application, an implementation manner of the method for constructing an analysis model, as shown in fig. 5, includes:
s501, constructing a training sample set.
The training sample set comprises all report points corresponding to a plurality of training sample address names and real aliases corresponding to the training sample address names.
Note that, label data may be added to each training sample address name to label what the alias of the training sample address name is, or what the alias of the training sample address name is.
S502, extracting a feature matrix of the address name of the target training sample according to all report points corresponding to the address name of the target training sample.
Wherein the target training sample address name is a first training sample address name or a second training sample address name.
It should be noted that, for a specific implementation manner of step S502, refer to step S104, which is not described herein again.
S503, inputting the feature matrix of the target training sample address name into a deep convolutional neural network to obtain a feature value of the target training sample address name.
It should be noted that, for a specific implementation manner of step S503, refer to step S105, which is not described herein again.
S504, obtaining a prediction result of whether the first training sample address name and the second training sample address name have an alias relation according to the characteristic value of the first address name and the characteristic value of the second address name.
And the prediction result is that the first training sample address name and the second training sample address name have an alias relationship, or the first training sample address name and the second training sample address name do not have an alias relationship.
And S505, judging whether the error between the prediction result and the real alias corresponding to the address name of the training sample meets a preset convergence condition.
Specifically, if it is determined that the error between the prediction result and the real alias corresponding to the address name of the training sample does not satisfy the preset convergence condition, step S506 is executed; if the error between the prediction result and the real alias corresponding to the address name of the training sample is determined to satisfy the preset convergence condition, step S507 is executed.
And S506, continuously adjusting parameters in the deep convolutional neural network.
And S507, determining the deep convolutional neural network as an analysis model.
And S106, determining an alias relationship score according to the characteristic value of the first address name and the characteristic value of the second address name.
Optionally, in another embodiment of the present application, an implementation manner of step S106, as shown in fig. 6, includes:
s601, calculating to obtain a similar value of the characteristic value of the first address name and the characteristic value of the second address name according to the characteristic value of the first address name and the characteristic value of the second address name.
Specifically, the following calculation formula may be used to calculate the approximate value of the eigenvalue of the first address name and the eigenvalue of the second address name:
Figure BDA0003279578500000101
wherein δ represents a proximity of a characteristic value of the first address name and a characteristic value of the second address name; v represents a characteristic value of the first address name; v' represents a characteristic value of the second address name.
S602, calculating an alias relationship score between the first address name and the second address name by using the similar values.
Specifically, the alias relationship score between the first address name and the second address name can be calculated by using the following calculation formula:
Figure BDA0003279578500000111
where s represents an alias relationship score between the first address name and the second address name.
S107, if the alias relationship score is larger than the threshold value, determining that the first address name and the second address name have the alias relationship.
The threshold is a value set by a technician or the like, and can be set and changed according to subsequent experimental results, application conditions and the like, which is not limited herein.
According to the scheme, the application provides an address alias identification method, which comprises the following steps: firstly, acquiring a first address name and a second address name; then, in a receiving address database, a first user with a receiving address name as a first address name and a second user with a receiving address name as a second address name are inquired; then aiming at each target user, acquiring a report point of the target user in a report point database according to the identification of the target user, and associating the report point of the target user with a target address name corresponding to the target user; the report point is a positioning address and a time stamp which are obtained when a user uses an application program to operate; then, extracting a feature matrix of the target address name according to all report points corresponding to the target address name; inputting the characteristic matrix of the target address name into an analysis model to obtain a characteristic value of the target address name; finally, determining an alias relationship score according to the characteristic value of the first address name and the characteristic value of the second address name; if the alias relationship score is greater than the threshold, determining that the first address name and the second address name have an alias relationship. Therefore, the aim of quickly and accurately identifying the address alias is fulfilled.
Another embodiment of the present application provides an address alias identification method, as shown in fig. 7, which specifically includes:
an obtaining unit 701 is configured to obtain a first address name and a second address name.
The query unit 702 is configured to query, in the receiving address database, a first user whose receiving address name is a first address name and a second user whose receiving address name is a second address name.
The associating unit 703 is configured to, for each target user, obtain a report point of the target user in a report point database according to the identifier of the target user, and associate the report point of the target user with a target address name corresponding to the target user.
The report point is a positioning address and a time stamp which are obtained when a user uses an application program to operate; the target user is a first user or a second user; the target address name is a first address name or a second address name; the target address name corresponding to the first user is a first target name; the target address name corresponding to the second user is the second address name.
The first extracting unit 704 is configured to extract a feature matrix of the destination address name according to all entry points corresponding to the destination address name.
Optionally, in another embodiment of the present application, an implementation manner of the first extraction unit 704 includes:
and the equally dividing unit is used for equally dividing the city map to which the target address name belongs into N × N small piece grids.
Wherein N is a positive integer.
And the counting unit is used for counting the number of all report points corresponding to the target address name in each small piece grid.
And the first extraction subunit is used for normalizing the number in each small piece grid to obtain a feature matrix of the target address name.
For a specific working process of the unit disclosed in the above embodiment of the present application, reference may be made to the content of the corresponding method embodiment, as shown in fig. 3, which is not described herein again.
The analyzing unit 705 is configured to input the feature matrix of the destination address name into the analysis model, so as to obtain a feature value of the destination address name.
The analysis model is obtained by training the deep convolutional neural network by a plurality of training sample address names and real aliases corresponding to the training sample address names.
Optionally, in another embodiment of the present application, an implementation manner of the construction unit of the analysis model includes:
and the training sample construction unit is used for constructing a training sample set.
The training sample set comprises all report points corresponding to a plurality of training sample address names and real aliases corresponding to the training sample address names.
And the second extraction unit is used for extracting a feature matrix of the address name of the target training sample according to all report points corresponding to the address name of the target training sample.
Wherein the target training sample address name is a first training sample address name or a second training sample address name.
And the input unit is used for inputting the characteristic matrix of the target training sample address name into the deep convolutional neural network to obtain the characteristic value of the target training sample address name.
And the prediction unit is used for obtaining a prediction result of whether the first training sample address name and the second training sample address name have an alias relationship according to the characteristic value of the first address name and the characteristic value of the second address name.
And the adjusting unit is used for continuously adjusting the parameters in the deep convolutional neural network by using the error between the prediction result and the real alias corresponding to the address name of the training sample until the error between the prediction result output by the adjusted deep convolutional neural network and the real alias corresponding to the address name of the training sample meets a preset convergence condition, and determining the adjusted deep convolutional neural network as an analysis model.
For a specific working process of the unit disclosed in the above embodiment of the present application, reference may be made to the content of the corresponding method embodiment, as shown in fig. 5, which is not described herein again.
A first determining unit 706, configured to determine an alias relationship score according to the feature value of the first address name and the feature value of the second address name.
Optionally, in another embodiment of the present application, an implementation manner of the first determining unit 706 includes:
the first calculation unit is used for calculating and obtaining a similar value of the characteristic value of the first address name and the characteristic value of the second address name according to the characteristic value of the first address name and the characteristic value of the second address name;
and the second calculating unit is used for calculating and obtaining an alias relation score between the first address name and the second address name by using the similar value.
For a specific working process of the unit disclosed in the above embodiment of the present application, reference may be made to the content of the corresponding method embodiment, as shown in fig. 6, which is not described herein again.
A second determining unit 707 configured to determine that the first address name and the second address name have an alias relationship if the alias relationship score is greater than a threshold value.
For a specific working process of the unit disclosed in the above embodiment of the present application, reference may be made to the content of the corresponding method embodiment, as shown in fig. 1, which is not described herein again.
Optionally, in another embodiment of the present application, an implementation manner of the apparatus for identifying an address alias further includes:
and the encryption unit is used for encrypting the identification of each target user by using a preset encryption mode to obtain the encrypted identification of the target user.
Wherein, the associating unit 703 is configured to:
and aiming at each target user, acquiring a report point of the target user in a report point database according to the encrypted identifier of the target user, and associating the report point of the target user with a target address name corresponding to the target user.
For specific working processes of the units disclosed in the above embodiments of the present application, reference may be made to the contents of the corresponding method embodiments, which are not described herein again.
As can be seen from the above, the present application provides an address alias identification device: first, an obtaining unit 701 obtains a first address name and a second address name; then, the query unit 702 queries, in the receiving address database, a first user whose receiving address name is a first address name and a second user whose receiving address name is a second address name; the association unit 703 acquires, for each target user, a report point of the target user in a report point database according to the identifier of the target user, and associates the report point of the target user with a target address name corresponding to the target user; the report point is a positioning address and a time stamp which are obtained when a user uses an application program to operate; then, the first extraction unit 704 extracts a feature matrix of the target address name according to all report points corresponding to the target address name; the analysis unit 705 inputs the feature matrix of the target address name into the analysis model to obtain a feature value of the target address name; finally, the first determining unit 706 determines an alias relationship score according to the feature value of the first address name and the feature value of the second address name; if the alias relationship score is greater than the threshold value, the second determination unit 707 determines that the first address name and the second address name have an alias relationship. Therefore, the aim of quickly and accurately identifying the address alias is fulfilled.
Another embodiment of the present application provides an electronic device, as shown in fig. 8, including:
one or more processors 801.
A storage device 802 on which one or more programs are stored.
The one or more programs, when executed by the one or more processors 801, cause the one or more processors 801 to implement a method of identifying address aliases as described in any of the above embodiments.
Another embodiment of the present application provides a computer storage medium, on which a computer program is stored, wherein the computer program, when executed by a processor, implements the method for identifying address aliases according to any one of the above embodiments.
In the above embodiments disclosed in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The apparatus and method embodiments described above are illustrative only, as the flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present disclosure may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part. The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a live broadcast device, or a network device) to execute all or part of the steps of the method according to the embodiments of the present disclosure. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Those skilled in the art can make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. An address alias identification method, comprising:
acquiring a first address name and a second address name;
in a receiving address database, inquiring to obtain a first user with a receiving address name as the first address name and a second user with the receiving address name as the second address name;
aiming at each target user, acquiring a report point of the target user in a report point database according to the identification of the target user, and associating the report point of the target user with a target address name corresponding to the target user; the report point is a positioning address and a timestamp which are obtained when the user uses an application program to operate; the target user is the first user or the second user; the target address name is the first address name or the second address name; the target address name corresponding to the first user is a first target name; the target address name corresponding to the second user is a second address name;
extracting a feature matrix of the target address name according to all report points corresponding to the target address name;
inputting the feature matrix of the target address name into an analysis model to obtain a feature value of the target address name; the analysis model is obtained by training a deep convolutional neural network by a plurality of training sample address names and real aliases corresponding to the training sample address names;
determining an alias relationship score according to the characteristic value of the first address name and the characteristic value of the second address name;
if the alias relationship score is greater than a threshold value, determining that the first address name and the second address name have an alias relationship.
2. The identification method according to claim 1, wherein the analytical model is constructed by a method comprising:
constructing a training sample set; the training sample set comprises all report points corresponding to a plurality of training sample address names and real aliases corresponding to the training sample address names;
extracting a feature matrix of the address name of the target training sample according to all report points corresponding to the address name of the target training sample; wherein the target training sample address name is a first training sample address name or a second training sample address name;
inputting the feature matrix of the target training sample address name into a deep convolutional neural network to obtain a feature value of the target training sample address name;
obtaining a prediction result of whether the first training sample address name and the second training sample address name have an alias relationship according to the characteristic value of the first address name and the characteristic value of the second address name;
and continuously adjusting parameters in the deep convolutional neural network by using the error between the prediction result and the real alias corresponding to the address name of the training sample until the error between the prediction result output by the adjusted deep convolutional neural network and the real alias corresponding to the address name of the training sample meets a preset convergence condition, and determining the adjusted deep convolutional neural network as an analysis model.
3. The identification method according to claim 1, wherein the extracting a feature matrix of the destination address name according to all entries corresponding to the destination address name comprises:
equally dividing the city map to which the target address name belongs into small grids of N x N; wherein N is a positive integer;
counting the number of all report points corresponding to the target address name in each small piece grid;
and normalizing the number in each small grid to obtain a feature matrix of the target address name.
4. The method according to claim 1, wherein determining an alias relationship score according to the eigenvalue of the first address name and the eigenvalue of the second address name comprises:
calculating to obtain a similar value of the characteristic value of the first address name and the characteristic value of the second address name according to the characteristic value of the first address name and the characteristic value of the second address name;
and calculating to obtain an alias relationship score between the first address name and the second address name by using the similarity value.
5. The identification method according to claim 1, wherein before, for each target user, obtaining a newspaper point of the target user in a newspaper point database according to the identifier of the target user, and associating the newspaper point of the target user with a target address name corresponding to the target user, the method further comprises:
encrypting the identification of each target user by using a preset encryption mode to obtain the encrypted identification of the target user;
wherein, for each target user, obtaining a report point of the target user in a report point database according to the identifier of the target user, and associating the report point of the target user with a target address name corresponding to the target user, includes:
and aiming at each target user, acquiring a report point of the target user in a report point database according to the encrypted identifier of the target user, and associating the report point of the target user with a target address name corresponding to the target user.
6. An address alias identification apparatus, comprising:
an acquisition unit configured to acquire a first address name and a second address name;
the query unit is used for querying a first user with a receiving address name as the first address name and a second user with the receiving address name as the second address name in a receiving address database;
the association unit is used for acquiring a report point of each target user in a report point database according to the identification of the target user and associating the report point of the target user with a target address name corresponding to the target user; the report point is a positioning address and a timestamp which are obtained when the user uses an application program to operate; the target user is the first user or the second user; the target address name is the first address name or the second address name; the target address name corresponding to the first user is a first target name; the target address name corresponding to the second user is a second address name;
the first extraction unit is used for extracting a feature matrix of the target address name according to all report points corresponding to the target address name;
the analysis unit is used for inputting the feature matrix of the target address name into an analysis model to obtain a feature value of the target address name; the analysis model is obtained by training a deep convolutional neural network by a plurality of training sample address names and real aliases corresponding to the training sample address names;
a first determining unit, configured to determine an alias relationship score according to a feature value of the first address name and a feature value of the second address name;
a second determining unit, configured to determine that an alias relationship exists between the first address name and the second address name if the alias relationship score is greater than a threshold value.
7. The identification device according to claim 6, wherein the unit for constructing the analysis model comprises:
the training sample construction unit is used for constructing a training sample set; the training sample set comprises all report points corresponding to a plurality of training sample address names and real aliases corresponding to the training sample address names;
the second extraction unit is used for extracting a feature matrix of the address name of the target training sample according to all report points corresponding to the address name of the target training sample; wherein the target training sample address name is a first training sample address name or a second training sample address name;
the input unit is used for inputting the feature matrix of the target training sample address name into a deep convolutional neural network to obtain a feature value of the target training sample address name;
the prediction unit is used for obtaining a prediction result of whether the first training sample address name and the second training sample address name have an alias relationship according to the characteristic value of the first address name and the characteristic value of the second address name;
and the adjusting unit is used for continuously adjusting the parameters in the deep convolutional neural network by using the error between the prediction result and the real alias corresponding to the training sample address name until the error between the prediction result output by the adjusted deep convolutional neural network and the real alias corresponding to the training sample address name meets a preset convergence condition, and determining the adjusted deep convolutional neural network as an analysis model.
8. The identification device according to claim 6, wherein the first extraction unit comprises:
the equal dividing unit is used for equally dividing the city map to which the target address name belongs into small grids of N x N; wherein N is a positive integer;
the counting unit is used for counting the number of all report points corresponding to the target address name in each small piece grid;
and the first extraction subunit is used for normalizing the number of each small piece of grid to obtain a feature matrix of the target address name.
9. An electronic device, comprising:
one or more processors;
a storage device having one or more programs stored thereon;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of identifying address aliases of any of claims 1-5.
10. A computer storage medium, on which a computer program is stored, wherein the computer program, when executed by a processor, implements the method of identifying address aliases of any one of claims 1 to 5.
CN202111128210.5A 2021-09-26 2021-09-26 Address alias identification method and device, electronic equipment and computer storage medium Active CN113779331B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111128210.5A CN113779331B (en) 2021-09-26 2021-09-26 Address alias identification method and device, electronic equipment and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111128210.5A CN113779331B (en) 2021-09-26 2021-09-26 Address alias identification method and device, electronic equipment and computer storage medium

Publications (2)

Publication Number Publication Date
CN113779331A true CN113779331A (en) 2021-12-10
CN113779331B CN113779331B (en) 2024-02-06

Family

ID=78853391

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111128210.5A Active CN113779331B (en) 2021-09-26 2021-09-26 Address alias identification method and device, electronic equipment and computer storage medium

Country Status (1)

Country Link
CN (1) CN113779331B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110866797A (en) * 2018-08-28 2020-03-06 珠海市卓优信息技术有限公司 Timed supply chain pre-selling system and method thereof
CN110866711A (en) * 2018-08-28 2020-03-06 珠海市卓优信息技术有限公司 Electronic commerce purchaser system of timing supply chain and method thereof
CN111882224A (en) * 2020-07-30 2020-11-03 上加下信息技术成都有限公司 Method and device for classifying consumption scenes

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110866797A (en) * 2018-08-28 2020-03-06 珠海市卓优信息技术有限公司 Timed supply chain pre-selling system and method thereof
CN110866711A (en) * 2018-08-28 2020-03-06 珠海市卓优信息技术有限公司 Electronic commerce purchaser system of timing supply chain and method thereof
CN111882224A (en) * 2020-07-30 2020-11-03 上加下信息技术成都有限公司 Method and device for classifying consumption scenes

Also Published As

Publication number Publication date
CN113779331B (en) 2024-02-06

Similar Documents

Publication Publication Date Title
CN105183781B (en) Information recommendation method and device
US8751459B2 (en) Method and system to analyze email addresses
CN113656807B (en) Vulnerability management method, device, equipment and storage medium
CN109670931B (en) Loan user behavior detection method, loan user behavior detection device, loan user behavior detection equipment and loan user behavior detection storage medium
CN106874287B (en) Method and device for processing POI address codes
CN109359480B (en) A kind of the privacy of user guard method and system of Digital Library-Oriented
CN112818398B (en) Data processing method and big data processing equipment for big data privacy protection
CN104080054A (en) Abnormal interest point acquisition method and device
CN111866196B (en) Domain name traffic characteristic extraction method, device and equipment and readable storage medium
CN112069276A (en) Address coding method and device, computer equipment and computer readable storage medium
CN110648172B (en) Identity recognition method and system integrating multiple mobile devices
CN106570367B (en) Method for identifying ID and device based on keyboard operation
WO2021135104A1 (en) Multi-source data-based object pushing method and apparatus, device, and storage medium
CN105303449A (en) Social network user identification method based on camera fingerprint features and system thereof
Nakamura et al. Encryption-free framework of privacy-preserving image recognition for photo-based information services
CN112632409A (en) Same user identification method, device, computer equipment and storage medium
CN110225009B (en) Proxy user detection method based on communication behavior portrait
Ramya et al. An efficient Minkowski distance-based matching with Merkle hash tree authentication for biometric recognition in cloud computing
CN105701224A (en) Security information customized service system based on big data
CN112732693B (en) Intelligent internet of things data acquisition method, device, equipment and storage medium
CN113779331A (en) Address alias identification method and device, electronic equipment and computer storage medium
Shin et al. Writer identification using intra-stroke and inter-stroke information for security enhancements in P2P systems
CN110097258B (en) User relationship network establishment method, device and computer readable storage medium
CN110489669B (en) Information pushing method and device
CN106156349A (en) Image search method based on information security

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant