CN113779331B - Address alias identification method and device, electronic equipment and computer storage medium - Google Patents

Address alias identification method and device, electronic equipment and computer storage medium Download PDF

Info

Publication number
CN113779331B
CN113779331B CN202111128210.5A CN202111128210A CN113779331B CN 113779331 B CN113779331 B CN 113779331B CN 202111128210 A CN202111128210 A CN 202111128210A CN 113779331 B CN113779331 B CN 113779331B
Authority
CN
China
Prior art keywords
address name
target
address
training sample
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111128210.5A
Other languages
Chinese (zh)
Other versions
CN113779331A (en
Inventor
何天赋
陈国春
颜萍
王晟宇
袁野
李瑞远
鲍捷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jingdong City Beijing Digital Technology Co Ltd
Original Assignee
Jingdong City Beijing Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jingdong City Beijing Digital Technology Co Ltd filed Critical Jingdong City Beijing Digital Technology Co Ltd
Priority to CN202111128210.5A priority Critical patent/CN113779331B/en
Publication of CN113779331A publication Critical patent/CN113779331A/en
Application granted granted Critical
Publication of CN113779331B publication Critical patent/CN113779331B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90344Query processing by using string matching techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Bioethics (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides an address alias identification method, an address alias identification device, electronic equipment and a computer storage medium, wherein the address alias identification method comprises the following steps: inquiring a first user with a first address name and a second user with a second address name from a receiving address database; aiming at each target user, acquiring the report points of the target users in a report point database according to the identification of the target users; extracting to obtain a feature matrix of the target address name according to all the report points corresponding to the target address name; inputting the feature matrix of the target address name into an analysis model to obtain a feature value of the target address name; determining an alias relation score according to the characteristic value of the first address name and the characteristic value of the second address name; if the alias relation score is greater than the threshold, it is determined that the first address name and the second address name have an alias relation. Thereby achieving the purpose of quickly and accurately identifying the address alias.

Description

Address alias identification method and device, electronic equipment and computer storage medium
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a method and apparatus for identifying an address alias, an electronic device, and a computer storage medium.
Background
A geographic entity, in addition to its standard name, is also commonly referred to as an alias. For example, beijing Zhongxin building located in a central business area is commonly referred to as "China respect".
Alias acquisition in the prior art mainly relies on manual collection and entry. The collection modes may include: crowd-sourced collection, namely dividing cities into sections, and distributing alias collection tasks of the sections to people familiar with local geographic information; record information about building registration is acquired. It can be seen that the existing manual collection and input needs to consume a great deal of labor cost and time, and the obtained aliases have serious hysteresis.
Disclosure of Invention
In view of the above, the present application provides a method, apparatus, electronic device, and computer storage medium for identifying an address alias, which can quickly and accurately identify the address alias.
The first aspect of the present application provides a method for identifying an address alias, including:
acquiring a first address name and a second address name;
inquiring a first user with a receiving address name of the first address name and a second user with a receiving address name of the second address name in a receiving address database;
Aiming at each target user, acquiring a report point of the target user in a report point database according to the identification of the target user, and associating the report point of the target user with a target address name corresponding to the target user; the report point is a positioning address and a time stamp obtained when the user uses an application program to operate; the target user is the first user or the second user; the target address name is the first address name or the second address name; the target address name corresponding to the first user is a first target name; the target address name corresponding to the second user is a second address name;
extracting and obtaining a feature matrix of the target address name according to all the report points corresponding to the target address name;
inputting the feature matrix of the target address name into an analysis model to obtain a feature value of the target address name; the analysis model is obtained by training a deep convolutional neural network by a plurality of training sample address names and real aliases corresponding to the training sample address names;
determining an alias relation score according to the characteristic value of the first address name and the characteristic value of the second address name;
And if the alias relation score is larger than a threshold value, determining that the first address name and the second address name have an alias relation.
Optionally, the method for constructing the analysis model includes:
constructing a training sample set; the training sample set comprises all report points corresponding to a plurality of training sample address names and real aliases corresponding to the training sample address names;
extracting a feature matrix of the target training sample address name according to all report points corresponding to the target training sample address name; the target training sample address name is a first training sample address name or a second training sample address name;
inputting the feature matrix of the target training sample address name into a deep convolutional neural network to obtain a feature value of the target training sample address name;
obtaining a prediction result of whether the first training sample address name and the second training sample address name have an alias relation according to the characteristic value of the first address name and the characteristic value of the second address name;
and continuously adjusting parameters in the deep convolutional neural network by utilizing errors between the prediction result and the real aliases corresponding to the address names of the training samples until the errors between the prediction result output by the adjusted deep convolutional neural network and the real aliases corresponding to the address names of the training samples meet preset convergence conditions, and determining the adjusted deep convolutional neural network as an analysis model.
Optionally, the extracting, according to all the report points corresponding to the target address name, the feature matrix of the target address name includes:
equally dividing the city map to which the target address name belongs into N small grids; wherein, N is a positive integer;
counting the number of all the report points corresponding to the target address name in each small grid;
and normalizing the quantity in each small grid to obtain the feature matrix of the target address name.
Optionally, the determining an alias relation score according to the feature value of the first address name and the feature value of the second address name includes:
calculating to obtain a similar value of the characteristic value of the first address name and the characteristic value of the second address name according to the characteristic value of the first address name and the characteristic value of the second address name;
and calculating by using the similar value to obtain an alias relation score between the first address name and the second address name.
Optionally, before the obtaining, for each target user, the report point of the target user in the report point database according to the identifier of the target user and associating the report point of the target user with the target address name corresponding to the target user, the method further includes:
Encrypting the identification of each target user by using a preset encryption mode to obtain the encrypted identification of the target user;
the method for obtaining the report point of each target user in the report point database according to the identification of the target user, and associating the report point of the target user with the target address name corresponding to the target user comprises the following steps:
and aiming at each target user, acquiring the report point of the target user in a report point database according to the encrypted identification of the target user, and associating the report point of the target user with the target address name corresponding to the target user.
A second aspect of the present application provides an address alias identification apparatus, including:
the acquisition unit is used for acquiring the first address name and the second address name;
the inquiry unit is used for inquiring a first user with the receiving address name of the first address name and a second user with the receiving address name of the second address name in the receiving address database;
the association unit is used for acquiring the report point of each target user in a report point database according to the identification of the target user, and associating the report point of the target user with the target address name corresponding to the target user; the report point is a positioning address and a time stamp obtained when the user uses an application program to operate; the target user is the first user or the second user; the target address name is the first address name or the second address name; the target address name corresponding to the first user is a first target name; the target address name corresponding to the second user is a second address name;
The first extraction unit is used for extracting and obtaining a feature matrix of the target address name according to all the report points corresponding to the target address name;
the analysis unit is used for inputting the characteristic matrix of the target address name into an analysis model to obtain a characteristic value of the target address name; the analysis model is obtained by training a deep convolutional neural network by a plurality of training sample address names and real aliases corresponding to the training sample address names;
a first determining unit, configured to determine an alias relation score according to the feature value of the first address name and the feature value of the second address name;
and the second determining unit is used for determining that the first address name and the second address name have an alias relation if the alias relation score is larger than a threshold value.
Optionally, the construction unit of the analysis model includes:
the training sample construction unit is used for constructing a training sample set; the training sample set comprises all report points corresponding to a plurality of training sample address names and real aliases corresponding to the training sample address names;
the second extraction unit is used for extracting and obtaining a feature matrix of the target training sample address name according to all report points corresponding to the target training sample address name; the target training sample address name is a first training sample address name or a second training sample address name;
The input unit is used for inputting the feature matrix of the target training sample address name into the deep convolutional neural network to obtain the feature value of the target training sample address name;
the prediction unit is used for obtaining a prediction result of whether the first training sample address name and the second training sample address name have an alias relation according to the characteristic value of the first address name and the characteristic value of the second address name;
and the adjusting unit is used for continuously adjusting parameters in the deep convolutional neural network by utilizing the error between the prediction result and the real alias corresponding to the address name of the training sample until the error between the prediction result output by the adjusted deep convolutional neural network and the real alias corresponding to the address name of the training sample meets a preset convergence condition, and determining the adjusted deep convolutional neural network as an analysis model.
Optionally, the first extraction unit includes:
an equally dividing unit, configured to equally divide the city map to which the destination address name belongs into n×n small grids; wherein, N is a positive integer;
the counting unit is used for counting the quantity of all the report points corresponding to the target address name in each small grid;
And the first extraction subunit is used for normalizing the quantity in each small grid to obtain a feature matrix of the target address name.
Optionally, the first determining unit includes:
the first calculating unit is used for calculating to obtain a similar value of the characteristic value of the first address name and the characteristic value of the second address name according to the characteristic value of the first address name and the characteristic value of the second address name;
and the second calculation unit is used for calculating and obtaining the alias relation score between the first address name and the second address name by using the similar value.
Optionally, the address alias identification device further includes:
the encryption unit is used for encrypting the identification of each target user by using a preset encryption mode to obtain the encrypted identification of the target user;
wherein, the association unit is used for:
and aiming at each target user, acquiring the report point of the target user in a report point database according to the encrypted identification of the target user, and associating the report point of the target user with the target address name corresponding to the target user.
A third aspect of the present application provides an electronic device, comprising:
One or more processors;
a storage device having one or more programs stored thereon;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of identifying an address alias as in any one of the first aspects.
A fourth aspect of the present application provides a computer storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the method for identifying an address alias according to any one of the first aspects.
As can be seen from the above solutions, the present application provides a method, an apparatus, an electronic device, and a computer storage medium for identifying an address alias, where the method for identifying an address alias includes: firstly, acquiring a first address name and a second address name; then, in a receiving address database, inquiring a first user with the receiving address name being the first address name and a second user with the receiving address name being the second address name; then, aiming at each target user, acquiring the report point of the target user in a report point database according to the identification of the target user, and associating the report point of the target user with the target address name corresponding to the target user; the report point is a positioning address and a time stamp obtained when the user uses an application program to operate; then, extracting and obtaining a feature matrix of the target address name according to all the report points corresponding to the target address name; inputting the feature matrix of the target address name into an analysis model to obtain a feature value of the target address name; finally, determining an alias relation score according to the characteristic value of the first address name and the characteristic value of the second address name; and if the alias relation score is larger than a threshold value, determining that the first address name and the second address name have an alias relation. Thereby achieving the purpose of quickly and accurately identifying the address alias.
Drawings
In order to more clearly illustrate the present embodiment or the technical solutions in the prior art, the drawings that are required to be used in the description of the embodiment or the prior art will be briefly described below, and it is obvious that the drawings in the description below are only examples of the present invention, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
FIG. 1 is a specific flowchart of a method for identifying an address alias according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a report distribution according to another embodiment of the present disclosure;
FIG. 3 is a flow chart of a method for extracting feature matrices for obtaining destination address names according to another embodiment of the present application;
fig. 4 is a schematic diagram of feature matrix extraction of 3*3 address name a according to another embodiment of the present disclosure;
FIG. 5 is a flowchart of a method for constructing an analytical model according to another embodiment of the present application;
FIG. 6 is a flow chart of a method of determining an alias relationship score according to another embodiment of the present application;
FIG. 7 is a schematic diagram of an address alias identification apparatus according to another embodiment of the present application;
Fig. 8 is a schematic diagram of an electronic device for implementing an address alias identification method according to another embodiment of the present application.
Detailed Description
The technical solutions in this embodiment will be clearly and completely described below with reference to the drawings in this embodiment, and it is obvious that the described embodiments are only some of the embodiments, not all of the embodiments. All other embodiments, based on the embodiments herein, which would be apparent to one of ordinary skill in the art without undue burden are within the scope of this disclosure.
It should be noted that the terms "first," "second," and the like in this application are used merely to distinguish between different devices, modules, or units and are not intended to limit the order or interdependence of functions performed by such devices, modules, or units, but the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but also other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The embodiment of the application provides an address alias identification method, as shown in fig. 1, specifically including the following steps:
s101, acquiring a first address name and a second address name.
It should be noted that, the first address name and the second address name in the present application may be two address names considered to be specified, or may be two address names selected randomly in the system, which is not limited herein.
S102, in a receiving address database, a first user with a receiving address name of a first address name and a second user with a receiving address name of a second address name are inquired.
Specifically, in the receiving address database, the specific implementation manner of querying the first user obtaining the receiving address name as the first address name and the second user obtaining the receiving address name as the second address name may be implemented by using, but is not limited to, matching address text strings, which are not limited herein.
For example: the users filling in the shipping address as address name a are user 1, user 2 and user 3. Then, the present application can generate a user list of address names a→ { user 1, user 2, user 3}. Similarly, the users who fill in the shipping address as address name D are user 3 and user 4. Then, the present application can generate a user list of address names D, address names d→ { user 3, user 4.
S103, aiming at each target user, acquiring the report point of the target user in a report point database according to the identification of the target user, and associating the report point of the target user with the target address name corresponding to the target user.
The report point is a positioning address and a time stamp obtained when a user uses an application program to operate; the target user is a first user or a second user; the target address name is a first address name or a second address name; the target address name corresponding to the first user is a first target name; the target address name corresponding to the second user is the second address name.
It should be noted that specific operations may be set, and not every operation needs to obtain a report of the user, which is not limited herein.
Continuing the above example, address name A→ { user 1, user 2, user 3} → { user 1 reporting point, user 2 reporting point, user 3 reporting point }; the address name d→ { user 3, user 4} → { user 3's point of report, user 4 }. Specific report distribution conditions can be seen in fig. 2.
Optionally, in another embodiment of the present application, to protect privacy security of the client, an implementation of the address alias further includes, before performing step S103:
And encrypting the identification of each target user by using a preset encryption mode to obtain the encrypted identification of the target user.
It should be noted that the preset encryption method may be, but is not limited to, MD5 algorithm, which is not limited herein.
Therefore, the purpose that the contact report points are anonymous users in the processing process and the detailed position information of the specific users is difficult to crack is achieved.
One embodiment of step S103 includes:
and aiming at each target user, acquiring the report point of the target user in a report point database according to the encrypted identification of the target user, and associating the report point of the target user with the target address name corresponding to the target user.
S104, extracting and obtaining a feature matrix of the target address name according to all the report points corresponding to the target address name.
Optionally, in another embodiment of the present application, an implementation of step S104, as shown in fig. 3, includes:
s301, equally dividing a city map to which a target address name belongs into small grids of N x N.
Wherein N is a positive integer.
S302, counting the number of all the report points corresponding to the target address name in each small grid.
And S303, normalizing the number in each small grid to obtain a feature matrix of the target address name.
Specifically, as shown in fig. 4, a simple schematic diagram of feature matrix extraction of 3*3 address name a is shown.
S105, inputting the feature matrix of the target address name into the analysis model to obtain the feature value of the target address name.
The analysis model is obtained by training the deep convolutional neural network by a plurality of training sample address names and real aliases corresponding to the training sample address names. The deep convolutional neural network is composed of a plurality of convolutional layers and pooled layers.
Optionally, in another embodiment of the present application, an implementation of a method for constructing an analytical model, as shown in fig. 5, includes:
s501, constructing a training sample set.
The training sample set comprises all report points corresponding to the address names of the training samples and real aliases corresponding to the address names of the training samples.
It should be noted that, tag data may be added to each training sample address name, to mark what the alias of the training sample address name is, or what the alias of the training sample address name is.
S502, extracting a feature matrix of the target training sample address name according to all report points corresponding to the target training sample address name.
The target training sample address name is the first training sample address name or the second training sample address name.
It should be noted that, for the specific implementation manner of step S502, reference may be made to step S104, which is not described herein.
S503, inputting the feature matrix of the target training sample address name into the deep convolutional neural network to obtain the feature value of the target training sample address name.
It should be noted that, for the specific implementation manner of step S503, reference may be made to step S105, which is not described herein.
S504, according to the characteristic value of the first address name and the characteristic value of the second address name, a prediction result of whether the first training sample address name and the second training sample address name have an alias relation is obtained.
The prediction result is that the first training sample address name and the second training sample address name have an alias relation, or the first training sample address name and the second training sample address name have no alias relation.
S505, judging whether the error between the predicted result and the real alias corresponding to the training sample address name meets a preset convergence condition.
Specifically, if it is determined that the error between the prediction result and the real alias corresponding to the address name of the training sample does not meet the preset convergence condition, step S506 is executed; if it is determined that the error between the prediction result and the real alias corresponding to the training sample address name meets the preset convergence condition, step S507 is executed.
S506, continuously adjusting parameters in the deep convolutional neural network.
S507, determining the deep convolutional neural network as an analysis model.
S106, determining the alias relation score according to the characteristic value of the first address name and the characteristic value of the second address name.
Optionally, in another embodiment of the present application, an implementation of step S106, as shown in fig. 6, includes:
s601, calculating to obtain a similar value of the characteristic value of the first address name and the characteristic value of the second address name according to the characteristic value of the first address name and the characteristic value of the second address name.
Specifically, the following calculation formula may be adopted to calculate and obtain the similar value of the characteristic value of the first address name and the characteristic value of the second address name:
wherein δ represents a value close to the characteristic value of the first address name and the characteristic value of the second address name; v represents a characteristic value of the first address name; v' represents the eigenvalue of the second address name.
S602, calculating by using the similar value to obtain an alias relation score between the first address name and the second address name.
Specifically, the following calculation formula may be used to calculate the alias relation score between the first address name and the second address name:
Where s represents an alias relationship score between the first address name and the second address name.
And S107, if the alias relation score is larger than the threshold value, determining that the first address name and the second address name have an alias relation.
The threshold is a value set by a technician or the like, and can be set and changed according to a subsequent experimental result, an application condition or the like, and is not limited herein.
As can be seen from the above schemes, the present application provides a method for identifying an address alias: firstly, acquiring a first address name and a second address name; then, in the receiving address database, inquiring a first user with the receiving address name being a first address name and a second user with the receiving address name being a second address name; then, aiming at each target user, acquiring the report point of the target user in a report point database according to the identification of the target user, and associating the report point of the target user with the target address name corresponding to the target user; the report point is a positioning address and a time stamp obtained when a user uses an application program to operate; then, extracting and obtaining a feature matrix of the target address name according to all the report points corresponding to the target address name; inputting the feature matrix of the target address name into an analysis model to obtain a feature value of the target address name; finally, determining an alias relation score according to the characteristic value of the first address name and the characteristic value of the second address name; if the alias relation score is greater than the threshold, it is determined that the first address name and the second address name have an alias relation. Thereby achieving the purpose of quickly and accurately identifying the address alias.
Another embodiment of the present application provides a method for identifying an address alias, as shown in fig. 7, specifically including:
an obtaining unit 701, configured to obtain a first address name and a second address name.
The query unit 702 is configured to query a receiving address database for a first user who obtains a receiving address name as a first address name and a second user who obtains a receiving address name as a second address name.
And the association unit 703 is configured to obtain, for each target user, a report point of the target user from the report point database according to the identifier of the target user, and associate the report point of the target user with a target address name corresponding to the target user.
The report point is a positioning address and a time stamp obtained when a user uses an application program to operate; the target user is a first user or a second user; the target address name is a first address name or a second address name; the target address name corresponding to the first user is a first target name; the target address name corresponding to the second user is the second address name.
The first extracting unit 704 is configured to extract, according to all the report points corresponding to the target address name, a feature matrix of the target address name.
Optionally, in another embodiment of the present application, an implementation of the first extraction unit 704 includes:
and the equally dividing unit is used for equally dividing the city map to which the target address name belongs into N small grids.
Wherein N is a positive integer.
And the statistics unit is used for counting the number of all the report points corresponding to the target address name in each small grid.
And the first extraction subunit is used for normalizing the quantity in each small grid to obtain a feature matrix of the target address name.
The specific working process of the unit disclosed in the foregoing embodiments of the present application may refer to the content of the corresponding method embodiment, as shown in fig. 3, which is not described herein again.
And the analysis unit 705 is configured to input the feature matrix of the target address name into the analysis model, and obtain a feature value of the target address name.
The analysis model is obtained by training the deep convolutional neural network by a plurality of training sample address names and real aliases corresponding to the training sample address names.
Optionally, in another embodiment of the present application, an implementation of the building unit of the analysis model includes:
and the training sample construction unit is used for constructing a training sample set.
The training sample set comprises all report points corresponding to the address names of the training samples and real aliases corresponding to the address names of the training samples.
And the second extraction unit is used for extracting and obtaining the feature matrix of the address name of the target training sample according to all the report points corresponding to the address name of the target training sample.
The target training sample address name is the first training sample address name or the second training sample address name.
And the input unit is used for inputting the feature matrix of the target training sample address name into the deep convolutional neural network to obtain the feature value of the target training sample address name.
And the prediction unit is used for obtaining a prediction result of whether the first training sample address name and the second training sample address name have an alias relation according to the characteristic value of the first address name and the characteristic value of the second address name.
The adjusting unit is used for continuously adjusting parameters in the deep convolutional neural network by utilizing errors between the prediction result and the real alias corresponding to the training sample address name until the errors between the prediction result output by the adjusted deep convolutional neural network and the real alias corresponding to the training sample address name meet preset convergence conditions, and determining the adjusted deep convolutional neural network as an analysis model.
The specific working process of the unit disclosed in the foregoing embodiments of the present application may refer to the content of the corresponding method embodiment, as shown in fig. 5, which is not described herein again.
The first determining unit 706 is configured to determine an alias relation score according to the feature value of the first address name and the feature value of the second address name.
Optionally, in another embodiment of the present application, an implementation manner of the first determining unit 706 includes:
the first calculation unit is used for calculating to obtain a similar value of the characteristic value of the first address name and the characteristic value of the second address name according to the characteristic value of the first address name and the characteristic value of the second address name;
and the second calculation unit is used for calculating and obtaining the alias relation score between the first address name and the second address name by using the similar value.
The specific working process of the unit disclosed in the foregoing embodiments of the present application may refer to the content of the corresponding method embodiment, as shown in fig. 6, which is not described herein again.
The second determining unit 707 is configured to determine that the first address name and the second address name have an alias relationship if the alias relationship score is greater than a threshold value.
The specific working process of the unit disclosed in the foregoing embodiments of the present application may refer to the content of the corresponding method embodiment, as shown in fig. 1, which is not repeated herein.
Optionally, in another embodiment of the present application, an implementation manner of the address alias identification device further includes:
the encryption unit is used for encrypting the identification of each target user by using a preset encryption mode to obtain the encrypted identification of the target user.
Wherein, the association unit 703 is configured to:
and aiming at each target user, acquiring the report point of the target user in a report point database according to the encrypted identification of the target user, and associating the report point of the target user with the target address name corresponding to the target user.
The specific working process of the unit disclosed in the foregoing embodiments of the present application may refer to the content of the corresponding method embodiment, which is not described herein again.
As can be seen from the above scheme, the present application provides an address alias identification device: first, the acquisition unit 701 acquires a first address name and a second address name; then, the query unit 702 queries the receiving address database for a first user who obtains the receiving address name as the first address name and a second user who obtains the receiving address name as the second address name; the association unit 703 obtains, for each target user, a report point of the target user from the report point database according to the identifier of the target user, and associates the report point of the target user with a target address name corresponding to the target user; the report point is a positioning address and a time stamp obtained when a user uses an application program to operate; then, the first extracting unit 704 extracts a feature matrix of the target address name according to all the report points corresponding to the target address name; the analysis unit 705 inputs the feature matrix of the target address name to the analysis model to obtain a feature value of the target address name; finally, the first determining unit 706 determines an alias relation score according to the feature value of the first address name and the feature value of the second address name; if the alias relation score is greater than the threshold value, the second determination unit 707 determines that the first address name and the second address name have an alias relation. Thereby achieving the purpose of quickly and accurately identifying the address alias.
Another embodiment of the present application provides an electronic device, as shown in fig. 8, including:
one or more processors 801.
A storage device 802 on which one or more programs are stored.
The one or more programs, when executed by the one or more processors 801, cause the one or more processors 801 to implement the method of identifying an address alias as in any of the embodiments described above.
Another embodiment of the present application provides a computer storage medium having a computer program stored thereon, where the computer program, when executed by a processor, implements a method for identifying an address alias according to any one of the embodiments above.
In the above embodiments of the disclosure, it should be understood that the disclosed apparatus and method may be implemented in other manners. The apparatus and method embodiments described above are merely illustrative, for example, flow diagrams and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in various embodiments of the present disclosure may be integrated together to form a single portion, or each module may exist alone, or two or more modules may be integrated to form a single portion. The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the present disclosure may be embodied in essence or a part contributing to the prior art or a part of the technical solution, or in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a live device, or a network device, etc.) to perform all or part of the steps of the method described in the embodiments of the present disclosure. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Those skilled in the art can make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. An address alias identification method, comprising:
acquiring a first address name and a second address name;
inquiring a first user with a receiving address name of the first address name and a second user with a receiving address name of the second address name in a receiving address database;
aiming at each target user, acquiring a report point of the target user in a report point database according to the identification of the target user, and associating the report point of the target user with a target address name corresponding to the target user; the report point is a positioning address and a time stamp obtained when the user uses an application program to operate; the target user is the first user or the second user; the target address name is the first address name or the second address name; the target address name corresponding to the first user is a first target name; the target address name corresponding to the second user is a second address name;
extracting and obtaining a feature matrix of the target address name according to all the report points corresponding to the target address name;
inputting the feature matrix of the target address name into an analysis model to obtain a feature value of the target address name; the analysis model is obtained by training a deep convolutional neural network by a plurality of training sample address names and real aliases corresponding to the training sample address names;
Determining an alias relation score according to the characteristic value of the first address name and the characteristic value of the second address name;
and if the alias relation score is larger than a threshold value, determining that the first address name and the second address name have an alias relation.
2. The method for identifying according to claim 1, wherein the method for constructing the analysis model comprises:
constructing a training sample set; the training sample set comprises all report points corresponding to a plurality of training sample address names and real aliases corresponding to the training sample address names;
extracting a feature matrix of the target training sample address name according to all report points corresponding to the target training sample address name; the target training sample address name is a first training sample address name or a second training sample address name;
inputting the feature matrix of the target training sample address name into a deep convolutional neural network to obtain a feature value of the target training sample address name;
obtaining a prediction result of whether the first training sample address name and the second training sample address name have an alias relation according to the characteristic value of the first address name and the characteristic value of the second address name;
And continuously adjusting parameters in the deep convolutional neural network by utilizing errors between the prediction result and the real aliases corresponding to the address names of the training samples until the errors between the prediction result output by the adjusted deep convolutional neural network and the real aliases corresponding to the address names of the training samples meet preset convergence conditions, and determining the adjusted deep convolutional neural network as an analysis model.
3. The identification method according to claim 1, wherein the extracting the feature matrix of the destination address name according to all the message points corresponding to the destination address name includes:
equally dividing the city map to which the target address name belongs into N small grids; wherein, N is a positive integer;
counting the number of all the report points corresponding to the target address name in each small grid;
and normalizing the quantity in each small grid to obtain the feature matrix of the target address name.
4. The method of identifying of claim 1, wherein determining an alias relationship score from the characteristic value of the first address name and the characteristic value of the second address name comprises:
Calculating to obtain a similar value of the characteristic value of the first address name and the characteristic value of the second address name according to the characteristic value of the first address name and the characteristic value of the second address name;
and calculating by using the similar value to obtain an alias relation score between the first address name and the second address name.
5. The method for identifying according to claim 1, wherein for each target user, before obtaining the report point of the target user in the report point database according to the identifier of the target user and associating the report point of the target user with the target address name corresponding to the target user, the method further comprises:
encrypting the identification of each target user by using a preset encryption mode to obtain the encrypted identification of the target user;
the method for obtaining the report point of each target user in the report point database according to the identification of the target user, and associating the report point of the target user with the target address name corresponding to the target user comprises the following steps:
and aiming at each target user, acquiring the report point of the target user in a report point database according to the encrypted identification of the target user, and associating the report point of the target user with the target address name corresponding to the target user.
6. An address alias identification apparatus, comprising:
the acquisition unit is used for acquiring the first address name and the second address name;
the inquiry unit is used for inquiring a first user with the receiving address name of the first address name and a second user with the receiving address name of the second address name in the receiving address database;
the association unit is used for acquiring the report point of each target user in a report point database according to the identification of the target user, and associating the report point of the target user with the target address name corresponding to the target user; the report point is a positioning address and a time stamp obtained when the user uses an application program to operate; the target user is the first user or the second user; the target address name is the first address name or the second address name; the target address name corresponding to the first user is a first target name; the target address name corresponding to the second user is a second address name;
the first extraction unit is used for extracting and obtaining a feature matrix of the target address name according to all the report points corresponding to the target address name;
The analysis unit is used for inputting the characteristic matrix of the target address name into an analysis model to obtain a characteristic value of the target address name; the analysis model is obtained by training a deep convolutional neural network by a plurality of training sample address names and real aliases corresponding to the training sample address names;
a first determining unit, configured to determine an alias relation score according to the feature value of the first address name and the feature value of the second address name;
and the second determining unit is used for determining that the first address name and the second address name have an alias relation if the alias relation score is larger than a threshold value.
7. The apparatus according to claim 6, wherein the construction unit of the analysis model includes:
the training sample construction unit is used for constructing a training sample set; the training sample set comprises all report points corresponding to a plurality of training sample address names and real aliases corresponding to the training sample address names;
the second extraction unit is used for extracting and obtaining a feature matrix of the target training sample address name according to all report points corresponding to the target training sample address name; the target training sample address name is a first training sample address name or a second training sample address name;
The input unit is used for inputting the feature matrix of the target training sample address name into the deep convolutional neural network to obtain the feature value of the target training sample address name;
the prediction unit is used for obtaining a prediction result of whether the first training sample address name and the second training sample address name have an alias relation according to the characteristic value of the first address name and the characteristic value of the second address name;
and the adjusting unit is used for continuously adjusting parameters in the deep convolutional neural network by utilizing the error between the prediction result and the real alias corresponding to the address name of the training sample until the error between the prediction result output by the adjusted deep convolutional neural network and the real alias corresponding to the address name of the training sample meets a preset convergence condition, and determining the adjusted deep convolutional neural network as an analysis model.
8. The identification device of claim 6, wherein the first extraction unit comprises:
an equally dividing unit, configured to equally divide the city map to which the destination address name belongs into n×n small grids; wherein, N is a positive integer;
The counting unit is used for counting the quantity of all the report points corresponding to the target address name in each small grid;
and the first extraction subunit is used for normalizing the quantity in each small grid to obtain a feature matrix of the target address name.
9. An electronic device, comprising:
one or more processors;
a storage device having one or more programs stored thereon;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of identifying address aliases of any one of claims 1 to 5.
10. A computer storage medium, having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the method of identifying an address alias according to any one of claims 1 to 5.
CN202111128210.5A 2021-09-26 2021-09-26 Address alias identification method and device, electronic equipment and computer storage medium Active CN113779331B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111128210.5A CN113779331B (en) 2021-09-26 2021-09-26 Address alias identification method and device, electronic equipment and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111128210.5A CN113779331B (en) 2021-09-26 2021-09-26 Address alias identification method and device, electronic equipment and computer storage medium

Publications (2)

Publication Number Publication Date
CN113779331A CN113779331A (en) 2021-12-10
CN113779331B true CN113779331B (en) 2024-02-06

Family

ID=78853391

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111128210.5A Active CN113779331B (en) 2021-09-26 2021-09-26 Address alias identification method and device, electronic equipment and computer storage medium

Country Status (1)

Country Link
CN (1) CN113779331B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110866711A (en) * 2018-08-28 2020-03-06 珠海市卓优信息技术有限公司 Electronic commerce purchaser system of timing supply chain and method thereof
CN110866797A (en) * 2018-08-28 2020-03-06 珠海市卓优信息技术有限公司 Timed supply chain pre-selling system and method thereof
CN111882224A (en) * 2020-07-30 2020-11-03 上加下信息技术成都有限公司 Method and device for classifying consumption scenes

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110866711A (en) * 2018-08-28 2020-03-06 珠海市卓优信息技术有限公司 Electronic commerce purchaser system of timing supply chain and method thereof
CN110866797A (en) * 2018-08-28 2020-03-06 珠海市卓优信息技术有限公司 Timed supply chain pre-selling system and method thereof
CN111882224A (en) * 2020-07-30 2020-11-03 上加下信息技术成都有限公司 Method and device for classifying consumption scenes

Also Published As

Publication number Publication date
CN113779331A (en) 2021-12-10

Similar Documents

Publication Publication Date Title
JP6594988B2 (en) Method and apparatus for processing address text
CN105550583B (en) Android platform malicious application detection method based on random forest classification method
US9639714B1 (en) Secure transmission of sensitive data
US9361457B1 (en) Use of decoy data in a data store
US8751459B2 (en) Method and system to analyze email addresses
CN113242236B (en) Method for constructing network entity threat map
US20200228432A1 (en) Network asset discovery
CN112818398B (en) Data processing method and big data processing equipment for big data privacy protection
CN105827594A (en) Suspicion detection method based on domain name readability and domain name analysis behavior
CN111756522A (en) Data processing method and system
CN112069276A (en) Address coding method and device, computer equipment and computer readable storage medium
CN110648172B (en) Identity recognition method and system integrating multiple mobile devices
CN109429517A (en) Text and fingerprint recognition are carried out in the feeding of database table, text file and data to add salt
WO2021135104A1 (en) Multi-source data-based object pushing method and apparatus, device, and storage medium
Zhang et al. Geolocation of covert communication entity on the Internet for post-steganalysis
CN113779331B (en) Address alias identification method and device, electronic equipment and computer storage medium
CN108449778B (en) Wireless access point display method and device and terminal equipment
CN112765502B (en) Malicious access detection method, device, electronic equipment and storage medium
CN112633761B (en) Index data query method, device, equipment and storage medium
WO2017000817A1 (en) Method and device for acquiring matching relationship between data
CN113746804A (en) DNS hidden channel detection method, device, equipment and storage medium
CN110097258B (en) User relationship network establishment method, device and computer readable storage medium
CN113992451B (en) Asset data processing method and device
Ruppel et al. Geocookie: A space-efficient representation of geographic location sets
CN111800409A (en) Interface attack detection method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant