CN117896717A - Target wifi screening method, device, medium and equipment - Google Patents

Target wifi screening method, device, medium and equipment Download PDF

Info

Publication number
CN117896717A
CN117896717A CN202410063321.XA CN202410063321A CN117896717A CN 117896717 A CN117896717 A CN 117896717A CN 202410063321 A CN202410063321 A CN 202410063321A CN 117896717 A CN117896717 A CN 117896717A
Authority
CN
China
Prior art keywords
target
wifi
information
preset
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410063321.XA
Other languages
Chinese (zh)
Inventor
董霖
宋彤彤
方宏源
边彤洁
段永康
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Merit Interactive Co Ltd
Original Assignee
Merit Interactive Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Merit Interactive Co Ltd filed Critical Merit Interactive Co Ltd
Priority to CN202410063321.XA priority Critical patent/CN117896717A/en
Publication of CN117896717A publication Critical patent/CN117896717A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the field of data processing, in particular to a target wifi screening method, a device, a medium and equipment, comprising the following steps: according to the object name information and the object address information, a geohash character string corresponding to the object and corresponding candidate wifi are obtained, according to the object information list and names of all candidate wifi, a similarity list between the object and each candidate wifi is obtained, further, the object similarity is obtained, the corresponding candidate wifi with the object similarity being larger than a preset similarity threshold value is determined to be the object wifi, the number of intermediate users and the object name keyword are obtained through the object name information and the object address information to represent the size of the object, the accuracy of geohash character strings is improved, the association relation between the object and each candidate wifi is comprehensively represented according to the multi-dimensional similarity between each candidate wifi of the object, and the representation accuracy of the association relation is improved, so that the screening accuracy of the object wifi is improved.

Description

Target wifi screening method, device, medium and equipment
Technical Field
The invention relates to the field of data processing, in particular to a target wifi screening method, a device, a medium and equipment.
Background
Along with popularization of intelligent equipment and rapid development of networks, companies and enterprises gradually rely on the intelligent equipment and a wifi network to execute daily tasks, and targets wifi corresponding to the companies are screened out from a plurality of wifi, so that the method has important significance in the aspects of grasping business conditions of the companies, knowing working conditions of staff, guiding working directions of the companies and the like.
The wifi screening method at the present stage can calculate the similarity between the company name and the wifi name, and screen out the wifi name which is most matched with the target company from a plurality of wifi. However, due to the fact that the naming randomness of wifi is large, the association relationship between the company and wifi is difficult to accurately represent only based on the similarity between the company name and the wifi name, and therefore the accuracy of wifi screening is poor.
Therefore, how to improve the screening accuracy of the target wifi becomes a problem to be solved.
Disclosure of Invention
Aiming at the technical problems, the technical scheme adopted by the invention is a target wifi screening method, which comprises the following steps:
And acquiring a target information list corresponding to the target object, wherein the target information comprises target name information, target mailbox address information, target website address information and target address information.
And acquiring geohash character strings corresponding to the target objects according to the target name information and the target address information.
And determining wifi in the geographical area range corresponding to the geohash character strings as candidate wifi.
According to the target information list and the names of all the candidate wifi, a similarity list set between the target object and each candidate wifi is obtained, wherein the similarity list set comprises a first similarity list between the target name information and the name of the corresponding candidate wifi, a second similarity list between the target mailbox address information and the name of the corresponding candidate wifi, and a third similarity list between the target website information and the name of the corresponding candidate wifi.
And acquiring the target similarity between the target object and each candidate wifi according to the similarity list set and a preset priority list.
And determining the candidate wifi with the corresponding target similarity larger than a preset similarity threshold as the target wifi.
The invention also provides a target wifi screening device, which comprises:
The target information acquisition module is used for acquiring a target information list corresponding to the target object, wherein the target information comprises target name information, target mailbox address information, target website information and target address information.
And the character string acquisition module is used for acquiring geohash character strings corresponding to the target objects according to the target name information and the target address information.
And the candidate wifi acquisition module is used for determining wifi within the geographical area range corresponding to the geohash character strings as candidate wifi.
The similarity list acquisition module is used for acquiring a similarity list set between the target object and each candidate wifi according to the target information list and the names of all the candidate wifi, wherein the similarity list set comprises a first similarity list between the target name information and the names of the corresponding candidate wifi, a second similarity list between the target mailbox address information and the names of the corresponding candidate wifi, and a third similarity list between the target website information and the names of the corresponding candidate wifi.
The target similarity acquisition module is used for acquiring the target similarity between the target object and each candidate wifi according to the similarity list set and a preset priority list.
And the target wifi screening module is used for determining the candidate wifi with the corresponding target similarity larger than a preset similarity threshold value as the target wifi.
The invention also provides a non-transitory computer readable storage medium, wherein at least one instruction or at least one section of program is stored in the non-transitory computer readable storage medium, and the at least one instruction or the at least one section of program is loaded and executed by a processor to realize the target wifi screening method.
The invention also provides an electronic device comprising a processor and the non-transitory computer readable storage medium described above.
The invention has at least the following beneficial effects: according to the target name information and the target address information, the number of intermediate users corresponding to the target object and the target name keyword are obtained to represent the size of the target object, so that the grade number of geohash corresponding to the target object and the length of the geohash character string are determined, the accuracy of the geohash character string is matched with the size of the target object, and the accuracy of the geohash character string obtained according to the grade of geohash and the target address information is improved; determining wifi in a geographic area range corresponding to geohash character strings as candidate wifi, respectively acquiring first similarity between target name information and the name of each candidate wifi, second similarity between target mailbox address information and the name of each candidate wifi and third similarity between target website information and the name of each candidate wifi according to the target information list and the names of all candidate wifi, comprehensively representing the association relation between the target object and each candidate wifi from the similarity between the target object and each candidate wifi in three aspects, and improving the representing accuracy of the association relation between the target object and the candidate wifi; according to the similarity list and the preset priority list, the target similarity between the target object and each candidate wifi is obtained, the candidate wifi with the corresponding target similarity being larger than the preset similarity threshold value is determined to be the target wifi, and therefore screening accuracy of the target wifi is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of a target wifi screening method according to a first embodiment of the present invention;
fig. 2 is a flowchart of a target user screening method according to a second embodiment of the present invention;
fig. 3 is a flowchart of a target object screening method according to a third embodiment of the present invention;
fig. 4 is a schematic structural diagram of a target wifi screening device according to a fourth embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to fall within the scope of the invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or server that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example 1
The first embodiment provides a target wifi screening method, where the target wifi screening method includes the following steps, as shown in fig. 1:
S10, acquiring a target information list corresponding to a target object, wherein the target information comprises target name information, target mailbox address information, target website address information and target address information.
The target object may be an object with a certain number of employees, such as a company, an enterprise, a studio, a factory, etc. needing to perform employee screening, and the target name information may be a name in the form of a chinese name, an english name, etc. of the target object; the target mailbox address information may refer to address information of a dedicated mailbox of the target object; the target website information may refer to website information of a dedicated website of the target object; the target address information may refer to location information of a geographic location where the target object is located. The target name information, the target mailbox address information, the target website information and the target address information are uniquely corresponding to the target object and can be used as the identity of the target object, so that the association relationship between the target object and the wifi can be represented by analyzing the association relationship among the target name information, the target mailbox address information, the target website information, the target address information and the wifi, and further the screening task of the target wifi corresponding to the target object is completed.
The target name information, the target mailbox address information, the target website address information and the target address information are used as the identity of the target object, the task of analyzing the association relationship between the target object and the wifi is converted into the task of analyzing the association relationship between the target name information, the target mailbox address information, the target website address information and the wifi, and the screening accuracy of the target wifi is improved.
In one embodiment, the target information list is obtained by:
acquiring a preset information list corresponding to a target object, wherein the preset information list comprises preset name information, preset mailbox address information, preset website information and preset address information;
And cleaning data of the preset information list to obtain a target information list corresponding to the target object.
The method comprises the steps of carrying out data cleaning on a preset information list in order to improve screening accuracy of target wifi, so that similarity analysis is conveniently carried out on the target wifi, and characterization accuracy of association relation between a target object and the wifi is improved.
In a specific embodiment, the step of performing data cleaning on the preset information list to obtain the target information list corresponding to the target object further includes the following steps:
Acquiring a preset first name keyword list, a preset mailbox keyword list, a preset website keyword list and preset characters, wherein the preset name keyword list comprises a plurality of preset first name keywords, the preset mailbox keyword list comprises a plurality of preset mailbox keywords, and the preset website keyword list comprises a plurality of preset website keywords;
all first name keywords appearing are removed from preset name information, and intermediate name information is obtained;
Performing character conversion on the intermediate name information according to a preset character conversion form to obtain target name information;
removing all the mailbox keywords from preset mailbox address information, and obtaining intermediate mailbox address information;
and replacing the corresponding preset character in the intermediate mailbox address information with a null character to obtain the target mailbox address information.
Removing all website keywords from preset website information to obtain intermediate website information;
And replacing the corresponding preset character in the intermediate website information with a null character to obtain the target website information.
The first name keyword may refer to a keyword related to a company name preset such as "limited company", "limited liability company", "stock limited company", "office", "studio", "store", "center", "factory", "department", etc.; the mailbox keywords may refer to keywords related to mailbox names preset such as "qq", "163", "126", "gmail", "vip", etc.; the website keywords may refer to preset keywords related to network addresses, such as "http", "https", "com", "cn", etc.; the preset character may refer to a character preset of "_", "-", "//", etc.
The first name keyword, the mailbox keyword, the website keyword and the preset character can be set by an implementer according to actual conditions.
By eliminating the keywords in the preset name information, the preset mailbox address information and the preset website information and replacing the preset characters, the normalization of the name information, the mailbox address information and the website information is improved, and the interference of the corresponding keywords and the corresponding preset characters can be reduced when the similarity calculation is performed with wifi information, so that the characterization accuracy of the association relationship between the target object and wifi is improved.
In one embodiment, the target address information and the preset address information may be identical.
S20, acquiring geohash character strings corresponding to the target objects according to the target name information and the target address information.
When the distance between the wifi and the target object is relatively short, the corresponding wifi can be regarded as a candidate wifi which possibly has an association relationship with the target object, so that in order to obtain all candidate wifi which possibly has an association relationship with the target object and screen to obtain the corresponding target wifi, the embodiment combines the target name information and the target address information to obtain geohash character strings corresponding to the target object, so as to represent the position information of the target object, and serve as a basis for obtaining the candidate wifi.
In a specific embodiment, S20 further includes the following steps:
acquiring the number of intermediate users corresponding to the target object according to the target name information;
acquiring target name keywords in the target name information according to a preset keyword extraction algorithm;
When the target name keywords are consistent with the preset second name keywords or the number of intermediate users is smaller than or equal to a preset first number threshold, determining that the geohash level corresponding to the target object is 7 level;
when the target name keywords are consistent with the preset third name keywords or the number of the intermediate users is greater than or equal to a preset second number threshold, determining that the geohash level corresponding to the target object is 5 level;
when the target name keyword is inconsistent with the preset second name keyword, the target name keyword is inconsistent with the preset third name keyword, the number of intermediate users is greater than a preset first number threshold value, and the number of intermediate users is smaller than a preset second number threshold value, the geohash level corresponding to the target object is determined to be level 6;
And acquiring the geohash character string corresponding to the target object according to the grade of geohash and the target address information.
The geohash is an address coding method, which can divide the whole geographic area into areas, code two-dimensional space longitude and latitude data into a character string, and the larger the length of the geohash character string is, the smaller the range of each divided area is, and the higher the corresponding area division precision is.
Therefore, in order to improve accuracy of geohash character strings, the embodiment measures the level of geohash corresponding to the target object according to the target name information and the target address information, so that accuracy of obtaining the candidate wifi is improved.
Specifically, the target name keyword may refer to a keyword that may represent the size of the target object to a certain extent in target name information such as "company", "store", "center", "factory", "department", etc., correspondingly, the second name keyword may refer to a keyword that may represent the corresponding target object to a smaller extent in "store", etc., and the third name keyword may refer to a keyword that may represent the corresponding target object to a larger extent in "factory", etc., and meanwhile, the number of intermediate users corresponding to the target object may represent the size of the target object to a certain extent, so, in combination with the number of intermediate users corresponding to the target object, and the target name keyword extracted from the target name information based on the keyword extraction algorithm, the level of geohash corresponding to the target object is determined.
The target name keyword, the second name keyword and the third name keyword can be set by an implementer according to actual situations.
One skilled in the art knows that any keyword extraction algorithm in the prior art falls within the protection scope of the present invention, and is not described herein.
The number of intermediate users corresponding to the target object and the target name keyword are obtained to represent the size of the target object, so that the grade number of geohash corresponding to the target object and the length of the geohash character string are determined, the accuracy of the geohash character string is matched with the size of the target object, and the accuracy of the geohash character string obtained according to the grade of geohash and the target address information is improved.
S30, determining wifi in the geographical area range corresponding to the geohash character strings as candidate wifi.
Wherein, the scope size that different wifi information can cover is different, in order to reduce the different adverse effect that produces wifi screening result of wifi coverage, this embodiment all confirms as candidate wifi with the geographic area within range that geohash character string corresponds to all target wifi that the target object corresponds all cover in candidate wifi within range as far as possible, thereby when further screening out target wifi in candidate wifi, can improve the acquisition accuracy of target wifi.
S40, obtaining a similarity list set between the target object and each candidate wifi according to the target information list and the names of all candidate wifi, wherein the similarity list set comprises a first similarity list between the target name information and the names of the corresponding candidate wifi, a second similarity list between the target mailbox address information and the names of the corresponding candidate wifi, and a third similarity list between the target website information and the names of the corresponding candidate wifi.
Above-mentioned, obtain the first similarity between the name of target name information and each candidate wifi, the second similarity between the name of target mailbox address information and each candidate wifi, and the third similarity between the name of target website information and each candidate wifi respectively, from the similarity between the target object and each candidate wifi in three aspects, comprehensive characterization target object and each candidate wifi's association, improved target object and candidate wifi's association's characterization accuracy.
In a specific embodiment, S40 further includes the following steps:
acquiring a first editing distance and a first public character string length between the target name information and the name of each candidate wifi according to the target name information and the name of each candidate wifi;
According to the first editing distance, obtaining first editing distance similarity between the target name information and the corresponding candidate wifi names;
Acquiring first public character string similarity between the target name information and the names of the corresponding candidate wifi according to the first public character string length, the character string length corresponding to the names of the corresponding candidate wifi and the character string length corresponding to the target name information;
and acquiring a first similarity list between the target name information and the corresponding candidate wifi according to the first editing distance similarity and the first public character string similarity.
The editing distance refers to the number of steps required to be passed when changing from one character string to another, and the similarity between two character strings can be represented; the common string length refers to the length of a common string between two strings, and may also characterize the degree of similarity between the two strings.
Therefore, the embodiment converts the target name information into the corresponding target name character, obtains the editing distance and the common character string length between the target name character and the name of each candidate wifi, and determines the first editing distance and the first common character string length between the corresponding target name information and the name of each candidate wifi.
In one embodiment, the first edit distance similarity_lev meets the following condition:
similarity_lev=1/(EDITDISTANCE +1), where EDITDISTANCE refers to the corresponding first edit distance.
In one embodiment, the first common string similarity_common meets the following condition:
Similarity_common=(LengthO)/(max(Length1,Length2))。
Wherein Length1 refers to a string Length corresponding to the corresponding target name information, length2 refers to a string Length corresponding to the name of the corresponding candidate wifi, lengthO refers to a Length of the corresponding first common string, and max () refers to a function taking a maximum value.
In one embodiment, the first Similarity1 meets the following conditions:
Similarity 1=ψ 1*Similarity_lev+ψ2 ×similarity_common, wherein ψ 1 refers to a weight corresponding to edit distance Similarity, and ψ 2 refers to a weight corresponding to common character string Similarity.
Above-mentioned, combine edit distance and public string length to represent the first similarity between target name information and the corresponding candidate wifi, improved the accuracy of first similarity.
In a specific embodiment, S40 further includes the following steps:
acquiring a second editing distance and a second public character string length between the target mailbox address information and the name of each candidate wifi according to the target mailbox address information and the name of each candidate wifi;
Obtaining second editing distance similarity between the target mailbox address information and the corresponding name of the candidate wifi according to the second editing distance;
Obtaining second public character string similarity between the target mailbox address information and the names of the corresponding candidate wifi according to the second public character string length, the character string length corresponding to the names of the corresponding candidate wifi and the character string length corresponding to the target mailbox address information;
And acquiring a second similarity list between the target mailbox address information and the corresponding candidate wifi according to the second edit distance similarity and the second public character string similarity.
In a specific embodiment, S40 further includes the following steps:
acquiring a third editing distance and a third public character string length between the target website information and the name of each candidate wifi according to the target website information and the name of each candidate wifi;
According to the third editing distance, obtaining third editing distance similarity between the target website information and the corresponding name of the candidate wifi;
obtaining the third public character string similarity between the target website information and the names of the corresponding candidate wifi according to the third public character string length, the character string length corresponding to the names of the corresponding candidate wifi and the character string length corresponding to the target website information;
And obtaining a third similarity list between the target website information and the corresponding candidate wifi according to the third editing distance similarity and the third public character string similarity.
S50, obtaining the target similarity between the target object and each candidate wifi according to the similarity list set and a preset priority list.
The preset priority list includes a first priority corresponding to the first similarity, a second priority corresponding to the second similarity, and a third priority corresponding to the third similarity, and the association degree between the target object and each candidate wifi may be represented by obtaining the target similarity between the target object and each candidate wifi according to the similarity list set and the preset priority list.
The specific values of the first priority, the second priority and the third priority can be set by an implementer according to actual situations.
According to the method, the first similarity corresponding to the target name information, the second similarity corresponding to the target mailbox address information and the importance degree of the target website information and the corresponding third similarity in the process of measuring the association degree between the target object and the corresponding candidate wifi are considered, the preset priority is taken as the weight of the corresponding similarity, the target similarity between the target object and each candidate wifi is obtained, and the accuracy of the target similarity is improved.
S60, determining the candidate wifi with the corresponding target similarity larger than a preset similarity threshold as the target wifi.
The larger the target similarity is, the higher the association degree between the characterization target object and the corresponding candidate wifi is, so that the candidate wifi with the corresponding target similarity larger than the preset similarity threshold value is determined as the target wifi.
The specific value of the preset similarity threshold may be set by the implementer according to the actual situation.
According to the target name information and the target address information, the number of intermediate users corresponding to the target object and the target name keyword are obtained to represent the size of the target object, so that the grade number of geohash corresponding to the target object and the length of the geohash character string are determined, the precision of the geohash character string is matched with the size of the target object, and the accuracy of the geohash character string obtained according to the grade of geohash and the target address information is improved; determining wifi in a geographic area range corresponding to geohash character strings as candidate wifi, respectively acquiring first similarity between target name information and the name of each candidate wifi, second similarity between target mailbox address information and the name of each candidate wifi and third similarity between target website information and the name of each candidate wifi according to the target information list and the names of all candidate wifi, comprehensively representing the association relation between the target object and each candidate wifi from the similarity between the three dimensions of the target object and each candidate wifi, and improving the representing accuracy of the association relation between the target object and the candidate wifi; according to the similarity list and the preset priority list, the target similarity between the target object and each candidate wifi is obtained, the candidate wifi with the corresponding target similarity being larger than the preset similarity threshold value is determined to be the target wifi, and therefore screening accuracy of the target wifi is improved.
Example two
On the basis of the first embodiment, the second embodiment provides a target user screening method, where the target user screening method includes the following steps, as shown in fig. 2:
s1, obtaining first candidate users corresponding to target objects and wifi interaction sets corresponding to the first candidate users.
The target object may be an object with a certain number of employees, such as a company, an enterprise, a studio, a factory, etc. which need to perform employee screening; the wifi interaction set can comprise a plurality of target wifi, wherein the target wifi refers to wifi which is used by a target object, and each target wifi uniquely corresponds to the target object; the first candidate user may be a user having an interaction behavior with a target wifi corresponding to a target object, the interaction behavior may be a scanning behavior or a connection behavior, and the target user screening method is used for screening staff corresponding to the target object from the first candidate user and taking the staff as the target user.
For each wifi, the user may scan for the wifi within a certain range, so the first candidate user to scan for the target wifi may be an employee of the target object, may be an employee of other objects near the target object, or may be other users near the geographic location past the target object; meanwhile, each employee of the target object can scan and connect to the target wifi, and can scan but not connect to the target wifi.
The screening of the target user in the target object cannot be accurately completed according to the scanning interaction behavior between the first candidate user and the target wifi alone or according to the connection interaction behavior between the first candidate user and the target wifi alone, so that in order to improve the screening accuracy of the target user, the embodiment acquires the wifi interaction set corresponding to each first candidate user, and completes the screening of the target user by combining the scanning interaction behavior and the connection interaction behavior between the first candidate user and the target wifi.
In a specific embodiment, the wifi interaction set comprises a wifi scanning subset and a wifi connection subset, the wifi scanning subset comprises a plurality of target wifi scanned by the corresponding first candidate user and scanning times between the wifi scanning subset and each target wifi scanned, and the wifi connection subset comprises a plurality of target wifi connected by the corresponding first candidate user and connection times between the wifi scanning subset and each target wifi connected by the corresponding first candidate user.
The more the number of scanning times and the number of connection times between the first candidate users and the target wifi are, the more compact the association relationship between the first candidate users and the target object is, so that in addition to the plurality of target wifi scanned by each first candidate user and the plurality of connected target wifi, the number of scanning times between each first candidate user and each scanned target wifi and the number of connection times between each scanned target wifi and each connected target wifi are also obtained, so that the interaction relationship between the first candidate users and the target wifi is fully represented.
According to the method, the wifi interaction set corresponding to each first candidate user is obtained to represent the scanning interaction behavior and the connection interaction behavior between the first candidate users and the target wifi, the scanning interaction behavior and the connection interaction behavior are used as a basis for analyzing the association relationship between the first candidate users and the target object, a data basis is provided for screening the target users, and therefore screening accuracy of the target users is improved.
S2, screening and obtaining first-type key users corresponding to the target object from all first candidate users according to the wifi interaction set.
In a specific embodiment, S2 further comprises the following steps:
acquiring a first quantity of target wifi scanned by each first candidate user according to the wifi scanning subset corresponding to each first candidate user;
Obtaining a second number of target wifi connected by each first candidate user according to the wifi connection subset corresponding to each first candidate user;
Obtaining the selection probability corresponding to each first candidate user according to the first duty ratio of the first quantity corresponding to each first candidate user in the total quantity of the target wifi, the second duty ratio of the second quantity in the total quantity of the target wifi, the sum of the corresponding scanning times, the sum of the corresponding connection times, the first weight corresponding to the first duty ratio, the second weight corresponding to the second duty ratio, the third weight corresponding to the scanning times and the fourth weight corresponding to the connection times;
And when the selection probability is larger than a preset selection probability threshold value, determining the first candidate users corresponding to the selection probability as the first type key users.
The more the number of scanning times and the number of connection times between the first candidate users and the target wifi are, the closer the association relationship between the first candidate users and the target object is, so that the probability that each first candidate user is selected as the first key user of the target object is represented by combining a preset selection probability formula and obtaining the selection probability corresponding to each first candidate user according to the first duty ratio of the first number in the total number of the target wifi, the second duty ratio of the second number in the total number of the target wifi, the sum of the scanning times, the sum of the connection times and the corresponding weight.
In a specific embodiment, the preset selection probability formula is P=γ1*(x1/x0)+γ2*(x2/x0)+γ3*y14*y2,, where x 1 refers to a first number corresponding to the first candidate user, x 1 refers to a second number corresponding to the first candidate user, x 0 refers to a total number of target wifi, y 1 refers to a sum of scanning times corresponding to the first candidate user, y 2 refers to a sum of connection times corresponding to the first candidate user, γ 1 refers to a first weight, γ 2 refers to a second weight, γ 3 refers to a third weight, and γ 4 refers to a fourth weight.
It will be appreciated that the specific values of gamma 1<γ23<γ41、γ2、γ3 and gamma 4 may be set by the practitioner depending on the circumstances.
According to the wifi interaction set, the probability of selecting each first candidate user to be selected as the first key user of the target object is obtained based on the first duty ratio of the first number of target wifi scanned by each candidate user in the total number of target wifi, the second duty ratio of the second number of target wifi connected by each candidate user in the total number of target wifi, the sum of scanning times, the sum of connection times and the corresponding weight, so that the probability of selecting each first candidate user as the first key user of the target object is represented, and the accuracy of the screened first key users is improved.
S3, acquiring an IP list corresponding to each first type key user and a user list corresponding to each candidate IP, wherein the IP list comprises a plurality of candidate IPs corresponding to each first type key user, and the user list comprises a plurality of second candidate users corresponding to each candidate IP.
The candidate IP corresponding to the first type key user may identify a device corresponding to the first key user, and the candidate IP may be used as an identity of the corresponding first type key user.
The second candidate user refers to a user related to the corresponding candidate IP, and a user consistent with the first candidate user may exist in the second candidate user.
Through acquiring the IP list corresponding to each first type key user and the user list corresponding to each candidate IP, the association relation between the first type key user and the candidate IP can be analyzed, so that candidate IP with close association relation with a target object is screened as a target IP according to the known first type key user, the first type key user with close association relation with the target object is screened as a second type key user according to the known target IP, and the like until the screening condition of the screening target user is met, and the screening task of the target user is completed.
And S4, screening out the first type target IP corresponding to the target object from all the candidate IPs according to the IP lists corresponding to all the first type key users.
The first type of target IP is the IP which is screened from the candidate IP and has a close association relationship with the target object.
In a specific embodiment, S4 further includes the following steps:
Acquiring a third number of the first type key users corresponding to each candidate IP according to the IP list corresponding to all the first type key users;
Obtaining a third duty ratio of a third number in the total number of the first type key users;
And when the third duty ratio is larger than a preset first quantity duty ratio threshold value, determining the candidate IP corresponding to the third duty ratio as the first type target IP.
The larger the third duty ratio of the third number of the first type key users corresponding to each candidate IP in the total number of the first type key users is, the closer the association relation between the corresponding candidate IP and the target object is represented, so that the first type target IP is obtained through screening according to the comparison of the third duty ratio and the preset first number duty ratio threshold.
According to the third duty ratio of the third number of the first type key users corresponding to each candidate IP in the total number of the first type key users, and the comparison of the third duty ratio and the preset first number duty ratio threshold, the first type target IP is obtained through screening, and is used as a basis for updating the key users and further updating the target IP, so that screening accuracy of the key users and the target IP is improved.
S5, screening and obtaining second type key users corresponding to the target objects from all second candidate users according to user lists corresponding to all first type target IPs, wherein the number of the second type key users is greater than or equal to that of the first type key users.
The second type key users are users screened from the second candidate users and have close association relation with the target object, and the second type key users comprise all first type key users.
In a specific embodiment, S5 further includes the following steps:
Acquiring a fourth number of first type target IPs corresponding to each second candidate user except the first type key user according to the user list corresponding to all the first type target IPs;
acquiring a fourth duty ratio of a fourth number in the total number of the first type target IPs;
When the fourth duty ratio is larger than a preset second quantity duty ratio threshold value, determining second candidate users corresponding to the fourth duty ratio as first class intermediate candidate users;
And determining the first type key users and the first type intermediate candidate users as the second type key users.
The larger the fourth duty ratio of the fourth number of the first type target IPs corresponding to each second candidate user in the total number of the first type target IPs, the closer the association relationship between the corresponding second candidate user and the target object is represented, so that the first type intermediate candidate users are obtained through screening according to the comparison of the fourth duty ratio and a preset second number duty ratio threshold value, the second type key users are obtained through further combination with the first type key users, and the number of the second type key users is larger than or equal to that of the first type key users.
According to the fourth duty ratio of the fourth number of the first type target IPs corresponding to each second candidate user in the total number of the first type target IPs, the fourth duty ratio is compared with the preset second number duty ratio threshold value, the first type intermediate candidate users are obtained through screening, and serve as bases for updating the target IPs and further updating the key users, so that screening accuracy of the key users and the target IPs is improved, the expansion of the key users is completed, and screening accuracy of the target users is improved.
S6, according to IP lists corresponding to all N-th type key users, N-th type target IPs corresponding to target objects are obtained through screening from all candidate IPs, wherein the number of the N-th type target IPs is greater than or equal to that of the N-1-th type target IPs, N is an integer greater than 1, and when N=2, the N-th type key users are consistent with the second type key users.
In a specific embodiment, S6 further includes the following steps:
Acquiring a fifth number of N-th type key users corresponding to each candidate IP except the N-1-th type target IP according to the IP list corresponding to all N-th type key users;
obtaining a fifth duty ratio of a fifth number in the total number of the N-th type key users;
When the fifth duty ratio is larger than a preset first quantity duty ratio threshold value, determining candidate IPs corresponding to the fifth duty ratio as N-1 class intermediate target IPs;
And determining the N-1 type target IP and the N-1 type intermediate target IP as the N type target IP.
After the N-type key users are obtained, the N-type target IP corresponding to the target object is screened from all the candidate IPs by further combining with the IP list corresponding to the N-type key users, so that the target IP is updated again, the key users are updated and the target IP is updated further, the screening accuracy of the key users and the target IP is further improved, and meanwhile, the target IP expansion is finished, so that the screening accuracy of the target users is improved.
S7, screening and obtaining N+1th type key users corresponding to the target object from all second candidate users according to user lists corresponding to all N type target IPs, wherein the number of the N+1th type key users is greater than or equal to that of the N type key users.
In a specific embodiment, S7 further includes the following steps:
Acquiring a sixth number of N-type target IPs corresponding to each second candidate user except the N-type key user according to the user list corresponding to all N-type target IPs;
Obtaining a sixth duty ratio of the sixth number in the total number of the nth class of target IPs;
When the sixth duty ratio is larger than a preset second quantity duty ratio threshold value, determining a second candidate user corresponding to the sixth duty ratio as an N-th intermediate candidate user;
and determining the N-type key users and the N-type intermediate candidate users as N+1-type key users.
After the N-type target IP is obtained, the N+1-type key users corresponding to the target object are screened from all the second candidate users by further combining with the user list corresponding to the N-type target IP, so that the key users are updated again, the key users are used as the basis for updating the target IP and further updating the key users, the screening accuracy of the key users and the target IP is further improved, and the expansion of the key users is finished again, so that the screening accuracy of the target users is improved.
S8, when the number of the N+1th type key users is equal to the number of the N type key users, determining the N+1th type key users as target users.
When the number of the n+1th type key users is equal to the number of the N type key users, the number of the corresponding key users expansion is zero when the n+1th type key users are acquired, and the number of the represented key users reaches an upper limit, so that the n+1th type key users are all key users corresponding to the target users, and the n+1th type key users are determined to be the target users.
And when the screening condition is met, the quantity of the characterization key users reaches the upper limit, so that the target users are acquired, and the acquisition accuracy of the target users is improved.
In a specific embodiment, the target user screening method further includes the following steps:
Acquiring the number of intermediate users of a target object;
acquiring the corresponding spreading efficiency of the N+1th type key users according to the number of the N+1th type key users, the number of the N+1th type key users and the number of intermediate users;
And when the number of the N+1th class key users is larger than the number of the N class key users and the expansion efficiency is smaller than a preset efficiency threshold, determining the N+1th class key users as target users.
When N is larger, the number of the key users is smaller when the key users are expanded each time, so in order to improve the screening efficiency of the target users, the screening cost is saved, according to the number of the N-th type key users, the number of the n+1-th type key users and the number of intermediate users, the corresponding expansion efficiency of the n+1-th type key users is calculated by combining a preset expansion efficiency calculation formula, and the expansion of the key users is stopped when the expansion efficiency is smaller than a preset efficiency threshold, and the n+1-th type key users are determined as the target users.
The intermediate user may refer to a participating user in the target object.
The preset efficiency threshold may be set by an implementer according to actual situations.
In a specific embodiment, the preset spreading efficiency calculation formula is η= (Q N+1-QN)/(2×θ), where Q N+1 is the number of n+1st type key users, Q N is the number of N type key users, and θ is the number of intermediate users of the target object.
According to the number of the N-th type key users, the number of the N+1-th type key users and the number of the intermediate users, the corresponding expansion efficiency of the N+1-th type key users is calculated, and when the expansion efficiency is smaller than the preset efficiency threshold, the expansion of the key users is stopped, so that the screening efficiency of target users is improved, and the screening cost is saved.
According to the wifi interaction set, the first type key users corresponding to the target objects are obtained through screening from all the first candidate users, the first type target IPs corresponding to the target objects are obtained through screening from all the candidate IPs according to the IP lists corresponding to all the first type key users, and the second type key users corresponding to the target objects are obtained through screening from all the second candidate users according to the user lists corresponding to all the first type target IPs, so that the screening accuracy of the key users and the target IPs is improved as a basis for further updating the target IPs and further updating the key users; according to the IP lists corresponding to all N-th key users, N-th target IP corresponding to the target object is obtained through screening from all candidate IPs, and according to the user lists corresponding to all N-th target IPs, N+1-th key users corresponding to the target object are obtained through screening from all second candidate users, and the N+1-th key users are used as the basis for further updating the target IP and further updating the key users, so that the screening accuracy of the key users and the target IP is further improved, and the expansion of the target IP and the key users is completed again; when the number of the N+1st type key users is equal to the number of the N type key users, the N+1st type key users are determined to be target users, and when the number of the key users reaches the upper limit, the corresponding target users are obtained, so that the screening accuracy of the target users is improved.
Example III
On the basis of the first embodiment and the second embodiment, the third embodiment provides a target object screening method, where the target object screening method includes the following steps, as shown in fig. 3:
s100, A wifi connection information list A= { A 1,A2,……,Ai,……,Am } corresponding to the target user is obtained, wherein ,Ai={Ai1,Ai2,……,Aij,……,Ain(i)},Aij={Aij 1,Aij 2,Aij 3},Aij 1 refers to the j preset wifi,Aij 2={Aij 21,Aij 22,……,Aij 2h,……,Aij 2r(ij)},Aij 2h of the connection of the i-th target user, the corresponding connection time point ,Aij 3={Aij 31,Aij 32,……,Aij 3h,……,Aij 3r(ij)},Aij 3h of the i-th target user when the i-th target user is connected with A ij 1 refers to the corresponding connection time of the i-th target user when the h-th target user is connected with A ij 1, i=1, 2, … …, m, m refers to the number of target users, j=1, 2, … …, n (i), n (i) refers to the number of wifi of the connection of the i-th target user, h=1, 2, … …, r (ij), and r (ij) refer to the number of times of the connection of the i-th target user with the j-th wifi.
The target user may refer to an employee of an initial target object, the initial target object may refer to a company, an enterprise, a studio, a factory, etc. needing to perform object screening, which has a certain number of employees, the preset wifi refers to wifi connected by the target user in a period of time, the preset object refers to a company having a certain association relationship with the target object, for example, the preset object may refer to a subsidiary company of the target object, or refer to another company having frequent business transactions with the target object, and the target object screening method may screen a subsidiary company corresponding to the target object and another company having business transactions.
The connection time point and the connection duration between the target user and the preset wifi can be used for representing the incidence relation between the target user and the preset wifi, the incidence relation between the target user and the preset wifi can be a relation in daily life or an incidence relation in work, therefore, the preset wifi can be screened according to the incidence relation between the target user and the preset wifi, wifi with the work incidence relation with the target user is screened out, and accordingly all preset objects can be further classified according to the screened wifi, and the screening task of the target object is completed.
Above-mentioned, wifi connection information list that target user corresponds provides the data basis for screening the target object.
S200, according to A, a preset first working time range [ B 1,c1 ] and a preset second working time range [ B 2,c2 ], obtaining a target connection times list B= { B 1,B2,……,Bi,……,Bm } corresponding to A, wherein target connection times B ij corresponding to B i={Bi1,Bi2,……,Bij,……,Bin(i)},Aij 1 are obtained through the following steps:
s210, when A ij 2h is in [ b 1,c1 ] or in [ b 2,c2 ], determining the h connection between the ith target user and A ij 1 as a target connection action corresponding to A ij 1.
S220, traversing a ij 2, and obtaining B (ij) =s (ij), where S (ij) refers to the total number of target connection behaviors corresponding to a ij 1.
Whether the corresponding connection behavior of the target user and the preset wifi is the target connection behavior related to work or not can be judged by judging whether the connection time point of the target user and the preset wifi is in a preset first working time range [ b 1,c1 ] or a preset second working time range [ b 2,c2 ], therefore, the wifi connection behavior of the target user is screened, the wifi connection behavior irrelevant to work is eliminated, the target connection behavior is screened out, the total number of the target connection behaviors is obtained, the total number of the target connection behaviors is used as a basis for representing the work association relationship between the preset wifi and target staff, and therefore screening of the target object is completed by finding out the business association wifi, and screening accuracy of the target object is improved.
The preset first working time range [ b 1,c1 ] and the preset second working time range [ b 2,c2 ] can be set by an implementer according to actual conditions. For example, the connection time point may be a time point with a period of 24 hours, b 1 may refer to 8:00, c 1 may refer to 12:00, b 2 may refer to 14:00, and c 2 may refer to 18:00.
Above-mentioned, through to target user with preset wifi's connection time point, [ b 1,c1 ] and [ b 2,c2 ], select the target connection action that is relevant with the work from target user with all connection actions of preset wifi to obtain the total number of target connection action, reject the wifi connection action that is irrelevant with the work and to filter the adverse effect of target object, thereby improved target object's screening accuracy.
And S300, obtaining a target connection duration list C= { C 1,C2,……,Ci,……,Cm } corresponding to the A according to the A and all the target connection behaviors, wherein the target connection duration C ij corresponding to the C i={Ci1,Ci2,……,Cij,……,Cin(i)},Aij 1 is equal to the sum of S (ij) connection durations corresponding to S (ij) target connection behaviors corresponding to the A ij 1.
According to the target connection behaviors between each target user and each preset wifi and the connection time length corresponding to each target connection behavior, the target connection time length between each target user and each preset wifi can be counted and obtained, the work association degree between each target user and each preset wifi is used for representing, adverse effects of the connection time length of the wifi connection behaviors irrelevant to work on screening target objects are eliminated, and therefore screening accuracy of the target objects is improved.
And S400, acquiring a service association wifi list D= { D 1,D2,……,Di,……,Dm }, wherein ,Di={Di 1,Di 2,……,Di v,……,Di t(i)},Di v refers to a v-th service association wifi corresponding to an i-th target user, v=1, 2 … …, t (i), and t (i) refers to the total number of service association wifi corresponding to the i-th target user.
The service association wifi refers to wifi which has service association with a target user and a target object, and is used as a basis for determining a service association company.
In one embodiment, D i is obtained by:
S410, according to B ij and C ij, acquiring a service association degree E ij=α1*Bij2*Cij corresponding to a j-th preset wifi connected by an i-th target user, wherein alpha 1 refers to a first preset priority, and alpha 2 refers to a second preset priority;
S420, if E ij>E0, determining A ij 1 as service association wifi corresponding to the ith target user, wherein E 0 refers to a preset service association degree threshold;
S430, traversing B i and C i to obtain D i={Di 1,Di 2,……,Di v,……,Di t(i).
The more the number of target connection times between the target user and the preset wifi is, the longer the target connection time between the target user and the preset wifi is, and the higher the corresponding service association degree between the preset wifi and the target user and the target object is. Therefore, according to B ij and C ij, the service association degree E ij is obtained by combining the preset α 1 and α 2, and when E ij>E0 is performed, the service association wifi corresponding to the ith target user is determined as the a ij 1, so that screening of all service association wifi is completed, and the screening is used as a basis for further screening the service association wifi.
The specific value of the preset service association degree threshold E 0 may be set by the implementer according to the actual situation.
Above-mentioned, combine target connection number of times and the target connection duration between target user and the presupposed wifi, the business association degree between presupposed wifi and target user and the target object is characterized to according to business association degree and business association degree between the threshold value size comparison, select the wifi that has business association with target user and target object from all presupposed wifi, improved business association wifi's screening accuracy, and then improved target object's screening accuracy.
S500, classifying the preset object corresponding to each service association wifi into a first target object or a second target object according to D.
The first target object refers to a subsidiary of the target object, and the second target object refers to a company which has frequent business trips to the target object except the subsidiary.
In a specific embodiment, S500 further includes the following steps:
s510, according to D, obtaining the number of target users corresponding to each service association wifi;
s520, classifying each service association wifi as a first service association wifi or a second service association wifi according to the number of target users corresponding to each service association wifi;
S530, determining a preset object corresponding to the first service association wifi as a first target object;
s540, determining a preset object corresponding to the second service association wifi as a second target object.
In a specific embodiment, S520 further includes the following steps:
S521, when the number of target users corresponding to the service association wifi is larger than a first target user number threshold, classifying the corresponding service association wifi as a first service association wifi;
s522, when the number of the target users corresponding to the service association wifi is smaller than or equal to the first target user number threshold and is larger than the second target user number threshold, classifying the corresponding service association wifi as a second service association wifi.
Wherein the second target user number threshold is less than the first target user number threshold.
The method comprises the steps that through the connection time points [ b 1,c1 ] and [ b 2,c2 ] of a target user and preset wifi, target connection behaviors relevant to work are screened out from all connection behaviors of the target user and the preset wifi, the total number of the target connection behaviors is obtained, and adverse effects of the wifi connection behaviors irrelevant to the work on screening target objects are eliminated; according to the target connection behaviors and the connection time length corresponding to each target connection behavior, the target connection time length between each target user and each preset wifi is counted to represent the work association degree between each target user and each preset wifi, and adverse effects of the connection time length of the wifi connection behaviors irrelevant to work on screening target objects are eliminated; the service association degree between the preset wifi and the target user and the service association degree between the preset wifi and the target object are represented by combining the target connection times and the target connection time between the target user and the preset wifi, and the wifi which has service association with the target user and the target object is screened out from all the preset wifi according to the comparison between the service association degree and the service association degree threshold, so that the screening accuracy of the service association wifi is improved; and classifying the preset object corresponding to each service association wifi as a first target object or a second target object, thereby improving the screening accuracy of the target objects.
Example IV
On the basis of the first embodiment, the fourth embodiment provides a target wifi screening device, where the target wifi screening device includes, as shown in fig. 4:
The target information obtaining module 41 is configured to obtain a target information list corresponding to a target object, where the target information includes target name information, target mailbox address information, target website information, and target address information.
The character string obtaining module 42 is configured to obtain a geohash character string corresponding to the target object according to the target name information and the target address information.
And the candidate wifi obtaining module 43 is configured to determine wifi within the geographic area range corresponding to the geohash character string as a candidate wifi.
The similarity list obtaining module 44 is configured to obtain a similarity list set between the target object and each candidate wifi according to the target information list and the names of all candidate wifi, where the similarity list set includes a first similarity list between the target name information and the name of the corresponding candidate wifi, a second similarity list between the target mailbox address information and the name of the corresponding candidate wifi, and a third similarity list between the target website information and the name of the corresponding candidate wifi.
The target similarity obtaining module 45 is configured to obtain, according to the similarity list set and a preset priority list, a target similarity between the target object and each candidate wifi.
The target wifi screening module 46 is configured to determine, as the target wifi, a candidate wifi whose corresponding target similarity is greater than a preset similarity threshold.
In one embodiment, the target information obtaining module 41 includes the following sub-modules:
The preset information list acquisition sub-module is used for acquiring a preset information list corresponding to the target object, wherein the preset information list comprises preset name information, preset mailbox address information, preset website information and preset address information.
And the data cleaning sub-module is used for cleaning the data of the preset information list and obtaining a target information list corresponding to the target object.
In one embodiment, the data cleansing submodule includes the following elements:
The system comprises a preset data acquisition unit, a preset website keyword list and a preset character, wherein the preset data acquisition unit is used for acquiring a preset first name keyword list, a preset mailbox keyword list, a preset website keyword list and a preset character, the preset name keyword list comprises a plurality of preset first name keywords, the preset mailbox keyword list comprises a plurality of preset mailbox keywords, and the preset website keyword list comprises a plurality of preset website keywords.
The intermediate name information acquisition unit is used for eliminating all the first name keywords from the preset name information and acquiring the intermediate name information.
The target name information acquisition unit is used for performing character conversion on the intermediate name information according to a preset character conversion form to acquire the target name information.
The intermediate mailbox address information acquisition unit is used for eliminating all mailbox keywords from preset mailbox address information and acquiring the intermediate mailbox address information.
The target mailbox address information acquisition unit is used for replacing the corresponding preset character in the intermediate mailbox address information with a null character to acquire the target mailbox address information.
The intermediate website information acquisition unit is used for removing all website keywords from preset website information to acquire intermediate website information.
The target website information acquisition unit is used for replacing the corresponding preset character in the intermediate website information with a null character to acquire the target website information.
In one embodiment, the string retrieval module 42 includes the following sub-modules:
And the intermediate user quantity acquisition sub-module is used for acquiring the intermediate user quantity corresponding to the target object according to the target name information.
The keyword extraction sub-module is used for obtaining the target name keywords in the target name information according to a preset keyword extraction algorithm.
The first level obtaining sub-module is configured to determine that the level geohash corresponding to the target object is 7 levels when the target name keyword is consistent with the preset second name keyword or the number of intermediate users is less than or equal to a preset first number threshold.
And the second level acquisition sub-module is used for determining that the level geohash corresponding to the target object is level 5 when the target name keyword is consistent with a preset third name keyword or the number of intermediate users is greater than or equal to a preset second number threshold.
The third level obtaining sub-module is configured to determine that the geohash level corresponding to the target object is level 6 when the target name keyword is inconsistent with the preset second name keyword, the target name keyword is inconsistent with the preset third name keyword, the number of intermediate users is greater than the preset first number threshold, and the number of intermediate users is less than the preset second number threshold.
And the character string acquisition sub-module is used for acquiring the geohash character string corresponding to the target object according to the grade of geohash and the target address information.
In one embodiment, the similarity list obtaining module 44 includes the following sub-modules:
The first distance obtaining sub-module is used for obtaining a first editing distance and a first public character string length between the target name information and the name of each candidate wifi according to the target name information and the name of each candidate wifi.
The first editing distance similarity obtaining sub-module is used for obtaining the first editing distance similarity between the target name information and the corresponding candidate wifi name according to the first editing distance.
The first public character string similarity obtaining sub-module is used for obtaining the first public character string similarity between the target name information and the corresponding candidate wifi name according to the first public character string length, the character string length corresponding to the corresponding candidate wifi name and the character string length corresponding to the target name information.
The first similarity list obtaining sub-module is used for obtaining a first similarity list between the target name information and the corresponding candidate wifi according to the first editing distance similarity and the first public character string similarity.
In a specific embodiment, the similarity list obtaining module 44 further includes the following sub-modules:
the second distance obtaining sub-module is used for obtaining a second editing distance and a second public character string length between the target mailbox address information and the name of each candidate wifi according to the target mailbox address information and the name of each candidate wifi.
And the second editing distance similarity obtaining sub-module is used for obtaining the second editing distance similarity between the target mailbox address information and the corresponding name of the candidate wifi according to the second editing distance.
The second public character string similarity obtaining sub-module is used for obtaining the second public character string similarity between the target mailbox address information and the names of the corresponding candidate wifi according to the second public character string length, the character string length corresponding to the names of the corresponding candidate wifi and the character string length corresponding to the target mailbox address information.
The second similarity list obtaining sub-module is configured to obtain a second similarity list between the target mailbox address information and the corresponding candidate wifi according to the second edit distance similarity and the second public character string similarity.
In a specific embodiment, the similarity list obtaining module 44 further includes the following sub-modules:
the third distance obtaining sub-module is used for obtaining a third editing distance and a third public character string length between the target website information and the name of each candidate wifi according to the target website information and the name of each candidate wifi.
And the third editing distance similarity obtaining sub-module is used for obtaining the third editing distance similarity between the target website information and the corresponding name of the candidate wifi according to the third editing distance.
The third public character string similarity obtaining sub-module is used for obtaining the third public character string similarity between the target website information and the corresponding candidate wifi name according to the third public character string length, the character string length corresponding to the corresponding candidate wifi name and the character string length corresponding to the target website information.
The third similarity list obtaining sub-module is configured to obtain a third similarity list between the target website information and the corresponding candidate wifi according to the third edit distance similarity and the third public character string similarity.
It should be noted that, because the content of information interaction and execution process between the modules and the embodiment of the method of the present invention are based on the same concept, specific functions and technical effects thereof may be referred to in the method embodiment section, and details thereof are not repeated herein.
Example five
A fifth embodiment of the present invention provides a non-transitory computer readable storage medium having at least one instruction or at least one program stored therein, the at least one instruction or the at least one program loaded and executed by a processor to implement the steps of:
S10, acquiring a target information list corresponding to a target object, wherein the target information comprises target name information, target mailbox address information, target website address information and target address information.
S20, acquiring geohash character strings corresponding to the target objects according to the target name information and the target address information.
S30, determining wifi in the geographical area range corresponding to the geohash character strings as candidate wifi.
S40, obtaining a similarity list set between the target object and each candidate wifi according to the target information list and the names of all candidate wifi, wherein the similarity list set comprises a first similarity list between the target name information and the names of the corresponding candidate wifi, a second similarity list between the target mailbox address information and the names of the corresponding candidate wifi, and a third similarity list between the target website information and the names of the corresponding candidate wifi.
S50, obtaining the target similarity between the target object and each candidate wifi according to the similarity list set and a preset priority list.
S60, determining the candidate wifi with the corresponding target similarity larger than a preset similarity threshold as the target wifi.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous link (SYNCHLINK) DRAM (SLDRAM), memory bus (rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above functional units and the division of the modules are illustrated, and in practical application, the above functions may be allocated to different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to complete all or part of the functions described above.
Example six
A sixth embodiment of the present invention provides an electronic device including a processor and a non-transitory computer-readable storage medium in the fifth embodiment of the present invention.
The present invention is not limited to the above-mentioned embodiments, but is not limited to the above-mentioned embodiments, and any person skilled in the art can make some changes or modifications to the equivalent embodiments without departing from the scope of the present invention, but all the simple modifications, equivalent changes and modifications according to the technical matter of the present invention fall within the scope of the technical solution of the present invention.

Claims (10)

1. The target wifi screening method is characterized by comprising the following steps of:
Acquiring a target information list corresponding to a target object, wherein the target information comprises target name information, target mailbox address information, target website information and target address information;
Acquiring geohash character strings corresponding to the target objects according to the target name information and the target address information;
determining wifi in the geographical area range corresponding to the geohash character strings as candidate wifi;
acquiring a similarity list set between the target object and each candidate wifi according to the target information list and the names of all candidate wifi, wherein the similarity list set comprises a first similarity list between the target name information and the names of the corresponding candidate wifi, a second similarity list between the target mailbox address information and the names of the corresponding candidate wifi, and a third similarity list between the target website information and the names of the corresponding candidate wifi;
acquiring target similarity between the target object and each candidate wifi according to the similarity list set and a preset priority list;
And determining the candidate wifi with the corresponding target similarity larger than a preset similarity threshold as the target wifi.
2. The target wifi screening method according to claim 1, wherein the target information list is obtained by:
acquiring a preset information list corresponding to a target object, wherein the preset information list comprises preset name information, preset mailbox address information, preset website information and preset address information;
And cleaning the data of the preset information list to obtain a target information list corresponding to the target object.
3. The method for filtering target wifi according to claim 2, wherein the step of performing data cleaning on the preset information list to obtain the target information list corresponding to the target object further includes the following steps:
Acquiring a preset first name keyword list, a preset mailbox keyword list, a preset website keyword list and preset characters, wherein the preset name keyword list comprises a plurality of preset first name keywords, the preset mailbox keyword list comprises a plurality of preset mailbox keywords, and the preset website keyword list comprises a plurality of preset website keywords;
Removing all first name keywords from the preset name information to obtain intermediate name information;
performing character conversion on the intermediate name information according to a preset character conversion form to obtain target name information;
Removing all the mailbox keywords from the preset mailbox address information to obtain intermediate mailbox address information;
And replacing the corresponding preset character in the intermediate mailbox address information with a null character to obtain the target mailbox address information.
Removing all website keywords from the preset website information to obtain intermediate website information;
and replacing the corresponding preset character in the intermediate website information with a null character to obtain the target website information.
4. The method for filtering target wifi according to claim 1, wherein the step of obtaining the geohash character strings corresponding to the target object according to the target name information and the target address information further includes the steps of:
Acquiring the number of intermediate users corresponding to the target object according to the target name information;
acquiring target name keywords in the target name information according to a preset keyword extraction algorithm;
when the target name keywords are consistent with the preset second name keywords or the number of the intermediate users is smaller than or equal to a preset first number threshold, determining that the geohash level corresponding to the target object is 7 level;
When the target name keywords are consistent with preset third name keywords or the number of the intermediate users is greater than or equal to a preset second number threshold, determining that the geohash level corresponding to the target object is 5 level;
When the target name keyword is inconsistent with a preset second name keyword, the target name keyword is inconsistent with a preset third name keyword, the number of intermediate users is greater than the preset first number threshold and the number of intermediate users is less than the preset second number threshold, determining that the geohash level corresponding to the target object is level 6;
and acquiring the geohash character string corresponding to the target object according to the grade geohash and the target address information.
5. The method for screening target wifi according to claim 1, wherein the step of obtaining a similarity list set between the target object and each candidate wifi according to the target information list and names of all candidate wifi further comprises the following steps:
acquiring a first editing distance and a first public character string length between the target name information and the name of each candidate wifi according to the target name information and the name of each candidate wifi;
acquiring a first editing distance similarity between the target name information and the corresponding candidate wifi name according to the first editing distance;
Acquiring first public character string similarity between the target name information and the names of the corresponding candidate wifi according to the first public character string length, the character string length corresponding to the names of the corresponding candidate wifi and the character string length corresponding to the target name information;
And acquiring a first similarity list between the target name information and the corresponding candidate wifi according to the first editing distance similarity and the first public character string similarity.
6. The method for screening target wifi according to claim 1, wherein the step of obtaining a similarity list set between the target object and each candidate wifi according to the target information list and names of all candidate wifi further comprises the following steps:
Acquiring a second editing distance and a second public character string length between the target mailbox address information and the names of each candidate wifi according to the target mailbox address information and the names of each candidate wifi;
Acquiring second editing distance similarity between the target mailbox address information and the names of the corresponding candidate wifi according to the second editing distance;
obtaining a second public character string similarity between the target mailbox address information and the names of the corresponding candidate wifi according to the second public character string length, the character string length corresponding to the names of the corresponding candidate wifi and the character string length corresponding to the target mailbox address information;
And acquiring a second similarity list between the target mailbox address information and the corresponding candidate wifi according to the second edit distance similarity and the second public character string similarity.
7. The method for screening target wifi according to claim 1, wherein the step of obtaining a similarity list set between the target object and each candidate wifi according to the target information list and names of all candidate wifi further comprises the following steps:
Acquiring a third editing distance and a third public character string length between the target website information and the name of each candidate wifi according to the target website information and the name of each candidate wifi;
According to the third editing distance, obtaining third editing distance similarity between the target website information and the corresponding name of the candidate wifi;
Obtaining third public character string similarity between the target website information and the names of the corresponding candidate wifi according to the third public character string length, the character string length corresponding to the names of the corresponding candidate wifi and the character string length corresponding to the target website information;
And acquiring a third similarity list between the target website information and the corresponding candidate wifi according to the third editing distance similarity and the third public character string similarity.
8. A target wifi screening device, the device comprising:
The target information acquisition module is used for acquiring a target information list corresponding to a target object, wherein the target information comprises target name information, target mailbox address information, target website information and target address information;
The character string acquisition module is used for acquiring geohash character strings corresponding to the target objects according to the target name information and the target address information;
The candidate wifi obtaining module is used for determining wifi in the geographical area range corresponding to the geohash character strings as candidate wifi;
A similarity list obtaining module, configured to obtain a similarity list set between the target object and each candidate wifi according to the target information list and the names of all candidate wifi, where the similarity list set includes a first similarity list between the target name information and the names of the corresponding candidate wifi, a second similarity list between the target mailbox address information and the names of the corresponding candidate wifi, and a third similarity list between the target website information and the names of the corresponding candidate wifi;
The target similarity acquisition module is used for acquiring the target similarity between the target object and each candidate wifi according to the similarity list set and a preset priority list;
And the target wifi screening module is used for determining the candidate wifi with the corresponding target similarity larger than a preset similarity threshold value as the target wifi.
9. A non-transitory computer readable storage medium having at least one instruction or at least one program stored therein, wherein the at least one instruction or the at least one program is loaded and executed by a processor to implement the target wifi screening method according to any of claims 1-7.
10. An electronic device comprising a processor and the non-transitory computer readable storage medium of claim 9.
CN202410063321.XA 2024-01-16 2024-01-16 Target wifi screening method, device, medium and equipment Pending CN117896717A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410063321.XA CN117896717A (en) 2024-01-16 2024-01-16 Target wifi screening method, device, medium and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410063321.XA CN117896717A (en) 2024-01-16 2024-01-16 Target wifi screening method, device, medium and equipment

Publications (1)

Publication Number Publication Date
CN117896717A true CN117896717A (en) 2024-04-16

Family

ID=90647081

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410063321.XA Pending CN117896717A (en) 2024-01-16 2024-01-16 Target wifi screening method, device, medium and equipment

Country Status (1)

Country Link
CN (1) CN117896717A (en)

Similar Documents

Publication Publication Date Title
CN112765235A (en) Human resource intelligent management system based on feature recognition and big data analysis and cloud management server
CN114511019A (en) Sensitive data classification and grading identification method and system
CN111177217A (en) Data preprocessing method and device, computer equipment and storage medium
Goncalves et al. Gathering alumni information from a web social network
Febrian et al. Decision support system employee performance appraisal method using topsis
CN111626346A (en) Data classification method, device, storage medium and device
CN112419124B (en) Method and device for quickly identifying low-efficiency industrial land and storage medium thereof
CN117573951B (en) Target user screening method, device, medium and equipment
CN117896717A (en) Target wifi screening method, device, medium and equipment
CN117896771A (en) Target object screening method, device, medium and equipment
Mendes et al. Investigating the use of chronological splitting to compare software cross-company and single-company effort predictions: a replicated study
CN112035775A (en) User identification method and device based on random forest model and computer equipment
CN113837319B (en) Cluster-based client classification method, device, equipment and storage medium
CN115858598A (en) Enterprise big data-based target information screening and matching method and related equipment
CN111460268B (en) Method and device for determining database query request and computer equipment
CN109919811B (en) Insurance agent culture scheme generation method based on big data and related equipment
Gonzales et al. Distance Metric Recommendation for k-Means Clustering: A Meta-Learning Approach
CN111092879B (en) Log association method and device, electronic equipment and storage medium
CN113742344A (en) Method and device for indexing power system data
CN113297190A (en) Visualization method, device and medium based on data comprehensive analysis
CN110618979A (en) Nested loop data processing method and device and computer equipment
CN111611397A (en) Information matching method and device, computer equipment and storage medium
CN110633430A (en) Event discovery method, device, equipment and computer readable storage medium
CN118037339A (en) Target area industry development trend prediction method, system, equipment and medium
CN117575542B (en) Building engineering data control system and method based on modularized assembly

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination