CN109492103B - Label information acquisition method and device, electronic equipment and computer readable medium - Google Patents

Label information acquisition method and device, electronic equipment and computer readable medium Download PDF

Info

Publication number
CN109492103B
CN109492103B CN201811333350.4A CN201811333350A CN109492103B CN 109492103 B CN109492103 B CN 109492103B CN 201811333350 A CN201811333350 A CN 201811333350A CN 109492103 B CN109492103 B CN 109492103B
Authority
CN
China
Prior art keywords
fulfillment
address
information
user
classification result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811333350.4A
Other languages
Chinese (zh)
Other versions
CN109492103A (en
Inventor
倪嘉呈
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Liangxin Technology Co., Ltd
Original Assignee
Beijing Sankuai Online Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sankuai Online Technology Co Ltd filed Critical Beijing Sankuai Online Technology Co Ltd
Priority to CN201811333350.4A priority Critical patent/CN109492103B/en
Publication of CN109492103A publication Critical patent/CN109492103A/en
Priority to CA3060822A priority patent/CA3060822A1/en
Application granted granted Critical
Publication of CN109492103B publication Critical patent/CN109492103B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure provides a label information acquisition method, a label information acquisition device, electronic equipment and a computer readable medium, and belongs to the technical field of internet. The label information acquisition method comprises the following steps: classifying fulfillment addresses of users to obtain classification results of the fulfillment addresses, wherein the fulfillment addresses are addresses for performing orders by the users; and analyzing according to the fulfillment behavior of the user aiming at the fulfillment address and the classification result to obtain the label information of the user. According to the method, the fulfillment addresses of the users are classified, and the user fulfillment behaviors are analyzed to obtain the label information of the users, such as relatively strong financial attributes of occupation, house property value, living habits and the like, so that the consumption capacity of the users is evaluated on the premise of not acquiring sensitive information of the users.

Description

label information acquisition method and device, electronic equipment and computer readable medium
Technical Field
the present disclosure generally relates to the field of internet technologies, and in particular, to a method and an apparatus for acquiring tag information, an electronic device, and a computer-readable medium.
Background
In the traditional financial industry, information such as income level, consumption capability, repayment capability and the like of a user is generally acquired through bank flow, social security of public deposit, personal tax certification, house property certification and on-duty certification combined with client declaration filling information. However, for the internet financial platform, there is no way to directly obtain information related to professional and real estate values.
Therefore, there is still a need for improvement in the prior art solutions.
The above information disclosed in this background section is only for enhancement of understanding of the background of the disclosure and therefore it may contain information that does not constitute prior art that is already known to a person of ordinary skill in the art.
disclosure of Invention
the present disclosure provides a tag information obtaining method, a tag information obtaining device, an electronic device, and a computer readable medium, which solve at least one of the above problems.
additional features and advantages of the disclosure will be set forth in the detailed description which follows, or in part will be obvious from the description, or may be learned by practice of the disclosure.
according to an aspect of the present disclosure, there is provided a tag information acquisition method, including:
classifying fulfillment addresses of users to obtain classification results of the fulfillment addresses, wherein the fulfillment addresses are addresses for performing orders by the users; and analyzing according to the fulfillment behavior of the user aiming at the fulfillment address and the classification result to obtain the label information of the user.
In an embodiment of the present disclosure, analyzing, according to the fulfillment behavior of the user for the fulfillment address and in combination with the classification result, obtaining the tag information of the user includes:
performing attribute analysis on the classification result by combining an industry information database to obtain attribute information of the fulfillment address;
And analyzing according to the fulfillment behavior of the user aiming at the fulfillment address by combining the classification result and the attribute information to obtain the label information of the user.
in an embodiment of the present disclosure, the classifying the fulfillment address to obtain a classification result of the fulfillment address includes:
Classifying and labeling the historical fulfillment addresses based on the word vectors to obtain a mapping relation between the fulfillment addresses and the classifications, wherein the historical fulfillment addresses are addresses for performing fulfillment on the historical orders by a plurality of users in the platform;
and aiming at the fulfillment address, combining the mapping relation between the fulfillment address and the classification to obtain the classification result.
in one embodiment of the present disclosure, the classifying and labeling the historical fulfillment addresses based on the word vectors includes:
Extracting backbone information from the historical fulfillment address; performing word segmentation on the trunk information through a word segmentation technology to obtain a plurality of address word segments; converting the plurality of address participles into word vectors; clustering the word vectors; and carrying out corresponding classification labeling on the classification result of the trunk information of the fulfillment address according to the clustering result.
in an embodiment of the present disclosure, the classifying the fulfillment address to obtain a classification result of the fulfillment address includes:
Performing Softmax training on a history fulfillment address based on the word features to obtain a text classification model, wherein the history fulfillment address is an address for performing fulfillment on a history order by a plurality of users in a platform;
and inputting the fulfillment address into the text classification model, and outputting the classification result.
in one embodiment of the disclosure, Softmax training a historical fulfillment address based on word features comprises:
Configuring address texts and classified corresponding rules; matching the historical fulfillment address by adopting multi-mode matching, and if the historical fulfillment address is matched with the address text, outputting a corresponding classification result according to the corresponding rule; performing word segmentation on the fulfillment address, performing multi-element combination on the obtained word segmentation, and performing Softmax training based on the characteristics of a single word segmentation or a plurality of word segmentation to obtain the text classification model.
In an embodiment of the present disclosure, before performing attribute analysis on the classification result in combination with an industry information database, the method further includes:
Preprocessing the industry classification information; constructing an industry information database according to the preprocessed industry classification information;
Wherein the industry information database comprises a plurality of pieces of information, each piece of information comprises:
a label; classifying results; attribute information;
The classification result comprises: at least one or more of a residence, hospital, hotel, office building, recreational facility; the attribute information includes: at least one or more of a room price, hospital type, hotel star level, office building level, recreational entertainment level.
In an embodiment of the present disclosure, an industry information database is combined to perform attribute analysis on the classification result, so as to obtain attribute information:
and obtaining attribute information corresponding to the fulfillment address by performing forward maximum matching on the classification result and the information in the industry information database.
in an embodiment of the present disclosure, before performing analysis according to the user's performance on the performance address in combination with the classification result, the method further includes:
Acquiring a performance behavior of the user aiming at the performance address; wherein the performance activities include: at least one of the number of times of performing on weekdays, the number of times of performing on non-weekdays, the date span of performing, and the information of the user labeling the performing address.
in an embodiment of the present disclosure, analyzing, according to the fulfillment behavior of the user for the fulfillment address and in combination with the classification result, obtaining the tag information of the user includes:
And if the number of times of performing on the user in the working day is greater than or equal to a first threshold value and the classification result is a hospital, obtaining the label information of the user as medical staff for the occupation.
In an embodiment of the present disclosure, analyzing, according to the fulfillment behavior of the user for the fulfillment address, in combination with the classification result and the attribute information, to obtain tag information of the user includes:
And if the number of times of performing on the non-working days of the user is greater than or equal to a second threshold value, the classification result is the residence, and the rate of the house in the attribute information is greater than or equal to a third threshold value, obtaining that the label information of the user is a high-end cell.
according to still another aspect of the present disclosure, there is provided a tag information acquiring apparatus including: the address classification module is configured to classify a fulfillment address of a user to obtain a classification result of the fulfillment address, wherein the fulfillment address is an address for performing an order by the user; and the label analysis module is configured to analyze the fulfillment behavior of the user aiming at the fulfillment address by combining the classification result to obtain the label information of the user.
According to yet another aspect of the present disclosure, there is provided an electronic device comprising a processor; a memory storing instructions for the processor to control the method steps as described above.
according to another aspect of the present disclosure, there is provided a computer-readable medium having stored thereon computer-executable instructions that, when executed by a processor, implement the method steps as described above.
According to the method, the device, the electronic equipment and the computer readable medium for acquiring the tag information, provided by the embodiment of the disclosure, the performing address of the user is classified 7268, the tag information of the user, such as relatively strong financial attributes of occupation, house property value, living habits and the like, is obtained by analyzing in combination with the performing behavior of the user, and the consuming capacity of the user is evaluated on the premise of not acquiring sensitive information of the user.
it is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The above and other objects, features and advantages of the present disclosure will become more apparent by describing in detail exemplary embodiments thereof with reference to the attached drawings.
Fig. 1 shows a flowchart of a tag information obtaining method provided in an embodiment of the present disclosure.
Fig. 2 shows a flowchart of another tag information obtaining method provided in an embodiment of the present disclosure.
fig. 3 shows a flowchart of classification labeling based on word vectors in an embodiment of the present disclosure.
FIG. 4 is a flow chart illustrating training of text classification based on word features in an embodiment of the present disclosure.
Fig. 5 illustrates a flow chart for classifying a fulfillment address of a user in an embodiment of the present disclosure.
fig. 6 shows a schematic diagram of a tag information acquisition apparatus provided in another embodiment of the present disclosure.
Fig. 7 is a schematic diagram illustrating another tag information acquisition apparatus provided in another embodiment of the present disclosure.
Fig. 8 shows a schematic structural diagram of an electronic device suitable for implementing an embodiment of the present application, provided by an embodiment of the present disclosure.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repetitive description will be omitted.
Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the disclosure. One skilled in the relevant art will recognize, however, that the subject matter of the present disclosure can be practiced without one or more of the specific details, or with other methods, components, devices, steps, and the like. In other instances, well-known structures, methods, devices, implementations, materials, or operations are not shown or described in detail to avoid obscuring aspects of the disclosure.
some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.
in order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to specific embodiments and the accompanying drawings.
In a related embodiment of the present invention, some attributes of the user may be generally characterized in the platform according to the direct consumption behavior (group purchase, takeout, reservation, movie, ticketing, etc.) of the user, for example, the age, preference, etc. of the user may be analyzed and characterized according to the behavior of the movie or ticketing, etc. browsed or traded by the client in the platform. However, it is easy to find the financial attributes of the user from the user transaction, browsing, and other actions, and the user is limited by the platform category, and the financial attribute mining is not sufficient for the user.
Based on the above problems, some embodiments of the present disclosure provide a tag information obtaining method, device, electronic device, and computer readable medium, where a fulfillment address of a user is structured by a natural language processing technology, and a real estate value, an occupation, a living habit, and the like of the user are further obtained from the structured information, so as to extract a relatively strong financial attribute.
Fig. 1 shows a flowchart of a tag information obtaining method provided in an embodiment of the present disclosure, which includes the following steps:
As shown in fig. 1, in step S110, a fulfillment address of a user is classified to obtain a classification result of the fulfillment address, where the fulfillment address is an address at which the user performs fulfillment on an order.
As shown in fig. 1, in step S120, a performance behavior of the user with respect to the performance address is analyzed in combination with the classification result, so as to obtain tag information of the user.
Fig. 2 further shows a flowchart of another tag information obtaining method provided in an embodiment of the present disclosure, which includes the following steps:
As shown in fig. 2, in step S210, a fulfillment address of a user is classified to obtain a classification result of the fulfillment address, where the fulfillment address is an address at which the user performs fulfillment on an order.
As shown in fig. 2, in step S220, an attribute analysis is performed on the classification result in combination with an industry information database to obtain attribute information of the fulfillment address.
As shown in fig. 2, in step S230, the performing behavior of the user for the performing address is analyzed in combination with the classification result and the attribute information to obtain the tag information of the user.
The difference from the method flow shown in fig. 1 is that the flow shown in fig. 2 further performs attribute analysis according to the classification result and an industry information database, so as to perform deep analysis on the user tag according to the performance behavior of the user and the classification result and the attribute information, and obtain the tag information of the user.
by the tag information acquisition method in the exemplary embodiment, the performing address of the user is classified and attribute identification is performed, the performing behavior of the user is combined to analyze to obtain tag information of the user, such as relatively strong financial attributes of occupation, real estate value, living habits and the like, and the consuming capacity of the user is evaluated on the premise of not acquiring sensitive information of the user.
The following takes the flow shown in fig. 2 as an example, and further describes each step in the tag information obtaining method in the embodiment of the present disclosure.
In step S210, the fulfillment address of the user is classified to obtain a classification result of the fulfillment address.
In one embodiment of the present disclosure, the fulfillment address is an address provided by the user for fulfilling the order, for example, an address filled by the user when placing an order in O2O, where the order needs to be fulfilled, for example, an address related to the order such as take-out, online car appointment, etc. is the fulfillment address. One take-out order comprises a fulfillment address, and one network car-booking order comprises two fulfillment addresses (a starting place and a destination). The fulfillment address in this embodiment is mainly a fulfillment address in a takeaway order, and is also applicable to two fulfillment addresses of a departure place and a destination in an online car order.
In an embodiment of the present disclosure, the classifying the fulfillment address in step S210 to obtain a classification result of the fulfillment address may be implemented by the following two off-line model training manners, specifically:
Off-line model training is performed in the following manner:
1) Classifying and labeling the historical fulfillment addresses based on word vectors to obtain a mapping relation between the fulfillment addresses and the classifications;
aiming at the fulfillment address, combining the mapping relation between the fulfillment address and the classification to obtain the classification result; or is
2) Performing Softmax training on the history fulfillment address based on the word features to obtain a text classification model;
And inputting the fulfillment address into the text classification model, and outputting the classification result.
The history fulfillment address is an address where a plurality of users in the platform perform fulfillment on the history order.
fig. 3 shows a flow chart of classification labeling based on word vectors, comprising the following steps:
As shown in fig. 3, in step S301, backbone information is extracted from the history fulfillment address.
in one embodiment of the present disclosure, this step is preceded by filtering all historical fulfillment addresses obtained from the platform, where the filtering includes removing duplicate addresses.
For example, a complete fulfillment address may be as follows:
shanghai city Changning Lu Meifeng square (4 storied building, small south China, opposite square sugar town)
The main information in the fulfillment address is "Changning Lu Meifeng square" in Shanghai city, and the information in parentheses is temporarily not considered in this step.
As shown in fig. 3, in step S302, the stem information is segmented by a segmentation technique to obtain a plurality of address segmentation words.
In one embodiment of the present disclosure, a space is used as a separator for an english address during word segmentation, and there are many word segmentation techniques for a chinese address, and word segmentation based on dictionary matching and hidden markov models can be selected according to requirements.
still taking the above fulfillment address as an example, the address participles obtained after step S202 are:
Shanghai city Changning Lu Meifeng square
As shown in fig. 3, in step S303, the plurality of address participles are converted into word vectors.
In one embodiment of the present disclosure, word2vec technology (e.g., a skip ngram model) is adopted in this step to convert address participles into word vectors, and the word vector technology is to train a neural network model through context relationships between words.
Still taking the aforementioned fulfillment address as an example, a word vector is obtained for each word segmentation of the address words "shanghai city", "changning road" and "megakukuang square", and finally the word vectors are accumulated to obtain a word vector corresponding to the address text.
As shown in fig. 3, in step S304, the word vectors are clustered.
in one embodiment of the present disclosure, fulfillment addresses may be clustered, for example into 1000 clusters, by means of kmeans clustering. The number of clusters obtained by clustering is set according to needs, and the more the number of clusters is, the larger the workload of subsequent labeling is.
As shown in fig. 3, in step S305, classifying and labeling the classification result of the backbone information of the fulfillment address according to the clustering result.
In one embodiment of the present disclosure, the fulfillment address in each cluster is labeled according to the clustered result. For example, the fulfillment address closest to the cluster center in 1000 clusters is manually labeled, so that the full number of fulfillment address trunks are labeled. For example, the class of the megalobbies is labeled as shopping malls, the class of the longmeng hotel is labeled as hotels, and the class of the kexin hause garden is labeled as residences.
FIG. 4 shows a flow diagram of text classification training based on word features, comprising the steps of:
As shown in fig. 4, in step S401, correspondence rules of address texts and classifications are configured.
in one embodiment of the present disclosure, an artificial dictionary may be configured, where the artificial dictionary includes address texts and corresponding rules for classification, and the format may be: text- > classification.
as shown in fig. 4, in step S402, the history fulfillment address is matched by using multi-mode matching, and if the history fulfillment address matches the address text, a corresponding classification result is output according to the correspondence rule.
in an embodiment of the present disclosure, the multi-pattern matching is to determine whether the rule front piece has an inclusion relationship in the address text, where the inclusion relationship is the multi-pattern matching, and some failure cases or bad cases (badcase) can be handled by using the corresponding rule and the reason of the multi-pattern matching, where the rule front piece is the address text and the rule back piece is the hotel, i.e., the classification.
For example, the rule in the artificial dictionary is hotel- > hotel, and based on the corresponding rule, if the text is given, hotel is obtained, because "hotel" is included, the classification of "hotel" is hotel.
As shown in fig. 4, in step S403, performing word segmentation on the performing address, performing multi-element combination on the obtained word segmentation, and performing Softmax training based on features of a single word segmentation or multiple word segmentation to obtain the text classification model.
In one embodiment of the disclosure, for corpora obtained by manual labeling, performing word segmentation on a fulfillment address, and performing bigram (2 word segments), trigram (2 word segments) combination, based on unigram (single word segment), bigram, trigram features, and using Softmax training to obtain a text classification model.
After training based on the offline models shown in fig. 3 and 4, classifying the obtained fulfillment address, which is an online prediction process, fig. 5 shows a flowchart for classifying the fulfillment address of the user, including the following steps:
As shown in fig. 5, in step S501, the stem information in the performing address is extracted, and a classification label can be obtained by address labeling based on the word vector. Obtained by the procedure shown in fig. 3.
As shown in fig. 5, in step S502, prediction is performed using the word feature model, and the stem information of the fulfillment address and the part other than the stem information (for example, the part in parentheses) are sequentially predicted to obtain a prediction result. For example, by using the flow shown in fig. 4 to perform prediction, if the rules in the artificial dictionary can be hit, the classification result is directly returned according to the corresponding rules, otherwise, the classification result is obtained by training with the Softmax model.
in the online prediction process, the content (such as the content in brackets) outside the main information is not considered by the prediction method based on the word vector and the clustering, and the clustering result is influenced because the content in brackets interferes a plurality of items. However, during online prediction, the offline training model acts on the trunk information and the non-trunk information in sequence, and then the prediction result corresponding to the content in the parentheses is obtained, because the address in the parentheses is more accurate and specific.
For example, the classification result when only the skeleton information is considered is:
shanghai Shangning Lu Meifeng square market
Considering the non-main information in parentheses, the classification result is:
shanghai Changning Lu Meifeng square (4 th-small south China and small square sugar town) company enterprise
Since the sugar town is a public creation space, the corresponding classification should be a company enterprise, not a market, and the classification result of the latter kind is more accurate.
Before prediction in step S502, the performing address is processed with the azimuth word, for example, if "… opposite face" is included in the performing address, the part of the azimuth word "opposite face" is ignored.
In step S220, an attribute analysis is performed on the classification result by combining with an industry information database to obtain attribute information of the fulfillment address.
In an embodiment of the present disclosure, before performing attribute analysis on the classification result in combination with an industry information database, the method further includes:
The method comprises the following steps of constructing an industry information database:
firstly, preprocessing industry classification information; and secondly, constructing an industry information database according to the preprocessed industry classification information.
the preprocessing can acquire, clean and structure information such as addresses and industries in an external acquisition and/or public data crawling mode to obtain information in a triple form.
The industry information database obtained by construction comprises a plurality of pieces of information, and each triple information comprises:
A label; classifying results; attribute information;
The classification result comprises: at least one or more of a residence, hospital, hotel, office building, recreational facility; the attribute information includes: at least one or more of a room price, hospital type, hotel star level, office building level, recreational entertainment level.
For example, the information in the industry information database is as follows:
A building is a Kaixin Hao Yuan; a residence; the room price is 100000 Yuan/sq m;
Hotel ═ Shangri La; a hotel; hotel star grade is five stars;
Hospital, Zhongshan hospital; a hospital; hospital type-three hospital
in an embodiment of the present disclosure, the performing attribute analysis on the classification result in combination with an industry information database in this step to obtain attribute information specifically includes:
and obtaining attribute information corresponding to the fulfillment address by performing forward maximum matching on the classification result and the information in the industry information database.
the algorithm of the forward maximum matching is to separate a segment of character string, wherein the length of the separation is limited, then match the separated sub-character string with the words in the dictionary, if the matching is successful, then carry out the next round of matching until all the character strings are processed, otherwise, remove a word from the end of the sub-character string, then carry out the matching, and the above steps are repeated.
for example, the fulfillment address is:
Kaixin Haoyuan (12 blocks 306)
Firstly, obtaining a classification result as a house, then screening the house according to the second column in the industry information database to obtain an entity related to the house, then matching the entity in the Kaixin Hao park in a positive maximum matching mode to obtain an attribute corresponding to the fulfillment address, namely an entity of the rate information, namely the rate of the house is 100000 yuan/square meter.
In step S230, the performing behavior of the user for the performing address is analyzed in combination with the classification result and the attribute information to obtain the tag information of the user.
In an embodiment of the present disclosure, before performing the specific analysis, the method further includes:
Acquiring a performance behavior of the user aiming at the performance address; wherein the performance activities include: at least one of the number of times of performing on weekdays, the number of times of performing on non-weekdays, the date span of performing, and the information labeled to the performing address by the user.
In an embodiment of the present disclosure, analyzing, according to the fulfillment behavior of the user for the fulfillment address and in combination with the classification result and/or the attribute information, to obtain tag information of the user includes:
If the number of times of performing on the user's working day is greater than or equal to a first threshold value and the classification result is a hospital, the obtained label information of the user is medical staff for occupation; or if the number of times of performing on the non-working days of the user is greater than or equal to a second threshold value, the classification result is the residence, and the rate of the house in the attribute information is greater than or equal to a third threshold value, the tag information of the user is obtained as the high-end cell.
The threshold referred by the mapping process may be set according to needs, for example, the first threshold corresponding to the number of performance times on weekdays may be set to 5 times, and the third threshold corresponding to the house price may be set in consideration of different cities.
And analyzing and mapping according to the text extracted by the user in the given fulfillment address and the fulfillment behaviors (information such as working day fulfillment times, holiday fulfillment times, fulfillment date span, user labeling on the fulfillment address and the like) of the user aiming at the fulfillment address to obtain the label information of the user.
For example: if the fulfillment address is classified as office building and the fulfillment address is marked as a work place by the user, the user is presumed to be white-collar, that is, the label information is white-collar.
if the address category is residential and the developer is ten thousand and the house price is 75635.0 yuan/square meter and the date span is >1 year on this address and the number of holiday executions > 5 on this address, it is presumed that the user is camped on a high-end cell, i.e. labeled high-end cell.
the crowd fact label obtained by using the fulfillment address structured information has certain sequencing capacity on the amount and risk of the user and can be used as a strong financial attribute of the user.
based on the above process, the present disclosure does not need to directly obtain the information of the user, classifies the user fulfillment address, and can determine whether the user performs in a residence place (the fulfillment address is classified as a residence) or a work place (the address is classified as an office building or a company enterprise) or other places. Then, on the basis of classification, the constructed industry information database is associated with a fulfillment address, so that the attribute information of the fulfillment address of the user can be further analyzed, and the label information of the user, such as information of house property value, occupation, living habits and the like, is obtained by combining the fulfillment behaviors (fulfillment frequency, working day fulfillment frequency, holiday fulfillment frequency, time span of fulfillment date and the like) of the user, so that the strong financial attribute of the user is extracted.
In summary, the tag information obtaining method provided in this embodiment obtains the tag information of the user, such as relatively strong financial attributes such as occupation, real estate value, living habits, and the like, by classifying and identifying the fulfillment address of the user and analyzing the fulfillment behavior of the user, and evaluates the consuming capability of the user on the premise of not obtaining sensitive information of the user.
Fig. 6 is a schematic diagram of a tag information acquiring apparatus provided in another embodiment of the present disclosure, and as shown in fig. 6, the apparatus 600 includes: an address classification module 610 and a tag analysis module 620.
The address classification module 610 is configured to classify a fulfillment address of a user to obtain a classification result of the fulfillment address, where the fulfillment address is an address where the user performs fulfillment on an order; the tag analysis module 620 is configured to analyze the fulfillment behavior of the user for the fulfillment address in combination with the classification result to obtain tag information of the user.
fig. 7 is a schematic diagram of another tag information acquiring apparatus provided in another embodiment of the present disclosure, and as shown in fig. 7, the apparatus 700 includes: an address classification module 710, an attribute identification module 720, and a tag analysis module 730.
The address classification module 710 is configured to classify a fulfillment address of a user to obtain a classification result of the fulfillment address, where the fulfillment address is an address provided by the user for fulfilling an order; the attribute identification module 720 is configured to perform attribute analysis on the classification result by combining with an industry information database to obtain attribute information of the fulfillment address; the tag analysis module 730 is configured to analyze the fulfillment behavior of the user for the fulfillment address in combination with the classification result and the attribute information to obtain tag information of the user.
the functions of each module in the apparatus are described in the above method embodiments, and are not described again here.
In summary, the tag information obtaining apparatus in this embodiment obtains the tag information of the user, such as relatively strong financial attributes such as occupation, real estate value, living habits, and the like, by classifying and identifying the fulfillment address of the user and analyzing the fulfillment behavior of the user, and evaluates the consuming capability of the user without obtaining sensitive information of the user.
in another aspect, the present disclosure also provides an electronic device, including a processor and a memory, where the memory stores operating instructions for the processor to control the following method:
classifying fulfillment addresses of users to obtain classification results of the fulfillment addresses, wherein the fulfillment addresses are addresses for performing orders by the users; and analyzing according to the fulfillment behavior of the user aiming at the fulfillment address and the classification result to obtain the label information of the user. Or
classifying fulfillment addresses of users to obtain classification results of the fulfillment addresses, wherein the fulfillment addresses are addresses for performing orders by the users; performing attribute analysis on the classification result by combining an industry information database to obtain attribute information of the fulfillment address; and analyzing according to the fulfillment behavior of the user aiming at the fulfillment address by combining the classification result and the attribute information to obtain the label information of the user.
Referring now to FIG. 8, shown is a block diagram of a computer system 800 suitable for use in implementing the electronic device of an embodiment of the present application. The electronic device shown in fig. 8 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.
as shown in fig. 8, the computer system 800 includes a Central Processing Unit (CPU)801 that can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)802 or a program loaded from a storage section 805 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data necessary for the operation of the system 800 are also stored. The CPU 801, ROM 802, and RAM 803 are connected to each other via a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.
the following components are connected to the I/O interface 805: an input portion 806 including a keyboard, a mouse, and the like; an output section 808 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 808 including a hard disk and the like; and a communication section 809 including a network interface card such as a LAN card, a modem, or the like. The communication section 809 performs communication processing via a network such as the internet. A drive 810 is also connected to the I/O interface 805 as necessary. A removable medium 811 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 810 as necessary, so that a computer program read out therefrom is mounted on the storage section 808 as necessary.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 809 and/or installed from the removable medium 811. The computer program executes the above-described functions defined in the system of the present application when executed by the Central Processing Unit (CPU) 801.
It should be noted that the computer readable medium shown in the present application may be a computer readable signal medium or a computer readable medium or any combination of the two. A computer readable medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
the units described in the embodiments of the present application may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes a transmitting unit, an obtaining unit, a determining unit, and a first processing unit. The names of these units do not in some cases constitute a limitation to the unit itself, and for example, the sending unit may also be described as a "unit sending a picture acquisition request to a connected server".
In another aspect, the present disclosure also provides a computer-readable medium, which may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to include the method steps of:
Classifying the fulfillment address of the user to obtain a classification result of the fulfillment address, wherein the fulfillment address is an address provided by the user for fulfilling orders; and analyzing according to the fulfillment behavior of the user aiming at the fulfillment address and the classification result to obtain the label information of the user. Or
classifying fulfillment addresses of users to obtain classification results of the fulfillment addresses, wherein the fulfillment addresses are addresses for performing orders by the users; performing attribute analysis on the classification result by combining an industry information database to obtain attribute information of the fulfillment address; and analyzing according to the fulfillment behavior of the user aiming at the fulfillment address by combining the classification result and the attribute information to obtain the label information of the user.
It should be clearly understood that this disclosure describes how to make and use particular examples, but the principles of this disclosure are not limited to any details of these examples. Rather, these principles can be applied to many other embodiments based on the teachings of the present disclosure.
exemplary embodiments of the present disclosure are specifically illustrated and described above. It is to be understood that the present disclosure is not limited to the precise arrangements, instrumentalities, or instrumentalities described herein; on the contrary, the disclosure is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims (13)

1. A tag information acquisition method is characterized by comprising the following steps:
Classifying fulfillment addresses of users to obtain classification results of the fulfillment addresses, wherein the fulfillment addresses are addresses for performing orders by the users;
Performing attribute analysis on the classification result by combining an industry information database to obtain attribute information of the fulfillment address;
and analyzing according to the fulfillment behavior of the user aiming at the fulfillment address by combining the classification result and the attribute information to obtain the label information of the user.
2. The tag information obtaining method of claim 1, wherein the classifying the fulfillment address to obtain a classification result of the fulfillment address comprises:
Classifying and labeling the historical fulfillment addresses based on the word vectors to obtain a mapping relation between the fulfillment addresses and the classifications, wherein the historical fulfillment addresses are addresses for performing fulfillment on the historical orders by a plurality of users in the platform;
And aiming at the fulfillment address, combining the mapping relation between the fulfillment address and the classification to obtain the classification result.
3. the tag information obtaining method of claim 2, wherein the classifying and labeling the historical fulfillment addresses based on the word vectors comprises:
extracting backbone information from the historical fulfillment address;
Performing word segmentation on the trunk information through a word segmentation technology to obtain a plurality of address word segments;
converting the plurality of address participles into word vectors;
Clustering the word vectors;
and carrying out corresponding classification labeling on the classification result of the trunk information of the fulfillment address according to the clustering result.
4. The tag information obtaining method of claim 1, wherein the classifying the fulfillment address to obtain a classification result of the fulfillment address comprises:
Performing Softmax training on a history fulfillment address based on the word features to obtain a text classification model, wherein the history fulfillment address is an address for performing fulfillment on a history order by a plurality of users in a platform;
and inputting the fulfillment address into the text classification model, and outputting the classification result.
5. The tag information obtaining method of claim 4, wherein performing Softmax training on the historical fulfillment address based on the word features comprises:
Configuring address texts and classified corresponding rules;
Matching the historical fulfillment address by adopting multi-mode matching, and if the historical fulfillment address is matched with the address text, outputting a corresponding classification result according to the corresponding rule;
performing word segmentation on the fulfillment address, performing multi-element combination on the obtained word segmentation, and performing Softmax training based on the characteristics of a single word segmentation or a plurality of word segmentation to obtain the text classification model.
6. The tag information acquisition method according to claim 1, wherein before performing attribute analysis on the classification result in combination with an industry information database, the method further comprises:
Preprocessing the industry classification information;
constructing an industry information database according to the preprocessed industry classification information;
wherein the industry information database comprises a plurality of pieces of information, each piece of information comprises:
A label; classifying results; attribute information;
The classification result comprises: at least one or more of a residence, hospital, hotel, office building, recreational facility;
The attribute information includes: at least one or more of a room price, hospital type, hotel star level, office building level, recreational entertainment level.
7. The tag information acquisition method according to claim 6, wherein the classification result is subjected to attribute analysis in combination with an industry information database to obtain attribute information:
and obtaining attribute information corresponding to the fulfillment address by performing forward maximum matching on the classification result and the information in the industry information database.
8. The tag information obtaining method of claim 1, wherein before analyzing the fulfillment behavior of the user for the fulfillment address in combination with the classification result, the method further comprises:
Acquiring a performance behavior of the user aiming at the performance address;
wherein the performance activities include: at least one of the number of times of performing on weekdays, the number of times of performing on non-weekdays, the date span of performing, and the information of the user labeling the performing address.
9. The tag information obtaining method of claim 8, wherein analyzing the fulfillment behavior of the user for the fulfillment address in combination with the classification result to obtain the tag information of the user comprises:
And if the number of times of performing on the user in the working day is greater than or equal to a first threshold value and the classification result is a hospital, obtaining the label information of the user as medical staff for the occupation.
10. The method of claim 8, wherein analyzing according to the fulfillment behavior of the user for the fulfillment address in combination with the classification result and the attribute information, and obtaining the tag information of the user comprises:
And if the number of times of performing on the non-working days of the user is greater than or equal to a second threshold value, the classification result is the residence, and the rate of the house in the attribute information is greater than or equal to a third threshold value, obtaining that the label information of the user is a high-end cell.
11. A tag information acquisition apparatus characterized by comprising:
the address classification module is configured to classify a fulfillment address of a user to obtain a classification result of the fulfillment address, wherein the fulfillment address is an address for performing an order by the user;
the attribute identification module is configured to perform attribute analysis on the classification result by combining an industry information database to obtain attribute information of the fulfillment address;
And the label analysis module is configured to analyze the classification result and the attribute information according to the fulfillment behavior of the user aiming at the fulfillment address to obtain the label information of the user.
12. an electronic device, comprising:
a processor;
Memory storing instructions for the processor to control the method steps of any of claims 1-10.
13. A computer-readable medium having stored thereon computer-executable instructions, which when executed by a processor, perform the method steps of any one of claims 1-10.
CN201811333350.4A 2018-11-09 2018-11-09 Label information acquisition method and device, electronic equipment and computer readable medium Active CN109492103B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201811333350.4A CN109492103B (en) 2018-11-09 2018-11-09 Label information acquisition method and device, electronic equipment and computer readable medium
CA3060822A CA3060822A1 (en) 2018-11-09 2019-11-01 Label information acquistion method and apparatus, electronic device and computer readable medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811333350.4A CN109492103B (en) 2018-11-09 2018-11-09 Label information acquisition method and device, electronic equipment and computer readable medium

Publications (2)

Publication Number Publication Date
CN109492103A CN109492103A (en) 2019-03-19
CN109492103B true CN109492103B (en) 2019-12-17

Family

ID=65694177

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811333350.4A Active CN109492103B (en) 2018-11-09 2018-11-09 Label information acquisition method and device, electronic equipment and computer readable medium

Country Status (2)

Country Link
CN (1) CN109492103B (en)
CA (1) CA3060822A1 (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111861046B (en) * 2019-04-02 2023-12-29 南京大学 Intelligent patent value assessment system based on big data and deep learning
CN110213239B (en) * 2019-05-08 2021-06-01 创新先进技术有限公司 Suspicious transaction message generation method and device and server
CN112434154A (en) * 2019-08-26 2021-03-02 北京星选科技有限公司 Object processing method and device, electronic equipment and storage medium
CN111310462A (en) * 2020-02-07 2020-06-19 北京三快在线科技有限公司 User attribute determination method, device, equipment and storage medium
CN112765386A (en) * 2020-06-14 2021-05-07 黄雨勤 Information management method and system based on big data and Internet and cloud server
CN111966730A (en) * 2020-10-23 2020-11-20 北京淇瑀信息科技有限公司 Risk prediction method and device based on permanent premises and electronic equipment
CN112488103A (en) * 2020-11-30 2021-03-12 上海寻梦信息技术有限公司 Address information extraction method, model training method and related equipment
CN112417251A (en) * 2020-11-30 2021-02-26 华能大理风力发电有限公司 Transaction information retrieval method and device based on wind power bidding
CN112561479B (en) * 2020-12-16 2023-09-19 中国平安人寿保险股份有限公司 Intelligent decision-making-based enterprise personnel increasing method and device and computer equipment
CN116521827A (en) * 2023-05-19 2023-08-01 北京百度网讯科技有限公司 Geographic position place category determination method and device, electronic equipment and medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108345596A (en) * 2017-01-22 2018-07-31 分众(中国)信息技术有限公司 Building information converged services platform

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9882979B2 (en) * 2015-03-16 2018-01-30 International Business Machines Corporation Image file transmission
US10497045B2 (en) * 2016-08-05 2019-12-03 Accenture Global Solutions Limited Social network data processing and profiling
CN108287850B (en) * 2017-01-10 2021-09-21 创新先进技术有限公司 Text classification model optimization method and device
CN108287858B (en) * 2017-03-02 2021-08-10 腾讯科技(深圳)有限公司 Semantic extraction method and device for natural language
CN108711004A (en) * 2018-05-14 2018-10-26 北京京东金融科技控股有限公司 Methods of risk assessment and device and computer readable storage medium

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108345596A (en) * 2017-01-22 2018-07-31 分众(中国)信息技术有限公司 Building information converged services platform

Also Published As

Publication number Publication date
CN109492103A (en) 2019-03-19
CA3060822A1 (en) 2020-05-09

Similar Documents

Publication Publication Date Title
CN109492103B (en) Label information acquisition method and device, electronic equipment and computer readable medium
CN107992596B (en) Text clustering method, text clustering device, server and storage medium
CN109598517B (en) Commodity clearance processing, object processing and category prediction method and device thereof
CN109978619B (en) Method, system, equipment and medium for screening air ticket pricing strategy
CN113064964A (en) Text classification method, model training method, device, equipment and storage medium
CN110347840A (en) Complain prediction technique, system, equipment and the storage medium of text categories
CN112926308B (en) Method, device, equipment, storage medium and program product for matching text
CN113378970A (en) Sentence similarity detection method and device, electronic equipment and storage medium
CN115587739A (en) Client list distribution method and device, computer equipment and storage medium
CN113392920B (en) Method, apparatus, device, medium, and program product for generating cheating prediction model
CN111143534A (en) Method and device for extracting brand name based on artificial intelligence and storage medium
CN114780600A (en) Flight searching method, system, equipment and storage medium
CN113051911B (en) Method, apparatus, device, medium and program product for extracting sensitive words
CN113553431A (en) User label extraction method, device, equipment and medium
CN113220999A (en) User feature generation method and device, electronic equipment and storage medium
CN114036921A (en) Policy information matching method and device
CN113704420A (en) Method and device for identifying role in text, electronic equipment and storage medium
CN112560425A (en) Template generation method and device, electronic equipment and storage medium
US20230004715A1 (en) Method and apparatus for constructing object relationship network, and electronic device
CN116402166A (en) Training method and device of prediction model, electronic equipment and storage medium
CN115017385A (en) Article searching method, device, equipment and storage medium
CN115358817A (en) Intelligent product recommendation method, device, equipment and medium based on social data
CN114647727A (en) Model training method, device and equipment applied to entity information recognition
US20230230081A1 (en) Account identification method, apparatus, electronic device and computer readable medium
CN115292506A (en) Knowledge graph ontology construction method and device applied to office field

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20200624

Address after: Room 301, building 2, No. 18, Tianshan West Road, Changning District, Shanghai, 200335

Patentee after: Shanghai Liangxin Technology Co., Ltd

Address before: 100083 Beijing Haidian District North Fourth Ring Road West, No. 9 2106-030

Patentee before: BEIJING SANKUAI ONLINE TECHNOLOGY Co.,Ltd.

TR01 Transfer of patent right