CN111753195A - Label system construction method, device, equipment and storage medium - Google Patents
Label system construction method, device, equipment and storage medium Download PDFInfo
- Publication number
- CN111753195A CN111753195A CN202010556296.0A CN202010556296A CN111753195A CN 111753195 A CN111753195 A CN 111753195A CN 202010556296 A CN202010556296 A CN 202010556296A CN 111753195 A CN111753195 A CN 111753195A
- Authority
- CN
- China
- Prior art keywords
- search terms
- target
- category
- determining
- target search
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000010276 construction Methods 0.000 title claims abstract description 33
- 238000000034 method Methods 0.000 claims description 31
- 230000015654 memory Effects 0.000 claims description 19
- 238000002372 labelling Methods 0.000 claims description 15
- 238000012549 training Methods 0.000 claims description 5
- 238000012163 sequencing technique Methods 0.000 claims description 2
- 238000012545 processing Methods 0.000 abstract description 4
- 238000005516 engineering process Methods 0.000 abstract description 3
- 238000013473 artificial intelligence Methods 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 8
- 235000021170 buffet Nutrition 0.000 description 7
- 235000013305 food Nutrition 0.000 description 7
- 238000004891 communication Methods 0.000 description 4
- 238000004590 computer program Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 235000021168 barbecue Nutrition 0.000 description 3
- 230000002996 emotional effect Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000003993 interaction Effects 0.000 description 2
- 239000004973 liquid crystal related substance Substances 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 235000014102 seafood Nutrition 0.000 description 2
- 235000011888 snacks Nutrition 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000008451 emotion Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000003064 k means clustering Methods 0.000 description 1
- 235000013372 meat Nutrition 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000005295 random walk Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 235000013311 vegetables Nutrition 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/906—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9537—Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application discloses a label system construction method, a label system construction device, label system construction equipment and a storage medium, and relates to the technical field of data processing, in particular to the technical field of artificial intelligence and intelligent search. The specific implementation scheme is as follows: determining at least two target search terms describing a point of interest category label; determining at least two category labels according to the at least two target search terms; and establishing a parent-child relationship between the at least two category labels according to the distribution information of the target interest points associated with the at least two target search terms to obtain a structured label system of the interest points. According to the technology of the application, the determination cost of the label system is reduced, and the accuracy of the label system is improved.
Description
Technical Field
The application relates to the technical field of data processing, in particular to the technical field of artificial intelligence and intelligent search. Specifically, the embodiment of the application provides a label system construction method, a label system construction device, label system construction equipment and a storage medium.
Background
The tags Of a Point Of Interest (POI) may provide POI information and decision support for a user. The labels of the points of interest therefore play an important role in the map retrieval system.
Typically the tags of the points of interest comprise structured tags. The structured labels are generalized about the attributes of the interest points in various dimensions, for example, a category label of a certain interest point is a snack, and an industry category to which the snack belongs is a food, which belongs to the structured labels. The existing structured label system is set manually. Both the setting cost and the accuracy rate of the method need to be improved.
Disclosure of Invention
The disclosure provides a label system construction method, a device, equipment and a storage medium.
According to an aspect of the present disclosure, there is provided a tag architecture construction method, including:
determining at least two target search terms describing a point of interest category label;
determining at least two category labels according to the at least two target search terms;
and establishing a parent-child relationship between the at least two category labels according to the distribution information of the target interest points associated with the at least two target search terms to obtain a structured label system of the interest points.
According to another aspect of the present disclosure, there is provided a label architecture building apparatus including:
the search term determining module is used for determining at least two target search terms describing interest point category labels;
the label determining module is used for determining at least two category labels according to the at least two target search terms;
and the relationship establishing module is used for establishing a parent-child relationship between the at least two category labels according to the distribution information of the target interest points associated with the at least two target search terms to obtain a structured label system of the interest points.
According to still another aspect of the present disclosure, there is provided an electronic device including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of the embodiments of the present application.
According to yet another aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of the embodiments of the present application.
According to the technology of the application, the determination cost of the label system is reduced, and the accuracy of the label system is improved.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:
fig. 1 is a flowchart of a tag architecture construction method provided in an embodiment of the present application;
FIG. 2 is a schematic diagram of a label system provided by an embodiment of the present application;
FIG. 3 is a flow chart of another tag architecture construction method provided by an embodiment of the present application;
FIG. 4 is a flow chart of another label architecture construction method provided by the embodiments of the present application;
FIG. 5 is a flow chart of yet another tag architecture construction method provided by an embodiment of the present application;
FIG. 6 is a flow chart of yet another tag architecture construction method provided by an embodiment of the present application;
FIG. 7 is a schematic diagram of a structured label system provided by an embodiment of the present application;
FIG. 8 is a schematic diagram of a construction of an unstructured tag architecture provided by an embodiment of the present application;
FIG. 9 is a schematic diagram of a label tag provided in an embodiment of the present application;
fig. 10 is a schematic view of an application scenario provided in an embodiment of the present application;
fig. 11 is a schematic structural diagram of a label architecture building apparatus provided in an embodiment of the present application;
fig. 12 is a block diagram of an electronic device of a label architecture building method according to an embodiment of the present application.
Detailed Description
The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Fig. 1 is a flowchart of a tag architecture building method provided in an embodiment of the present application. The embodiment can be applied to the situation of constructing the structured label system of the interest points. The method may be performed by a label architecture building apparatus, which may be implemented in software and/or hardware. Referring to fig. 1, a tag architecture construction method provided in an embodiment of the present application includes:
s110, determining at least two target search terms describing the interest point category labels.
The interest point category label is a label describing the interest point category. Such as gourmet food, Chinese restaurants, Sichuan dishes, etc.
Target search terms refer to search terms that describe point of interest category labels.
The search term refers to search text or retrieval text input by the user.
For example, the target search term may be: nearby food.
In one embodiment, the determining at least two target search terms describing a point of interest category label may include:
and determining the target search term from the historical search terms according to the information of the interest points in the click data associated with the historical search terms.
Wherein, the historical search term refers to the search term which has completed the retrieval.
The click data associated with the historical search term refers to data of a retrieval result of clicking the historical search term by a user.
Specifically, if click data associated with a historical search term includes information of a point of interest, the historical search term is taken as a target search term.
Alternatively, the information of the point of interest may be a name of the point of interest or other attribute information of the point of interest.
S120, determining at least two category labels according to the at least two target search terms.
Wherein, the category label refers to a label describing the point of interest category.
In one embodiment, the determining at least two category labels from the at least two target search terms comprises:
and extracting the category label of each target search term in the at least two target search terms to obtain the at least two category labels.
S130, establishing a parent-child relationship between the at least two category labels according to the distribution information of the target interest points related to the at least two target search terms, and obtaining a structured label system of the interest points.
The target interest points refer to interest points related to target retrieval results clicked by users, and the target retrieval results refer to retrieval results of target search terms.
The target point of interest may also be understood as a point of interest viewed by the user through the search results of the target search term.
Optionally, the distribution information may be distribution information of the target interest point in a geographic location space, or may also be distribution information of the target interest point in a semantic space.
The structured tag system of the interest point refers to a structured tag system which describes a category to which the interest point belongs.
The structured label system is a tree or forest organized by labels, and has clear hierarchical division and parent-child relationship.
Parent-child relationships between at least two category tags may also be understood as dependencies between different category tags.
Illustratively, referring to FIG. 2, the structured tag system for points of interest can include food, shopping, and the like. The food may also include Chinese restaurants, buffets, and the like. The Chinese restaurant may also include Sichuan dish and Beijing dish. The buffet dinner can also comprise seafood buffet dinner service, barbecue buffet dinner service, chafing dish service and the like. The shopping may in turn include farmer and clothing, etc. Farmers may in turn include meat stores, vegetable stores, seafood stores, and the like. The garments may again include men's wear, women's wear, children's wear, and the like.
According to the technical scheme of the embodiment of the application, the category label of the interest point is determined according to the target search term; and determining the parent-child relationship among the category labels according to the distribution information of the target interest points associated with the target search terms, thereby realizing the automatic determination of the structured label system. Because the scheme does not need human participation, the introduction of artificial subjective factors is avoided, the determining cost of the label system is reduced, and the accuracy of the label system is improved.
Fig. 3 is a flowchart of another tag architecture construction method provided in an embodiment of the present application. The scheme is based on the scheme, and specific optimization of the step of determining at least two category labels according to the at least two target search terms is performed. Referring to fig. 3, the label system construction method provided by the present solution includes:
s210, determining at least two target search terms describing the interest point category labels.
S220, clustering the at least two target search terms according to the similarity between the at least two target search terms to obtain at least two category communities.
Wherein, the category community is a set formed by target search terms of the same category.
Each category community includes at least one target search term.
Optionally, before clustering the at least two target search terms according to the similarity between the at least two target search terms to obtain at least two category communities, the method further includes:
and determining the similarity between the at least two target search terms according to the semantic information of the at least two target search terms and/or the overlapping information of the associated click interest points.
And S230, determining the category labels of the community according to the search terms of the community in the at least two category communities to obtain the at least two category labels.
In one embodiment, determining the category labels of the community according to the search term of the community in the at least two category communities to obtain the at least two category labels includes:
and determining the category labels of the community according to any search term of the community in the at least two category communities to obtain the at least two category labels.
S240, establishing a parent-child relationship between the at least two category labels according to the distribution information of the target interest points related to the at least two target search terms, and obtaining a structured label system of the interest points.
According to the scheme, at least two category communities are obtained by clustering the at least two target search terms; and determining the category labels according to the search terms in the category communities, thereby reducing the repetition rate of the category labels and improving the extraction efficiency of the category labels.
In order to improve the accuracy of the category labels, the determining the category labels of the community according to the search term of the community in the at least two category communities to obtain the at least two category labels includes:
determining category labels of the communities according to center search terms of the communities in the at least two category communities to obtain the at least two category labels;
the central search term refers to a search term located at the central position of the semantic vector space of the community. The center search term may describe the community more accurately.
Further, before clustering the at least two target search terms according to the similarity between the at least two target search terms to obtain at least two category communities, the method further includes:
determining a search term to be calculated from the at least two target search terms according to the search heat of the target search terms;
and determining the similarity between the search term to be calculated and other target search terms to obtain the similarity between the at least two target search terms.
Because the higher the search heat, the higher the user attention of the category label corresponding to the target search term, the search term to be calculated is determined according to the search heat of the target search term, and the at least two target search terms are clustered based on the similarity between the search term to be calculated and other target search terms, so that the category label concerned by the user is extracted, and the accuracy of the category label is improved.
In order to avoid missing category labels concerned by the user, the determining a search term to be calculated from the at least two target search terms according to the search heat of the target search terms includes:
sequencing the at least two target search terms according to the search heat of the target search terms;
and according to the sorting result, respectively taking the at least two target search terms as the search terms to be calculated.
Fig. 4 is a flowchart of another tag architecture construction method provided in an embodiment of the present application. The scheme is based on the scheme, and specific optimization is carried out on the step of determining at least two target search terms describing interest point category labels. Referring to fig. 4, the label system construction method provided by the present solution includes:
s310, determining the target search term from the historical search term according to the information of the interest points in the click data associated with the historical search term and the information of the interest points in the historical search term.
Specifically, the information that includes the point of interest in the historical search term may be information whether the historical search term includes the name of the point of interest.
In one embodiment, the determining the target search term from the historical search terms according to the information of the interest points included in the click data associated with the historical search terms and the information of the interest points included in the historical search terms comprises:
if the associated click data of the historical search term comprises the interest point, taking the historical search term as a candidate search term;
and if the candidate search term only comprises the information except the interest point name, taking the candidate search term as the target search term.
The target search term determined based on the method is a generalized search term describing the point of interest. The category labels of the interest points can be extracted through the category search terms.
S320, determining at least two category labels according to the at least two target search terms.
S330, establishing a parent-child relationship between the at least two category labels according to the distribution information of the target interest points associated with the at least two target search terms to obtain a structured label system of the interest points.
According to the scheme, the target search term is determined from the historical search terms according to the information of the interest points in the click data associated with the historical search terms and the information of the interest points in the historical search terms, so that the generalized search terms describing the interest points are extracted as the target search terms. Because the category information of the interest points is described in the category search terms, the accuracy of the category labels determined according to the category search terms is higher.
Fig. 5 is a flowchart of another tag architecture construction method provided in an embodiment of the present application. The scheme is an expansion scheme provided on the basis of the scheme. Referring to fig. 5, the scheme includes:
s410, determining at least two target search terms describing the interest point category labels.
S420, determining at least two category labels according to the at least two target search terms.
S430, establishing a parent-child relationship between the at least two category labels according to the distribution information of the target interest points associated with the at least two target search terms, and obtaining a structured label system of the interest points.
S440, inputting at least one of the name, the position information, the signboard text information, the comment information and the commodity category information of the interest point to be labeled into a pre-trained labeling model, and outputting the structured label of the interest point to be labeled.
And the labeling model is obtained by training sample data labeled with the label in the structured label system.
In one embodiment, the signboard text information and the commodity category information can be obtained by identifying the external scene face image and/or the internal scene commodity image of the interest point to be marked.
The embodiment of the present application does not limit the execution main body of the above steps. Alternatively, the execution subject of S440 may be different from that of other steps.
According to the scheme, at least one of the name, the position information, the signboard text information, the comment information and the commodity category information of the interest point to be marked is input into a pre-trained marking model, and the label of the interest point to be marked is output, so that the automatic marking of the label of the interest point to be marked is realized.
Fig. 6 is a flowchart of another tag architecture construction method provided in an embodiment of the present application. On the basis of the scheme, the method specifically describes the step of establishing a parent-child relationship between the at least two category labels according to the distribution information of the target interest points associated with the at least two target search terms to obtain a structured label system of the interest points. Referring to fig. 6, the scheme includes:
s510, determining at least two target search terms describing the interest point category labels.
S520, determining at least two category labels according to the at least two target search terms.
S530, determining the distribution containing relationship of the target interest points related to the at least two target search terms according to the distribution information of the target interest points.
S540, determining a parent-child relationship between the at least two category labels according to the inclusion relationship.
And S550, connecting the at least two category labels according to the determined parent-child relationship to obtain the category label system.
Illustratively, if a first target search term of the at least two target search terms is a nearby food, a second target search term is a nearby buffet, target interest points associated with the first target search term are a certain restaurant, a certain barbecue cafeteria and a certain hot pot cafeteria, and target interest points associated with the second target search term are a certain barbecue cafeteria and a certain hot pot cafeteria, then according to the inclusion relationship between the target interest points associated with the first target search term and the target interest points associated with the second target search term, it can be determined that the category label of the buffet corresponding to the second target search term is a lower-layer description of the category label of the food corresponding to the first target search term. Thus connecting the gourmet and the buffet, the structured label system is obtained.
According to the scheme, the distribution containing relation of the target interest points related to the at least two target search terms is determined according to the distribution information of the target interest points; determining a parent-child relationship between the at least two category labels according to the inclusion relationship; and connecting the at least two category labels according to the determined parent-child relationship to obtain the structured label system, thereby realizing the establishment of the category labels.
The scheme is an alternative scheme provided on the basis of the scheme. This scheme includes: the method comprises a structured label system construction stage, an unstructured label system construction stage and a label labeling stage.
With reference to fig. 7, the structured tag architecture construction phase can be described as follows:
based on a random walk algorithm, determining target interest points related to target search terms according to a click relation network formed by the target search terms and the target interest points;
determining the similarity between different target search terms according to the overlapping relation of the different target search terms and the associated target interest points;
clustering target search terms based on a k-means clustering method according to the similarity between different target search terms to obtain at least two category communities;
determining a category label of each category community according to the target search term of the vector space center position of each category community;
determining parent-child relationships among the category labels according to the inclusion relationship of the target interest point distribution associated with the target search term; and establishing a structured label system of the interest points according to the parent-child relationship.
For example, the set of click interest points of the user who has searched the tag of the chinese restaurant includes the set of click interest points of the two types of tags of the chinese dish and the northeast dish; the user clicking interest point sets for searching the two types of labels of the Sichuan dish and the northeast dish are mutually independent relations. Therefore, the Sichuan dish and the northeast dish can be obtained and belong to the category of Chinese restaurants. By the layer-by-layer progressive method, the system level of the structured label is obtained.
Referring to FIG. 8, the unstructured tag architecture build phase can be described as follows:
words of the emotion description class are extracted from related webpages, commenting and commenting of the interest point set to serve as unstructured labels of the interest points.
For example, in restaurant-related web pages, cursors and comments, words of an emotional description class mainly comprise 'very good', 'good taste', 'a certain dish specially liking a restaurant', emotional tendency phrases are extracted through a language component model, then the emotional tendency phrases in the same class are integrated and de-duplicated, and some labels with the highest frequency are extracted as unstructured labels of the class.
Referring to fig. 9, the label labeling phase may be described as follows:
inputting at least one of the name, the position information, the signboard text information, the comment information and the commodity category information of the interest point to be labeled into a pre-trained labeling model, and outputting a structured label and/or an unstructured label of the interest point to be labeled;
and the labeling model is obtained by training sample data labeled with the structured label and/or the unstructured label.
The signboard text information and the commodity category information can be obtained by recognizing the external scene face image and/or the internal scene commodity image of the interest point to be marked.
In one embodiment, an application scenario of the present solution is as follows: and searching nearby restaurants in the map application, screening out interest points meeting the requirements by the map application based on the tags of the interest points, and feeding back the interest points to the user. The feedback information is shown in fig. 10.
According to the scheme, the automatic determination of the structured label system is realized through the inclusion relation between the search term retrieved by the user and the click interest point. And according to the new search term information, the structured label system can be updated in time, and the accuracy of the label system is improved.
The label of the interest point to be labeled is determined according to the name, the position information, the signboard text information, the comment information and the commodity category information of the interest point to be labeled, so that the label labeling accuracy is improved.
Fig. 11 is a schematic structural diagram of a label architecture building apparatus according to an embodiment of the present application. Referring to fig. 11, the label system building apparatus 1100 provided in this embodiment includes: a search term determination module 1101, a tag determination module 1102, and a relationship establishment module 1103.
Wherein, the search term determining module 1101 is configured to determine at least two target search terms describing the interest point category tag;
a tag determining module 1102, configured to determine at least two category tags according to the at least two target search terms;
a relationship establishing module 1103, configured to establish a parent-child relationship between the at least two category tags according to distribution information of the target interest points associated with the at least two target search terms, so as to obtain a structured tag system of the interest points.
According to the technical scheme of the embodiment of the application, the category label of the interest point is determined according to the target search term; and determining the parent-child relationship among the category labels according to the distribution information of the target interest points associated with the target search terms, thereby realizing the automatic determination of the structured label system. Because the scheme does not need human participation, the introduction of artificial subjective factors is avoided, the determining cost of the label system is reduced, and the accuracy of the label system is improved.
Further, the tag determination module includes:
the clustering unit is used for clustering the at least two target search terms according to the similarity between the at least two target search terms to obtain at least two category communities;
and the label determining unit is used for determining the category labels of the community according to the search terms of the community in the at least two category communities to obtain the at least two category labels.
Further, the tag determination unit is specifically configured to:
determining category labels of the communities according to center search terms of the communities in the at least two category communities to obtain the at least two category labels;
the central search term refers to a search term located at the central position of the semantic vector space of the community.
Further, the apparatus further comprises:
the search term selection module is used for determining a search term to be calculated from the at least two target search terms according to the search heat of the target search terms before clustering the at least two target search terms according to the similarity between the at least two target search terms to obtain at least two category communities;
and the similarity determining module is used for determining the similarity between the search term to be calculated and other target search terms to obtain the similarity between the at least two target search terms.
Further, the search term selection module includes:
the search term ordering unit is used for ordering the at least two target search terms according to the search heat of the target search terms;
and the search term selection unit is used for respectively taking the at least two target search terms as the search terms to be calculated according to the sorting result.
Further, the search term determination module includes:
and the search term determining unit is used for determining the target search term from the historical search terms according to the information of the interest points in the click data associated with the historical search terms and the information of the interest points in the historical search terms.
Further, the search term determination unit is specifically configured to:
if the associated click data of the historical search term comprises the interest point, taking the historical search term as a candidate search term;
and if the candidate search term only comprises the information except the interest point name, taking the candidate search term as the target search term.
Further, the apparatus further comprises:
the label labeling module is used for establishing a parent-child relationship between the at least two category labels according to the distribution information of the target interest points related to the at least two target search terms, obtaining a structured label system of the interest points, inputting at least one of names, position information, signboard text information, comment information and commodity category information of the interest points to be labeled into a pre-trained labeling model, and outputting the structured labels of the interest points to be labeled;
and the labeling model is obtained by utilizing label training in the structured label system.
According to an embodiment of the present application, an electronic device and a readable storage medium are also provided.
Fig. 12 is a block diagram of an electronic device according to a label system building method according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.
As shown in fig. 12, the electronic apparatus includes: one or more processors 1201, memory 1202, and interfaces for connecting the various components, including a high speed interface and a low speed interface. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information of a GUI on an external input/output apparatus (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). Fig. 12 illustrates an example of one processor 1201.
The memory 1202 is a non-transitory computer readable storage medium, and can be used for storing non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the tag architecture construction method in the embodiment of the present application (for example, the search term determination module 1101, the tag determination module 1102, and the relationship establishment module 1103 shown in fig. 11). The processor 1201 executes various functional applications of the server and data processing by executing non-transitory software programs, instructions, and modules stored in the memory 1202, that is, implements the tag architecture construction method in the above method embodiment.
The memory 1202 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created from use of the tag architecture building electronic device, and the like. Further, the memory 1202 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 1202 may optionally include memory located remotely from the processor 1201, which may be connected to the tag architecture electronics via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The electronic device of the label system construction method may further include: an input device 1203 and an output device 1204. The processor 1201, the memory 1202, the input device 1203, and the output device 1204 may be connected by a bus or other means, and the bus connection is exemplified in fig. 12.
The input device 1203 may receive input numeric or character information and generate key signal inputs related to user settings and function controls of the tag architecture-building electronic device, such as a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointing stick, one or more mouse buttons, a track ball, a joystick, or other input device. The output devices 1204 may include a display device, auxiliary lighting devices (e.g., LEDs), tactile feedback devices (e.g., vibrating motors), and the like. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device can be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), the internet, and blockchain networks.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
According to the technology of the application, the determination cost of the label system is reduced, and the accuracy of the label system is improved.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, and the present invention is not limited thereto as long as the desired results of the technical solutions disclosed in the present application can be achieved.
The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.
Claims (18)
1. A label system construction method comprises the following steps:
determining at least two target search terms describing a point of interest category label;
determining at least two category labels according to the at least two target search terms;
and establishing a parent-child relationship between the at least two category labels according to the distribution information of the target interest points associated with the at least two target search terms to obtain a structured label system of the interest points.
2. The method of claim 1, wherein said determining at least two category labels from said at least two target search terms comprises:
clustering the at least two target search terms according to the similarity between the at least two target search terms to obtain at least two category communities;
and determining the category labels of the community according to the search terms of the community in the at least two category communities to obtain the at least two category labels.
3. The method according to claim 2, wherein the determining the category labels of the community according to the search terms of the community in the at least two category communities to obtain the at least two category labels comprises:
determining category labels of the communities according to center search terms of the communities in the at least two category communities to obtain the at least two category labels;
the central search term refers to a search term located at the central position of the semantic vector space of the community.
4. The method according to claim 2, wherein before clustering the at least two target search terms according to the similarity between the at least two target search terms to obtain at least two category communities, the method further comprises:
determining a search term to be calculated from the at least two target search terms according to the search heat of the target search terms;
and determining the similarity between the search term to be calculated and other target search terms to obtain the similarity between the at least two target search terms.
5. The method of claim 4, wherein the determining a search term to be computed from the at least two target search terms according to the search heat of the target search term comprises:
sequencing the at least two target search terms according to the search heat of the target search terms;
and according to the sorting result, respectively taking the at least two target search terms as the search terms to be calculated.
6. The method of any of claims 1-5, wherein the determining at least two target search terms that describe a point of interest category label comprises:
and determining the target search term from the historical search terms according to the information of the interest points in the click data associated with the historical search terms and the information of the interest points in the historical search terms.
7. The method of claim 6, wherein determining the target search term from the historical search terms according to the click data associated with the historical search terms including information of points of interest and the historical search terms including information of points of interest comprises:
if the associated click data of the historical search term comprises the interest point, taking the historical search term as a candidate search term;
and if the candidate search term only comprises the information except the interest point name, taking the candidate search term as the target search term.
8. The method according to any one of claims 1 to 5, wherein after the parent-child relationship between the at least two category tags is established according to the distribution information of the target interest points associated with the at least two target search terms, and a structured tag system of the interest points is obtained, the method further includes:
inputting at least one of the name, the position information, the signboard text information, the comment information and the commodity category information of the interest point to be labeled into a pre-trained labeling model, and outputting a structured label of the interest point to be labeled;
and the labeling model is obtained by utilizing label training in the structured label system.
9. A label architecture building apparatus comprising:
the search term determining module is used for determining at least two target search terms describing interest point category labels;
the label determining module is used for determining at least two category labels according to the at least two target search terms;
and the relationship establishing module is used for establishing a parent-child relationship between the at least two category labels according to the distribution information of the target interest points associated with the at least two target search terms to obtain a structured label system of the interest points.
10. The apparatus of claim 9, wherein the tag determination module comprises:
the clustering unit is used for clustering the at least two target search terms according to the similarity between the at least two target search terms to obtain at least two category communities;
and the label determining unit is used for determining the category labels of the community according to the search terms of the community in the at least two category communities to obtain the at least two category labels.
11. The apparatus of claim 10, wherein the tag determination unit is specifically configured to:
determining category labels of the communities according to center search terms of the communities in the at least two category communities to obtain the at least two category labels;
the central search term refers to a search term located at the central position of the semantic vector space of the community.
12. The apparatus of claim 10, wherein the apparatus further comprises:
the search term selection module is used for determining a search term to be calculated from the at least two target search terms according to the search heat of the target search terms before clustering the at least two target search terms according to the similarity between the at least two target search terms to obtain at least two category communities;
and the similarity determining module is used for determining the similarity between the search term to be calculated and other target search terms to obtain the similarity between the at least two target search terms.
13. The apparatus of claim 12, wherein the search term selection module comprises:
the search term ordering unit is used for ordering the at least two target search terms according to the search heat of the target search terms;
and the search term selection unit is used for respectively taking the at least two target search terms as the search terms to be calculated according to the sorting result.
14. The apparatus of any of claims 9-13, wherein the search term determination module comprises:
and the search term determining unit is used for determining the target search term from the historical search terms according to the information of the interest points in the click data associated with the historical search terms and the information of the interest points in the historical search terms.
15. The apparatus of claim 14, wherein the search term determination unit is specifically configured to:
if the associated click data of the historical search term comprises the interest point, taking the historical search term as a candidate search term;
and if the candidate search term only comprises the information except the interest point name, taking the candidate search term as the target search term.
16. The apparatus according to any one of claims 9-13, the apparatus further comprising:
the label labeling module is used for establishing a parent-child relationship between the at least two category labels according to the distribution information of the target interest points related to the at least two target search terms, obtaining a structured label system of the interest points, inputting at least one of names, position information, signboard text information, comment information and commodity category information of the interest points to be labeled into a pre-trained labeling model, and outputting the structured labels of the interest points to be labeled;
and the labeling model is obtained by utilizing label training in the structured label system.
17. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-8.
18. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010556296.0A CN111753195B (en) | 2020-06-17 | 2020-06-17 | Label system construction method, device, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010556296.0A CN111753195B (en) | 2020-06-17 | 2020-06-17 | Label system construction method, device, equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111753195A true CN111753195A (en) | 2020-10-09 |
CN111753195B CN111753195B (en) | 2024-01-09 |
Family
ID=72674736
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010556296.0A Active CN111753195B (en) | 2020-06-17 | 2020-06-17 | Label system construction method, device, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111753195B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112328896A (en) * | 2020-11-26 | 2021-02-05 | 北京百度网讯科技有限公司 | Method, apparatus, electronic device, and medium for outputting information |
CN113139110A (en) * | 2021-04-28 | 2021-07-20 | 北京百度网讯科技有限公司 | Regional feature processing method, device, equipment, storage medium and program product |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103069849A (en) * | 2010-08-13 | 2013-04-24 | 诺基亚公司 | Method and apparatus for secure shared personal map layer |
US20140244651A1 (en) * | 2013-02-25 | 2014-08-28 | Telenav, Inc. | Navigation system with data driven category label creation mechanism and method of operation thereof |
US8880535B1 (en) * | 2011-11-29 | 2014-11-04 | Google Inc. | System and method for selecting user generated content related to a point of interest |
CN107133263A (en) * | 2017-03-31 | 2017-09-05 | 百度在线网络技术(北京)有限公司 | POI recommends method, device, equipment and computer-readable recording medium |
US20170357381A1 (en) * | 2016-06-10 | 2017-12-14 | Apple Inc. | Labeling a significant location based on contextual data |
CN108876509A (en) * | 2018-05-11 | 2018-11-23 | 上海赢科信息技术有限公司 | Utilize the method and system of POI analysis user tag |
CN109145219A (en) * | 2018-09-10 | 2019-01-04 | 百度在线网络技术(北京)有限公司 | Point of interest Effective judgement method and apparatus based on internet text mining |
CN109376205A (en) * | 2018-09-07 | 2019-02-22 | 顺丰科技有限公司 | Excavate method, apparatus, equipment and the storage medium of address point of interest relationship |
CN110263248A (en) * | 2019-05-21 | 2019-09-20 | 平安科技(深圳)有限公司 | A kind of information-pushing method, device, storage medium and server |
CN110390054A (en) * | 2019-07-25 | 2019-10-29 | 北京百度网讯科技有限公司 | Point of interest recalls method, apparatus, server and storage medium |
CN110472163A (en) * | 2019-08-22 | 2019-11-19 | 百度在线网络技术(北京)有限公司 | Map search result shows determining method, apparatus, electronic equipment and medium |
CN110674349A (en) * | 2019-09-27 | 2020-01-10 | 北京字节跳动网络技术有限公司 | Video POI (Point of interest) identification method and device and electronic equipment |
-
2020
- 2020-06-17 CN CN202010556296.0A patent/CN111753195B/en active Active
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103069849A (en) * | 2010-08-13 | 2013-04-24 | 诺基亚公司 | Method and apparatus for secure shared personal map layer |
US8880535B1 (en) * | 2011-11-29 | 2014-11-04 | Google Inc. | System and method for selecting user generated content related to a point of interest |
US20140244651A1 (en) * | 2013-02-25 | 2014-08-28 | Telenav, Inc. | Navigation system with data driven category label creation mechanism and method of operation thereof |
US20170357381A1 (en) * | 2016-06-10 | 2017-12-14 | Apple Inc. | Labeling a significant location based on contextual data |
CN107133263A (en) * | 2017-03-31 | 2017-09-05 | 百度在线网络技术(北京)有限公司 | POI recommends method, device, equipment and computer-readable recording medium |
CN108876509A (en) * | 2018-05-11 | 2018-11-23 | 上海赢科信息技术有限公司 | Utilize the method and system of POI analysis user tag |
CN109376205A (en) * | 2018-09-07 | 2019-02-22 | 顺丰科技有限公司 | Excavate method, apparatus, equipment and the storage medium of address point of interest relationship |
CN109145219A (en) * | 2018-09-10 | 2019-01-04 | 百度在线网络技术(北京)有限公司 | Point of interest Effective judgement method and apparatus based on internet text mining |
US20200081908A1 (en) * | 2018-09-10 | 2020-03-12 | Baidu Online Network Technology (Beijing) Co., Ltd. | Internet text mining-based method and apparatus for judging validity of point of interest |
CN110263248A (en) * | 2019-05-21 | 2019-09-20 | 平安科技(深圳)有限公司 | A kind of information-pushing method, device, storage medium and server |
CN110390054A (en) * | 2019-07-25 | 2019-10-29 | 北京百度网讯科技有限公司 | Point of interest recalls method, apparatus, server and storage medium |
CN110472163A (en) * | 2019-08-22 | 2019-11-19 | 百度在线网络技术(北京)有限公司 | Map search result shows determining method, apparatus, electronic equipment and medium |
CN110674349A (en) * | 2019-09-27 | 2020-01-10 | 北京字节跳动网络技术有限公司 | Video POI (Point of interest) identification method and device and electronic equipment |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112328896A (en) * | 2020-11-26 | 2021-02-05 | 北京百度网讯科技有限公司 | Method, apparatus, electronic device, and medium for outputting information |
CN112328896B (en) * | 2020-11-26 | 2024-03-15 | 北京百度网讯科技有限公司 | Method, apparatus, electronic device, and medium for outputting information |
CN113139110A (en) * | 2021-04-28 | 2021-07-20 | 北京百度网讯科技有限公司 | Regional feature processing method, device, equipment, storage medium and program product |
CN113139110B (en) * | 2021-04-28 | 2023-09-22 | 北京百度网讯科技有限公司 | Regional characteristic processing method, regional characteristic processing device, regional characteristic processing equipment, storage medium and program product |
Also Published As
Publication number | Publication date |
---|---|
CN111753195B (en) | 2024-01-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11714816B2 (en) | Information search method and apparatus, device and storage medium | |
CN110955764B (en) | Scene knowledge graph generation method, man-machine conversation method and related equipment | |
CN111522967B (en) | Knowledge graph construction method, device, equipment and storage medium | |
CN111782977B (en) | Point-of-interest processing method, device, equipment and computer readable storage medium | |
CN111522994A (en) | Method and apparatus for generating information | |
CN112100524B (en) | Information recommendation method, device, equipment and storage medium | |
CN111814077B (en) | Information point query method, device, equipment and medium | |
CN112650907A (en) | Search word recommendation method, target model training method, device and equipment | |
CN111026937A (en) | Method, device and equipment for extracting POI name and computer storage medium | |
CN111246257B (en) | Video recommendation method, device, equipment and storage medium | |
CN111783468A (en) | Text processing method, device, equipment and medium | |
CN112559901B (en) | Resource recommendation method and device, electronic equipment, storage medium and computer program product | |
CN111597433A (en) | Resource searching method and device and electronic equipment | |
CN111737430B (en) | Entity linking method, device, equipment and storage medium | |
CN111091006A (en) | Entity intention system establishing method, device, equipment and medium | |
CN111339406A (en) | Personalized recommendation method, device, equipment and storage medium | |
CN111506803A (en) | Content recommendation method and device, electronic equipment and storage medium | |
CN111522940A (en) | Method and device for processing comment information | |
JP7206514B2 (en) | Method for sorting geolocation points, training method for sorting model, and corresponding device | |
CN111753195A (en) | Label system construction method, device, equipment and storage medium | |
CN112328896A (en) | Method, apparatus, electronic device, and medium for outputting information | |
CN112100454A (en) | Searching method, searching device, electronic equipment and readable storage medium | |
CN111984876A (en) | Interest point processing method, device, equipment and computer readable storage medium | |
CN111984883A (en) | Label mining method, device, equipment and storage medium | |
CN113590914A (en) | Information processing method, device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |