CN110688850A - Catering type determination method and device - Google Patents

Catering type determination method and device Download PDF

Info

Publication number
CN110688850A
CN110688850A CN201910892671.6A CN201910892671A CN110688850A CN 110688850 A CN110688850 A CN 110688850A CN 201910892671 A CN201910892671 A CN 201910892671A CN 110688850 A CN110688850 A CN 110688850A
Authority
CN
China
Prior art keywords
vector
entity
relationship
extracted
statement
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910892671.6A
Other languages
Chinese (zh)
Inventor
范聪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sankuai Online Technology Co Ltd
Original Assignee
Beijing Sankuai Online Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sankuai Online Technology Co Ltd filed Critical Beijing Sankuai Online Technology Co Ltd
Priority to CN201910892671.6A priority Critical patent/CN110688850A/en
Publication of CN110688850A publication Critical patent/CN110688850A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/12Hotels or restaurants

Landscapes

  • Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the disclosure provides a catering type determination method, which includes: vectorizing the first entity and the second entity through Word2vec to generate a first set; vectorizing a relationship between a first entity and a second entity which are established in advance through TransE to generate a second set; and extracting a first vector in the first set and a third vector corresponding to the second vector in the second set from statements of the knowledge graph and the network encyclopedia data through remote supervision. According to the embodiment of the disclosure, the relationship between each pair of entities in the entity set is extracted from the knowledge graph and the network encyclopedia data through remote supervision, compared with the prior art, the extracted relationship can be greatly expanded due to the fact that the remote supervision has little dependence on manpower, so that the extraction is more comprehensive, the overall accuracy of the extracted relationship can be ensured, and the relationship between the first entity and the second entity, namely the catering type corresponding to the catering name, can be determined comprehensively and accurately in the following process.

Description

Catering type determination method and device
Technical Field
The present disclosure relates to the field of data analysis technologies, and in particular, to a catering type determination method, a catering type determination apparatus, an electronic device, and a computer-readable storage medium.
Background
With the development of the intellectualization of the catering industry, more convenient services and better service experience can be provided for users, for example, when the users take out at a spot or visit a street, what type of catering is wanted to eat, the catering name under the type can be inquired by inputting the catering type, and the service is mainly completed by classifying the catering names.
However, at present, the manner of determining the types of restaurants is mainly accomplished by manual labeling, that is, the names and types of restaurants having relationships are determined manually, and then the association relationship between the names and types of restaurants is established and stored.
The manual labeling mode needs manual operation, the efficiency of the manual operation is very limited, the types and names of the restaurants can be labeled are also very limited, and the determined relationship is very limited. And some catering names belong to a plurality of catering types, and manual standards are difficult to analyze comprehensively, so that some relations are not labeled, and the omission of the relation between the catering names and the catering types is caused.
Disclosure of Invention
The present disclosure provides a catering type determination method, a catering type determination apparatus, an electronic device, and a computer-readable storage medium to overcome technical problems in the related art.
According to a first aspect of an embodiment of the present disclosure, a catering type determination method is provided, including:
vectorizing a first entity and a second entity through Word2vec to generate a first set of a first vector containing the first entity and a second vector containing the second entity, wherein the first entity comprises a restaurant name and the second entity comprises a restaurant type;
vectorizing a pre-established relationship between a first entity and a second entity through TransE to generate a second set of third vectors containing the relationship;
and extracting a first vector in the first set and a third vector corresponding to the second vector in the second set from statements of the knowledge graph and the network encyclopedia data through remote supervision.
Optionally, the extracting, by remote supervision, a corresponding third vector of the first vector and the second vector in the first set in the second set from the statements of the knowledge-graph and the cyber-encyclopedia data comprises:
determining a sentence to be extracted containing the first vector and the second vector in knowledge graph and network encyclopedia data;
and extracting a third vector corresponding to the first vector and the second vector in the first set in the second set from the statement to be extracted.
Optionally, the method further comprises:
the noise filtered in the decimated third vector.
Optionally, the noise filtered in the decimated third vector comprises:
generating a statement to be analyzed according to the third vector corresponding to the target relationship and the first vector and the second vector corresponding to the first entity and the second entity in the first set conforming to the target relationship;
matrixing the statement to be analyzed to obtain a matrix of the statement to be analyzed;
inputting the matrix into a neural network to output the probability that the statement to be analyzed is noise;
and deleting the third vector corresponding to the statement with the probability greater than the preset probability from the extracted third vector.
Optionally, the neural network comprises a long-short term memory network layer, a fully connected layer and a softmax classifier.
According to a second aspect of the embodiments of the present disclosure, a catering type determination device is provided, including:
the system comprises a first vectorization module, a second vectorization module and a third vectorization module, wherein the first vectorization module is used for vectorizing a first entity and a second entity through Word2vec to generate a first set containing a first vector of the first entity and a second vector of the second entity, the first entity comprises a food and drink name, and the second entity comprises a food and drink type;
the second vectorization module is used for vectorizing a relationship between a first entity and a second entity which are established in advance through TransE to generate a second set of third vectors containing the relationship;
and the relation extraction module is used for extracting a first vector in the first set and a third vector corresponding to the second vector in the second set from statements of the knowledge graph and the network encyclopedia data through remote supervision.
Optionally, the relationship extraction module includes:
the sentence determining submodule is used for determining a sentence to be extracted containing the first vector and the second vector in the knowledge graph and the network encyclopedia data;
and the relation extraction submodule is used for extracting a third vector corresponding to the first vector and the second vector in the first set in the second set from the statement to be extracted.
Optionally, the apparatus further comprises:
and the noise filtering module is used for filtering the noise in the extracted third vector.
Optionally, the noise filtering module includes:
the sentence generation submodule is used for generating a sentence to be analyzed according to the third vector corresponding to the target relation and the first vector and the second vector corresponding to the first entity and the second entity in the first set which accord with the target relation;
the matrixing submodule is used for matrixing the statement to be analyzed to obtain a matrix of the statement to be analyzed;
the filtering submodule is used for inputting the matrix into a neural network so as to output the probability that the statement to be analyzed is noise;
and the probability comparison submodule is used for deleting the third vector corresponding to the statement with the probability greater than the preset probability from the extracted third vector.
Optionally, the neural network comprises a long-short term memory network layer, a fully connected layer and a softmax classifier.
According to a third aspect of the embodiments of the present disclosure, an electronic device is provided, including:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to implement the method of any of the above embodiments.
According to a third aspect of the embodiments of the present disclosure, a computer-readable storage medium is proposed, on which a computer program is stored, which when executed by a processor implements the steps in the method according to any of the embodiments described above.
According to the embodiment of the disclosure, the relationship between each pair of entities in the entity set is extracted from the knowledge graph and the network encyclopedia data through remote supervision, compared with the prior art, the extracted relationship can be greatly expanded due to the fact that the remote supervision has little dependence on manpower, so that the extraction is more comprehensive, the overall accuracy of the extracted relationship can be ensured, and the relationship between the first entity and the second entity, namely the catering type corresponding to the catering name, can be determined comprehensively and accurately in the following process.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.
Fig. 1 is a schematic flow chart diagram illustrating a restaurant type determination method according to an embodiment of the present disclosure.
Fig. 2 is a schematic diagram illustrating a restaurant type determination method according to an embodiment of the disclosure.
Fig. 3 is a schematic diagram illustrating a neural network, according to an embodiment of the present disclosure.
Fig. 4 is a corresponding schematic diagram of statements and entities and relationships shown in accordance with an embodiment of the present disclosure.
Fig. 5 is a schematic flow chart diagram illustrating another restaurant type determination method according to an embodiment of the present disclosure.
Fig. 6 is a schematic flow chart diagram illustrating yet another restaurant type determination method according to an embodiment of the present disclosure.
Fig. 7 is a schematic flow chart diagram illustrating yet another restaurant type determination method according to an embodiment of the present disclosure.
Fig. 8 is a schematic flow chart diagram illustrating yet another restaurant type determination method according to an embodiment of the present disclosure.
Fig. 9 is a hardware configuration diagram of a terminal or a server where the restaurant type determining apparatus is located according to an embodiment of the disclosure.
Fig. 10 is a schematic block diagram illustrating a restaurant type determination apparatus according to an embodiment of the present disclosure.
Fig. 11 is a schematic block diagram illustrating another restaurant type determination apparatus according to an embodiment of the present disclosure.
Fig. 12 is a schematic block diagram illustrating yet another restaurant type determination apparatus according to an embodiment of the present disclosure.
Fig. 13 is a schematic block diagram illustrating yet another restaurant type determination apparatus according to an embodiment of the present disclosure.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
Fig. 1 is a schematic flow chart diagram illustrating a restaurant type determination method according to an embodiment of the present disclosure. The restaurant type determining method shown in this embodiment may be applicable to electronic devices, such as terminals of mobile phones, tablet computers, wearable devices, and the like, and may also be applicable to servers, where the servers may be servers in the takeaway industry or servers in the restaurant query industry, and thus, embodiments of the present disclosure are not limited thereto.
As shown in fig. 1, a restaurant type determining method according to an embodiment of the disclosure may include:
step S1, vectorizing a first entity and a second entity through Word2vec to generate a first set including a first vector of the first entity and a second vector of the second entity, wherein the first entity includes a restaurant name and the second entity includes a restaurant type;
step S2, vectorizing the relationship between the first entity and the second entity established in advance through a transformating Embedding (transformating), so as to generate a second set of third vectors containing the relationship;
and step S3, extracting a first vector in the first set and a third vector corresponding to the second vector in the second set from statements of the knowledge graph and the network encyclopedia data through remote supervision.
In one embodiment, relationships between a plurality of first entities and second entities may be pre-constructed, and a second set may be generated based on the plurality of relationships, where relationships between a plurality of restaurant names and a plurality of restaurant types exist, and the relationships in the second set may be manually set and thus all considered accurate.
In one embodiment, the vectorized first entity and second entity in step S1 are the first entity and second entity for which the relationship is not determined, and are the objects for which the relationship is to be determined in this embodiment. The number of the first entities and the second entities with the relationship established in advance may be less than the number of the vectorized first entities and second entities, that is, the embodiment determines the relationship between a large number of first entities and second entities based on the relationship between a small number of first entities and second entities.
Wherein the first entity and the second entity can be vectorized through Word2vec, and the pre-established relationship between the first entity and the second entity can be vectorized through TransE. Since the vector comprises information of multiple dimensions (namely, comprises multiple features), the first entity, the second entity and the relationship between the first entity and the second entity can be more comprehensively characterized, so that when the target relationship is extracted from the statement in the subsequent step through remote supervision, the statement comprising the first vector, the second vector and the third vector can be accurately determined for extraction based on the vectors.
In one embodiment, based on the second set, a third vector expressing a relationship between the first entity and the second entity may be determined, and then, by remote supervision, a statement may be determined from the knowledge graph and the web encyclopedia data, the statement including the first vector and the second vector belonging to the first set and the third vector belonging to the second set, and then, the third vector corresponding to the first vector and the second vector in the first set in the second set may be extracted from the statement.
For example, the relationship between the first entity and the second entity is established in advance, and the expression manner may be that the relationship includes the words: the term "phrase" means a phrase or phrase calculated, classified, etc. The first entity includes a name of the food and drink, such as sha county, old horse, and the second entity includes a type of food and drink, such as snack, pulled noodles, barbeque, etc. Then, through remote supervision, the sentence determined from the knowledge graph and the network encyclopedia data may include a phrase segment "the horse is a ramen shop from northwest of China", based on the sentence, it may be determined that the first entity is the horse, the second entity is the ramen, and then the extracted relationship, that is, the third vector is: the old horse is stretched noodles. Thereby determining the relationship of the first entity to the second entity.
In one embodiment, since the statements in the knowledge graph and the network encyclopedia data in the actual scene may be quite complex, in order to more accurately determine the statements containing the first vector and the second vector, the statements in the knowledge graph and the network encyclopedia data may be divided into a plurality of terms by a word segmentation tool (e.g., Jieba), and since the terms are relatively short, the terms contain relatively few features, so as to determine which terms can be equivalent to the first vector and the second vector, and the determined terms are located in the statements, i.e., the statements used for extracting the third vector.
In one embodiment, the network encyclopedia data includes, but is not limited to, Wikipedia (Wikipedia) and encyclopedia data, and the Knowledge Graph includes, but is not limited to, an open Knowledge Graph (OpenKG, KG is collectively referred to as Knowledge Graph, i.e., a Knowledge Graph).
The network encyclopedia data comprises a large number of relationships among the word content recording entities, but sentences are relatively complex relative to a knowledge spectrogram, so that the extracted relationships between the first entities and the second entities are large in number and low in accuracy; the statements in the knowledge graph are simpler than the network encyclopedia data, but the recorded relation quantity of the first entity and the second entity is not more than the network encyclopedia data, so that the extracted relation quantity of the first entity and the second entity is less but the accuracy is higher.
And then the relation between each pair of entities in the entity set is extracted from the knowledge graph and the network encyclopedia data through remote supervision, compared with the prior art, the remote supervision has little dependence on manpower, so that the extracted relation can be greatly expanded, the extracted relation is more comprehensive, and the overall accuracy of the extracted relation is improved, so that the relation between the first entity and the second entity, namely the catering type corresponding to the catering name, can be comprehensively and accurately determined in the subsequent process.
Fig. 2 is a schematic diagram illustrating a restaurant type determination method according to an embodiment of the disclosure.
In one embodiment, as shown in fig. 2, for wikipedia (there may also be a knowledge graph), a small number of entity relationships (i.e., the relationship of the first entity and the second entity in the second set), and all entities (i.e., the first entity and the second entity in the first set), may be input into a statement level attention module, which may extract a third vector corresponding to the first vector and the second vector in the first set in the second set from statements of the knowledge graph and the network encyclopedia data based on remote supervision.
Furthermore, for the sentences in wikipedia, the sentences and all entities in the knowledge graph (i.e. the first entity and the second entity which need to be vectorized), there may also be a relationship (not shown in fig. 2) between the first entity and the second entity which is established in advance, and vectorization may be performed, wherein Word2vec or TransE may be used for vectorization as needed. In addition, for a longer sentence, the word segmentation tool Jieba may be used to divide the longer sentence into a plurality of shorter words, so as to determine which words are equivalent to the first vector and the second vector, and further determine the sentence where the word is located, that is, the sentence for extracting the third vector.
Since the statements in the knowledge graph and the network encyclopedia data have various forms, the relationship between the first entity and the second entity, namely the third vector, may be extracted by remotely supervising to extract the wrong relationship from the statements, so that the extracted result can be filtered to improve the accuracy of the determined relationship. The filtering model can be a neural network, and the filtering model can be obtained by obtaining a machine learning sample through remote supervision and performing machine learning based on the sample.
Fig. 3 is a schematic diagram illustrating a neural network, according to an embodiment of the present disclosure.
In one embodiment, the neural network as the filtering model may include an LSTM (Long Short-Term Memory network) layer, which may specifically include BILstm, and may further include an output layer, which may specifically be a Softmax classifier, and a full connection (fullconnection) layer between the LSTM layer and the output layer.
And inputting a small number of extracted entity relations (which can be all extracted entity relations) into the filtering model, outputting a probability which represents the probability that the statement to be analyzed formed by the extracted relations is noise, deleting the relation with a higher probability, and reserving the relation with a lower probability so as to accurately determine the relation between the first entity and the second entity based on the reserved relation.
Fig. 4 is a corresponding schematic diagram of statements and entities and relationships shown in accordance with an embodiment of the present disclosure.
In one embodiment, such as shown in fig. 4, for the statement "langzhou beef noodles, also known as langzhou broth beef noodles", the classification from which the first entity "la zhou" is drawn (i.e., the third vector) is the second entity "la noodles"; for the statement "marangla ramen honor ' 2008-2009 china ' the most influential fast food brand '", the relation (i.e. the third vector) extracted from it is the category of the first entity "marangla" which is the second entity "ramen".
Obviously, the relationship that the item of the 'la zhou' is the 'stretched noodle' is wrong, and the relationship that the item of the 'kalan' is the 'stretched noodle' is correct.
Fig. 5 is a schematic flow chart diagram illustrating another restaurant type determination method according to an embodiment of the present disclosure. As shown in fig. 5, the extracting, by remote supervision, a first vector and a third vector corresponding to the second vector in the first set from statements of a knowledge graph and cyber encyclopedia data includes:
step S31, determining sentences to be extracted containing the first vector and the second vector in knowledge graph and network encyclopedia data;
step S32, extracting, from the statement to be extracted, a third vector corresponding to the first vector and the second vector in the first set in the second set.
In one embodiment, since the sentences contained in the knowledge graph and the network encyclopedia data are nearly infinite, it is difficult to extract the relationship between the first entity and the second entity according to the process of the embodiment shown in fig. 1 for such a large amount of data, and the sentence to be extracted containing the first vector and the second vector can be determined in advance, so that the sentence to be extracted has a high probability of containing the relationship between the first vector and the second vector, that is, the relationship between the first entity and the second entity, and further, a third vector corresponding to the first vector and the second vector in the first set in the second set can be extracted from the sentence to be extracted, thereby reducing the sentences required to be considered in the extraction operation, and being beneficial to reducing the data amount to be processed.
Fig. 6 is a schematic flow chart diagram illustrating yet another restaurant type determination method according to an embodiment of the present disclosure. As shown in fig. 6, the method further comprises:
and step S4, filtering noise in the extracted third vector.
In one embodiment, since the forms of the statements in the knowledge graph and the network encyclopedia data are various, the relationship between the first entity and the second entity, that is, the third vector, may be extracted by remotely supervising to extract the relationship between the first entity and the second entity from the statements, so that the extracted result may be filtered to improve the accuracy of the determined relationship. The filtering model can be a neural network, and the filtering model can be obtained by obtaining a machine learning sample through remote supervision and performing machine learning based on the sample.
For example, the knowledge graph and the network encyclopedia data include a statement that "old horse is a ramen shop from northwest of China", a first entity determined based on the statement is old horse, a second entity is ramen, and an extracted third vector represents that the relationship between the first entity and the second entity is as follows: the old horse is stretched noodles. In this case the determined relationship between the first entity and the second entity is correct, i.e. the extracted third vector is correct.
For example, the knowledge graph and the network encyclopedia data include a statement that "the old horse eats a dumpling at a railway station for driving, the first entity is the old horse determined based on the statement, the second entity is the sum barbecue, and the extracted third vector represents the relationship between the first entity and the second entity as follows: the old horse is dumpling. This clearly does not correspond to the meaning of the statement, and therefore the relationship between the first entity and the second entity is determined in this case to be incorrect, i.e. the third vector is extracted in error, belonging to a noisy relationship.
If the noise relationship is more, the wrong relationship may be extracted, so that the catering type of the catering name is determined by mistake.
Fig. 7 is a schematic flow chart diagram illustrating yet another restaurant type determination method according to an embodiment of the present disclosure. As shown in fig. 7, the noise filtered in the decimated third vector includes:
step S41, generating a statement to be analyzed according to the third vector corresponding to the target relationship, and the first vector and the second vector corresponding to the first entity and the second entity in the first set conforming to the target relationship;
step S42, matrixing the statement to be analyzed to obtain a matrix of the statement to be analyzed;
step S43, inputting the matrix into a neural network to output the probability that the statement to be analyzed is noise;
and step S44, deleting the third vector corresponding to the statement with the probability greater than the preset probability from the extracted third vector.
In one embodiment, noise may be filtered by using a pre-trained neural network as a filtering model, and in order to accurately filter noise, a to-be-analyzed sentence may be generated according to a third vector corresponding to a target relationship and first and second vectors corresponding to first and second entities in a first set that meet the target relationship, for example, the first, second, and third vectors may be fused to generate the to-be-analyzed sentence.
Fig. 8 is a schematic diagram illustrating fusing a first vector, a second vector, and a third vector according to an embodiment of the present disclosure.
As shown in fig. 8, a first entity corresponding to a first vector h is "sha county", a second entity corresponding to a second vector t is "snack", and a third vector r indicates that a category of the first entity and the second entity is the second entity, that is, the category of "sha county" is "snack".
The first vector, the second vector and the third vector may be fused, for example, the three vectors shown in fig. 8 are spliced together, and based on the spliced three vectors, a sentence to be analyzed, which is composed of the first vector corresponding to the first entity, the relationship represented by the third vector, and the second entity corresponding to the second vector, may be formed, that is, the article category in sha county is a snack. In addition to directly stitching three vectors together to achieve fusion, fusion is also achieved by pooling, which includes, but is not limited to, Max pooling (Max pooling), Min pooling (Min pooling), and average pooling (Avg pooling).
For the sentence to be analyzed, whether the relation contained in the sentence to be analyzed is accurate needs to be determined, the relation can be input into a neural network for filtering noise after matrixing, the neural network can output the probability that the sentence to be analyzed is noise, if the probability is higher, for example, higher than a preset probability, the sentence to be analyzed can be determined to be noise to a greater extent, so that the third vector corresponding to the sentence with the probability higher than the preset probability can be deleted from the extracted third vector, and the relation represented by the retained third vector is relatively accurate, so that the relation between the first entity and the second entity can be accurately determined.
Optionally, the neural network comprises a long-short term memory network (e.g., BILstm) layer, a fully connected layer, and a softmax classifier.
Corresponding to the embodiment of the catering type determination method, the embodiment of the catering type determination device is further provided.
The embodiment of the catering type determination device can be applied to a terminal or a server. The device embodiments may be implemented by software, or by hardware, or by a combination of hardware and software. The software implementation is taken as an example, and as a logical device, the device is formed by reading corresponding computer program instructions in the nonvolatile memory into the memory for operation through the processor of the terminal or the server where the device is located. From a hardware aspect, as shown in fig. 9, a hardware structure diagram of a terminal or a server where the catering type determination device is located according to the present disclosure is shown, except for the processor, the memory, the network interface, and the nonvolatile memory shown in fig. 9, the terminal or the server where the device is located in the embodiment may also include other hardware according to the actual function of the terminal or the server, which is not described again.
Fig. 10 is a schematic block diagram illustrating a restaurant type determination apparatus according to an embodiment of the present disclosure. The catering type determination device shown in this embodiment may be applicable to electronic devices, such as a mobile phone, a tablet computer, a wearable device, and other terminals, and may also be applicable to a server, where the server may be a server in a takeaway industry, and may also be a server in a catering query industry, and thus, embodiments of the present disclosure are not limited thereto.
As shown in fig. 10, the restaurant type determining apparatus according to the embodiment of the disclosure may include:
the system comprises a first vectorization module 1, a second vectorization module and a third vectorization module, wherein the first vectorization module is used for vectorizing a first entity and a second entity through Word2vec to generate a first set of a first vector containing the first entity and a second vector containing the second entity, the first entity comprises a restaurant name, and the second entity comprises a restaurant type;
a second vectorization module 2, configured to vectorize, by means of TransE, a relationship between a first entity and a second entity that are established in advance, so as to generate a second set of third vectors that includes the relationship;
and the relation extraction module 3 is used for extracting a first vector in the first set and a third vector corresponding to the second vector in the second set from statements of the knowledge graph and the network encyclopedia data through remote supervision.
Fig. 11 is a schematic block diagram illustrating another restaurant type determination apparatus according to an embodiment of the present disclosure. As shown in fig. 11, the relationship extraction module 3 includes:
a sentence determination submodule 31 for determining a sentence to be extracted including the first vector and the second vector in the knowledge graph and the network encyclopedia data;
and the relation extraction submodule 32 is configured to extract, from the statement to be extracted, a third vector corresponding to the first vector and the second vector in the first set in the second set.
Fig. 12 is a schematic block diagram of another restaurant type determination apparatus shown in accordance with an embodiment of the present disclosure. As shown in fig. 12, the apparatus further includes:
and the noise filtering module 4 is used for filtering the noise in the extracted third vector.
Fig. 13 is a schematic block diagram illustrating yet another restaurant type determination apparatus according to an embodiment of the present disclosure. As shown in fig. 13, the noise filtering module 4 includes:
a statement generating submodule 41, configured to generate a statement to be analyzed according to the third vector corresponding to the target relationship, and the first vector and the second vector corresponding to the first entity and the second entity in the first set that meet the target relationship;
a matrixing submodule 42, configured to matrixing the statement to be analyzed to obtain a matrix of the statement to be analyzed;
a filtering submodule 43, configured to input the matrix into a neural network, so as to output a probability that the statement to be analyzed is noise;
and the probability comparison submodule 44 is configured to delete the third vector corresponding to the statement with the probability greater than the preset probability from the extracted third vector.
Optionally, the neural network comprises a long-short term memory network layer, a fully connected layer and a softmax classifier.
With regard to the apparatus in the above embodiments, the specific manner in which each module performs operations has been described in detail in the embodiments of the related method, and will not be described in detail here.
For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, wherein the modules described as separate parts may or may not be physically separate, and the parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules can be selected according to actual needs to achieve the purpose of the disclosed solution. One of ordinary skill in the art can understand and implement it without inventive effort.
An embodiment of the present disclosure also provides an electronic device, including:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to implement the method of any of the above embodiments.
Embodiments of the present disclosure also provide a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps in the method according to any of the above embodiments.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (10)

1. A catering type determination method is characterized by comprising the following steps:
vectorizing a first entity and a second entity through Word2vec to generate a first set of a first vector containing the first entity and a second vector containing the second entity, wherein the first entity comprises a restaurant name and the second entity comprises a restaurant type;
vectorizing a pre-established relationship between a first entity and a second entity through TransE to generate a second set of third vectors containing the relationship;
and extracting a first vector in the first set and a third vector corresponding to the second vector in the second set from statements of the knowledge graph and the network encyclopedia data through remote supervision.
2. The method of claim 1, wherein the extracting, by remote supervision, from statements of knowledge-graphs and cybernaviridae data, first and second vectors of the first set and a corresponding third vector of the second set comprises:
determining a sentence to be extracted containing the first vector and the second vector in knowledge graph and network encyclopedia data;
and extracting a third vector corresponding to the first vector and the second vector in the first set in the second set from the statement to be extracted.
3. The method of claim 1, further comprising:
the noise filtered in the decimated third vector.
4. The method of claim 3, wherein the noise filtered in the decimated third vector comprises:
generating a statement to be analyzed according to the third vector corresponding to the target relationship and the first vector and the second vector corresponding to the first entity and the second entity in the first set conforming to the target relationship;
matrixing the statement to be analyzed to obtain a matrix of the statement to be analyzed;
inputting the matrix into a neural network to output the probability that the statement to be analyzed is noise;
and deleting the third vector corresponding to the statement with the probability greater than the preset probability from the extracted third vector.
5. The method of claim 4, wherein the neural network comprises a long-short term memory network layer, a fully-connected layer, and a softmax classifier.
6. A restaurant type determining apparatus, comprising:
the system comprises a first vectorization module, a second vectorization module and a third vectorization module, wherein the first vectorization module is used for vectorizing a first entity and a second entity through Word2vec to generate a first set containing a first vector of the first entity and a second vector of the second entity, the first entity comprises a food and drink name, and the second entity comprises a food and drink type;
the second vectorization module is used for vectorizing a relationship between a first entity and a second entity which are established in advance through TransE to generate a second set of third vectors containing the relationship;
and the relation extraction module is used for extracting a first vector in the first set and a third vector corresponding to the second vector in the second set from statements of the knowledge graph and the network encyclopedia data through remote supervision.
7. The apparatus of claim 6, wherein the relationship extraction module comprises:
the sentence determining submodule is used for determining a sentence to be extracted containing the first vector and the second vector in the knowledge graph and the network encyclopedia data;
and the relation extraction submodule is used for extracting a third vector corresponding to the first vector and the second vector in the first set in the second set from the statement to be extracted.
8. The apparatus of claim 6, further comprising:
and the noise filtering module is used for filtering the noise in the extracted third vector.
9. An electronic device, comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to implement the method of any one of claims 1 to 5.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 5.
CN201910892671.6A 2019-09-20 2019-09-20 Catering type determination method and device Pending CN110688850A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910892671.6A CN110688850A (en) 2019-09-20 2019-09-20 Catering type determination method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910892671.6A CN110688850A (en) 2019-09-20 2019-09-20 Catering type determination method and device

Publications (1)

Publication Number Publication Date
CN110688850A true CN110688850A (en) 2020-01-14

Family

ID=69109843

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910892671.6A Pending CN110688850A (en) 2019-09-20 2019-09-20 Catering type determination method and device

Country Status (1)

Country Link
CN (1) CN110688850A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112528045A (en) * 2020-12-23 2021-03-19 中译语通科技股份有限公司 Method and system for judging domain map relation based on open encyclopedia map

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112528045A (en) * 2020-12-23 2021-03-19 中译语通科技股份有限公司 Method and system for judging domain map relation based on open encyclopedia map
CN112528045B (en) * 2020-12-23 2024-04-02 中译语通科技股份有限公司 Method and system for judging domain map relation based on open encyclopedia map

Similar Documents

Publication Publication Date Title
CN110019843B (en) Knowledge graph processing method and device
CN107193962B (en) Intelligent map matching method and device for Internet promotion information
CN110909182B (en) Multimedia resource searching method, device, computer equipment and storage medium
EP4040310A1 (en) Image and text data hierarchical classifiers
CN108734159B (en) Method and system for detecting sensitive information in image
CN110765774B (en) Training method and device of information extraction model and information extraction method and device
CN111144370B (en) Document element extraction method, device, equipment and storage medium
CN112818162B (en) Image retrieval method, device, storage medium and electronic equipment
CN110909868A (en) Node representation method and device based on graph neural network model
CN109522399B (en) Method and apparatus for generating information
CN110968664A (en) Document retrieval method, device, equipment and medium
CN113283432A (en) Image recognition and character sorting method and equipment
CN113343012B (en) News matching method, device, equipment and storage medium
CN111191133A (en) Service search processing method, device and equipment
CN113094287B (en) Page compatibility detection method, device, equipment and storage medium
CN113343936B (en) Training method and training device for video characterization model
CN110276013A (en) A kind of recommended method of maintenance technician, device and storage medium
CN110688850A (en) Catering type determination method and device
JP2020502710A (en) Web page main image recognition method and apparatus
CN111597336A (en) Processing method and device of training text, electronic equipment and readable storage medium
CN110580297A (en) Merchant and dish matching method and device based on dish image and electronic equipment
CN116304155A (en) Three-dimensional member retrieval method, device, equipment and medium based on two-dimensional picture
CN111797622A (en) Method and apparatus for generating attribute information
CN113297482B (en) User portrayal describing method and system of search engine data based on multiple models
CN115565042A (en) Commodity image feature representation method and device, equipment, medium and product thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination