CN111597458B

CN111597458B - Scene element extraction method, device, equipment and storage medium

Info

Publication number: CN111597458B
Application number: CN202010295057.4A
Authority: CN
Inventors: 李千; 史亚冰; 蒋烨; 柴春光; 朱勇
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2020-04-15
Filing date: 2020-04-15
Publication date: 2023-11-17
Anticipated expiration: 2040-04-15
Also published as: CN111597458A

Abstract

The application provides a method, a device, equipment and a storage medium for extracting scene elements, which relate to the field of knowledge maps and specifically comprise the following steps: comment information of interest points is obtained; and extracting comment information of the interest points through the sequence annotation model to obtain scene element labels in the comment information. Therefore, scene elements are extracted from comment information, and a scene map is further built based on the scene elements and is used for POI recommendation.

Description

Scene element extraction method, device, equipment and storage medium

Technical Field

The application relates to the technical field of computers, in particular to the technical field of knowledge maps, and provides a method, a device, equipment and a storage medium for extracting scene elements.

Background

In personalized recommended scenes of products such as maps, the user demands often change with the scene in which the user is located. Currently, a knowledge graph technology-based graph suitable for scene recommendation is established to meet the requirements of scene recommendation. When a map suitable for the scene recommendation is established, the relation between various scene elements combined into a scene and the interest points needs to be mined.

Currently, in order to meet the requirements of the scenerization recommendation, a scheme for establishing the relationship between the scene elements and the interest points is needed.

Disclosure of Invention

The present application aims to solve at least one of the technical problems in the related art to some extent.

Therefore, the application provides a scene element extraction method, a device, equipment and a storage medium.

An embodiment of a first aspect of the present application provides a method for extracting a scene element, including:

comment information of interest points is obtained;

and extracting comment information of the interest points through a sequence labeling model to obtain scene element labels in the comment information.

An embodiment of a second aspect of the present application provides a device for extracting a scene element, including:

the acquisition module is used for acquiring comment information of the interest points;

and the extraction module is used for extracting the comment information of the interest point through the sequence labeling model so as to acquire scene element labels in the comment information.

An embodiment of a third aspect of the present application provides an electronic device, including at least one processor, and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of extracting a scene element as described in the embodiments of the first aspect.

An embodiment of a fourth aspect of the present application proposes a non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method for extracting a scene element according to the embodiment of the first aspect.

One embodiment of the above application has the following advantages or benefits: because the comment information of the interest point is acquired, the comment information of the interest point is extracted through the sequence labeling model so as to acquire scene element tags in the comment information. Therefore, the scene elements are extracted based on the comment information of the interest points, and the scene map is further built based on the scene elements so as to be used for POI recommendation.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the application or to delineate the scope of the application. Other features of the present application will become apparent from the description that follows.

Drawings

The drawings are included to provide a better understanding of the present application and are not to be construed as limiting the application. Wherein:

fig. 1 is a flow chart of a method for extracting scene elements according to an embodiment of the present application;

FIG. 2 is a schematic diagram of generating a scene element tag according to an embodiment of the present application;

FIG. 3 is a flowchart illustrating another method for extracting scene elements according to an embodiment of the present application;

FIG. 4 is a flowchart illustrating another method for extracting scene elements according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of a scene element extraction device according to an embodiment of the present application;

fig. 6 is a schematic structural diagram of another scenario element extraction apparatus according to an embodiment of the present application;

fig. 7 illustrates a block diagram of an exemplary electronic device suitable for use in implementing embodiments of the application.

Detailed Description

Exemplary embodiments of the present application will now be described with reference to the accompanying drawings, in which various details of the embodiments of the present application are included to facilitate understanding, and are to be considered merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

In personalized recommendation of products such as maps, the demands of users generally change along with the scenes of the users, and most of the users have naturalness in the same scene, at present, POI (Point of Interest, interest point) information is enriched based on a knowledge graph technology, and a graph suitable for the personalized recommendation is established so as to meet the demands of the personalized recommendation. When a map suitable for the scene recommendation is established, the relation between various scene elements and POIs combined into a scene needs to be mined.

The embodiment of the application provides a method for extracting scene elements, which is used for extracting the scene elements based on comment information, so that a scene map is further built based on the scene elements and is used for POI recommendation.

Fig. 1 is a flow chart of a method for extracting a scene element according to an embodiment of the present application, as shown in fig. 1, where the method includes:

and step 101, obtaining comment information of the interest points.

In this embodiment, comment information corresponding to the point of interest may be obtained first, and as an example, the point of interest refers to a geographic object that may be abstracted into a point, for example, the point of interest includes a store, a bar, a gas station, a hospital, a station, and the like, and the comment information is, for example, "here, suitable for taking friends to get a meal" and the like.

As a possible implementation manner, the comment information of the interest point can be captured through the related application program, for example, when the comment information of the interest point of a restaurant is obtained, the relevant restaurant application program can be used for searching the interest point of the restaurant, and further the comment information related to the interest point of the restaurant is obtained.

In one embodiment of the present application, the obtained comment information may have an entire text, and therefore, after the comment information of the point of interest is obtained, the comment information may be further preprocessed. Optionally, punctuation marks in the comment information are obtained, the comment information is segmented according to the punctuation marks to generate a plurality of clauses, and then the plurality of clauses are segmented to form a plurality of words. As an example, the comment information is divided into sentences according to punctuation marks, and the comment information after the division is "suitable for a dinner here" is subjected to word segmentation, so that a plurality of words "the inner" are suitable for "the combination" and "the poly" are formed. Wherein the punctuation marks may be Chinese punctuation marks-! The method comprises the steps of carrying out a first treatment on the surface of the . ? … it may be English punctuation mark-! ? A "".

And 102, extracting comment information of the interest point through a sequence annotation model to obtain scene element labels in the comment information.

In this embodiment, sample data marked with scene element tags may be collected, a sequence marking model is trained in advance according to the sample data, the sequence marking model is input as comment information, and the comment information is output as the scene element tags in the comment information, and further, after comment information of an interest point is obtained, comment information of the interest point is extracted through the sequence marking model to obtain the scene element tags in the comment information. The scene element labels comprise crowd labels, time labels, place labels, demand labels and emotion labels, and the sequence labeling model can be composed of a pre-training model, at least one bidirectional gate cycle unit GRU layer and a conditional random field CRF layer, wherein the pre-training model is an ERNIE model.

In one embodiment of the application, the scene element tags may include a total of 11 of O, B-WHO (crowd), I-WHO (crowd), B-WHEN (time), I-WHEN (time), B-WHERE (place), I-WHERE (place), B-DEM (demand), I-DEM (demand), B-EMO (emotion), I-EMO (emotion). Wherein B represents the beginning of the tagged word, I represents the successor of the tagged word, and O represents a non-target item, i.e., an empty result.

As an example, for comment information "here suitable for dinner with friends", the comment information is extracted by a sequence annotation model, and the obtained scene element tags are [ this (O), inside (O), fit (O), band (O), pun (B-WHO), friend (I-WHO), party (O), poly (B-DEM), meal (I-DEM) ]. Therefore, extraction of scene elements in comment information is achieved.

The processing procedure of the sequence annotation model is described below.

In this embodiment, the sequence annotation model includes a pre-training model, two layers of bi-directional gate loop units GRU (gate recurrent unit), and a layer of conditional random fields CRF (conditional random field).

Wherein the pre-training model is used for encoding Chinese words to convert the words into vectors. Words among the plurality of clauses are converted by a pre-training model to generate codes corresponding to the words. For example, for the sentence "here fit friends together," words in the sentence are converted by the pre-trained model, and a corresponding word code is generated for each word in the sentence.

The bidirectional gate cycle unit GRU layer is used for acquiring the codes of the target words and generating the context information of the target words according to the codes corresponding to the words in the clauses. That is, the two-way gate cycle unit GRU layer inputs the word code for each word in a clause, outputs the word code for each word in the clause and the context information for each word, wherein the word code input by the GRU layer is obtained according to a pre-training model, and the context information includes the word codes of adjacent words. For example, for the sentence "here fit friends to dinner", the context information of the target word "gather" is the word encoding of "friends" to dinner ". In this embodiment, two bidirectional gate cycle unit GRU layers are adopted, and the output of the first bidirectional gate cycle unit GRU layer is used as the input of the second bidirectional gate cycle unit GRU layer to process, so as to improve the labeling accuracy and recall rate.

The conditional random field CRF layer is used for generating scene element labels of the target words according to the codes of the target words and the context information of the target words input by the bidirectional gate cycle unit GRU layer. That is, for each word in the sentence, the CRF layer inputs the word code of the word and the word code of the adjacent word, and further outputs the category of the scene element tag of the word, and the parsing layer of the CRF decodes the category of the scene element tag of the word, and maps to obtain the scene element tag of the word.

Referring to fig. 2, the input of the sequence annotation model is "friends party", and first, each word input is encoded by the pre-training model to generate a word vector, wherein each word vector is 768 dimensions. Further, inputting the word vector into Bi-GRU 1, and outputting the word vector into 512-dimensional vectors obtained by splicing the two-way GRUs; and inputting the spliced vector into Bi-GRU 2, and outputting a 512-dimensional vector. Further, the vector output by the Bi-GRU 2 is input to a full connection layer to obtain a 1-dimensional vector, wherein the 1-dimensional vector has 11 values for representing scene element labels corresponding to each word, and finally the scene element labels are obtained through CRF decoding, and as shown in fig. 2, the scene element labels corresponding to each word in a 'friend gathering' are [ B-WHO, I-WHO, B-DEM, I-DEM ] in sequence.

As an example, the training process is as follows: setting scene element labels, for example, comprises 11 types of O, B-WHO (crowd), I-WHO (crowd), B-WHEN (time), I-WHEN (time), B-WHERE (place), I-WHERE (place), B-DEM (demand), I-DEM (demand), B-EMO (emotion) and I-EMO (emotion). Training samples are obtained, such as sentence results of a related unstructured label extraction system. And labeling scene element labels for the word marks in the training samples, and further labeling the model by 90% of input sequences of the training samples, wherein the other 10% are used as test sets, and model parameters are adjusted according to model prediction results and labeling results until the model converges.

According to the extraction method of the scene elements, comment information of interest points is obtained; and extracting comment information of the interest points through the sequence annotation model to obtain scene element labels in the comment information. Therefore, scene elements are extracted from comment information, and a scene map is further built based on the scene elements and used for POI recommendation. In addition, compared with a label extraction scheme based on a syntactic template, a plurality of templates are not required to be configured, so that complex template development is avoided, and development efficiency is improved.

Based on the above embodiment, after the comment information of the interest point is extracted through the sequence labeling model to obtain the scene element tag in the comment information, the extraction result of the scene element tag may be further processed to improve the accuracy of scene element extraction.

Fig. 3 is a flowchart of another method for extracting a scene element according to an embodiment of the present application, as shown in fig. 3, after extracting comment information of an interest point by using a sequence labeling model to obtain a scene element tag in the comment information, the method further includes:

step 301, word segmentation is performed on the comment information to form a plurality of word segments, and a first word boundary corresponding to each word segment is obtained.

In this embodiment, the comment information may be segmented by a related word segmentation tool to form a plurality of segmented words, and a first word boundary corresponding to each segmented word is obtained. The word segmentation tool is used for achieving segmentation of sentences according to word granularity.

As an example, the comment information "ABCDE" is segmented to form the segmented words "AB", "CD" and "E", that is, AB is one word, CD is one word, and E is one word, and then the first word boundary corresponding to each segmented word is obtained as "ab|cd|e".

Step 302, determining a second word boundary corresponding to each scene element according to the scene element tags.

In this embodiment, the scene element tags include crowd tags, time tags, place tags, demand tags and emotion tags, and the second word boundaries corresponding to the scene element tags can be determined according to whether the scene tags are the same or adjacent.

As an example, for comment information "ABCDE", the scene element label obtained by the sequence labeling model is B-white, I-white, B-WHO, I-WHO, O in order, and then the second word boundary is determined to be "ab|cd|e"; if the scene element labels obtained through the sequence labeling model are B-WHEN, I-WHEN, I-WHEN, B-WHO and I-WHO in sequence, determining that the second word boundary is ABC|DE.

Step 303, determining a target scene element meeting the preset condition from the scene elements, and deleting the scene element label corresponding to the target scene element.

In this embodiment, the second word boundary is matched with the first word boundary, and if it is determined that the preset condition is met according to the second word boundary and the first word boundary, the corresponding scene element tag is deleted. If the preset condition is not met according to the second word boundary and the first word boundary, determining that the scene element labels are accurate, and reserving the corresponding scene element labels.

Wherein the preset condition includes that there is no first word boundary included in the second word boundary.

As an example, the first word boundary "ab|cd|e" and the second word boundary "ab|cd|e" determine that the preset condition is not satisfied for the scene elements AB and CD, and the scene element tags of AB and CD are reserved. If the first word boundary "ab|cd|e" and the second word boundary "a|bc|de" are the requirement labels, and the BC is determined to satisfy the preset condition because the BC crosses the part-of-speech boundary, and the requirement label corresponding to the BC is deleted.

As another example, if the first word boundary "ab|cd|e" and the second word boundary "a|bcd|e", BCD is extended to ABCD for "a|bcd|e" according to the word boundary and then remains, for example, the scene element tag of BCD is a crowd tag, then a is also labeled as a crowd tag.

It should be noted that, the implementation manner of processing the scene element tag according to the word boundary is merely exemplary, and as another example, if the second word boundary is included in the first word boundary, the second word boundary is expanded according to the first word boundary, for example, the first word boundary "abcd|e", and the second word boundary "a|bcd|e", the scene element tag corresponding to the scene element BCD is obtained, and a is also labeled as the scene element tag.

According to the extraction method of the scene elements, provided by the embodiment of the application, the scene elements crossing part-of-speech boundaries can be identified and filtered by matching the word segmentation result of the comment information with the extraction result of the scene element labels, so that the extraction accuracy of the scene elements is improved.

Fig. 4 is a flow chart of another method for extracting scene elements according to an embodiment of the present application, as shown in fig. 4, the method includes:

and step 401, processing the scene elements corresponding to the scene element labels through the semantic smoothness model to obtain the semantic smoothness corresponding to each scene element.

In this embodiment, the semantic meaning model is input as a word or sentence, and output as a semantic meaning, where the semantic meaning is a numerical value, for example, 0-positive infinity. After comment information of interest points is extracted through the sequence labeling model to obtain scene element labels in the comment information, inputting each scene element into the semantic smoothness model to obtain the semantic smoothness corresponding to each scene element.

As an example, for comment information "fit for a meal", where "meal" corresponds to a demand label, a scene element "meal" is input into a semantic smoothness model, resulting in semantic smoothness.

Step 402, determining a target scene element with semantic smoothness smaller than a preset threshold, and deleting a scene element label corresponding to the target scene element.

In this embodiment, the semantic smoothness of each scene element is compared with a preset threshold value, a target scene element with the semantic smoothness smaller than the preset threshold value is obtained, and a scene element tag corresponding to the target scene element is deleted.

The preset threshold value can be set according to the needs, or can be determined according to a large amount of experimental data, for example, the threshold value is set to be 1000, so that words with low smoothness in the scene element extraction result are filtered.

In one embodiment of the application, if the semantic smoothness of the scene element is larger than or equal to a preset threshold value, the scene element label corresponding to the scene element is reserved.

It should be noted that, the occurrence frequency of each scene element may also be counted, and the occurrence frequency of each scene element is compared with a preset threshold value, so as to determine the target scene element whose occurrence frequency is lower than the preset threshold value, and then delete the scene element tag corresponding to the target scene element.

In one embodiment of the application, after the scene element tags in the comment information are acquired, the tags of the interest points are generated according to the scene element tags, and the recommendation is performed according to the tags of the interest points. For example, comment information of the interest point can be collected, scene element tags are extracted according to the comment information, candidate scene elements with occurrence times larger than a preset threshold value are selected for each type of scene element tags, the tags of the interest point are generated according to the candidate scene elements and the corresponding scene element tags, and then recommendation is performed according to the tags of the interest point.

According to the extraction method of the scene elements, which is disclosed by the embodiment of the application, the semantic smoothness is calculated through the extraction result of the scene element labels, and the scene elements which are not smooth are filtered, so that the extraction accuracy of the scene elements is improved.

In order to realize the embodiment, the application further provides a scene element extraction device.

Fig. 5 is a schematic structural diagram of a device for extracting a scene element according to an embodiment of the present application, as shown in fig. 5, where the device includes: the acquisition module 10, the extraction module 20.

The obtaining module 10 is configured to obtain comment information of the point of interest.

And the extraction module 20 is used for extracting the comment information of the interest point through the sequence labeling model to obtain scene element tags in the comment information.

On the basis of fig. 5, the extracting device for a scene element shown in fig. 6 further includes: the device comprises a preprocessing module 30, a conversion module 40, a recommendation module 50, a first processing module 60 and a second processing module 70.

The preprocessing module 30 is configured to obtain punctuation marks in the comment information; dividing the comment information according to punctuation marks to generate a plurality of clauses; the plurality of clauses are segmented to form a plurality of words.

A conversion module 40, configured to convert, by using a pre-training model, a word among a plurality of clauses to generate a code corresponding to the word.

In one embodiment of the application, the sequence annotation model comprises: at least one bidirectional gate cycle unit GRU layer, which is used for obtaining the codes of the target words and generating the context information of the target words according to the codes corresponding to the words in the clauses; and a conditional random field CRF layer, configured to generate a scene element tag of the target word according to the code of the target word and the context information of the target word input by the GRU layer. The scene element labels comprise crowd labels, time labels, place labels, demand labels and emotion labels.

A recommendation module 50, configured to generate a tag of the interest point according to a scene element tag; and recommending according to the labels of the interest points.

The first processing module 60 is configured to perform word segmentation on the comment information to form a plurality of word segments, and obtain a first word boundary corresponding to each word segment; determining a second word boundary corresponding to each scene element according to the scene element tag; and determining target scene elements meeting preset conditions from all the scene elements, and deleting scene element labels corresponding to the target scene elements, wherein the preset conditions comprise that no first word boundary contained in the second word boundary exists.

The second processing module 70 is configured to process the scene elements corresponding to the scene element tags through the semantic smoothness model, and obtain semantic smoothness corresponding to each scene element; and determining a target scene element with the semantic smoothness smaller than a preset threshold value, and deleting a scene element label corresponding to the target scene element.

The explanation of the extraction method of the scene element in the foregoing embodiment is also applicable to the extraction device of the scene element in this embodiment, and will not be repeated here.

According to the extraction device of the scene elements, comment information of interest points is obtained; and extracting comment information of the interest points through the sequence annotation model to obtain scene element labels in the comment information. Therefore, scene elements are extracted from comment information, and a scene map is further built based on the scene elements and used for POI recommendation. In addition, compared with a label extraction scheme based on a syntactic template, a plurality of templates are not required to be configured, so that complex template development is avoided, and development efficiency is improved.

To achieve the above embodiments, the present application also proposes a computer program product, which when executed by a processor implements a method for extracting scene elements according to any of the previous embodiments.

According to an embodiment of the present application, the present application also provides an electronic device and a readable storage medium.

As shown in fig. 7, a block diagram of an electronic device according to an embodiment of the present application is a method for extracting a scene element. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the applications described and/or claimed herein.

As shown in fig. 7, the electronic device includes: one or more processors 701, memory 702, and interfaces for connecting the various components, including high-speed interfaces and low-speed interfaces. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions executing within the electronic device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In other embodiments, multiple processors and/or multiple buses may be used, if desired, along with multiple memories and multiple memories. Also, multiple electronic devices may be connected, each providing a portion of the necessary operations (e.g., as a server array, a set of blade servers, or a multiprocessor system). One processor 701 is illustrated in fig. 7.

Memory 702 is a non-transitory computer readable storage medium provided by the present application. The memory stores instructions executable by the at least one processor to cause the at least one processor to perform the method for extracting scene elements provided by the application. The non-transitory computer readable storage medium of the present application stores computer instructions for causing a computer to execute the extraction method of scene elements provided by the present application.

The memory 702 is used as a non-transitory computer readable storage medium for storing non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules (e.g., the acquisition module 10 and the extraction module 20 shown in fig. 5) corresponding to the extraction method of scene elements in the embodiment of the application. The processor 701 executes various functional applications of the server and data processing, i.e., implements the extraction method of scene elements in the above-described method embodiments, by running non-transitory software programs, instructions, and modules stored in the memory 702.

Memory 702 may include a storage program area that may store an operating system, at least one application program required for functionality, and a storage data area; the storage data area may store data created according to the use of the electronic device, etc. In addition, the memory 702 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, memory 702 may optionally include memory located remotely from processor 701, which may be connected to the electronic device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The electronic device of the extraction method of the scene element may further include: an input device 703 and an output device 704. The processor 701, the memory 702, the input device 703 and the output device 704 may be connected by a bus or otherwise, in fig. 7 by way of example.

The input device 703 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic device, such as a touch screen, keypad, mouse, trackpad, touchpad, pointer stick, one or more mouse buttons, trackball, joystick, and like input devices. The output device 704 may include a display apparatus, auxiliary lighting devices (e.g., LEDs), and haptic feedback devices (e.g., vibration motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device may be a touch screen.

Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASIC (application specific integrated circuit), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

These computing programs (also referred to as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented in a high-level procedural and/or object-oriented programming language, and/or in assembly/machine language. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present application may be performed in parallel, sequentially, or in a different order, provided that the desired results of the disclosed embodiments are achieved, and are not limited herein.

The above embodiments do not limit the scope of the present application. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present application should be included in the scope of the present application.

Claims

1. A method for extracting a scene element, comprising:

comment information of interest points is obtained, wherein the interest points refer to geographic objects abstracted to points;

extracting comment information of the interest point through a sequence labeling model to obtain scene element labels in the comment information, wherein the sequence labeling model comprises a pre-training model, at least one bidirectional gate cycle unit GRU layer and a conditional random field CRF layer, and the scene element labels comprise crowd labels, time labels, place labels, demand labels and emotion labels;

the method further comprises the steps of:

generating a label of the interest point according to the scene element label; and

recommending according to the labels of the interest points;

after extracting comment information of the interest point through a sequence labeling model to obtain scene element labels in the comment information, the method further comprises the following steps:

performing word segmentation on the evaluation information to form a plurality of word segments, and acquiring a first word boundary corresponding to each word segment;

determining a second word boundary corresponding to each scene element according to the scene element tag;

determining target scene elements meeting preset conditions from all the scene elements, and deleting scene element labels corresponding to the target scene elements, wherein the preset conditions comprise that a first word boundary contained in the second word boundary does not exist;

the recommending according to the label of the interest point comprises the following steps:

selecting candidate scene elements with occurrence times larger than a preset threshold value for each type of scene element labels;

generating a label of the interest point according to the candidate scene element and the corresponding scene element label;

and recommending according to the labels of the interest points.

2. The method for extracting a scene element according to claim 1, further comprising, after obtaining comment information of the point of interest:

acquiring punctuation marks in the comment information;

dividing the evaluation information according to the punctuation marks to generate a plurality of clauses;

and cutting words from the multiple clauses to form multiple words.

3. The method for extracting a scene element according to claim 2, further comprising:

and converting words in the multiple clauses through a pre-training model to generate codes corresponding to the words.

4. A method of extracting a scene element according to claim 3,

at least one bidirectional gate cycle unit GRU layer, which is used for obtaining the codes of the target words and generating the context information of the target words according to the codes corresponding to the words in the clauses; and

and the conditional random field CRF layer is used for generating a scene element tag of the target word according to the code of the target word and the context information of the target word, which are input by the GRU layer.

5. The method of claim 1, further comprising, after extracting comment information of the point of interest by a sequence annotation model to obtain scene element tags among the comment information:

processing the scene elements corresponding to the scene element labels through a semantic smoothness model to obtain the semantic smoothness corresponding to each scene element;

and determining a target scene element with the semantic smoothness smaller than a preset threshold value, and deleting a scene element label corresponding to the target scene element.

6. A scene element extraction device, comprising:

the system comprises an acquisition module, a judgment module and a judgment module, wherein the acquisition module is used for acquiring comment information of interest points, wherein the interest points refer to geographic objects abstracted to be points;

the extraction module is used for extracting comment information of the interest points through a sequence labeling model to obtain scene element labels in the comment information, wherein the sequence labeling model comprises a pre-training model, at least one bidirectional gate cycle unit GRU layer and a conditional random field CRF layer, and the scene element labels comprise crowd labels, time labels, place labels, demand labels and emotion labels;

the device further comprises:

the recommendation module is used for generating the label of the interest point according to the scene element label; and

recommending according to the labels of the interest points;

the first processing module is used for performing word segmentation on the evaluation information to form a plurality of word segments, and acquiring a first word boundary corresponding to each word segment;

the recommendation module is specifically configured to:

and recommending according to the labels of the interest points.

7. The extraction apparatus of scene elements according to claim 6, further comprising:

the preprocessing module is used for acquiring punctuation marks in the evaluation information;

and cutting words from the multiple clauses to form multiple words.

8. The extraction apparatus of scene elements according to claim 7, further comprising:

and the conversion module is used for converting words in the multiple clauses through a pre-training model so as to generate codes corresponding to the words.

9. The extraction device of scene elements according to claim 8, characterized by at least one bidirectional gate cycle unit gra layer for obtaining the codes of the target words and generating the context information of the target words according to the codes corresponding to the words in the clauses; and

10. The extraction apparatus of scene elements according to claim 6, further comprising:

the second processing module is used for processing the scene elements corresponding to the scene element labels through the semantic smoothness model to obtain the semantic smoothness corresponding to each scene element;

11. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of extracting a scene element according to any one of claims 1-8.

12. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of extracting a scene element according to any one of claims 1-8.